hi,
i'm trying to extract the UCSC sequence levels for Canis familiaris using the function extractSeqlevelsByGroup() from the GenomeInfoDb package, but I stumbled onto the following error:
library(GenomeInfoDb) library(BSgenome.Cfamiliaris.UCSC.canFam3) extractSeqlevelsByGroup(species=organism(Cfamiliaris), style="UCSC", group="auto") Error in extractSeqlevelsByGroup(species = organism(Cfamiliaris), style = "UCSC", : The style specified by 'UCSC' does not have a compatible entry for the species Canis lupus familiaris
while the analogous call works perfectly for Homo sapiens:
library(BSgenome.Hsapiens.UCSC.hg38) extractSeqlevelsByGroup(species=organism(Hsapiens), style="UCSC", group="auto") [1] "chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" [8] "chr8" "chr9" "chr10" "chr11" "chr12" "chr13" "chr14" [15] "chr15" "chr16" "chr17" "chr18" "chr19" "chr20" "chr21" [22] "chr22"
i guess something is wrong with the mapping from the out of organism to the GenomeInfoDb data file at GenomeInfoDb/inst/extdata/dataFiles, see:
organism(Cfamiliaris) [1] "Canis lupus familiaris" organism(Hsapiens) [1] "Homo sapiens"
thanks for your help, session information below.
robert. ps: sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Fedora release 12 (Constantine) locale: [1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF8 LC_COLLATE=en_US.UTF8 [5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8 [7] LC_PAPER=en_US.UTF8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices [6] utils datasets methods base other attached packages: [1] BSgenome.Hsapiens.UCSC.hg38_1.4.1 [2] BSgenome.Cfamiliaris.UCSC.canFam3_1.4.0 [3] BSgenome_1.38.0 [4] rtracklayer_1.30.1 [5] Biostrings_2.38.0 [6] XVector_0.10.0 [7] GenomicRanges_1.22.0 [8] GenomeInfoDb_1.6.0 [9] IRanges_2.4.1 [10] S4Vectors_0.8.0 [11] BiocGenerics_0.16.0 [12] vimcom_1.2-3 [13] setwidth_1.0-4 [14] colorout_1.1-0 loaded via a namespace (and not attached): [1] zlibbioc_1.16.0 GenomicAlignments_1.6.1 [3] BiocParallel_1.4.0 tools_3.2.2 [5] SummarizedExperiment_1.0.0 Biobase_2.30.0 [7] lambda.r_1.1.7 futile.logger_1.4.1 [9] futile.options_1.0.0 bitops_1.0-6 [11] RCurl_1.95-4.7 Rsamtools_1.22.0 [13] XML_3.98-1.3
I wonder if it is a clash between the organism for the canFam3 BSGenome package and the organism for GenomeInfoDb?
> extractSeqlevelsByGroup(species = "Canis familiaris", style = "UCSC", group = "auto")
[1] "chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9"
[10] "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18"
[19] "chr19" "chr20" "chr21" "chr22" "chr23" "chr24" "chr25" "chr26" "chr27"
[28] "chr28" "chr29" "chr30" "chr31" "chr32" "chr33" "chr34" "chr35" "chr36"
[37] "chr37" "chr38"
yes, but i'd like to have it working in general, that is, if i have a BSgenome-class object 'x':
this is something i use within my package VariantFiltering. i thought this could be an error but if not i guess i should move the question to bioc-devel.
any hint will be appreciated.