predictCoding errors with: sequence ^1$ not found
0
0
Entering edit mode
tony j • 0
@tony-j-14276
Last seen 4.9 years ago

I am attempting to run predictCoding as follows for a small set of variants across the complete genome, resulting in the sequence not found error:

> predictCoding(vcf, txdb_hg19, Hsapiens)
Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i],  : 
  sequence ^1$ not found

Both the naming conventions do match, and my vcf ranges appear in range:

> seqlevels(txdb_hg19)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "X"  "Y" 
> seqlevels(my_vcf)
 [1] "1"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "2"  "20" "21" "22" "3"  "4"  "5"  "6"  "7"  "8"  "9"  "X"  "Y" 
> which(end(my_vcf) > seqlengths(txdb_hg19)[as.character(seqnames(my_vcf))])
named integer(0)

Please let me know what other details would aid in troubleshooting.

Thanks in advance for any direction!

TJ

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SNPlocs.Hsapiens.dbSNP142.GRCh37_0.99.5 BSgenome.Hsapiens.UCSC.hg19_1.4.0       BSgenome_1.44.2                         rtracklayer_1.36.6                      org.Hs.eg.db_3.4.1                     
 [6] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.28.5                  AnnotationDbi_1.38.2                    biomaRt_2.32.1                          plyr_1.8.4                             
[11] VariantAnnotation_1.22.3                Rsamtools_1.28.0                        Biostrings_2.44.2                       XVector_0.16.0                          SummarizedExperiment_1.6.5             
[16] DelayedArray_0.2.7                      matrixStats_0.52.2                      Biobase_2.36.2                          GenomicRanges_1.28.6                    GenomeInfoDb_1.12.3                    
[21] IRanges_2.10.5                          S4Vectors_0.14.7                        BiocGenerics_0.22.1                     variants_0.99.129254                   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.13                     compiler_3.4.2                   bitops_1.0-6                     tools_3.4.2                      zlibbioc_1.22.0                  digest_0.6.12                    bit_1.1-12                      
 [8] RSQLite_2.0                      memoise_1.1.0                    tibble_1.3.4                     lattice_0.20-35                  pkgconfig_2.0.1                  rlang_0.1.2                      Matrix_1.2-11                   
[15] DBI_0.7                          GenomeInfoDbData_0.99.0          bit64_0.9-7                      grid_3.4.2                       cgdv17_0.14.0                    XML_3.98-1.9                     BiocParallel_1.10.1             
[22] PolyPhen.Hsapiens.dbSNP131_1.0.2 blob_1.1.0                       GenomicAlignments_1.12.2         RCurl_1.95-4.8                  
variantannotation predict coding locateVariants granges • 773 views
ADD COMMENT
0
Entering edit mode

Yep - figured it out almost immediately after posting. The Hsapiens seqlevels were not sent to "NCBI":

seqlevelsStyle(Hsapiens) <- "NCBI"
ADD REPLY

Login before adding your answer.

Traffic: 241 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6