Search
Question: predictCoding errors with: sequence ^1$ not found
0
gravatar for tony j
28 days ago by
tony j0
tony j0 wrote:

I am attempting to run predictCoding as follows for a small set of variants across the complete genome, resulting in the sequence not found error:

> predictCoding(vcf, txdb_hg19, Hsapiens)
Error in .getOneSeqFromBSgenomeMultipleSequences(x, names[i], start[i],  : 
  sequence ^1$ not found

Both the naming conventions do match, and my vcf ranges appear in range:

> seqlevels(txdb_hg19)
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "X"  "Y" 
> seqlevels(my_vcf)
 [1] "1"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "2"  "20" "21" "22" "3"  "4"  "5"  "6"  "7"  "8"  "9"  "X"  "Y" 
> which(end(my_vcf) > seqlengths(txdb_hg19)[as.character(seqnames(my_vcf))])
named integer(0)

Please let me know what other details would aid in troubleshooting.

Thanks in advance for any direction!

TJ

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SNPlocs.Hsapiens.dbSNP142.GRCh37_0.99.5 BSgenome.Hsapiens.UCSC.hg19_1.4.0       BSgenome_1.44.2                         rtracklayer_1.36.6                      org.Hs.eg.db_3.4.1                     
 [6] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.28.5                  AnnotationDbi_1.38.2                    biomaRt_2.32.1                          plyr_1.8.4                             
[11] VariantAnnotation_1.22.3                Rsamtools_1.28.0                        Biostrings_2.44.2                       XVector_0.16.0                          SummarizedExperiment_1.6.5             
[16] DelayedArray_0.2.7                      matrixStats_0.52.2                      Biobase_2.36.2                          GenomicRanges_1.28.6                    GenomeInfoDb_1.12.3                    
[21] IRanges_2.10.5                          S4Vectors_0.14.7                        BiocGenerics_0.22.1                     variants_0.99.129254                   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.13                     compiler_3.4.2                   bitops_1.0-6                     tools_3.4.2                      zlibbioc_1.22.0                  digest_0.6.12                    bit_1.1-12                      
 [8] RSQLite_2.0                      memoise_1.1.0                    tibble_1.3.4                     lattice_0.20-35                  pkgconfig_2.0.1                  rlang_0.1.2                      Matrix_1.2-11                   
[15] DBI_0.7                          GenomeInfoDbData_0.99.0          bit64_0.9-7                      grid_3.4.2                       cgdv17_0.14.0                    XML_3.98-1.9                     BiocParallel_1.10.1             
[22] PolyPhen.Hsapiens.dbSNP131_1.0.2 blob_1.1.0                       GenomicAlignments_1.12.2         RCurl_1.95-4.8                  
ADD COMMENTlink modified 28 days ago • written 28 days ago by tony j0

Yep - figured it out almost immediately after posting. The Hsapiens seqlevels were not sent to "NCBI":

seqlevelsStyle(Hsapiens) <- "NCBI"
ADD REPLYlink written 28 days ago by tony j0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 100 users visited in the last hour