BSGenome::getSeq gives error: Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append)
1
0
Entering edit mode
@dborgesrmit-9232
Last seen 8.4 years ago
United States

Hello,

Whenever I try to use getSeq from the BSGenome package to fetch a sequence i get this error:

getSeq(BSgenome.Hsapiens.UCSC.hg19,"chr7")
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
  subscript contains NAs or out-of-bounds indices

 

sessionInfo()

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.38.0                   Biostrings_2.38.1                 XVector_0.10.0                   
 [5] biomaRt_2.26.0                    rtracklayer_1.30.1                GenomicRanges_1.22.1              GenomeInfoDb_1.6.1               
 [9] IRanges_2.4.2                     S4Vectors_0.8.3                   BiocGenerics_0.16.1               reshape2_1.4.1                   
[13] ggplot2_1.0.1                     shiny_0.12.2                      BiocInstaller_1.20.1             

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2                futile.logger_1.4.1        plyr_1.8.3                 futile.options_1.0.0       bitops_1.0-6               tools_3.2.2               
 [7] zlibbioc_1.16.0            digest_0.6.8               RSQLite_1.0.0              gtable_0.1.2               DBI_0.3.1                  proto_0.3-10              
[13] stringr_1.0.0              grid_3.2.2                 Biobase_2.30.0             R6_2.1.1                   AnnotationDbi_1.32.0       XML_3.98-1.3              
[19] BiocParallel_1.4.0         lambda.r_1.1.7             magrittr_1.5               GenomicAlignments_1.6.1    scales_0.3.0               Rsamtools_1.22.0          
[25] htmltools_0.2.6            MASS_7.3-45                SummarizedExperiment_1.0.1 mime_0.4                   xtable_1.8-0               colorspace_1.2-6          
[31] httpuv_1.3.3               stringi_1.0-1              RCurl_1.95-4.7             munsell_0.4.2 

 

bsgenome getseq • 2.2k views
ADD COMMENT
0
Entering edit mode

Actually, this happens also to other use cases of getSeq (the code below is taken from GenomicFeatures extractTranscriptSeqs help page):

library(GenomicFeatures)
library(BSgenome.Hsapiens.UCSC.hg19)
genome <- BSgenome.Hsapiens.UCSC.hg19

## Load a TxDb object:
txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
                         package="GenomicFeatures")
txdb <- loadDb(txdb_file)

## Check that 'txdb' is based on the hg19 assembly:
txdb

## Extract the exon ranges grouped by transcript from 'txdb':
transcripts <- exonsBy(txdb, by="tx", use.names=TRUE)

## Extract the transcript sequences from the genome:
tx_seqs <- extractTranscriptSeqs(genome, transcripts)
tx_seqs

Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
  subscript contains NAs or out-of-bounds indices

 

Eventually this is caused by some updated package. I remember I did a package update today, but I can't remember which packages got updated (IRanges?).

Did you also update your packages recently?

cheers, jo

 

My session info:

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin15.0.0/x86_64 (64-bit)
Running under: OS X 10.11.2 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] GenomicFeatures_1.22.4            AnnotationDbi_1.32.0             
 [3] Biobase_2.30.0                    BSgenome.Hsapiens.UCSC.hg19_1.4.0
 [5] BSgenome_1.38.0                   rtracklayer_1.30.1               
 [7] Biostrings_2.38.1                 XVector_0.10.0                   
 [9] GenomicRanges_1.22.1              GenomeInfoDb_1.6.1               
[11] IRanges_2.4.2                     S4Vectors_0.8.3                  
[13] BiocGenerics_0.16.1              

loaded via a namespace (and not attached):
 [1] zlibbioc_1.16.0            GenomicAlignments_1.6.1   
 [3] BiocParallel_1.4.0         tools_3.2.2               
 [5] SummarizedExperiment_1.0.1 DBI_0.3.1                 
 [7] lambda.r_1.1.7             futile.logger_1.4.1       
 [9] futile.options_1.0.0       bitops_1.0-6              
[11] biomaRt_2.26.0             RCurl_1.95-4.7            
[13] RSQLite_1.0.0              Rsamtools_1.22.0          
[15] XML_3.98-1.3        
ADD REPLY
0
Entering edit mode
@martin-morgan-1513
Last seen 6 days ago
United States

I think it is A: IRanges/ShortRead causing crash?, a bad commit to IRanges 2.4.2 with a fix working it's way toward biocLite() or available now in svn.

ADD COMMENT

Login before adding your answer.

Traffic: 510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6