NAs values when importing BED file using import function
Entering edit mode
Last seen 6.6 years ago
United Kingdom

Dear all,

I am trying to import a BED file using rtracklayer's import function (the same problem appears with import.bed) in order to use getSeq function at a later stage and retrieve FASTA entries of this bed file (loaded as a GRanges object).

When I run getSeq i get this error:

getSeq(hg38, testbed)
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) : 
  subscript contains NAs or out-of-bounds indices

So, I just went back and checked this imported as GRanges object testbed for NAs and there are a lot of such values present!

Any suggestions?

I am new to R and rtracklayer, but have used the getSeq function successfully before.

My R session is:

R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid

 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

attached base packages:
 [1] stats4    grid      parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtracklayer_1.30.1   Gviz_1.14.0          GenomicRanges_1.22.1 GenomeInfoDb_1.6.1   IRanges_2.4.2       
 [6] S4Vectors_0.8.3      genomation_1.2.1     CNEr_1.6.1           AnnotationHub_2.2.2  BiocGenerics_0.16.1 
[11] BiocInstaller_1.20.1

loaded via a namespace (and not attached):
 [1] Biobase_2.30.0               httr_1.0.0                   splines_3.2.2               
 [4] Formula_1.2-1                shiny_0.12.2                 interactiveDisplayBase_1.8.0
 [7] latticeExtra_0.6-26          BSgenome_1.38.0              Rsamtools_1.22.0            
[10] impute_1.44.0                RSQLite_1.0.0                lattice_0.20-33             
[13] biovizBase_1.18.0            chron_2.3-47                 digest_0.6.8                
[16] RColorBrewer_1.1-2           XVector_0.10.0               colorspace_1.2-6            
[19] htmltools_0.2.6              httpuv_1.3.3                 plyr_1.8.3                  
[22] XML_3.98-1.3                 biomaRt_2.26.0               zlibbioc_1.16.0             
[25] xtable_1.8-0                 scales_0.3.0                 BiocParallel_1.4.0          
[28] ggplot2_1.0.1                seqPattern_1.2.0             SummarizedExperiment_1.0.1  
[31] GenomicFeatures_1.22.4       nnet_7.3-11                  proto_0.3-10                
[34] survival_2.38-3              magrittr_1.5                 mime_0.4                    
[37] MASS_7.3-45                  foreign_0.8-66               tools_3.2.2                 
[40] data.table_1.9.6             matrixStats_0.15.0           gridBase_0.4-7              
[43] stringr_1.0.0                munsell_0.4.2                cluster_2.0.3               
[46] plotrix_3.6                  AnnotationDbi_1.32.0         lambda.r_1.1.7              
[49] Biostrings_2.38.1            futile.logger_1.4.1          RCurl_1.95-4.7              
[52] dichromat_2.0-0              VariantAnnotation_1.16.3     bitops_1.0-6                
[55] gtable_0.1.2                 DBI_0.3.1                    reshape2_1.4.1              
[58] R6_2.1.1                     GenomicAlignments_1.6.1      gridExtra_2.0.0             
[61] Hmisc_3.17-0                 futile.options_1.0.0         KernSmooth_2.23-15          
[64] readr_0.2.2                  stringi_1.0-1                Rcpp_0.12.2                 
[67] rpart_4.1-10                 acepack_1.3-3.3






rtracklayer NA BED getseq • 1.3k views
Entering edit mode
Last seen 6 days ago
United States

Update your packages using BiocInstaller::biocLite(); your IRanges version has a bug that has since been fixed. I'm not sure that updating the package will solve your problems entirely. The NA's message would refer to the granges part of the GRanges, not the mcols() part; the other part of the error message (out-of-bounds) might occur if the granges() were negative or longer than the sequences themselves; after running example(import.bed) I did

gr = import(test_bed)
grl = splitAsList(gr, seqnames(gr))
Entering edit mode
Last seen 6.6 years ago
United Kingdom

Thanks a lot for your kind and prompt reply, Martin.

I have updated bioconductor and tried your suggestions but still it doesn't work for me. In the meantime I also tried at my local machine and it worked well as expected. Do not know what was the problem still, but for the record this is the current session at my machine:

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.1 (El Capitan)

[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg38_1.4.1 BSgenome_1.38.0                  
 [3] rtracklayer_1.30.1                GenomicRanges_1.22.1             
 [5] GenomeInfoDb_1.6.1                BiocInstaller_1.20.1             
 [7] Biostrings_2.38.1                 XVector_0.10.0                   
 [9] IRanges_2.4.3                     S4Vectors_0.8.3                  
[11] BiocGenerics_0.16.1              

loaded via a namespace (and not attached):
 [1] XML_3.98-1.3               Rsamtools_1.22.0           GenomicAlignments_1.6.1   
 [4] bitops_1.0-6               futile.options_1.0.0       zlibbioc_1.16.0           
 [7] futile.logger_1.4.1        lambda.r_1.1.7             BiocParallel_1.4.0        
[10] tools_3.2.2                Biobase_2.30.0             RCurl_1.95-4.7            
[13] SummarizedExperiment_1.0.1

Entering edit mode

Maybe you could divide testbed in half, testbed[1:(length(testbed)/2)], and use getSeq() with that. Continue dividing until you have one or two ranges that generate the error, and then compare the granges() with the seqnames() and seqlengths() of  hg38.


Login before adding your answer.

Traffic: 747 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6