rtracklayer import only subset of ranges
0
0
Entering edit mode
Jake ▴ 90
@jake-7236
Last seen 2.3 years ago
United States

Hi,

I have a number of bigBed files and I am trying to import only the ranges overlapping a specific window. The rtracklayer import vignette has an example of how to do this, but when I try to run the vignette it doesn't seem to import any ranges.


> test_path <- system.file("tests", package = "rtracklayer")
> test_bb <- file.path(test_path, "test.bb")
> 
> ## Returns ranges with all fields
> gr <- import(test_bb)
> gr
GRanges object with 10 ranges and 4 metadata columns:
       seqnames        ranges strand |        name     score signalValue
          <Rle>     <IRanges>  <Rle> | <character> <integer>   <numeric>
   [1]     chr1 237640-237790      * |           .        70          10
   [2]     chr1 521500-521650      * |           .       140          20
   [3]     chr1 565725-565875      * |           .       210          30
   [4]     chr1 565900-566050      * |           .       280          40
   [5]     chr1 566760-566910      * |           .       350          50
   [6]    chr10 119905-120055      * |           .       420          60
   [7]    chr10 122525-122675      * |           .       490          70
   [8]    chr10 173925-174075      * |           .       560          80
   [9]    chr10 179865-180015      * |           .       630          90
  [10]    chr10 180185-180335      * |           .       700         100
            peak
       <integer>
   [1]        -1
   [2]        -1
   [3]        -1
   [4]        -1
   [5]        -1
   [6]        -1
   [7]        -1
   [8]        -1
   [9]        -1
  [10]        -1
  -------
  seqinfo: 2 sequences from an unspecified genome
> 
> ## Retuns ranges only for 'chr10'
> ## between 180185-180185 with all fields
> which <- GRanges(c("chr10"), IRanges(c(180185, 180185)))
> rtracklayer::import(test_bb, which = which)
GRanges object with 0 ranges and 4 metadata columns:
   seqnames    ranges strand |        name     score signalValue      peak
      <Rle> <IRanges>  <Rle> | <character> <integer>   <numeric> <integer>
  -------
  seqinfo: 2 sequences from an unspecified genome
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] rtracklayer_1.54.0   GenomicRanges_1.46.1 GenomeInfoDb_1.30.1 
[4] IRanges_2.28.0       S4Vectors_0.32.4     BiocGenerics_0.40.0 

loaded via a namespace (and not attached):
 [1] XVector_0.34.0              zlibbioc_1.40.0            
 [3] GenomicAlignments_1.30.0    BiocParallel_1.28.3        
 [5] lattice_0.20-44             rjson_0.2.21               
 [7] tools_4.1.1                 grid_4.1.1                 
 [9] SummarizedExperiment_1.24.0 parallel_4.1.1             
[11] Biobase_2.54.0              matrixStats_0.61.0         
[13] yaml_2.3.5                  crayon_1.5.0               
[15] BiocIO_1.4.0                Matrix_1.3-4               
[17] GenomeInfoDbData_1.2.7      restfulr_0.0.13            
[19] bitops_1.0-7                RCurl_1.98-1.6             
[21] DelayedArray_0.20.0         compiler_4.1.1             
[23] MatrixGenerics_1.6.0        Biostrings_2.62.0          
[25] Rsamtools_2.10.0            XML_3.99-0.9
rtracklayer • 808 views
ADD COMMENT
0
Entering edit mode

Looks like it might be exclusion of the start but inclusive of stop:

>  which <- GRanges(c("chr10"), IRanges(start=180185, end=180185))
>  rtracklayer::import(test_bb, which = which)
GRanges object with 0 ranges and 4 metadata columns:
   seqnames    ranges strand |        name     score signalValue      peak
      <Rle> <IRanges>  <Rle> | <character> <integer>   <numeric> <integer>
  -------
  seqinfo: 2 sequences from an unspecified genome


>  which <- GRanges(c("chr10"), IRanges(start=180185, end=180335))
>  rtracklayer::import(test_bb, which = which)
GRanges object with 1 range and 4 metadata columns:
      seqnames        ranges strand |        name     score signalValue
         <Rle>     <IRanges>  <Rle> | <character> <integer>   <numeric>
  [1]    chr10 180185-180335      * |           .       700         100
           peak
      <integer>
  [1]        -1
  -------
  seqinfo: 2 sequences from an unspecified genome




>  which <- GRanges(c("chr10"), IRanges(start=180186, end=180186))
>  rtracklayer::import(test_bb, which = which)
GRanges object with 1 range and 4 metadata columns:
      seqnames        ranges strand |        name     score signalValue
         <Rle>     <IRanges>  <Rle> | <character> <integer>   <numeric>
  [1]    chr10 180185-180335      * |           .       700         100
           peak
      <integer>
  [1]        -1
  -------
  seqinfo: 2 sequences from an unspecified genome


>  which <- GRanges(c("chr10"), IRanges(start=180335, end=180335))
>  rtracklayer::import(test_bb, which = which)
GRanges object with 1 range and 4 metadata columns:
      seqnames        ranges strand |        name     score signalValue
         <Rle>     <IRanges>  <Rle> | <character> <integer>   <numeric>
  [1]    chr10 180185-180335      * |           .       700         100
           peak
      <integer>
  [1]        -1
  -------
  seqinfo: 2 sequences from an unspecified genome
ADD REPLY

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6