Can GenomicRanges findOverlaps ignore seqnames?
2
0
Entering edit mode
@o-william-mcclung-22004
Last seen 5.1 years ago
United States

findOverlaps in the GenomicRanges package has a flag, ignore.strand=TRUE, which allows the overlap computation to use only the duple (seqnames,ranges), essentially ignoring the strand. Is there a way to use findOverlaps to ignore seqnames so that only the duple (ranges,strand) is used to compute an overlap? If not, is there another way to compute overlaps using only (ranges,strand)?

Any pointers will be gratefully received.

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0  IRanges_2.18.2      
[4] S4Vectors_0.22.1     BiocGenerics_0.30.0 

loaded via a namespace (and not attached):
[1] zlibbioc_1.30.0        compiler_3.6.1         XVector_0.24.0        
[4] GenomeInfoDbData_1.2.1 RCurl_1.95-4.12        bitops_1.0-6          

GenomicRanges findOverlaps • 1.3k views
ADD COMMENT
0
Entering edit mode

Would you please provide more details on the use case?

ADD REPLY
2
Entering edit mode
@herve-pages-1542
Last seen 3 days ago
Seattle, WA, United States

Just set the seqnames of all the ranges in the query and subject to the same value. This can be done with something like:

library(GenomicRanges)
example(GRanges)
GRanges("A", ranges(gr), strand(gr))
# GRanges object with 10 ranges and 0 metadata columns:
#     seqnames    ranges strand
#        <Rle> <IRanges>  <Rle>
#   a        A      1-10      -
#   b        A      2-10      +
#   c        A      3-10      +
#   d        A      4-10      *
#   e        A      5-10      *
#   f        A      6-10      +
#   g        A      7-10      +
#   h        A      8-10      +
#   i        A      9-10      -
#   j        A        10      -
#   -------
#   seqinfo: 1 sequence from an unspecified genome; no seqlengths

However, I can't think of any real-world situation where doing something like this would actually have some meaning.

ADD COMMENT
0
Entering edit mode
@o-william-mcclung-22004
Last seen 5.1 years ago
United States

@ Hervé: Many thanks. This solution clearly works.
@ Hervé and Michael: Thanks for pointing out this use case should never occur. I need to go back and rethink my pipeline.

ADD COMMENT

Login before adding your answer.

Traffic: 460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6