Entering edit mode
Using findOverlaps() on a GRanges object I would like to retrive the following hits:
Sub: -------|||||||||||||------------------
Query Ranges:
Hit5: ---------||||||||||||||||||------------
Hit6: -----------|||||||||||||||||||---------
Hit4: ---------|||||||---------------------
.
.
.
That means hits that start in a subject range and may or may not extend over it.
Minimal example:
> sub <- GRanges(c(1),strand=Rle(c("+"),c(1)), IRanges(c(5), c(7)),mcols=data.frame(id=c("T1"))) > sub GRanges object with 1 range and 1 metadata column: seqnames ranges strand | mcols.id <Rle> <IRanges> <Rle> | <factor> [1] 1 [5, 7] + | T1 ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths > query <- GRanges(c(1,1,1,1,1,1),strand=Rle(c("+","+","+","+","+","+"),c(1,1,1,1,1,1)), IRanges(c(4,4,6,6,7,7), c(5,5,6,6,8,8)),mcols=data.frame(id=c("T8","T9","T10","T11","T12","T13"))) > query GRanges object with 6 ranges and 1 metadata column: seqnames ranges strand | mcols.id <Rle> <IRanges> <Rle> | <factor> [1] 1 [4, 5] + | T8 [2] 1 [4, 5] + | T9 [3] 1 [6, 6] + | T10 [4] 1 [6, 6] + | T11 [5] 1 [7, 8] + | T12 [6] 1 [7, 8] + | T13 ------- seqinfo: 1 sequence from an unspecified genome; no seqlengths
I tried with countOverlaps type= "start" but that only gives me hits that start at the exact same position.
> sum(countOverlaps(query,sub)) [1] 6 > sum(countOverlaps(query,sub,type="start")) [1] 0
There must be a way, thanks for looking into that!
Hi Michael! Oh, I understand what you are doing here!
However:
Here, subject has to be a IRanges object that doesnt account for strand information. So by ranges(sub) I get the IRanges and now I have to take care of the strand information myself.
This is a nice solution, thank you very much Michael!
Sorry, here is a better way for GRanges: