countOverlaps with weights
1
1
Entering edit mode
maltethodberg ▴ 180
@maltethodberg-9690
Last seen 10 hours ago
Denmark

Similar to how the coverage-function works with the weights argument, is there a way to use countOverlaps with weights? For example, using the score column of a bed file.

I currently do it via findOverlaps and then manipulating the Hits object, but this seems a bit slow and cumbersome - I feel like there's a GRanges trick I'm missing.

 

genomicranges countoverlaps • 1.2k views
ADD COMMENT
4
Entering edit mode
@martin-morgan-1513
Last seen 4 months ago
United States

Here is a query and subject

query = GRanges(c("1:5-10", "1:7-12"))
subject = GRanges(c("1:6-6", "1:10-10"), score=10 * 1:2)

My implementation of your 'slow and cumbersome' is

hits = as(findOverlaps(query, subject), "List")
weightedCount = sum(extractList(subject$score, hits))

which I guess is a bit cumbersome but I think quite fast. In detail... us findOverlaps and coerce the return value to an IntegerList with one element for each query

> hits = as(findOverlaps(query, subject), "List")
> hits
IntegerList of length 2
[[1]] 1 2
[[2]] 2

Use extractList() to reformat the score column in a way that parallels the hits

> extractList(subject$score, hits)
NumericList of length 2
[[1]] 10 20
[[2]] 20

and finally use sum() to accumulate the score for each query element

> sum(extractList(subject$score, hits))
[1] 30 20

 

ADD COMMENT
0
Entering edit mode

The Hits to List coercion is not an obvious one, and sometimes I wonder whether we shoud have extractByQueryHits(subject, hits) or something.

ADD REPLY
0
Entering edit mode

Thanks! I was unaware that you could coerce a Hits object to list. I was changing everything to a data.table, but your version is slightly faster and much more elegant code-wise.

ADD REPLY

Login before adding your answer.

Traffic: 786 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6