Entering edit mode
r.a.policastro
•
0
@rapolicastro-24308
Last seen 4.2 years ago
Hello all,
I am trying to use plyranges to do a strand specific reduction and aggregation of score with gaps allowed. This is conceptually similar to section 4.1 in the HelloRanges tutorial, except with using the max.gapwidth
argument in GenomicRanges::reduce
. The closest function I see in plyranges is reduce_ranges_directed
, but this does not allow gaps.
This question was posted yesterday in Biostars as well, but a consensus could not be reached on a best method.
Example data.
library("plyranges")
df <- data.frame(
seqnames="chrI", start=c(1, 10, 20), end=c(5, 15, 25), strand=c("+", "+", "-"),
score=c(8, 3, 6)
)
gr <- as_granges(df)
> gr
GRanges object with 3 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <integer>
[1] chrI 1-5 + | 8
[2] chrI 10-15 + | 3
[3] chrI 20-25 - | 6
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
Example of desired output with max allowed gap width of 10 and summing the scores for the aggregation in this example.
desired_output <- data.frame(
seqnames="chrI", start=c(1, 20), end=c(15, 25), strand=c("+", "-"),
score=c(11, 6)
)
desired_output <- as_granges(desired_output)
> desired_output
GRanges object with 2 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] chrI 1-15 + | 11
[2] chrI 20-25 - | 6
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
Cheers!