Add up RLEs that are in lists?
2
1
Entering edit mode
gen ▴ 30
@gen-7383
Last seen 7.5 years ago
United States

Hi, I previously asked about summing RLEs but I'm now curious about extending the concept to summing lists of RLEs of arbitrary lengths. It seems fairly simple but I haven't seen a way to do it outside of loops.

Given something like...

$uc001aaa.3 numeric-Rle of length 249250621 with 5 runs Lengths: 14101 25 219 30 249236246 Values : 0 1 0 1 0$uc001aac.4
numeric-Rle of length 249250621 with 124 runs
Lengths:             14345                30 ...                25         249221232
Values :                 0 0.909090909090909 ... 0.909090909090909                 0

\$uc001aae.4
numeric-Rle of length 249250621 with 29 runs
Lengths:     14345        30      1163        13 ...       489        25 249231381
Values :         0         1         0         1 ...         0         1         0

I know you can sum RLEs through result<-c(RLE1,RLE2) ie

integer-Rle of length 249250621 with 69798 runs
Lengths:  20000646        25      2330        25 ...      2121        25 199255312
Values :         0         1         0         1 ...         0         1         0

integer-Rle of length 249250621 with 69798 runs
Lengths:  20000646        25      2330        25 ...      2121        25 199255312
Values :         0         1         0         1 ...         0         1         0

=

integer-Rle of length 249250621 with 69798 runs
Lengths:  20000646        25      2330        25 ...      2121        25 199255312
Values :         0         2         0         2 ...         0         2         0

but I need to do it automatically with an arbitrary length list. Somehow being in a list form seems to prevent operations that would normally work.

reduce('+',RLElist)

sum(RLElist)

don't work, and I can't even sum elements of the list as I could RLEs that are not in a list.

thanks

RLE GenomicFeatures rnaseq • 1.5k views
1
Entering edit mode
@herve-pages-1542
Last seen 1 hour ago
Seattle, WA, United States

Hi,

An important hint: if you know how to do it on an ordinary numeric vector, or list of numeric vectors, then you know how to do it on an Rle or RleList.

So le'ts first try to do this on a list of integer vectors:

x <- list(11:15,1:-3)
Reduce('+', x)
# [1] 12 12 12 12 12

Works! (Note that I used Reduce(), not reduce().)

Now on an RleList:

Reduce('+', RleList(x))
# integer-Rle of length 5 with 1 run
#   Lengths:  5
#   Values : 12


Also works!

In other words, there is nothing special about using Rles vs using ordinary numeric vectors or lists as long as basic arithmetic is concerned.

H.

0
Entering edit mode
@ryan-c-thompson-5618
Last seen 2.1 years ago
Scripps Research, La Jolla, CA

If you just want to take the sum of all the RLEs at once, you can do sum(sapply(RLEList, sum)). If you really want to combine them all into one big RLE, you can do that with unlist(RleList(x)). For summing, I would guess that the former is faster and more memory-efficient because it does not need to construct the full concatenation of all the RLEs.

0
Entering edit mode

Hi, yeah I want to get a big RLE not a single value. the second option unfortunately seems to give integer overflow errors.

rror in unlist(RleList(ExampleRLElist)) :
error in evaluating the argument 'x' in selecting a method for function 'unlist': Error in .Call2("Rle_constructor", values, lengths, check, 0L, PACKAGE = "S4Vectors") :
integer overflow while summing elements in 'lengths'

What do you think could be a solution? maybe something like runmean?

1
Entering edit mode

Integer overflow means you will need to convert to numeric (i.e. floating point) because your sum is larger than the maximum representable integer. Note that you might lose some precision in the result.

0
Entering edit mode

Hi, somewhat of a tangent, but if you wanted to shift a numeric RLE unchanged to a certain position or by a certain amount how could you do it?

The closest I can find is shiftApply which I believe lines two RLEs together? But it doesn't seem to work on an RLE in isolation.

0
Entering edit mode

Please ask this as a new question. Also make sure to explain what you mean by shifting a numeric Rle. What does it mean for example to shift an ordinary numeric vector? Do you mean padding with zeroes? Again, if you know how to do it on an ordinary numeric vector, then you know how to do it on a numeric Rle. For example, padding with 100 zeroes on the left:

x <- Rle(c(-0.4, 3), 14:15)
c(Rle(0, 100), x)
# numeric-Rle of length 19 with 3 runs
#   Lengths:  100   14   15
#   Values :    0 -0.4    3

H.

0
Entering edit mode

The length of an Rle object cannot exceed 2^31-1. Unlisting your RleList object would result in an Rle object that exceeds that limit, hence the error. (The error message doesn't help, we should improve it.) Note that this limit is not just for Rle objects but for any vector-like or list-like object that derives from the Vector or List class (e.g. Hits, CharacterList, GRanges, GRangesList, GAlignments, etc...)

H.