Dear BioC developers and community,
I am using more and more DataFrames of Rle values, typically for transcriptome expression data, and I end up writing more and more functions that take a DataFrame, lapply a function that unpack the Rle, apply a second function, repack the Rle and convert the resulting list in a DataFrame. I was just wondering (actually, searched and did not find) if there are already classes or packages providing such a functionality, or provide methods such as colSums, rowsum, cor, etc, adapted to be efficient in that context.
Have a nice day !

Thanks a lot Hervé! It took me some time to understand the obvious, but the
DelayedArraywrappers are exactly what I needed.Would you recommend to I wrap in
DelayedArrays just before performing matrix-like operations, or to use theDelayedArrayclass as the base class for the assays in theSummarizedExperimentobjects that I produce ?(The background of my questions is that I am refactoring the
CAGErpackage to useMultiAssayExperiments,SummarizedExperiments andDataFrames ofRles extensively).Hi Hervé, I have been using
rowSums(DelayedArray(DF))for almost 6 years now, but this week I got curious about performance and did a benchmark. Interestingly, it is much faster todecodethe values and sum them than to wrap theDataFramein aDelayedArray, or to sum the Rle values without decoding them. I hope it can be useful to you and others. Interestingly, ChatGPT did not give working code because it confusedrunValueanddecode...Hi Charles,
Thanks for the feedback. Operating _natively_ on the DF of Rle objects will always be more efficient than wrapping the object first in a DelayedArray object. The latter is only a quick and easy way to expedite things by getting access to all the operations supported by DelayedArray objects in general. However nothing replaces operations that are implemented to work directly on a specific type of DelayedArray seed.
Note that these "native operations" must be careful to avoid expanding all the Rle's in the DF _at once_. This is easy to do with
rowSums(), but is sometimes a little bit less straightforward like in the case ofrowVars().Best,
H.