Question: Generic functions for DataFrames of Rle objects ?
12 months ago by
Charles Plessy30 wrote:

Dear BioC developers and community,

I am using more and more DataFrames of Rle values, typically for transcriptome expression data, and I end up writing more and more functions that take a DataFrame, lapply a function that unpack the Rle, apply a second function, repack the Rle and convert the resulting list in a DataFrame.  I was just wondering (actually, searched and did not find) if there are already classes or packages providing such a functionality, or provide methods such as colSums, rowsum, cor, etc, adapted to be efficient in that context.

12 months ago by
Hervé Pagès ♦♦ 13k
Hervé Pagès ♦♦ 13k wrote:

Hi Charles,

FWIW the DelayedArray package allows you to manipulate a DataFrame of Rle columns as a matrix-like object by just wrapping it inside a DelayedArray object (you do this by calling DelayedArray() on it). See ?DelayedArray for more information. After applying (delayed) operations on it, you can turn the DelayedArray object back into a DataFrame of Rle columns by just coercing it to DataFrame (i.e. with as(  , "DataFrame")). Note that I just added the coercion method to DataFrame in DelayedArray 0.2.5. This new version of the package should become available via biocLite() in 48h or less.

Thanks a lot Hervé! It took me some time to understand the obvious, but the DelayedArray wrappers are exactly what I needed.

Would you recommend to I wrap in DelayedArrays just before performing matrix-like operations, or to use the DelayedArray class as the base class for the assays in the SummarizedExperiment objects that I produce ?

(The background of my questions is that I am refactoring the CAGEr package to use MultiAssayExperiments, SummarizedExperiments and DataFrames of Rles extensively).

ADD REPLYlink modified 9 months ago • written 9 months ago by Charles Plessy30