Summary statistics of RleListViews object
1
0
Entering edit mode
@sean-davis-490
Last seen 6 weeks ago
United States

I have an RleListView representing coverage on the CDS regions of the genome and I want to calculate summary stats on all the values in the contained views.  In particular, I would like to create a per-base histogram of coverage over all the Views.  Any suggestions on how to do this efficiently?

rle Views • 797 views
0
Entering edit mode

Hi Sean,

Maybe this recent discussion would help?

It doesn't take care of the plotting but it shows you how to efficiently compute a simple function like extracting the mean coverage over the views.

Otherwise do you have code for how to "create a per-base histogram of coverage over all the Views" the inefficient way? That would be a good start for us to understand exactly what you want to do and to suggest ways to do it in a more efficient way.

Thanks,

H.

1
Entering edit mode
@sean-davis-490
Last seen 6 weeks ago
United States

To answer my own question, this works.

> class(covView)
[1] "SimpleRleViewsList"
attr(,"package")
[1] "IRanges"
> z = unlist(RleList(unlist(lapply(covView,function(v) do.call(c,viewApply(v,function(x) x))))))
> class(z)
[1] "Rle"
attr(,"package")
[1] "S4Vectors"



There is a fair amount of coercion going on in there to make this work, but it is fast enough and does what I wanted.  Using bplapply instead of lapply speeds things up proportionally, as well.