Summary statistics of RleListViews object
1
0
Entering edit mode
@sean-davis-490
Last seen 5 months ago
United States

I have an RleListView representing coverage on the CDS regions of the genome and I want to calculate summary stats on all the values in the contained views.  In particular, I would like to create a per-base histogram of coverage over all the Views.  Any suggestions on how to do this efficiently?

rle Views • 1.3k views
ADD COMMENT
0
Entering edit mode

Hi Sean,

Maybe this recent discussion would help?

  making a RleViewsList object gets slow with many chromosomes

It doesn't take care of the plotting but it shows you how to efficiently compute a simple function like extracting the mean coverage over the views.

Otherwise do you have code for how to "create a per-base histogram of coverage over all the Views" the inefficient way? That would be a good start for us to understand exactly what you want to do and to suggest ways to do it in a more efficient way.

Thanks,

H.

ADD REPLY
1
Entering edit mode
@sean-davis-490
Last seen 5 months ago
United States

To answer my own question, this works.

> class(covView)
[1] "SimpleRleViewsList"
attr(,"package")
[1] "IRanges"
> z = unlist(RleList(unlist(lapply(covView,function(v) do.call(c,viewApply(v,function(x) x))))))
> class(z)
[1] "Rle"
attr(,"package")
[1] "S4Vectors"

There is a fair amount of coercion going on in there to make this work, but it is fast enough and does what I wanted.  Using bplapply instead of lapply speeds things up proportionally, as well.

 

ADD COMMENT

Login before adding your answer.

Traffic: 749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6