Most efficient way to compute width of overlap of multiple features
1
0
Entering edit mode
Charles Berry ▴ 290
@charles-berry-5754
Last seen 5.1 years ago
United States
Michael Lawrence <lawrence.michael at="" ...=""> writes: > Thanks for your reply and for GenomicRanges. Please see inline comments below. > > On Tue, Jan 7, 2014 at 9:30 AM, Charles Berry <ccberry at="" ...=""> wrote: > > > Vince S. Buffalo <vsbuffalo <at=""> ...> writes: > > > > > > > > Hi Bioconductor folks, > > > > > > cover2.gr <- subset( as( coverage(gr2)>0, "GRanges"), score) > > > > seqinfocover2.gr) <- seqinfo(gr2) > > > > I think you just want cover2.gr <- reduce(gr2) instead of the above two > lines. Right. I should have recalled that. > > > > hits <- findOverlaps(gr,cover2.gr) > > > gr.over <- > > + pintersect(gr[queryHits(hits)],cover2.gr[subjectHits(hits)]) > > > > w <- width(ranges(hits, ranges(gr), rangescover2.gr))) *should* work as > an alternative to the pintersect call. > It does, but I cannot imagine I would ever have figured this idiom out. Maybe add an example to one of the help pages?? > > > gr.counts <- tapply(gr.over,queryHits(hits),FUN=function(x) > > sum(width(x))) > > > > This tapply is going to kill performance. Use something like: > > gr$overlap <- sum(splitAsList(w, factor(queryHits(hits), > seq_len(queryLength(hits))))) > Awesome! AFAICS, splitAsList has no help page. (It is used but not described in the GenomicRanges HOWTOs vignette.) I find GenomicRanges is an essential toolkit these days and would like to be able to figure out idioms like these. Anything that can be done to make idioms like these easier to find and use is much appreciated. > Or even: > > gr$overlap <- sum(relist(w, as(hits, "List"))) > Ditto the last comment, I think. Chuck
GenomicRanges GenomicRanges • 825 views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 2.4 years ago
United States
I'd love to come up with a nice way of communicating these aspects. In particular, the use of lists and partitioning, which is always overlooked. Just not sure of the best way. A cookbook? A series of workflow vignettes? A book-length manuscript that explores all the nooks and crannies? Just better man pages? There's just not enough time... Michael On Tue, Jan 7, 2014 at 12:30 PM, Charles Berry <ccberry@ucsd.edu> wrote: > Michael Lawrence <lawrence.michael@...> writes: > > > > > Thanks for your reply and for GenomicRanges. > > Please see inline comments below. > > > > > On Tue, Jan 7, 2014 at 9:30 AM, Charles Berry <ccberry@...> wrote: > > > > > Vince S. Buffalo <vsbuffalo <at=""> ...> writes: > > > > > > > > > > > Hi Bioconductor folks, > > > > > > > > > cover2.gr <- subset( as( coverage(gr2)>0, "GRanges"), score) > > > > > > seqinfocover2.gr) <- seqinfo(gr2) > > > > > > > I think you just want cover2.gr <- reduce(gr2) instead of the above two > > lines. > > Right. I should have recalled that. > > > > > > hits <- findOverlaps(gr,cover2.gr) > > > > gr.over <- > > > + pintersect(gr[queryHits(hits)],cover2.gr[subjectHits(hits)]) > > > > > > > w <- width(ranges(hits, ranges(gr), rangescover2.gr))) *should* work > as > > an alternative to the pintersect call. > > > > It does, but I cannot imagine I would ever have figured this idiom out. > > Maybe add an example to one of the help pages?? > > > > > > gr.counts <- tapply(gr.over,queryHits(hits),FUN=function(x) > > > sum(width(x))) > > > > > > > This tapply is going to kill performance. Use something like: > > > > gr$overlap <- sum(splitAsList(w, factor(queryHits(hits), > > seq_len(queryLength(hits))))) > > > > Awesome! > > AFAICS, splitAsList has no help page. (It is used but not described > in the GenomicRanges HOWTOs vignette.) > > I find GenomicRanges is an essential toolkit these days and would like > to be able to figure out idioms like these. > > Anything that can be done to make idioms like these easier to find > and use is much appreciated. > > > > Or even: > > > > gr$overlap <- sum(relist(w, as(hits, "List"))) > > > > Ditto the last comment, I think. > > Chuck > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6