Question: IRanges:::coverage() speedup/enchancement
0
gravatar for Charles Berry
9.9 years ago by
Charles Berry290
United States
Charles Berry290 wrote:
The semantics of the IRanges package and especially the RangedData class are very apprpriate for some of the applications I deal with. Unfortunately, coverage() is too slow to be useful to me. I wonder if the Biocore Team would consider retooling it to make it faster? Below I provide a link to a revised coverage.c that might suffice. The kind of case I need to handle has width values in 10kbase to 10Mbase range. As a toy example, being able to run stuff like tmp <- coverage( IRanges( start=seq(1,by=1000,length=10000), width=1e7 ) ) quickly is needed. A revised version of coverage.c is available at http://cabig2.ucsd.edu:8080/Plone/Members/ccberry/software/coverage.c/ view It will handle the case above almost instantly (while the existing version needs about 8 minutes on my machine) and seems about equal to the existing version for cases with width=30. In the cases I've looked at gc() reports the same memory usage. --- Also, I wonder if the Biocore Team would entertain allowing the 'weight' argument of coverage to be of type double? This would help in cases in which downweighting of counts of some genomic features is desired. Thanks, Chuck -- Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
coverage iranges • 413 views
ADD COMMENTlink modified 9.9 years ago by Michael Lawrence620 • written 9.9 years ago by Charles Berry290
Answer: IRanges:::coverage() speedup/enchancement
0
gravatar for Michael Lawrence
9.9 years ago by
Michael Lawrence620 wrote:
On Mon, Nov 30, 2009 at 11:10 AM, Charles C. Berry <cberry@tajo.ucsd.edu>wrote: > > > The semantics of the IRanges package and especially the RangedData class > are very apprpriate for some of the applications I deal with. > > Unfortunately, coverage() is too slow to be useful to me. > > I wonder if the Biocore Team would consider retooling it to make it > faster? Below I provide a link to a revised coverage.c that might suffice. > > The kind of case I need to handle has width values in 10kbase to 10Mbase > range. As a toy example, being able to run stuff like > > tmp <- coverage( IRanges( start=seq(1,by=1000,length=10000), > width=1e7 ) ) > > quickly is needed. > > A revised version of coverage.c is available at > > http://cabig2.ucsd.edu:8080/Plone/Members/ccberry/software/coverage. c/view > > It will handle the case above almost instantly (while the existing version > needs about 8 minutes on my machine) and seems about equal to the > existing version for cases with width=30. In the cases I've looked at > gc() reports the same memory usage. > > --- > > Also, I wonder if the Biocore Team would entertain allowing the 'weight' > argument of coverage to be of type double? This would help in cases in > which downweighting of counts of some genomic features is desired. > > In many use cases, it's probably sufficient to simply round floating point numbers to integers after multiplying by a power of 10. That only goes so far though, so supporting double-precision seems reasonable. The type of the output will simply depend on the type of the weights. > Thanks, > > Chuck > > -- > Charles C. Berry (858) 534-2098 > Dept of Family/Preventive > Medicine > E mailto:cberry@tajo.ucsd.edu UC San Diego > http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 9.9 years ago by Michael Lawrence620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 405 users visited in the last hour