Entering edit mode
delhomme@embl.de
★
1.2k
@delhommeemblde-3232
Last seen 9.6 years ago
Hi all,
While calculating some statistics of an RNA-seq experiment I tumbled
onto the following problem. Applying the IRanges coverage function to
my IRanges, I get back an integer Rle object. However trying to get
the mean or sum of that Rle object results in an integer overflow. The
following example just exemplify that overflow.
library(IRanges)
rC <- Rle(values=as.integer(c(1,(2^31)-1,1)))
sum(rC)
mean(rC)
Both result in an integer overflow.
[1] NA
Warning message:
In sum(runValue(x) * runLength(x), ..., na.rm = na.rm) :
Integer overflow - use sum(as.numeric(.))
The solution to that is to do the following:
sum(as.numeric(runLength(rC) * runValue(rC)))
but IMO it should be handled at the Rle level code; i.e. an integer
Rle can clearly have a sum, a mean, etc... result that involve
calculating values outside the integer range. Is there anything that
speaks again having these functions internally converting the integer
values to numeric before calculating the sum or mean?
Looking forward to hearing your thoughts on this,
Cheers,
Nico
sessionInfo()
R Under development (unstable) (2012-02-07 r58290)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.13.24 BiocGenerics_0.1.4
loaded via a namespace (and not attached):
[1] tools_2.15.0
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany