Entering edit mode
Hello,
I am working with count table that has very big lib.size:
> dge at .Data[[2]]$lib.size
[1] 3.2e+08 4.2e+08 4.5e+08 3.8e+08 2.3e+08 2.1e+08 3.3e+08 2.8e+08
This causes CPM very small, and consequently very negative logCPM.
This is 'head' of my cpm(counts):
C1 C2 C3 C4 T1 T2 T3 T4
00000001 0.000 0.0000 0.0000 0.0026 0.0042 0.000 0.000 0.0035
00000002 0.012 0.0092 0.0086 0.0103 0.0042 0.014 0.006 0.0070
00000003 0.073 0.0554 0.0474 0.0620 0.0584 0.056 0.057 0.0525
00000004 0.073 0.0624 0.0496 0.0620 0.0626 0.056 0.060 0.0525
00000005 0.076 0.0624 0.0496 0.0594 0.0584 0.056 0.060 0.0490
00000006 0.067 0.0624 0.0474 0.0620 0.0584 0.046 0.066 0.0630
The point that concerns me here is the effect number of decimal places
and rounding of numbers may lose sensitivity. Is this something that
can effect the outcome of analysis? If it does, should I just scale
the counts up before putting the data through my workflow?
##### body of 'cpm' function/method #######
{
x <- as.matrix(x)
if (is.null(lib.size))
lib.size <- colSums(x)
if (log) {
prior.count.scaled <- lib.size/mean(lib.size) * prior.count
lib.size <- lib.size + 2 * prior.count.scaled
}
lib.size <- 1e-06 * lib.size
if (log)
log2(t((t(x) + prior.count.scaled)/lib.size))
else t(t(x)/lib.size)
}
Kind regards,
Vang Quy Le
Bioinformatician, Molecular Biologist, PhD
+45 97 66 56 29
vql at rn.dk
AALBORG UNIVERSITY HOSPITAL
Section for Molecular Diagnostics,
Clinical Biochemistry
Reberbansgade
DK 9000 Aalborg
www.aalborguh.rn.dk