edgeR to do differential expression analysis. In the output, there is this column
logFC. I'm wondering how it was calculated.
Let's use one of my real cases for example. I have the expression of one gene from 6 treated samples and 6 control samples. The raw counts are as follows:
treated samples: 411 359 497 349 1091 861 control samples: 18 5 17 13 26 27
And the normalized counts (normalized using "TMM" integrated in edgeR) were generated as follows
exp.normalized.counts <- calcNormFactors(exp.raw.counts, method="TMM") cpm.normalized.counts <- cpm(exp.normalized.counts)
And what I got from
cpm.normalized.counts for this gene is:
treated samples: 29.86926837 26.2474782 36.72150731 19.91655285 66.49821842 56.14122252 control samples: 1.193338995 0.423771513 1.353243931 1.081332845 2.074234472 2.15046386
Supposing that the logFC is calculated as dividing the mean of
treat by the mean of
control, and then log2. Then the logFC calculated (I manually calculated with the numbers above) from the raw counts is: 5.072979445, and logFC calculated from the normalized counts is: 4.82993439
But the logFC in the output from edgeR is: 4.8144125776515
It isn't the same as neither of what I manually calculated results (it's slightly different from what I got from the normalized counts though). So I'm wondering how edgeR exactly calculates the logFC...