How does edgeR calculate fold change?
2
0
Entering edit mode
zhxiaokang ▴ 10
@zhxiaokang-13311
Last seen 10 days ago
Norway

I'm using edgeR to do differential expression analysis. In the output, there is this column logFC. I'm wondering how it was calculated.

Let's use one of my real cases for example. I have the expression of one gene from 6 treated samples and 6 control samples. The raw counts are as follows:

treated samples: 411 359 497 349 1091 861

control samples: 18 5 17 13 26 27


And the normalized counts (normalized using "TMM" integrated in edgeR) were generated as follows

exp.normalized.counts <- calcNormFactors(exp.raw.counts, method="TMM")
cpm.normalized.counts <- cpm(exp.normalized.counts)


And what I got from cpm.normalized.counts for this gene is:

treated samples: 29.86926837 26.2474782  36.72150731 19.91655285 66.49821842 56.14122252

control samples: 1.193338995 0.423771513 1.353243931 1.081332845 2.074234472 2.15046386


Supposing that the logFC is calculated as dividing the mean of treat by the mean of control, and then log2. Then the logFC calculated (I manually calculated with the numbers above) from the raw counts is: 5.072979445, and logFC calculated from the normalized counts is: 4.82993439

But the logFC in the output from edgeR is: 4.8144125776515

It isn't the same as neither of what I manually calculated results (it's slightly different from what I got from the normalized counts though). So I'm wondering how edgeR exactly calculates the logFC...

DifferentialExpression DEA FC FoldChange edgeR • 333 views
2
Entering edit mode
rproendo ▴ 20
@rproendo-17985
Last seen 3 days ago
United States

EdgeR (and similar tools, i.e., DESeq2) shrink logFC values towards zero. Your manual calculation gets close but there are additional steps being employed in edgeR. Gordon gave a very informative answer to a similar question, which you can check out here

1
Entering edit mode
swbarnes2 ▴ 850
@swbarnes2-14086
Last seen 4 hours ago
San Diego

If it was that trivial to calculate fold change, no one would use fancy software. EdgeR is not just dividing one mean by the other. The math is far more complex.

0
Entering edit mode

That makes sense. I should have thought of this LOL

0
Entering edit mode

Though you can see that in a very simple case, with no other experimental factors being corrected for, and counts that are not near noise level, the manually calculated value is pretty darn close to what fancy software tells you.

0
Entering edit mode

Yep, pretty close to what I got with the normalized counts