Question: fold change calculation in edger
0
3.3 years ago by
tonja.r40
United Kingdom
tonja.r40 wrote:

After normalizing the data with TMM method in edger one gets lots of values <1. Unfortunately, I found neither in edger manual nor in edger paper the description how the log fold change is calculated. Especially how those values <1 are treated because for instance, the log2 fold change of (0.7/0.03) is  4.544321 what is quite high log2 fold change but it resulted by dividing with a small number. How are the low normalized counts are treated when log2 fold change is calculated?

normalization edger logfc • 2.7k views
modified 3.3 years ago by Michael Love23k • written 3.3 years ago by tonja.r40

Give an minimum working example with code, because it's not clear what you're describing.

Answer: fold change calculation in edger
1
3.3 years ago by
Denali
Steve Lianoglou12k wrote:

edgeR uses the value from the prior.count parameter (in glmFit, for example) to mitigate the impact of low counts in the denominator that you are anticipating.

You'll find some detail in the ?glmFit documentation. Searching through the archives on this forum should also bring up some useful information.

Even if I add a default value of prior.count (0.125) it does not mitigate the impact as (0.7+0.125)/(0.03+0.125)=5.322581. Is there somewhere a description how exactly log2 fold change is calculated in edgeR?

Add a larger number to prior.count and check again. Try prior.count=5 for example. Passing the default value would not have changed your result because that is the value that was already used

> log2(0.7/0.03)
[1] 4.544321
> log2((0.7+0.125)/(0.03+0.125))
[1] 2.412126

So clearly, it does mitigate the impact of the small denominator. See ?predFC for details.

Answer: fold change calculation in edger
0
3.3 years ago by
Michael Love23k
United States
Michael Love23k wrote:

Details on statistical methods are often described in the help pages for relevant functions. Typical edgeR steps are to calculate normalization factors, estimate dispersions, then do F-test or likelihood ratio tests on coefficients as based on a GLM, as shown in the quick start section of the user guide. If you look up, for example, ?glmFit you get:

Description
Fit a negative binomial generalized log-linear model to the read counts for each gene. Conduct
genewise statistical tests for a given coefficient or coefficient contrast.
Usage
## S3 method for class 'DGEList'
glmFit(y, design=NULL, dispersion=NULL, prior.count=0.125, start=NULL, ...)
...
prior.count average prior count to be added to observation to shrink the estimated log-foldchanges
towards zero.