After normalizing the data with TMM method in edger one gets lots of values <1. Unfortunately, I found neither in edger manual nor in edger paper the description how the log fold change is calculated. Especially how those values <1 are treated because for instance, the log2 fold change of (0.7/0.03) is 4.544321 what is quite high log2 fold change but it resulted by dividing with a small number. How are the low normalized counts are treated when log2 fold change is calculated?
edgeR uses the value from the
prior.count parameter (in
glmFit, for example) to mitigate the impact of low counts in the denominator that you are anticipating.
You'll find some detail in the
?glmFit documentation. Searching through the archives on this forum should also bring up some useful information.
Details on statistical methods are often described in the help pages for relevant functions. Typical edgeR steps are to calculate normalization factors, estimate dispersions, then do F-test or likelihood ratio tests on coefficients as based on a GLM, as shown in the quick start section of the user guide. If you look up, for example, ?glmFit you get:
Description Fit a negative binomial generalized log-linear model to the read counts for each gene. Conduct genewise statistical tests for a given coefficient or coefficient contrast. Usage ## S3 method for class 'DGEList' glmFit(y, design=NULL, dispersion=NULL, prior.count=0.125, start=NULL, ...) ... prior.count average prior count to be added to observation to shrink the estimated log-foldchanges towards zero.