edgeR: the F-statistics changed after removing lowly expressed gene
1
0
Entering edit mode
Pei • 0
@e9de1a10
Last seen 13 days ago
United States

Hello everyone:

I used the glmQLFTest function by edgeR to detect differentially expressed genes. I have ONE dataset and two gene lists: (1) all genes (2) genes remained after removing lowly expressed genes by filterByExpr

In the result table, I found that logFC and logCPM for each gene remained unchanged between (1) and (2), which is as expected. However, the F statistic changed, tend to be decrease, for some genes.

As a result, genes identified as significant (FDR < 0.05) in (1) would become in-significant in (2).

Dose this make sense? Thanks in advance!

sessionInfo( )

`

glmQLFTest edgeR • 169 views
2
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

glmQLFTest() function is designed to be used after expression filtering. You should apply expression filtering, regardless of whether it results in more DE genes or not.

Filtering often increases the number of DE genes. If that is not what you are seeing, then the residual deviance may be getting underestimated by an overabundance of very low counts. The quasi-likelihood method is not designed to be used when the average count is much less than 1 for some genes, which can happen if filtering is not done.

0
Entering edit mode

Thank you Prof. Smyth! In my case, I found that number of significant genes tend to decrease after filtering when using glmQLFTest but number of significant genes tend to increase after filtering when using LRT.

I am not sure what's going on but the result by LRT looked more reasonable.

1
Entering edit mode

There is nothing to be gained by doing analyses without filtering. We don't recommend it.