Question

filtering lowly expressed genes before or after normalization?

0

Entering edit mode

burcu.atasu • 0

@burcuatasu-14358

Last seen 6.8 years ago

Hi, I need to figure out which approach is more appropriate regarding filtering lowly expressed genes. According to tximport manual, it is recommended to follow following commands for EdgeR analysis:
library(edgeR)

cts <- txi$counts
normMat <- txi$length
normMat <- normMat/exp(rowMeans(log(normMat)))
library(edgeR)
o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))
y <- DGEList(cts)
y$offset <- t(t(log(normMat)) + o)

and to continue with y as a DGE object. In my analysis I filtered out the lowly expressed genes based on the cpm value (for instance, cpm value is greater than 1 in at least the number of small group of samples) using "keep.lib.sizes=FALSE" after doing above mentioned normalization.
I am now confused if my approach is appropriate and if I should do the normalization after filtering?

Thanks for your help.
Best,

edgeR TXIMPORT • 1.0k views

ADD COMMENT • link updated 6.8 years ago by Michael Love 43k • written 6.8 years ago by burcu.atasu • 0

score 0 · Answer 1 · 2018-03-20

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 23 hours ago

United States

This is answered in the edgeR user guide, right? In the section on filtering it says "It is also recommended to recalculate the library sizes of the DGEList object after the filtering though the difference is usually negligible."

ADD COMMENT • link 6.8 years ago Michael Love 43k