Hi, I need to figure out which approach is more appropriate regarding filtering lowly expressed genes. According to tximport manual, it is recommended to follow following commands for EdgeR analysis:
library(edgeR)
cts <- txi$counts normMat <- txi$length normMat <- normMat/exp(rowMeans(log(normMat))) library(edgeR) o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat)) y <- DGEList(cts) y$offset <- t(t(log(normMat)) + o)
and to continue with y as a DGE object. In my analysis I filtered out the lowly expressed genes based on the cpm value (for instance, cpm value is greater than 1 in at least the number of small group of samples) using "keep.lib.sizes=FALSE" after doing above mentioned normalization.
I am now confused if my approach is appropriate and if I should do the normalization after filtering?
Thanks for your help.
Best,