Question: Filtering Rna-seq counts
gravatar for Aurora
17 months ago by
Aurora10 wrote:


Filtering Rna-seq counts before performing differential expression analysis is generally recommanded. I wonder why is it recommanded?  What makes the analysis better that if no filtering was performed.?


Thank you for answers


rna-seq filtering • 299 views
ADD COMMENTlink modified 17 months ago by Gordon Smyth39k • written 17 months ago by Aurora10
Answer: Filtering Rna-seq counts
gravatar for Gordon Smyth
17 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

To quote from the edgeR Workflow:

"Genes that have very low counts across all the libraries should be removed prior to downstream analysis. This is justified on both biological and statistical grounds. From biological point of view, a gene must be expressed at some minimal level before it is likely to be translated into a protein or to be considered biologically important. From a statistical point of view, genes with consistently low counts are very unlikely be assessed as significantly DE because low counts do not provide enough statistical evidence for a reliable judgement to be made. Such genes can therefore be removed from the analysis without any loss of information."

Filtering improves dispersion estimation (because one doesn't try to estimate dispersions for genes with no information), improves statistical power (because it reduces the amount of testing) and decreases computation. Most important of all, filtering allows good empirical Bayes estimation across genes because it makes the remaining genes more homogeneous.

ADD COMMENTlink modified 17 months ago • written 17 months ago by Gordon Smyth39k

Thanks a lot !

ADD REPLYlink written 17 months ago by Aurora10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 333 users visited in the last hour