Non specific filtering prior to differential expression with limma based on variance
2
0
Entering edit mode
chris86 ▴ 420
@chris86-8408
Last seen 4.4 years ago
UCL, United Kingdom

Hi

I am doing some non specific gene filtering prior to DE with limma and I am just wondering when I am filtering based on variance how much filtering I can do and still call my DE results with limma 'valid'. For instance, if I select the top 10% or 5% of the most variable genes by the co efficient of variation can I still do differential expression on different subgroups of the data set and call my conclusions valid? I think the answer to this is yes, because I am not specifically selecting any genes only removing the unvariable ones generally that can be thought as background. However it would be nice to have some other opinions about this as I may have misled myself!

Thanks,

Chris

limma microarray normalization • 1.5k views
ADD COMMENT
3
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 6 months ago
Germany/Heidelberg/German Cancer Resear…

Dear Chris,

from literature(http://www.ncbi.nlm.nih.gov/pubmed/20460310) and other tutorials you can see that generally filtering based on variance in conjuction with limma is not recommended. In detail, using a non-specific filter based on variance, could possibly remove many of the probesets that are connected(cause) the sampling error in the ordinary Students T-test-for instance, probesets that have a small p-value, which is not due to differential expression, but more commonly because the variance of the specific probesets is close to zero. Also you could see that these probesets in a volcano plot commonly have a relatively small p-value(y-axis), but also a very small  fold-change(x-axis). Thus, if you apply this filter, limma is not useful anymore, as it usually "corrects" these low-variance probesets. So, generally, if you want to filter prior statistical analysis and at the same time use limma, you should probably use a non-specific filtering based on intensity, which is considered beneficial in many cases.

You should also check the post  which adresses your question analytically (https://support.bioconductor.org/p/69467/#69744)

ADD COMMENT
0
Entering edit mode

Thanks for your very useful reply.

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 33 minutes ago
WEHI, Melbourne, Australia

Variance filtering is likely to be harmful whenever you have a decreasing mean-variance relationship or if you are planning to use any of the better DE statistical tests (like limma, maanova, SAM, CyberT, edgeR etc).

Please filter by mean intensity or some similar measure. It is still "non-specific" but much safer.

ADD COMMENT

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6