gene filtering for limma lmFit
1
0
Entering edit mode
Peng, Fred ▴ 10
@peng-fred-3738
Last seen 9.6 years ago
Hello all, I have a question while I am using the limma package to identify differentially expressed genes: should I perform gene filtering after normalization to exclude genes that are likely unexpressed in the samples before fitting the linear model. With my limited stats knowledge, I believe the inclusion of 'unexpressed' genes may affect the BH mutliple testing correction by unnecessarily increasing the number of genes being tested. Previously when I performed global test (using the globaltest package) on Affy data, however, I found that the gene filtering step had no noticeable effect on the final P-value and therefore had not been required, so I wonder if limma's capability to detect differentially expressed genes would be affected by whether or not 'unexpressed' genes were filtered out. Thanks very much in advance. Fred Peng
Normalization affy limma Normalization affy limma • 1.6k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States
Hi Fred, You are correct that including a bunch of unexpressed genes when adjusting for multiplicity will reduce power. However, with limma you don't want to remove the 'unexpressed' genes too early (this doesn't apply to 'bad' data, where the spots are demonstrably unreliable for some reason or another). You have to remember that the eBayes() step adjusts the denominator of the t-statistic based on a prior variance estimate that is calculated from all the genes under consideration. If you filter out genes prior to this step you can bias this estimate. So the recommended method is to perhaps remove demonstrably bad spots first, do the normalization, model fitting, etc, and then filter out those genes you consider unexpressed before doing the multiplicity adjustment. Best, Jim Peng, Fred wrote: > Hello all, > I have a question while I am using the limma package to identify > differentially expressed genes: should I perform gene filtering after > normalization to exclude genes that are likely unexpressed in the samples > before fitting the linear model. With my limited stats knowledge, I believe > the inclusion of 'unexpressed' genes may affect the BH mutliple testing > correction by unnecessarily increasing the number of genes being tested. > Previously when I performed global test (using the globaltest package) on > Affy data, however, I found that the gene filtering step had no noticeable > effect on the final P-value and therefore had not been required, so I wonder > if limma's capability to detect differentially expressed genes would be > affected by whether or not 'unexpressed' genes were filtered out. > Thanks very much in advance. > Fred Peng > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826
ADD COMMENT

Login before adding your answer.

Traffic: 986 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6