gene set enrichment analysis with missing values
1
0
Entering edit mode
heyi xiao ▴ 360
@heyi-xiao-3308
Last seen 8.2 years ago
United States
Dear all, I have an expression data matrix with genes as rows and samples as columns. Many genes (~30%) have missing values in one or more samples. I would like to do a gene set enrichment type of analysis. Shall I remove the whole rows for all these genes? I am a little concerned that this may affect the testing power when so many genes are missing from the analysis. Is there any better way to go? Any suggestions would be appreciate. Thank you! Heyi [[alternative HTML version deleted]]
• 1.3k views
ADD COMMENT
0
Entering edit mode
Luo Weijun ★ 1.6k
@luo-weijun-1783
Last seen 17 months ago
United States
Hi Heyi, You may want to try the GAGE method. GAGE does differential expression tests on gene sets based on one-on-one comparison between samples. This special approach together with carefully designed NA handling utility makes GAGE tolerant to missing values (NAs). You don’t really have to remove genes with missing values. Actually it is better not removing genes with missing values, as the existent expression values for these genes can be fully used to make the analysis more sensitive. The gage package is newly available with bioconductor 2.7 at http://bioconductor.org/help/bioc- views/release/bioc/html/gage.html. GAGE method has been published at http://www.biomedcentral.com/1471-2105/10/161. Let me know if you have other questions or need help. Thanks! Weijun --- On Wed, 11/3/10, heyi xiao <xiaoheyiyh@yahoo.com> wrote: From: heyi xiao <xiaoheyiyh@yahoo.com> Subject: gene set enrichment analysis with missing values To: bioconductor@stat.math.ethz.ch Date: Wednesday, November 3, 2010, 10:19 PM Dear all, I have an expression data matrix with genes as rows and samples as columns. Many genes (~30%) have missing values in one or more samples. I would like to do a gene set enrichment type of analysis. Shall I remove the whole rows for all these genes? I am a little concerned that this may affect the testing power when so many genes are missing from the analysis. Is there any better way to go? Any suggestions would be appreciate. Thank you! Heyi [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6