Limma Modeling
1
0
Entering edit mode
@katarzyna-bryc-1808
Last seen 10.4 years ago
Dear List, I have a question on correctly modeling a situtation to find significantly differentiated genes with Limma. I have Affy arrays for pediatric patients collected before the patients were treated with a drug for 4 months. After this time period, some patients had a side effect of significant weight gain, while others did not. I wish to find the genes which significantly differentiate patients who gained weight with the treatment from those who did not suffer this side effect. Since these are pediatric patients, I also wish to control for Sex and Age (continuous variable). I understand that with Limma I can model the following: Gene Expression = Age + Sex + Weight Gain but I actually wish to look at Weight Gain as the dependent variable, and Gene Expression as one of the independent variables (I still control for Age and Sex). Thus, I actually wish to model Weight Gain = Age + Sex + Gene Expression My questions are: 1. Will these two models give me the same results for finding genes significant in predicting Weight Gain? 2. If not, is there a way to model this using either Limma or another Bioconductor method? Thank you for any helpful words, Kasia Bryc
affy limma affy limma • 1.8k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.8 years ago
United States
The problem is that you have thousands of genes which you have summarized as "gene expression". This means that you can obtain a perfect fit with many different sets of genes. So, usually you need to do some gene selection before you fit the model. Then you could do ordinary variable selection as in ordinary linear regression, or possibly use a method like ridge regression. --Naomi At 12:00 PM 7/20/2006, Katarzyna Bryc wrote: >Dear List, > >I have a question on correctly modeling a situtation to find >significantly differentiated genes with Limma. I have Affy arrays for >pediatric patients collected before the patients were treated with a >drug for 4 months. After this time period, some patients had a side >effect of significant weight gain, while others did not. I wish to find >the genes which significantly differentiate patients who gained weight >with the treatment from those who did not suffer this side effect. Since >these are pediatric patients, I also wish to control for Sex and Age >(continuous variable). > >I understand that with Limma I can model the following: > >Gene Expression = Age + Sex + Weight Gain > >but I actually wish to look at Weight Gain as the dependent variable, >and Gene Expression as one of the independent variables (I still control >for Age and Sex). Thus, I actually wish to model > >Weight Gain = Age + Sex + Gene Expression > >My questions are: >1. Will these two models give me the same results for finding genes >significant in predicting Weight Gain? >2. If not, is there a way to model this using either Limma or another >Bioconductor method? > >Thank you for any helpful words, >Kasia Bryc > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
The other way to think of this problem is as a classification problem. If you have two groups, those that gain weight and those that do not, you can use tools like SVM or randomForests to determine those genes that are most predictive of weight gain. As Naomi points out, there may be zero to MANY genes that can classify your samples correctly. At least for randomForests, the question can be framed as a regression, as well as a classification problem. Sean On 7/20/06 11:19, "Naomi Altman" <naomi at="" stat.psu.edu=""> wrote: > The problem is that you have thousands of genes which you have > summarized as "gene expression". This means that you can obtain a > perfect fit with many different sets of genes. > > So, usually you need to do some gene selection before you fit the > model. Then you could do ordinary variable selection as in ordinary > linear regression, or possibly use a method like ridge regression. > > --Naomi > > At 12:00 PM 7/20/2006, Katarzyna Bryc wrote: >> Dear List, >> >> I have a question on correctly modeling a situtation to find >> significantly differentiated genes with Limma. I have Affy arrays for >> pediatric patients collected before the patients were treated with a >> drug for 4 months. After this time period, some patients had a side >> effect of significant weight gain, while others did not. I wish to find >> the genes which significantly differentiate patients who gained weight >> with the treatment from those who did not suffer this side effect. Since >> these are pediatric patients, I also wish to control for Sex and Age >> (continuous variable). >> >> I understand that with Limma I can model the following: >> >> Gene Expression = Age + Sex + Weight Gain >> >> but I actually wish to look at Weight Gain as the dependent variable, >> and Gene Expression as one of the independent variables (I still control >> for Age and Sex). Thus, I actually wish to model >> >> Weight Gain = Age + Sex + Gene Expression >> >> My questions are: >> 1. Will these two models give me the same results for finding genes >> significant in predicting Weight Gain? >> 2. If not, is there a way to model this using either Limma or another >> Bioconductor method? >> >> Thank you for any helpful words, >> Kasia Bryc >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6