choosing explanatory variables for linear model in limma
0
0
Entering edit mode
Yannick Wurm ▴ 220
@yannick-wurm-2314
Last seen 10.2 years ago
Hello Jim & List, how would you go about doing model selection using two-color data? (since your response variable is actually a ratio) I'm actually surprised that no "formal/mathematical" linear model is written in the limma Users Guide... any comments? Kind regards, Yannick On 2009-09-03 13:33:0, Jim Macdonald wrote: > Hi Andre, > > If you want to do model selection, then limma is probably not the tool > for the job. > > Instead, what I would do would be to choose some (one, five, ten, > whatever) genes and use lm() for the model selection process. That way > you can do all the conventional model selection steps, and once you are > satisfied with the model you have chosen you can go back to limma and > fit the model on all the genes. > > Best, > > Jim > > Andre J. Aberer wrote: > > Dear list members, > > > > short version of my question: > > How can I determine, whether it improves the model quality of a linear > > model (in limma), when I introduce additional explanatory variables? Is > > there an equivalent to feature selection (as in machine learning) for > > choosing the explanatory variables? > > > > The complete story: > > We analyse a dataset of about ninety single channel microarray chips and > > we want to search for differentially expressed genes and enriched gene > > sets. The chips are annotated with information (at least 20 factors, > > could be extended to 50) like the organ from which the RNA was > > extracted, the experimenter that did the lab work, the labelling kit she > > used and a huge amount of features describing e.g. the genotype of the > > individual or different aspects of the disease. > > > > We would like to build one linear model (resp. one design matrix) with > > all of the factors of interest mentioned above as explanatory variables > > in order to test various contrasts. Of course, we have to include all > > the variables that we possibly want to test in the linear model. But > > what about the ``technical'' factors like the ``labelling kit'' that was > > used? One never might want to test a contrast using this explanatory > > variable, however the net chip intensity could be influenced by a > > technical factor like this. So how can I determine, if it makes sense to > > include this variable? > > > > I am using the standard procedure as described in the limma guide: > > designMatrix <- model.matrix(~0 + var1 + var2, data=someTable) > > fitBoth <- lmFit(eset, designMatrix) > > where var1 and var2 are variables like ``diseaseOutcome'' and > > ``labellingKit''. > > > > We thought, that maybe an anova table could help us here, showing us the > > influences of var1 and var2. As far as I read > > (e.g. http://data.princeton.edu/R/linearModels.html) the anvoa function > > can be simply applied to a lm object or can be used to compare two lm > > instances. Of course, in that case it is only applied to one linear model > > and not one per gene as in the limma setting. > > So, if I try anova for one or two limma fit objects (MArrayLM), R > > complains that there is no applicable method and other anova variants > > (like anova.lm) do not work neither. This holds as well, when I want to > > do an anova for just one extracted linear model for one gene > > (like anova(lmFit[1,])). > > > > Our ultimo ratio so far is, to build a design matrix with and another > > one without a certain explanatory variable. Then we would determine the > > top DEGs and compare for each DEG their fitted linear models in an anova > > table. Finally we could check for how many of the top DEGs the > > additional variable would make a difference. > > However, this does not seem to be the golden path...or are we completely > > on the wrong track? > > > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 [[alternative HTML version deleted]]
GO limma PROcess GO limma PROcess • 1000 views
ADD COMMENT

Login before adding your answer.

Traffic: 529 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6