Search
Question: Limma; matrix of microarray data and design matrix
0
gravatar for john herbert
6.1 years ago by
john herbert550
john herbert550 wrote:
Dear Bioconductors. I have a six column matrix of one colour array data (first 3 columns are case, second 3 are control), quantile normalized. I would like to do simple differential gene expression using limma. Is there a line or two of code that generates a simple design matrix for this scenario? I usually use a design matrix created from a targets file, and I never really understand lines like... design <- model.matrix(~0+f) (what is ~0+f)? [[alternative HTML version deleted]]
ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by john herbert550
0
gravatar for James W. MacDonald
6.1 years ago by
United States
James W. MacDonald43k wrote:
Hi John, On 6/3/2011 9:20 AM, john herbert wrote: > Dear Bioconductors. > I have a six column matrix of one colour array data (first 3 columns are > case, second 3 are control), quantile normalized. > > I would like to do simple differential gene expression using limma. > > Is there a line or two of code that generates a simple design matrix for > this scenario? > > I usually use a design matrix created from a targets file, and I never > really understand lines like... design<- model.matrix(~0+f) (what is > ~0+f)? No idea what f is here (other than the obvious; it is a variable pointing to a set of factors). But constructing the design matrix is simple. f <- factor(rep(c("case","control"), each = 3)) # ;-D design <- model.matrix(~f) fit <- lmFit(<yourmatrix>, design) fit2 <- eBayes(fit) topTable(fit2, coef=2) -OR- design <- model.matrix(~0+f) fit <- lmFit(<yourmatrix>, design) fit2 <- contrasts.fit(fit, c(-1,1)) fit2 <- eBayes(fit2) topTable(fit2, coef=1) These results will be identical, except the signs will be flipped for your coefficients (and I would normally prefer the sign in the second case). It is worth your while to figure out why, and what the difference is between the two design matrices. Best, Jim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENTlink written 6.1 years ago by James W. MacDonald43k
0
gravatar for john herbert
6.1 years ago by
john herbert550
john herbert550 wrote:
Thank you James, that is very helpful. In terms of why, I am not sure at the moment. To be honest, I don't have any idea about the stats here. Take the tilde for instance. searching online finds. 1. In asymptotic notation<http: mathworld.wolfram.com="" asymptoticnotation.html=""> , [image: f∼phi] is used to mean that [image: f/phi->1]. 2. In statistics <http: en.wikipedia.org="" wiki="" statistics=""> and probability theory <http: en.wikipedia.org="" wiki="" probability_theory="">, ‹~› means “is distributed as” How does that fit in with ~f? 0 compared factor variables? On Fri, Jun 3, 2011 at 2:20 PM, john herbert <arraystruggles@gmail.com>wrote: > Dear Bioconductors. > I have a six column matrix of one colour array data (first 3 columns are > case, second 3 are control), quantile normalized. > > I would like to do simple differential gene expression using limma. > > Is there a line or two of code that generates a simple design matrix for > this scenario? > > I usually use a design matrix created from a targets file, and I never > really understand lines like... design <- model.matrix(~0+f) (what is > ~0+f)? > > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.1 years ago by john herbert550
Hi John, On 6/6/2011 6:47 AM, john herbert wrote: > Thank you James, that is very helpful. > In terms of why, I am not sure at the moment. > > To be honest, I don't have any idea about the stats here. > > Take the tilde for instance. searching online finds. > 1. In asymptotic notation<http: mathworld.wolfram.com="" asymptoticnotation.html=""> > , [image: f???phi] is used to mean that [image: f/phi->1]. > 2. In statistics<http: en.wikipedia.org="" wiki="" statistics=""> and probability > theory<http: en.wikipedia.org="" wiki="" probability_theory="">, ???~??? means ???is > distributed as??? > > How does that fit in with ~f? > 0 compared factor variables? No. The tilde has a different meaning within R, specifying the right hand side of a model equation. The default in R is to fit an intercept in all linear models (which in the context of ANOVA is better thought of as a 'baseline' sample, to which all other samples are compared). So when you do something like f <- factor(rep(c("A","B"), each = 3)) design <- model.matrix(~f) you are by default setting the 'A' samples as the baseline sample, and the second coefficient in the model is the B - A comparison. To eliminate the intercept, you add either a 0 or a -1 to the right hand side of the equation: design <- model.matrix(~0+f) which will then compute the average expression of the A and B samples separately, so you have to explicitly create a contrasts matrix in order to compute the B - A contrast. See the limmaUsersGuide, and ?formula for more information. You might also consider looking at Julian Faraway's excellent book on using R to fit linear models. This used to be a pdf he gave away for free, but is now published. However, some work with the googles might get you to the pdf if it is floating around on somebody's website. Best, Jim > > On Fri, Jun 3, 2011 at 2:20 PM, john herbert<arraystruggles at="" gmail.com="">wrote: > >> Dear Bioconductors. >> I have a six column matrix of one colour array data (first 3 columns are >> case, second 3 are control), quantile normalized. >> >> I would like to do simple differential gene expression using limma. >> >> Is there a line or two of code that generates a simple design matrix for >> this scenario? >> >> I usually use a design matrix created from a targets file, and I never >> really understand lines like... design<- model.matrix(~0+f) (what is >> ~0+f)? >> >> > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 -------------- next part -------------- ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLYlink written 6.1 years ago by James W. MacDonald43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 277 users visited in the last hour