Limma; matrix of microarray data and design matrix
2
0
Entering edit mode
john herbert ▴ 560
@john-herbert-4612
Last seen 10.2 years ago
Dear Bioconductors. I have a six column matrix of one colour array data (first 3 columns are case, second 3 are control), quantile normalized. I would like to do simple differential gene expression using limma. Is there a line or two of code that generates a simple design matrix for this scenario? I usually use a design matrix created from a targets file, and I never really understand lines like... design <- model.matrix(~0+f) (what is ~0+f)? [[alternative HTML version deleted]]
limma limma • 3.3k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi John, On 6/3/2011 9:20 AM, john herbert wrote: > Dear Bioconductors. > I have a six column matrix of one colour array data (first 3 columns are > case, second 3 are control), quantile normalized. > > I would like to do simple differential gene expression using limma. > > Is there a line or two of code that generates a simple design matrix for > this scenario? > > I usually use a design matrix created from a targets file, and I never > really understand lines like... design<- model.matrix(~0+f) (what is > ~0+f)? No idea what f is here (other than the obvious; it is a variable pointing to a set of factors). But constructing the design matrix is simple. f <- factor(rep(c("case","control"), each = 3)) # ;-D design <- model.matrix(~f) fit <- lmFit(<yourmatrix>, design) fit2 <- eBayes(fit) topTable(fit2, coef=2) -OR- design <- model.matrix(~0+f) fit <- lmFit(<yourmatrix>, design) fit2 <- contrasts.fit(fit, c(-1,1)) fit2 <- eBayes(fit2) topTable(fit2, coef=1) These results will be identical, except the signs will be flipped for your coefficients (and I would normally prefer the sign in the second case). It is worth your while to figure out why, and what the difference is between the two design matrices. Best, Jim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
john herbert ▴ 560
@john-herbert-4612
Last seen 10.2 years ago
Thank you James, that is very helpful. In terms of why, I am not sure at the moment. To be honest, I don't have any idea about the stats here. Take the tilde for instance. searching online finds. 1. In asymptotic notation<http: mathworld.wolfram.com="" asymptoticnotation.html=""> , [image: f∼phi] is used to mean that [image: f/phi->1]. 2. In statistics <http: en.wikipedia.org="" wiki="" statistics=""> and probability theory <http: en.wikipedia.org="" wiki="" probability_theory="">, ‹~› means “is distributed as” How does that fit in with ~f? 0 compared factor variables? On Fri, Jun 3, 2011 at 2:20 PM, john herbert <arraystruggles@gmail.com>wrote: > Dear Bioconductors. > I have a six column matrix of one colour array data (first 3 columns are > case, second 3 are control), quantile normalized. > > I would like to do simple differential gene expression using limma. > > Is there a line or two of code that generates a simple design matrix for > this scenario? > > I usually use a design matrix created from a targets file, and I never > really understand lines like... design <- model.matrix(~0+f) (what is > ~0+f)? > > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi John, On 6/6/2011 6:47 AM, john herbert wrote: > Thank you James, that is very helpful. > In terms of why, I am not sure at the moment. > > To be honest, I don't have any idea about the stats here. > > Take the tilde for instance. searching online finds. > 1. In asymptotic notation<http: mathworld.wolfram.com="" asymptoticnotation.html=""> > , [image: f???phi] is used to mean that [image: f/phi->1]. > 2. In statistics<http: en.wikipedia.org="" wiki="" statistics=""> and probability > theory<http: en.wikipedia.org="" wiki="" probability_theory="">, ???~??? means ???is > distributed as??? > > How does that fit in with ~f? > 0 compared factor variables? No. The tilde has a different meaning within R, specifying the right hand side of a model equation. The default in R is to fit an intercept in all linear models (which in the context of ANOVA is better thought of as a 'baseline' sample, to which all other samples are compared). So when you do something like f <- factor(rep(c("A","B"), each = 3)) design <- model.matrix(~f) you are by default setting the 'A' samples as the baseline sample, and the second coefficient in the model is the B - A comparison. To eliminate the intercept, you add either a 0 or a -1 to the right hand side of the equation: design <- model.matrix(~0+f) which will then compute the average expression of the A and B samples separately, so you have to explicitly create a contrasts matrix in order to compute the B - A contrast. See the limmaUsersGuide, and ?formula for more information. You might also consider looking at Julian Faraway's excellent book on using R to fit linear models. This used to be a pdf he gave away for free, but is now published. However, some work with the googles might get you to the pdf if it is floating around on somebody's website. Best, Jim > > On Fri, Jun 3, 2011 at 2:20 PM, john herbert<arraystruggles at="" gmail.com="">wrote: > >> Dear Bioconductors. >> I have a six column matrix of one colour array data (first 3 columns are >> case, second 3 are control), quantile normalized. >> >> I would like to do simple differential gene expression using limma. >> >> Is there a line or two of code that generates a simple design matrix for >> this scenario? >> >> I usually use a design matrix created from a targets file, and I never >> really understand lines like... design<- model.matrix(~0+f) (what is >> ~0+f)? >> >> > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 -------------- next part -------------- ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY

Login before adding your answer.

Traffic: 854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6