Question

Question: How does limma derives its logFC value in two colored arrays?

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 28 minutes ago

WEHI, Melbourne, Australia

Dear Sunny, > Date: Mon, 27 Sep 2010 03:14:01 -0400 > From: Sunny Srivastava <research.baba at="" gmail.com=""> > To: bioconductor <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Question: How does limma derives its logFC value in > two colored arrays? > > Hello Bioconductor Gurus, > > > I have the a data about gene expression from TWO COLORED Agilent array. I > wanted to check differential expression between p3 and wild strain of yeast. > In one array p3 is colored with Cy5 and wild is colored with Cy3 and in the > second array the dyes are swapped. Assuming I have normalized my data using > VSN and obtained M values for the two arrays, I now want to use limma to > derive the differentially expressed genes. > > My model matrix (say design) in this case will be > > p3 > 1 > -1 > > if wild type is the reference. > > If my understanding is correct about how limma analyzes differential > expression, then M value is the dependent variable, sample annotation > (whether p3 or wild, provided by design) is the independent (explanatory) > variable, and a linear model is fit per gene using the following equation. > > > lmFit( M , design) This is all correct. > As the data per gene is small, it is better to use eBayes method to obtain > genewise p-value. But the object obtained from eBayes (say fit3) doesn't > contain the value *logFC*. Yes it does--the logFC is fit3$coefficients. This terminology is because the logFCs are estimated as the coefficients of the linear model. Best wishes Gordon > When I use topTable to order the genes, then > logFC appears. > > The concept of logFC is clear to me in case of a Affy single colored array > (ie log (Int_trt/ Int_control) ), but somehow I am still confused how to > interpret this in two colored arrays. > > In my opinion M value (for each array) should represent logFC if color bias > is ignored. How does limma derives its logFC value in two colored arrays? Is > it based on the B statistics? Please enlighten me ! > > Thanks in advance for any help. > > Best Regards, > S. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

Yeast affy vsn limma Yeast affy vsn limma • 1.7k views

ADD COMMENT • link updated 13.6 years ago by Sunny Srivastava ▴ 350 • written 13.6 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2010-09-27

Dear Dr. Smyth, Thank you very much for the answer. That clarifies a lot of things! I have one last question to make sure I understood you correctly - in the case of two colored arrays, even if we are regressing the M values on the treatment (sample annotation), we will call the coefficients as logFC. Thanks in advance for your help. Best Regards, S. On Mon, Sep 27, 2010 at 7:17 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Dear Sunny, > > Date: Mon, 27 Sep 2010 03:14:01 -0400 >> From: Sunny Srivastava <research.baba@gmail.com> >> To: bioconductor <bioconductor@stat.math.ethz.ch> >> Subject: [BioC] Question: How does limma derives its logFC value in >> two colored arrays? >> >> Hello Bioconductor Gurus, >> >> >> I have the a data about gene expression from TWO COLORED Agilent array. I >> wanted to check differential expression between p3 and wild strain of >> yeast. >> In one array p3 is colored with Cy5 and wild is colored with Cy3 and in >> the >> second array the dyes are swapped. Assuming I have normalized my data >> using >> VSN and obtained M values for the two arrays, I now want to use limma to >> derive the differentially expressed genes. >> >> My model matrix (say design) in this case will be >> >> p3 >> 1 >> -1 >> >> if wild type is the reference. >> >> If my understanding is correct about how limma analyzes differential >> expression, then M value is the dependent variable, sample annotation >> (whether p3 or wild, provided by design) is the independent (explanatory) >> variable, and a linear model is fit per gene using the following equation. >> >> >> lmFit( M , design) >> > > This is all correct. > > As the data per gene is small, it is better to use eBayes method to obtain >> genewise p-value. But the object obtained from eBayes (say fit3) doesn't >> contain the value *logFC*. >> > > Yes it does--the logFC is fit3$coefficients. This terminology is because > the logFCs are estimated as the coefficients of the linear model. > > Best wishes > Gordon > > When I use topTable to order the genes, then >> logFC appears. >> >> The concept of logFC is clear to me in case of a Affy single colored array >> (ie log (Int_trt/ Int_control) ), but somehow I am still confused how to >> interpret this in two colored arrays. >> >> In my opinion M value (for each array) should represent logFC if color >> bias >> is ignored. How does limma derives its logFC value in two colored arrays? >> Is >> it based on the B statistics? Please enlighten me ! >> >> Thanks in advance for any help. >> >> Best Regards, >> S. >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}