Limma-Contrasts-Question

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Dear Biju, Using the contrast (A1+B1)/2 - (A2+B2)/2 will find genes which response in the same direction, and perhaps by about the same fold change, in both A and B. Doing separate tests for A1-A2 and B1-B2, does not require genes to be changing in the same direction in A and B. Best wishes Gordon > Date: Thu, 11 Mar 2010 14:49:35 +0100 > From: "Biju Joseph" <bjoseph at="" hygiene.uni-wuerzburg.de=""> > To: <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Limma-Contrasts-Question > Message-ID: <000001cac121$b0cbd3b0$12637b10$@uni-wuerzburg.de> > Content-Type: text/plain; charset="iso-8859-1" > > Dear all > > Sorry if this is a trivial question, > My experiment design is the following > 2 strains (A & B) subjected to 4 conditions each (1,2,3,4) compared using a > common reference design. > We are interested in various contrasts between the 8 samples. > Using limma - I was able to generate topTables for the required contrasts > eg: > A1-B1, A2-B2, A3-B3, A4-B4 > A1-A2, A2-A3, B1-B2, B2-B3 and so on > > A comparison for example the topTable A1-A2 and B1-B2, would represent the > common response in A and B from condition 1 and condition 2. > > Using this manual comparison of the 2 topTables, I saw that around 400 genes > are commonly differentially regulated in strain A and strain B in the > conditions 1 and 2. > > Now when I include the following contrast in my model in limma > > (A1+B1)/2 - (A2+B2)/2 which in my understanding also generates the common > response between condition 1 and 2 in the 2 strains A and B. > > The topTable generated using this contrast shows only 10 genes to be > commonly differentially regulated between condition 1 and 2. > > Would be great if someone could explain this discrepancy to me and about > which method is safer to compare(comparison of the 2 individual toptables or > the toptable generated using make.contrasts). > > Best > Biju > Institut f?r Hygiene und Mikrobiologie > Universit?t W?rzburg > Josef-Schneider-Str. 2, Geb?ude E1 > 97080 W?rzburg > Email: bjoseph at hygiene.uni-wuerzburg.de > Tel.: 0931 201 46708 > Fax: 0931 201 46445 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

limma limma • 1.6k views

ADD COMMENT • link updated 15.8 years ago by Biju Joseph ▴ 30 • written 15.8 years ago by Gordon Smyth 53k

0

Entering edit mode

Biju Joseph ▴ 30

@biju-joseph-3955

Last seen 11.3 years ago

Thanks Gordon for your answer. In my question, I was referring to the DE expressed genes in the same direction. What I am actually unclear about is the following. Lets say, I generate topTables for 1. A1-A2 2. B1-B2 Now using these 2 individual tables, I could pull out the genes common in either direction using lets say Access. Is this method valid and safe? How comparable should this manually generated common response between strains A and B in conditions 1 and 2 be comparable to the topTable generated using (A1+B1)/2 - (A2+B2)/2. Of course I think that these 2 methods should generate more or less the same results at least with respect to numbers of DE genes. What FC is better to report for the common response? One from using the (A1+B1)/2 - (A2+B2)/2 contrast or the FC from the individual topTables. Best Biju Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > Dear Biju, > > Using the contrast (A1+B1)/2 - (A2+B2)/2 will find genes which > response in the same direction, and perhaps by about the same fold > change, in both A and B. > > Doing separate tests for A1-A2 and B1-B2, does not require genes to > be changing in the same direction in A and B. > > Best wishes > Gordon > >> Date: Thu, 11 Mar 2010 14:49:35 +0100 >> From: "Biju Joseph" <bjoseph at="" hygiene.uni-wuerzburg.de=""> >> To: <bioconductor at="" stat.math.ethz.ch=""> >> Subject: [BioC] Limma-Contrasts-Question >> Message-ID: <000001cac121$b0cbd3b0$12637b10$@uni-wuerzburg.de> >> Content-Type: text/plain; charset="iso-8859-1" >> >> Dear all >> >> Sorry if this is a trivial question, >> My experiment design is the following >> 2 strains (A & B) subjected to 4 conditions each (1,2,3,4) compared using a >> common reference design. >> We are interested in various contrasts between the 8 samples. >> Using limma - I was able to generate topTables for the required contrasts >> eg: >> A1-B1, A2-B2, A3-B3, A4-B4 >> A1-A2, A2-A3, B1-B2, B2-B3 and so on >> >> A comparison for example the topTable A1-A2 and B1-B2, would represent the >> common response in A and B from condition 1 and condition 2. >> >> Using this manual comparison of the 2 topTables, I saw that around 400 genes >> are commonly differentially regulated in strain A and strain B in the >> conditions 1 and 2. >> >> Now when I include the following contrast in my model in limma >> >> (A1+B1)/2 - (A2+B2)/2 which in my understanding also generates the common >> response between condition 1 and 2 in the 2 strains A and B. >> >> The topTable generated using this contrast shows only 10 genes to be >> commonly differentially regulated between condition 1 and 2. >> >> Would be great if someone could explain this discrepancy to me and about >> which method is safer to compare(comparison of the 2 individual toptables or >> the toptable generated using make.contrasts). >> >> Best >> Biju >> > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}

ADD COMMENT • link 15.8 years ago Biju Joseph ▴ 30

0

Entering edit mode

Hi Biju, One thing to consider is that the question you're asking - "which genes have the SAME response in A and B?" is not what a statistical test is designed to measure. Instead, the null hypothesis is that the responses are the same, and only if there is enough evidence of a different between the responses will the statistical test become significant. One possibility would be to do the 3 following contrasts: 1) A1-A2 2) B1-B2 3) (A1-A2) - (B1-B2) The third one tests whether the response in A is the same as the response in B. You could do a Venn Diagram on these three contrasts, and a those genes that are significant in 1) and 2) but not significant in 3) could be considered genes that have the a significant response in A and the "same" (i.e., not significantly different) significant response in B. Note that genes could be significant for 3), even if they change in the same direction, if they change by differing amounts (2-fold up versus 20-fold up). Whether you want to call this the "same" response depends on your research questions... HTH, Jenny At 02:17 AM 3/15/2010, Biju Joseph wrote: >Thanks Gordon for your answer. > >In my question, I was referring to the DE expressed genes in the same >direction. > >What I am actually unclear about is the following. > >Lets say, I generate topTables for >1. A1-A2 >2. B1-B2 > >Now using these 2 individual tables, I could pull out the genes common >in either direction using lets say Access. > >Is this method valid and safe? >How comparable should this manually generated common response between >strains A and B in conditions 1 and 2 be comparable to the topTable >generated using (A1+B1)/2 - (A2+B2)/2. Of course I think that these 2 >methods should generate more or less the same results at least with >respect to numbers of DE genes. > >What FC is better to report for the common response? One from using >the (A1+B1)/2 - (A2+B2)/2 contrast or the FC from the individual >topTables. > >Best >Biju > > >Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > >>Dear Biju, >> >>Using the contrast (A1+B1)/2 - (A2+B2)/2 will find genes which >>response in the same direction, and perhaps by about the same fold >>change, in both A and B. >> >>Doing separate tests for A1-A2 and B1-B2, does not require genes to >>be changing in the same direction in A and B. >> >>Best wishes >>Gordon >> >>>Date: Thu, 11 Mar 2010 14:49:35 +0100 >>>From: "Biju Joseph" <bjoseph at="" hygiene.uni-wuerzburg.de=""> >>>To: <bioconductor at="" stat.math.ethz.ch=""> >>>Subject: [BioC] Limma-Contrasts-Question >>>Message-ID: <000001cac121$b0cbd3b0$12637b10$@uni-wuerzburg.de> >>>Content-Type: text/plain; charset="iso-8859-1" >>> >>>Dear all >>> >>>Sorry if this is a trivial question, >>>My experiment design is the following >>>2 strains (A & B) subjected to 4 conditions each (1,2,3,4) compared using a >>>common reference design. >>>We are interested in various contrasts between the 8 samples. >>>Using limma - I was able to generate topTables for the required contrasts >>>eg: >>>A1-B1, A2-B2, A3-B3, A4-B4 >>>A1-A2, A2-A3, B1-B2, B2-B3 and so on >>> >>>A comparison for example the topTable A1-A2 and B1-B2, would represent the >>>common response in A and B from condition 1 and condition 2. >>> >>>Using this manual comparison of the 2 topTables, I saw that around 400 genes >>>are commonly differentially regulated in strain A and strain B in the >>>conditions 1 and 2. >>> >>>Now when I include the following contrast in my model in limma >>> >>>(A1+B1)/2 - (A2+B2)/2 which in my understanding also generates the common >>>response between condition 1 and 2 in the 2 strains A and B. >>> >>>The topTable generated using this contrast shows only 10 genes to be >>>commonly differentially regulated between condition 1 and 2. >>> >>>Would be great if someone could explain this discrepancy to me and about >>>which method is safer to compare(comparison of the 2 individual toptables or >>>the toptable generated using make.contrasts). >>> >>>Best >>>Biju >>____________________________________________________________________ __ >>The information in this email is confidential and inte...{{dropped:10}} > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD REPLY • link 15.8 years ago Jenny Drnevich ★ 2.0k

0

Entering edit mode

Thanks Jenny, Your suggestion answers my question. (i.e) I make the following Contrasts as you suggest 1. A1-A2 2. B1-B2 3. (A1-A2)-(B1-B2) And then the genes that are significant in contrasts 1 and 2 but not in contrast 3 would be considered as those that have a significant response in A and B. This is perfect and answers my question as well. But what is still not clear to me is what happens when I use the contrast (A1+B1)/2 - (A2+B2)/2. Why do I get a much larger list of genes called as significantly DE when I use this contrast compared to the method "significant in contrast 1 and 2 but not in contrast 3". Best Biju Institut f?r Hygiene und Mikrobiologie Universit?t W?rzburg Josef-Schneider-Str. 2, Geb?ude E1 97080 W?rzburg Email: bjoseph at hygiene.uni-wuerzburg.de Tel.: 0931 201 46708 Fax: 0931 201 46445 -----Urspr?ngliche Nachricht----- Von: Jenny Drnevich [mailto:drnevich at illinois.edu] Gesendet: Dienstag, 16. M?rz 2010 15:04 An: Biju Joseph; Gordon K Smyth Cc: Bioconductor mailing list Betreff: Re: [BioC] Limma-Contrasts-Question Hi Biju, One thing to consider is that the question you're asking - "which genes have the SAME response in A and B?" is not what a statistical test is designed to measure. Instead, the null hypothesis is that the responses are the same, and only if there is enough evidence of a different between the responses will the statistical test become significant. One possibility would be to do the 3 following contrasts: 1) A1-A2 2) B1-B2 3) (A1-A2) - (B1-B2) The third one tests whether the response in A is the same as the response in B. You could do a Venn Diagram on these three contrasts, and a those genes that are significant in 1) and 2) but not significant in 3) could be considered genes that have the a significant response in A and the "same" (i.e., not significantly different) significant response in B. Note that genes could be significant for 3), even if they change in the same direction, if they change by differing amounts (2-fold up versus 20-fold up). Whether you want to call this the "same" response depends on your research questions... HTH, Jenny At 02:17 AM 3/15/2010, Biju Joseph wrote: >Thanks Gordon for your answer. > >In my question, I was referring to the DE expressed genes in the same >direction. > >What I am actually unclear about is the following. > >Lets say, I generate topTables for >1. A1-A2 >2. B1-B2 > >Now using these 2 individual tables, I could pull out the genes common >in either direction using lets say Access. > >Is this method valid and safe? >How comparable should this manually generated common response between >strains A and B in conditions 1 and 2 be comparable to the topTable >generated using (A1+B1)/2 - (A2+B2)/2. Of course I think that these 2 >methods should generate more or less the same results at least with >respect to numbers of DE genes. > >What FC is better to report for the common response? One from using >the (A1+B1)/2 - (A2+B2)/2 contrast or the FC from the individual >topTables. > >Best >Biju > > >Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > >>Dear Biju, >> >>Using the contrast (A1+B1)/2 - (A2+B2)/2 will find genes which >>response in the same direction, and perhaps by about the same fold >>change, in both A and B. >> >>Doing separate tests for A1-A2 and B1-B2, does not require genes to >>be changing in the same direction in A and B. >> >>Best wishes >>Gordon >> >>>Date: Thu, 11 Mar 2010 14:49:35 +0100 >>>From: "Biju Joseph" <bjoseph at="" hygiene.uni-wuerzburg.de=""> >>>To: <bioconductor at="" stat.math.ethz.ch=""> >>>Subject: [BioC] Limma-Contrasts-Question >>>Message-ID: <000001cac121$b0cbd3b0$12637b10$@uni-wuerzburg.de> >>>Content-Type: text/plain; charset="iso-8859-1" >>> >>>Dear all >>> >>>Sorry if this is a trivial question, >>>My experiment design is the following >>>2 strains (A & B) subjected to 4 conditions each (1,2,3,4) compared using a >>>common reference design. >>>We are interested in various contrasts between the 8 samples. >>>Using limma - I was able to generate topTables for the required contrasts >>>eg: >>>A1-B1, A2-B2, A3-B3, A4-B4 >>>A1-A2, A2-A3, B1-B2, B2-B3 and so on >>> >>>A comparison for example the topTable A1-A2 and B1-B2, would represent the >>>common response in A and B from condition 1 and condition 2. >>> >>>Using this manual comparison of the 2 topTables, I saw that around 400 genes >>>are commonly differentially regulated in strain A and strain B in the >>>conditions 1 and 2. >>> >>>Now when I include the following contrast in my model in limma >>> >>>(A1+B1)/2 - (A2+B2)/2 which in my understanding also generates the common >>>response between condition 1 and 2 in the 2 strains A and B. >>> >>>The topTable generated using this contrast shows only 10 genes to be >>>commonly differentially regulated between condition 1 and 2. >>> >>>Would be great if someone could explain this discrepancy to me and about >>>which method is safer to compare(comparison of the 2 individual toptables or >>>the toptable generated using make.contrasts). >>> >>>Best >>>Biju >>____________________________________________________________________ __ >>The information in this email is confidential and inte...{{dropped:10}} > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD REPLY • link 15.8 years ago Biju Joseph ▴ 30

0

Entering edit mode

Hi Biju, I suggest you consult a statistics textbook or a local statistician to get some more information on 2x2 factorial designs, which is what your experiment has. In particular, contrast 3 is known as the interaction term, whereas (A1+B1)/2 - (A2+B2)/2 is the main effect of condition 1 vs. condition 2 AVERAGED over the levels (A & B) of the second factor. The reason why you want to use the interaction term instead of the main effect term is because the values for A2, B1 and B2 could be very similar, but if A1 is much larger, the average of A1 and B1 could be enough to give you a significant main effect term, even though the response in not really the same. In a typical 2x2 ANOVA, if the interaction term is significant, then you ignore the results for the main effects because they may be misleading... Jenny At 07:37 AM 3/18/2010, Biju Joseph wrote: >Thanks Jenny, Your suggestion answers my question. >(i.e) > >I make the following Contrasts as you suggest >1. A1-A2 >2. B1-B2 >3. (A1-A2)-(B1-B2) > >And then the genes that are significant in contrasts 1 and 2 but not in >contrast 3 would be considered as those that have a significant response in >A and B. This is perfect and answers my question as well. > >But what is still not clear to me is what happens when I use the contrast > >(A1+B1)/2 - (A2+B2)/2. > >Why do I get a much larger list of genes called as significantly DE when I >use this contrast compared to the method "significant in contrast 1 and 2 >but not in contrast 3". > >Best >Biju > >Institut f?r Hygiene und Mikrobiologie >Universit?t W?rzburg >Josef-Schneider-Str. 2, Geb?ude E1 >97080 W?rzburg >Email: bjoseph at hygiene.uni-wuerzburg.de >Tel.: 0931 201 46708 >Fax: 0931 201 46445 > >-----Urspr?ngliche Nachricht----- >Von: Jenny Drnevich [mailto:drnevich at illinois.edu] >Gesendet: Dienstag, 16. M?rz 2010 15:04 >An: Biju Joseph; Gordon K Smyth >Cc: Bioconductor mailing list >Betreff: Re: [BioC] Limma-Contrasts-Question > >Hi Biju, > >One thing to consider is that the question you're asking - "which >genes have the SAME response in A and B?" is not what a statistical >test is designed to measure. Instead, the null hypothesis is that the >responses are the same, and only if there is enough evidence of a >different between the responses will the statistical test become >significant. One possibility would be to do the 3 following contrasts: > >1) A1-A2 >2) B1-B2 >3) (A1-A2) - (B1-B2) > >The third one tests whether the response in A is the same as the >response in B. You could do a Venn Diagram on these three contrasts, >and a those genes that are significant in 1) and 2) but not >significant in 3) could be considered genes that have the a >significant response in A and the "same" (i.e., not significantly >different) significant response in B. > >Note that genes could be significant for 3), even if they change in >the same direction, if they change by differing amounts (2-fold up >versus 20-fold up). Whether you want to call this the "same" response >depends on your research questions... > >HTH, >Jenny > >At 02:17 AM 3/15/2010, Biju Joseph wrote: > >Thanks Gordon for your answer. > > > >In my question, I was referring to the DE expressed genes in the same > >direction. > > > >What I am actually unclear about is the following. > > > >Lets say, I generate topTables for > >1. A1-A2 > >2. B1-B2 > > > >Now using these 2 individual tables, I could pull out the genes common > >in either direction using lets say Access. > > > >Is this method valid and safe? > >How comparable should this manually generated common response between > >strains A and B in conditions 1 and 2 be comparable to the topTable > >generated using (A1+B1)/2 - (A2+B2)/2. Of course I think that these 2 > >methods should generate more or less the same results at least with > >respect to numbers of DE genes. > > > >What FC is better to report for the common response? One from using > >the (A1+B1)/2 - (A2+B2)/2 contrast or the FC from the individual > >topTables. > > > >Best > >Biju > > > > > >Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > > > >>Dear Biju, > >> > >>Using the contrast (A1+B1)/2 - (A2+B2)/2 will find genes which > >>response in the same direction, and perhaps by about the same fold > >>change, in both A and B. > >> > >>Doing separate tests for A1-A2 and B1-B2, does not require genes to > >>be changing in the same direction in A and B. > >> > >>Best wishes > >>Gordon > >> > >>>Date: Thu, 11 Mar 2010 14:49:35 +0100 > >>>From: "Biju Joseph" <bjoseph at="" hygiene.uni-wuerzburg.de=""> > >>>To: <bioconductor at="" stat.math.ethz.ch=""> > >>>Subject: [BioC] Limma-Contrasts-Question > >>>Message-ID: <000001cac121$b0cbd3b0$12637b10$@uni-wuerzburg.de> > >>>Content-Type: text/plain; charset="iso-8859-1" > >>> > >>>Dear all > >>> > >>>Sorry if this is a trivial question, > >>>My experiment design is the following > >>>2 strains (A & B) subjected to 4 conditions each (1,2,3,4) compared using >a > >>>common reference design. > >>>We are interested in various contrasts between the 8 samples. > >>>Using limma - I was able to generate topTables for the required contrasts > >>>eg: > >>>A1-B1, A2-B2, A3-B3, A4-B4 > >>>A1-A2, A2-A3, B1-B2, B2-B3 and so on > >>> > >>>A comparison for example the topTable A1-A2 and B1-B2, would represent >the > >>>common response in A and B from condition 1 and condition 2. > >>> > >>>Using this manual comparison of the 2 topTables, I saw that around 400 >genes > >>>are commonly differentially regulated in strain A and strain B in the > >>>conditions 1 and 2. > >>> > >>>Now when I include the following contrast in my model in limma > >>> > >>>(A1+B1)/2 - (A2+B2)/2 which in my understanding also generates the common > >>>response between condition 1 and 2 in the 2 strains A and B. > >>> > >>>The topTable generated using this contrast shows only 10 genes to be > >>>commonly differentially regulated between condition 1 and 2. > >>> > >>>Would be great if someone could explain this discrepancy to me and about > >>>which method is safer to compare(comparison of the 2 individual toptables >or > >>>the toptable generated using make.contrasts). > >>> > >>>Best > >>>Biju > >>__________________________________________________________________ ____ > >>The information in this email is confidential and inte...{{dropped:10}} > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > >Jenny Drnevich, Ph.D. > >Functional Genomics Bioinformatics Specialist >W.M. Keck Center for Comparative and Functional Genomics >Roy J. Carver Biotechnology Center >University of Illinois, Urbana-Champaign > >330 ERML >1201 W. Gregory Dr. >Urbana, IL 61801 >USA > >ph: 217-244-7355 >fax: 217-265-5066 >e-mail: drnevich at illinois.edu Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD REPLY • link 15.8 years ago Jenny Drnevich ★ 2.0k

Login before adding your answer.