Question

Help interpreting many contrasts in one contrast versus many individual contrast matrices

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

Dear Belisa, Your experiment has 17 different conditions, so you obviously cannot analyse it as a 2x2 experiment. (A 2x2 experiment has only 4 conditions in total.) The simplest way to analyse your experiment is to create a single factor with 25 levels, and to analyse your data as in Section 8.3 in the limma User's Guide. This allows you to test any hypothesis you like, including testing for interactions. If you have lots of contrasts, but you don't tell topTable() which contrast you want to test for, then topTable() will test whether *any* of the contrasts are different from zero. This is analogous to an F-test where the numerator degrees of freedomm are the number of contrasts. The help page for topTable() says: "topTableF ranks genes on the basis of moderated F-statistics for one or more coefficients. If topTable is called with coef that has length greater than 1, then the specified columns will be extracted from fit and topTableF called on the result. topTable with coef=NULL is the same as topTableF, unless the fitted model fit has only one column." You might find it very help to collaborate with a statistical bioinformatician at your own institute, if one is available. Best wishes Gordon > Date: Sat, 24 Nov 2012 06:06:22 -0800 (PST) > From: "Belisa Santos [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, belisa.santos.duarte at gmail.com > Subject: [BioC] Help interpreting many contrasts in one contrast > matrix versus many individual contrast matrices > > > Hello everybody, > > I am having a hard time interpreting in a meaningful way the output > from a contrast matrix with many contrasts versus a smal contrast matrix > with few contrasts and how they compare to each other. > > # Description of my dataset: > > Control: No treatment and time zero (total 6 replicates) > Treatment A: time1, time2, time3 and time4 (3 replicates each, total 12) > Treatment AB: time1, time2, time3 and time4 (3 replicates each, total 12) > Treatment AC: time1, time2, time3 and time4 (3 replicates each, total 12) > Treatment ABC: time1, time2, time3 and time4 (3 replicates each, total 12) > > Total of 54 microarrays, where A, B and C are different compounds used > for the growth media of the cells. > > - I do not have ONE unique research question. I want to see the effect > of time, the effect of treatment and the effect of the interaction > time-treatment. Also, I have one very specific question which is: What > is the effect of the interaction BC? (Not interested in the effect of > time for this one...) > > # My approach: > - I made a design matrix using Control as intercept (so first column > (control) filled with 1s) > - Then made 3 BIG contrast matrices: one for the treatment factor > (i.e. all combinations of contrasts between same time different > treatment ), one for the time factor (i.e. all combinations of same > treatment different time) and one for the interaction treatment-time > (all combinations treatment-time). (Still have to come up with a clever > way to find the effect of the interaction BC...) > > # My doubts are: > > 1) Can I describe my experiment as a 2x2 factorial design (2 factors: > time and treatment)? (I ask this because I also have that extra control > I used as intercept...) > > 2) Am I correct to interpret that given that I have used the control > as intercept in the design matrix, all subsequent contrasts will have > the effect of control "subtracted"? > 2.1) Is this a correct approach for my case? (Is this conceptually > correct? Is it done frequently? Is it the most elegant way to do it, or > are there "better" alternatives?) > > 3) Finally I am having problems interpreting the outcome of my > contrasts from the matrices with many contrasts. For example for my > contrast matrix for the treatment factor (there are 24 individual > contrasts), when I ask for a topTable (without specifying any particular > coefficient), what is exactly the meaning of that list? Are those the > union of all the genes that are differently expressed in all contrasts > and then ordered? Or is there any other testing done that makes this DEG > list more meaningful than just doing individual contrasts, uniting the > sets and ordering them... I feel these cannot be the same... but do not > know... and I need help to interpret it correctly. > > I would really appreciate some help with these doubts. I have read the > documentation several times now, but my experimental design is not fully > covered by any example... and i would like to be sure that i am > analyzing my data correctly. > > Thank you in advance for your attention and patience. Kind regards, > > Belisa > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] C/en_US.UTF-8/C/C/C/C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.14.1 annotate_1.36.0 hgu133plus2cdf_2.11.0 hgu133plus2.db_2.8.0 > [5] org.Hs.eg.db_2.8.0 RSQLite_0.11.2 DBI_0.2-5 AnnotationDbi_1.20.2 > [9] affy_1.36.0 Biobase_2.18.0 BiocGenerics_0.4.0 > > loaded via a namespace (and not attached): > [1] BiocInstaller_1.8.3 IRanges_1.16.4 XML_3.95-0.1 affyio_1.26.0 > [5] parallel_2.15.0 preprocessCore_1.20.0 stats4_2.15.0 tools_2.15.0 > [9] xtable_1.7-0 zlibbioc_1.4.0 > > -- > Sent via the guest posting facility at bioconductor.org. ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

hgu133plus2 hgu133plus2 • 1.6k views

ADD COMMENT • link updated 13.1 years ago by wang peter ★ 2.0k • written 13.1 years ago by Gordon Smyth 53k

score 0 · Answer 1 · 2012-11-25

i suggest you should have control for each time point >> Control: No treatment and time zero (total 6 replicates) >> Treatment A: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment AB: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment AC: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment ABC: time1, time2, time3 and time4 (3 replicates each, total 12) and then do paire-wised comparision Treatment A vs control : time1 vs , time2, time3 and time4 you can also use the time zero for all the time 1 to 4 to save money,but not very good then you do Treatment AB vs control : time1 vs , time2, time3 and time4 just consider one factor for each comparision, donot consider more than 1 -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253

score 0 · Answer 2 · 2012-11-26

On Mon, 26 Nov 2012, Gordon K Smyth wrote: > Dear Belisa, > > Your experiment has 17 different conditions, so you obviously cannot analyse > it as a 2x2 experiment. (A 2x2 experiment has only 4 conditions in total.) > > The simplest way to analyse your experiment is to create a single factor with > 25 levels, and to analyse your data as in Section 8.3 in the limma User's > Guide. That should read "17 levels", one for each condition. Gordon > This allows you to test any hypothesis you like, including testing > for interactions. > > If you have lots of contrasts, but you don't tell topTable() which contrast > you want to test for, then topTable() will test whether *any* of the > contrasts are different from zero. This is analogous to an F-test where the > numerator degrees of freedomm are the number of contrasts. The help page for > topTable() says: > > "topTableF ranks genes on the basis of moderated F-statistics for one or more > coefficients. If topTable is called with coef that has length greater than 1, > then the specified columns will be extracted from fit and topTableF called on > the result. topTable with coef=NULL is the same as topTableF, unless the > fitted model fit has only one column." > > You might find it very help to collaborate with a statistical > bioinformatician at your own institute, if one is available. > > Best wishes > Gordon > > >> Date: Sat, 24 Nov 2012 06:06:22 -0800 (PST) >> From: "Belisa Santos [guest]" <guest at="" bioconductor.org=""> >> To: bioconductor at r-project.org, belisa.santos.duarte at gmail.com >> Subject: [BioC] Help interpreting many contrasts in one contrast >> matrix versus many individual contrast matrices >> >> >> Hello everybody, >> >> I am having a hard time interpreting in a meaningful way the output >> from a contrast matrix with many contrasts versus a smal contrast matrix >> with few contrasts and how they compare to each other. >> >> # Description of my dataset: >> >> Control: No treatment and time zero (total 6 replicates) >> Treatment A: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment AB: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment AC: time1, time2, time3 and time4 (3 replicates each, total 12) >> Treatment ABC: time1, time2, time3 and time4 (3 replicates each, total 12) >> >> Total of 54 microarrays, where A, B and C are different compounds used for >> the growth media of the cells. >> >> - I do not have ONE unique research question. I want to see the effect of >> time, the effect of treatment and the effect of the interaction >> time-treatment. Also, I have one very specific question which is: What is >> the effect of the interaction BC? (Not interested in the effect of time for >> this one...) >> >> # My approach: > >> - I made a design matrix using Control as intercept (so first column >> (control) filled with 1s) > >> - Then made 3 BIG contrast matrices: one for the treatment factor (i.e. >> all combinations of contrasts between same time different treatment ), one >> for the time factor (i.e. all combinations of same treatment different >> time) and one for the interaction treatment-time (all combinations >> treatment-time). (Still have to come up with a clever way to find the >> effect of the interaction BC...) >> >> # My doubts are: >> >> 1) Can I describe my experiment as a 2x2 factorial design (2 factors: >> time and treatment)? (I ask this because I also have that extra control I >> used as intercept...) >> >> 2) Am I correct to interpret that given that I have used the control as >> intercept in the design matrix, all subsequent contrasts will have the >> effect of control "subtracted"? > >> 2.1) Is this a correct approach for my case? (Is this conceptually >> correct? Is it done frequently? Is it the most elegant way to do it, or are >> there "better" alternatives?) >> >> 3) Finally I am having problems interpreting the outcome of my contrasts >> from the matrices with many contrasts. For example for my contrast matrix >> for the treatment factor (there are 24 individual contrasts), when I ask >> for a topTable (without specifying any particular coefficient), what is >> exactly the meaning of that list? Are those the union of all the genes that >> are differently expressed in all contrasts and then ordered? Or is there >> any other testing done that makes this DEG list more meaningful than just >> doing individual contrasts, uniting the sets and ordering them... I feel >> these cannot be the same... but do not know... and I need help to interpret >> it correctly. >> >> I would really appreciate some help with these doubts. I have read the >> documentation several times now, but my experimental design is not fully >> covered by any example... and i would like to be sure that i am analyzing >> my data correctly. >> >> Thank you in advance for your attention and patience. Kind regards, >> >> Belisa >> >> -- output of sessionInfo(): >> >>> sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] C/en_US.UTF-8/C/C/C/C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_3.14.1 annotate_1.36.0 hgu133plus2cdf_2.11.0 >> hgu133plus2.db_2.8.0 >> [5] org.Hs.eg.db_2.8.0 RSQLite_0.11.2 DBI_0.2-5 >> AnnotationDbi_1.20.2 >> [9] affy_1.36.0 Biobase_2.18.0 BiocGenerics_0.4.0 >> >> loaded via a namespace (and not attached): >> [1] BiocInstaller_1.8.3 IRanges_1.16.4 XML_3.95-0.1 >> affyio_1.26.0 >> [5] parallel_2.15.0 preprocessCore_1.20.0 stats4_2.15.0 >> tools_2.15.0 >> [9] xtable_1.7-0 zlibbioc_1.4.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}