2x2 factorial loop without common reference (pool)

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 9 hours ago

WEHI, Melbourne, Australia

Dear Francois, Let's assume you're working with your second (simpler) targets file. All you need is design <- modelMatrix(targets, ref="a") This will give you a matrix with columns "b", "c" and "d" with actually represent b vs a, c vs a and d vs a. To get the four comparisons you ask for, you need cont.matrix <- makeContrasts(avsb=b,avsc=c,bvsd=d-b,cvsd=d-c, levels=design) then fit <- lmFit(MA, design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit) If you wanted the interaction term, this would be: int=d-b-c. Best wishes Gordon PS. Although you don't say explicitly, I'm assuming that a1, a2 etc represent some sort of biological replication. The above analysis does not keep track of which array has which biological replicate of each treatment. If you wanted to do a careful job of that, you would have no choice but to do a "separate channel" analysis, as Naomi Altman has suggested separately. If your biological replicates a1, a2 etc are not very different, compared to microarray measurement error, then the above simpler analysis may be good enough. Date: Sun, 23 Apr 2006 13:41:22 -0400 >From: "francois fauteux" <francois.fauteux at="" gmail.com=""> >Subject: [BioC] 2x2 factorial loop without common reference (pool) >To: bioconductor at stat.math.ethz.ch, " Fran?ois fauteux " > <francois.fauteux at="" gmail.com="">, " Richard B?langer " > <richard.belanger at="" plg.ulaval.ca=""> >Message-ID: > <53328b400604231041v51db3863i8bb48b2fbf725229 at mail.gmail.com> >Content-Type: text/plain; charset=ISO-8859-1 > >Hi; > >We are doing an experiment with agilent 44K (3 biological reps, >complete dye-swap): > >a - control >b - treatment 1 >c - treatment 2 >d - treatment 1 + treatment 2 > >and I would like to output evidence of the interaction between two >treatments and effect on gene expression. > >24 chips: > >SlideNumber Cy3 Cy5 >1 a1 b1 >2 a2 b2 >3 a3 b3 >4 b1 a1 >5 b2 a2 >6 b3 a3 >7 a1 c1 >8 a2 c2 >9 a3 c3 >10 c1 a1 >11 c2 a2 >12 c3 a3 >13 b1 d1 >14 b2 d2 >15 b3 d3 >16 d1 b1 >17 d2 b2 >18 d3 b3 >19 c1 d1 >20 c2 d2 >21 c3 d3 >22 d1 c1 >23 d2 c2 >24 d3 c3 > >I've done several tests with limma to isolate significant results in >the following: >1- a vs b; >2- a vs c; >3- b bs d; >4- c vs d; > >with this "targets.txt": > >SlideNumber Cy3 Cy5 >1 a b >2 a b >3 a b >4 b a >5 b a >6 b a >7 a c >8 a c >9 a c >10 c a >11 c a >12 c a >13 b d >14 b d >15 b d >16 d b >17 d b >18 d b >19 c d >20 c d >21 c d >22 d c >23 d c >24 d c > >First option: > > > f <- paste(targets$Cy3, targets$Cy5, sep = ".") > > f <- factor(f, levels = c("a.b", "b.a", "a.c", "c.a", "b.d", > "d.a", "c.d", "d.a")) > > design1 <- model.matrix(~0 + f) > > > design > a.b b.a a.c c.a b.d d.b c.d d.c >1 1 0 0 0 0 0 0 0 >2 1 0 0 0 0 0 0 0 >3 1 0 0 0 0 0 0 0 >4 0 1 0 0 0 0 0 0 >5 0 1 0 0 0 0 0 0 >6 0 1 0 0 0 0 0 0 >7 0 0 1 0 0 0 0 0 >8 0 0 1 0 0 0 0 0 >9 0 0 1 0 0 0 0 0 >10 0 0 0 1 0 0 0 0 >11 0 0 0 1 0 0 0 0 >12 0 0 0 1 0 0 0 0 >13 0 0 0 0 1 0 0 0 >14 0 0 0 0 1 0 0 0 >15 0 0 0 0 1 0 0 0 >16 0 0 0 0 0 1 0 0 >17 0 0 0 0 0 1 0 0 >18 0 0 0 0 0 1 0 0 >19 0 0 0 0 0 0 1 0 >20 0 0 0 0 0 0 1 0 >21 0 0 0 0 0 0 1 0 >22 0 0 0 0 0 0 0 1 >23 0 0 0 0 0 0 0 1 >24 0 0 0 0 0 0 0 1 > >This gives significant results for each one of the "levels" but does >not take into account the dye-swap (i.e "a.b" and "b.a" are considered >independent). > >Other tested option is: > > design2 <- modelMatrix(targets,ref="a") > > > design > p s sp >ab1 0 1 0 >ab2 0 1 0 >ab3 0 1 0 >ba1 0 -1 0 >ba2 0 -1 0 >ba3 0 -1 0 >ac1 1 0 0 >ac2 1 0 0 >ac3 1 0 0 >ca1 -1 0 0 >ca2 -1 0 0 >ca3 -1 0 0 >bd1 0 -1 1 >bd2 0 -1 1 >bd3 0 -1 1 >db1 0 1 -1 >db2 0 1 -1 >db3 0 1 -1 >cd1 -1 0 1 >cd2 -1 0 1 >cd3 -1 0 1 >dc1 1 0 -1 >dc2 1 0 -1 >dc3 1 0 -1 > >This gives results for "b" effect, "c" effect, and "d" effect. >However, I could'nt get results for the 4 comparisons of interest >(even though the matrix is coherent). > >Questions: > >1 - What would be the best option (design and operations) to get to >contrasts of interest considering that the experiment has a 4 >treatments in a factorial design without common reference (a vs b, a >vs c, b vs d, c vs d) and taking into account the dye-effect; > >2- Is this method (4 contrasts) the best one considering that >treatment "d" is a combination of treatments "b" and "c" (factorial >type design). How could one directly get to identify genes >differentially expressed due to the interaction between treatment "b" >and treatment "c" (i.e effect of "d" over "b" and "c"). > >In Limma Users Guide and elsewhere on this forum, I could not find a >clear description of how this type of analysis should be performed, >even though it is a simple design (i.e 2X2 factorial without a common >reference - two color arrays - complete dye swap). > >Thanks for your time, best regards. > >Fran?ois Fauteux >?tudiant ? la ma?trise en biologie v?g?tale >Centre de recherche en horticulture >Universit? Laval >francois.fauteux at gmail.com

Microarray limma Microarray limma • 945 views

ADD COMMENT • link updated 18.0 years ago by Jenny Drnevich ★ 2.2k • written 18.0 years ago by Gordon Smyth 50k

0

Entering edit mode

Jenny Drnevich ★ 2.2k

@jenny-drnevich-382

Last seen 9.6 years ago

Hi everyone, Comments from Naomi and Gordon (below) about the technical replication in the 2x2 factorial loop experiment are very close to an issue I have been struggling with for several analyses: When (if ever) is it OK to treat technical replicates as biological replicates? Often this is done when there is more than one random effect (e.g. also have duplicate spots, blocking effects, etc.) because as Gordon has said previously, the between gene smoothing of limma cannot currently be done with more than one random effect. I know there have been many discussions on this on the list previously, but I can see two problems with treating tech reps as biological reps, and only one of them has been addressed: 1. There is likely to be artificially decreased variance within treatment groups because tech reps should have higher correlations than biological reps. This problem has been addressed several times and probably the best answer has come from Gordon along the lines of: often measurement error is larger than biological variation, so IF there are not higher correlations among tech reps then variance estimates should not be artificially decreased. 2. The DF is artificially increased due to psuedoreplication of the biological replicates, which leads to artificially lower p-values. This combined with even minor changes to the variance components can lead to large changes in p-values in my experience. As far as I know, this second problem has not been addressed. As a case in point, in the 2x2 factorial loop from before, each of the three biological replicates has 4 technical replicates, and even if there are not higher correlations, treating them as biological reps yields N=12 for each group instead of N=3. Shouldn't we be worried about this effect as well? In such cases when the experiment design really has more than one random effect, wouldn't the analysis be better off to model the random effects properly with a multilevel model such as lme/nlme rather than get the benefits of the empirical Bayes shrinkage either through ignoring technical replication or averaging dye swaps? Thanks, Jenny Naomi's comment: I would use single channel analysis for this. The only problem is that Limma allows only 1 level of random effects. Hence, you will need to average the dye- swaps. Gordon's comment: >PS. Although you don't say explicitly, I'm assuming that a1, a2 etc >represent some sort of biological replication. The above analysis >does not keep track of which array has which biological replicate of >each treatment. If you wanted to do a careful job of that, you would >have no choice but to do a "separate channel" analysis, as Naomi >Altman has suggested separately. If your biological replicates a1, a2 >etc are not very different, compared to microarray measurement error, >then the above simpler analysis may be good enough. > >Date: Sun, 23 Apr 2006 13:41:22 -0400 > >From: "francois fauteux" <francois.fauteux at="" gmail.com=""> > >Subject: [BioC] 2x2 factorial loop without common reference (pool) > >To: bioconductor at stat.math.ethz.ch, " Fran?ois fauteux " > > <francois.fauteux at="" gmail.com="">, " Richard B?langer " > > <richard.belanger at="" plg.ulaval.ca=""> > >Message-ID: > > <53328b400604231041v51db3863i8bb48b2fbf725229 at mail.gmail.com> > >Content-Type: text/plain; charset=ISO-8859-1 > > > >Hi; > > > >We are doing an experiment with agilent 44K (3 biological reps, > >complete dye-swap): > > > >a - control > >b - treatment 1 > >c - treatment 2 > >d - treatment 1 + treatment 2 > > > >and I would like to output evidence of the interaction between two > >treatments and effect on gene expression. > > > >24 chips: > > > >SlideNumber Cy3 Cy5 > >1 a1 b1 > >2 a2 b2 > >3 a3 b3 > >4 b1 a1 > >5 b2 a2 > >6 b3 a3 > >7 a1 c1 > >8 a2 c2 > >9 a3 c3 > >10 c1 a1 > >11 c2 a2 > >12 c3 a3 > >13 b1 d1 > >14 b2 d2 > >15 b3 d3 > >16 d1 b1 > >17 d2 b2 > >18 d3 b3 > >19 c1 d1 > >20 c2 d2 > >21 c3 d3 > >22 d1 c1 > >23 d2 c2 > >24 d3 c3 > > > >I've done several tests with limma to isolate significant results in > >the following: > >1- a vs b; > >2- a vs c; > >3- b bs d; > >4- c vs d; > > > >with this "targets.txt": > > > >SlideNumber Cy3 Cy5 > >1 a b > >2 a b > >3 a b > >4 b a > >5 b a > >6 b a > >7 a c > >8 a c > >9 a c > >10 c a > >11 c a > >12 c a > >13 b d > >14 b d > >15 b d > >16 d b > >17 d b > >18 d b > >19 c d > >20 c d > >21 c d > >22 d c > >23 d c > >24 d c > > > >First option: > > > > > f <- paste(targets$Cy3, targets$Cy5, sep = ".") > > > f <- factor(f, levels = c("a.b", "b.a", "a.c", "c.a", "b.d", > > "d.a", "c.d", "d.a")) > > > design1 <- model.matrix(~0 + f) > > > > > design > > a.b b.a a.c c.a b.d d.b c.d d.c > >1 1 0 0 0 0 0 0 0 > >2 1 0 0 0 0 0 0 0 > >3 1 0 0 0 0 0 0 0 > >4 0 1 0 0 0 0 0 0 > >5 0 1 0 0 0 0 0 0 > >6 0 1 0 0 0 0 0 0 > >7 0 0 1 0 0 0 0 0 > >8 0 0 1 0 0 0 0 0 > >9 0 0 1 0 0 0 0 0 > >10 0 0 0 1 0 0 0 0 > >11 0 0 0 1 0 0 0 0 > >12 0 0 0 1 0 0 0 0 > >13 0 0 0 0 1 0 0 0 > >14 0 0 0 0 1 0 0 0 > >15 0 0 0 0 1 0 0 0 > >16 0 0 0 0 0 1 0 0 > >17 0 0 0 0 0 1 0 0 > >18 0 0 0 0 0 1 0 0 > >19 0 0 0 0 0 0 1 0 > >20 0 0 0 0 0 0 1 0 > >21 0 0 0 0 0 0 1 0 > >22 0 0 0 0 0 0 0 1 > >23 0 0 0 0 0 0 0 1 > >24 0 0 0 0 0 0 0 1 > > > >This gives significant results for each one of the "levels" but does > >not take into account the dye-swap (i.e "a.b" and "b.a" are considered > >independent). > > > >Other tested option is: > > > design2 <- modelMatrix(targets,ref="a") > > > > > design > > p s sp > >ab1 0 1 0 > >ab2 0 1 0 > >ab3 0 1 0 > >ba1 0 -1 0 > >ba2 0 -1 0 > >ba3 0 -1 0 > >ac1 1 0 0 > >ac2 1 0 0 > >ac3 1 0 0 > >ca1 -1 0 0 > >ca2 -1 0 0 > >ca3 -1 0 0 > >bd1 0 -1 1 > >bd2 0 -1 1 > >bd3 0 -1 1 > >db1 0 1 -1 > >db2 0 1 -1 > >db3 0 1 -1 > >cd1 -1 0 1 > >cd2 -1 0 1 > >cd3 -1 0 1 > >dc1 1 0 -1 > >dc2 1 0 -1 > >dc3 1 0 -1 > > > >This gives results for "b" effect, "c" effect, and "d" effect. > >However, I could'nt get results for the 4 comparisons of interest > >(even though the matrix is coherent). > > > >Questions: > > > >1 - What would be the best option (design and operations) to get to > >contrasts of interest considering that the experiment has a 4 > >treatments in a factorial design without common reference (a vs b, a > >vs c, b vs d, c vs d) and taking into account the dye-effect; > > > >2- Is this method (4 contrasts) the best one considering that > >treatment "d" is a combination of treatments "b" and "c" (factorial > >type design). How could one directly get to identify genes > >differentially expressed due to the interaction between treatment "b" > >and treatment "c" (i.e effect of "d" over "b" and "c"). > > > >In Limma Users Guide and elsewhere on this forum, I could not find a > >clear description of how this type of analysis should be performed, > >even though it is a simple design (i.e 2X2 factorial without a common > >reference - two color arrays - complete dye swap). > > > >Thanks for your time, best regards. > > > >Fran?ois Fauteux > >?tudiant ? la ma?trise en biologie v?g?tale > >Centre de recherche en horticulture > >Universit? Laval > >francois.fauteux at gmail.com > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD COMMENT • link 18.0 years ago Jenny Drnevich ★ 2.2k

Login before adding your answer.