Question

Limma design for single channel arrays with technical repeats

0

Entering edit mode

Giulio Di Giovanni ▴ 10

@giulio-di-giovanni-6017

Last seen 11.5 years ago

Hi all, I work with one-channel arrays, where on each slide the sample is repeated three times (in three blocks called subarrays). In order to perform the differential expression analysis (usually a two-groups comparison), up to now we always averaged the three repetitions (after normalization). But I always wanted to use more effectively the information that we get from the repetitions. For example now we have two groups to compare, 15 + 15 samples. Looking at the userguide and at the mailing list archive I came up with this design. # group 2 vs. 3 require(limma) res.P23 <- cbind(res.grp2, res.grp3) CR23 <- c(rep("g2", ncol(res.grp2)), rep("g3", ncol(res.grp3))) design23<- model.matrix(~0+factor(CR23)) colnames(design23) <- unique(CR23) biolrep23 <- sort(rep(1:30,3)) # 1,1,1,2,2,2,3,3,3 30,30,30 corfit23<- duplicateCorrelation(res.P23, design23, ndups=1, block= biolrep23) fit23 <- lmFit(res.P23, design23, ndups=1, block=biolrep23, cor=corfit23$consensus) cont.matrix23 <- makeContrasts(g2vsg3 = g2-g3, levels=design23) fit2.23 <- contrasts.fit(fit23, cont.matrix23) fit2.23 <- eBayes(fit2.23) I know that it is a banal question, but I would like to know if the formulation of the biolrep sequence (and the rest) is correct. This because I get really different results compared with the analysis done by using averaged arrays. I tried with a few different datasets, with very different corfit$consensus value. I'm not shocked by these results, at the contrary, but I just want to be sure, especially after I've read somewhere that it shouldn't change that much. I thank you in advance, [[alternative HTML version deleted]]

• 1.3k views

ADD COMMENT • link updated 12.6 years ago by Gordon Smyth 53k • written 12.6 years ago by Giulio Di Giovanni ▴ 10

score 0 · Answer 1 · 2013-06-27

Hi all, I work with one-channel arrays, where on each slide the sample is repeated three times (in three blocks called subarrays). In order to perform the differential expression analysis (usually a two-groups comparison), up to now we always averaged the three repetitions (after normalization). But I always wanted to use more effectively the information that we get from the repetitions. For example now we have two groups to compare, 15 + 15 samples. Looking at the userguide and at the mailing list archive I came up with this design. # group 2 vs. 3 require(limma) res.P23 <- cbind(res.grp2, res.grp3) CR23 <- c(rep("g2", ncol(res.grp2)), rep("g3", ncol(res.grp3))) design23<- model.matrix(~0+factor(CR23)) colnames(design23) <- unique(CR23) biolrep23 <- sort(rep(1:30,3)) # 1,1,1,2,2,2,3,3,3 30,30,30 corfit23<- duplicateCorrelation(res.P23, design23, ndups=1, block= biolrep23) fit23 <- lmFit(res.P23, design23, ndups=1, block=biolrep23, cor=corfit23$consensus) cont.matrix23 <- makeContrasts(g2vsg3 = g2-g3, levels=design23) fit2.23 <- contrasts.fit(fit23, cont.matrix23) fit2.23 <- eBayes(fit2.23) I know that it is a banal question, but I would like to know if the formulation of the biolrep sequence (and the rest) is correct. This because I get really different results compared with the analysis done by using averaged arrays. I tried with a few different datasets, with very different corfit$consensus value. I'm not shocked by these results, at the contrary, but I just want to be sure, especially after I've read somewhere that it shouldn't change that much. I thank you in advance, [[alternative HTML version deleted]]

score 0 · Answer 2 · 2013-06-29

Dear Giulio, I do not see any obvious problems with your code. duplicateCorrelation will give different results to simple averaging. It will downweight genes that are inconsistent between technical repeats, which averaging cannot do, and it will usually give more statistical significant. The method is primarily designed to rescue experiments with small numbers of biological replicates. For your experiment with n=15 in each group, you may get good results without it. Best wishes Gordon > Date: Thu, 27 Jun 2013 16:28:07 +0200 > From: Giulio Di Giovanni <de.molay at="" hotmail.com=""> > To: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Subject: [BioC] Limma design for single channel arrays with technical > repeats > > Hi all, > > I work with one-channel arrays, where on each slide the sample is > repeated three times (in three blocks called subarrays). In order to > perform the differential expression analysis (usually a two-groups > comparison), up to now we always averaged the three repetitions (after > normalization). But I always wanted to use more effectively the > information that we get from the repetitions. > > For example now we have two groups to > compare, 15 + 15 samples. Looking at the userguide and at the mailing > list archive I came up with this design. > > # group 2 vs. 3 > > require(limma) > res.P23 <- cbind(res.grp2, res.grp3) > CR23 <- c(rep("g2", > ncol(res.grp2)), rep("g3", ncol(res.grp3))) > design23<- > model.matrix(~0+factor(CR23)) > colnames(design23) <- unique(CR23) > biolrep23 <- sort(rep(1:30,3)) # > 1,1,1,2,2,2,3,3,3 ? 30,30,30 Or biolrep12 <- rep(1:30, each=3) > corfit23<- > duplicateCorrelation(res.P23, design23, ndups=1, block= biolrep23) > fit23 <- lmFit(res.P23, design23, > ndups=1, block=biolrep23, cor=corfit23$consensus) > cont.matrix23 <- > makeContrasts(g2vsg3 = g2-g3, levels=design23) > fit2.23 <- contrasts.fit(fit23, > cont.matrix23) > fit2.23 <- eBayes(fit2.23) > > I know that it is a banal question, but I would like to know if the > formulation of the biolrep sequence (and the rest) is correct. This > because I get really different results compared with the analysis done > by using averaged arrays. I tried with a few different datasets, with > very different corfit$consensus value. I'm not shocked by these results, > at the contrary, but I just want to be sure, especially after I've read > somewhere that it shouldn't change that much. > I thank you in advance, ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:5}}