Question regarding handling technical replicates for Affy arrays

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 15 hours ago

WEHI, Melbourne, Australia

Dear Noel, I'm sorry I didn't fully appreciate from your first email that you have only one replicate, because you didn't give the whole biolrep vector. A single replicate is simply not enough to estimate the technical-replicate variance component. You need at least two. That is the reason why duplicateCorrelation() returns a NaN answer. I guess that most people would average the technical replicates or would choose the "best" one. It isn't likely to make a lot of difference. There's no perfect solution because this isn't a perfect experimental design. Best wishes Gordon > Date: Mon, 29 May 2006 01:20:10 -0700 (PDT) > From: "noel0925 at sbcglobal.net" <noel0925 at="" sbcglobal.net=""> > Subject: Re: [BioC] Question regarding handling technical replicates > for Affy arrays > To: bioconductor at stat.math.ethz.ch > > > Hi Gordon, > > Thank you for your reply. > > Actually, I have looked at the consensus correlation > and I obtain [1] NaN. This doesn't seem sensible. > > Perhaps I have specified the biological replicates > incorrectly. The desciption of dupcor states that > "Typically the blocks are biological replicates and > the repeated observations are technical replicates." > As such, I thought that it made sense to create a > vector of the replicates as follows: > > > biolrep <- > c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,2 5,26,27,28,29,30,31,32,33,34,35,36,37,38,39) > > Thus, there are 39 DIFFERENT RNA samples and 1 sample > which is replicated (hybed to two different arrays). > Where the number 5 is repeated twice since it is the > only sample for which there is a technical replicate. > Samples 1-20 are of RNA1, samples 21-30 are RNA2, and > samples 31-40 are RNA3. > > Have I specified the biological replicates properly? > The "biolrep" examples I have seen in the literature > confused me a bit since it seems to specify both a > block of biological replicates and techical reps > within those blocks. But the cases given are for two > color arrays for example, in section 23.5 of the Limma > book chapter, the first example is for the case where > two wt and two mut mice from the same strain are > compared using two arrays for each pair so that the > 1st and 2nd and 3rd and 4th are technical reps. So > here, > biolrep<- c(1,1,2,2). > > This is different however from the Affy data I > describe since the 3 different genotypes are all on > separate arrays rather than both wt and mut on the > same array. > > If I do: > biolrep <- > c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,2 5,26,27,28,29,30,31,32,33,34,35,36,37,38,39) > > then corfit$consensus yields NaN. Though the > biological reps are not explicitly defined here, I > would assume they are inferred from f<- > factor(targets$Target,levels = c("RNA1", "RNA2", > "RNA3")). > > > If I do: > biolrep <- c(rep(1,20), rep(2,10), rep(3,10)) > then corfit$consensus yields Inf and this does not > indicate which arrays are technical reps. > > Any further insight you could offer would be great. > Thanks very much, > > Noelle >

affy affy • 950 views

ADD COMMENT • link updated 19.6 years ago by Naomi Altman ★ 6.0k • written 19.6 years ago by Gordon Smyth 53k

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.7 years ago

United States

Dear Noel, If you average the technical replicates, the averages have somewhat less variance than the data for the samples with only 1 array. So, probably you are best off doing 2 analyses - selecting each array in turn. This may lead to slightly different "top gene" lists, but it will be bettter than selecting the "best" array and not understanding how this affects your analysis. Since you appear to have adequate biological replication, the tow results should be very similar. You should take a careful gene by gene look at any genes that show up as statistically significant in one of the analyses but not the other. --Naomi At 08:53 PM 5/29/2006, Gordon K Smyth wrote: >Dear Noel, > >I'm sorry I didn't fully appreciate from your first email that you >have only one replicate, >because you didn't give the whole biolrep vector. A single >replicate is simply not enough to >estimate the technical-replicate variance component. You need at >least two. That is the reason >why duplicateCorrelation() returns a NaN answer. > >I guess that most people would average the technical replicates or >would choose the "best" one. >It isn't likely to make a lot of difference. There's no perfect >solution because this isn't a >perfect experimental design. > >Best wishes >Gordon > > > Date: Mon, 29 May 2006 01:20:10 -0700 (PDT) > > From: "noel0925 at sbcglobal.net" <noel0925 at="" sbcglobal.net=""> > > Subject: Re: [BioC] Question regarding handling technical replicates > > for Affy arrays > > To: bioconductor at stat.math.ethz.ch > > > > > > Hi Gordon, > > > > Thank you for your reply. > > > > Actually, I have looked at the consensus correlation > > and I obtain [1] NaN. This doesn't seem sensible. > > > > Perhaps I have specified the biological replicates > > incorrectly. The desciption of dupcor states that > > "Typically the blocks are biological replicates and > > the repeated observations are technical replicates." > > As such, I thought that it made sense to create a > > vector of the replicates as follows: > > > > > > biolrep <- > > > c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,2 5,26,27,28,29,30,31,32,33,34,35,36,37,38,39) > > > > Thus, there are 39 DIFFERENT RNA samples and 1 sample > > which is replicated (hybed to two different arrays). > > Where the number 5 is repeated twice since it is the > > only sample for which there is a technical replicate. > > Samples 1-20 are of RNA1, samples 21-30 are RNA2, and > > samples 31-40 are RNA3. > > > > Have I specified the biological replicates properly? > > The "biolrep" examples I have seen in the literature > > confused me a bit since it seems to specify both a > > block of biological replicates and techical reps > > within those blocks. But the cases given are for two > > color arrays for example, in section 23.5 of the Limma > > book chapter, the first example is for the case where > > two wt and two mut mice from the same strain are > > compared using two arrays for each pair so that the > > 1st and 2nd and 3rd and 4th are technical reps. So > > here, > > biolrep<- c(1,1,2,2). > > > > This is different however from the Affy data I > > describe since the 3 different genotypes are all on > > separate arrays rather than both wt and mut on the > > same array. > > > > If I do: > > biolrep <- > > > c(1,2,3,4,5,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,2 5,26,27,28,29,30,31,32,33,34,35,36,37,38,39) > > > > then corfit$consensus yields NaN. Though the > > biological reps are not explicitly defined here, I > > would assume they are inferred from f<- > > factor(targets$Target,levels = c("RNA1", "RNA2", > > "RNA3")). > > > > > > If I do: > > biolrep <- c(rep(1,20), rep(2,10), rep(3,10)) > > then corfit$consensus yields Inf and this does not > > indicate which arrays are technical reps. > > > > Any further insight you could offer would be great. > > Thanks very much, > > > > Noelle > > > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.6 years ago Naomi Altman ★ 6.0k

Login before adding your answer.