Question

Question regarding handling technical replicates for Affy arrays

0

Entering edit mode

noel0925@sbcglobal.net ▴ 90

@noel0925sbcglobalnet-1574

Last seen 9.6 years ago

Hi All, There seems to have been much discussion regarding the statistical issues surrounding handling of technical replicates in gene expression data using the Limma package as pertains to two-color arrays, but less so for Affy data. As such, I am still uncertain regarding how best to handle a single technical replicate in a dataset 40 Affy arrays from 3 tumor types. Clearly, I want to model the biological variation between these three tumor types as a fixed effect. Limma, I understand, is only able to handle 1 random effect. Unlike two-color arrays, single color arrays do not necessitate allocating random effects to within array spot replicates or dye swaps so, I should still be able to model the lack of dependence between these two technical reps as a random effect using the duplicateCorrelation function. It is unclear to me how to specify this single technical replicate however. The examples I have read about in the limma manual or other desciptions of its functions usually deal with cDNA arrays and usually there is a nice structure to the technical replication- eg pairs of technical replicates for each sample (dye swap or otherwise). In my case, I only have one technical rep (and 20, 10, and 10 biolgical reps). What seemed to make sense to me was: >classes<- c(rep(1,20), rep(2,10), rep(3,10)) >f<- factor(targets$Target, levels = c("RNA1", "RNA2", "RNA3")) >design<- model.matrix(~0+f) >colnames(design)<- c("RNA1", "RNA2", "RNA3") >biolrep <- c(1,2,3,4,5,5,6,7,8,9 40) >corfit<- duplicateCorrelation(eset, design, ndups=1, block= biolrep) >fit<-lmFit(eset,design, ndups=1, block=biolrep ,cor=corfit$consensus) >contrast.matrix<- makeContrasts(RNA1-RNA3, RNA2-RNA3,RNA1-RAN2, levels=design) >fit2<- contrasts.fit(fit, contrast.matrix) >fit2<-eBayes(fit2) How else can I specify biolrep? Obviously it would be preferable to model the lack of dependence rather than go with the alternatives. i.) Averaging the two technical reps so as to treat them as primary data. I assume this would be done post-normalization and summarization? ii.) I have also read that simply treating the technical rep as a biological rep is not too dangerous since- the measurement error can be larger than the biological variation- but am hesitant since I want to perform downstream analysis like clustering and don't want an 'extra' sample that really comes from the same patient. iii.) Discard the data from one of the technical reps (worse still). I would greatly appreciate any insight into the best way to handle this issue. Thanks in advance. Noelle

GO Clustering affy limma GO Clustering affy limma • 1.1k views

ADD COMMENT • link 17.9 years ago noel0925@sbcglobal.net ▴ 90