paired microarray samples

0

Entering edit mode

Chris Fjell ▴ 60

@chris-fjell-3696

Last seen 9.6 years ago

I've a question on how to properly deal with paired samples in limma. For example, I've got an experiment with 3 blood donors (D1, D2, D3) and 4 treatments (T1, T2, T3, T4) of the blood cells. One single-channel array is done for each donor-treatment (e.g. D1 cells treated with T2). In the past I've treated these as simple biological replicates, so the code flow is like: treatments=pData(BSData.quantile)$Sample_Group # T1, T2, T3, T4, from Illumina beadarray data treatments=as.factor(treatments) design=model.matrix(~0+treatments) colnames(design)=levels(treatments) fit = lmFit(exprs(BSData.quantile), design) cont.matrix = makeContrasts(T1vT2 = T1 - T2, T3vT1 = T3 - T1, levels = design) fit = contrasts.fit(fit, cont.matrix) ebFit = eBayes(fit) for( comp in colnames( cont.matrix ) ) { tt = topTable(ebFit, coef = comp) ... [write results to file...] } But donors are responding quite differently so I don't want to simple pool all the treatments, I want to consider response of each donor, i.e. "Paired Samples". I found this snippet on the BioC newsgroup at https://stat.ethz.ch/pipermail/bioconductor/2006-May/012969.html Calculate the logFC ratios manually per donor, then pass it to lmFit() - seems like the standard paired t-test. For my example, for the comparison of T1 to T2 exM = exprs(BSData.quantile) diffD1 = exM[, which( donors == "D1" & samples == "T1" )] - exM[, which( donors == "D1" & samples == "T2" )] diffD2 = exM[, which( donors == "D2" & samples == "T1" )] - exM[, which( donors == "D2" & samples == "T2" )] diffD3 = exM[, which( donors == "D3" & samples == "T1" )] - exM[, which( donors == "D3" & samples == "T2" )] paired_ex = cbind( diffD1, diffD2, diffD3) fit = lmFit(paired_ex) The example cited above also uses "block = origin, correlation = dupcor$cons" parameters to lmFit() but this is for technical replicates (?) This works but I get dramatically worse p-values (due I assume to using half the number of "samples": 3 ratios tested against 0 versus testing two groups of 3). But the limma user's guide (Oct 22, 2008 version, Chapter 8, p.40) gives another example, using model.matrix() not makeContrasts(): > SibShip <- factor(targets$SibShip) > Treat <- factor(targets$Treatment, levels=c("C","T")) > design <- model.matrix(~SibShip+Treat) > fit <- lmFit(eset, design) > fit <- eBayes(fit) > topTable(fit, coef="TreatT") I do the similar thing for my data ... design_paired_treatments = model.matrix( ~donors+treatments ) fit_ps = lmFit( exprs(BSData.quantile),design_paired_treatments) fit_ps = eBayes( fit_ps ) and look at the columns of my design matrix and it's like colnames( design_paired_treatments ) [1] "(Intercept)" "donorsD1" "donorsD2" "treatmentsT1" [5] "treatmentsT2" "treatmentsT3" Donor D3 and treatment T4 don't appear in the design. and I get some results... tt = topTable( fit_ps, coef="donorsD1") tt ID logFC AveExpr t P.Value adj.P.Val B 6370315 -6.58 8.74 -119.4 4.26e-17 2.08e-12 17.6 7160474 3.59 8.77 56.5 7.43e-14 1.37e-09 16.6 5260484 -5.69 8.71 -55.8 8.39e-14 1.37e-09 16.6 This is a regression model, I think... I can see why the two terms (donor D3 and treatment T4) disappeared - they have to be restated in terms of the rest, if I remember right for doing ANOVA with indicator variables. Is there a recommended way to do this with limma? Or another package? Thanks for any advice. -Chris

Regression limma beadarray Regression limma beadarray • 1.6k views

ADD COMMENT • link updated 14.5 years ago by Naomi Altman ★ 6.0k • written 14.5 years ago by Chris Fjell ▴ 60

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.0 years ago

United States

Have a look at the randomized complete block example. Donor is a block. --Naomi At 03:05 PM 10/16/2009, Chris Fjell wrote: >I've a question on how to properly deal with paired samples in limma. > >For example, I've got an experiment with 3 blood donors (D1, D2, D3) and >4 treatments (T1, T2, T3, T4) of the blood cells. One single-channel >array is done for each donor-treatment (e.g. D1 cells treated with T2). > >In the past I've treated these as simple biological replicates, so the >code flow is like: > >treatments=pData(BSData.quantile)$Sample_Group # T1, T2, T3, T4, from >Illumina beadarray data >treatments=as.factor(treatments) >design=model.matrix(~0+treatments) >colnames(design)=levels(treatments) >fit = lmFit(exprs(BSData.quantile), design) >cont.matrix = makeContrasts(T1vT2 = T1 - T2, T3vT1 = T3 - T1, levels = >design) >fit = contrasts.fit(fit, cont.matrix) >ebFit = eBayes(fit) > >for( comp in colnames( cont.matrix ) ) { > tt = topTable(ebFit, coef = comp) > ... > [write results to file...] >} > > >But donors are responding quite differently so I don't want to simple >pool all the treatments, I want to consider response of each donor, i.e. >"Paired Samples". > > > >I found this snippet on the BioC newsgroup at >https://stat.ethz.ch/pipermail/bioconductor/2006-May/012969.html >Calculate the logFC ratios manually per donor, then pass it to lmFit() - >seems like the standard paired t-test. > >For my example, for the comparison of T1 to T2 >exM = exprs(BSData.quantile) >diffD1 = exM[, which( donors == "D1" & samples == "T1" )] - exM[, which( >donors == "D1" & samples == "T2" )] >diffD2 = exM[, which( donors == "D2" & samples == "T1" )] - exM[, which( >donors == "D2" & samples == "T2" )] >diffD3 = exM[, which( donors == "D3" & samples == "T1" )] - exM[, which( >donors == "D3" & samples == "T2" )] > >paired_ex = cbind( diffD1, diffD2, diffD3) >fit = lmFit(paired_ex) > >The example cited above also uses "block = origin, correlation = >dupcor$cons" parameters to lmFit() but this is for technical replicates (?) >This works but I get dramatically worse p-values (due I assume to using >half the number of "samples": 3 ratios tested against 0 versus testing >two groups of 3). > > >But the limma user's guide (Oct 22, 2008 version, Chapter 8, p.40) gives >another example, using model.matrix() not makeContrasts(): > > > SibShip <- factor(targets$SibShip) > > Treat <- factor(targets$Treatment, levels=c("C","T")) > > design <- model.matrix(~SibShip+Treat) > > fit <- lmFit(eset, design) > > fit <- eBayes(fit) > > topTable(fit, coef="TreatT") > >I do the similar thing for my data ... > >design_paired_treatments = model.matrix( ~donors+treatments ) >fit_ps = lmFit( exprs(BSData.quantile),design_paired_treatments) >fit_ps = eBayes( fit_ps ) > >and look at the columns of my design matrix and it's like >colnames( design_paired_treatments ) >[1] "(Intercept)" "donorsD1" "donorsD2" "treatmentsT1" >[5] "treatmentsT2" "treatmentsT3" > >Donor D3 and treatment T4 don't appear in the design. > >and I get some results... >tt = topTable( fit_ps, coef="donorsD1") >tt > ID logFC AveExpr t P.Value adj.P.Val B > 6370315 -6.58 8.74 -119.4 4.26e-17 2.08e-12 17.6 > 7160474 3.59 8.77 56.5 7.43e-14 1.37e-09 16.6 > 5260484 -5.69 8.71 -55.8 8.39e-14 1.37e-09 16.6 > >This is a regression model, I think... I can see why the two terms >(donor D3 and treatment T4) disappeared - they have to be restated in >terms of the rest, if I remember right for doing ANOVA with indicator >variables. > > > >Is there a recommended way to do this with limma? Or another package? > >Thanks for any advice. > >-Chris > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 14.5 years ago Naomi Altman ★ 6.0k

Login before adding your answer.