Question

technical reps, limma - theory

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Hi Naomi, The limma duplicateCorrelation() function fits the random effects model as a general mixed model by maximising the REML likelihood. This is the same as lme() and lmer() etc. This is equivalent to the ANOVA approach in the balanced case, although not otherwise. Once you get to lmFit(), a contraint has been placed on the random effects model (common within-block correlation across genes) to reduce it to a single error component model at the genewise level. So by this time limma is already fitting a different model to the usual genewise random effects approach. The residual degrees of freedom returned by lmFit() are correct for this single error component approach. I am sorry that the theory behind the limma approach to technical reps has not yet been written up and published. I am trying to find time to do this. The theory is closely analogous to the within-array duplicate spot theory which is published in Bioinformatics. Gordon > Date: Tue, 19 Jul 2005 13:13:50 -0400 > From: Naomi Altman <naomi at="" stat.psu.edu=""> > Subject: [BioC] technical reps, limma - theory > To: bioconductor at stat.math.ethz.ch > > Consider the simple one-way design, with biological and technical > reps. Generally we would consider the biological reps to be blocks. (For > simplicity, we can think of this as a one-channel analysis). The usual > ANOVA seems to be at odds with what we get from limma (ignoring the eBayes > step, which clearly cannot be recovered from the classical treatment). > > The usual ANOVA (t treatments, b biological reps, n technical reps within > biological reps, giving ntb arrays) > > source df MS F > > treatment t-1 MS(T) MS(T)/MS(B) > bio rep b-1 MS(B) MS(B)/MSE (although we don't > really care about this) > error=T*B ntb-t-b+1 MSE > > However, the Limma manual suggests using duplicateCorrelation and block to > handle block designs, and this gives a different ANOVA. In particular, the > error d.f. for this ANOVA is ntb-t. Using this method, the within block > correlation is used in computing the t-statistics for the treatment, so you > do not get the simple 1-way ANOVA that would come from ignoring block, but > you cannot recover the p-value from the usual ANOVA, either. > > If you put in bio rep as a fixed factor, then Limma will use the MSE as the > denominator for the contrast tests, so this also does not recover the ANOVA. > > I have not tried this computation with replicate spots (only replicate > arrays) but either: > > 1) the usual ANOVA is right and duplicateCorrelation is doing something odd > 2) duplicate Correlation is right and I don't understand the usual ANOVA > 3) both methods are correct for somewhat different models, and I don't > understand the statistical implications of this > > I am not discounting 3 - as the statisticians in the crowd know, there are > 2 versions, constrained and unconstrained, for the simplest case of > balanced ANOVA with fixed and random effects which lead to different > p-values for some effects and both are defensible for almost any data set. > > Anyways, I would like to understand this better. So, I would welcome comments. > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111

limma limma • 1.2k views

ADD COMMENT • link 19.6 years ago Gordon Smyth 52k