Entering edit mode
Hi Naomi,
The limma duplicateCorrelation() function fits the random effects
model as a general mixed model
by maximising the REML likelihood. This is the same as lme() and
lmer() etc. This is equivalent
to the ANOVA approach in the balanced case, although not otherwise.
Once you get to lmFit(), a contraint has been placed on the random
effects model (common
within-block correlation across genes) to reduce it to a single error
component model at the
genewise level. So by this time limma is already fitting a different
model to the usual genewise
random effects approach. The residual degrees of freedom returned by
lmFit() are correct for this
single error component approach.
I am sorry that the theory behind the limma approach to technical reps
has not yet been written up
and published. I am trying to find time to do this. The theory is
closely analogous to the
within-array duplicate spot theory which is published in
Bioinformatics.
Gordon
> Date: Tue, 19 Jul 2005 13:13:50 -0400
> From: Naomi Altman <naomi at="" stat.psu.edu="">
> Subject: [BioC] technical reps, limma - theory
> To: bioconductor at stat.math.ethz.ch
>
> Consider the simple one-way design, with biological and technical
> reps. Generally we would consider the biological reps to be blocks.
(For
> simplicity, we can think of this as a one-channel analysis). The
usual
> ANOVA seems to be at odds with what we get from limma (ignoring the
eBayes
> step, which clearly cannot be recovered from the classical
treatment).
>
> The usual ANOVA (t treatments, b biological reps, n technical reps
within
> biological reps, giving ntb arrays)
>
> source df MS F
>
> treatment t-1 MS(T) MS(T)/MS(B)
> bio rep b-1 MS(B) MS(B)/MSE (although we
don't
> really care about this)
> error=T*B ntb-t-b+1 MSE
>
> However, the Limma manual suggests using duplicateCorrelation and
block to
> handle block designs, and this gives a different ANOVA. In
particular, the
> error d.f. for this ANOVA is ntb-t. Using this method, the within
block
> correlation is used in computing the t-statistics for the treatment,
so you
> do not get the simple 1-way ANOVA that would come from ignoring
block, but
> you cannot recover the p-value from the usual ANOVA, either.
>
> If you put in bio rep as a fixed factor, then Limma will use the MSE
as the
> denominator for the contrast tests, so this also does not recover
the ANOVA.
>
> I have not tried this computation with replicate spots (only
replicate
> arrays) but either:
>
> 1) the usual ANOVA is right and duplicateCorrelation is doing
something odd
> 2) duplicate Correlation is right and I don't understand the usual
ANOVA
> 3) both methods are correct for somewhat different models, and I
don't
> understand the statistical implications of this
>
> I am not discounting 3 - as the statisticians in the crowd know,
there are
> 2 versions, constrained and unconstrained, for the simplest case of
> balanced ANOVA with fixed and random effects which lead to different
> p-values for some effects and both are defensible for almost any
data set.
>
> Anyways, I would like to understand this better. So, I would
welcome comments.
>
> Naomi S. Altman 814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics 814-863-7114 (fax)
> Penn State University 814-865-1348
(Statistics)
> University Park, PA 16802-2111