Entering edit mode
Dear Jenny,
The second issue you identify (the DF issue) isn't an intrinsic
characteristic of multi-level
variation. Rather the whole issue of DF is model-dependent.
If there really is no biological-replicate effect (the variance
component is relatively small, so
the correlation is small), and you fit a model without it, then the
whole issue with DF doesn't
arise. In this case there is only one error level and you are
perfectly justified in using all
the available DF to estimate it.
BTW, in the limma approach to multilevel models, the full DF is always
used. The DF issues
associated with ANOVA models is side-stepped. This is a consequence
of the extreme smoothing
across genes using by the approach. A full explanation of this would
need a lot of space ...
Best wishes
Gordon
On Thu, April 27, 2006 2:00 am, Jenny Drnevich wrote:
> Hi everyone,
>
> Comments from Naomi and Gordon (below) about the technical
replication in
> the 2x2 factorial loop experiment are very close to an issue I have
been
> struggling with for several analyses: When (if ever) is it OK to
treat
> technical replicates as biological replicates? Often this is done
when
> there is more than one random effect (e.g. also have duplicate
spots,
> blocking effects, etc.) because as Gordon has said previously, the
between
> gene smoothing of limma cannot currently be done with more than one
random
> effect. I know there have been many discussions on this on the list
> previously, but I can see two problems with treating tech reps as
> biological reps, and only one of them has been addressed:
>
> 1. There is likely to be artificially decreased variance within
treatment
> groups because tech reps should have higher correlations than
biological
> reps. This problem has been addressed several times and probably the
best
> answer has come from Gordon along the lines of: often measurement
error is
> larger than biological variation, so IF there are not higher
correlations
> among tech reps then variance estimates should not be artificially
decreased.
>
> 2. The DF is artificially increased due to psuedoreplication of the
> biological replicates, which leads to artificially lower p-values.
This
> combined with even minor changes to the variance components can lead
to
> large changes in p-values in my experience.
>
> As far as I know, this second problem has not been addressed. As a
case in
> point, in the 2x2 factorial loop from before, each of the three
biological
> replicates has 4 technical replicates, and even if there are not
higher
> correlations, treating them as biological reps yields N=12 for each
group
> instead of N=3. Shouldn't we be worried about this effect as well?
In such
> cases when the experiment design really has more than one random
effect,
> wouldn't the analysis be better off to model the random effects
properly
> with a multilevel model such as lme/nlme rather than get the
benefits of
> the empirical Bayes shrinkage either through ignoring technical
replication
> or averaging dye swaps?
>
> Thanks,
> Jenny
>
> Naomi's comment:
> I would use single channel analysis for
> this. The only problem is that Limma allows only
> 1 level of random effects. Hence, you will need to average the dye-
swaps.
>
> Gordon's comment:
>>PS. Although you don't say explicitly, I'm assuming that a1, a2 etc
>>represent some sort of biological replication. The above analysis
>>does not keep track of which array has which biological replicate of
>>each treatment. If you wanted to do a careful job of that, you would
>>have no choice but to do a "separate channel" analysis, as Naomi
>>Altman has suggested separately. If your biological replicates a1,
a2
>>etc are not very different, compared to microarray measurement
error,
>>then the above simpler analysis may be good enough.