Question

When to treat technical reps as biological reps? WAS:Re: 2x2 factorial loop without common reference (pool)

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 8 hours ago

WEHI, Melbourne, Australia

Dear Jenny, The second issue you identify (the DF issue) isn't an intrinsic characteristic of multi-level variation. Rather the whole issue of DF is model-dependent. If there really is no biological-replicate effect (the variance component is relatively small, so the correlation is small), and you fit a model without it, then the whole issue with DF doesn't arise. In this case there is only one error level and you are perfectly justified in using all the available DF to estimate it. BTW, in the limma approach to multilevel models, the full DF is always used. The DF issues associated with ANOVA models is side-stepped. This is a consequence of the extreme smoothing across genes using by the approach. A full explanation of this would need a lot of space ... Best wishes Gordon On Thu, April 27, 2006 2:00 am, Jenny Drnevich wrote: > Hi everyone, > > Comments from Naomi and Gordon (below) about the technical replication in > the 2x2 factorial loop experiment are very close to an issue I have been > struggling with for several analyses: When (if ever) is it OK to treat > technical replicates as biological replicates? Often this is done when > there is more than one random effect (e.g. also have duplicate spots, > blocking effects, etc.) because as Gordon has said previously, the between > gene smoothing of limma cannot currently be done with more than one random > effect. I know there have been many discussions on this on the list > previously, but I can see two problems with treating tech reps as > biological reps, and only one of them has been addressed: > > 1. There is likely to be artificially decreased variance within treatment > groups because tech reps should have higher correlations than biological > reps. This problem has been addressed several times and probably the best > answer has come from Gordon along the lines of: often measurement error is > larger than biological variation, so IF there are not higher correlations > among tech reps then variance estimates should not be artificially decreased. > > 2. The DF is artificially increased due to psuedoreplication of the > biological replicates, which leads to artificially lower p-values. This > combined with even minor changes to the variance components can lead to > large changes in p-values in my experience. > > As far as I know, this second problem has not been addressed. As a case in > point, in the 2x2 factorial loop from before, each of the three biological > replicates has 4 technical replicates, and even if there are not higher > correlations, treating them as biological reps yields N=12 for each group > instead of N=3. Shouldn't we be worried about this effect as well? In such > cases when the experiment design really has more than one random effect, > wouldn't the analysis be better off to model the random effects properly > with a multilevel model such as lme/nlme rather than get the benefits of > the empirical Bayes shrinkage either through ignoring technical replication > or averaging dye swaps? > > Thanks, > Jenny > > Naomi's comment: > I would use single channel analysis for > this. The only problem is that Limma allows only > 1 level of random effects. Hence, you will need to average the dye- swaps. > > Gordon's comment: >>PS. Although you don't say explicitly, I'm assuming that a1, a2 etc >>represent some sort of biological replication. The above analysis >>does not keep track of which array has which biological replicate of >>each treatment. If you wanted to do a careful job of that, you would >>have no choice but to do a "separate channel" analysis, as Naomi >>Altman has suggested separately. If your biological replicates a1, a2 >>etc are not very different, compared to microarray measurement error, >>then the above simpler analysis may be good enough.

Microarray limma Microarray limma • 1.2k views

ADD COMMENT • link 19.5 years ago Gordon Smyth 53k

score 0 · Answer 1 · 2006-05-01

On Mon, May 1, 2006 10:16 am I wrote > BTW, in the limma approach to multilevel models, the full DF is always used. The DF issues > associated with ANOVA models is side-stepped. This is a consequence of the extreme smoothing > across genes using by the approach. A full explanation of this would need a lot of space ... I thought of a way to explain it, which I hope will make some sense. The limma model to multilevel models (blocking) collapses the two levels of variation down to one level by imposing a constraint. The constraint is that the ratio of biological to technical variation is roughly the same for each gene. This allows all the arrays, even the technical replication, to be used in estimating the standard error for contrasts of interest. Best wishes Gordon