I am aware that limma does not have any built-in facility for testing random effects, but I have an idea for how to do it and would like advice on whether or not it is valid. By making minor modifications to the gls.series() and lm.series() functions I can extract the AIC values of the fitted models, and then can compare two otherwise identical models. This mirrors the process that I would follow in a traditional linear model if I wanted to test a random effect, although of course it will need to be repeated separately for each individual gene in question.
E.g.
M1 <- lmFit(v, design)
M2 <- lmFit(v, design, block = rand.fac, correlation = corfit$consensus.correation)
for (i in 1:ngenes) {
M2$AIC[i] - M1$AIC[i] # Or, some hacked-together equivalent
}
I have given this a go and it is certainly possible in practice, but I am not sure whether it is statistically/mathematically valid. One specific concern is that both M1 and M2 show the same value for df.residual
, when I would expect the random term to have "used up" one degree of freedom. This then makes me worried that the gls.series() and lm.series() functions are fitting the model in a fundamentally different way for which the AIC values will not be directly comparable (an analogous situation I'm aware of: it is not valid to compare AIC values from lmms fitted via REML). But I am getting out of my statistical depth, so it would be great to get some advice from someone with a better understanding of the underlying mathematics.
Background: I am using limma to analyse differential expression in a dataset with repeated measures from the same individuals. There are too many different individuals to set id as a fixed effect, so I have set it as a random effect. I would now like to test the contribution of the random variable to my model, because I am interested in whether or not individuals have their own distinctive and consistent levels of expression. In other words, is there significant repeatability within an individual? To determine this requires knowing whether there are significant levels of within-individual variation.
If what you'd like to know is the contribution of within-donor differences to the variability of gene expression, then perhaps you can use variancePartition. This package builds on top of
limma
to fit random effects with a linear mixed model.Thank you, that looks promising, I will take a look!