limmaUsersGuide() section "9.7 Multi-level Experiments" I learned about using
duplicateCorrelation() and passing that information to
lmFit() to handle observations from the same individual (using
correlation arguments). Similarly, from "9.6.2 Many time points" I learned about how to multiply splines with a group variable and extract the F-statistic showing the differences between groups across time using
topTable(coef). Those sections don't use
voom() as they were likely written before
In any case, I'm analyzing a data set with 2 regions and individuals spanning many age groups. Some individuals yielded data for both regions, some did not. To explore the correlation by individual, I ran limma-voom with and without using
duplicateCorrelation(). The p-values and adjusted-p-values are smaller for the model that takes the individual correlation into account, but are fairly similar. However, a subset of the F-statistics are very different.
First of all, I was surprised by the range of some of the F-statistics (over 3 in log10). Second, this plot doesn't agree with how similar the p-values look like. How come the 2 models yield similar p-values for genes with widely different F-statistics? (the degrees of freedom should be the same)
I then explored the coefficients of interest (the interaction between the 2nd region and the time linear splines) and they look pretty different (see last pages of the pdf in the gist).
Assuming that I used the functions correctly, do you expect the F-statistics to be so different for a few of genes? What about the coefficients? I know that
lmFit() changes between lm.series and gls.series for the 2 different ways I used it, but I expected the coefficients to look mostly similar given that the consensus correlation is about 0.24 (much lower than the default of 0.75). Or could the lm.series vs gls.series explain the difference here? ***At this point I realize that I didn't use topTable() correctly***
Code used (shows parts of the design matrix and session info):