Hi,
Ive read the excellent vignettes, the DeSeq paper, and discussed with colleagues but ultimately am still struggling with the correct approach to using DeSeq2 to analyze a repeated measures experiment including 2 separate groups. I want to be sure I do this correctly, so any guidance is greatly appreciated.
I am analyzing an experiment in which I have 2 groups (x and y) of volunteers measured at 4 separate time points (a, b, c and d). I want to know A) whether changes in gene counts over time differ between groups (i.e., group-by-time interaction), and for any genes that don't exhibit a significant interaction to B) determine whether gene counts differ over time when both groups are combined.
For objective A the "Group-specific condition effects, individuals nested within groups" explained here: http://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups seems most appropriate. Therein the appropriate model is described as ~ grp + grp:ind.n + grp:cnd.
Question 1 is why is there no cnd term (i.e., main effect of time/condition) as there would be in a general linear model? In other words, why isn't the correct model ~ grp + cnd + grp:ind.n + grp:cnd?
In order to test for an interaction I plan to use a log-ratio test to compare a reduced model that does not include the grp:cnd term.
Question 2: is that approach correct?
To compare whether condition/timepoint effects (e.g., b vs a) also differ across groups, I can use the code: results(dds, contrast=list("grpY.cndB","grpX.cndB")).
Question 3 is how do I interpret those results? Do the results indicate log2(grpY.cndB/grpX.cndB) while controlling for cndA, or (log2(grpY.cndB/grpY.cndA) - log2(grpX.cndB/grpX.cndA)) or something else?
For objective B, my initial thought was to use a log-ratio test to compare the models ~ grp + cnd + grp:ind.n + grp:cnd and ~ grp + grp:ind.n + grp:cnd. However, if the first model is incorrect, that obviously doesn't work. My next thought was to use the paired samples approach noted in the vignette above. The vignette states that the model would be: ~ subject + cnd; so my thought was to use ~ subject + cnd + group. I also looked at the vignettes for time course experiments, but these models don't appear to account for the fact that repeated measurements are being taken from the same individual (i.e., measurements over time are not independent) so I didn't think the time course example would be applicable here.
Question 4 is which approach, if any, is correct?
Thank you for the help,
Phil