Hi,
I am using DEseq2 to analyse RNAsequencing data in a timecourse setup. I am however unsure about setting up the correct experimental design. In particular, I struggle to decide when to combine factors and use the group to extract results and when to use interaction terms. I have tried to find answers by reading extensively on experimental design, however it only made me less confident that I have the correct experimental design for my analyses. I would be grateful for advice on how to properly setup the experimental design in the following case:
I am interested in differences of a particular gene knockout in comparison to wild type controls at different timepoints after injury. For this I have generated RNA sequencing data in biological triplicates. At the moment, I am combining factors running design = ~ condition such as I would have for the gene knockout: KO_timepoint1, KO_timepoint_2, KO_timepoint_3 and equivalently for the wild type control: wt_timepoint1, wt_timepoint_2, wt_timepoint_3
With this I think I can get all kinds of needed results by using contrasts e.g.
res=results(dds,contrast = c("condition","KO_timepoint1","wt_timepoint1")))
or
res=results(dds,contrast = c("condition","KO_timepoint2","KO_timepoint1")))
and so on. However, I have read in posts and vignette that an experimental design such as ~ genotype+time+genotype:time is used. Sometimes also using LRT. I am now unsure if I need to change my experimental design.
I would be grateful for advice on which strategy is most suitable in my case and how the two different designs would influence results. Thanks a lot in advance!
Best, Katja
Now that DESeq() doesn't do shrinkage, it will be equivalent with DESeq() and results().
It's true that shrinkage involves an interdependency between coefficients (as you can see in the LASSO trace plots). The way we are moving with lfcShrink is to focus on individual coefficients at a time with new shrinkage estimators, using the 'coef' argument. We will continue to support the old-DESeq2-style of shrinkage with the 'contrast' argument.
Thanks for the very fast reply and clarifying that both ways are producing the same design. To make sure I got it right, you also say that in this case choosing Wald's or LRT gives the same results?
Regarding the shrinkage, I read in a recent post that it is recommended to use lfcShrink for bulk RNA sequencing data. I am currently using it for my data as well again in combination with contrasts.
Would you not advise to do that? and would I then have to switch to the interaction design since designs are not equivalent anymore?
At the moment you can only use the group design and 'contrast' with lfcShrink, because I haven't implemented it for interaction terms, but in the next release you can use 'coef'. We will have recommendations for which to use in the next release notes.
Sorry but I still have two follow up questions:
1.) I am still a bit confused if I should now apply shrinking of foldchanges in my example or not? If designs are not equivalent anymore when foldchanges are shrunk (e.g. by using betaPrior=TRUE or lfcShrink), do I then need to change my design or should I simply not shrink foldchanges?
2.) In my previous analyses using the "old" DEseq2 versions, I kept the defaults (betaPrior=TRUE and test="Wald") using the same design as described above. Would I then here need to use the LRT- test instead or turn off shrinking of foldchanges?
Hmm, maybe this is getting too complicated. You can do either design and use DESeq() and results(). This will give you maximum likelihood estimates of LFC and adjusted p-values.
Then, if you use the ~group design, you can additionally shrink the LFC for visualization. This doesn't change the p-values or adjusted p-values. This last step is up to you. Some users find the moderated LFCs useful for visualization.
(2) There is no "need" to do anything in particular. My general advice is to use the latest version of DESeq2 with the defaults. The defaults for a software are basically the suggestions of the authors put into code. This would mean, version 1.16 of DESeq2 (current release as of August 2017) with DESeq() and results(). You can choose to do whatever you like really.