Question: DESeq2 3 factor design - whether to split up data or add interaction terms
11 days ago
atongz652

I have 60 birds in a study, we split up the birds so that there are 15 birds per treatment (so 4 treatments in total), and at 3 different time points, 5 birds were sacrificed Each bird contributed 6 different sample types(different areas of the body were sampled). I want to study the influence of treatments (ie. contrast different treatments), but I already know that both sample type and time point a major impact on gene expression expression.

For example, I want to know at timepoint 0, sample type 1, which genes are deferentially expressed between treatment 1 and treatment 2, or between treatment 2 and treatment 4, etc.

Should I split the data up into 18 smaller datasets so that each smaller dataset only contains samples of the same sample type and time point, then use design= ~treatment?

Alterantively would design=~Site+Timepoint+Treatment on my entire dataset tell me give me a good idea of genes that are differentially expressed after accounting for site and timepoint? Are there interaction terms I need to take into account? Thank you!

modified 11 days ago by Michael Love
Answer: DESeq2 3 factor design - whether to split up data or add interaction terms
gravatar for Michael Love
11 days ago
Michael Love
United States
Michael Love wrote:

The question of whether and which interactions to include when you have three factors is beyond the amount of support I can give here. It’s more of a statistical consulting question than a software question and I don’t have the extra time these days to get into it.

Any of these models can be set up in DESeq2 and coefficients or combinations of coefficients pulled out, similarly to a linear model in R.

Rather than split into separate datasets, for simply comparing levels of treatment within site and timepoint, see the Interaction section of the vignette. We have a relevant suggestion there.

written 11 days ago by Michael Love
