Time-series experiment with patients from 2 conditions and 5 time-points from each patient
Hi everyone, I am trying to create a DEseq2 model matrix for my RNA-seq experiment. I have two conditions (treated vs untreated) and in each condition 10 patients (10 different patients in treated, 10 different patients in untreated group) and from each patient I have 5 time-points (baseline,10min, 3h, 6h, 24h). I would like to identify DE genes whose expression profiles across my time-points are different in the treated vs untreated group. I am now struggling with how to create my design for DESeq2. If my time-points were independent (not always 5 from the same patient), I would go for full model ~ condition + time + condition:time
and reduced ~ condition + time
. But when I ignore the non-independency of my time-points, in the PCA plot I can see that samples cluster more based on the patient they originated from rather than time-points or conditions. So I have a patient batch effect I need to remove. But I am not able to find out how (or if) that is possible to do with DESeq2, because if I add my patient effect into the model (like this ~ patient + condition + time + condition:time
) I get an error that my model matrix is not full rank (which makes sense since I do not have the same patients in both groups).
My experimental design looks like this:
sample condition time patient
01_A_SE treated baseline pat01
01_C_SE treated 10min pat01
01_D_SE treated 3h pat01
01_E_SE treated 6h pat01
01_F_SE treated 24h pat01
02_A_SE treated baseline pat02
02_C_SE treated 10min pat02
02_D_SE treated 3h pat02
02_E_SE treated 6h pat02
02_F_SE treated 24h pat02
. . . .
. . . .
. . . .
18_A_SE untreated baseline pat18
19_C_SE untreated 10min pat18
19_D_SE untreated 3h pat18
19_E_SE untreated 6h pat18
19_F_SE untreated 24h pat18
20_A_SE untreated baseline pat20
20_C_SE untreated 10min pat20
20_D_SE untreated 3h pat20
20_E_SE untreated 6h pat20
20_F_SE untreated 24h pat20
Is there a way how do this in DESeq2, or do I have to look into limma or some other software?
Thank you very much!
Karolina
Thank you! I went through the section and discussed it with a colleague who is more statistics-oriented and came to this conclusion (maybe it will be useful for someone in future, so I am giving a reproducible example):
Because I have uneven number of patients in groups, I removed all-zero interactions manually from both m1 and m2. In the end I am most interested in genes whose time expression profiles differ between my two conditions, so I use coefficient
conditiontreated
.Hope I am not too far from truth.