DESeq2 time-series with two conditions and several patients
1
0
Entering edit mode
@02755e9a
Last seen 17 months ago
CEITEC, Masaryk University, Brno, Czech…

Time-series experiment with patients from 2 conditions and 5 time-points from each patient

Hi everyone, I am trying to create a DEseq2 model matrix for my RNA-seq experiment. I have two conditions (treated vs untreated) and in each condition 10 patients (10 different patients in treated, 10 different patients in untreated group) and from each patient I have 5 time-points (baseline,10min, 3h, 6h, 24h). I would like to identify DE genes whose expression profiles across my time-points are different in the treated vs untreated group. I am now struggling with how to create my design for DESeq2. If my time-points were independent (not always 5 from the same patient), I would go for full model ~ condition + time + condition:time and reduced ~ condition + time. But when I ignore the non-independency of my time-points, in the PCA plot I can see that samples cluster more based on the patient they originated from rather than time-points or conditions. So I have a patient batch effect I need to remove. But I am not able to find out how (or if) that is possible to do with DESeq2, because if I add my patient effect into the model (like this ~ patient + condition + time + condition:time) I get an error that my model matrix is not full rank (which makes sense since I do not have the same patients in both groups).

My experimental design looks like this:

sample  condition   time        patient
01_A_SE treated     baseline    pat01
01_C_SE treated     10min       pat01
01_D_SE treated     3h          pat01
01_E_SE treated     6h          pat01
01_F_SE treated     24h         pat01
02_A_SE treated     baseline    pat02
02_C_SE treated     10min       pat02
02_D_SE treated     3h          pat02
02_E_SE treated     6h          pat02
02_F_SE treated     24h         pat02
.       .           .           .
.       .           .           .
.       .           .           .
18_A_SE untreated   baseline    pat18
19_C_SE untreated   10min       pat18
19_D_SE untreated   3h          pat18
19_E_SE untreated   6h          pat18
19_F_SE untreated   24h         pat18
20_A_SE untreated   baseline    pat20
20_C_SE untreated   10min       pat20
20_D_SE untreated   3h          pat20
20_E_SE untreated   6h          pat20
20_F_SE untreated   24h         pat20


Is there a way how do this in DESeq2, or do I have to look into limma or some other software?

Thank you very much!

Karolina

DESeq2 • 611 views
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Take a look at the vignette section under Interactions, on group-specific condition effects, individuals nested within groups. This applies to your design with group = treatment and condition = time.

1
Entering edit mode

Thank you! I went through the section and discussed it with a colleague who is more statistics-oriented and came to this conclusion (maybe it will be useful for someone in future, so I am giving a reproducible example):

coldata <- DataFrame(condition=factor(c(rep("treated",each=15),rep("untreated",each=20))),
patient=factor(rep(1:5,each=7)),
time=factor(rep(c("baseline","10min", "3h", "6h", "24h"),7)))
as.data.frame(coldata)

mrcounts <- counts(makeExampleDESeqDataSet(n=10000, m = nrow(coldata)))

coldata$patient.n <- factor(c(rep(1:3,each=5),rep(1:4,each=5))) coldata$condition <- relevel(coldata\$condition, "untreated")

as.data.frame(coldata)

m1 <- model.matrix(~ condition + condition:patient.n + condition:time, coldata)
m2 <- model.matrix(~ condition + condition:patient.n, coldata)

idx <- which(apply(m1, 2, function(x) all(x==0)))
m1 <- m1[,-idx]

idx <- which(apply(m2, 2, function(x) all(x==0)))
m2 <- m2[,-idx]

dds<-DESeqDataSetFromMatrix(countData = mrcounts, colData = coldata, design = m1)
ddsLRT <- DESeq(dds, test="LRT", reduced = m2)

resultsNames(ddsLRT)
res <- results(ddsLRT, name="conditiontreated", independentFiltering = T, alpha=0.05)
summary(res)


Because I have uneven number of patients in groups, I removed all-zero interactions manually from both m1 and m2. In the end I am most interested in genes whose time expression profiles differ between my two conditions, so I use coefficient conditiontreated.

Hope I am not too far from truth.