Dear DESeq2 developer,
Thank your very much for this great tool! I have a design that is similar to the one mentioned in deseq2 user guide "Group-specific condition effects, individuals nested within groups" http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#group-specific-condition-effects-individuals-nested-within-groups.
The colData is shown below. Basically, this randomized clinical trial has three treatment groups, patients are nested within each group, while each patient has two measurements: pre and post treatment (time).
treatment patient.n time
1 Placebo 1 pre
2 Placebo 1 post
3 Placebo 2 pre
4 Placebo 2 post
5 Placebo 3 pre
6 Placebo 3 post
7 LD_Drug 1 pre
8 LD_Drug 1 post
9 LD_Drug 2 pre
10 LD_Drug 2 post
11 LD_Drug 3 pre
12 LD_Drug 3 post
13 HD_Drug 1 pre
14 HD_Drug 1 post
15 HD_Drug 2 pre
16 HD_Drug 2 post
17 HD_Drug 3 pre
18 HD_Drug 3 post
19 HD_Drug 4 pre
20 HD_Drug 4 post
My DESeq2 workflow is as following:
dds = DESeqDataSetFromMatrix(countData=count, colData=colinfo, design=~patient.n+time)
dds <- dds[rowSums(counts(dds)) >= 10,]
mm = model.matrix(~treatment + treatment:patient.n + treatment:time, colinfo)
mm <- mm[,colSums(mm) > 0]
dds = DESeq(dds, full = mm, betaPrior = F)
"Intercept" "treatmentLD_Drug" "treatmentHD_Drug" "treatmentPlacebo.patient.n2"
"treatmentLD_Drug.patient.n2" "treatmentHD_Drug.patient.n2" "treatmentPlacebo.patient.n3" "treatmentLD_Drug.patient.n3" "treatmentHD_Drug.patient.n3" "treatmentHD_Drug.patient.n4" "treatmentPlacebo.timepost" "treatmentLD_Drug.timepost" "treatmentHD_Drug.timepost"
Regarding this design, the comparisons that I am interested are:
1. time effect (post vs pre) within each treatment group
I think results(dds, name="treatmentPlacebo.timepost"), results(dds, name="treatmentLD_Drug.timepost") and results(dds, name="treatmentHD_Drug.timepost") will extract time effect results for Placebo, LD_Drug and HD_Drug respectively. If this way is correct, when I compare to a simple design(~patient.n+time) for each group separately, the results are different, I have more significant genes in the simple design. Can you explain why and which way do you think is better?
2. more importantly, I am interested in the different post effect between treatment groups while adjusting for baseline (pre) level, which is similar to either of the ANCOVA models: post ~ treatment + pre, post-pre ~ treatment + pre, (post-pre)/pre ~ treatment + pre. I am wondering how to set up contrast to extract such information for each two treatment groups?
3. this model contains a interaction term "treatment:time", which also allows us to examine the genes that have different post vs pre time effect between treatment groups. Using results(dds, contrast=list("treatmentLD_Drug.timepost","treatmentPlacebo.timepost")), results(dds, contrast=list("treatmentHD_Drug.timepost","treatmentPlacebo.timepost")), results(dds, contrast=list("treatmentHD_Drug.timepost","LD_Drug.timepost")) will extract these genes for each two groups. Together with question #1 (not sure for question #2), all these terms can be extracted from the above model specified, do you think whether it is better to analyze all data using one model or analyze them separately like in question #1?
4. regarding main effect for treatment, with all the interaction terms, how to interpret for example "treatmentLD_Drug"? When should we add "time" main effect?
Sorry for all these questions, this model matrix is too complex and I want to make sure that I am doing the right thing.
Thank you very much for your help and I am looking forward to your response,