Question on DESeq2 on concatenated factors model vs. full interaction model
1
0
Entering edit mode
Dr Wei Chen ▴ 10
@dr-wei-chen-6318
Last seen 8.5 years ago
United States

I'm running DESeq2 for my expression analysis and have a question on the design formula. I have 2 cell lines, 2 time points, and 4 treatments in 3 batches. I tried the following 2 designs to compare the same 2 treatments and the results are slightly different in terms of the number of significant genes and their ranks. Why is that? Are these two models not the same?

1. dds$group <- factor(paste0(dds$cell, ".", dds$time, ".", dds$treat))
design(dds)<- ~ group + batch
dds<-DESeq(dds)

2. design(dds)<- ~ batch + cell*time*treat
dds<-DESeq(dds)

Thanks for the help!

Best,

Wei

deseq2 • 1.4k views
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.7 years ago
Zentrum für Molekularbiologie, Universi…

First: How have you called "results", and how was the comparison described in the first line of the output from "results"? This might already give you a hint what is going on.

Second: No, the two models are not the same. 

About the first model: By default, DESeq2 performs a Wald test, checking for significance of the last model coefficient. This would be a comparison between two specific levels of the concatenated factor; something like: differences between cell line 3, treatment 2, time point 3 and cell line 1, treatment 1, time point 1. You can pass extra arguments to "results" to select which levels you want to compare.

In the second model, the comparison is for the double interaction between the three factors, which might be rather hard to interpret.

 

It might be easier to advice you if you told us more about what comparison you actually want to do.

ADD COMMENT
0
Entering edit mode

Thanks Simon. Take time comparison as an example (I have 2 time points: 3hr and 24hr), this is how I call the results function:

Using the first model:

> results(dds, contrast=c("group","19154.3hr.Typhi","19154.24hr.Typhi"), addMLE=T)
log2 fold change (MAP): group 19154.3hr.Typhi vs 19154.24hr.Typhi
Wald test p-value: group 19154.3hr.Typhi vs 19154.24hr.Typhi

Using the second model:

> results(dds, name="time_3hr_vs_24hr", addMLE=F)
log2 fold change (MLE): time 3hr vs 24hr
Wald test p-value: time 3hr vs 24hr

As I mentioned, the results are slightly different. My dilemma is that: I have to use the first model to make pairwise comparisons, and use the second model to get the 2-way and 3-way interaction terms. Would be nice if I can get all of them using the same model.

Best,

Wei

ADD REPLY
0
Entering edit mode
The results would be the same (to numerical convergence diffs) if you set betaPrior=FALSE in DESeq(). While the LFC shrinkage has nice properties, it is dependent on how you specify the design. The pvalues are statistically valid with or without beta prior.
ADD REPLY

Login before adding your answer.

Traffic: 776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6