DESEq2 comparison with mulitple cell types under 2 conditions
Entering edit mode
kmu004 ▴ 50
Last seen 4.8 years ago
United States


I am new to using DEseq2 and multiple factor design. I have gone through forum discussions that deal with DEseq2 and multifactorial designs. However, I would like to understand if my design for the following matrix is correct?

Sample Celltype Condition
1 1 KO
2 1 WT
3 2 KO
4 2 WT
5 3 KO
6 3 WT
7 1 KO
8 1 WT
9 2 KO
10 2 WT
11 3 KO
12 3 WT
13 1 KO
14 1 WT
15 2 KO
16 2 WT
17 3 KO
18 3 WT
19 1 KO
20 1 WT
21 2 KO
22 2 WT
23 3 KO
24 3 WT

That is we have 3 cell types for each condition (knock out (KO) and wild type(WT)). I would like to perform teh following differential analysis

1. For KO condition, I would like to perform a comparision between each cell type

2. For each cell type, I would like to compare between KO and WT

Should I set up the design as follows?  ~CellType+CellType:Condition

Thank you in advance and any pointers or help is appreciated.




deseq2 multiple factor design • 20k views
Entering edit mode
Last seen 22 hours ago
United States

The easiest way for you to make all these comparisons is to simply join the two factors and then use contrast to compare in the end:

dds$group <- factor(paste0(dds$celltype, dds$condition))
design(dds) <- ~ group
dds <- DESeq(dds)
# e.g. for condition KO cell type 2 vs cell type 1
results(dds, contrast=c("group","2KO","1KO")) 
# e.g. for cell type 1 KO vs WT
results(dds, contrast=c("group","1KO","1WT")) 
Entering edit mode

Hi! Sorry to resurrect this post, but I have a similar doubt.

I have this dataset:

sample_name status tissue gender
A 0 Br F
B 0 Br F
C 0 Br F
D 1 Br F
E 1 Br F
F 1 Br F
G 0 Br M
H 0 Br M
I 0 Br M
J 1 Br M
K 1 Br M
L 1 Br M
M 0 Go F
N 0 Go F
O 0 Go F
P 1 Go F
Q 1 Go F
R 1 Go F
S 0 Go M
T 0 Go M
U 0 Go M
V 1 Go M
W 1 Go M
X 1 Go M
Y 0 Mu F
Z 0 Mu F
A1 0 Mu F
B1 1 Mu F
C1 1 Mu F
D1 1 Mu F
E1 0 Mu M
F1 0 Mu M
G1 0 Mu M
H1 1 Mu M
I1 1 Mu M
J1 1 Mu M


I'll need to test the differential expression in "status" 0 against 1, later "tissue" all against all, "gender" M against F and finally all groups between all groups.

The solution for group is the same of this post right? But, for spliced analysis, should I use a different design formula like: status + tissue + gender?


Best regards,

Entering edit mode

I'd recommend speaking with a local statistician to help you formulate your design. There are many ways to make these comparisons.

Entering edit mode

Dear Michael,

I have a similar experiment to this one: Design formula and Design matrix in DESeq

I have 2 groups for genotype (Mut vs. Ctr), and 4 time points, with 8 biological replicates each. As I am interested in the interactions, I ran my model as you suggested in the manual and in similar posts:

dge <- DESeqDataSetFromMatrix(countData = counts, colData = phenomodel1, design = ~group), where group is Ctr_time1, Ctr_time2, Ctr_time3, Ctr_time4, Mut_time1, Mut_time2, Mut_time3, Mut_time4.

dgeWALD <- DESeq(dge)

This works great in order to extract the genotype effect specifically for each age (example of genotype effect for age 1: Waldresults1 <- results(dgeWALD, contrast=c("group", "Mut_time1", "Ctr_time1")) .

In order to extract the overall effect of genotype (an average of the effect in time 1, 2, 3, and 4), I used the following:

Waldresultsgenotype <- results(dgeWALD, contrast=list(c("Mut_time1", "Mut_time2", "Mut_time3", "Mut_time4"), c("Ctr_time1", "Ctr_time2", "Ctr_time3", "Ctr_time4")), listValues=c(1,-1))

Now I am also interested in the effect of genotype across aging... And the aging effect in general really... And so I am trying to use the likelihood ratio. However, after reading the manuals and searching for similar posts, I still can't understand how I can run this. So here are my specific questions:

From my understanding, I should run something like this

1) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~age) # for the effect of genotype accross aging (meaning I take out the effect of age and thus stay with the effect of genotype)

2) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~genotype) # for the effect of age, independently of genotype (meaning I take out the effect of genotype and thus stay with the effect of aging)

But my full model was ~group (interaction!), so I really don't understand very well how I define the full model and the reduced model. In every case I see the LRT being used the full model is something like ~condition+genotype, but in this example one would be looking at the effect of genotype whilst controlling for the effect of condition, right? Which is not my interest.

Therefore, my question is how exactly should I define (for the LRT):

a) DESeqDataSetFromMatrix (colData and design)

c) DESeq

d) DESeq LRT (full and reduced)

If you know of any similar posts to similar studies I would really appreciate if you could help me find them.

Thank you in advance!


Entering edit mode

hi Isabel, 

Can you post this as a new question? It will help to organize the thread better. And in the new post, can you post your colData and can you say specifically which variable in the colData you are referring to when you say "aging".

Entering edit mode

Done. Thank you!


Login before adding your answer.

Traffic: 501 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6