Question: DESEq2 comparison with mulitple cell types under 2 conditions
gravatar for kmu004
4.0 years ago by
United States
kmu00440 wrote:


I am new to using DEseq2 and multiple factor design. I have gone through forum discussions that deal with DEseq2 and multifactorial designs. However, I would like to understand if my design for the following matrix is correct?

Sample Celltype Condition
1 1 KO
2 1 WT
3 2 KO
4 2 WT
5 3 KO
6 3 WT
7 1 KO
8 1 WT
9 2 KO
10 2 WT
11 3 KO
12 3 WT
13 1 KO
14 1 WT
15 2 KO
16 2 WT
17 3 KO
18 3 WT
19 1 KO
20 1 WT
21 2 KO
22 2 WT
23 3 KO
24 3 WT

That is we have 3 cell types for each condition (knock out (KO) and wild type(WT)). I would like to perform teh following differential analysis

1. For KO condition, I would like to perform a comparision between each cell type

2. For each cell type, I would like to compare between KO and WT

Should I set up the design as follows?  ~CellType+CellType:Condition

Thank you in advance and any pointers or help is appreciated.




ADD COMMENTlink modified 4.0 years ago by Michael Love23k • written 4.0 years ago by kmu00440
Answer: DESEq2 comparison with mulitple cell types under 2 conditions
gravatar for Michael Love
4.0 years ago by
Michael Love23k
United States
Michael Love23k wrote:

The easiest way for you to make all these comparisons is to simply join the two factors and then use contrast to compare in the end:

dds$group <- factor(paste0(dds$celltype, dds$condition))
design(dds) <- ~ group
dds <- DESeq(dds)
# e.g. for condition KO cell type 2 vs cell type 1
results(dds, contrast=c("group","2KO","1KO")) 
# e.g. for cell type 1 KO vs WT
results(dds, contrast=c("group","1KO","1WT")) 
ADD COMMENTlink written 4.0 years ago by Michael Love23k

Hi! Sorry to resurrect this post, but I have a similar doubt.

I have this dataset:

sample_name status tissue gender
A 0 Br F
B 0 Br F
C 0 Br F
D 1 Br F
E 1 Br F
F 1 Br F
G 0 Br M
H 0 Br M
I 0 Br M
J 1 Br M
K 1 Br M
L 1 Br M
M 0 Go F
N 0 Go F
O 0 Go F
P 1 Go F
Q 1 Go F
R 1 Go F
S 0 Go M
T 0 Go M
U 0 Go M
V 1 Go M
W 1 Go M
X 1 Go M
Y 0 Mu F
Z 0 Mu F
A1 0 Mu F
B1 1 Mu F
C1 1 Mu F
D1 1 Mu F
E1 0 Mu M
F1 0 Mu M
G1 0 Mu M
H1 1 Mu M
I1 1 Mu M
J1 1 Mu M


I'll need to test the differential expression in "status" 0 against 1, later "tissue" all against all, "gender" M against F and finally all groups between all groups.

The solution for group is the same of this post right? But, for spliced analysis, should I use a different design formula like: status + tissue + gender?


Best regards,

ADD REPLYlink written 2.2 years ago by gear.hq0

I'd recommend speaking with a local statistician to help you formulate your design. There are many ways to make these comparisons.

ADD REPLYlink written 2.2 years ago by Michael Love23k

Dear Michael,

I have a similar experiment to this one: Design formula and Design matrix in DESeq

I have 2 groups for genotype (Mut vs. Ctr), and 4 time points, with 8 biological replicates each. As I am interested in the interactions, I ran my model as you suggested in the manual and in similar posts:

dge <- DESeqDataSetFromMatrix(countData = counts, colData = phenomodel1, design = ~group), where group is Ctr_time1, Ctr_time2, Ctr_time3, Ctr_time4, Mut_time1, Mut_time2, Mut_time3, Mut_time4.

dgeWALD <- DESeq(dge)

This works great in order to extract the genotype effect specifically for each age (example of genotype effect for age 1: Waldresults1 <- results(dgeWALD, contrast=c("group", "Mut_time1", "Ctr_time1")) .

In order to extract the overall effect of genotype (an average of the effect in time 1, 2, 3, and 4), I used the following:

Waldresultsgenotype <- results(dgeWALD, contrast=list(c("Mut_time1", "Mut_time2", "Mut_time3", "Mut_time4"), c("Ctr_time1", "Ctr_time2", "Ctr_time3", "Ctr_time4")), listValues=c(1,-1))

Now I am also interested in the effect of genotype across aging... And the aging effect in general really... And so I am trying to use the likelihood ratio. However, after reading the manuals and searching for similar posts, I still can't understand how I can run this. So here are my specific questions:

From my understanding, I should run something like this

1) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~age) # for the effect of genotype accross aging (meaning I take out the effect of age and thus stay with the effect of genotype)

2) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~genotype) # for the effect of age, independently of genotype (meaning I take out the effect of genotype and thus stay with the effect of aging)

But my full model was ~group (interaction!), so I really don't understand very well how I define the full model and the reduced model. In every case I see the LRT being used the full model is something like ~condition+genotype, but in this example one would be looking at the effect of genotype whilst controlling for the effect of condition, right? Which is not my interest.

Therefore, my question is how exactly should I define (for the LRT):

a) DESeqDataSetFromMatrix (colData and design)

c) DESeq

d) DESeq LRT (full and reduced)

If you know of any similar posts to similar studies I would really appreciate if you could help me find them.

Thank you in advance!


ADD REPLYlink written 2.1 years ago by I.Castanho50

hi Isabel, 

Can you post this as a new question? It will help to organize the thread better. And in the new post, can you post your colData and can you say specifically which variable in the colData you are referring to when you say "aging".

ADD REPLYlink written 2.1 years ago by Michael Love23k

Done. Thank you!

ADD REPLYlink written 2.0 years ago by I.Castanho50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 299 users visited in the last hour