DESeq2 design model that considers 3 factors
1
1
Entering edit mode
sandra ▴ 10
@sandra-17197
Last seen 5.1 years ago

Hi there,

I am new to DESeq2 and multiple factor design. I have read the DESeq2 vignette and some forum discussions that deal with design questions of multiple factors. But I’m still unsure if my approach is reasonable and/or what the best approach would be.

My experimental design is comprised of 3 tissues. For each tissue, I have 4 genotypes, and for most of the genotypes (but not for all) I have 2 colours. With 3 replicates for each, I have a total of 48 samples with 16 possible condition combinations (as there are not always 2 colours for each genotype in each tissue).

How do I write the design code to control for tissue, genotype and colour?

 

My matrix looks like this (samples.txt):

sample_name rep tissue genotype colour
1 A L c g
2 B L c g
3 C L c g
4 A L f g
5 B L f g
6 C L f g
7 A L f w
8 B L f w
9 C L f w
10 A L g g
11 B L g g
12 C L g g
13 A L gxf g
14 B L gxf g
15 C L gxf g
16 A L gxf w
17 B L gxf w
18 C L gxf w
19 A C c g
20 B C c g
21 C C c g
22 A C f g
23 B C f g
24 C C f g
25 A C g g
26 B C g g
27 C C g g
28 A C g w
29 B C g w
30 C C g w
31 A C gxf g
32 B C gxf g
33 C C gxf g
34 A C gxf w
35 B C gxf w
36 C C gxf w
37 A E c g
38 B E c g
39 C E c g
40 A E f w
41 B E f w
42 C E f w
43 A E g g
44 B E g g
45 C E g g
46 A E gxf w
47 B E gxf w
48 C E gxf w

 

I would like to compare the following:

  1. For each tissue:
    • a. Compare between the 4 genotypes
    • b. Compare between the 2 colours
  2. For each genotype:
    • a. Compare between the 3 tissues
    • b. Compare between the 2 colours
  3. For each colour:
    • a. Compare between the 3 tissues
    • b. Compare between the 4 genotypes

 

However, I’m mainly interested in 1a and 1b.

 

1) Initially, I tried using a design with ~group

# created the column ‘group’ in my samples table 

samples$group <- factor(paste0(samples$tissue, samples$genotype, samples$colour))

# imported the data from tximport using design = ~group

ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design=~group)

# pre-filtered to keep genes with at least 100 counts in total across all samples

keep <- rowSums(counts(ddsTxi)) >= 100

dds <- ddsTxi[keep,]

# differential expression with Wald test

dds <- DESeq(dds)

resultsNames(dds)

 

> resultsNames(dds)

 [1] "Intercept"                                          "group_Cfg_vs_Ccg"     
 [3] "group_Cgxfg_vs_Ccg" "group_Cgxfw_vs_Ccg"
 [5] "group_Cgg_vs_Ccg"     "group_Cgw_vs_Ccg"    
 [7] "group_Ecg_vs_Ccg"        "group_Efw_vs_Ccg"        
 [9] "group_Egxfw_vs_Ccg"    "group_Egg_vs_Ccg"       
[11] "group_Lcg_vs_Ccg"          "group_Lfg_vs_Ccg"          
[13] "group_Lfw_vs_Ccg"           "group_Lgxfg_vs_Ccg"     
[15] "group_Lgxfw_vs_Ccg"      "group_Lgg_vs_Ccg"  

 

I guess here it takes the Ccg combination as the base level (as I understand this is set alphabetically). However, I would like to compare, for example, all samples of one genotype to all samples of another genotype within a tissue (i.e. 1a).

 

2) Then I tried using the design = ~colour + tissue + genotype + tissue:genotype

# imported the data from tximport

ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design= ~colour + tissue + genotype + tissue:genotype)

# pre-filtered to keep genes with at least 100 counts in total across all samples

keep <- rowSums(counts(ddsTxi)) >= 100

dds <- ddsTxi[keep,]

# differential expression with Wald test

dds <- DESeq(dds)

resultsNames(dds)

 

> resultsNames(dds)
 [1] "Intercept"                     "colour_w_vs_g"         "tissue_E_vs_C"    "tissue_L_vs_C"     
 [5] "genotype_f_vs_c"          "genotype_g_vs_c"         "genotype_gxf_vs_c"     "tissueE.genotypef"     
 [9] "tissueL.genotypef"        "tissueE.genotypeg"     "tissueL.genotypeg"       "tissueE.genotypegxf"
[13] "tissueL.genotypegxf"  

 

I think the base level genotype is: c and the base level tissue is: C (as set alphabetically).

I could then obtain the base level genotype effects for tissue C with:

results(dds, contrast=list("genotype_f_vs_c"))

results(dds, contrast=list("genotype_g_vs_c"))

results(dds, contrast=list("genotype_gxf_vs_c"))

 

But how do I obtain, for example, “genotype_f_vs_g” for tissue C?

 

The below command lines gave the same results as above:

results(dds, contrast=c("genotype","f", "c"))

results(dds, contrast=c("genotype","g", "c"))

results(dds, contrast=c("genotype","gxf", "c"))

 

So I tried this for genotype_f_vs_g and it seemed to work:

results(dds, contrast=c("genotype","f", "g"))

 

Is the above the correct way to obtain "genotype_f_vs_g" for tissue C? Is there another way using the above results(dds, contrast = list()) command?

 

I would like to do the same for the other two tissues.

The genotype effect for tissue L would be:

results(dds, contrast=list(c("genotype_f_vs_c","tissueL.genotypef")))

results(dds, contrast=list(c("genotype_g_vs_c","tissueL.genotypeg")))

results(dds, contrast=list(c("genotype_gxf_vs_c","tissueL.genotypegxf")))

 

But how do I obtain, for example, “genotype_f_vs_g” for tissue L?

 

Thank you in advance and any help will be much appreciated.

 

Kind regards,

Sandra

 

 

deseq2 multiple factor design • 1.8k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 9 hours ago
United States

I don’t have much time these days to provide extensive statistical consulting and experimental design advice on the support site. I can provide software support, but it seems you are already familiar with the DESeq2 results() function and how it can be used to construct contrasts. I’d recommend speaking with someone familiar with linear models and contrasts. There is nothing particular to DESeq2 here.

You say “However, I would like to compare, for example, all samples of one genotype to all samples of another genotype within a tissue (i.e. 1a).”

I don’t see why you can’t use the group design and the ‘contrast’ argument to obtain this. Did you try this?

ADD COMMENT
0
Entering edit mode

Hi Michael, Thank you very much for your response and sorry I hadn’t replied before - I’ve been on maternity leave.

Regarding your suggestion to try group design and the ‘contrast’ argument. I have indeed tried this but I only get 16 result tables (please see resultsNames(dds) above under 1).

All of my samples are compared to only one sample, which is the base level sample Ccg. However, some sample comparisons are missing in the list, for example Ecg _vs _Efw.

Hence, I’ve tried using the design = ~colour + tissue + genotype + tissue:genotype (explained above under 2) but I ran into problems there too. For example, I don’t know how to obtain the comparison “genotype _f _vs _g” for the tissue L?

Could you please explain why DESeq2 doesn’t compare all samples with each other, but compares samples only against the base level sample? And is there a way to obtain the other comparisons?

Kind regards and many thanks, Sandra

ADD REPLY
0
Entering edit mode

some sample comparisons are missing in the list, for example Ecg vs Efw

results(dds, contrast=c(“group”,”ecg”,”efw”))

ADD REPLY
0
Entering edit mode

Thank you for your quick reply!

So even when the table "group _Ecg _vs _Efw" is missing under resultsNames(dds), I can still get the comparison using the 'contrast' argument? I didn't realise that's possible.

Thanks again!

ADD REPLY

Login before adding your answer.

Traffic: 427 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6