Question: DESeq2 design model that considers 3 factors
0
12 months ago by
sandra0
sandra0 wrote:

Hi there,

I am new to DESeq2 and multiple factor design. I have read the DESeq2 vignette and some forum discussions that deal with design questions of multiple factors. But I’m still unsure if my approach is reasonable and/or what the best approach would be.

My experimental design is comprised of 3 tissues. For each tissue, I have 4 genotypes, and for most of the genotypes (but not for all) I have 2 colours. With 3 replicates for each, I have a total of 48 samples with 16 possible condition combinations (as there are not always 2 colours for each genotype in each tissue).

How do I write the design code to control for tissue, genotype and colour?

My matrix looks like this (samples.txt):

 sample_name rep tissue genotype colour 1 A L c g 2 B L c g 3 C L c g 4 A L f g 5 B L f g 6 C L f g 7 A L f w 8 B L f w 9 C L f w 10 A L g g 11 B L g g 12 C L g g 13 A L gxf g 14 B L gxf g 15 C L gxf g 16 A L gxf w 17 B L gxf w 18 C L gxf w 19 A C c g 20 B C c g 21 C C c g 22 A C f g 23 B C f g 24 C C f g 25 A C g g 26 B C g g 27 C C g g 28 A C g w 29 B C g w 30 C C g w 31 A C gxf g 32 B C gxf g 33 C C gxf g 34 A C gxf w 35 B C gxf w 36 C C gxf w 37 A E c g 38 B E c g 39 C E c g 40 A E f w 41 B E f w 42 C E f w 43 A E g g 44 B E g g 45 C E g g 46 A E gxf w 47 B E gxf w 48 C E gxf w

I would like to compare the following:

1. For each tissue:
• a. Compare between the 4 genotypes
• b. Compare between the 2 colours
2. For each genotype:
• a. Compare between the 3 tissues
• b. Compare between the 2 colours
3. For each colour:
• a. Compare between the 3 tissues
• b. Compare between the 4 genotypes

However, I’m mainly interested in 1a and 1b.

1) Initially, I tried using a design with ~group

# created the column ‘group’ in my samples table

samples$group <- factor(paste0(samples$tissue, samples$genotype, samples$colour))

# imported the data from tximport using design = ~group

ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design=~group)

# pre-filtered to keep genes with at least 100 counts in total across all samples

keep <- rowSums(counts(ddsTxi)) >= 100

dds <- ddsTxi[keep,]

# differential expression with Wald test

dds <- DESeq(dds)

resultsNames(dds)

> resultsNames(dds)

 [1] "Intercept"                                          "group_Cfg_vs_Ccg"       [3] "group_Cgxfg_vs_Ccg" "group_Cgxfw_vs_Ccg"  [5] "group_Cgg_vs_Ccg"     "group_Cgw_vs_Ccg"      [7] "group_Ecg_vs_Ccg"        "group_Efw_vs_Ccg"          [9] "group_Egxfw_vs_Ccg"    "group_Egg_vs_Ccg"        [11] "group_Lcg_vs_Ccg"          "group_Lfg_vs_Ccg"           [13] "group_Lfw_vs_Ccg"           "group_Lgxfg_vs_Ccg"      [15] "group_Lgxfw_vs_Ccg"      "group_Lgg_vs_Ccg"  

I guess here it takes the Ccg combination as the base level (as I understand this is set alphabetically). However, I would like to compare, for example, all samples of one genotype to all samples of another genotype within a tissue (i.e. 1a).

2) Then I tried using the design = ~colour + tissue + genotype + tissue:genotype

# imported the data from tximport

ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design= ~colour + tissue + genotype + tissue:genotype)

# pre-filtered to keep genes with at least 100 counts in total across all samples

keep <- rowSums(counts(ddsTxi)) >= 100

dds <- ddsTxi[keep,]

# differential expression with Wald test

dds <- DESeq(dds)

resultsNames(dds)

> resultsNames(dds)  [1] "Intercept"                     "colour_w_vs_g"         "tissue_E_vs_C"    "tissue_L_vs_C"       [5] "genotype_f_vs_c"          "genotype_g_vs_c"         "genotype_gxf_vs_c"     "tissueE.genotypef"       [9] "tissueL.genotypef"        "tissueE.genotypeg"     "tissueL.genotypeg"       "tissueE.genotypegxf" [13] "tissueL.genotypegxf"  

I think the base level genotype is: c and the base level tissue is: C (as set alphabetically).

I could then obtain the base level genotype effects for tissue C with:

results(dds, contrast=list("genotype_f_vs_c"))

results(dds, contrast=list("genotype_g_vs_c"))

results(dds, contrast=list("genotype_gxf_vs_c"))

But how do I obtain, for example, “genotype_f_vs_g” for tissue C?

The below command lines gave the same results as above:

results(dds, contrast=c("genotype","f", "c"))

results(dds, contrast=c("genotype","g", "c"))

results(dds, contrast=c("genotype","gxf", "c"))

So I tried this for genotype_f_vs_g and it seemed to work:

results(dds, contrast=c("genotype","f", "g"))

Is the above the correct way to obtain "genotype_f_vs_g" for tissue C? Is there another way using the above results(dds, contrast = list()) command?

I would like to do the same for the other two tissues.

The genotype effect for tissue L would be:

results(dds, contrast=list(c("genotype_f_vs_c","tissueL.genotypef")))

results(dds, contrast=list(c("genotype_g_vs_c","tissueL.genotypeg")))

results(dds, contrast=list(c("genotype_gxf_vs_c","tissueL.genotypegxf")))

But how do I obtain, for example, “genotype_f_vs_g” for tissue L?

Thank you in advance and any help will be much appreciated.

Kind regards,

Sandra

modified 12 months ago by Michael Love25k • written 12 months ago by sandra0
Answer: DESeq2 design model that considers 3 factors
0
12 months ago by
Michael Love25k
United States
Michael Love25k wrote:

I don’t have much time these days to provide extensive statistical consulting and experimental design advice on the support site. I can provide software support, but it seems you are already familiar with the DESeq2 results() function and how it can be used to construct contrasts. I’d recommend speaking with someone familiar with linear models and contrasts. There is nothing particular to DESeq2 here.

You say “However, I would like to compare, for example, all samples of one genotype to all samples of another genotype within a tissue (i.e. 1a).”

I don’t see why you can’t use the group design and the ‘contrast’ argument to obtain this. Did you try this?