Hi there,
I am new to DESeq2 and multiple factor design. I have read the DESeq2 vignette and some forum discussions that deal with design questions of multiple factors. But I’m still unsure if my approach is reasonable and/or what the best approach would be.
My experimental design is comprised of 3 tissues. For each tissue, I have 4 genotypes, and for most of the genotypes (but not for all) I have 2 colours. With 3 replicates for each, I have a total of 48 samples with 16 possible condition combinations (as there are not always 2 colours for each genotype in each tissue).
How do I write the design code to control for tissue, genotype and colour?
My matrix looks like this (samples.txt):
|
I would like to compare the following:
- For each tissue:
- a. Compare between the 4 genotypes
- b. Compare between the 2 colours
- For each genotype:
- a. Compare between the 3 tissues
- b. Compare between the 2 colours
- For each colour:
- a. Compare between the 3 tissues
- b. Compare between the 4 genotypes
However, I’m mainly interested in 1a and 1b.
1) Initially, I tried using a design with ~group
# created the column ‘group’ in my samples table samples$group <- factor(paste0(samples$tissue, samples$genotype, samples$colour)) # imported the data from tximport using design = ~group ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design=~group) # pre-filtered to keep genes with at least 100 counts in total across all samples keep <- rowSums(counts(ddsTxi)) >= 100 dds <- ddsTxi[keep,] # differential expression with Wald test dds <- DESeq(dds) resultsNames(dds)
> resultsNames(dds)
[1] "Intercept" "group_Cfg_vs_Ccg"
[3] "group_Cgxfg_vs_Ccg" "group_Cgxfw_vs_Ccg"
[5] "group_Cgg_vs_Ccg" "group_Cgw_vs_Ccg"
[7] "group_Ecg_vs_Ccg" "group_Efw_vs_Ccg"
[9] "group_Egxfw_vs_Ccg" "group_Egg_vs_Ccg"
[11] "group_Lcg_vs_Ccg" "group_Lfg_vs_Ccg"
[13] "group_Lfw_vs_Ccg" "group_Lgxfg_vs_Ccg"
[15] "group_Lgxfw_vs_Ccg" "group_Lgg_vs_Ccg"
I guess here it takes the Ccg combination as the base level (as I understand this is set alphabetically). However, I would like to compare, for example, all samples of one genotype to all samples of another genotype within a tissue (i.e. 1a).
2) Then I tried using the design = ~colour + tissue + genotype + tissue:genotype
# imported the data from tximport ddsTxi <- DESeqDataSetFromTximport(txi,colData=samples,design= ~colour + tissue + genotype + tissue:genotype) # pre-filtered to keep genes with at least 100 counts in total across all samples keep <- rowSums(counts(ddsTxi)) >= 100 dds <- ddsTxi[keep,] # differential expression with Wald test dds <- DESeq(dds) resultsNames(dds)
> resultsNames(dds)
[1] "Intercept" "colour_w_vs_g" "tissue_E_vs_C" "tissue_L_vs_C"
[5] "genotype_f_vs_c" "genotype_g_vs_c" "genotype_gxf_vs_c" "tissueE.genotypef"
[9] "tissueL.genotypef" "tissueE.genotypeg" "tissueL.genotypeg" "tissueE.genotypegxf"
[13] "tissueL.genotypegxf"
I think the base level genotype is: c and the base level tissue is: C (as set alphabetically).
I could then obtain the base level genotype effects for tissue C with:
results(dds, contrast=list("genotype_f_vs_c")) results(dds, contrast=list("genotype_g_vs_c")) results(dds, contrast=list("genotype_gxf_vs_c"))
But how do I obtain, for example, “genotype_f_vs_g” for tissue C?
The below command lines gave the same results as above:
results(dds, contrast=c("genotype","f", "c")) results(dds, contrast=c("genotype","g", "c")) results(dds, contrast=c("genotype","gxf", "c"))
So I tried this for genotype_f_vs_g and it seemed to work:
results(dds, contrast=c("genotype","f", "g"))
Is the above the correct way to obtain "genotype_f_vs_g" for tissue C? Is there another way using the above results(dds, contrast = list()) command?
I would like to do the same for the other two tissues.
The genotype effect for tissue L would be:
results(dds, contrast=list(c("genotype_f_vs_c","tissueL.genotypef"))) results(dds, contrast=list(c("genotype_g_vs_c","tissueL.genotypeg"))) results(dds, contrast=list(c("genotype_gxf_vs_c","tissueL.genotypegxf")))
But how do I obtain, for example, “genotype_f_vs_g” for tissue L?
Thank you in advance and any help will be much appreciated.
Kind regards,
Sandra
Hi Michael, Thank you very much for your response and sorry I hadn’t replied before - I’ve been on maternity leave.
Regarding your suggestion to try group design and the ‘contrast’ argument. I have indeed tried this but I only get 16 result tables (please see resultsNames(dds) above under 1).
All of my samples are compared to only one sample, which is the base level sample Ccg. However, some sample comparisons are missing in the list, for example Ecg _vs _Efw.
Hence, I’ve tried using the design = ~colour + tissue + genotype + tissue:genotype (explained above under 2) but I ran into problems there too. For example, I don’t know how to obtain the comparison “genotype _f _vs _g” for the tissue L?
Could you please explain why DESeq2 doesn’t compare all samples with each other, but compares samples only against the base level sample? And is there a way to obtain the other comparisons?
Kind regards and many thanks, Sandra
results(dds, contrast=c(“group”,”ecg”,”efw”))
Thank you for your quick reply!
So even when the table "group _Ecg _vs _Efw" is missing under resultsNames(dds), I can still get the comparison using the 'contrast' argument? I didn't realise that's possible.
Thanks again!