Hi,
I have a design with three variables: tissue, time and phenotype, where the tissue has three levels (X, Y and Z, where X is the reference level), time as three levels (12h, 24h and 48h, where 12h is the reference level), and phenotype has three levels (ctrl, A and B, where ctrl is the reference level). I have added the design below my post (first column is just the sample ID) as the dput. I am using the following formula to consider all possible variables and interaction terms in the dds object:
~ tissue * time * phenotype
The reason I am doing this is because I want to generate a single dds object, and then be able to use the contrasts to retrieve the differential gene expression for any combination of these variables. For example, I could then want to get differential gene expression for (tissue Y time 48h phenotype B) vs (tissue Y time 48h phenotype ctrl), or another example could be (tissue X time 24h phenotype A) vs (tissue X time 24h phenotype ctrl). Please note that I do know how to run DESeq2 for simple pairwise comparisons, but this is not the goal of this post.
So right now with this formula the resultsNames(dds) are:
tissueY
tissueZ
time24h
time48h
phenotypeA
phenotypeB
tissueY:time24h
tissueZ:time24h
tissueY:time48h
tissueZ:time48h
tissueY:phenotypeA
tissueZ:phenotypeA
tissueY:phenotypeB
tissueZ:phenotypeB
time24h:phenotypeA
time48h:phenotypeA
time24h:phenotypeB
time48h:phenotypeB
tissueY:time24h:phenotypeA
tissueZ:time24h:phenotypeA
tissueY:time48h:phenotypeA
tissueZ:time48h:phenotypeA
tissueY:time24h:phenotypeB
tissueZ:time24h:phenotypeB
tissueY:time48h:phenotypeB
tissueZ:time48h:phenotypeB
To extract differential expression between for example (tissue Y time 48h phenotype B) and (tissue Y time 48h phenotype ctrl), what would be the correct contrasts?
Two of my guesses were these:
results(dds, contrast = list(c("tissueY:time48h", "tissueY:time48h:phenotypeB")))
or
results(dds, contrast = list(c("phenotypeB", "tissueY:time48h:phenotypeB")))
Are one of these correct? Is it something else? And what would it become if were are interested in one of the reference levels, for example differential expression between (tissue X time 24h phenotype A) and (tissue X time 24h phenotype ctrl), where tissue X was set as the reference?
Thank you!!
dput(design) structure(list(tissue = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("X", "Y", "Z"), class = "factor"), time = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("12h", "24h", "48h"), class = "factor"), phenotype = structure(c(3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L ), .Label = c("A", "B", "ctrl"), class = "factor")), class = "data.frame", row.names = c("s1", "s2", "s3", "s4", "s5", "s6", "s7", "s8", "s9", "s10", "s11", "s12", "s13", "s14", "s15", "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31", "s32", "s33", "s34", "s35", "s36", "s37", "s38", "s39", "s40", "s41", "s42", "s43", "s44", "s45", "s46", "s47", "s48", "s49", "s50", "s51", "s52", "s53", "s54", "s55", "s56", "s57", "s58", "s59", "s60", "s61", "s62", "s63", "s64", "s65", "s66", "s67", "s68", "s69", "s70", "s71", "s72", "s73", "s74", "s75", "s76", "s77", "s78", "s79", "s80", "s81"))
The point of starting with all interactions is not to guess the contrasts. I am developing a function that uses DESeq as part of it, but this function needs to be flexible and return a dds object from which the user can extract differential expression using the appropriate contrasts, and an example of this could be the one I mentioned in my post. I need to test it to make sure that it is giving me the same results as a simple pairwise comparison. I know how to do simple pairwise comparisons, as in the DESeq2 documentation. However, this is not the point of my post. I hope you understand. Thank you
Edited my reply for typos
I see. Sorry that I missed this line in your initial post, "Please note that I do know how to run DESeq2 for simple pairwise comparisons, but this is not the goal of this post."
I don't have much spare time to work out which pairwise comparisons are reflected by which names in the model matrix. You could create the model matrix and examine the 1's and 0's in the matrix to get a better sense of what coefficients represent which pairwise comparisons.
I added that line to clarify it. I will try doing what you suggested. I am a beginner with using model matrices and contrasts, so this is why I wanted to make sure that I am doing it properly.
(deleted, the same message posted twice)