This is a radically simplified version of: Confused by "composite" designs. (After I read the comments and answers that earlier post received, I realized that my query had conflated several different questions, which made it pretty confusing. In this post I attempt to narrow the focus to only one question, using a simpler, even if less realistic, example. The phrasing of this second attempt is sufficiently different from that of the first one that I thought it would be too confusing to just edit or update the original post in-place. Hence this second post.)
This is a hypothetical example. I have 10 samples, as described in the metadata table below:
> metadata drug dose 1 NONE NaN 2 A 1 3 A 10 4 A 100 5 B 1 6 B 10 7 B 100 8 C 1 9 C 10 10 C 100
Now, I create two deseq datasets, with different designs:
dds1 <- DESeq(DESeqDataSetFromMatrix(countData = counts, colData = metadata, design = ~ drug)) dds2 <- DESeq(DESeqDataSetFromMatrix(countData = counts, colData = metadata, design = ~ drug + dose))
Finally, I compute two sets of results, as follows:
results1 <- list(A = results(dds1, contrast = c("drugA", "drugNONE")), B = results(dds1, contrast = c("drugB", "drugNONE")), C = results(dds1, contrast = c("drugC", "drugNONE"))) results2 <- list(A = results(dds2, contrast = c("drugA", "drugNONE")), B = results(dds2, contrast = c("drugB", "drugNONE")), C = results(dds2, contrast = c("drugC", "drugNONE")))
The two RHS expressions above are identical, except that one uses dds1 (hence, design = ~ drug) and the other on uses dds2 (hence, design = ~ drug + dose).
How do results1
and results2
differ? I don't mean, how do their values differ, but rather how do they differ semantically?
What I'm trying to get at is whether the interpretation of the expression contrast = c("drugA", "drugNONE")
is context-dependent or not, and if it is, how.
Here, the only context differences I'm interested in are those resulting from differences in the formulae used as values for the design
parameter.
(Of course, in some trivial ways, the meaning of the expression contrast = c("drugA", "drugNONE")
is very much context dependent. For example, if
dds0 <- DESeq(DESeqDataSetFromMatrix(countData = counts, colData = metadata, design = ~ dose))
...then the expression results(dds0, contrast = c("drugA", "drugNONE"))
is downright invalid, so in this case it may be fair to say that the expression contrast = c("drugA", "drugNONE")
is basically meaningless.)
As a further wrinkle, suppose that I now define
dds3 <- DESeq(ESeqDataSetFromMatrix(countData = counts, colData = metadata, design = ~ dose + drug)) results3 <- list(A = results(dds3, contrast = c("drugA", "drugNONE")), B = results(dds3, contrast = c("drugB", "drugNONE")), C = results(dds3, contrast = c("drugC", "drugNONE")))
Now, the only difference between dds2
and dds3
is in the ordering of the factors in the design formulas. Is there any semantic difference between results2 and results3?