Hi!
I have problems understanding the contrasts
argument of the results
function in DESeq2
.
This is my dataset:
dds <- makeExampleDESeqDataSet(m = 18)
dds$condition = factor(c("exp", "exp", "exp", "stat", "stat", "stat", "pla", "pla", "pla", "ser.stat", "ser.stat", "ser.stat", "ser.exp", "ser.exp", "ser.exp", "rif.exp", "rif.exp", "rif.exp"))
design(dds) <- ~0 + condition # 0 because no intercept ("control")
dds <- DESeq(dds, betaPrior = FALSE) # betaPrior = FALSE because no intercept
I have 18 samples, 6 conditions with 3 replicates each. I have now a bit of a mixed dataset when it comes to a "control" condition. For example, my "exp" and "stat" are two different treatments for which I would like to perform the DGE, but there's also the comparison of "exp" vs. "rif.exp", for which "exp" would be the control (compound not added) for "rif.exp" (compound added). I therefore decided not to set an intercept, because I do not have a control for all conditions.
Therefore, the following
> resultsNames(dds)
[1] "conditionexp" "conditionpla" "conditionrif.exp" "conditionser.exp" "conditionser.stat" "conditionstat"
returns 6 conditions.
When I now want my results for one comparison, I run:
padj.cutoff = 0.05
lfc.cutoff = log2(2)
res <- results(dds,
contrast=c("condition", "exp", "stat"),
alpha = padj.cutoff, lfcThreshold = lfc.cutoff)
I then cannot run lfcShink
because I do not have an intercept. So far so good!
Then, I want to make the comparison between two conditions (averaged) - "exp" & "stat" vs "pla". Now, I don't understand how to specify the contrast
, and what nominator/denominator means for listValues
.
I am not sure if the following code is correct, but here it is:
results(dds,
contrast=list("conditionpla", c("conditionexp","conditionstat")),
listValues=c(1, -1/3),
alpha = padj.cutoff, lfcThreshold = lfc.cutoff)
Would someone please let me know if my thoughts on the intercept are right, and how I specify the contrast for one condition vs multiple conditions? I could of course run the code twice, once for my one-vs-one condition and once for my one-vs-multiple conditions, or is this cheating because my pvalue
are not adjusted correctly, ie. would this be cheating?
Thank you!
Thank you, dear Michael, and thank you for being so active in this forum in general! Could you have a look at this line, in which I wand to compare one vs. two conditions combined:
And tell me if this is correct?
And, what is the difference between
And
The first one gives me for
results
:And the second one
And the second one is the one I want I believe. Bit it just has
conditionexp
andconditionstat
switched? Does the order of the arguments matter?Order shouldn’t matter. Can you check again that the first line of code gives the three coefficients with “+”? Eg if you copy the code from the comment back into R you get the coefficients added and no 0.5?
If you can make a reproducible example using makeSimulatedDESeqDataSet with the same design as yours, that’s even better.
Hi Michael, sorry for not getting back to this earlier. Here is my example:
And here's the
results
that give me a different output based on the order of factors:And this is what I don't understand -.- My feeling is that the second
results
is the correct way, but why?EDIT: I apparently constructed the
list
forcontrasts
wrong.Really is not the same as:
Ah, that's it, you've solved the riddle! Yes
c()
will just concatenate everything, while you want two list elements: list(x, y) where y is a vector.thanks again for you help! 🧙