Importance of Order in Design Formula
2
0
Entering edit mode
@dennism9251-21198
Last seen 3.1 years ago
United States

Can someone just quickly clarify if order matters when deciding on your design formula? In this post Anderson nicely explains that the design below is "Effect of treatment, accounting for the sample pairing"

~ Patient.ID + Treatment


But I thought that variables included in your design table (specifically referring to the Wilkinson Notation) just referred to the factors you wanted to take into account when creating your linear regression model.

What would be the difference between:

"Effect of treatment, accounting for the sample pairing"

~ Patient.ID + Treatment


and "Effect of sample pairing, accounting for the treatment"

~ Treatment + Patient.ID


The same question is extended to interactions Patient.ID:Treatment and Treatment:Patient.ID)

deseq2 • 3.3k views
0
Entering edit mode

Also the design matrix does not change when specifying different orders in model.matrix()

2
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 1 day ago
San Diego

https://rdrr.io/bioc/DESeq2/man/results.html

If results is run without specifying contrast or name, it will return the comparison of the last level of the last variable in the design formula over the first level of this variable.

If you specify the contrast you want, order doesn't matter.

0
Entering edit mode

Thank you for your answer, but the focus of this question isn't really results() or choosing the correct resultsNames/ contrast. I was wondering if calling results( same contrast/name/etc ) on two DESeqDataSets with differing design order would change the outcome. To clarify my question...

design(dds1) <- ~ Patient.ID + Treatment
design(dds2) <- ~ Treatment + Patient.ID
dds1 <- DESeq(dds1)
dds2 <- DESeq(dds2)
res1 <- results(dds1) #Yes I know there is no contrast/results name
res2 <- results(dds2)


Is res1 == res2?

Also while on the topic on specifying contrast. Is there a difference from in the code shown below?

results(dds, contrast=c("condition", "Trt", "Ctrl"))
results(dds, name="condition_Trt_vs_Ctrl")

2
Entering edit mode

As swbarnes2 points out above, the order in the design doesn't matter.

When someone pulls out the coefficient associated with variable x with a design formula ~z + x, they will often write about "the effect of x, controlling for z" or "...while adjusting for z", etc. Or you could be more explicit and say "the coefficient associated with x, in a linear model including terms for z and x".

0
Entering edit mode
Wallace • 0
@63cbfce3
Last seen 2.0 years ago
Greece

Thank you for this!