Question: All Pairwise Comparisons with DESeq2 While Controlling for Batch Effects: Does Reference Level Matter?
1
gravatar for wunderl
5 months ago by
wunderl20
wunderl20 wrote:

I have the following design matrix:

batch treatment
   A         1
   A         2
   A         3
   B         1
   B         2
   B         3
   C         1
   C         2
   C         3

> batch
[1] A A A B B B C C C
Levels: A B C

> treatment
[1] 1 2 3 1 2 3 1 2 3
Levels: 1 2 3

I would like to determine all of the pairwise differences between the three treatments while controlling for batch. There is no "control/baseline" level for either batch or treatment here, so there is no clear choice for the reference levels. So far I have been modeling this using the design ~ batch + treatment. I realize using the default approach in DESeq2 this produces the following:

#model matrix
 (Intercept) batchB batchC treatment2 treatment3
           1      0      0          0          0
           1      0      0          1          0
           1      0      0          0          1
           1      1      0          0          0
           1      1      0          1          0
           1      1      0          0          1
           1      0      1          0          0
           1      0      1          1          0
           1      0      1          0          1

#resultNames(dds)
[1] "Intercept"             "batch_B_vs_A"     "batch_C_vs_A"    
[4] "treatment_2_vs_1" "treatment_3_vs_1"

#Missing comparison
results(dds, contrast = c("treatment", "2", "3"))

My concern with this is that both the reference level for batch and for treatment do not get coefficients in the resulting model but are instead convolved together in the intercept term. I am wondering if this would make the "missing" missing pairwise comparison inaccurate or different from the other two explicitly fit by the model. It also prevents running lfcShrink for the missing comparison when using apeglm.

I am wondering if the following expanded/full model matrix is the appropriate way to handle this situation:

(Intercept) batchA batchB batchC treatment1 treatment2 treatment3
          1      1      0      0          1          0          0
          1      1      0      0          0          1          0
          1      1      0      0          0          0          1
          1      0      1      0          1          0          0
          1      0      1      0          0          1          0
          1      0      1      0          0          0          1
          1      0      0      1          1          0          0
          1      0      0      1          0          1          0
          1      0      0      1          0          0          1

This would theoretically give me coefficients for all 3 levels of both batch and treatment, and would allow me to run lfcShrink for all 3 possible pairwise comparisons between treatments. This would make the intercept term effectively 0 however, and would also remove the reference against with the fold changes are being determined. This all seems problematic.

Is this an appropriate approach, or is there another alternative approach I should be using?

ADD COMMENTlink modified 4 months ago by Michael Love25k • written 5 months ago by wunderl20
Answer: All Pairwise Comparisons with DESeq2 While Controlling for Batch Effects: Does R
0
gravatar for Michael Love
4 months ago by
Michael Love25k
United States
Michael Love25k wrote:

It doesn't make a difference numerically, but it just takes one more step to run lfcShrink. If you look up in the vignette, we have some simple code to produce e.g. the 3 vs 2 effect as one of the coefficients in resultsNames(dds).

ADD COMMENTlink written 4 months ago by Michael Love25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 208 users visited in the last hour