**20**wrote:

I have the following design matrix:

```
batch treatment
A 1
A 2
A 3
B 1
B 2
B 3
C 1
C 2
C 3
> batch
[1] A A A B B B C C C
Levels: A B C
> treatment
[1] 1 2 3 1 2 3 1 2 3
Levels: 1 2 3
```

I would like to determine all of the pairwise differences between the three treatments while controlling for batch. There is no "control/baseline" level for either `batch`

or `treatment`

here, so there is no clear choice for the reference levels. So far I have been modeling this using the design `~ batch + treatment`

. I realize using the default approach in `DESeq2`

this produces the following:

```
#model matrix
(Intercept) batchB batchC treatment2 treatment3
1 0 0 0 0
1 0 0 1 0
1 0 0 0 1
1 1 0 0 0
1 1 0 1 0
1 1 0 0 1
1 0 1 0 0
1 0 1 1 0
1 0 1 0 1
#resultNames(dds)
[1] "Intercept" "batch_B_vs_A" "batch_C_vs_A"
[4] "treatment_2_vs_1" "treatment_3_vs_1"
#Missing comparison
results(dds, contrast = c("treatment", "2", "3"))
```

My concern with this is that both the reference level for `batch`

and for `treatment`

do not get coefficients in the resulting model but are instead convolved together in the `intercept`

term. I am wondering if this would make the "missing" missing pairwise comparison inaccurate or different from the other two explicitly fit by the model. It also prevents running `lfcShrink`

for the missing comparison when using `apeglm`

.

I am wondering if the following expanded/full model matrix is the appropriate way to handle this situation:

```
(Intercept) batchA batchB batchC treatment1 treatment2 treatment3
1 1 0 0 1 0 0
1 1 0 0 0 1 0
1 1 0 0 0 0 1
1 0 1 0 1 0 0
1 0 1 0 0 1 0
1 0 1 0 0 0 1
1 0 0 1 1 0 0
1 0 0 1 0 1 0
1 0 0 1 0 0 1
```

This would theoretically give me coefficients for all 3 levels of both `batch`

and `treatment`

, and would allow me to run `lfcShrink`

for all 3 possible pairwise comparisons between `treatments`

. This would make the intercept term effectively `0`

however, and would also remove the reference against with the fold changes are being determined. This all seems problematic.

Is this an appropriate approach, or is there another alternative approach I should be using?

**25k**• written 3 months ago by wunderl •

**20**