Hello,
I was hoping someone could help me determine which of the following contrast formulations is correct. Like others, I am getting confused about what to specify when one of the treatments being contrasted is the reference level.
Here is an example which aligns with my real dataset. In this case, I did RNASeq on samples collected from different Donors, each of which was given a different treatment. I am interested in the contrasts between each pair of treatments, while blocking for Donor. Specifically, I want to test for DEGs for B vs. A, C vs. A, and C vs. B.
> eg = data.frame(Donor = rep(letters[1:5], each = 3),
+            Trt = rep(LETTERS[1:3], 5))
> eg
   Donor Trt
1      a   A
2      a   B
3      a   C
4      b   A
5      b   B
6      b   C
7      c   A
8      c   B
9      c   C
10     d   A
11     d   B
12     d   C
13     e   A
14     e   B
15     e   C
In the edgeR manual, it suggests putting blocking variables first, and treatment variables second, into the design matrix. The first two attempts at building a design matrix and contrasts below follow this paradigm. They differ by the exclusion of the intercept. While I think I understand how the intercept (or not) works when there are only treatment variables, I realize I am not so sure how it works when there are also blocking variables.
It is clearer to me to do the opposite, however, and also provide no intercept, so that I can specify A, B, and C directly in the contrasts. So, assuming it is doing what I think it is doing, the third option below is my preferred formulation.
I'd very much appreciate insight into which formula(e) and contrasts are testing what I'm looking for. Thanks!
Try 1:
> #try 1, no intercept, blocking variable first
> design1 = model.matrix(~0 + Donor + Trt, data = eg)
> colnames(design1)
[1] "Donora" "Donorb" "Donorc" "Donord" "Donore" "TrtB"   "TrtC"  
> makeContrasts(B_A = TrtB, 
+               C_A = TrtC,
+               C_B = TrtC - TrtB, levels = design1)
        Contrasts
Levels   B_A C_A C_B
  Donora   0   0   0
  Donorb   0   0   0
  Donorc   0   0   0
  Donord   0   0   0
  Donore   0   0   0
  TrtB     1   0  -1
  TrtC     0   1   1
Try 2:
> # try 2, intercept, blocking variable first
> design2 = model.matrix(~Donor + Trt, data = eg)
> colnames(design2)
[1] "(Intercept)" "Donorb"      "Donorc"      "Donord"      "Donore"      "TrtB"        "TrtC"       
> makeContrasts(B_A = TrtB, 
+               C_A = TrtC,
+               C_B = TrtC - TrtB, levels = design2)
           Contrasts
Levels      B_A C_A C_B
  Intercept   0   0   0
  Donorb      0   0   0
  Donorc      0   0   0
  Donord      0   0   0
  Donore      0   0   0
  TrtB        1   0  -1
  TrtC        0   1   1
Warning message:
In makeContrasts(B_A = TrtB, C_A = TrtC, C_B = TrtC - TrtB, levels = design2) :
  Renaming (Intercept) to Intercept
Try 3:
> # try 3, no intercept, treatment variable first
> design3 = model.matrix(~0 + Trt + Donor, data = eg)
> colnames(design3)
[1] "TrtA"   "TrtB"   "TrtC"   "Donorb" "Donorc" "Donord" "Donore"
> makeContrasts(B_A = TrtB - TrtA,
+               C_A = TrtC - TrtA,
+               C_B = TrtC - TrtB, levels = design3)
        Contrasts
Levels   B_A C_A C_B
  TrtA    -1  -1   0
  TrtB     1   0  -1
  TrtC     0   1   1
  Donorb   0   0   0
  Donorc   0   0   0
  Donord   0   0   0
  Donore   0   0   0
                    
                
                
Thanks! Great, that's what I was hoping.