I'm having a hard time understanding how to pull out the relevant contrasts when using a model with covariates in
Let's say I have the following gene expression matrix and experimental variables:
library(limma) set.seed(16) mat <- matrix(rnorm(600), 100, 6) age <- c(45, 55, 47, 56, 43, 56) gender <- factor(c("M", "F", "F", "F", "M", "F")) condition <- factor(c(rep("healthy", 3), rep("disease", 3)), levels = c("healthy", "disease"))
If I were to use a very simple design (
~ condition) then I could easily identify genes differentially expressed in disease vs healthy both using a treatment-contrast parametrization (case A below) or using a group-means parametrization and manually defining the contrast (case B below):
# case A: simple design; treatment-contrast parametrization design_A <- model.matrix(~ condition) fit_A <- lmFit(mat, design_A) fit_A <- eBayes(fit_A) tt_A <- topTable(fit_A, coef = "conditiondisease") # case B: simple design; group-means parametrization design_B <- model.matrix(~ 0 + condition) fit_B <- lmFit(mat, design_B) cont.matrix_B <- makeContrasts(diseaseVShealthy = conditiondisease - conditionhealthy, levels = design_B) fit_B <- contrasts.fit(fit_B, cont.matrix_B) fit_B <- eBayes(fit_B) tt_B <- topTable(fit_B, coef = "diseaseVShealthy")
In fact, in this case the tt_A and tt_B tables contain the same results.
However, if I want to use a more complex design (
~ age + gender + condition), I can still do the treatment-contrast parametrization (case C below) but I can no longer do the group-means parametrization because I'm missing the
conditionhealthy column in the design matrix (case D below):
# case C: complex design; treatment-contrast parametrization design_C <- model.matrix(~ age + gender + condition) fit_C <- lmFit(mat, design_C) fit_C <- eBayes(fit_C) tt_C <- topTable(fit_C, coef = "conditiondisease") # case D: simple design; group-means parametrization design_D <- model.matrix(~ 0 + age + gender + condition) fit_D <- lmFit(mat, design_D) cont.matrix_D <- makeContrasts(diseaseVShealthy = conditiondisease - conditionhealthy, levels = design_D)
Error in eval(ej, envir = levelsenv) : object 'conditionhealthy' not found#fit_D <- contrasts.fit(fit_D, cont.matrix_D) #fit_D <- eBayes(fit_D) #tt_D <- topTable(fit_D, coef = "diseaseVShealthy")
I would like to use the group-means parametrization with the more complex design because I think it's much clearer what's going on.
How can I do that?
P.S.: Note that I also don't completely understand the expressions "treatment-contrasts" and "group-means" parametrization, but this is what they are called on the limma user guide (section 9.2).