Question

Limma model with different intercepts/references

0

Entering edit mode

Pat • 0

@pat-20045

Last seen 5.2 years ago

I need your help with this please. I would like to design a limma model to analyze some RNA seq data. I have got four sample groups (A,B,C,D) and am particular interested in lets say group B. I am interested in a sort of intercept model, in which B is compared to all other groups each one time as an intercept. For example:

If I design an intercept model like this:

modelMatrix <- model.matrix(~ groups, data = samples)

B is different from the intercept A, while C is not different to intercept A AND while D is not different to intercept A. Graphical output for a gene would look like this (I hope that makes sense.., star is significant difference to intercept (). Underscore is the average gene expression of that group, here upregulated compared to intercept):

_____________
     B_*     
(A)_    C_ D_
_____________
Numerical output:
(A) B   C   D
0   1   0   0
_____________

Then I could write the respective contrasts with this model, but I would also like to know which genes are DE when C would be intercept and D would be intercept. Like this here:

C is different from the intercept A, while D is not different to intercept A AND while B is not different to intercept A.

D is different from the intercept A, while B is not different to intercept A AND while C is not different to intercept A.

_____________
     B_*     
(A)_    C_ D_
_____________
(A) B   C   D
0   1   0   0
_____________

_____________
   B_*     
A_    (C)_ D_
_____________
A   B   (C) D
0   1   0   0
_____________

_____________
   B_*     
A_    C_ (D)_
_____________
A   B   C   (D)
0   1   0   0
_____________

If I design my model with group B as the intercept then I get the following, which I dont want.

samples$groups <- factor(samples$groups, 
                              levels = c("B", "A", "C", "D"))
modelMatrix <- model.matrix(~ groups, data = samples)

_____________
     A_*     
(B)_    C_ D_
_____________
(B) A   C   D
0   1   0   0
_____________

_____________
        C_*
(B)_ A_    D_
_____________
(B) A   C   D
0   0   1   0
_____________

_____________
            D_*
(B)_ A_  C_ 
_____________
(B) A   C   D
0   0   0   1
_____________

When I write a model without intercept and fit a contrast like this one: design <- model.matrix(~0 + group, data = samples)

cont.matrix <- makeContrasts(B.vs.A = B - A, B.vs.C = B - C, B.vs.D = B - D, levels=design)
cont.matrix

I get the same thing as the model with B as intercept, which I also dont want.

How do I have to write the contrasts please so that I can compare the uniquely DE genes from B compared to each of the other groups? I could write the respective factors three times and then refit the model, like this:

samples$groups <- factor(samples$groups, 
                              levels = c("A", "B", "C", "D"))
modelMatrix1 <- model.matrix(~ groups, data = samples)
#
samples$groups <- factor(samples$groups, 
                              levels = c("C", "B", "A", "D"))
modelMatrix2 <- model.matrix(~ groups, data = samples)
#
samples$groups <- factor(samples$groups, 
                              levels = c("D", "B", "C", "A"))
modelMatrix3 <- model.matrix(~ groups, data = samples)

But I am pretty sure that I dont have to write three models with three different intercepts and only take one comparison from each. Cant figure it out at the moment how to write the contrasts with the different intercepts/references.

Many thanks for your input!

limma • 888 views

ADD COMMENT • link 5.2 years ago • updated 5.1 years ago Pat • 0

score 3 · Answer 1 · 2019-03-02

Perhaps you should look at makeContrasts and contrasts.fit?

y <- matrix(rnorm(6000), ncol=6)
g <- gl(3, 2)

design <- model.matrix(~g) # group 1 is first.
fit <- lmFit(y, design)
fit <- eBayes(fit)
topTable(fit, coef=2) # comparison of 2 - 1
topTable(fit, coef=3) # comparison of 3 - 1

con <- makeContrasts(g3 - g2, levels=design)
fit2 <- contrasts.fit(fit, con)
fit2 <- eBayes(fit2)
topTable(fit2) # comparison of 3 -2

The limma user's guide has many such examples.

score 0 · Answer 2 · 2019-03-05

Many thanks for the reply. I have drafted below two approaches that would look at the same thing, how GroupA is uniquely different to the other groups. One is with an intercept model and then refitting with different factor levels as intercepts, and the other one with no intercept and then using contrasts to get the same information.

Would you recommend one over the other approach? Is there a more elegant way to write this?

INTERCEPT MODEL

INTERCEPT MODEL

# Filtering genes
gene_filt <- filterByExpr(gene_DGE, design = model.matrix(~ Group, data = gene_DGE$samples), min.count = 40)
#
gene_filtered1 <- gene_DGE[gene_filt, keep.lib.sizes = F]
#
gene_filtered2 <- calcNormFactors(gene_filtered1)


# Fit INTERCEPT model 3 times to get the genes that are uniquely expressed between two groups
## 1) What makes the GroupA unique from GroupB (intercept)?

modelMatrix <- model.matrix(~ Group, data = samples)
#
samples$Group <- factor(samples$Group, 
                              levels = c("GroupB", "GroupA", "GroupC", "GroupD"))
modelMatrix <- model.matrix(~ Group, data = samples)
modelMatrix
#
gene_v_NEW <- voom(gene_filtered2, design = modelMatrix, plot = T)
#
gene_fit_NEW <- lmFit(gene_v_NEW, design = modelMatrix) %>% 
  eBayes()
#
genes_INTERCEPT <- decideTests(gene_fit_NEW) %>% as.data.frame() %>% bind_cols(gene_fit_NEW$genes)

# Data export
comparison1_Unique_up <- filter(genes_INTERCEPT, GroupA == 1, GroupC == 0 | GroupC == -1, GroupD == 0 | GroupD == -1) %>% .[order(.$Annotation),]
comparison1_Unique_down <- filter(genes_INTERCEPT, GroupA == -1, GroupC == 0 | GroupC == 1, GroupD == 0 | GroupD == 1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison1_Unique_up, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison1_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison1_Unique_down, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison1_Unique_down", append=TRUE, row.names=FALSE)  





## 2) What makes the GroupA unique from GroupC (new intercept)?

modelMatrix <- model.matrix(~ Group, data = samples)
#
samples$Group <- factor(samples$Group, 
                              levels = c("GroupC", "GroupA", "GroupB", "GroupD"))
modelMatrix <- model.matrix(~ Group, data = samples)
modelMatrix
#
gene_v_NEW <- voom(gene_filtered2, design = modelMatrix, plot = T)
#
gene_fit_NEW <- lmFit(gene_v_NEW, design = modelMatrix) %>% 
  eBayes()
#
genes_INTERCEPT <- decideTests(gene_fit_NEW) %>% as.data.frame() %>% bind_cols(gene_fit_NEW$genes)

# Data export
comparison2_Unique_up <- filter(genes_INTERCEPT, GroupD == 0 | GroupD == -1, GroupB == 0 | GroupB == -1, GroupA == 1) %>% .[order(.$Annotation),]
comparison2_Unique_down <- filter(genes_INTERCEPT, GroupD == 0 | GroupD == 1, GroupB == 0 | GroupB == 1, GroupA == -1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison2_Unique_up, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison2_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison2_Unique_down, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison2_Unique_down", append=TRUE, row.names=FALSE)  






## 3) What makes the GroupA unique from GroupD (new intercept)?

modelMatrix <- model.matrix(~ Group, data = samples)
#
samples$Group <- factor(samples$Group, 
                              levels = c("GroupD", "GroupA", "GroupB", "GroupC"))
modelMatrix <- model.matrix(~ Group, data = samples)
modelMatrix
#
gene_v_NEW <- voom(gene_filtered2, design = modelMatrix, plot = T)
#
gene_fit_NEW <- lmFit(gene_v_NEW, design = modelMatrix) %>% 
  eBayes()
#
genes_INTERCEPT <- decideTests(gene_fit_NEW) %>% as.data.frame() %>% bind_cols(gene_fit_NEW$genes)

# Data export
comparison3_Unique_up <- filter(genes_INTERCEPT, GroupC == 0 | GroupC == -1, GroupB == 0 | GroupB == -1, GroupA == 1) %>% .[order(.$Annotation),]
comparison3_Unique_down <- filter(genes_INTERCEPT, GroupC == 0 | GroupC == 1, GroupB == 0 | GroupB == 1, GroupA == -1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison3_Unique_up, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison3_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison3_Unique_down, file="DGE_expression_list-INTERCEPT.xlsx", sheetName="comparison3_Unique_down", append=TRUE, row.names=FALSE)

score 0 · Answer 3 · 2019-03-05

CONTRAST MODEL

CONTRAST MODEL

# Filtering genes
gene_filt2 <- filterByExpr(gene_DGE, design = model.matrix(~0 +Group, data = gene_DGE$samples), min.count = 40)
#
gene_filtered3 <- gene_DGE[gene_filt2, keep.lib.sizes = F]
#
gene_filtered4 <- calcNormFactors(gene_filtered3)


# Fit model with 3 contrasts to get the genes that are uniquely expressed between two groups
gene_v_NoIntercept <- voom(gene_filtered4, design = model.matrix(~0 + Group, data = gene_filtered4$samples), plot = T)
#
gene_fit_NoIntercept <- lmFit(gene_v_NoIntercept, design = model.matrix(~0 + Group, data = gene_filtered4$samples)) 
#
design <- model.matrix(~0 + Group, data = gene_filtered3$samples)


## CONTRASTS
## 1) What makes the GroupA unique from GroupB?
cont.matrix1 <- makeContrasts(B.vs.A = GroupB - GroupA, B.vs.C = GroupB - GroupC, B.vs.D = GroupB - GroupD, levels=design)
cont.matrix1
## fit contrasts
fit.cont1 <- contrasts.fit(gene_fit_NoIntercept, cont.matrix1) %>% eBayes()
#
genes_NoIntercept1 <- decideTests(fit.cont1) %>% as.data.frame() %>%  bind_cols(fit.cont1$genes)

# Data export
comparison1_Unique_up <- filter(genes_NoIntercept1, GroupA == 1, GroupC == 0 | GroupC == -1, GroupD == 0 | GroupD == -1) %>% .[order(.$Annotation),]
comparison1_Unique_down <- filter(genes_NoIntercept1, GroupA == -1, GroupC == 0 | GroupC == 1, GroupD == 0 | GroupD == 1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison1_Unique_up, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison1_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison1_Unique_down, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison1_Unique_down", append=TRUE, row.names=FALSE)  





## 2) What makes the GroupA unique from GroupC?
cont.matrix2 <- makeContrasts(C.vs.A = GroupC - GroupA, C.vs.B = GroupC - GroupB, C.vs.D = GroupC - GroupD, levels=design)
cont.matrix2
## fit contrasts
fit.cont2 <- contrasts.fit(gene_fit_NoIntercept, cont.matrix2) %>% eBayes()
#
genes_NoIntercept2 <- decideTests(fit.cont2) %>% as.data.frame() %>%  bind_cols(fit.cont2$genes)

# Data export
comparison2_Unique_up <- filter(genes_NoIntercept2, GroupA == 1, GroupB == 0 | GroupB == -1, GroupD == 0 | GroupD == -1,) %>% .[order(.$Annotation),]
comparison2_Unique_down <- filter(genes_NoIntercept2, GroupA == -1, GroupB == 0 | GroupB == 1, GroupD == 0 | GroupD == 1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison2_Unique_up, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison2_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison2_Unique_down, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison2_Unique_down", append=TRUE, row.names=FALSE)  




## 3) What makes the GroupA unique from GroupD?
cont.matrix3 <- makeContrasts(D.vs.A = GroupC - GroupA, D.vs.B = GroupD - GroupB, D.vs.C = GroupD - GroupC, levels=design)
cont.matrix3
## fit contrasts
fit.cont3 <- contrasts.fit(gene_fit_NoIntercept, cont.matrix3) %>% eBayes()
#
genes_NoIntercept3 <- decideTests(fit.cont3) %>% as.data.frame() %>%  bind_cols(fit.cont3$genes)

# Data export
comparison3_Unique_up <- filter(genes_NoIntercept3, GroupA == 1, GroupB == 0 | GroupB == -1, GroupC == 0 | GroupC == -1) %>% .[order(.$Annotation),]
comparison3_Unique_down <- filter(genes_NoIntercept3, GroupA == -1, GroupB == 0 | GroupB == 1, GroupC == 0 | GroupC == 1) %>% .[order(.$Annotation),]
#
write.xlsx(comparison3_Unique_up, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison3_Unique_up", append=TRUE, row.names=FALSE)
#
write.xlsx(comparison3_Unique_down, file="DGE_expression_list-CONTRASTS.xlsx", sheetName="comparison3_Unique_down", append=TRUE, row.names=FALSE)

score 0 · Answer 4 · 2019-03-05

0

Entering edit mode

Pat • 0

@pat-20045

Last seen 5.2 years ago

There should be a way to write this all in one contrast, like the trial below (which does not work). One contrasts would have the advantage that I dont have to merge the different tables after fitting the different intercepts or contrasts from above..

cont.matrix_x <- makeContrasts(A.vs.B = GroupA - GroupB, A.vs.C = GroupA - GroupC, A.vs.D = GroupA - GroupD, C.vs.D = GroupC - GroupD, C.vs.B = GroupC - GroupB, D.vs.B = GroupD - GroupB, levels=design)
cont.matrix_x

fit.cont3 <- contrasts.fit(gene_fit_NoIntercept, cont.matrix_x) %>% eBayes()

genes_NoIntercept4 <- decideTests(fit.cont3) %>% as.data.frame() %>%  bind_cols(fit.cont3$genes)

comparison_x_Unique_up_and_down <- filter(genes_NoIntercept4, "A.vs.B" == 1 | "A.vs.B" == -1, "A.vs.C" == 1 | "A.vs.C" == -1, "A.vs.D" == 1 | "A.vs.D" == 1, "C.vs.D" == 0, "C.vs.B" == 0, "D.vs.B" == 0) %>% .[order(.$Annotation),]


vennDiagram(comparison_x_Unique_up_and_down[,1:3],include=c("up","down"),
            counts.col=c("brown3", "darkblue"),
            circle.col = c("darkred", "blue", "green4"))
title("all genes", line = 1)

# NO RESULT

ADD COMMENT • link 5.1 years ago Pat • 0

0

Entering edit mode

As I said above, unless you're answering your own question, use the "Add comment" button.

You say you get "NO RESULT", but that is not helpful. No plot? An empty plot? A Venn diagram full of zeroes? A Venn diagram with a zero in the middle? I think your code is right but the %>% don't lend themselves to easy reading.

ADD REPLY • link 5.1 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thanks for your quick reply. I will use less code next time. You are right, the Venn diagram is full of zeros. May be "A.vs.B" should be written without "". Then I do get a few numbers here and there, but not what I would have expected from the models with the different contrasts. I guess, I will just do it manually. Thanks for referring to the ?contrasts.fit. I will use the non-intercept model with the different contrasts.

ADD REPLY • link 5.1 years ago Pat • 0