Question

Multiple comparison of coefficients in limma

0

Entering edit mode

Seymoo • 0

@seymoo-12522

Last seen 12 months ago

Oslo

In a cohort of cancer samples there are 4 groups, G1 to G4. I want to perform differential gene expression between group1 Vs all other groups, i,e, G1 = G1-(G2+G3+G4)/3, to determine DEGs for group 1.

I could just find an example in limma guide where pair-wise comparisons being made ( under 9.3. several groups) with following example

 f <- factor(targets$Target, levels=c("RNA1","RNA2","RNA3"))
design <- model.matrix(~0+f)
colnames(design) <- c("RNA1","RNA2","RNA3")
To make all pair-wise comparisons between the three groups one could proceed
fit <- lmFit(eset, design)
contrast.matrix <- makeContrasts(RNA2-RNA1, RNA3-RNA2, RNA3-RNA1, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

I am wondering how should I formulate that in contrast matrix to be statistically correct?

would this be correct to write it like this:

design1 <- model.matrix(~0+factor(annot$grp, levels = unique(annot$grp)))
colnames(design1) <- unique(annot$grp)
row.names(design1) <- row.names(annot)
wt <- arrayWeights(HTA)
fitHTA <- lmFit(HTAEXP, design1, weights = wt)
genas(fitHTA, subset="Fpval", plot=TRUE, alpha=0.4)
cm1 <- makeContrasts(G1 = G1-(G2+G3+G4)/3,
                     G2 = G2-(G1+G3+G4)/3,
                     G3 = G3-(G1+G2+G4)/3,
                     G4 = G4-(G1+G2+G3)/3, levels=design1)

Thanks in advance.

limma • 1.9k views

ADD COMMENT • link updated 6.4 years ago by James W. MacDonald 65k • written 6.4 years ago by Seymoo • 0

score 1 · Answer 1 · 2017-12-06

1

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 22 hours ago

United States

That looks OK, if your intention is to compare each group against the mean of all the remaining groups.

ADD COMMENT • link 6.4 years ago James W. MacDonald 65k

0

Entering edit mode

Hi James,

So because now I am comparing them against the mean, to do similar comparison would it be better to formulate it like this

cm1 <- makeContrasts(G1VsG2 = G1-G2,
                     G1VsG3 = G1-G3,
                     G1VsG4 = G1-G4,
                     G2VsG3 = G2-G3,
                     G2VsG4 = G2-G4,
                     G3VsG4 = G3-G4,
                     levels=design1)

fitHTA2 <- contrasts.fit(fitHTA, cm1)
fitHTA2 <- eBayes(fitHTA2, robust=TRUE, trend=TRUE)

topTable(fitHTA2, coef=c("G1VsG2","G1VsG3","G1VsG4"), adjust.method="BH", p.value=0.01, sort.by="F",number = 20)

because the result is different, and I am not sure what I am getting here is similar to the comparison I did before.

ADD REPLY • link 6.4 years ago Seymoo • 0

1

Entering edit mode

In your original post, you were comparing each group to the average of the others. For example, the null hypothesis for contrast G1 is that the average expression in group G1 is equal to the grand mean of the average expression in groups G2, G3 and G4. By comparison, the topTable call in your above comment does an ANOVA-like contrast where your null hypothesis is that all groups have the same average expression. (This is true even though you only define comparisons involving G1; for example, if G2 and G3 vary, then at least one of them must differ from G1.)

The two contrasts are quite different in their null hypotheses and their scientific usefulness. The contrasts in your original post provide an average direction and magnitude of DE for each group compared to the others; this is a convenient summary but may be misleading when the average log-fold change is not representative of the differences between specific pairs of groups.

The contrast in your comment only determines whether there is any difference between groups, it makes no statement on the significance of the direction or magnitude of the log-fold change to the other groups. Nonetheless, this can be useful if we just want some genes with differences, e.g., for plotting or some other meta-analyses. The ANOVA is more powerful than combining pairwise comparisons, and it isn't confused by one of the "other groups" going up and another one going down (such that the average of G2/3/4 would be equal to G1, which would escape detection with the other contrast).