Question

Limma model for 4 factor (2*2*2*2) design to detect the factor specific effect and their interaction effect on gene expression

0

Entering edit mode

baibing7713661 • 0

@baibing7713661-9047

Last seen 8.5 years ago

Netherlands

I'm currently analysing an microarray experiment including 2 genotype, 2 treatment, 2 time point and 2 expression levels with three biological replicates (48 samples). When I treat each factor independently, I could find gene affacted by each factor separately. Since my result indicated one factor is strongly afffected by the other, now I would like to also check the interaction between each factor pair on gene expression. But I don't know how to build model in Limma to solve my problem. Anyone could help? Thanks.

Bing

Ph.D

Wageningen University

limma microarray • 1.6k views

ADD COMMENT • link 8.5 years ago baibing7713661 • 0

0

Entering edit mode

Please give more details (i.e., code) as to exactly how you've parameterized your design matrix. I get the impression that you've fitted an additive model for all factors, which will not have any interaction terms.

ADD REPLY • link 8.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Hi Aaron,

Thanks for the reply. For now I have made all the treatment as a single factor with design as below. I have 4 factors with 2 level each which makes it 16 single factors. Then I used different contrasts to get the interaction between each factor pairs. This may work for two factors as in Limma manual 9.5.2. However not sure whether this is correct for multi-factors.

This four factors are not blocked. So I'm interested to see the interaction between each pair and also maybe each 3 and even all 4 factors

design <- model.matrix(~0+factor(c(1,1,2,2,2,3,3,4,4,4,5,5,6,6,6,7,7,7,8,8,8,9,9,9,10,10,10,11,11,11,12,12,12,13,13,13,14,14,14,15,15,15,16,16,16)))
colnames(design) <- c("P_Cor_D24", "P_Cor_D6", "P_Cor_ND24", "P_Cor_ND6", "P_Water_D24", "P_Water_D6", "P_Water_ND24", "P_Water_ND6", "T_Cor_D24", "T_Cor_D6", "T_Cor_ND24", "T_Cor_ND6", "T_Water_D24", "T_Water_D6", "T_Water_ND24","T_Water_ND6")

contrast.matrix <- makeContrasts((P_Water_D6-P_Water_ND6)-(T_Water_D6-T_Water_ND6),(P_Water_D24-P_Water_ND24)-(T_Water_D24-T_Water_ND24),(P_Cor_D6-P_Water_D6)-(T_Cor_D6-T_Water_D6),(P_Cor_ND6-P_Water_ND6)-(T_Cor_ND6-T_Water_ND6),(P_Cor_D24-P_Water_D24)-(T_Cor_D24-T_Water_D24),(P_Cor_ND24-P_Water_ND24)-(T_Cor_ND24-T_Water_ND24),(P_Water_D24-P_Water_D6)-(T_Water_D24-T_Water_D6),(P_Water_ND24-P_Water_ND6)-(T_Water_ND24-T_Water_ND6),(P_Cor_D24-P_Cor_D6)-(T_Cor_D24-T_Cor_D6),(P_Cor_ND24-P_Cor_ND6)-(T_Cor_ND24-T_Cor_ND6),levels=design)

Cheers

Bing

ADD REPLY • link 8.5 years ago baibing7713661 • 0

3

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 2 hours ago

The city by the bay

Each of the contrasts you've specified will test for a differential effect between P and T, i.e., whether the effect of treatment/time/whatever for factor P is the same as that for factor T. This represents the interaction between the P/T factor and another factor, depending on the exact contrast you're looking at.

If you want to test for a three-factor interaction, that's a bit more complicated. To illustrate, let's use an example:

A <- rep(c("A", "a"), each=4)
B <- rep(c("B", "b"), 4)
E <- rep(rep(c("E", "e"), each=2), 2)
grouping <- factor(paste(A, B, E, sep="."))
design <- model.matrix(~0+grouping)
colnames(design) <- levels(grouping)

Let's treat the lower case as the "baseline", and the upper case as the "treatment" for each factor. Assume that we want to find the three-way interaction effect in the sample with all three treatments, i.e., A.B.E. This is done with:

con <- makeContrasts(A.B.E - a.b.e # effect of triple treatment
     - ((A.b.e - a.b.e) + (a.B.e - a.b.e) + (a.b.E - a.b.e)) # "main effects"
     - (((A.B.e - A.b.e) - (a.B.e - a.b.e))   # interaction effect between A and B
      + ((A.b.E - A.b.e) - (a.b.E - a.b.e))   # interaction effect between A and E
      + ((a.B.E - a.B.e) - (a.b.E - a.b.e))), # interaction effect between B and E
     levels=design)

The null hypothesis here is that the effect of the triple treatment is equal to the sum of the main effects of the individual treatments and the interaction effects of the pairwise treatments.

ADD COMMENT • link 8.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

baibing7713661 • 0

@baibing7713661-9047

Last seen 8.5 years ago

Netherlands

Hi Aaron,

Thanks for this. Indeed the contrast I made is to look at P/T interactions. I also have the interactions contrast for other factors. However, this way of contrast will only compare the sample with the interacted factors in the different level of the third factors separately. for example

(P_Water_D6-P_Water_ND6)-(T_Water_D6-T_Water_ND6): for looking at P/T interaction at 6 hours in water

(P_Water_D24-P_Water_ND24)-(T_Water_D24-T_Water_ND24): for looking at P/T interaction at 24 hours in water

While in comparison of 4 way anova, it will take into account of all the P sample and T samples regardless of other factors for P/T interactions.

I'm not so sure which is the best way to look for P/T interaction. Do you recommend multi-factor ANOVA (MANOVA) for this specific question? and what is the best tool for this. Thanks.

Cheers

Bing

ADD COMMENT • link 8.5 years ago baibing7713661 • 0

0

Entering edit mode

If you want to look at pairwise interactions between P/T and another factor, then there is no choice but to define a separate contrast for each combination of those other factors. However, you can consolidate all of these contrasts into a single ANOVA-like test, by running topTable (after contrasts.fit and eBayes) without specifying coef. This will test if any of the pairwise interactions are non-zero. In this manner, it will use all of the P and T samples in your data set.

If you want to look at three- or four-factor interactions, then use the code example I've given in my original answer. Note that referring to a "P/T interaction" is quite vague; the P/T factor has to be interacting with at least one other factor.

ADD REPLY • link 8.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Great, this is the point. How should I make model.matrix and makeContrasts if I would like to run this ANOVA-like test since the contrast I made before is all about the interaction between one for example P/T factor and another for example D/ND factor or 6/24 factor. I don't know how to make a single contrast for test all pairwise interaction considering all the sample involved.

ADD REPLY • link 8.5 years ago baibing7713661 • 0

0

Entering edit mode

You can use exactly the same sort of contrast matrix that you put in the comment to your original post, where you've specified a contrast for the pairwise interaction effect between P/T and another factor (for each combination of all other factors). Each contrast will form a column in the contrast matrix that's returned by makeContrasts. You can then go through contrasts.fit and eBayes as you would usually do. The key step is to not specify coef when you run topTable, such that all of the contrasts in the matrix will be tested simultaneously in an ANOVA-like test.

Edit: By my estimation, there should be 12 contrasts in the final matrix, to test for all possible interaction effects between P/T and each other factor (conditioning on all possible combinations of those factors). I think you're missing:

(P_Cor_D6-P_Cor_ND6)-(T_Cor_D6-T_Cor_ND6)
(P_Cor_D24-P_Cor_ND24)-(T_Cor_D24-T_Cor_ND24)

ADD REPLY • link 8.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Yes, in total for P/T interaction there are 12 contrasts. However, In this way and for each contrast, not all the sample is considered right? such like (P_Cor_D6-P_Cor_ND6)-(T_Cor_D6-T_Cor_ND6) which only considered N/ND factor in Cor treatment but not D/ND sample in water treatment or N/ND sample in 24 hours treatment while multi-factor ANOVA will normally consider all the D/ND samples in the experiment, right ?

ADD REPLY • link 8.5 years ago baibing7713661 • 0

0

Entering edit mode

For each individual contrast (i.e., for each column of the contrast matrix), only the relevant groups corresponding to that contrast will be considered. However, when you do an ANOVA-like test with topTable as I've described, all of the contrasts in the matrix are simultaneously tested as part of a global null hypothesis (i.e., that the individual nulls for all contrasts are true). This means that all samples will be used in calculating the final p-value.

ADD REPLY • link 8.5 years ago Aaron Lun ★ 28k

score 1 · Accepted Answer · 2015-10-29

1

Entering edit mode

baibing7713661 • 0

@baibing7713661-9047

Last seen 8.5 years ago

Netherlands

Hi Aaron,

I would like to know your interpretation for the gene list produced from the formula (T_Cor_D24 - T_Water_D24) - (T_Cor_ND24 - T_Water_ND24). Basically is the interaction between the Cor/water effect and D/ND effect at 24 hours. This fomular produced two gene lists up- (1) and down (-1) regulated. I explained as the positively and negative effect of Cor on the difference between D and ND. Does that mean the up list is the genes that increased the difference between D and ND by Cor and the down list is the genes reduced the difference between D and ND by Cor? what's your explanation. Thanks

Bing

ADD COMMENT • link 8.5 years ago baibing7713661 • 0

0

Entering edit mode

Mathematically speaking, that's correct; the "up" genes (with positive interaction terms) are those where the Cor treatment increases the log-fold change between D and ND, compared to the corresponding D/ND log-fold change in water. I'll rearrange your contrast slightly so that this makes a bit more sense:

(T_Cor_D24 - T_Cor_ND24) - (T_Water_D24 - T_Water_ND24)

You can see that if this contrast is positive, then the left term in parentheses (i.e., the D/ND effect in Cor) must be larger than the right term for the corresponding effect in water. However, this may not be as easy to interpret as one might think - when I refer to "increase", I literally mean an increase, in the way that, for example, -1 is an increase over -2.

In particular, there is no guarantee that the "up" genes have a larger D/ND effect size in Cor treatment compared to water. For example, imagine a case where the D/ND log-fold change is zero with Cor but is negative in water. This results in a large positive interaction term, even though the absolute size of the D/ND difference is much larger in water compared to Cor (where it is zero).

If you want to know what's going on, then you also need to look at the D/ND log-fold changes for each treatment individually:

If both D/ND log-FCs are positive, then this represents a situation where D is upregulated compared to ND in both treatments, and upregulation is amplified in the Cor treatment. This is the most obvious interpretation.
If the log-FC is positive for Cor treatment and negative in water, then the direction of DE is reversed between treatments (upregulated in Cor and downregulated in water). This could be considered to be a fairly extreme change.
If both log-FCs are negative, then the gene is being downregulated in D compared to ND upon treatment with water. However, this downregulation is mitigated in the Cor-treated samples.

All of the above scenarios assume a positive interaction term, but it's easy enough to flip them around for a negative interaction (i.e., those genes in the "down" set). For example, if both log-FCs and the interaction are negative, then downregulation is amplified.

ADD REPLY • link 8.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thanks Aaron, this is a perfect interpretation.

Bing

ADD REPLY • link 8.5 years ago baibing7713661 • 0