Explanation of an ANOVA approach of multiple coefficients in limma with topTable in a microarray analysis
Entering edit mode
Last seen 8 weeks ago
University of Salerno, Salerno, Italy

Dear Bioconductor Community,

I would like to ask a specific question regarding the interpretation of an "ANOVA" approach in limma and topTable function. In detail, in a previous post i have created ( C: Questions about complex design in limma regarding an agilent microarray dataset ) , Aaron helpfully mentioned the difference about dropping separately each coefficient in topTable about a statistical comparison (i.e. coef=1) and by dropping for instance coef=1:4, which essentially performs an ANOVA test checking for DE in any of my comparisons. Thus, my crucial (and might naive question) is the following: is it sensibleĀ  to get a significantly greater number of DE genes in my ANOVA implementation, than in the sum of dropping each coefficient separately ? And this could be probably due to the "nature" of the ANOVA testing ? In other words, what is the crucial difference in the computation of statistics and DE genes when moving from i.e. coef=2 (a specific comparison) to coef=1:4 ? For instance, the ANOVA approach also tests for difference in means in coef=1 versus coef=2 ? Or this is irrelevantĀ  as all the mentioned comparisons have been specified in the makeContrasts function? (above link for code).

Please excuse me for this beginner question, but I'm a newbie in R/statistics and this specific part is very crucial !!

Best Regards,

Konstantinos Yeles

limma ANOVA topTable microarray • 1.3k views
Entering edit mode
Last seen 1 hour ago
WEHI, Melbourne, Australia

There's no secret difference between anova and t-tests, it's sort of commonsense.

You can think of the anova test as pooling the four separate t-tests into an F-test. If all four t-tests were borderline significant, then the anova F-test will naturally be more powerful than doing separate t-tests, because it will accumulate information from all the tests. Getting one borderline t-statistic may not be surprising, but getting four of them is less likely by chance and will lead to a small p-value. The F-test therefore can be more significant than any of the individual t-tests.

On the other hand, if only one of the t-tests is large and the other three are small, then the F-test will be less powerful than the t-tests because the small statistics will dilute the large one. So the F-test may be much less significant than the most significant of the t-tests.

In general, if all four contrasts are similar in size then the F-test will be more powerful than the t-tests. If only one of the contrasts tends to be DE, then individual t-tests will tend to be more powerful.

Entering edit mode

Dear Gordon, thank you very much for your answer !! Just two quick points to mention in order to be on the "safe side":

1) About the comparing coefficients with coef=1:4--essentially, ANOVA will perform only the comparisons that have already been defined in the coefficients with makeContrasts, right ? For example, if coef=1 represents bystander samples vs control samples in 0.5h, ANOVA will NOT also perform a between coefficients comparison, correct ? I.E. coef2 vs coef4.

2) Or my above notion is incorrect, and actually ANOVA compares in any of the means in each contrast (defined above-shown in the previous post) in each gene is significantly higher("different") from the other three ?

Please excuse for my new question, but this is the point that confuses me the most for the specific interpretation !!

Entering edit mode

It depends on how the coefficients are defined, i.e., what they mean in the context of the fitted model. Sorry, but I only want to answer the question you asked here. I don't have time to read your earlier post and the long question and answer series with Aaron.


Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6