Posted in answer to https://support.bioconductor.org/p/16672/
Dear Misha,
> Date: Fri, 6 Apr 2007 13:00:00 +0100 > From: "Misha Kapushesky" <o******@ebi.ac.uk> > Subject: [BioC] Limma - decideTests - separate/nestedF questions > To: bioconductor at stat.math.ethz.ch > > Hi all, > > Several questions concerning intricacies of limma's decideTests() have > emerged from a discussion with some colleagues here at the EBI. Perhaps > someone can enlighten us. > > 1. The docs (and previously on this list) say that nestedF is especially > powerful in identifying genes that are diff. expressed in many contrasts, > and less powerful for ones diff. expressed in only one contrast.
I have said this on the list, but I don't think the limma documentation says so.
> On the > other hand we also read that with nestedF "at least one contrast will be > classified as significant if and only if the overall F-statistic is > significant" - meaning that it should pick up genes diff. expressed in only > one contrast, shouldn't it? Why less powerful?
You can verify for yourself in ordinary ANOVA that F-tests are less powerful than t-tests for detecting sparse effects. This because the null effects tend to dilute the truly different effects.
Suppose for example that you're testing 10 contrasts with a common residual standard deviation and p=0.001. For simplicity, suppose the residual degrees of freedom is large, so I can use normal calculations instead of t distribution. Suppose that only one contrast is truly different. For all the other contrasts, the null hypothesis of no difference is true.
If you do individual t-tests, you need a t-statistic of 3.9 to be significant after Bonferroni adjustment for multiple testing.
Now consider the F-test. The typical size of a t-statistic for a null contrasts is t^2=1, so the F-statistic will typically be about F=(t^2+9)/10 where t is the t-statistic for the truly different contrast. To be significant at 0.001 you need F=2.96, which implies t=4.54. In other words, the t-statistic needs to be larger to stand out as significant in an F-statistic than it does as an individual test.
On the other hand, if several contrasts were truly different, then the F-test would be more powerful than the t-tests.
> 2. Say we have 5 contrasts adn we run decideTests() both with "separate" and > with "nestedF" methods and are comparing the results. Suppose some gene is > marked as differentially expressed in 2 out of 5 "separate" contrasts, but > is significant in only 1 out of 5 "nestedF" ones. What's the best way to > interpret such a result? Shall we say this gene is not sufficiently variable > overall? And vice versa, if it's marked significant in 2 columns in > "nestedF" but only in 1 in "separate" results, does "nestedF" overestimate > its significance, or is it that "separate" failed to pick it up in some > contrast?
You're starting from the assumption that one of the methods is correct and the other is wrong, and that there is a way to figure out which one is correct in each case. This is not the right way to think about it. It is perfectly possible for the two methods to give different results and for both to be correct. (Otherwise limma wouldn't offer more than one method.) It is not even possible to say that one method is consistently more stringent than the other. If that was so, it would be possible to consolidate the two methods into one.
method="separate" and method="nestedF" do quite different things. "separate" controls the FDR on a per-contrast basis only. It does not control the FDR globally across all contrasts. "nestedF" controls the FDR on a per-gene basis only. It does not offer any formal FDR control at the contrast-level.
In practice you will find that "separate" gives more significant results when there are other significant results for the same contrast, i.e., significant results beget other results down the same contrast. Hence you will find that the t-statistic threshold for significance varies between contrasts. On the other hand, "nestedF" will give more significant results for genes for which there other significant results, i.e., significant results beget other results for the same gene. You will find that the t-statistic threshold for significance is much less for genes with many significant contrasts.
The bottom line is that you should not attempt to mix-n-match the different methods. You should decide in advance what sort of errors you want to control, choose the appropriate multiple testing method, and stick to it. I personally use nestedF when I'm most interested in finding genes which respond to more than one treatment. Otherwise I would use other methods.
I wish I could make this simpler, but multiple testing in two dimensions (genes and contrasts) is intrinsically subtle.
> 3. Does it make sense to rank genes in order of significance of differential > expression by looking at how many columns of "nestedF" results have non-zero > values for each gene?
No. The F-statistic orders the genes in terms of significance.
> If a gene is classified by "nestedF" as diff. > expressed in only 1 contrast, is it still one of the most variable genes > across all contrasts?
You need to define what you mean by "variable".
> 4. Say we have an experiment with treatment A, treatment B and a compound > treatment A+B (not time course), is it legitimate to apply "nestedF" to all > pair-wise contrasts to identify the most responsive overall genes, but then > to look at the results matrix separately to say which treatments contributed > most to this overall significance? Or is it more sensible to do it with the > method "separate"?
Both are legitimate methods, and both could be sensible in your case depending on what you know about your treatments and what sort of effects you're most interesting in finding.
To throw in another consideration, why not use method="global", which is by far the simplest method, using the same threshold for all genes and all contrasts, and provides global FDR control in most cases.
Best wishes
Gordon
> Many thanks in advance for any answers to these questions! > > --Misha K. and colleagues > Microarray Informatics Team, EBI