Search
Question: Moderated t-test in Anova
0
3.5 years ago by
Korea, Republic Of
shaheryar0 wrote:

How  can i apply moderated t-test in Anova as ebayes function used in limma?

Thanks

modified 3.5 years ago by Gordon Smyth35k • written 3.5 years ago by shaheryar0

Have you read the limma User's guide? There are multiple worked examples of how to do what you ask.

ADD REPLYlink written 3.5 years ago by James W. MacDonald48k
1
3.5 years ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

Following up on James' comment, the best example is in Section 9.3 of the user's guide. This does a moderated F-test for the ANOVA-style comparison, rather than a t-test. Generally speaking, we need to:

1. Use makeContrasts to define all of the comparisons you're interested in. For an ANOVA-style test for equality across multiple groups, we can define all pairwise comparisons between groups (note that we don't really need all pairwise comparisons, but this approach gives us all of the relevant log-fold changes in the downstream results, so we might as well do it).
2. Use contrasts.fit to re-fit the linear model according to the contrast matrix from step 1.
3. Use eBayes to perform empirical Bayes shrinkage on the refitted model from step 2.
4. Use topTable without any coef specified (i.e., set to NULL). This will do an ANOVA-style test for all of the comparisons at once.

If you need more advice, you'll have to give specific details about your samples, experimental design and comparisons of interest.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Aaron Lun21k

In an attempt to provide some clarification for the OP, I'd like to ask (and try to answer) what was left unsaid here.

You say that "note that we don't really need all pairwise comparisons", which might lead one to ask "well, how many contrast do I have to input to get the same result?" So I thought I'd go about the guerrilla statistician's way to answer this question.

Using some data I have lying around here, I cut it down so that it very much resembles the dataset that is being used in Section 9.3, with the slightly modified code below in the hopes that it makes comparisons more clear.

Here is the setup described in the user's guide:

fit0 <- lmFit(eset, design)
cm1 <- makeContrasts(RNA2-RNA1, RNA3-RNA2, RNA3-RNA1, levels=design)
fit1 <- eBayes(contrasts.fit(fit0, cm1))

And now the pvalues for the ANOVA are in fit1$F.p.value So what is the minimal number of contrasts we need to setup to get the same value? Replacing cm1 with either of these two gives you practically equal, but not identical $F.p.value's as above.

cm2a <- makeContrasts(RNA2-RNA1, RNA3-RNA1, levels=design)
fit2a <- eBayes(contrasts.fit(fit0, cm2a))

cm2b <- makeContrasts(RNA2-RNA1, RNA3-RNA2, levels=design)
fit2b <- eBayes(contrasts.fit(fit0, cm2b))

A plot of plot(-log10(fit1$F.p.value), -log10(fit2a$F.p.value)) (or y=-log10(fit2b$F.p.value)) gets you a plot that's basically on the 45, except that a few points are off the diagonal. These points, however, are from pvalues that are super highly significant (ie. > 7 in -log10 space) that the downstream results you get from these would most likely be identical. Still, the question remains: are these two analysis really (theoretically) equivalent to the one outlined in Section 9.3? As an aside, you could also get the same $F.p.value you get from cm2a by fitting a design with an intercept, and grabbing the multi-coef comparison's pvlaue (topTable(..., coef=2:3)$P.Value) like so: design.icept <- model.matrix(~ Target, target) fit.icept <- eBayes(lmFit(eset, design.icept)) tt <- topTable(fit.icept, 2:3, Inf, sort.by='none') Now tt$P.Value is exactly (not approximately) the same as fit2a\$F.p.value

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Steve Lianoglou12k
1

There's some approximation in contrasts.fit, but that's only relevant if you have non-orthogonal designs and weights (so it shouldn't have any effect here). The differences that you mention between fit1, fit2a and fit2b are probably due to numerical issues or differences in termination of the fitting iterations.

Of course, I'm sure you're aware that - in theory, at least - there should be no difference. The null hypothesis in cm2a and cm2b should be the same as that in cm. For example, with cm2a, if RNA2 - RNA1 = 0, and RNA3 - RNA1 = 0, then RNA3 - RNA2 must also equal zero. There's no need to state the last one.

Yup, just spelling things out a bit more explicitly for the OP. Thanks for all the help/advice you provide on this forum!

1
3.5 years ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

limma easily does anova-style F-tests, but with empirical Bayes moderation. You can use in limma on any model formula that you would use for the anova() function (except for those with random factors).

Suppose you have a matrix of expression values y and an experimental factor group. To do a oneway anova for the first gene, you might use:

out <- anova(y[1,] ~ group)
summary(out)


In limma you do this:

design <- model.matrix(~group)
fit <- lmFit(y, design)
fit <- eBayes(fit)
topTable(fit)


This does a moderated oneway anova for every gene in your dataset, not just one.

There is absolutely no way to use anova() to do an empirical Bayes analysis like limma does. anova() does a univariate analysis for one gene only, and empirical Bayes analysis is not possible for just one gene.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Gordon Smyth35k
0
3.5 years ago by
United States
Andrew Bass0 wrote:

Hi,

Sorry if I am confusing your question but I think you're asking how to calculate a moderated t-statistic without using limma (for high-dimensional data)? I think you have to write your own function or just simply use limma.

If, for whatever reason, you do not want to use limma:

Here's how I would approach this but there may be a more efficient way someone else can point out:

(0) How many samples do you have? If you have a lot then you can just do a t-test.

(1) First take a look at Gorfon Smyth's paper on the moderated t statistic (formula on page 8): http://www.statsci.org/smyth/pubs/ebayes.pdf

(2) Using the notation in the paper, there are two quantities needed for the moderated t-test: d_{0} and s_{0}, the prior d.o.f and standard deviation. So if you download the limma package, there is a function called "squeezeVar()" (I think there's a file in the R directory called "squeezeVar.R") that returns both d_{0} and s_{0}.

So I would create a new function, fit your linear model,  use squeezeVar() to calculate s_{0} and d_{0} and then calculate the moderated t-statistic.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Andrew Bass0
0
3.5 years ago by
Korea, Republic Of
shaheryar0 wrote:

As u know limma and Anova are two different approaches for gene analysis. I want to ask that in limma we can perform moderated t-test using ebayes function as it take parameter of model created by lm.fit but we use lm function for creating model in anova and the model created by annova does not work in ebayes. So can we apply moderated t-test in annova or what is the test name for anova similiar to ebays in limma or what is the best test for annova type model.

You cannot use the output of the anova function as an input into eBayes. If you want to do an ANOVA-style test in limma, follow the instructions above or in the user's guide.

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Aaron Lun21k

Actullay i want to do an limma-style test in annova :( My actual point is that

The anova function isn't built to deal with empirical Bayes shrinkage. If you want to do moderation, you should use the limma analysis pipeline as I've described.

Ok i am trying to do in that manner. i will let you know if it doesn't work well. Thanks for your guidance

Do u have any user guide for annova (specific)?