Anova; can I use lmFit and eBayes from Limma, or should it be aov?
1
2
Entering edit mode
john herbert ▴ 560
@john-herbert-4612
Last seen 10.2 years ago

Dear Bioconductors,

This may be more a question of statistics understanding but it is related to Limma.

Is it possible to perform an Anova using functions from the Limma package? If so, how can a post hoc test be applied to the eBayes "MArrayLM" resulting object?

For me, it is a complex design I have array data, different clients, time course, stage of disease and treatment. The Limma manual eludes to Limma being equivalent to Anova in some cases for single channel arrays (on pages 35 and 41). I am using single channel data.

This code runs;

dd <- data.frame(patient=targets$patient,
  stage=as.factor(targets$stage),
  treat=as.factor(targets$treat),time=targets$time)

head(dd)
  patient stage treat time
1       1     A     Y   24
2       1     A     Y   72
3       2     B     N   24
4       2     B     N   72
5       2     B     N    0
6       2     B     Y   72

design <- model.matrix(~patient+stage*treat*time,data=dd)
fit <- lmFit(temp,design)
fit2 <- eBayes(fit)
topTable(fit2)
ID X.Intercept. patient stageB stageC treatY time stageB.treatY stageC.treatY stageB.time stageC.time
 A_23_P113716 A_23_P113716 17.57842 -0.0042698998 -0.18798801 -0.18726778 -0.008996453 1.282717e-03 0.38532184 0.18501711 0.0025256233 0.0005692329 A_23_P77779 A_23_P77779 17.07476 0.0015734113 -0.14199010 -0.07687905 -0.132257119 8.288697e-05 0.04068160 0.03375537 0.0024200711 0.0014022044 

treatY.time stageB.treatY.time stageC.treatY.time AveExpr F P.Value adj.P.Val A_23_P113716 -1.340200e-03 -0.0072744604 -0.0022265946 17.49773 144347.32 6.481891e-159 2.657575e-154 A_23_P77779 -8.857291e-04 0.0008166484 -0.0004867999 17.01806 98921.73 8.930817e-153 1.830817e-148

How do I interpret the results, if I am performing Anova in Limma correctly? What are the numbers for X.intercept and stageB etc?

If I take all probes into a variable with

top_data <- topTable(fit2,n=200000000)

and look for ones that are not significant, I don't find any.

 which(top_data$adj.P.Val > 0.01)
 integer(0)

So it looks like every single probe varies significantly between any of the conditions, which my instincts say is not correct.

Maybe I should run aov like this instead? But there is still a problem of post hoc testing as

aof2 <- function(x,y){ aov(x ~ stage*treat*time+Error(patient), y) } anovaresults <- apply(temp, 1, aof2,dd)
lapply(anovaresults,TukeyHSD)
Error in UseMethod("TukeyHSD") : no applicable method for 'TukeyHSD' applied to an object of class "c('aovlist', 'listof')"

If anyone knows how to solve either of these issues, it will be appreciated.

Thanks,

John.

limma • 7.3k views
ADD COMMENT
0
Entering edit mode

Question reposted

ADD REPLY
3
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

Dear John,

What limma does is always equivalent to Anova. Although, of course, it is not classical Anova but rather an empirical Bayes extension of Anova that is appropriate for microarray data.

Post hoc tests are done in limma using decideTests(), and many options are offered. You won't find classical methods like TukeyHSD though, because limma isn't doing classical Anova and because methods like TukeyHSD don't generalize well to high-dimensional datasets like microarrays. There are a couple of problems with your code below.

First, you have overlooked the fact that limma, unlike classical Anova, does not automatically adjust for the intercept term. (This is for historical reasons because limma was originally developed for two colour microarrays.) Instead limma expects you to specify which coefficients or contrasts you want to test for. When you run

topTable(fit2)

without specifying any coefficients, limma will test whether any of the coefficients are different from zero, including the intercept term. You can see that it has done this because it has included a column for the intercept in the toptable. This is not what you want.

Second, you have failed to specify in your model that patient should be a factor in your model rather than a numerical covariate. It is also unclear to me whether time should be a factor or a covariate. At the moment both are being treated as a covariates.

It is a good idea to look at colnames(design) or colnames(fit) to check that the terms in the model are what you intended.

Once you fix your model, you can conduct Anova-style F-tests using limma simply by specifying a range of coefficients to topTable(), as discussed in the User's Guide.

Best wishes
Gordon

ADD COMMENT

Login before adding your answer.

Traffic: 854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6