Question: Order of call (before or after eBayes)
gravatar for andrew.birnberg
8 months ago by
andrew.birnberg20 wrote:

Hello, I have run into a new issue (for me) in fitting a limma model.

The model I am fitting includes a number of covariates/additional factors, such as ages and weights of individuals associated with a given "array", several random factors, such as race/ethnicity, region, and the factor defining my primary groups of interest.

E.g. model.matrix(~ 0 + groups + race + region + scale(age) + scale(weight), data = full_dataset)

I am making contrasts from pairwise interactions of levels of the groups factor, e.g. - high - medium - medium - low - high - low

mat_data <- # defined earlier
design <- # defined earlier
contrasts <- # defined earlier

When fitting a limma model, is there any difference between the following, and if so, what exactly:

fit <- lmFit(mat_data, design)
fit <-, contrasts)
fit <- eBayes(fit)


fit <- lmFit(mat_data, design)
fit <- eBayes(fit)
fit <-, contrasts)

Clearly there is some difference, since I see large differences in topTable for each coefficient, particularly in the t-statistic and P-values. However, logFC and AveExpr stay the same, so I believe that the coefficients being estimated are the same.

The only reason I stumbled up this scenario is that I wanted to check the residuals of my fit and found that after calling I could no longer use residuals or fitted with the MArrayLM object. It seems that after adjusting the model with, since the coefficients no longer match the design, you can't compute with the original coefficients. So, rather than computing the fit twice, I wanted to see if I could use the fit after the eBayes call to extract the residuals, and then call to get the contrasts of interest.

Thank you for any help you can provide!

limma • 272 views
ADD COMMENTlink modified 8 months ago by Gordon Smyth38k • written 8 months ago by andrew.birnberg20
Answer: Order of call (before or after eBayes)
gravatar for Ryan C. Thompson
8 months ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.4k wrote:

I recommend saving the result of as a separate object, so that you still have access to the original lmFit result as well. They both have their uses, as you have noted. Run eBayes on both of them. My typical workflow looks something like this:

fit <- lmFit(mat_data, design)
fit <- eBayes(fit)
cfit <-, contrasts)
cfit <- eBayes(cfit)
ADD COMMENTlink written 8 months ago by Ryan C. Thompson7.4k

I'm not sure I'm clear on your suggestion. Is it necessary to call eBayes twice your workflow? What would be the effect in that case?

ADD REPLYlink written 8 months ago by andrew.birnberg20

It's only necessary to call eBayes if you need the statistics that it calculates. This will almost always be the case for the contrast fit (since you want to call topTable on it), but sometimes you also want it for the regular fit as well. It normally only takes a second or two to run it, so I just always run it on every fit.

ADD REPLYlink written 8 months ago by Ryan C. Thompson7.4k

Ok, so you're saying that the statistics reported by topTable after, but before the second eBayes are based on incorrect variances? If so, what is topTable using to calculate its statistics? I had thought that only the gene-wise variances were updated by eBayes and that the coefficient variances were functions of those gene-wise variances (and that the contrasts of the coefficients were functions of those initial coefficient variances...).

ADD REPLYlink written 8 months ago by andrew.birnberg20

You can read the code of the functions to find out exactly what they are doing. I did that a while ago but I don't recall the exact details. I just remember that the rule I came up with was "always run eBayes on every fit", and I haven't had any trouble since.

ADD REPLYlink written 8 months ago by Ryan C. Thompson7.4k
Answer: Order of call (before or after eBayes)
gravatar for Gordon Smyth
8 months ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

I think you already know which order is correct. The order with before eBayes is standard and correct. You've observed that the second order gives different results so you know it can't be right.

It's easy to see what's going on:

  1. lmFit computes coefficients, residual variances and standard errors.

  2. converts the coefficients and standard errors to reflect the contrasts rather than the original design matrix, but does not compute t-statistics or p-values.

  3. eBayes computes t-statistics and p-values from the coefficients and standard errors.

So you can see that, if you run after eBayes without running eBayes again, then the object contains coefficients and standard errors matching the contrasts but t-statistics and p-values matching the original coefficients. Obviously not sensible.

If you run the code as Ryan suggested:

fit <- lmFit(mat_data, design)
fit2 <-, contrasts)
fit2 <- eBayes(fit2)

then you preserve both the original coefficients and the contrasts. It's no coincidence that the example code is written exactly like this on the help page for and every time the function is used in the limma User's Guide.

It is true that you can't compute fitted values or residuals from fit2, but I find that I hardly ever want to do that. I could have programmed so that it would store both the original coefficients and the contrasts but I didn't want to make the output object any bigger than it needed to be.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Gordon Smyth38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 231 users visited in the last hour