Question

vooma lm.fit versus voom lmFit

1

Entering edit mode

biominer ▴ 10

@biominer-7701

Last seen 6.4 years ago

European Union

Hello,

I'd like to use the vooma function for my microarray data because when chechinkg the model I encountered quite some trend in the variances.

My input is an expressionset and kind of the fData got lost, so I checked into the code to see why. (Everything which is not an EList gets reduced to the actual expressionmatrix. That's not a major problem, but copying some code from the first lines of the voom function could easily fix that)

Having a look I realized that there are quite some differences between the voom and vooma functions. I had expected vooma to be very much alike the voom function just without transformation to logcpm.

However, vooma seems to use R base's lm.fit function for the modeling instead of limma's own lmFit function. Since the latter provides quite some more flexibility I was wondering what the rationale behind that decision has been ...

What are the relevant differences between log-cpm and array expression values that make these differences necessary?

Best,

Maik

limma • 2.9k views

ADD COMMENT • link updated 8.9 years ago by Gordon Smyth 50k • written 8.9 years ago by biominer ▴ 10

score 5 · Accepted Answer · 2015-05-12

5

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 5 hours ago

WEHI, Melbourne, Australia

vooma() is actually an edit of voom() as it was three years ago. Originally both functions were written primarily for matrices and EList objects. We've made voom() more general in the meantime but neglected to update vooma() in parallel. vooma() does work on ExpressionSet objects but doesn't preserve all the annotation information.

The reason for our lack of attention of vooma() is that the function is rarely used. Putting trend=TRUE when you run eBayes() will give you the same beneficial effect without having to run vooma(). In other words:

fit <- lmFit(y, design)
fit <- eBayes(fit, trend=TRUE)

gives (for microarray data) essentially the same effect as:

v <- vooma(y, design)
fit <- lmFit(v, design)
fit <- eBayes(fit)

We generally recommend the former pipeline over the second for microarray data.

The real justification for the vooma() function is as prototype for the voomaByGroup() function, which fits a heteroscedastic model with different variances for different groups.

ADD COMMENT • link 8.9 years ago Gordon Smyth 50k

0

Entering edit mode

Thanks a lot, that's very helpful! Just out of curiosity: Is the reason why you can't simply use eBayes with the trend parameter for log-cpm values as well the nonlinear mean variance trend or or a higher degree of heteroscedacicity overall?

ADD REPLY • link 8.9 years ago biominer ▴ 10

0

Entering edit mode

trend=TRUE is not a fully adequate substitute for voom() for RNA-seq data because the sequencing depths can be different for different libraries. In the voom paper:

http://genomebiology.com/2014/15/2/R29

we showed that trend=TRUE and voom() give equivalent results when the library sizes are equal but voom() is the best when the library sizes are considerably unequal.

ADD REPLY • link 8.9 years ago Gordon Smyth 50k