Differences between edgeR's and limma+voom's implementation of camera/roast/romer
Entering edit mode
maltethodberg ▴ 140
Last seen 4 months ago

I'm currently using the excellent camera/roast/romer functions from limma for analysing groups of genes in count data.

Currently there are (to my knowledge) two different ways of calling the same functions: using edgeR's implementation directly on a DGEList or first transforming the data with voom, and then use standard limma.

In some cases there can be quite some difference between outputs from the two - I'm wondering what's currently considered best practice when choosing between the edgeR and the limma+voom implementation? Are there any important statistical and/or practical aspect to take into account when choosing?

limma edger mroast romer camera • 3.6k views
Entering edit mode
Last seen 1 hour ago
WEHI, Melbourne, Australia

Well, the idea is that the gene set tests should be consistent with the main differential expression analysis. If you are planning to use edgeR to do the main DE analysis, then you should also use the edgeR gene set test methods. On the other hand, if you are planning to use limma for the main DE analysis, then you should use the gene set test methods in that package. Both the edgeR and limma gene set test methods call the same underlying test functions, the only difference is in how the counts are transformed at the beginning. The edgeR methods use a transformation based on the fitted negative binomial model, which is obviously not relevant for a limma analysis.

Entering edit mode
Last seen 17 hours ago
United States

As a small addition to Gordon's answer, I once compared roast and camera results between edgeR and voom here:

A: Limma camera function never gives any differentially expressed gene sets

This is only an N of 1 result, but the pvalues returned from both versions were more or less consistent.

These part is more speculative: I've glanced at similar results from analyses of different datasets, and my impression (again, this is not a systematic comparison) is that the two largely agree, however there are differences in the extremes of pvalues, but this wouldn't affect which genesets you would identify as "significant"

Again, the previous statement is just from me spot checking a few results here and there and not from a formal analysis, so take that with a grain of salt.

However, you say that in some cases there can be quite large differences between the two, so it sounds like you have some datasets in hand.

Could you, perhaps, plot the -log10(pvalues) from each method as I did in the link above?

Entering edit mode

You are right that the internal rankings are usually more consistent than the p-values. I haven't done any systematic comparison either, but noted that at a p-value threshold of 0.05, one method will sometimes give more significant gene sets than the other. 

Entering edit mode

Yeah, I wouldn't focus too much on differences when you use a given cutoff as you can just be observing a threshold effect, but hard to know until you make something like the plot I linked to.


Login before adding your answer.

Traffic: 471 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6