Question

Differences between edgeR's and limma+voom's implementation of camera/roast/romer

1

Entering edit mode

maltethodberg ▴ 170

@maltethodberg-9690

Last seen 9 days ago

Denmark

I'm currently using the excellent camera/roast/romer functions from limma for analysing groups of genes in count data.

Currently there are (to my knowledge) two different ways of calling the same functions: using edgeR's implementation directly on a DGEList or first transforming the data with voom, and then use standard limma.

In some cases there can be quite some difference between outputs from the two - I'm wondering what's currently considered best practice when choosing between the edgeR and the limma+voom implementation? Are there any important statistical and/or practical aspect to take into account when choosing?

limma edger mroast romer camera • 5.5k views

ADD COMMENT • link updated 8.2 years ago by Steve Lianoglou ★ 13k • written 8.2 years ago by maltethodberg ▴ 170

1

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 14 months ago

United States

As a small addition to Gordon's answer, I once compared roast and camera results between edgeR and voom here:

A: Limma camera function never gives any differentially expressed gene sets

This is only an N of 1 result, but the pvalues returned from both versions were more or less consistent.

These part is more speculative: I've glanced at similar results from analyses of different datasets, and my impression (again, this is not a systematic comparison) is that the two largely agree, however there are differences in the extremes of pvalues, but this wouldn't affect which genesets you would identify as "significant"

Again, the previous statement is just from me spot checking a few results here and there and not from a formal analysis, so take that with a grain of salt.

However, you say that in some cases there can be quite large differences between the two, so it sounds like you have some datasets in hand.

Could you, perhaps, plot the -log10(pvalues) from each method as I did in the link above?

ADD COMMENT • link 8.2 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

You are right that the internal rankings are usually more consistent than the p-values. I haven't done any systematic comparison either, but noted that at a p-value threshold of 0.05, one method will sometimes give more significant gene sets than the other.

ADD REPLY • link 8.2 years ago maltethodberg ▴ 170

0

Entering edit mode

Yeah, I wouldn't focus too much on differences when you use a given cutoff as you can just be observing a threshold effect, but hard to know until you make something like the plot I linked to.

ADD REPLY • link 8.2 years ago Steve Lianoglou ★ 13k

score 6 · Accepted Answer · 2016-02-10

Well, the idea is that the gene set tests should be consistent with the main differential expression analysis. If you are planning to use edgeR to do the main DE analysis, then you should also use the edgeR gene set test methods. On the other hand, if you are planning to use limma for the main DE analysis, then you should use the gene set test methods in that package. Both the edgeR and limma gene set test methods call the same underlying test functions, the only difference is in how the counts are transformed at the beginning. The edgeR methods use a transformation based on the fitted negative binomial model, which is obviously not relevant for a limma analysis.