Question

cameraPR vs geneSetTest and ROAST/CAMERA in general

6

Entering edit mode

BharathAnanth ▴ 80

@bharathananth-10049

Last seen 5.8 years ago

Hi!

I learnt from the this old post (limma roast syntax for overall anova) that ROAST cannot be performed on multiple contrasts (i.e., with the F-test). This is still true, I suppose? In that post it was suggested to use geneSetTest with Fstatistic (from topTable).

First, my understanding was that roast and geneSetTest test different hypotheses, i.e., self-contained vs competitive. So this is not exactly apples and apples, is it?

Second, there is also cameraPR now available in limma. What is the recommendation for using geneSetTest vs cameraPR? In my analysis, I get very good significance (p<0.001) with geneSetTest and no significance (p>0.5) with cameraPR (once the inter gene correlation is included). I would like to know which to believe/interpret.

Third, more generally, in my analyses, I find situations where I apply either roast or camera on a single gene set, I get discordant conclusions. I understand they test different hypotheses. I do not want to indulge in p-value hacking and pick the test that fits my story. So do you have suggestions as to how to go about this in a consistent manner. (as an aside, I can suggest an explanation for no significance in ROAST but significance in CAMERA, when the effects are so small and limited to be insignificant overall due to multiple testing, but gene set of interest has all the genes on which there is some effect).

Thanks

limma voomwithqualityweights gene set analysis • 2.9k views

ADD COMMENT • link updated 6.0 years ago by Aaron Lun ★ 28k • written 6.0 years ago by BharathAnanth ▴ 80

score 9 · Answer 1 · 2018-04-02

For your first question; that's right, they are not testing the same null hypothesis. But roast() does not support F-statistics yet, so if you want to do a gene set test with multiple contrasts, you'll have to use geneSetTest(). This may be sufficient to answer your broader scientific question regarding the function of DE genes.

For your second question; reading ?geneSetTest should pretty much tell you what you need to know. To paraphrase the documentation, geneSetTest() assumes that the genes are independent, which is generally inappropriate in the presence of co-regulated genes. If geneSetTest() disagrees with cameraPR(), I would be inclined to believe the latter as it is more robust to correlations between genes.

For your third question; if you understand that they test different hypotheses, you should be able to pick the test that addresses your scientific question. Perhaps an example might be illustrative: imagine a situation where 50% of genes are DE between your conditions of interest. These 50% of genes are spread evenly throughout all gene sets, meaning that a competitive gene set test would not give any significant hits. However, a self-contained test would return significant hits for all gene sets. The outcomes for both tests are correct in their own way, but the scientific conclusions obtained are quite different.

For example, say I was interested in the immune response gene set. The self-contained test would reject the null, which tells me that the immune response is affected by the differences between conditions. Fair enough, as 50% of genes are DE in this set; I might then start to think about the biological consequences of altered immune activity between conditions. However, the competitive test would accept the null, which tells me that the immune response is no more affected than other gene sets by the differences between conditions. This is a different but also useful result, as it tells me that the immune response is not the primary distinguishing feature between conditions. Thus, I might be inclined to prioritize other pathways for follow-up work to characterize the differences between conditions.