Question: Is number of rotations dependent on number of gene sets in mroast?
0
9 months ago by
siajunren0 wrote:

When testing more gene sets with mroast, should number of rotations be increased to maintain the accuracy of p values? If I use x number of rotations for testing 1 gene set, should I use perhaps 10x number of rotations for 10 gene sets so that each gene set gets x rotations? This question obviously demonstrates my lack of understanding of how mroast works, sorry about that.

limma mroast • 181 views
modified 9 months ago by Gordon Smyth37k • written 9 months ago by siajunren0
Answer: Is number of rotations dependent on number of gene sets in mroast?
2
9 months ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

No, the number of rotations is not dependent on the number of gene sets in the way that you are thinking.

All the rotations are applied simultaneously to all genes (and hence to all gene sets). If you set nrot=10000, then every gene and every gene set will get 10,000 rotations. That remains true no matter how many genes or sets there are.

There is a dependence, however, between the number of gene sets and the number of rotations that you need to get a significant FDR value. The smallest two-sided p-value that mroast can give for any set is 1/(nrot+1). If you have a lot of gene sets, then you will need to choose nrot large so that small p-values are possible, otherwise you won't end up with any significant gene sets after adjustment for multiple testing.

If the number of gene sets is very large, then the number of rotations needed to get fine resolution of the most significant sets can be prohibitive, so we recommend that you switch to fry() or camera() instead.

1

Does the number of rotations you should use depend on the number of gene sets? I thought the OP was asking something along those lines. For instance, if you do mroast with say 5 gene sets, the multiplicity burden isn't that great, so maybe the default 999 rotations is sufficient. But if you have 100 gene sets and the minimum possible p-value is 0.001 with 999 rotations, should you bump that up to say 10 or 20K so your minimum possible p-value is small enough to survive multiplicity correction?

I don't know how the interplay between the midp and BH works out in this situation, but I was thinking that you would probably want to bump the rotations up as the gene sets increased, all things equal.

What you say about multiple testing is right, but it seems to me that OP was asking a much more basic question. I've added a bit more detail to my reply.