Hi

I've been using ROAST and Camera, but I am bit unsure of how much to trust each. Let's say that I want to test if a just a a couple of gene sets are enriched in an RNAseq dataset. roast/mroast should do the trick, right? Well, I am under the impression that it's very easy for roast to produce interesting p-values, even if it's just random data.

As an example, let's randomly generate gene expression for 10000 genes, and make 1000 of these DE expressed between 2 groups. Then generate two lists with 100 genes: one where 50% of the genes are DE, and the other where genes are picked entirely at random.

```
set.seed(0)
# generate random expression 10000 genes
y <- matrix(rnorm(10000 * 4), 10000,4)
design <- cbind(Intercept = 1, Group = c(0, 0, 1, 1))
# make 1000 genes differentially expressed
y[1:1000, 3:4] <- y[1:1000, 3:4] + 3
# index 1 - 50% genes are DE
DE_50 <- 951:1050
# index 2 - random genes
DE_random <- sample(1:10000, 100)
mroast(y, list(DE_50 = DE_50, DE_random = DE_random),
design = design, contrast = 2, set.statistic = 'mean50')
```

This gives:

```
NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
DE_50 100 0.02 0.55 Up 0.002 0.003 0.002 0.003
DE_random 100 0.11 0.17 Up 0.030 0.030 0.009 0.009
```

Quite interesting p-values for a random list! ROAST appears to have a huge false positive rate!

In contrast, Camera behaves as expected:

```
> camera(y, DE_50, design = design, contrast = 2)
NGenes Direction PValue
set1 100 Up 5.534736e-14
> camera(y, DE_random, design = design, contrast = 2)
NGenes Direction PValue
set1 100 Down 0.3403619
```

This makes me quite unsure about trusting ROAST. Camera is/was said to be quite conservative, but partly this could be just because ROAST gives many false positives. Are there any real advantages of using ROAST over Camera?

Thanks, Nuno

Why did you choose set.statistic="mean50"?

roast and camera would give the same p-value for the random set if you stuck to the default settings, so the whole phenomenon that's worrying you wouldn't exist.

I am often interested in testing for significance in both directions (a subset of genes is up- and another is down-regulated). The default setting does not allow that, but

`floormean`

,`mean50`

and`msq`

do. And in the original ROAST paper you suggested that`mean50`

is a good compromise in many biological situations.Partially related to this: what exactly is the meaning of

`PValue.Mixed`

? According to limma's help, it's a 'non-directional p-value'. So how does this work in the context of using`set.statistic="mean"`

? Note that even in your counter-example below (using the default options), PValue.Mixed is very low for the random set.That is what the mixed p-value is for: it tests whether the set contains DE genes in any direction. If the mixed p-value is low but the directional p-value is not, then that shows that the set contains DE genes in both directions.

I tried hard to describe PValue.Mixed clearly, both in the roast paper and in the roast documentation. People do find the whole concept to be difficult however. You didn't find the documentation to be clear enough?

I did recommend "mean50" back in 2010, but since then I found it was very confusing to (sometimes) get both up and down p-values significant for the same set, so I changed "mean" to be the default instead.

camera obviously cannot detect subsets of genes changing in different directions so, if you want that, you have to use roast.

It seems to me that you may be wanting the gene set test to have a combination of properties that are impossible for a single test to achieve. A test that is sensitive to subsets of genes changing in different directions has to be sensitive to relatively small DE proportions, and that will then lead to the behaviour that you complained about in your post.