Entering edit mode

Dear listserv,
Consider the following vector of P-values:
p.values <- c(
0.266845129,
0.353675275,
0.030585143,
0.791285527,
0.60805832,
0.433466716,
0.039369439,
0.48918122,
0.740626586
)
When I run the qvalue function of the qvalue package, I get a rather
surprising outcome: the q-values are much smaller than the
corresponding p-values!
See the following code below:
source("http://bioconductor.org/biocLite.R")
biocLite("qvalue")
library(qvalue)
q.values <- qvalue(p.values)$qvalue
comparison <- cbind(p.values, q.values)
colnames(comparison) <- c("p.values", "q.values")
comparison
p.values q.values
[1,] 0.26684513 0.037111560
[2,] 0.35367528 0.037111560
[3,] 0.03058514 0.008960246
[4,] 0.79128553 0.040020397
[5,] 0.60805832 0.039540110
[6,] 0.43346672 0.037111560
[7,] 0.03936944 0.008960246
[8,] 0.48918122 0.037111560
[9,] 0.74062659 0.040020397
So what is happening here? Can I trust the q-values? In each case they
are substantially larger than the P-values. Especially troubling to me
are rows 4, 5, and 9.
Thanks in advance for any information/advice!
-----------------------------------
Josh Banta, Ph.D
Assistant Professor
Department of Biology
The University of Texas at Tyler
Tyler, TX 75799
Tel: (903) 565-5655
http://plantevolutionaryecology.org
[[alternative HTML version deleted]]

Hi Josh,

Of course it's not statistically defensible, and you can't use these qvalues. As Mike Love points out, qvalue just isn't intended for small sets of p-values like this.

The main problem is that qvalue's estimate of pi0 (the proportion of true nulls) is not reliable for small numbers of tests. For your data, qvalue estimates pi0 to be

which is equivalent to saying that most likely all the null hypotheses should be rejected.

Some other methods of estimating pi0 are more robust. For example the limma function propTrueNull gives pi0=0.75 for the same data:

The qvalue method of estimating pi0 is sensitive to the size of the largest p.values. If you changed just the 0.79 p-value to 0.99, all the qvalues would look very different:

For very small sets of p-values, it seems to me to be safer to use the Benjamini and Hochberg algorithm (available in the p.adjust function). BH is more conservative than qvalue, but it doesn't rely on an estimator for pi0 and is trustworthy for any number of tests.

I generally use BH myself for any number of tests (my packages limma and edgeR use BH), because I don't feel that estimation of pi0 can be made bullet proof even for large numbers of tests.

Best wishes