Search
Question: Limma Voom produces many identical adjusted p-values
0
8 months ago by
b.nota300
Netherlands
b.nota300 wrote:

I have used limma voom for a simple RNAseq experiment (with two conditions), and in the end many genes have exactly the same adjusted p-value. I was wondering if this is an artifact? Or an error in the program? Or any other explanation?

In a nutshell here the essential code.

dge <- DGEList(counts=counts)
keep <- rowSums(cpm(dge)>1) >= 3
dge2 <- dge[keep,]
y1 <- calcNormFactors(dge2)
dim(y1)
# 13597

v <- voom(y1, design, plot=TRUE)

fit <- lmFit(v, design)

contr <- makeContrasts(
TreatsControl = Treat-Control,
levels = design)

fit2 <- eBayes(contrasts.fit(fit, contr))

topTable(fit2)
logFC  AveExpr        t      P.Value   adj.P.Val        B
ENSG00000x 1.2876594 5.909082 16.08384 2.420048e-06 0.008157177 5.642434
ENSG00000x 1.6732008 4.513546 16.50495 2.064063e-06 0.008157177 5.629066
ENSG00000x 1.6447710 4.407068 14.42668 4.717211e-06 0.008157177 4.903185
ENSG00000x 0.8961535 9.105814 13.99453 5.681749e-06 0.008157177 4.866960
ENSG00000x 1.8755434 8.735065 13.73668 6.365353e-06 0.008157177 4.755742
ENSG00000x 1.0970506 8.481634 13.47671 7.152532e-06 0.008157177 4.641336
ENSG00000x 2.0074732 2.637772 15.16813 3.469509e-06 0.008157177 4.612408
ENSG00000x 1.0538233 4.869446 13.48162 7.136669e-06 0.008157177 4.601666
ENSG00000x 1.6564010 5.035228 13.41057 7.370392e-06 0.008157177 4.576627
ENSG00000x 0.8405514 6.563975 13.07226 8.611902e-06 0.008157177 4.457263

As you can see, the adjusted p-values are exactly the same, while the normal p-values differ. Any ideas about the cause? Has anyone seen this before?

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] edgeR_3.20.5 limma_3.34.5

loaded via a namespace (and not attached):
[1] compiler_3.4.3  Rcpp_0.12.14    grid_3.4.3      locfit_1.5-9.1
[5] lattice_0.20-35

modified 8 months ago by Gordon Smyth35k • written 8 months ago by b.nota300
3
8 months ago by
Bioinformatics Laboratory, Academic Medical Center, Amsterdam, the Netherlands
Perry Moerland110 wrote:

Hi Benjamin,

There is indeed a logical explanation for these runs of identical adjusted p-values when using the Benjamini-Hochberg (BH) method. To quote Wikipedia:

"The Benjamini–Hochberg procedure (BH step-up procedure) controls the FDR at level alpha. It works as follows [for p-values in ascending order P_1,...,P_m and corresponding null hypotheses H_1,...,H_m]:

1. For a given alpha, find the largest k such that P_k <= alpha*(k/m).
2. Reject the null hypothesis (i.e., declare discoveries) for all H_i for i=1,...,k.

Geometrically, this corresponds to plotting P_k vs. k (on the y and x axes respectively), drawing the line through the origin with slope alpha/m, and rejecting all null hypotheses that are to the left of where the points cross the line."

The adjusted p-value as calculated by the function p.adjust (invoked by topTable) is then equal to the smallest alpha at which the corresponding null hypothesis is rejected. Depending on the shape of the P_k vs. k curve this might indeed lead to runs of identical adjusted p-values. See also:

Best,

Perry

Perry, thanks for explanation and link to earlier posts!

2
8 months ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

You have discovered why it is better to plot p-values rather than FDR values!

See for example: Volcanoplot with limma - RAW P-values or Adj.P-Values

1
8 months ago by
Axel Klenk920
Switzerland
Axel Klenk920 wrote:

Dear b.nota,

yes, it is a known effect of the BH p-value adjustment, not limma or voom.

Cheers,

- axel

Thanks Axel, so can I just use these adjusted p-values? Or can I fix it?

Yes, they are fine, you can use them. Keep in mind that they are false discovery rates not p-values.

When I put them in a plot, it seems kind of weird, I got complaints from my fellow researchers. They should be adjusted p-values, right? That is something else than just false discovery rates.

Yep, that's why I usually use unadjusted p-values for plotting and ordering.  W.r.t. p-value vs. FDR, see