Question

Limma Voom produces many identical adjusted p-values

0

Entering edit mode

b.nota ▴ 360

@bnota-7379

Last seen 3.6 years ago

Netherlands

I have used limma voom for a simple RNAseq experiment (with two conditions), and in the end many genes have exactly the same adjusted p-value. I was wondering if this is an artifact? Or an error in the program? Or any other explanation?

In a nutshell here the essential code.

dge <- DGEList(counts=counts)
keep <- rowSums(cpm(dge)>1) >= 3
dge2 <- dge[keep,]
y1 <- calcNormFactors(dge2)
dim(y1)
# 13597

v <- voom(y1, design, plot=TRUE)

fit <- lmFit(v, design)

contr <- makeContrasts(
TreatsControl = Treat-Control,
levels = design)

fit2 <- eBayes(contrasts.fit(fit, contr))

topTable(fit2)
                    logFC  AveExpr        t      P.Value   adj.P.Val        B
ENSG00000x 1.2876594 5.909082 16.08384 2.420048e-06 0.008157177 5.642434
ENSG00000x 1.6732008 4.513546 16.50495 2.064063e-06 0.008157177 5.629066
ENSG00000x 1.6447710 4.407068 14.42668 4.717211e-06 0.008157177 4.903185
ENSG00000x 0.8961535 9.105814 13.99453 5.681749e-06 0.008157177 4.866960
ENSG00000x 1.8755434 8.735065 13.73668 6.365353e-06 0.008157177 4.755742
ENSG00000x 1.0970506 8.481634 13.47671 7.152532e-06 0.008157177 4.641336
ENSG00000x 2.0074732 2.637772 15.16813 3.469509e-06 0.008157177 4.612408
ENSG00000x 1.0538233 4.869446 13.48162 7.136669e-06 0.008157177 4.601666
ENSG00000x 1.6564010 5.035228 13.41057 7.370392e-06 0.008157177 4.576627
ENSG00000x 0.8405514 6.563975 13.07226 8.611902e-06 0.008157177 4.457263

As you can see, the adjusted p-values are exactly the same, while the normal p-values differ. Any ideas about the cause? Has anyone seen this before?

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] edgeR_3.20.5 limma_3.34.5

loaded via a namespace (and not attached):
[1] compiler_3.4.3  Rcpp_0.12.14    grid_3.4.3      locfit_1.5-9.1
[5] lattice_0.20-35

limma voom adjusted pvalue • 5.3k views

ADD COMMENT • link updated 6.3 years ago by Gordon Smyth 50k • written 6.3 years ago by b.nota ▴ 360

score 4 · Answer 1 · 2018-01-11

Hi Benjamin,

There is indeed a logical explanation for these runs of identical adjusted p-values when using the Benjamini-Hochberg (BH) method. To quote Wikipedia:

"The Benjamini–Hochberg procedure (BH step-up procedure) controls the FDR at level alpha. It works as follows [for p-values in ascending order P_1,...,P_m and corresponding null hypotheses H_1,...,H_m]:

1. For a given alpha, find the largest k such that P_k <= alpha*(k/m).
2. Reject the null hypothesis (i.e., declare discoveries) for all H_i for i=1,...,k.

Geometrically, this corresponds to plotting P_k vs. k (on the y and x axes respectively), drawing the line through the origin with slope alpha/m, and rejecting all null hypotheses that are to the left of where the points cross the line."

The adjusted p-value as calculated by the function p.adjust (invoked by topTable) is then equal to the smallest alpha at which the corresponding null hypothesis is rejected. Depending on the shape of the P_k vs. k curve this might indeed lead to runs of identical adjusted p-values. See also:

See C: adj.P metod = \"BH\" many exact the same value for the answer of Mike Love to basically the same question asked a few years ago.
A nice geometric explanation [under the header 'Runs of identical adjusted p-values'] from (again) Mike Love.

Best,

Perry

score 2 · Answer 2 · 2018-01-11

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 42 minutes ago

WEHI, Melbourne, Australia

You have discovered why it is better to plot p-values rather than FDR values!

See for example: Volcanoplot with limma - RAW P-values or Adj.P-Values

ADD COMMENT • link 6.3 years ago Gordon Smyth 50k

score 1 · Answer 3 · 2018-01-11

1

Entering edit mode

Axel Klenk ★ 1.0k

@axel-klenk-3224

Last seen 7 hours ago

UPF, Barcelona, Spain

Dear b.nota,

yes, it is a known effect of the BH p-value adjustment, not limma or voom.

Cheers,

- axel

ADD COMMENT • link 6.3 years ago Axel Klenk ★ 1.0k

0

Entering edit mode

Thanks Axel, so can I just use these adjusted p-values? Or can I fix it?

ADD REPLY • link 6.3 years ago b.nota ▴ 360

0

Entering edit mode

Yes, they are fine, you can use them. Keep in mind that they are false discovery rates not p-values.

ADD REPLY • link 6.3 years ago Axel Klenk ★ 1.0k

0

Entering edit mode

When I put them in a plot, it seems kind of weird, I got complaints from my fellow researchers. They should be adjusted p-values, right? That is something else than just false discovery rates.

ADD REPLY • link 6.3 years ago b.nota ▴ 360

0

Entering edit mode

Yep, that's why I usually use unadjusted p-values for plotting and ordering. W.r.t. p-value vs. FDR, see

?p.adjust

ADD REPLY • link 6.3 years ago Axel Klenk ★ 1.0k