Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
Dear list,
I was converting the p values of my DEG list to adjusted p-values
(using benjamin Hochberg) and found out that many adjusted p-values
have the exact same value. (using topTable from limma package)
Is this normal and what is the reason for this observation?
thanks in advance,
cheers,
Bas
-- output of sessionInfo():
R version 2.15.3 (2013-03-01)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] limma_3.14.4 affy_1.36.1 Biobase_2.18.0
BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] affyio_1.26.0 BiocInstaller_1.8.3 preprocessCore_1.20.0
tools_2.15.3 zlibbioc_1.4.0
--
Sent via the guest posting facility at bioconductor.org.
hi Bas,
This is a direct consequence of the method of Benjamini and Hochberg. The paper is quite accessible and worth a look:
http://www.jstor.org/stable/2346101
See formula (1) under "False Discovery Rate Controlling Procedure".
All the p-values p_i up to p_k also get rejected at the FDR q. for a visualization of this formula, take a few sorted p-values with adjusted p-value written above:
p <- sort(c(.01,.2,.21,.22,.5,.51,.52,.8,.9))
padj <- p.adjust(p, method="BH")
plot(p,ylim=c(0,1),xlim=c(0,length(p)))
text(seq_along(p),p,round(padj,3),pos=3)
abline(0,padj[4]/length(p))
here, k = 4, m=length(p), and q = padj[4].
Mike