p.adjust BH generates duplicate values
1
0
Entering edit mode
David Young ▴ 10
@david-young-4857
Last seen 9.7 years ago
Hi all, I was doing an RMA->limma (ebayes) analysis of an affymetrix mouse 430a experiment and noticed that while the p-values listed in toptable were all different, the adjusted p-values (adjust="BH") contained duplicate values. I don't think this is incorrect necessarily, but I was wondering why a different alpha wasn't generated for each gene. From what I understand, the BH method gets the adjusted p-value (alpha) from [P_k*n*c(n) ] / k < alpha, where n = total number of genes (tests), P_k = p-value at kth gene (genes ordered from low to high p-value), and k = number of genes with p-value less than or equal to P_k. I'm not entirely sure how the c(n) (dependence correction) part works, but it seems like a unique adjusted p-value (alpha) could be generated for each gene. Instead I get: >top<-topTable(efit, adjust="BH", n=nrow(exprs(rmadata))) >write.table(top, "output.xls", sep="\t") from output.xls... ID adj.P.Val P.Value Mm.277921 0.039259664 3.17E-06 Mm.272646 0.050424143 9.93E-06 Mm.148886 0.050424143 1.64E-05 Mm.235998 0.050424143 2.02E-05 Mm.4598 0.050424143 2.04E-05 Mm.10728 0.101013086 4.89E-05 Mm.162744 0.106930684 6.34E-05 Mm.247564 0.106930684 6.91E-05 Mm.269384 0.115716969 8.62E-05 Mm.212428 0.115716969 9.34E-05 Mm.457989 0.118548889 0.000126578 Mm.154662 0.118548889 0.000128005 Mm.21005 0.118548889 0.000133975 Mm.5109 0.149489879 0.000196053 Mm.207432 0.149489879 0.00020444 Does anyone know why several probesets have the same adjusted p value even though the regular p value is different for each gene? I'm 90% sure this is just my ignorance about the BH method, but I'll be very thankful to anyone who can point me in the right direction. Thanks in advance, Dave Young > sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.8.3 mouse430a2mmugcdf_14.1.0 simpleaffy_2.28.0 gcrma_2.24.1 [5] genefilter_1.34.0 affy_1.30.0 Biobase_2.12.2 loaded via a namespace (and not attached): [1] affyio_1.20.0 annotate_1.30.1 AnnotationDbi_1.14.1 Biostrings_2.20.3 [5] DBI_0.2-5 IRanges_1.10.6 preprocessCore_1.14.0 RSQLite_0.9-4 [9] splines_2.13.1 survival_2.36-9 tools_2.13.1 xtable_1.5-6 [[alternative HTML version deleted]]
• 2.6k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

Dear David,

You have described the first step of the BH algorithm. However there is a second step which ensures that the adjusted p-values are monotonic in the original p-values. It is this second step that sometimes causes a series of genes to get the same adjusted p-value. This occurs whenever the first-step adjusted p-value for a less significant gene is lower than that for a more significant gene.

Best wishes
Gordon

ADD COMMENT

Login before adding your answer.

Traffic: 722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6