3.5 years ago by

Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia

edgeR and DESeq both use the same method, specifically they call the "BH" method of the p.adjust() function in the stats package.

The FDR (aka adjusted p-value) given for each rank position in the toptags table is an upper bound for the expected FDR up to this point in the table. One does not need to know the true state of the null hypotheses to compute the expected FDR. The actual (unknown) FDR in any specific situation can be greater or less than the FDR value given, but the theorem guarantees that on average it will be less.

The BH adjusted p-values also have an another looser (unpublished) empirical Bayes interpretation. They give a conservative estimate of the posterior probability that the null hypothesis is true, supposing that one chooses at random either the specified gene or one above it in the top table, see:

http://www.slideshare.net/AustralianBioinformatics/the-value-of-p-values-gordon-smyth

**Background**: Benjamini and Hochberg (1995) defined the concept of FDR and created an algorithm to control the expected FDR below a specified level given a list of independent p-values. When I was writing the limma package in 2002, I re-interpreted Benjamini and Hochberg's algorithm in terms of adjusted p-values. Specifically I considered each gene to define a list of p-values, being the p-values for all genes above and including that gene in the toptable. Then I defined the adjusted p-value to be the smallest pre-specified FDR for which the gene list would be considered significant according to the BH algorithm. I contributed the code to the R project as part of the p.adjust function, and that's what we use now.