Question

BH vs BY in p.adjust + FDR interpretation

0

Entering edit mode

lemerle@embl.de ▴ 50

@lemerleemblde-1669

Last seen 9.7 years ago

dear list, i am analysing a set of affymetrix chips using limma to get lists of differentially expressed probesets. concerning BH and BY, the two methods available with p.adjust that adjust p-values controlling for the false discovery rate, i was surprised to find a large discrepancy in the number of differentially expressed genes in the case of one contrast of interest: > fit <- lmFit(eset.f,design) > fit2 <- contrasts.fit(fit,c.matn) > fit2 <- eBayes(fit2) # adjustment for multiple testing of the set of p-values derived from the first contrast: > p.value <- fit2$p.value[,1] > bh <- p.adjust(p.value, method = "BH") > by <- p.adjust(p.value, method = "BY") # number of differentially expressed genes below the cutoff value of 0.01 for the 2 methods > length(which(bh<0.01)) [1] 1032 > length(which(by<0.01)) [1] 519 # size of the original set: 3549 probesets (with 15 observations for each) > dim(exprs(eset.f)) [1] 3549 15 i don't think there is a bug anywhere (but i'd gladly send you my data to reproduce this), i'm more inclined to think that i know too little to make sense of this at this stage.... could this difference be due to the fact that these numbers correspond to a large fraction of the whole set tested and that the two methods behave differently with respect to this? if estimates of false positives at this cutoff were reliable with both methods, wouldn't this imply that BH gives a far better coverage of the set of truly differentially expressed genes than BY (TP/(TP+FN)), for the same control over false discovery (here, 1%)? i've seen mentioned but only vaguely that the magnitude of change is an issue. that the methods work best when it is low.... if you think that may be it, could you point me to something i might read to understand this better? a related question i have is about the interpretation of FDR values close to 1, in such cases where there are many changes, ie when the fraction of non-differentially expressed genes in the whole set is far below 1... i've come to understand that methods controlling fdr substitute the p-value of each gene for a value that corresponds to the minimum cut-off to apply in order for that gene to be called positive, with the added info that such a set obtained by applying this minimum cutoff can be expected to contain that percentage of false positives. but as the fdr rate ranges from 0 to 1, the chosen cutoff can fall above the true percentage of non-differentially expressed sets and i don't see how such an estimate can still be reliable up to 1 if the true number of false positives is far lower.... unless it represents only an upper bound of how many false positives to expect... is this the case ? thanks for any comment, i'm confused about this.... caroline -- Caroline Lemerle PhD student, lab of Luis Serrano Structural Biology and Biocomputing Dept EMBL, Heidelberg, Germany tel 00-49-6221-387-8335 --

Coverage limma Coverage limma • 2.4k views

ADD COMMENT • link updated 8.3 years ago by Gordon Smyth 50k • written 17.8 years ago by lemerle@embl.de ▴ 50

score 0 · Answer 1 · 2006-07-28

Hi Caroline, what does hist(fit2$p.value[,1], breaks=100, col="orange") look like? Assuming that the p-values are uniform distributed under the null hypothesis (which they should be if they are worth their name), then the shape of this distribution can give you an idea of the proportion of differential vs non-differential probesets. Best wishes Wolfgang -- ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > dear list, > > i am analysing a set of affymetrix chips using limma to get lists of > differentially expressed probesets. > concerning BH and BY, the two methods available with p.adjust that adjust > p-values controlling for the false discovery rate, i was surprised to find a > large discrepancy in the number of differentially expressed genes in the case > of one contrast of interest: > >> fit <- lmFit(eset.f,design) >> fit2 <- contrasts.fit(fit,c.matn) >> fit2 <- eBayes(fit2) > > # adjustment for multiple testing of the set of p-values derived from the first > contrast: > >> p.value <- fit2$p.value[,1] >> bh <- p.adjust(p.value, method = "BH") >> by <- p.adjust(p.value, method = "BY") > > # number of differentially expressed genes below the cutoff value of 0.01 for > the 2 methods > >> length(which(bh<0.01)) > > [1] 1032 > >> length(which(by<0.01)) > > [1] 519 > > # size of the original set: 3549 probesets (with 15 observations for each) > >> dim(exprs(eset.f)) > > [1] 3549 15 > > i don't think there is a bug anywhere (but i'd gladly send you my data to > reproduce this), i'm more inclined to think that i know too little to make > sense of this at this stage.... > > could this difference be due to the fact that these numbers correspond to a > large fraction of the whole set tested and that the two methods behave > differently with respect to this? > > if estimates of false positives at this cutoff were reliable with both methods, > wouldn't this imply that BH gives a far better coverage of the set of truly > differentially expressed genes than BY (TP/(TP+FN)), for the same control over > false discovery (here, 1%)? > > i've seen mentioned but only vaguely that the magnitude of change is an issue. > that the methods work best when it is low.... if you think that may be it, > could you point me to something i might read to understand this better? > > a related question i have is about the interpretation of FDR values close to 1, > in such cases where there are many changes, ie when the fraction of > non-differentially expressed genes in the whole set is far below 1... > > i've come to understand that methods controlling fdr substitute the p-value of > each gene for a value that corresponds to the minimum cut-off to apply in order > for that gene to be called positive, with the added info that such a set > obtained by applying this minimum cutoff can be expected to contain that > percentage of false positives. > but as the fdr rate ranges from 0 to 1, the chosen cutoff can fall above the > true percentage of non-differentially expressed sets and i don't see how such > an estimate can still be reliable up to 1 if the true number of false positives > is far lower.... unless it represents only an upper bound of how many false > positives to expect... is this the case ? > > thanks for any comment, i'm confused about this.... > > caroline >

score 0 · Answer 2 · 2016-01-19

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 7 hours ago

WEHI, Melbourne, Australia

An answer was posted here: BH vs BY in p.adjust + FDR interpretation

ADD COMMENT • link 8.3 years ago Gordon Smyth 50k