edgeR multiple contrasts Vs. One test
1
1
Entering edit mode
Michael Breen ▴ 370
@michael-breen-5999
Last seen 7.3 years ago
Hi All, If we have an design for which we have 4 groups, lets call: 1.Control Untreated 2. Control Treated 3. Cases Untreated 4. Cases Treated. and we were interested in differences between: -treated and untreated for Control -treated and untreated for Cases -treated differences between cases and controls -untreated differences between cases and controls. -differences between treated and untreated. 5 tests in total. We can then use edgeR contrast function as something like this... contrasts <- makeContrasts( Case.TreatedvsUntreated = Case.Treated-Case.Untreated, Control.TreatedvsUntreated = Control.Treated-Control.Untreated, CasevsControl.Untreated = Case.Untreated-Control.Untreated, etc..... levels=design) This produces an appropriate rank order of significance for each contrast. However, what is the cost of having no correction for the fact that I just performed 5 tests on each gene instead of just 1 test?? Any insight? Yours, Michael [[alternative HTML version deleted]]
edgeR edgeR • 1.3k views
1
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

Dear Michael,

The default in edgeR is to adjust the p-value in order to control the expected false discovery rate (FDR). FDR is a scalable quantity in a numerical sense, so the fact that you control the FDR for each contrast separately is less important than if you were trying to control the type I error rate. What I mean by "scalable" is this: if the true FDR is below a given level for each of 5 contrasts separately, then the true FDR must automatically be below that level across all 5 contrasts together. That's just a numerical identity.

This doesn't mean that applying the Benjamini-Hochberg algorithm (BH) to each contrast separately is equivalent to applying the algorithm across all the p-values together. The BH algorithm controls the expected FDR below a given value, not the actual FDR. So if you apply BH to lots of contrasts, then the chance that the true FDR will be above the nominal level for one or more of the contrasts is quite high. But it does mean nevertheless that applying BH across each of the contrasts separately remains a relatively robust practice in terms of overall FDR control.

The situation would be different if you used adjust.method="holm". Holm's method controls the family-wise type I error rate, and the type I error rate does not scale over multiple contrasts.

Note also that applying the BH algorithm to all the contrasts simultaneously is not a panacea anyway. Controlling the expected FDR across all genes and all contrasts does not also control the expected FDR within each contrast separately when the percentage of DE genes differs between the contrasts. In particular, the expected FDR for individual contrasts with relatively DE genes can become overly high. Applying FDR globally to all contrasts is conservative for contrasts with many DE genes and liberal for those with few, even while it controls the expected FDR overall.

If you want to get fancy, the decideTests() function in the limma package provides a variety of ways to control either the FDR or the family-wise error rate within and between contrasts. However the default practice remains to apply FDR to each contrast separately, and I continue to recommend that for the majority of analyses.

Best wishes
Gordon

0
Entering edit mode

What method is used for FDR control in exactTest?

Thanks,
Naomi

0
Entering edit mode

Hi Naomi,

exactTest just computes p-values.  Just as in limma, adjustment for multiple testing is done at the collation stage, rather than at the time of constructing the test and the p-value.

To see the default for topTags:

   > args(topTags)
function (object, n = 10, adjust.method = "BH", sort.by = "PValue")
NULL

Regards
Gordon

0
Entering edit mode

Great to know!

So the FDR does scale across multiple contrasts.

Thank you!