Question

EDGE-R exact test vs QL F-test

4

Entering edit mode

kaihami ▴ 40

@kaihami-10979

Last seen 9.0 years ago

Dear All,

Sorry if all my question will sound silly...

I'm using EdgeR for RNA-Seq analysis.

In my case I have just 2 conditions, control and treatment.

Therefore, I wondering to know if I should go with exact test (et) and/or use a QL F-Test. If I am not wrong the last is more restrictive and leads to reduced type I error.

So, which one is better for my conditions? Statistically speaking...

Is there any (real) difference between them? Or if I use the QL F-Test in my condition I'm applying a wrong model?

Thanks in advance,

edger rnaseq • 9.4k views

ADD COMMENT • link updated 3.8 years ago by Sarah • 0 • written 9.6 years ago by kaihami ▴ 40

score 9 · Answer 1 · 2016-06-26

The QL framework provides more accurate type I error rate control, as it accounts for the uncertainty of the dispersion estimates. In contrast, the exact test assumes that the estimated dispersion is the true value, which can result in some inaccuracy. (The "exact" refers to the fact that the p-value is calculated exactly rather than relying on approximations; however, this only results in exact type I error control when the estimated and true dispersions are the same.) For this reason, I prefer using the QL methods whenever I apply edgeR.

The QL methods (and GLM methods) are also more flexible with respect to the experimental design. For example, if you got a second batch of samples, all you would need to do in a GLM framework would be to change the design matrix, while the exact test methods can't handle an extra blocking factor for the batch.

In summary, while both of the methods will work for your data set, the QL F-test is probably the better choice. There are some situations where the QL F-test doesn't work well - for example, if you don't have replicates, you'd have to supply a fixed dispersion, which defeats the whole point of modelling estimation uncertainty. Another situation is where the dispersions are very large and the counts are very small, whereby some of the approximations in the QL framework seem to fail. In such cases, I usually switch to the LRT rather than using the exact test, for the reasons of experimental flexibility that I mentioned above.

score 1 · Answer 2 · 2016-06-26

1

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen just now

WEHI, Melbourne, Australia

This QL workflow article might be of interest: http://f1000research.com/articles/5-1438