Hi.
I was wondering if I can use the estimates of dispersion computed using other packages or other methods in replacement of that computed by EdgeR.
The reason why I asked is I still want to use the GLM machinery of edgeR, but wanted to use the custom computed dispersion estimated for each gene.
Do you also need to supply an appropriate prior.df when supplying your own dispersion? I know at least some of the statistical tests make use of the prior.df, right?
That's a good question, and the answer is (generally) no. For example, if I'm specifying a dispersion because I don't have any replicates, then there's no point in specifying a prior d.f. value, because there's no gene-specific information to shrink towards a common value or trend.
In another example, I could specify a custom NB dispersion for use in the quasi-likelihood framework. In this case, the prior d.f. is automatically estimated for empirical Bayes shrinkage of the QL dispersions. So, no manual specification of the prior d.f. is necessary.
If I remember correctly, the only test which explicitly uses the prior d.f. is
glmQLFTest
. The others are only affected by the prior d.f. through shrinkage of the tagwise dispersions. That should be moot if you're going to specify your own dispersions.glmLRT(test="F")
seems to use the prior.df as well.That's true. However, the last time I checked, the
test='F'
setting was fairly experimental. I wouldn't recommend using it routinely, especially in a complicated scenario where the dispersion estimates are being sourced from somewhere else.