Can I use the dispersion estimated using other methods in replacement of that computed by edgeR?
1
0
Entering edit mode
lwc628 ▴ 20
@lwc628-6832
Last seen 9.4 years ago
United States

Hi. 

I was wondering if I can use the estimates of dispersion computed using other packages or other methods in replacement of that computed by EdgeR. 

The reason why I asked is I still want to use the GLM machinery of edgeR, but wanted to use the custom computed dispersion estimated for each gene.

 

 

edger • 900 views
ADD COMMENT
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 19 hours ago
The city by the bay

Technically, yes. You can supply your own dispersions as a vector (one for each gene) through the dispersion argument in glmFit. The GLM-based steps will then be performed using these user-specified values.

Whether this approach makes any sense, though, depends on what you're trying to do. The dispersion estimation functions in edgeR are fairly sophisticated and should perform well in most situations. For routine analyses, it'd take some effort to convince me that a dispersion estimate from another package is necessary.

Personally, I've only needed to manually specify the dispersion whenever there's no biological replication in my dataset.

ADD COMMENT
0
Entering edit mode

Do you also need to supply an appropriate prior.df when supplying your own dispersion? I know at least some of the statistical tests make use of the prior.df, right?

ADD REPLY
0
Entering edit mode

That's a good question, and the answer is (generally) no. For example, if I'm specifying a dispersion because I don't have any replicates, then there's no point in specifying a prior d.f. value, because there's no gene-specific information to shrink towards a common value or trend.

In another example, I could specify a custom NB dispersion for use in the quasi-likelihood framework. In this case, the prior d.f. is automatically estimated for empirical Bayes shrinkage of the QL dispersions. So, no manual specification of the prior d.f. is necessary.

If I remember correctly, the only test which explicitly uses the prior d.f. is glmQLFTest. The others are only affected by the prior d.f. through shrinkage of the tagwise dispersions. That should be moot if you're going to specify your own dispersions.

ADD REPLY
0
Entering edit mode

glmLRT(test="F") seems to use the prior.df as well.

ADD REPLY
0
Entering edit mode

That's true. However, the last time I checked, the test='F' setting was fairly experimental. I wouldn't recommend using it routinely, especially in a complicated scenario where the dispersion estimates are being sourced from somewhere else.

ADD REPLY

Login before adding your answer.

Traffic: 677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6