Question

Can I use the dispersion estimated using other methods in replacement of that computed by edgeR?

0

Entering edit mode

lwc628 ▴ 20

@lwc628-6832

Last seen 8.9 years ago

United States

Hi.

I was wondering if I can use the estimates of dispersion computed using other packages or other methods in replacement of that computed by EdgeR.

The reason why I asked is I still want to use the GLM machinery of edgeR, but wanted to use the custom computed dispersion estimated for each gene.

edger • 810 views

ADD COMMENT • link updated 9.2 years ago by Aaron Lun ★ 28k • written 9.2 years ago by lwc628 ▴ 20

score 2 · Accepted Answer · 2015-02-11

2

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 12 hours ago

The city by the bay

Technically, yes. You can supply your own dispersions as a vector (one for each gene) through the dispersion argument in glmFit. The GLM-based steps will then be performed using these user-specified values.

Whether this approach makes any sense, though, depends on what you're trying to do. The dispersion estimation functions in edgeR are fairly sophisticated and should perform well in most situations. For routine analyses, it'd take some effort to convince me that a dispersion estimate from another package is necessary.

Personally, I've only needed to manually specify the dispersion whenever there's no biological replication in my dataset.

ADD COMMENT • link 9.2 years ago Aaron Lun ★ 28k

0

Entering edit mode

Do you also need to supply an appropriate prior.df when supplying your own dispersion? I know at least some of the statistical tests make use of the prior.df, right?

ADD REPLY • link 9.2 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

That's a good question, and the answer is (generally) no. For example, if I'm specifying a dispersion because I don't have any replicates, then there's no point in specifying a prior d.f. value, because there's no gene-specific information to shrink towards a common value or trend.

In another example, I could specify a custom NB dispersion for use in the quasi-likelihood framework. In this case, the prior d.f. is automatically estimated for empirical Bayes shrinkage of the QL dispersions. So, no manual specification of the prior d.f. is necessary.

If I remember correctly, the only test which explicitly uses the prior d.f. is glmQLFTest. The others are only affected by the prior d.f. through shrinkage of the tagwise dispersions. That should be moot if you're going to specify your own dispersions.

ADD REPLY • link 9.2 years ago Aaron Lun ★ 28k

0

Entering edit mode

glmLRT(test="F") seems to use the prior.df as well.

ADD REPLY • link 9.2 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

That's true. However, the last time I checked, the test='F' setting was fairly experimental. I wouldn't recommend using it routinely, especially in a complicated scenario where the dispersion estimates are being sourced from somewhere else.

ADD REPLY • link 9.2 years ago Aaron Lun ★ 28k