How edgeR estimate the common dispersion?
1
1
Entering edit mode
Dan • 0
@63648ae2
Last seen 11 months ago
United States

In the paper Small-sample estimation of negative binomial dispersion, with applications to SAGE data, section 4.1 Conditional maximum likelihood:

For RNA sequencing data, assume the counts for a single tag across n libraries is a negative binomial random variable. Consider $$Y_1, \cdots , Y_n$$ as independent and $$NB(\mu_i = m_i \lambda, \phi)$$, where mi is the library size (i.e. total number of tags sequenced for library i) and λ represents the proportion of the library that is a particular tag. The probability mass function is:

$$f(y_i;\mu,\phi)=P(Y=y_i)=\frac{\Gamma(\phi^{-1}+y_i)}{\Gamma(\phi^{-1})\Gamma(y_i+1)}\left(\frac{1}{\phi^{-1}\mu^{-1}+1}\right)^{y_i}\left(\frac{1}{\phi\mu+1}\right)^{\phi^{-1}} \tag 1$$

$$\text{E}(Y)=\mu$$ and $$\text{Var}(Y)=\mu+\phi\mu^2$$

If all libraries are the same size (i.e. mi≡m), the sum $$Z = Y_1 + \cdots + Y_n \sim NB(nm\lambda, \phi n^{−1})$$

If we write out the log-likelihood, dropping terms that don't involve ϕ, we get:

$$\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + \sum y_i\log\left({\phi\mu \over 1 + \phi\mu}\right) + n\phi^{-1}\log\left({1 \over 1 + \phi\mu}\right)$$

Rewriting the log terms and rearranging gives us:

$$ \sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + \sum y_i\log\phi\mu - \left(\sum y_i+n\phi^{-1}\right)\log(1+\phi\mu)$$

Some more rearranging:

$$\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + z\log\phi\mu -(z+n\phi^{-1})\log(1+\phi\mu)$$

Substituting $$\bar{y} = (1/n)\sum y_i$$ for μ, as it is the MLE for μ and is what enables us to get rid of λ (which is hidden inside μ) gives us the one-dimensional optimization problem:

$$\max_{\phi} \left[\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + n\bar{y}\log\phi\bar{y} -n(\bar{y}+\phi^{-1})\log(1+\phi\bar{y})\right]$$

How to derive conditional maximum likelihood for ϕ that is in the paper?

$$l_{Y|Z=z}(\phi)=\left[\sum_{i=1}^{n}\log\Gamma(y_i+\phi^{-1})\right]+\log\Gamma(n\phi^{-1})-\log\Gamma(z+n\phi^{-1})-n\log\Gamma(\phi^{-1})$$

edgeR • 524 views
ADD COMMENT
2
Entering edit mode
Dan • 0
@63648ae2
Last seen 11 months ago
United States

I try to answer this question. Based on the Conditional Likelihood defined in https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/j.2517-6161.1996.tb02101.x, the conditional log-likelihood for ϕ of Y conditioned on Z, dropping terms that don't involve ϕ, is: $$l_{Y|Z=z}(\mathbf y; \phi)=\log f(\mathbf y;\mu,\phi)-\log f_{z}(z;n\mu,n^{-1}\phi)$$

$$=\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + \sum y_i\log\left({\phi\mu \over 1 + \phi\mu}\right) + n\phi^{-1}\log\left({1 \over 1 + \phi\mu}\right) $$

$$-\log\Gamma(z+n\phi^{-1})+\log\Gamma(n\phi^{-1})-z\log\left({\phi\mu \over 1 + \phi\mu}\right) - n\phi^{-1}\log\left({1 \over 1 + \phi\mu}\right)$$

$$=\left[\sum_{i=1}^{n}\log\Gamma(y_i+\phi^{-1})\right]+\log\Gamma(n\phi^{-1})-\log\Gamma(z+n\phi^{-1})-n\log\Gamma(\phi^{-1})$$

ADD COMMENT

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6