In the paper Small-sample estimation of negative binomial dispersion, with applications to SAGE data, section 4.1 Conditional maximum likelihood:
For RNA sequencing data, assume the counts for a single tag across n libraries is a negative binomial random variable. Consider $$Y_1, \cdots , Y_n$$ as independent and $$NB(\mu_i = m_i \lambda, \phi)$$, where mi is the library size (i.e. total number of tags sequenced for library i) and λ represents the proportion of the library that is a particular tag. The probability mass function is:
$$f(y_i;\mu,\phi)=P(Y=y_i)=\frac{\Gamma(\phi^{-1}+y_i)}{\Gamma(\phi^{-1})\Gamma(y_i+1)}\left(\frac{1}{\phi^{-1}\mu^{-1}+1}\right)^{y_i}\left(\frac{1}{\phi\mu+1}\right)^{\phi^{-1}} \tag 1$$
$$\text{E}(Y)=\mu$$ and $$\text{Var}(Y)=\mu+\phi\mu^2$$
If all libraries are the same size (i.e. mi≡m), the sum $$Z = Y_1 + \cdots + Y_n \sim NB(nm\lambda, \phi n^{−1})$$
If we write out the log-likelihood, dropping terms that don't involve ϕ, we get:
$$\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + \sum y_i\log\left({\phi\mu \over 1 + \phi\mu}\right) + n\phi^{-1}\log\left({1 \over 1 + \phi\mu}\right)$$
Rewriting the log terms and rearranging gives us:
$$ \sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + \sum y_i\log\phi\mu - \left(\sum y_i+n\phi^{-1}\right)\log(1+\phi\mu)$$
Some more rearranging:
$$\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + z\log\phi\mu -(z+n\phi^{-1})\log(1+\phi\mu)$$
Substituting $$\bar{y} = (1/n)\sum y_i$$ for μ, as it is the MLE for μ and is what enables us to get rid of λ (which is hidden inside μ) gives us the one-dimensional optimization problem:
$$\max_{\phi} \left[\sum\log \Gamma(y_i+\phi^{-1}) - n \log\Gamma(\phi^{-1}) + n\bar{y}\log\phi\bar{y} -n(\bar{y}+\phi^{-1})\log(1+\phi\bar{y})\right]$$
How to derive conditional maximum likelihood for ϕ that is in the paper?
$$l_{Y|Z=z}(\phi)=\left[\sum_{i=1}^{n}\log\Gamma(y_i+\phi^{-1})\right]+\log\Gamma(n\phi^{-1})-\log\Gamma(z+n\phi^{-1})-n\log\Gamma(\phi^{-1})$$