In the paper Small-sample estimation of negative binomial dispersion, with applications to SAGE data, section 4.1 Conditional maximum likelihood:
For RNA sequencing data, assume the counts for a single tag across n libraries is a negative binomial random variable. Consider
If all libraries are the same size (i.e. mi≡m), the sum
If we write out the log-likelihood, dropping terms that don't involve ϕ, we get:
Rewriting the log terms and rearranging gives us:
Some more rearranging:
Substituting
How to derive conditional maximum likelihood for ϕ that is in the paper?