Suppose I have a design like this, with a large number of control samples, and a single test sample
design = ~ condition
table(condition)
control treatmentA
50 1
When I estimate the dispersions and run differential expression testing, I would have expected that all the information about the dispersion comes from the control samples, since the treatment sample is perfectly fit by the linear model and does not add any degrees of freedom.
But in practice that's not what happens -- if I exclude the treatment sample and set design = ~ 1
or if I include an additional single sample with, say, "treatmentB", I get quite different estimates for the dispersion.
Is there an intuitive explanation of why?