Question

how do singleton samples contribute to dispersion estimate in DESeq2?

1

Entering edit mode

Victor ▴ 10

@73f9cb53

Last seen 22 months ago

United States

Suppose I have a design like this, with a large number of control samples, and a single test sample

design = ~ condition
table(condition)

control treatmentA

50 1

When I estimate the dispersions and run differential expression testing, I would have expected that all the information about the dispersion comes from the control samples, since the treatment sample is perfectly fit by the linear model and does not add any degrees of freedom.

But in practice that's not what happens -- if I exclude the treatment sample and set design = ~ 1 or if I include an additional single sample with, say, "treatmentB", I get quite different estimates for the dispersion.

Is there an intuitive explanation of why?

DESeq2 • 532 views

ADD COMMENT • link updated 22 months ago by Michael Love 41k • written 22 months ago by Victor ▴ 10

score 0 · Answer 1 · 2022-07-06

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 14 hours ago

United States

For that type of design, they do not really just much information about the dispersion. But the dispersion estimates are MLE and not based on closed formula like the pooled variance estimates for t-tests.

ADD COMMENT • link 22 months ago Michael Love 41k