Question

how do singleton samples contribute to dispersion estimate in DESeq2?

1

Entering edit mode

Victor ▴ 10

@73f9cb53

Last seen 2.6 years ago

United States

Suppose I have a design like this, with a large number of control samples, and a single test sample

design = ~ condition
table(condition)

control treatmentA

50 1

When I estimate the dispersions and run differential expression testing, I would have expected that all the information about the dispersion comes from the control samples, since the treatment sample is perfectly fit by the linear model and does not add any degrees of freedom.

But in practice that's not what happens -- if I exclude the treatment sample and set design = ~ 1 or if I include an additional single sample with, say, "treatmentB", I get quite different estimates for the dispersion.

Is there an intuitive explanation of why?

DESeq2 • 687 views

ADD COMMENT • link updated 2.6 years ago by Michael Love 43k • written 2.6 years ago by Victor ▴ 10

score 0 · Answer 1 · 2022-07-06

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

For that type of design, they do not really just much information about the dispersion. But the dispersion estimates are MLE and not based on closed formula like the pooled variance estimates for t-tests.

ADD COMMENT • link 2.6 years ago Michael Love 43k