Question

DESeq2 empirical bayes: variance of the prior for log fold changes (LFC)

1

Entering edit mode

Chenghao ▴ 10

@06f8a532

Last seen 7 months ago

United States

Hello,

I'm reading the DESeq2 paper, and i have two questions regarding the prior for LFC.

1) For the variance of the normal prior for LFC, I am wondering why the variance contributed by sampling distribution of LFC isn't separated out, unlike the prior for log dispersion.

2) The model's coefficients (LFC's) estimates for a gene should be correlated, while the priors for them are set as independent. Is there a specific reason for this choice? Wouldn't a multivariate normal prior be more appropriate here?

Any help is greatly appreciated.

DESeq2 • 3.9k views

ADD COMMENT • link updated 7 months ago by Michael Love 43k • written 7 months ago by Chenghao ▴ 10

score 2 · Accepted Answer · 2025-06-06

why the variance contributed by sampling distribution of LFC isn't separated out

Good question, we say "though here we do not subtract the expected sampling variance from the observed variance of maximum likelihood estimates".

The reason was practical, it resulted in too much shrinkage with the Normal prior. Later we developed the Cauchy prior in Zhu, Ibrahim, Love 2019 (apeglm), which is a better fit to typical RNA-seq experiments and does use the standard errors:

quote from apeglm paper

The model's coefficients (LFC's) estimates for a gene should be correlated

The sampling distribution of the estimated LFCs is correlated by (X'X)^-1 but we are specifying a prior for the true effects.

In the end, we perform shrinkage on each coefficient separately with apeglm. This is because there can be some counterintuitive behavior with attempting shrinkage on multiple parameters at once (which also occurs in other regularization methods like ridge and lasso).