Question

Which of apeglm and ashr may be more appropriate for pseudobulked DESeq2 analysis of single-cell RNA-seq data?

0

Entering edit mode

KS • 0

@7c52d12d

Last seen 4 months ago

United States

Hello!

I'm analyzing pseudobulked single-cell RNA-seq data using DESeq2 and comparing apeglm vs ashr for log fold change shrinkage. Even in my largest cell type (comparing 8 vs 11 samples, 25-7771 cells per sample with median cell number of 835), I'm observing substantial differences between the two methods that significantly impact downstream analysis.

Key observations:

apeglm appears much more aggressive in shrinkage, pushing many genes toward logFC = 0
ashr maintains a gradient of shrinkage values and preserves more moderate effect sizes
In downstream GSEA analysis, apeglm yields very few significant enrichments while ashr produces many biologically plausible pathway enrichments

Below I'm showing the correlation between original and shrinked logFCs:

logFCs

Specific question:

For pseudobulked scRNA-seq data, are there methodological reasons to prefer one approach over the other? I'm particularly interested in:

Whether the distributional assumptions of each method are better suited to the characteristics of pseudobulked data
If the more aggressive shrinkage by apeglm might be overly conservative for this data type, potentially masking true biological signal
How to objectively evaluate which approach is more appropriate when the biological interpretation seems more coherent with one method

I want to avoid confirmation bias in method selection - while ashr results align better with my biological expectations, I'm concerned this might influence my judgment. Are there principled ways to evaluate shrinkage method appropriateness beyond downstream biological plausibility?

Thank you!

apeglm DESeq2 • 730 views

ADD COMMENT • link updated 3 months ago by Michael Love 43k • written 4 months ago by KS • 0

score 0 · Answer 1 · 2025-09-08

apeglm appears much more aggressive in shrinkage, pushing many genes toward logFC = 0 ashr maintains a gradient of shrinkage values and preserves more moderate effect sizes

I have seen this before on the support site, and we saw it as well in the publication.

In downstream GSEA analysis, apeglm yields very few significant enrichments while ashr produces many biologically plausible pathway enrichments

Yes, it could be that you have some information in the genes that are shrunk to 0 by apeglm but not by ashr. Ashr has a more flexible prior which may be more appropriate for the distribution of true effects in your data.