Question: Log2FoldChange shrinkage method in DESeq2
3
14 months ago by
United States
Benjamin Ostendorf80 wrote:

Hi everyone,

I noticed that applying the newly recommended apeglm method for logfoldchange-shrinkage when using the DESeq2 package resulted in a very different MA-plot in comparison to setting the lfcShrink method to normal ; it seems to shrink the logfoldchanges a lot more for most genes, possibly yielding a better signal/noise ratio. Are differences this dramatic to be expected when changing the lfcShrink method?

Thank you!
Benjamin

Shrinkage method set to normal

Shrinkage method set to apeglm

deseq2 lfcshrink apeglm • 719 views
modified 14 months ago by Michael Love24k • written 14 months ago by Benjamin Ostendorf80
Answer: Log2FoldChange shrinkage method in DESeq2
3
14 months ago by
Michael Love24k
United States
Michael Love24k wrote:

hi Benjamin,

Great question:

Yes, apeglm does a much better job at preserving signal while shrinking away noise. We found that it's better both in ranking gene by effect size and in mean absolute error, comparing to the true LFC when it is known. The kind of LFC it shrinks to zero are those in which there is very little information to estimate even the sign of the LFC. Also the shrinkage is more aggressive when most of the effect sizes are consistent with LFC very close to 0.

Can you tell me a little about the sample size here?

I'd be interested to see also how "ashr" performs if you can try this.

I'm comparing 8 versus 5 biological replicates; the data were generated using Lexogen's 3' Quantseq kit, which generates just one fragment per transcript close to the 3' end.

Please find below the MA-plot using ashr-shrunken LFCs. lfcShrink with ashr gave this warning: Due to absence of package REBayes, switching to EM algorithm (I couldn't install REBayes because installation of the dependent package Rmosek failed).

I'm impressed by the increased signal/noise of apeglm, but it seems unusual that apeglm shrinkage yields many genes with a padj<0.1 but  abs(LFC)<0.01 (please see volcano plot below)?

Shrinkage method set to ashr

Volcano plot with shrinkage method set to apeglm

1

That's a good point. The reason for this is that the padj comes from null hypothesis testing then BH correction, while the shrinkage comes from a Bayesian procedure to get a reliable effect size, and the shrinkage procedure tends to be more aggressive at shrinking potential noise than the testing procedure. One way to think about it is that 1 out of 10 of the FDR set could have LFC=0. The shrinkage procedure will tend to put genes that are compatible with LFC=0 to 0, and it's more aggressive at this than the padj generating procedure. If you want something like a padj but that is in agreement with the shrunken estimate, we have followed the Stephens 2016 paper and added the capability to compute s-values and attach them to the results. You just need to set svalue=TRUE when you run lfcShrink, and then you will have an aggregate posterior probability to work with. Instead of bounding the rate of LFC=0, it bounds the rate of false signs (e.g. LFC is estimated as positive, but it's truly negative). I tend to use lower thresholds for svalue than for padj, so I might consider a 0.01 or a 0.005 svalue cutoff (the maximum s-value being ~0.5 corresponding to just guessing the sign of the LFC).

Thanks Michael, I'll read up on and try out the s-values!

I also noticed a similar behavior of apeglm in my own RNAseq data set (10 samples), and I was wondering if in this case it is advised to use the "normal" method for shrinking the values instead of the newer apeglm. Any advise would be greatly appreciated.