Question

Log2FoldChange shrinkage method in DESeq2

4

Entering edit mode

Benjamin Ostendorf ▴ 90

@benjamin-ostendorf-15752

Last seen 4.5 years ago

United States

Hi everyone,

I noticed that applying the newly recommended apeglm method for logfoldchange-shrinkage when using the DESeq2 package resulted in a very different MA-plot in comparison to setting the lfcShrink method to normal ; it seems to shrink the logfoldchanges a lot more for most genes, possibly yielding a better signal/noise ratio. Are differences this dramatic to be expected when changing the lfcShrink method?

Thank you!
Benjamin

Shrinkage method set to normal:

Shrinkage method set to apeglm

deseq2 apeglm lfcshrink • 7.8k views

ADD COMMENT • link updated 7.6 years ago by Michael Love 43k • written 7.6 years ago by Benjamin Ostendorf ▴ 90

score 5 · Accepted Answer · 2018-05-23

5

Entering edit mode

Michael Love 43k

@mikelove

Last seen 23 hours ago

United States

hi Benjamin,

Great question:

Yes, apeglm does a much better job at preserving signal while shrinking away noise. We found that it's better both in ranking gene by effect size and in mean absolute error, comparing to the true LFC when it is known. The kind of LFC it shrinks to zero are those in which there is very little information to estimate even the sign of the LFC. Also the shrinkage is more aggressive when most of the effect sizes are consistent with LFC very close to 0.

Can you tell me a little about the sample size here?

I'd be interested to see also how "ashr" performs if you can try this.

ADD COMMENT • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Thanks for your reply, Michael!

I'm comparing 8 versus 5 biological replicates; the data were generated using Lexogen's 3' Quantseq kit, which generates just one fragment per transcript close to the 3' end.

Please find below the MA-plot using ashr-shrunken LFCs. lfcShrink with ashr gave this warning: Due to absence of package REBayes, switching to EM algorithm (I couldn't install REBayes because installation of the dependent package Rmosek failed).

I'm impressed by the increased signal/noise of apeglm, but it seems unusual that apeglm shrinkage yields many genes with a padj<0.1 but abs(LFC)<0.01 (please see volcano plot below)?

Shrinkage method set to ashr

Volcano plot with shrinkage method set to apeglm

ADD REPLY • link 7.6 years ago Benjamin Ostendorf ▴ 90

1

Entering edit mode

That's a good point. The reason for this is that the padj comes from null hypothesis testing then BH correction, while the shrinkage comes from a Bayesian procedure to get a reliable effect size, and the shrinkage procedure tends to be more aggressive at shrinking potential noise than the testing procedure. One way to think about it is that 1 out of 10 of the FDR set could have LFC=0. The shrinkage procedure will tend to put genes that are compatible with LFC=0 to 0, and it's more aggressive at this than the padj generating procedure. If you want something like a padj but that is in agreement with the shrunken estimate, we have followed the Stephens 2016 paper and added the capability to compute s-values and attach them to the results. You just need to set svalue=TRUE when you run lfcShrink, and then you will have an aggregate posterior probability to work with. Instead of bounding the rate of LFC=0, it bounds the rate of false signs (e.g. LFC is estimated as positive, but it's truly negative). I tend to use lower thresholds for svalue than for padj, so I might consider a 0.01 or a 0.005 svalue cutoff (the maximum s-value being ~0.5 corresponding to just guessing the sign of the LFC).

ADD REPLY • link 7.6 years ago Michael Love 43k

0

Entering edit mode

Thanks Michael, I'll read up on and try out the s-values!

ADD REPLY • link 7.6 years ago Benjamin Ostendorf ▴ 90

0

Entering edit mode

I also noticed a similar behavior of apeglm in my own RNAseq data set (10 samples), and I was wondering if in this case it is advised to use the "normal" method for shrinking the values instead of the newer apeglm. Any advise would be greatly appreciated.

MAplots

ADD REPLY • link 6.9 years ago ricardo3889 • 0

0

Entering edit mode

This means the counts for nearly all genes are compatible with LFC=0. Likely small sample size? If you want unshrunken LFC just go with the output of results().

ADD REPLY • link 6.9 years ago Michael Love 43k