Question: Interpreting the results from lfcShrink() with apeglm in DESeq2
gravatar for charles.foster
12 months ago by
charles.foster40 wrote:

Dear all,

I'm relatively new to differential expression analysis of RNA-Seq data, and I'm working my way through the DESeq2 vignette to come to terms with how it works. I'm looking to obtain a set of genes with a log2 fold change >2, and an adjusted p-value of <0.001.

Following the "standard" procedure, I'm able to get the results using the following command:

res_groupA_vs_groupB <- results(dds,contrast=c("Tissue","groupA","groupB"),lfcThreshold=2,alpha=0.001)

This gave me sensible results, including adjusted p-values that I'm comfortable interpreting. I then noticed the lfcShrink function, and read about its benefits. I ran the following command:

res2_groupA_vs_groupB <- lfcShrink(dds,coef=2,type="apeglm",lfcThreshold=2)

In this case, s-values are provided, and from the documentation I can see these "provide the probability of false signs among the tests with equal or smaller s-value than a given given's s-value". I read through the Stephens (2016) reference to try to understand these more, but I'm still a little uncertain as to the interpretation of s-values. As stated above, I've decided a priori to focus on those genes with an adjusted p-value of <0.001, but I'm uncertain whether this same logic can be applied to s-values (i.e., focusing on genes with s-value <0.001). Can the interpretation be analogous, or am I on the wrong track here?

Any advice or suggestions for further reading would be greatly appreciated.

deseq2 lfcshrink apeglm • 416 views
ADD COMMENTlink modified 12 months ago by Michael Love25k • written 12 months ago by charles.foster40
Answer: Interpreting the results from lfcShrink() with apeglm in DESeq2
gravatar for Michael Love
12 months ago by
Michael Love25k
United States
Michael Love25k wrote:

Hi Charles,

The adjusted p-values and s-values are similar but with a different definition of error. One focuses on falsely rejecting what are truly null genes, and the other on getting the sign of the LFC wrong. You can use whichever you prefer or feel more comfortable with. 

Because we were computing posterior distributions in apeglm, and because we were adding ashr at the same time to lfcShrink, we decided it made sense to provide s-values as output. I think there is one clear benefit to outputting s-values, which is during method development, we can assess and benchmark power and error control simultaneously with real data. It's very difficult to find real datasets in which there are non-null genes and null genes, if we have a point null hypothesis of LFC=0.

ADD COMMENTlink written 12 months ago by Michael Love25k

Hi Michael,

Great- thanks for the clarification and great program!

ADD REPLYlink written 12 months ago by charles.foster40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 186 users visited in the last hour