Question: Interpreting the results from lfcShrink() with apeglm in DESeq2
0
gravatar for charles.foster
8 months ago by
charles.foster10 wrote:

Dear all,

I'm relatively new to differential expression analysis of RNA-Seq data, and I'm working my way through the DESeq2 vignette to come to terms with how it works. I'm looking to obtain a set of genes with a log2 fold change >2, and an adjusted p-value of <0.001.

Following the "standard" procedure, I'm able to get the results using the following command:

res_groupA_vs_groupB <- results(dds,contrast=c("Tissue","groupA","groupB"),lfcThreshold=2,alpha=0.001)

This gave me sensible results, including adjusted p-values that I'm comfortable interpreting. I then noticed the lfcShrink function, and read about its benefits. I ran the following command:

res2_groupA_vs_groupB <- lfcShrink(dds,coef=2,type="apeglm",lfcThreshold=2)

In this case, s-values are provided, and from the documentation I can see these "provide the probability of false signs among the tests with equal or smaller s-value than a given given's s-value". I read through the Stephens (2016) reference to try to understand these more, but I'm still a little uncertain as to the interpretation of s-values. As stated above, I've decided a priori to focus on those genes with an adjusted p-value of <0.001, but I'm uncertain whether this same logic can be applied to s-values (i.e., focusing on genes with s-value <0.001). Can the interpretation be analogous, or am I on the wrong track here?

Any advice or suggestions for further reading would be greatly appreciated.

deseq2 lfcshrink apeglm • 290 views
ADD COMMENTlink modified 8 months ago by Michael Love24k • written 8 months ago by charles.foster10
Answer: Interpreting the results from lfcShrink() with apeglm in DESeq2
3
gravatar for Michael Love
8 months ago by
Michael Love24k
United States
Michael Love24k wrote:

Hi Charles,

The adjusted p-values and s-values are similar but with a different definition of error. One focuses on falsely rejecting what are truly null genes, and the other on getting the sign of the LFC wrong. You can use whichever you prefer or feel more comfortable with. 

Because we were computing posterior distributions in apeglm, and because we were adding ashr at the same time to lfcShrink, we decided it made sense to provide s-values as output. I think there is one clear benefit to outputting s-values, which is during method development, we can assess and benchmark power and error control simultaneously with real data. It's very difficult to find real datasets in which there are non-null genes and null genes, if we have a point null hypothesis of LFC=0.

ADD COMMENTlink written 8 months ago by Michael Love24k

Hi Michael,

Great- thanks for the clarification and great program!

ADD REPLYlink written 8 months ago by charles.foster10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 189 users visited in the last hour