Question

Cutoffs of padj and fold change in DESeq2

2

Entering edit mode

pmpierry ▴ 20

@pmpierry-11422

Last seen 6.7 years ago

Hello everyone,

I have a doubt about the cutoffs in DESeq2 related to padj and fold change. I initially did all my analyses using the default parameters in "results" function. Afterwards, I decided to use cutoffs of padj<=0.05 e FC>=2.0. To do so, I used the results table from those default analysis and filtered based on these cutoffs I wanted.

Later, I read again DESeq2 documentation and saw the arguments "alpha" and "lfcThreshold" in "results" function. My question is if the way I did before (to filter the results table after default analysis) was correct or I need to do all my analysis again using the arguments in results function.

Thank you!

deseq2 • 9.8k views

ADD COMMENT • link 7.6 years ago pmpierry ▴ 20

0

Entering edit mode

Hello Michael,

I followed your suggestions to add alpha and lfcThreshold arguments to results():

res <- results(dds, alpha=0.05, lfcThreshold=1, altHypothesis="greaterAbs")

But, when I call "summary(res)", only alpha has changed, but LFC did not:

> summary(res)

out of 2154 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up) : 84, 3.9%
LFC < 0 (down) : 37, 1.7%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

I thought it would show "LFC > 1 (up)" and "LFC < -1 (down)" or "LFC < |1| (down).

Am I missing some other argument or setup in results()?

Thanks again for all attention and help!

ADD REPLY • link 7.6 years ago pmpierry ▴ 20

0

Entering edit mode

I see what you mean. summary() is just a helper function to make a table and I don't have it coded to look at the lfcThreshold. I could try to change that, but anyway, you can see that you are in fact testing against LFC of 1 by looking at the MA plot.

And to get numbers of genes you can just do:

table(res$padj < 0.1 & res$log2FoldChange > 0)

table(res$padj < 0.1 & res$log2FoldChange < 0)

ADD REPLY • link 7.6 years ago Michael Love 41k

0

Entering edit mode

Ok, I got it! Thank you again for your help!

ADD REPLY • link 7.6 years ago pmpierry ▴ 20

0

Entering edit mode

By the way, I just fixed this in devel (so released in April 2018) so that it shows the LFC threshold value:

> dds <- makeExampleDESeqDataSet()
> dds <- DESeq(dds, quiet=TRUE)
> res <- results(dds, lfcThreshold=log2(1.5))
> summary(res)

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0.58 (up)    : 0, 0%
LFC < -0.58 (down) : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> res <- lfcShrink(dds, coef=2, lfcThreshold=log2(1.5))
> summary(res)

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0.58 (up)    : 0, 0%
LFC < -0.58 (down) : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> summary(results(dds))

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0 (up)       : 0, 0%
LFC < 0 (down)     : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?result

ADD REPLY • link 6.0 years ago Michael Love 41k

score 0 · Answer 1 · 2016-09-05

If you want to use a LFC filter you should use the lfcThreshold argument. This is preferable because it is actually testing for a LFC greater than the value you choose. It makes less sense IMO to test for a non-zero LFC and then filter a different value. For more on this, see the "Hypothesis tests with thresholds on effect size" section of the DESeq2 paper.

Regarding the adjusted p-value threshold you can choose what you want, but by supplying the threshold to the 'alpha' argument, you will increase power. The independent filtering routing within results() uses alpha to optimize the low count filtering. Choosing a different value for alpha than the one you use to filter is sub-optimal.