Variation in number of DEGs on LFC shrinkage
1
0
Entering edit mode
Deevanshu • 0
@a5b88c96
Last seen 2.4 years ago
India

Hi, I am using DeSeq2 to estimate the DEGs across a dataset of 6 samples with 3 samples each in 2 conditions - patient versus control. The parameters for the DEGs are as follows: |Log2FC|<2, P-value<0.05

I ran DeSeq on the datasets through 2 ways - (i) Without the LFC shrinkage, and (ii) With the LFC Shrinkage

I didn't expect a huge variation in the number of DEGs detected but the results showed otherwise. This has confused me further about the usage of LFC shrinkage in DE Analysis. The code and output are as below.

I want to understand if this is the expected variation if using LFC shrinkage versus not using LFC shrinkage, and if yes, when should shrinkage be used?

> contrast_oe <- c("SampleType", "LC", "Control")
> res_tableOE_unshrunken <- results(dds, contrast=contrast_oe)
> unshrunk_tb <- res_tableOE_unshrunken %>% data.frame() %>% rownames_to_column(var="gene") %>% as_tibble()
> unshrunk_tb$diffexpressed <- "NO"
> unshrunk_tb$diffexpressed[unshrunk_tb$log2FoldChange > 2 & unshrunk_tb$pvalue < 0.05] <- "UP"
> unshrunk_tb$diffexpressed[unshrunk_tb$log2FoldChange < -2 & unshrunk_tb$pvalue < 0.05] <- "DOWN"
> aggregate(gene~diffexpressed, unshrunk_tb, function(x) c(count = length(x)))
  diffexpressed  gene
1          DOWN   293
2            NO 25349
3            UP   843
> res_tableOE_shrunk <- lfcShrink(dds, contrast = contrast_oe, type = "ashr")
using 'ashr' for LFC shrinkage. If used in published research, please cite:
    Stephens, M. (2016) False discovery rates: a new deal. Biostatistics, 18:2.
    https://doi.org/10.1093/biostatistics/kxw041
> shrunk_tb <- res_tableOE_shrunk %>% data.frame() %>% rownames_to_column(var="gene") %>% as_tibble()
> shrunk_tb$diffexpressed <- "NO"
> shrunk_tb$diffexpressed[shrunk_tb$log2FoldChange > 2 & shrunk_tb$pvalue < 0.05] <- "UP"
> shrunk_tb$diffexpressed[shrunk_tb$log2FoldChange < -2 & shrunk_tb$pvalue < 0.05] <- "DOWN"
> aggregate(gene~diffexpressed, shrunk_tb, function(x) c(count = length(x)))
  diffexpressed  gene
1          DOWN     2
2            NO 26448
3            UP    35
DESeq2 lfcShrink • 738 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

The top results() call is not how we recommend to threshold against an LFC value, use lfcThreshold instead (see the vignette or paper).

Still the results will not be identical, as the methods are not identical. If you use lfcThreshold with lfcShrink it will output aggregate posterior tail probabilities, (s-values). You can plot on the -log10 scale the p-value from results with lfcThreshold vs s-value from lfcShrink with lfcThreshold to compare.

ADD COMMENT

Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6