[DESeq2] Results summary indicates LFC > and < 0 even with changed lfcThreshold
1
0
Entering edit mode
Mike ▴ 10
@mike-12142
Last seen 2.6 years ago
Canada
results1 <- results( dds, contrast=c("combined","B_treated","B_untreated") )
results2 <- results( dds, contrast=c("combined","B_treated","B_untreated"), lfcThreshold=0.58496 )

summary(results1)
summary(results2)

Output. The results are different but they both say "LFC > 0 (up)" and "LFC < 0 (down)":

> summary(results1)
adjusted p-value < 0.1
LFC > 0 (up)     : 1548, 3.7%
LFC < 0 (down)   : 2116, 5%
outliers [1]     : 24, 0.057%
low counts [2]   : 24080, 57%
(mean count < 19)
> summary(results2)
adjusted p-value < 0.1
LFC > 0 (up)     : 42, 0.1%
LFC < 0 (down)   : 25, 0.06%
outliers [1]     : 24, 0.057%
low counts [2]   : 20071, 48%
(mean count < 6)

 

A second question. When I look at the list of genes from results1 there are 775 significantly increased and with log2fc > 0.58496 and 924 significantly decreased log2fs < 0.58496, yet from results2 the numbers are much lower. Does this seem right?

 

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods
[10] base

other attached packages:
 [1] biomaRt_2.34.2             ggbeeswarm_0.6.0           ggrepel_0.7.0
 [4] extrafont_0.17             ggplot2_2.2.1              PoiClaClu_1.0.2
 [7] RColorBrewer_1.1-2         pheatmap_1.0.8             DESeq2_1.18.1
[10] SummarizedExperiment_1.8.1 DelayedArray_0.4.1         matrixStats_0.53.1
[13] Biobase_2.38.0             GenomicRanges_1.30.0       GenomeInfoDb_1.14.0
[16] IRanges_2.12.0             S4Vectors_0.16.0           BiocGenerics_0.24.0

loaded via a namespace (and not attached):
 [1] httr_1.3.1               RMySQL_0.10.13           bit64_0.9-7
 [4] splines_3.4.3            Formula_1.2-2            assertthat_0.2.0
 [7] latticeExtra_0.6-28      blob_1.1.0               vipor_0.4.5
[10] GenomeInfoDbData_1.0.0   Rsamtools_1.30.0         progress_1.1.2
[13] Rttf2pt1_1.3.5           pillar_1.1.0             RSQLite_2.0
[16] backports_1.1.2          lattice_0.20-35          extrafontdb_1.0
[19] digest_0.6.15            XVector_0.18.0           checkmate_1.8.5
[22] colorspace_1.3-2         htmltools_0.3.6          Matrix_1.2-12
[25] plyr_1.8.4               XML_3.98-1.10            genefilter_1.60.0
[28] zlibbioc_1.24.0          xtable_1.8-2             scales_0.5.0
[31] BiocParallel_1.12.0      htmlTable_1.11.2         tibble_1.4.2
[34] annotate_1.56.1          GenomicFeatures_1.30.3   nnet_7.3-12
[37] lazyeval_0.2.1           survival_2.41-3          magrittr_1.5
[40] memoise_1.1.0            foreign_0.8-69           beeswarm_0.2.3
[43] tools_3.4.3              data.table_1.10.4-3      prettyunits_1.0.2
[46] stringr_1.3.0            munsell_0.4.3            locfit_1.5-9.1
[49] cluster_2.0.6            AnnotationDbi_1.40.0     Biostrings_2.46.0
[52] compiler_3.4.3           rlang_0.2.0              RCurl_1.95-4.10
[55] rstudioapi_0.7           htmlwidgets_1.0          bitops_1.0-6
[58] base64enc_0.1-3          gtable_0.2.0             curl_3.1
[61] DBI_0.7                  R6_2.2.2                 GenomicAlignments_1.14.1
[64] gridExtra_2.3            rtracklayer_1.38.3       knitr_1.20
[67] bit_1.1-12               Hmisc_4.1-1              stringi_1.1.6
[70] Rcpp_0.12.15             geneplotter_1.56.0       rpart_4.1-12
[73] acepack_1.4.1
DESeq2 lfcThreshold • 2.2k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

"they both say "LFC > 0 (up)" and "LFC < 0 (down)"

Yeah, I should have -- but didn't -- attach the lfcThreshold as metadata to the results object when I wrote results(). I could add this, so that I could then print out different threshold here when called by summary(). The function is currently telling you the truth, those are in fact the number of positive and negative LFCs, but instead it would be nicer if it would print out the lfcThreshold value. This has been requested before, and it's on a long list of things to do.

"When I look at the list of genes from results1 there are 775 significantly increased and with log2fc > 0.58496 and 924 significantly decreased log2fs < 0.58496, yet from results2 the numbers are much lower. Does this seem right?"

Yes, these are different operations. See:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#tests-of-log2-fold-change-above-or-below-a-threshold

And also this paragraph from the DESeq2 paper:

For well-powered experiments, however, a statistical test against the conventional null hypothesis of zero LFC may report genes with statistically significant changes that are so weak in effect strength that they could be considered irrelevant or distracting. A common procedure is to disregard genes whose estimated LFC β ir is below some threshold, |β ir |≤θ. However, this approach loses the benefit of an easily interpretable FDR, as the reported P value and adjusted P value still correspond to the test of zero LFC. It is therefore desirable to include the threshold in the statistical testing procedure directly, i.e., not to filter post hoc on a reported fold-change estimate, but rather to evaluate statistically directly whether there is sufficient evidence that the LFC is above the chosen threshold.

ADD COMMENT
0
Entering edit mode

By the way, I just fixed this in devel (so released in April 2018) so that it shows the LFC threshold value:

> dds <- makeExampleDESeqDataSet()
> dds <- DESeq(dds, quiet=TRUE)
> res <- results(dds, lfcThreshold=log2(1.5))
> summary(res)

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0.58 (up)    : 0, 0%
LFC < -0.58 (down) : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> res <- lfcShrink(dds, coef=2, lfcThreshold=log2(1.5))
> summary(res)

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0.58 (up)    : 0, 0%
LFC < -0.58 (down) : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> summary(results(dds))

out of 997 with nonzero total read count
adjusted p-value   < 0.1
LFC > 0 (up)       : 0, 0%
LFC < 0 (down)     : 0, 0%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?result
ADD REPLY

Login before adding your answer.

Traffic: 642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6