Question: DESeq2 lfcShrink time for analysis
gravatar for jshouse
18 months ago by
jshouse0 wrote:

First, this is a rather large experiment. Roughly 3000 features by 1300 samples. Creating the `dds` object takes 12.5 hours with 20 workers at 3.1 ghz, and resultsNames(dds)  consists of 487 items. 

> resultsNames(dds)[1:8]
  [1] "Intercept"                                             "group_003_CGO_1_vs_MethodBlank_0"                     
  [3] "group_003_CGO_10_vs_MethodBlank_0"                     "group_003_CGO_100_vs_MethodBlank_0"                   
  [5] "group_006_HFO_1_vs_MethodBlank_0"                      "group_006_HFO_10_vs_MethodBlank_0"                    
  [7] "group_006_HFO_100_vs_MethodBlank_0"                    "group_007_HFO_1_vs_MethodBlank_0"                     


This treatment set consists of 161 chemicals that we then compare using contrasts for the dose = 100 to the methodblank controls as follows:  results(dds, coef = "group_006_HFO_100_vs_MethodBlank_0", parrallel = TRUE).  This takes about 5 minutes for each contrast

When I try to use lfcShrink(dds, coef="group_006_HFO_100_vs_MethodBlank_0", parallel = TRUE, BPPARAM=SnowParam(18)) , it runs for hours before I interrupt R. While doing so, it pegs all 20 workers at 100%

Am I doing something wrong here? The vignette indicates this shouldn't take very long, especially with parallel workers.

Thanks for your time.

***System Info***

Microsoft R Open 3.4.2
The enhanced R distribution from Microsoft
Microsoft packages Copyright (C) 2017 Microsoft Corporation

Using the Intel MKL for parallel mathematical computing (using 10 cores).

Default CRAN mirror snapshot taken on 2017-10-15.

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_2.2.1              dplyr_0.7.4                BiocParallel_1.12.0        DESeq2_1.18.1              SummarizedExperiment_1.8.0 DelayedArray_0.4.1         matrixStats_0.52.2         Biobase_2.38.0            
 [9] GenomicRanges_1.30.0       GenomeInfoDb_1.14.0        IRanges_2.12.0             S4Vectors_0.16.0           BiocGenerics_0.24.0        RevoUtils_10.0.6           BiocInstaller_1.28.0       RevoUtilsMath_10.0.1      

loaded via a namespace (and not attached):
 [1] tidyr_0.7.2             bit64_0.9-7             splines_3.4.2           Formula_1.2-2           assertthat_0.2.0        latticeExtra_0.6-28     blob_1.1.0              GenomeInfoDbData_0.99.1 yaml_2.1.16            
[10] RSQLite_2.0             backports_1.1.1         lattice_0.20-35         glue_1.2.0              digest_0.6.12           RColorBrewer_1.1-2      XVector_0.18.0          checkmate_1.8.5         colorspace_1.3-2       
[19] htmltools_0.3.6         Matrix_1.2-12           plyr_1.8.4              XML_3.98-1.9            pkgconfig_2.0.1         genefilter_1.60.0       zlibbioc_1.24.0         purrr_0.2.4             xtable_1.8-2           
[28] scales_0.5.0            htmlTable_1.11.0        tibble_1.3.4            annotate_1.56.1         nnet_7.3-12             lazyeval_0.2.1          survival_2.41-3         magrittr_1.5            memoise_1.1.0          
[37] foreign_0.8-69          tools_3.4.2             data.table_1.10.4-3     stringr_1.2.0           locfit_1.5-9.1          munsell_0.4.3           cluster_2.0.6           AnnotationDbi_1.40.0    bindrcpp_0.2           
[46] compiler_3.4.2          rlang_0.1.4             grid_3.4.2              RCurl_1.95-4.8          rstudioapi_0.7          htmlwidgets_0.9         bitops_1.0-6            base64enc_0.1-3         gtable_0.2.0           
[55] DBI_0.7                 R6_2.2.2                gridExtra_2.3           knitr_1.17              bit_1.1-12              bindr_0.1               Hmisc_4.0-3             stringi_1.1.6           Rcpp_0.12.14           
[64] geneplotter_1.56.0      rpart_4.1-11            acepack_1.4.1




deseq2 lfcshrink • 338 views
ADD COMMENTlink modified 18 months ago by ellascottgm0 • written 18 months ago by jshouse0
Answer: DESeq2 lfcShrink time for analysis
gravatar for Michael Love
18 months ago by
Michael Love24k
United States
Michael Love24k wrote:

I myself use limma-voom when there are hundreds of samples. The NB GLM is overkill and the iterative steps to fit the parameters are unavoidable.


ADD COMMENTlink written 18 months ago by Michael Love24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 358 users visited in the last hour