Hi, I have a question about the cores usage during DESeq2 differential expression pipeline. The issue happens when I launch nbinomLRT function.
The object i give to the function is:
> ddsDisp class: DESeqDataSet dim: 843 100 metadata(1): version assays(2): counts mu rownames(843): OTU_2 OTU_3 ... OTU_970 OTU_971 rowData names(9): baseMean baseVar ... dispOutlier dispMAP colnames(100): Sample_1_grp1 Sample_2_grp1 ... Sample_99_grp2 Sample_100_grp2 colData names(3): grp NF.poscounts sizeFactor
So a matrix with 50 samples from experimental condition grp1 and 50 samples from grp2 (total 100 samples) with 843 rows.
And i call the function:
nbinomLRT(ddsDisp, reduced = ~ 1, full = ~ grp)
As I have to launch a lot of simulations in a server, I need all calculations to stay in a single core. So at the beginning of the script I've used:
register(SerialParam())
But things are different: when the script comes to this function all 20 cores of the server are saturated and the waiting time for a response is more than 7 minutes (for a 843x100 matrix, isn't it strange?)
And i've already tried calling the wrapper DESeq instead of the separated functions:
ddsRes <- DESeq(object = dds, test = "LRT", reduced = ~1, full = ~ grp, parallel = FALSE) # or even this ddsRes <- DESeq(object = dds, test = "LRT", reduced = ~1, full = ~ grp, parallel = TRUE, BPPARAM = MulticoreParam(1))
My thought is that, during QR decomposition inside nbinomLRT, the sample size (100) of the dataset is somehow to big and all cores are involved; because with lower sample sizes (10,20,50) the problem doesn't occure. That's why I tried to change the option useQR to FALSE without solving the problem of all cores usage but lowering waiting time. Is there something I can in order to avoid all cores usage?
Here my sessionInfo() (I know there is a newer version of R but in the server I have to use this one :( ):
R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.1 LTS Matrix products: default BLAS: /opt/microsoft/ropen/3.4.4/lib64/R/lib/libRblas.so LAPACK: /opt/microsoft/ropen/3.4.4/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 [6] LC_MESSAGES=C.UTF-8 LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] crayon_1.3.4 bindrcpp_0.2.2 Seurat_2.3.0 cowplot_0.9.3 [5] ggplot2_3.0.0 scde_1.99.1 flexmix_2.3-13 lattice_0.20-35 [9] MAST_1.4.1 genefilter_1.60.0 AUC_0.3.0 BiocParallel_1.12.0 [13] zinbwave_1.0.0 SingleCellExperiment_1.0.0 samr_2.0 impute_1.52.0 [17] ROCR_1.0-7 gplots_3.0.1 reshape2_1.4.3 plyr_1.8.4 [21] phyloseq_1.22.3 metagenomeSeq_1.20.1 RColorBrewer_1.1-2 glmnet_2.0-16 [25] foreach_1.4.4 Matrix_1.2-14 DESeq2_1.20.0 SummarizedExperiment_1.8.1 [29] DelayedArray_0.4.1 matrixStats_0.54.0 Biobase_2.38.0 GenomicRanges_1.30.3 [33] GenomeInfoDb_1.14.0 IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0 [37] edgeR_3.20.9 limma_3.34.9 RevoUtils_10.0.9 RevoUtilsMath_10.0.1 loaded via a namespace (and not attached): [1] SparseM_1.77 prabclus_2.2-6 ModelMetrics_1.2.0 R.methodsS3_1.7.1 [5] tidyr_0.8.1 acepack_1.4.1 bit64_0.9-7 knitr_1.20 [9] irlba_2.3.2 R.utils_2.7.0 Rook_1.1-1 data.table_1.11.8 [13] rpart_4.1-13 RCurl_1.95-4.11 metap_1.0 snow_0.4-3 [17] RSQLite_2.1.1 RANN_2.6 VGAM_1.0-6 proxy_0.4-22 [21] bit_1.1-14 lubridate_1.7.4 assertthat_0.2.0 gower_0.1.2 [25] RMTstat_0.3 hms_0.4.2 DEoptimR_1.0-8 caTools_1.17.1.1 [29] readxl_1.1.0 igraph_1.2.2 DBI_1.0.0 geneplotter_1.56.0 [33] htmlwidgets_1.3 ddalpha_1.3.4 RcppArmadillo_0.9.100.5.0 purrr_0.2.5 [37] dplyr_0.7.6 backports_1.1.2 permute_0.9-4 trimcluster_0.1-2.1 [41] annotate_1.56.2 gbRd_0.4-11 quantreg_5.36 Cairo_1.5-9 [45] abind_1.4-5 caret_6.0-80 withr_2.1.2 sfsmisc_1.1-2 [49] robustbase_0.93-3 checkmate_1.8.5 vegan_2.5-2 mclust_5.4.1 [53] softImpute_1.4 cluster_2.0.7-1 gsl_1.9-10.3 segmented_0.5-3.0 [57] ape_5.2 ADGofTest_0.3 diffusionMap_1.1-0.1 lazyeval_0.2.1 [61] recipes_0.1.3 pkgconfig_2.0.2 nlme_3.1-131.1 nnet_7.3-12 [65] bindr_0.1.1 rlang_0.2.2 diptest_0.75-7 pls_2.7-0 [69] MatrixModels_0.4-1 extRemes_2.0-9 doSNOW_1.0.16 cellranger_1.1.0 [73] lmtest_0.9-36 distillery_1.0-4 carData_3.0-2 zoo_1.8-4 [77] base64enc_0.1-3 ggridges_0.5.1 png_0.1-7 rjson_0.2.20 [81] stabledist_0.7-1 bitops_1.0-6 R.oo_1.22.0 Lmoments_1.2-3 [85] KernSmooth_2.23-15 Biostrings_2.46.0 blob_1.1.1 DRR_0.0.3 [89] lars_1.2 stringr_1.3.1 brew_1.0-6 scales_1.0.0 [93] ica_1.0-2 memoise_1.1.0 magrittr_1.5 bibtex_0.4.2 [97] gdata_2.18.0 zlibbioc_1.24.0 compiler_3.4.4 lsei_1.2-0 [101] pcaMethods_1.70.0 dimRed_0.1.0 fitdistrplus_1.0-11 ade4_1.7-13 [105] dtw_1.20-1 XVector_0.18.0 pbapply_1.3-4 htmlTable_1.12 [109] magic_1.5-9 Formula_1.2-3 MASS_7.3-49 mgcv_1.8-23 [113] tidyselect_0.2.5 stringi_1.2.4 forcats_0.3.0 copula_0.999-18 [117] yaml_2.2.0 locfit_1.5-9.1 latticeExtra_0.6-28 grid_3.4.4 [121] tools_3.4.4 rio_0.5.10 rstudioapi_0.8 foreign_0.8-69 [125] gridExtra_2.3 prodlim_2018.04.18 scatterplot3d_0.3-41 Rtsne_0.13 [129] digest_0.6.18 FNN_1.1.2.1 lava_1.6.3 fpc_2.1-11.1 [133] Rcpp_0.12.19 car_3.0-2 broom_0.5.0 SDMTools_1.1-221 [137] AnnotationDbi_1.40.0 npsurv_0.4-0 kernlab_0.9-27 Rdpack_0.10-1 [141] colorspace_1.3-2 ranger_0.10.1 XML_3.98-1.16 CVST_0.2-2 [145] splines_3.4.4 RcppRoll_0.3.0 multtest_2.34.0 xtable_1.8-3 [149] jsonlite_1.5 geometry_0.3-6 timeDate_3043.102 modeltools_0.2-22 [153] ipred_0.9-7 tclust_1.4-1 R6_2.2.2 Hmisc_4.1-1 [157] pillar_1.3.0 htmltools_0.3.6 glue_1.3.0 pspline_1.0-18 [161] class_7.3-14 codetools_0.2-15 tsne_0.1-3 pcaPP_1.9-73 [165] mvtnorm_1.0-8 tibble_1.4.2 mixtools_1.1.0 numDeriv_2016.8-1 [169] curl_3.2 gtools_3.8.1 zip_1.0.0 openxlsx_4.1.0 [173] survival_2.41-3 biomformat_1.6.0 munsell_0.5.0 rhdf5_2.22.0 [177] GenomeInfoDbData_1.0.0 iterators_1.0.10 haven_1.1.2 gtable_0.2.0
I thank you in advance for your help,
Matteo
Thank you very much Davide, you solved my problem. :)