Hi,
I am trying to run DiffBind using parallel execution but it does not detect multiple cores, although I can through:
> parallel::detectCores() [1] 8
When I try to run DiffBind. Here's what I see with a test run:
> test = dba(sampleSheet = "TEST.xls") wt_D_1 wt D 1 bayes wt_D_2 wt D 2 bayes > test.counts = dba.count(test, minOverlap=1) Sample: 01_wt_D_1_BAM_MD_asBED.bed125 Sample: 02_wt_D_2_BAM_MD_asBED.bed125 Sample: Input_files/25_wt_D_INP_BAM_MD_asBED.bed125 Warning message: In dba.multicore.init(DBA$config) : Parallel execution unavailable: executing serially.
What should I do to run dba.count in parallel? It would reduce the analysis time a lot for me. If you need any more info or me to run any other diagnostic command, please ask.
Thanks!
PS: Here is my sessionInfo() output in case it helps
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DiffBind_2.4.8 SummarizedExperiment_1.6.5 DelayedArray_0.2.7 matrixStats_0.52.2 Biobase_2.36.2
[6] GenomicRanges_1.28.6 GenomeInfoDb_1.12.3 IRanges_2.10.5 S4Vectors_0.14.7 BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] edgeR_3.18.1 bit64_0.9-7 splines_3.4.0 gtools_3.5.0 assertthat_0.2.0
[6] latticeExtra_0.6-28 amap_0.8-14 RBGL_1.52.0 blob_1.1.0 GenomeInfoDbData_0.99.0
[11] Rsamtools_1.28.0 ggrepel_0.7.0 Category_2.42.1 pillar_1.1.0 RSQLite_2.0
[16] backports_1.1.2 lattice_0.20-35 glue_1.2.0 limma_3.32.10 digest_0.6.14
[21] RColorBrewer_1.1-2 XVector_0.16.0 checkmate_1.8.5 colorspace_1.3-2 Matrix_1.2-12
[26] plyr_1.8.4 GSEABase_1.38.2 XML_3.98-1.9 pkgconfig_2.0.1 pheatmap_1.0.8
[31] ShortRead_1.34.2 biomaRt_2.32.1 genefilter_1.58.1 zlibbioc_1.22.0 xtable_1.8-2
[36] GO.db_3.4.1 scales_0.5.0 brew_1.0-6 gdata_2.18.0 BiocParallel_1.10.1
[41] tibble_1.4.1 annotate_1.54.0 ggplot2_2.2.1 GenomicFeatures_1.28.5 lazyeval_0.2.1
[46] XLConnect_0.2-13 magrittr_1.5 survival_2.41-3 memoise_1.1.0 systemPipeR_1.10.2
[51] gplots_3.0.1 hwriter_1.3.2 GOstats_2.42.0 graph_1.54.0 tools_3.4.0
[56] data.table_1.10.4-3 BBmisc_1.11 sendmailR_1.2-1 munsell_0.4.3 locfit_1.5-9.1
[61] bindrcpp_0.2 AnnotationDbi_1.38.2 Biostrings_2.44.2 compiler_3.4.0 caTools_1.17.1
[66] rlang_0.1.6 grid_3.4.0 RCurl_1.95-4.10 rjson_0.2.15 AnnotationForge_1.18.2
[71] base64enc_0.1-3 bitops_1.0-6 gtable_0.2.0 DBI_0.7 R6_2.2.2
[76] GenomicAlignments_1.12.2 dplyr_0.7.4 rtracklayer_1.36.6 bit_1.1-12 bindr_0.1
[81] XLConnectJars_0.2-13 KernSmooth_2.23-15 rJava_0.9-9 stringi_1.1.6 BatchJobs_1.7
[86] Rcpp_0.12.14
Thanks for the quick reply !
I do not know if it's appropiate, but instead of starting a new thread I wanted to ask you a follow-up question.
I have a huge dataset (32 files) and when I run the dba.count command, sometimes, it skips some files and doesnt count the reads. I have run it several times, and everytime the files that get "skipped" are different. I do not know if I'm running out of memory or what could be the cause for this behavior. I have resorted to run it until I get all the files read, but it is very time consuming.
This is an example of the message I obtain after running the dba.count :
Hmmn. Since you are running serially anyway, is may be best to set
bParallel=FALSE
and see if that is any better.This is a tough one to debug remotely. One thing you could try if you get desperate is to set:
and then type "c" whenever it stops.
I tried it with bParallel=FALSE but I am still getting files skipped. Sometimes it's just one, sometimes it's 10 of them.
I am running the debug. What should I look for? Does it give an automated report at the end?
Alternatively, I could add a line so it stops after finding any library with size 0. At least that way I do not have to wait until it has processed all the files to know if it has read them.
Thanks a lot for your help !