Entering edit mode
Hi, Dr Stark
I find it seems that Parallel execution do not work in DiffBind when working under Windows system
library(DiffBind)
library(BiocParallel)
> tamoxifen <- dba(sampleSheet="tamoxifen.csv")
BT4741 BT474 ER Resistant Full-Media 1 bed
BT4742 BT474 ER Resistant Full-Media 2 bed
MCF71 MCF7 ER Responsive Full-Media 1 bed
MCF72 MCF7 ER Responsive Full-Media 2 bed
MCF73 MCF7 ER Responsive Full-Media 3 bed
T47D1 T47D ER Responsive Full-Media 1 bed
T47D2 T47D ER Responsive Full-Media 2 bed
MCF7r1 MCF7 ER Resistant Full-Media 1 bed
MCF7r2 MCF7 ER Resistant Full-Media 2 bed
ZR751 ZR75 ER Responsive Full-Media 1 bed
ZR752 ZR75 ER Responsive Full-Media 2 bed
> tamoxifen <- dba.count(tamoxifen, summits=250)
Computing summits...
Sample: reads/Chr18_BT474_ER_1.bam125
Sample: reads/Chr18_BT474_ER_2.bam125
Sample: reads/Chr18_MCF7_ER_1.bam125
Sample: reads/Chr18_MCF7_ER_2.bam125
Sample: reads/Chr18_MCF7_ER_3.bam125
Sample: reads/Chr18_T47D_ER_1.bam125
Sample: reads/Chr18_T47D_ER_2.bam125
Sample: reads/Chr18_TAMR_ER_1.bam125
Sample: reads/Chr18_TAMR_ER_2.bam125
Sample: reads/Chr18_ZR75_ER_1.bam125
Sample: reads/Chr18_ZR75_ER_2.bam125
Sample: reads/Chr18_BT474_input.bam125
Sample: reads/Chr18_MCF7_input.bam125
Sample: reads/Chr18_T47D_input.bam125
Sample: reads/Chr18_TAMR_input.bam125
Sample: reads/Chr18_ZR75_input.bam125
Re-centering peaks...
Reads will be counted as Single-end.
Sample: reads/Chr18_BT474_ER_1.bam125
Sample: reads/Chr18_BT474_ER_2.bam125
Sample: reads/Chr18_MCF7_ER_1.bam125
Sample: reads/Chr18_MCF7_ER_2.bam125
Sample: reads/Chr18_MCF7_ER_3.bam125
Sample: reads/Chr18_T47D_ER_1.bam125
Sample: reads/Chr18_T47D_ER_2.bam125
Sample: reads/Chr18_TAMR_ER_1.bam125
Sample: reads/Chr18_TAMR_ER_2.bam125
Sample: reads/Chr18_ZR75_ER_1.bam125
Sample: reads/Chr18_ZR75_ER_2.bam125
Sample: reads/Chr18_BT474_input.bam125
Sample: reads/Chr18_MCF7_input.bam125
Sample: reads/Chr18_T47D_input.bam125
Sample: reads/Chr18_TAMR_input.bam125
Sample: reads/Chr18_ZR75_input.bam125
Warning messages:
1: Parallel execution unavailable: executing serially.
2: UNKNOWN PARALLEL PACKAGE: 0
sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19041)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Chinese (Simplified)_China.936
#> [2] LC_CTYPE=Chinese (Simplified)_China.936
#> [3] LC_MONETARY=Chinese (Simplified)_China.936
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=Chinese (Simplified)_China.936
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.0.2 magrittr_2.0.1 tools_4.0.2 htmltools_0.5.0
#> [5] yaml_2.2.1 stringi_1.5.3 rmarkdown_2.6 highr_0.8
#> [9] knitr_1.30 stringr_1.4.0 xfun_0.20 digest_0.6.27
#> [13] rlang_0.4.10 evaluate_0.14
Created on 2021-01-28 by the reprex package (v0.3.0)
Best wishes
Guandong Shang
It is not correct that BiocParallel does not support parallel evaluation on Windows. Windows doesn't support forked (
MulticoreParam()
) evaluation, but it does support parallel evaluation in independent processes (SnowParam()
). There are some complications, in particular because the workers are completely independent, implying that they must load DiffBind (and pay the time cost of loading it) each time a newbplapply()
is evaluated. There are some hints in the vignette, but I'll try to improve that with some additional guidance.