The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: bplapply slow to decide number of cores
0
gravatar for Charles Plessy
16 months ago by
Japan
Charles Plessy90 wrote:

I am surprised that bplapply() takes a few seconds to detect the number of available cores, while multicoreWorkers() or MulticoreParam() are fast when they run by themselves:

> system.time(BiocParallel::bplapply(1:1e2 , function(x) order(rnorm(n=1e3)), BPPARAM = MulticoreParam()))
   user  system elapsed
  0.060   0.180   4.033

Multicore setup is fast if number of cores is fixed: 

> system.time(BiocParallel::bplapply(1:1e2 , function(x) order(rnorm(n=1e3)), BPPARAM = MulticoreParam(1)))
   user  system elapsed
  0.036   0.004   0.042

Slow if the choice is delegated to multicoreWorkers():

> system.time(BiocParallel::bplapply(1:1e2 , function(x) order(rnorm(n=1e3)), BPPARAM = MulticoreParam(multicoreWorkers())))
   user  system elapsed
  0.056   0.140   4.034

But by itself, multicoreWorkers() is fast!

> system.time(multicoreWorkers())
   user  system elapsed
  0.000   0.032   0.037

I am running BiocParallel 1.10.0.

biocparallel • 270 views
ADD COMMENTlink modified 16 months ago by Martin Morgan ♦♦ 22k • written 16 months ago by Charles Plessy90
Answer: bplapply slow to decide number of cores
2
gravatar for Martin Morgan
16 months ago by
Martin Morgan ♦♦ 22k
United States
Martin Morgan ♦♦ 22k wrote:

The 'fast' version is actually because you've set the number of cores to 1; BIocParallel cheats in this circumstance and just does lapply locally. For the other versions the cost is creating the forked processes and sending data to and from the workers. For me the time is <.5s so I'm surprised at how slow your computations seem to be. You could separate out the setup costs (and amortize over uses in a script, for instance, with

param = MulticoreParam()
bpstart(param)

and then separately

bplapply(seq_len(100), function(x) order(rnorm(n=1e3)), BPPARAM=param

with a final clean-up

bpstop(param)

It would be helpful to know your sessionInfo()

ADD COMMENTlink written 16 months ago by Martin Morgan ♦♦ 22k

Indeed the speed has a lot to do with the session. I tried the bplapply command in a fresh session and it took 0.6 s.  Then I loaded GenomicRanges and it took 1.4 s, then I added SummarizedExperimentMultiAssayExperiment and rtracklayer,  and the elapsed time rose to 2.0, 2.3 and 2.8 s respectively !

ADD REPLYlink written 16 months ago by Charles Plessy90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 203 users visited in the last hour