Question: Parallel Join Excessively Slow
gravatar for Lucas Schiffer
3.0 years ago by
Boston University, Boston, MA
Lucas Schiffer220 wrote:

Within the curatedMetagenomicData package parallelization was used to increase performance. However, after some profiling, it was found that parallelization actually slowed processes down, as compared to similar tasks done in serial. The result is difficult to make sense of and a small example has been constructed here to reproduce the scenario. Any helpful comments would be welcomed.

ADD COMMENTlink modified 3.0 years ago by Martin Morgan ♦♦ 24k • written 3.0 years ago by Lucas Schiffer220
Answer: Parallel Join Excessively Slow
gravatar for Martin Morgan
3.0 years ago by
Martin Morgan ♦♦ 24k
United States
Martin Morgan ♦♦ 24k wrote:

Here are several examples that illustrate the cost of parallel evaluation

> library(BiocParallel)
> v = integer(1e8)
> system.time(lapply(1:8, function(i, v) i, v))
   user  system elapsed 
  0.004   0.000   0.001 

Cost of starting up the nodes

> system.time(bplapply(1:8, function(i, v) i))
   user  system elapsed 
  0.148   0.012   0.481 

Cost of transferring data to the workers

> system.time(bplapply(1:8, function(i, v) i, v))
   user  system elapsed 
  0.092   0.476   1.727 

Cost of retrieving data from the workers

> system.time(bplapply(1:8, function(i, v) v, v))
   user  system elapsed 
  0.600   1.704   3.378 

and of course the dominant cost, iteration instead of vectorization

> system.time(1:8)
   user  system elapsed 
      0       0       0 

It seems likely that you've replaced a vectorized calculation with an interation, and are moving large amounts of data to and from the workers.

bpvec() might be a better fit to your needs. And generally, the iteration over n assays implies potentially polynomial scaling, where the first assay is copied in the first iteration, then the first and second assays in the second iteration, then the first, second, and third assays in the third iteration, etc; one would rather develop a more efficient algorithm.


ADD COMMENTlink written 3.0 years ago by Martin Morgan ♦♦ 24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 262 users visited in the last hour