Search
Question: BiocParallel not proceeding
1
gravatar for petyuk
13 months ago by
petyuk70
United States
petyuk70 wrote:

After refreshing my R installation and the packages, this simple example of BiocParallel::bplapply doesn't work properly.  It looks like it starts, but never finishes.

Thanks for taking a look into it!

Vlad

 

> library(BiocParallel)
> fun <- function(v) {
+     message("working") ## 10 tasks
+     sqrt(v)
+ }
> bplapply(1:10, fun)

# it does not finish (at least in a minute or so)

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocParallel_1.6.6

loaded via a namespace (and not attached):
[1] parallel_3.3.1 tools_3.3.1   
> 
ADD COMMENTlink modified 12 months ago by Martin Morgan ♦♦ 20k • written 13 months ago by petyuk70
  1. upgrading to BiocParallel_1.7.9 didn't help
  2. works fine on windows machine though

 

ADD REPLYlink written 13 months ago by petyuk70

It hits a snag at this line in bplapply

BPPARAM <- bpstart(BPPARAM, length(X))

ADD REPLYlink written 13 months ago by petyuk70

Within bpstart the issue is at

bpbackend(x) <- .bpmakeForkCluster(nnodes, bptimeout(x), getOption("ports", NA_integer_))

ADD REPLYlink written 13 months ago by petyuk70

A likely candidate is that ports on the Mac are not accessible; I'm not familiar enough with how things work, but I think the overall strategy would be to identify open ports and use these via options(ports=123).

ADD REPLYlink written 13 months ago by Martin Morgan ♦♦ 20k

Not sure if this is relevant but this is what I get (regardless of the port). Although may be this is what is supposed to be since there is no process communicating to the 11709 port.  Anyway, BiocParallel:::.bpmakeForkCluster is not cooperating with me today.

> port <- 11000L + sample(1000L, 1L)
> port
[1] 11709
> socketConnection("localhost", port=port, TRUE, TRUE, "a+b", timeout = 3)
Error in socketConnection("localhost", port = port, TRUE, TRUE, "a+b",  : 
  cannot open the connection
In addition: Warning message:
In socketConnection("localhost", port = port, TRUE, TRUE, "a+b",  :
  problem in listening on this socket
ADD REPLYlink modified 13 months ago • written 13 months ago by petyuk70

What is odd it starts with MulticoreParam(), that is FORK type, but later switches to SOCK type.  I thought SOCK is for Windows, but not POSIX systems.

ADD REPLYlink written 13 months ago by petyuk70

The 'SOCK' class of the return value is a red herring here. A simple fork isn't rich enough to support the BiocParallel behavior, which requires communication between the forked and master process. Sockets are used for that communication.

ADD REPLYlink written 13 months ago by Martin Morgan ♦♦ 20k

 

If I intercept the execution and rename the host from "WE25743" (the name of my machine) to "localhost", then .bpmakeForkCluster proceeds just fine. Hope that helps.

 

debugging in: BiocParallel:::.bpmakeForkCluster(8, 10, NA_integer_)
debug: {
    nnodes <- as.integer(nnodes)
    if (is.na(nnodes) || nnodes < 1L)
        stop("'nnodes' must be >= 1")
    if (is.na(port))
        port <- 11000L + sample(1000L, 1L)
    else if (length(port) != 1L)
        stop("'port' must be integer(1)")
    host <- Sys.info()[["nodename"]]
    cl <- vector("list", nnodes)
    for (rank in seq_len(nnodes)) {
        .bpmakeForkChild(host, port, rank, timeout)
        cl[[rank]] <- .bpconnectForkChild(host, port, rank, timeout)
    }
    class(cl) <- c("SOCKcluster", "cluster")
    cl
}

 

ADD REPLYlink written 13 months ago by petyuk70
1
gravatar for Martin Morgan
12 months ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:

Sorry for the delay and thanks for debugging. I can sort of guess at the issue and have provided a patched version in BiocParallel 1.8.1 (current Bioc release version 3.4) and 1.9.1 (bioc devel), available with either

BiocInstaller::biocLite("Bioconductor-mirror/BiocParallel@release-3.4")  ## if you're using Bioconductor v. 3.4, release
BiocInstaller::biocLite("Bioconductor-mirror/BiocParallel")  ## if you're using Bioconductor v. 3.5, devel

or after the next build (likely Sunday afternoon, eastern time) via biocLite("BiocParallel").

The workaround lets you set the host using a global option

options(bphost="localhost")

instead of the  default Sys.info()[["nodename"]]. Set the option any time before calling bplapply(), bpvec(), or bpstart(). I'd  appreciate hearing whether this works for you (and others), and if so will make a more permanent change.

ADD COMMENTlink written 12 months ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 232 users visited in the last hour