sub-processes with Rmpi and R versions later than 3.6
0
0
Entering edit mode
hstern2 • 0
@hstern2-12803
Last seen 2.5 years ago
United States

I'm seeing a problem where the sub-tasks spawned by Rmpi, snow, BiocParallel do not terminate at the end of the program after a call to Rmpi::mpi.finalize()

the problem does not occur with an older version of R (3.4.4) only the most recent 3.x versions as well as 4.x

Rmpi • 693 views
ADD COMMENT
0
Entering edit mode

Can you provide a simple reproducible (one that I can cut and paste into my session, and see the problem) example?

ADD REPLY
0
Entering edit mode

bplapply(1:4, sqrt, BPPARAM=SnowParam(type="MPI")) Rmpi::mpi.finalize()

(hangs on call to Rmpi::mpi.finalize() with R 3.6.3, works OK with R 3.4.4 ... openmpi 1.10.7, linux RHEL 7.6)

...can omit call to Rmpi::mpi.finalize() but that produces

mpirun has exited due to process rank 0 with PID 16135 on node bhc0003 exiting improperly. There are three reasons this could occur:

  1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination.

  2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination"

  3. this process called "MPIAbort" or "orteabort" and the mca parameter ortecreatesession_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one.

This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here).

You can avoid this message by specifying -quiet on the mpirun command line.

ADD REPLY

Login before adding your answer.

Traffic: 404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6