DoparParam() execution fails when foreach() works
1
0
Entering edit mode
boris • 0
@boris-23553
Last seen 10 months ago
United States

I'm trying to run DESeq2 using the SLURM backend fronted by ClusterMQ registered as a doParallel backend, like so:


library(foreach)
library(clustermq)
library(DESeq2)
library(doParallel)
library(BiocParallel)

# CLUSTER_MQ and FOREACH working well for a toy example

load("/fsx/home/bhayete/Projects/Deseq2DoPar/DeSeq_ToyData.RData")

# USING CLUSTERMQ TO PARALELLIZE DESEQ------------------------NOT WORKING


TIMEOUT = 10000
NJOBS = 100
options(
  clustermq.scheduler = "slurm",
  clustermq.template = 'slurmMq.tmpl',
  clustermq.data.warning=5000 #megabytes
)
register_dopar_cmq(n_jobs=NJOBS,
                   fail_on_error=FALSE,
                   verbose=TRUE,
                   log_worker=TRUE,
                   timeout = TIMEOUT, #how long to wait on MQ side
                   pkgs=c('BiocParallel', 'DESeq2'), 
                   template=list(
                      timeout=TIMEOUT, #how long to wait on SLURM side
                      memory=5000,
                      cores=1,#how many cores to use (to throttle down memory usage),
                      partition = 'compute-spot',
                      r_path = file.path(R.home("bin"), "R")
                   )  
)

dds <- DESeqDataSetFromMatrix(countData = Count_Filt, colData = Metadata, design = ~ CoarseCondition)
print(paste(getDoParWorkers(), "workers", sep = '_'))
doparam <- DoparParam()
# Define workers otherwise only 1 worker ill be used
doparam$workers <- NJOBS
register(doparam)
x = foreach(i=1:300) %dopar% sqrt(i)
x2 = bplapply(1:300, sqrt, BPPARAM = doparam, log_worker=TRUE)

The resulting output snippet is as follows. Note that x is calculated on the cluster correctly, while x2 doesn't run. It is as though some internals of the S4 object for DESeq2 are not correctly exported to the cluster. What does this error mean and has anyone been able to run bplapply over SLURM by this mechanism?

x = foreach(i=1:300) %dopar% sqrt(i) Submitting 100 worker jobs (ID: cmq7587) ... Running 300 calculations (1 objs/0 Mb common; 1 calls/chunk) ... Master: [2.1s 20.9% CPU]; Worker: [avg 72.7% CPU, max 284.7 Mb]

x2 = bplapply(1:300, sqrt, BPPARAM = doparam, log_worker=TRUE) Submitting 100 worker jobs (ID: cmq9511) ... Running 100 calculations (1 objs/0 Mb common; 1 calls/chunk) ... Master: [2.8s 9.7% CPU]; Worker: [avg 78.8% CPU, max 290.2 Mb]
Warning in summarize_result(job_result, n_errors, n_warnings, cond_msgs, : 100/100 jobs failed (0 warnings) (Error #1) could not find function ".bpworker_EXEC" (Error #10) could not find function ".bpworker_EXEC" (Error #100) could not find function ".bpworker_EXEC" (Error #11) could not find function ".bpworker_EXEC" (Error #12) could not find function ".bpworker_EXEC" (Error #13) could not find function ".bpworker_EXEC" (Error #14) could not find function ".bpworker_EXEC" (Error #15) could not find function ".bpworker_EXEC" (Error #17) could not find function ".bpworker_EXEC" (Error #19) could not find function ".bpworker_EXEC" (Error #2) could not find function ".bpworker_EXEC" (Error #21) could not find function ".bpworker_EXEC" (Error #3) could not find function ".bpworker_EXEC" (Error #4) could not find function ".bpworker_EXEC"

clustermq doparallel DESeq2 • 528 views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.1k
@atpoint-13662
Last seen 1 minute ago
Germany

I am not 100% sure what you're asking. The way of parallelizing the DESeq2 process is to use BPPARAM option which accepts a MultiCoreParam object with a uer-defined number of cores/workers. Why all the hustle with the registration rather than just providing a MultiCoreParam object (MultiCoreParam(workers=10) for example) to DESeq() and then let BiocParallel figure out everything under the hood (which is its purpose)?

ADD COMMENT
0
Entering edit mode

Registration allows me to use ClusterMQ as the backend for foreach/BiocParallel, linking to SLURM and unlocking far larger compute resources than possible with just one machine. I do use BPPARAM=DoparParam() in the process.

ADD REPLY

Login before adding your answer.

Traffic: 616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6