Hello,
I have the following example code:
library(BiocParallel)
param <- BatchtoolsParam(workers=5, cluster="slurm", template=tmpl)
register(param)
## do work
FUN <- function(x, y) { library(pkg); # Works }
xx <- bplapply(1:10, FUN)
FUN is an exported function in my package. FUN is working on independent workers so I have to load pkg
inside FUN, but I remember library(pkg)
is not allowed in the function body in rcmdcheck/BiocCheck
. Where should I place library(pkg)
?
Regards
Let me describe my questions in more detail.
My package
mypkg
has the large functionlargeFun
and small functionsmallFun
. Since jobs are submitted to remote independent clusters, I needlibrary(mypkg)
ormypkg::smallFun(x, y)
on each of the independent clusters. OtherwisesmallFun
is not found.Do I have to use the double colon format
mypkg::smallFun(x, y)
? If I uselibrary(mypkg)
,BiocCheck::BiocCheck
raises warnings: The following files call library or require on mypkg. This is not necessary.What is the right format to use
library(mypkg)
in this case?smallFun
should be available on the worker automatically. To confirm this I created a test packagethen added a simple file
R/funs.R
Then created the NAMESPACE and installed the package
And then in a new session I can
I don't have access to a slurm cluster, but the underlying machinery is the same and i would expect
large(BatchtoolsParam(2, "slurm"))
to work, too.The reason that I'm confident that this works is mentioned in the last paragraph of the 'Introduction to BiocParallel' vignette
This in turn is because an R function is actually the function + the environment in which it is defined.
large
is defined the package namespace (environemnt), and so the definition of FUN includes the other functions in environment, in this casesmall
.If you've tested this and it fails, then I suspect that you've mis-diagnosed the problem; perhaps you could share your repository and actual code to reproduce the problem (easily!). It could be that the slurm implementation of batchtools is actually different from other implementations (I would be surprised) so you might confirm that the problem you are having still occurs when using, e.g.,
SnowParam()
.You are right: In bplapply(), the environment of FUN (other than the global environment) is serialized to the workers. I tested
smallFun(x, y)
instead ofmypkg::smallFun(x, y)
on SLURM. It works fine. Thanks for your explanation!