I am starting to look into BiocParallel, probably later than I should.
I am on a 64 core node. As far as I know, I have done nothing except load GenomicFiles. If I do
registered() $MulticoreParam class: MulticoreParam; bpisup: TRUE; bpworkers: 64; catch.errors: TRUE setSeed: TRUE; recursive: TRUE; cleanup: TRUE; cleanupSignal: 15; verbose: FALSE $SnowParam class: SnowParam; bpisup: FALSE; bpworkers: 64; catch.errors: TRUE cluster spec: 64; type: PSOCK $BatchJobsParam class: BatchJobsParam; bpisup: TRUE; bpworkers: NA; catch.errors: TRUE cleanup: TRUE; stop.on.error: FALSE; progressbar: TRUE $SerialParam class: SerialParam; bpisup: TRUE; bpworkers: 1; catch.errors: TRUE
If I understand it correctly, I now have 4 registered parallel backends (without doing anything) and the default is multicore. I think it is highly problematic for multi-user systems that the default is selected in this way. Specifically, in this case I have not requested 64 cores from my scheduler. Instead, I believe the default parallel backend should always be serial, and that we need to have user intervention to do more.
In line with this - and wearing my admin cap for this paragraph - I think it would be pretty convenient if it is possible to modify the default choices and settings using environment variables. This way, suitable choices can be made for some users in a multi-user environment, based on scheduling requests. For example, I would like to write something in .Rprofile.site which sets the default number of cores in a MulticoreParams, not based on cores-in-machine, but on cores-in-scheduling-request.
Also, I don't understand that the SnowParams is different from what I see with
> SnowParam() class: SnowParam; bpisup: FALSE; bpworkers: 0; catch.errors: TRUE cluster spec: 0; type: PSOCK