The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Issue with BiocParallel
0
gravatar for milos-91
4 months ago by
milos-910
milos-910 wrote:

Hi!

I have a problem with running the PureCN when --parallel flag is active. The error is:

Error in serialize(data, node$con, xdr = FALSE) :

error writing to connection

Calls: runAbsoluteCN ... .send_EXEC -> <Anonymous> -> sendData.SOCK0node -> serialize

Similar error has already been reported and it is due to lack of RAM memory. But, in this case, I've checked the metrics, and only around 15% of RAM memory and 14% of CPUs are used.

Any idea why this is failing?

Thank you very much!

 

biocparallel purecn • 149 views
ADD COMMENTlink written 4 months ago by milos-910
2

The error isn't that too much memory is being used, but that the amount of data being sent from the worker to the manager or vice versa is too large for the type of connection implemented. I'm not familiar with PureCN but the overall strategy is to reduce the amount of data sent or returned to the worker, e.g., by analysis of smaller chromosome regions????

ADD REPLYlink modified 4 months ago • written 4 months ago by Martin Morgan ♦♦ 23k

Thanks Martin and Milos. There is definitely some room for improvement, although I've never seen this error even in our whole exomes. Martin, is there an easy way to profile the memory usage of workers?

ADD REPLYlink written 4 months ago by markus.riester110

Actually I'm not 100% sure that is the amount of data being serialized; the first step would be to get a reproducible example...

ADD REPLYlink written 4 months ago by Martin Morgan ♦♦ 23k

Is it possible that this happens when a worker was idle for a long time? Some workers exit early after a few minutes, other can run for more than our in big datasets. This is something I can probably find an easy workaround for.

ADD REPLYlink written 4 months ago by markus.riester110

Do you mind trying version 1.13.4? It should balance the workload much better across nodes. This should reduce the runtime significantly and might decrease the chance of such communication errors.

ADD REPLYlink written 3 months ago by markus.riester110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 218 users visited in the last hour