Search
Question: [DiffBind] Memory issues with dba.count()
0
gravatar for enricoferrero
2.2 years ago by
enricoferrero550
United Kingdom
enricoferrero550 wrote:

Hi Rory et al., 

I'm hitting the memory limits of my server (96GB RAM) when using DiffBind::dba.count(), which results in my job getting killed.

I'm trying to generate a count matrix from many samples (>30), which translates to many sites/peaks. I suspect the massive matrix cannot be allocated by R into memory.

I've seen the argument bLowMem mentioned in some previous discussions, but it doesn't seem to be recognised by dba.count() any longer, is that right?

Is there any way to use dba.count() in this scenario? Would something like the bigmemory package be helpful here?

Thank you,

 

ADD COMMENTlink modified 2.2 years ago by Rory Stark2.1k • written 2.2 years ago by enricoferrero550
1
gravatar for Rory Stark
2.2 years ago by
Rory Stark2.1k
CRUK, Cambridge, UK
Rory Stark2.1k wrote:

Hello-

The bLowMem parameter was replaced by bUseSummarizeOverlaps. You can try setting this to TRUE. when calling dba.count(). You can also set the configuration value $config$yieldSize in your DBA object to a lower value (like 50000).

Another approach is to use a consensus peakset with fewer peaks. If you are relying on the minOverlap parameter (default value 2), you can set it higher. Calling dba.overlap() with mode=DBA_OLAP_RATE will return a vector with the number of consensus peaks for successively greater values of minOverlap so you can choose an appropriate one.

I am currently looking at memory usage in DiffBind, as it does seem to occasionally ballon very high, and hope to have a fix in the next version.

Regards-

Rory

ADD COMMENTlink written 2.2 years ago by Rory Stark2.1k

Thanks Rory,

I'll try using the summarizeOverlaps option with a lower yieldSize.

It's great to hear that you're looking at the memory consumption - it's the one thing that is keeping me from using DiffBind more extensively across projects.

Best,

ADD REPLYlink written 2.2 years ago by enricoferrero550

FYI, in the development version of DiffBind (1.17.6 and later), we have made significant improvements in peak memory usage, reducing it by an order of magnitude, especially in the case where a binding matrix is being constructed (e.g. dba.count). I have an analysis that was taking >70GB to run and now takes 5GB. Give it a try!

Cheers-

Rory

ADD REPLYlink written 23 months ago by Rory Stark2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 111 users visited in the last hour