Search
Question: Plans for multicore block processing in DelayedArray?
0
gravatar for maltethodberg
7 weeks ago by
UCPH
maltethodberg40 wrote:

Following up on the vignette in the devel-version for DelayedArray: Is there a timeline for implementation of multicore processing across blocks?

ADD COMMENTlink modified 7 weeks ago by Hervé Pagès ♦♦ 13k • written 7 weeks ago by maltethodberg40
2
gravatar for Hervé Pagès
7 weeks ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

The timeline is: as soon as possible! I'll do my very best to have this into the next BioC release.

Cheers.

H.

ADD COMMENTlink written 7 weeks ago by Hervé Pagès ♦♦ 13k

That sounds good! I am considering rewriting an R-package I am are developing to take advantage of the new features in DelayedArray. While it is much more memory efficient than my current implementation, it is quite a bit slower. The main bottle-necks are calls to rowSums, colSums, etc on large arrays. I assume simple cases like these would see a fairly significant boost in speed, especially on server grade computers with multiple cores?

ADD REPLYlink written 7 weeks ago by maltethodberg40

I expect multicore block processing to improve speed but how much exactly will depend on the type of DelayedArray object. Let's keep in mind that with on-disk DelayedArray objects (HDF5Array), the cores will compete for I/O. And it's not clear to me that N cores trying to read N HDF5 blocks concurrently are going to do a much better job than 1 core reading the N blocks sequentially. Of course it will also depend on your hard drive, with SSD being better at handling concurrent read access than rotating drives. For in-memory DelayedArray objects (e.g. RleArray) multicore block processing will probably give a more significant speed boost.

This was for read-only multicore block processing. Note that operations that write the data to disk (e.g. realization or matrix multiplication) won't be able to support multicore block processing if the realization backend is HDF5 because HDF5 does not support concurrent write access to a dataset yet. However, we should be able to support multicore realization of a DelayedArray as an RleArray object.

Cheers,

H.

ADD REPLYlink written 7 weeks ago by Hervé Pagès ♦♦ 13k

I am only using in-memory objects at the moment, primarily DataFrame of Rle's wrapped with DelayedArray. I was actually wondering what the difference is between a DelayedArray around a DataFrame compared to an RleMatrix - conceptually they seem very similar? Is one more efficient speed or memory-wise? Will both be able to have parallel block-processing?

ADD REPLYlink written 7 weeks ago by maltethodberg40

The plan for RleArray and RleMatrix objects is to use a seed (RleArraySeed) that supports chunking. This will allow better compression and more efficient block processing (especially in the case of multicore block processing) than a DataFrame of Rle's wrapped in a DelayedArray.

Furthermore, this chunking will allow multicore realization of a DelayedArray as an RleArray or RleMatrix object. Not something that is really feasible with a DataFrame of Rle's wrapped in a DelayedArray.

Cheers,

H.

ADD REPLYlink written 7 weeks ago by Hervé Pagès ♦♦ 13k

Interesting! Looking forward, in order to unlock most of the features of DelayedArray (better compression + parallel block processing), it's better to use RleMatrix instead of a DataFrame of Rle's?

One reason for using a DataFrame of Rle's is that it seems much faster to create than an RleMatrix. What's a good way to build a very large RleMatrix? Obviously, first creating a normal matrix is not possible, since that would take up to much memory. Coercing a DataFrame of Rle's to a RleMatrix seems very slow as well.

ADD REPLYlink written 6 weeks ago by maltethodberg40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 176 users visited in the last hour