How to incrementaly load the nzdata and nzindex from SparseArraySeed
0
0
Entering edit mode
Koki ▴ 10
@koki-7888
Last seen 13 months ago
Japan

I have implemented several functions based on DelayedArray and succeeded in cutting out the small block arrays and processing them in an out-of-core manner.

However, the computation time still remains an issue.

Next, I tried to use SparseArraySeed and using only non-zero elements (nzdata) and their indices (nzindex), because I thought that calculations involving zeros can be omitted, which is expected to speed up the process.

library("DelayedArray")

arr <- array(rbinom(2*3*4, 1, 0.5), dim=c(2,3,4))
sarr <- dense2sparse(darr)
sarr@nzdata
sarr@nzindex

However, the nzdata and nzindex are assumed to be on-memory, and it may not be possible to expand them all in memory for extremely large sparse arrays.

Do you know of a way to extract these nzdata and nzindexes sequentially from a file and use them?

According to the documentation of DelayedArray, HDF5Array, and TileDBArray, the as.sparse option of writeHDF5Array simply sets the sparse slot to the flag TRUE and the data is actually stored as dense format, but the as.sparse option of writeTileDBArray actually stores the data in sparse format in TileDB. So, I am thinking that using TileDBArray may solve this problem.

Thank you in advance.

Koki

TileDBArray HDF5Array DelayedArray • 1.0k views
ADD COMMENT
0
Entering edit mode

Sorry, I realized that I can set as.sparse=TRUE in read_block and use @nzindex, @nzdata in the returned SparseArraySeed.

ADD REPLY

Login before adding your answer.

Traffic: 924 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6