I am using
HDF5Array inside my package, and some of my functions use
getAutoBlockSize internally to get the block size so that it does not exceed the block size specified by
setAutoBlockSize, and then use
HDF5Array::write_block as documented, each calculation is done sequentially.
By the way, is it safe to assume that all the functions implemented in
HDF5Array are basically block size aware?
For example, the following functions are used in my functions, and I haven't written any code of block process in the explicit, but can I assume that these recognize the block size and process it sequentially?
I couldn't find any place in the code where
getAutoBlockSize is written explicitly though.
Also, I would like to know if there is a way to find out if a source code is block size aware or not. If there is the list somewhere, it would be helpful.
So, if I run a delayed operation and don't perform the actual calculation, but simply stack the calculation, and then
realizethe calculation after that, can I assume that the writes to the file (e.g. HDF5) that are required during the calculation are block size aware?
For example, the following codes use a combination of delayed operation and block-processed operation but the code as a whole does not exceed the block size, is that correct?
Also, I think that even a simple delayed operation can cause a memory error (e.g., Error: C stack usage of HDF5Array) and does it mean that there is not enough memory to stack the calculation?
Yes, that's correct. More precisely:
realize(x, "HDF5Array")just calls
as(x, "HDF5Array"), which just calls
writeHDF5Array(x), so the three are equivalent. The workhorse behind
DelayedArray::BLOCK_write_to_sink()(this is an internal helper so is not documented). As its name suggests
DelayedArray::BLOCK_write_to_sink()is block-size aware i.e. it will define a grid of blocks on
getAutoBlockSize(), walk on the blocks of that grid, and realize each block before writing them to disk.
Note however that choosing blocks that respect
getAutoBlockSize()isn't a guarantee that the code won't use more memory than the block size. This is a common misconception. See the last paragraph of the Details section in
?getAutoBlockSizefor more information about this.
Well, not a simple delayed operation. You need to stack tens of thousands of delayed operations on an object to end up with a "C stack usage is too close to the limit" problem. This typically happens when you apply a delayed operation in a loop which is almost never a good idea.
Ok, I got the gist of it.
If I have the same situation of the previous case (Error: C stack usage of HDF5Array), where I have to do delayed operations repeatedly, I'd better perform
realizeoften to avoid the "C stack usage is too close to the limit" error.
Thanks a lot.