H5Sset_extent_simple is not exposed in rhdf5, what it means for datasets where dims != maxdims
1
0
Entering edit mode
@nathaniel-hayden-6327
Last seen 9.4 years ago
United States

I am interested in being able to write to an HDF5 file iteratively, while making use of HDF5's ability to expand the extent of datasets after their creation to minimize file size (until the extra space is needed). Based on the HDF5 documentation, it looks like the way to do this is with H5Sset_extent_simple. But apparently the function is not exposed in the rhdf5 package.

It seems like this means although it's possible to create datasets where the dataspace's dimensional extent is different from the maximum dimensional extent (set, for example, via the maxdims argument to h5createDataset), this isn't actually meaningful unless one resorts to external software.

Here's an example script of how I would expect it to work; is there another way to accomplish the same thing?

library(rhdf5)
h5fl <- tempfile()
h5createFile(file=h5fl)
h5createDataset(h5fl, "foo", 5, 9)
h5write(11:15, h5fl, "foo")
h5ls(h5fl, all=TRUE)[c("dim", "maxdim")]
## call something like H5Sset_extent_simple to expand to 9
h5write(16:19, h5fl, "foo", index=list(6:9)) 

Thanks, Nate.

rhdf5 hdf5 • 2.0k views
ADD COMMENT
1
Entering edit mode
Bernd Fischer ▴ 550
@bernd-fischer-5348
Last seen 7.9 years ago
Germany / Heidelberg / DKFZ

Dear Nate,

 

at first I implemented the function H5Sset_extent_simple that is now available in the rhdf5 interface. Once I have done this, I noticed that this doesn't help to answer your question. It changes the dimensions of a (virtual) data space, but did not change anything in the HDF5 file. But the function H5Dset_extent does the job. This low level function was already implemented, but difficult to use, therefore a added a new high-level function h5set_extent that should fulfill your needs. The following code should work from version 2.11.4 on. See the example below:

 

> library(rhdf5)
> tmpfile <- tempfile()
> h5createFile(file=tmpfile)
[1] TRUE
> h5createDataset(tmpfile, "A", c(10,12), c(20,24))
[1] TRUE
> h5ls(tmpfile, all=TRUE)[c("dim", "maxdim")]
      dim  maxdim
0 10 x 12 20 x 24
> h5set_extent(tmpfile, "A", c(20,24))
> h5ls(tmpfile, all=TRUE)[c("dim", "maxdim")]
      dim  maxdim
0 20 x 24 20 x 24

 

ADD COMMENT
0
Entering edit mode

Wonderful, Bernd. Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6