rhdf5 H5Dget_create_plist implementation
@chris-jewell-9295
I have an application which generates chunked datasets in HDF5 format.  I wish to read these in R, for which the rhdf5 package is very helpful -- thanks!  However, I wish to set the chunk cache size from within R to allow quicker access to the data.  To do this, I first wish to query the chunk size used to create each dataset.

I notice that H5Dget_create_plist is not yet implemented in rhdf5. I have therefore tried to write my own wrapper around the C H5Dget_create_plist() function.  However, the hid_t IDs of my rhdf5-provided datasets are not being recognised:

f <- H5Fcreate("myfile.h5")
s <- H5Screate(10,10)
d <- H5Dcreate(f, "data", "H5T_IEEE_F32LE", s)
myH5Dget_create_plist(d@ID)
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
#000: H5D.c line 571 in H5Dget_create_plist(): not a dataset
major: Invalid arguments to routine
minor: Inappropriate type


I wonder if rhdf5 is using a different HDF5 library compared to my system-wide installation which my C-wrapper uses?

Thanks,

Chris

Bernd Fischer ▴ 540
@bernd-fischer-5348
H5Dget_create_plist is implemented in rhdf5 2.15.1 (available tomorrow evening).

With

library(rhdf5)
f = H5Fopen(f, "myfile.h5")
d = H5Dopen(f, "A")
p = H5Dget_create_plist(d)
H5Pget_chunk(p)

you can access the chunk size of dataset A in the HDF5 file "myfile.h5".

@wolfgang-huber-3550
Chris,

that is indeed the case, see e.g. https://github.com/Bioconductor-mirror/rhdf5/tree/master/src

Can you try directing your C wrapper to the code in there? And if you think it's generally useful perhaps even contribute your code back to Bernd Fischer.

[There are pros and cons for an R package bringing its own copy of such a library, or using one found on the system. From experience with both approaches, installation and maintenance tend to be far easier with the 'own copy' approach.]

HTH, Wolfgang