rhdf5 read/write concurrency surprise
0
0
Entering edit mode
Bernd Fischer ▴ 550
@bernd-fischer-5348
Last seen 7.3 years ago
Germany / Heidelberg / DKFZ
Dear Brad, I investigated this issue and found that similar issues happen with the R file connections. I added example code at the end of this message. Therefore, I decided to leave the behavior as is for the low-level HDF5 functions (upper case H5? functions and HDF5 object identifiers). However, for the HDF5 high-level functions (lower case h5?), I added a warning, if an open HDF5 handle already exists for the specified filename (rhdf5 >= 2.9.5). In your example, the second call to "h5dump(hf)" should through a warning like: > h5dump(hf) named list() Warnmeldung: In h5checktypeOrOpenLoc(file) : An open HDF5 file handle exists. If the file has changed on disk meanwhile, the function may not work properly. Run 'H5close()' to close all open HDF5 object handles. and a call to H5close() should solve the problem. > H5close() > h5dump(hf) $foo [1] 1 On 26.06.2014, at 08:13, Brad Friedman <friedman.brad at="" gene.com=""> wrote: > An open rhdf5 handle becomes corrupted when a another process writes to the hdf5 file. > > This example requires you to start two different R processes, a writer and a reader > > ## Create empty HDF5 file in the writer process > > library(rhdf5) > > hf <- "x.hdf5" > > h5createFile(hf) > [1] TRUE > > h5dump(hf) > named list() > > > > ## Then, in the reader process open a handle and dump the file > > library(rhdf5) > > hf <- "x.hdf5" > > fid <- H5Fopen(hf) > > h5dump(fid) > named list() > > > ## Now leave the rhdf5 handle open in the reader and go back to the > ## writer process and write a data set > > h5write(1, hf, "foo") > > h5dump(hf) > $foo > [1] 1 > > > ## Now go back to the reader and try to read it: > > h5dump(fid) > named list() > ## That is not right---it doesn't reflect the change. > ## Maybe the handle is bad. Try to read it using the filename instead > > h5dump(hf) > named list() > ## still can't see it. Try a new rhdf5 handle > > fid2 <- H5Fopen(hf) > > h5dump(fid2) > named list() > ## Still can't see it. Turns out if I close all the open rhdf5 handles > ## I can see it. > > H5Fclose(fid) > > H5Fclose(fid2) > > h5dump(hf) > $foo > [1] 1 > > > > A workaround for this is that whenever the file is modified by the writer process, the reader process has to make sure to close all open handles for the file and then reopen fresh ones. Another workaround is to never explicitly open handles with H5Fopen, and to only use rhdf's interface that accepts a file name instead of an open HDF5 handle. > > > > sessionInfo() > R version 3.1.0 Patched (2014-05-17 r65643) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rhdf5_2.9.1 > > loaded via a namespace (and not attached): > [1] zlibbioc_1.10.0 >
GO rhdf5 PROcess GO rhdf5 PROcess • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6