HDF5Array Single cell experiment objects - can on-disk files change?
1
1
Entering edit mode
@sarahwilliams1-11887
Last seen 4.3 years ago
Melbourne, Monash University

Basic question about working with HDF5-backed / sparse/ delayedArray SingleCellExperiment objects:

Because its stored on disk, does that mean operations on a large dataset could get realised live on the currently loaded data on disk?

ie. if I do the following,

sce <- loadHDF5SummarizedExperiment("original_data/")
## make lots of changes to big dataset sce ##
## maybe something computationally nasty that can't be 'delayed'?
sce2 <- saveHDF5SummarizedExperiment(sce, file="altered_data")

Is 'original_data' guaranteed to be unchanged?

Thanks.

HDF5Array • 1.6k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 8 hours ago
The city by the bay

Any non-delayed operations will be realized to the specified backend. By default, this is an ordinary matrix, represented fully in random access memory. If you do:

setRealizationBackend("HDF5Array")

... you will dump results into a HDF5Matrix. Results will be dumped into a file specified by:

getHDF5DumpFile()
## [1] "/tmp/RtmpFodSvr/HDF5Array_dump/auto00001.h5"

... which is not your original HDF5 file. This avoids problems from overwriting, which would be disastrous as it silently changes the data that might be actively pointed to by other HDF5Array objects.

FYI, to achieve the dangerous behaviour, one would need to do:

setHDF5DumpFile("original_data/name_of_hdf5.h5")

... but then one would be silly. I'm not even sure what would happen if the HDF5 C library is asked to read from and write to the same file at once, or even the same data set - you might see some very interesting behaviour here.

So, yes, original_data is guaranteed to be unchanged unless you actively tell it otherwise.

ADD COMMENT
0
Entering edit mode

Thanks for clarifying. Good to be sure :)

ADD REPLY

Login before adding your answer.

Traffic: 581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6