I am curious about the use of parquet files as an on-disk storage back-end for Bioconductor objects. So far, I have found some references to parquet files in Bioconductor packages, e.g.
but there doesn't seem to be broader support e.g. via the (awesome!) DelayedArray package, yet.
I am aware of the support of true matrix-storage formats, e.g.
- HDF5 files in HDF5Array (thank you, Herve!), but am looking for even better support of cloud storage systems, or
- tiledb, supported via TileDBArray (thank you, Aaron!), but - unlike parquet files - tiledb has not been adopted in my work environment, yet.
Before I continue experimenting with marrying parquet and Bioconductor further, I was wondering if "parquet-backed Bioconductor objects" were a bad idea to begin with (and if so - why!). Or if there are ongoing efforts already that I might benefit from (or contribute to).
Many thanks for any thoughts and pointers,