Entering edit mode
Guest User ★ 12k@guest-user-4897
Last seen 7.2 years ago
Hello, I have a database (22 GB) in SQLite that I query from R for numerical analysis. I'm considering converting the database to HDF5 for faster read times (because reading the population is slow). I have two questions about the rhdf5 package that I haven't been able to figure out from my own experimenting. (i) Suppose that I save an R dataframe to a HDF file. Is it possible to read subsets of the dataframe based on variable names and variable values? Often, I don't won't to read the full dataframe into memory (~ 100 million observations and ~ 30 variables). (ii) I frequently use indexes in my SQLite database to quickly join related tables. Does rhdf5 have a similar feature? If not, will converting to a HDF5 database create substantial bottlenecks if I rely on these joins frequently? Thanks so much for your help. -- output of sessionInfo(): N/A -- Sent via the guest posting facility at bioconductor.org.