Hello,
First of all apologies if this is not the right place for this type of question. But I am stuck and will really appreciate any help on pointers on this.
I am trying to use rhdf5 to read a subset of a large dataset that was originally created using pytables and stored in a hdf5 file. As the dataset is quite large, I only want to read in few rows (say three in the following example) from the table at a time but running into issues.
library('rhdf5')
library(bit64)
h5ls('LargeDataset.h5', recursive=2)
group name otype dclass dim
0 / ESLarge H5I_GROUP
1 /ESLarge _i_features H5I_GROUP
2 /ESLarge features H5I_DATASET COMPOUND 4327078
data <- h5read('LargeDataset.h5', 'ESLarge/features', index=list(1:3), bit64conversion='bit64')
Warning message:
In `[.data.frame`(list(a_inferred = c("unknown", "unknown", :
'drop' argument will be ignored
I know that the table consists of 4327078 rows and 11 columns. So for above, data variable should contain 3 rows and 11 columns but when I look at data, I can only see 3 rows and 3 columns.
data a_inferred a_label h_mean 1 unknown unknown 0.14034226 2 unknown unknown 0.05577267 3 unknown unknown 0.03498855
Can someone suggest how can I read few rows with all the columns please? Changing the argument of list gives me different sized square matrix e.g. list(1:6) gives a 6x6 data variable. Doing following also gives an error like,
data <- h5read('LargeDataset.h5', 'ESLarge/features', index=list(1:3, NULL), bit64conversion='bit64')
Error in h5read("LargeDataset.h5", "ESLarge/features", :
length of index has to be equal to dimensional extension of HDF5 dataset.
Any ideas please?
