Problem in extracting a row from matrix in rhdf5
1
0
Entering edit mode
filipe • 0
@filipe-13062
Last seen 6.8 years ago

Hello,

I'm having trouble extracting a specific row from a big matrix using the rhdf5 package. It works if I give it an integer as the row index, but not if I assign that integer to a variable and then use that variable when subsetting. Please see the minimal example below for the error and session info.

If anyone has encountered this and/or can spot an error in my code that would be great! Thanks in advance!

Cheers,

Filipe

>
> library(rhdf5)
>
> h5f <- H5Fopen("/path/to/file.h5")
>
> big_matrix <- h5f&"big_matrix"
>
> # This works
> extracted_row <- big_matrix[293857, ]

>
> # This doesn't
> row_index <- 293857
> big_matrix[row_index, ]
Error in as.integer(index[[i]]) :
  cannot coerce type 'symbol' to vector of type 'integer'
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  :
  HDF5. Dataset. Read failed.
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
  'x' must be atomic
>
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.8 (Santiago)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rhdf5_2.18.0

loaded via a namespace (and not attached):
[1] zlibbioc_1.20.0 tools_3.3.1     packrat_0.4.8-1
>

rhdf5 • 2.2k views
ADD COMMENT
0
Entering edit mode

Update to my initial post: this code still doesn't work using R-3.4.0 and rhdf5_2.20.0. However, it does give a different error message:

> row_index <- 293857
> big_matrix[row_index,]
Error in H5Sselect_index(h5spaceFile, index) :
  index exceeds HDF5-array dimension.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,  :
  HDF5. Dataset. Read failed.
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
  'x' must be atomic
>
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.4 (Santiago)
ADD REPLY
0
Entering edit mode

Hi Filipe,

Thanks for reporting this.  I've managed to reproduce this with a simple example.  I'll look into why it's happening and report back once I've got a solution.

ADD REPLY
0
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 10 hours ago
EMBL Heidelberg

This should be fixed in rhdf5 version 2.21.2.  I use H5DOpen() to access the dataset, but hopefully this example is demonstrates the fix.

library(rhdf5)

tf <- tempfile(fileext = ".h5")
download.file(url = "http://msmith.de/data/h5tutr_dset.h5",
              destfile = tf,
              quiet = TRUE)
fid <- H5Fopen(tf)
did <- H5Dopen(fid, "dset")
> did[2,]
[1] 0 0 0 0
> 
> row_index <- 2
> did[row_index, ]
[1] 0 0 0 0
ADD COMMENT
0
Entering edit mode

Thank you! I'll install the development version, then.

 

ADD REPLY

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6