rhdf5 missing information in read
1
1
Entering edit mode
@joseph-nathaniel-paulson-6442
Last seen 7.7 years ago
United States

I'm in the process of writing a few wrappers for loading and writing out files in the biom-format that happens to be in HDF5 format. The rhdf5 package is great, but in particular, the beginning of every file (as an example:https://github.com/biocore/biom-format/blob/master/examples/rich_sparse_otu_table_hdf5.biom ) has missing information that I can get running the command-line version of hdf5dump 

 

Running hdf5dump vs. 1.8.7 I'm able to see creation-dateformat-urlformat-version, etc (see below).

However, running h5read/ls/dump on the same object none of these categories/groups come up. My goal is to get the format-verson, etc groups that are not showing up. 

In particular rhdf5::hdf5dump function does not open the file unlike hdf5dump through terminal.

Example:

# in R

> h5dump("~/Desktop/rich_sparse_otu_table_hdf5.biom")
HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 0:
  #000: H5F.c line 1522 in H5Fopen(): unable to open file
    major: File accessability
    minor: Unable to open file
  #001: H5F.c line 1211 in H5F_open(): unable to open file: time = Tue Oct 28 00:27:02 2014
, name = '~/Desktop/rich_sparse_otu_table_hdf5.biom', tent_flags = 1
    major: File accessability
    minor: Unable to open file
  #002: H5FD.c line 1086 in H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #003: H5FDsec2.c line 348 in H5FD_sec2_open(): unable to open file: name = '~/Desktop/rich_sparse_otu_table_hdf5.biom', errno = 2, error message = 'No such file or directory', flags = 1, o_flags = 2
    major: File accessability
    minor: Unable to open file
HDF5: unable to open file
Error in h5checktypeOrOpenLoc(file) : 
Error in h5checktypeOrOpenLoc(). File '~/Desktop/rich_sparse_otu_table_hdf5.biom' is not a valid HDF5 file.
str(h5read("./rich_sparse_otu_table_hdf5.biom","/"))
List of 2
 $ observation:List of 4
  ..$ group-metadata: NULL
  ..$ ids           : chr [1:5(1d)] "GG_OTU_1" "GG_OTU_2" "GG_OTU_3" "GG_OTU_4" ...
  ..$ matrix        :List of 3
  .. ..$ data   : num [1:15(1d)] 1 5 1 2 3 1 1 4 2 2 ...
  .. ..$ indices: int [1:15(1d)] 2 0 1 3 4 5 2 3 5 0 ...
  .. ..$ indptr : int [1:6(1d)] 0 1 6 9 13 15
  ..$ metadata      :List of 1
  .. ..$ taxonomy: chr [1:7, 1:5] "k__Bacteria" "p__Proteobacteria" "c__Gammaproteobacteria" "o__Enterobacteriales" ...
 $ sample     :List of 4
  ..$ group-metadata: NULL
  ..$ ids           : chr [1:6(1d)] "Sample1" "Sample2" "Sample3" "Sample4" ...
  ..$ matrix        :List of 3
  .. ..$ data   : num [1:15(1d)] 5 2 1 1 1 1 1 1 1 2 ...
  .. ..$ indices: int [1:15(1d)] 1 3 1 3 4 0 2 3 4 1 ...
  .. ..$ indptr : int [1:7(1d)] 0 2 5 9 11 12 15
  ..$ metadata      :List of 4
  .. ..$ BODY_SITE           : chr [1:6(1d)] "gut" "gut" "gut" "skin" ...
  .. ..$ BarcodeSequence     : chr [1:6(1d)] "CGCTTATCGAGA" "CATACCAGTAGC" "CTCTCTACCTGT" "CTCTCGGCCTGT" ...
  .. ..$ Description         : chr [1:6(1d)] "human gut" "human gut" "human gut" "human skin" ...
  .. ..$ LinkerPrimerSequence: chr [1:6(1d)] "CATGCTGCCTCCCGTAGGAGT" "CATGCTGCCTCCCGTAGGAGT" "CATGCTGCCTCCCGTAGGAGT" "CATGCTGCCTCCCGTAGGAGT" ...

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] rhdf5_2.10.0         BiocInstaller_1.16.0
loaded via a namespace (and not attached):
[1] tools_3.1.0     zlibbioc_1.12.0

# Terminal 

./hdf5-1.8.7-mac-intel-x86_64-static/bin/h5dump ./rich_sparse_otu_table_hdf5.biom 
HDF5 "./rich_sparse_otu_table_hdf5.biom" {
GROUP "/" {
   ATTRIBUTE "creation-date" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SCALAR
      DATA {
      (0): "2014-07-29T16:16:36.617320"
      }
   }
   ATTRIBUTE "format-url" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SCALAR
      DATA {
      (0): "http://biom-format.org"
      }
   }
   ATTRIBUTE "format-version" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
      DATA {
      (0): 2, 1
      }
   }
   ATTRIBUTE "generated-by" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SCALAR
      DATA {
      (0): "example"
      }
   }
   ATTRIBUTE "id" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SCALAR
      DATA {
      (0): "No Table ID"
      }
   }
   ATTRIBUTE "nnz" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SCALAR
      DATA {
      (0): 15
      }
   }
   ATTRIBUTE "shape" {
      DATATYPE  H5T_STD_I64LE
      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
      DATA {
      (0): 5, 6
      }
   }
   ATTRIBUTE "type" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SCALAR
      DATA {
      (0): "otu table"
      }
   }
.....
rhdf5 • 2.3k views
ADD COMMENT
4
Entering edit mode
@nathaniel-hayden-6327
Last seen 9.4 years ago
United States

See the man page for h5read. The argument you're looking for is 'read.attributes'. By default attributes are not included with h5read.

?h5read
library(rhdf5)
fl = "min_sparse_otu_table_hdf5.biom"
withatts = h5read(fl, "/", read.attributes=TRUE)
withatts
attributes(withatts)
ADD COMMENT

Login before adding your answer.

Traffic: 463 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6