The Example from ?biomformat::make_biom
is not behaving as I expect - I assume that this is a bug, but perhaps I am not understanding the example. The example seems to imply that the components should be identical after writing BIOM files, then reading them back in, but the sample and observation metadata are not, because the column names are different.
This is particularly confusing because these components are identical between the BIOM instance read from the package data and the instance created from its components using make_biom
library(biomformat)
Below is from Examples of ?biomformat::make_biom
# import with default parameters, specify a file
biomfile = system.file("extdata", "rich_dense_otu_table.biom", package = "biomformat")
x = read_biom(biomfile)
data = biom_data(x)
smd = sample_metadata(x)
omd = observation_metadata(x)
# Make a new biom object from component data
y = make_biom(data, smd, omd)
# Won't be identical to x because of header info.
identical(x, y)
## [1] FALSE
# The data components should be, though.
identical(observation_metadata(x), observation_metadata(y))
## [1] TRUE
identical(sample_metadata(x), sample_metadata(y))
## [1] TRUE
identical(biom_data(x), biom_data(y))
## [1] TRUE
## Quickly show that writing and reading still identical.
# Define a temporary directory to write .biom files
tempdir = tempdir()
write_biom(x, biom_file=file.path(tempdir, "x.biom"))
write_biom(y, biom_file=file.path(tempdir, "y.biom"))
x1 = read_biom(file.path(tempdir, "x.biom"))
y1 = read_biom(file.path(tempdir, "y.biom"))
identical(observation_metadata(x1), observation_metadata(y1))
## [1] FALSE
identical(sample_metadata(x1), sample_metadata(y1))
## [1] FALSE
identical(biom_data(x1), biom_data(y1))
## [1] TRUE
Observations
I expect identical(observation_metadata(x1), observation_metadata(y1))
and identical(sample_metadata(x1), sample_metadata(y1))
to be TRUE
, but they are not. The problem appears to be the column names:
colnames(observation_metadata(x1))
## [1] "taxonomy1" "taxonomy2" "taxonomy3" "taxonomy4" "taxonomy5" "taxonomy6"
## [7] "taxonomy7"
colnames(observation_metadata(y1))
## [1] "V1" "V2" "V3" "V4" "V5" "V6" "V7"
colnames(sample_metadata(x1))
## [1] "BarcodeSequence" "LinkerPrimerSequence" "BODY_SITE"
## [4] "Description"
colnames(sample_metadata(y1))
## [1] "V1" "V2" "V3" "V4"
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] datasets utils stats grDevices graphics methods base
##
## other attached packages:
## [1] biomformat_1.6.0 GGally_1.3.2 broom_0.4.3
## [4] openintro_1.7.1 rvest_0.3.2 xml2_1.1.1
## [7] stringr_1.2.0 lubridate_1.6.0 googlesheets_0.2.2
## [10] ggplot2_2.2.1 rmarkdown_1.8.6 knitr_1.18
## [13] downloader_0.4
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.14 compiler_3.4.1 RColorBrewer_1.1-2
## [4] cellranger_1.1.0 pillar_1.0.1 plyr_1.8.4
## [7] bindr_0.1 zlibbioc_1.24.0 bitops_1.0-6
## [10] tools_3.4.1 digest_0.6.13 rhdf5_2.22.0
## [13] jsonlite_1.5 lattice_0.20-35 nlme_3.1-131
## [16] evaluate_0.10.1 tibble_1.4.1 gtable_0.2.0
## [19] pkgconfig_2.0.1 rlang_0.1.6 Matrix_1.2-10
## [22] psych_1.7.8 yaml_2.1.16 parallel_3.4.1
## [25] bindrcpp_0.2 dplyr_0.7.4 httr_1.3.1
## [28] rprojroot_1.3-2 grid_3.4.1 reshape_0.8.7
## [31] glue_1.2.0 R6_2.2.2 foreign_0.8-69
## [34] reshape2_1.4.3 tidyr_0.7.2 purrr_0.2.4
## [37] magrittr_1.5 backports_1.1.2 scales_0.5.0
## [40] htmltools_0.3.6 mnormt_1.5-5 assertthat_0.2.0
## [43] colorspace_1.3-2 stringi_1.1.6 RCurl_1.95-4.8
## [46] lazyeval_0.2.0 munsell_0.4.3