Question: biomformat::make_biom roundtrip problem?
0
gravatar for jgranek
18 months ago by
jgranek0
jgranek0 wrote:

The Example from ?biomformat::make_biom is not behaving as I expect - I assume that this is a bug, but perhaps I am not understanding the example. The example seems to imply that the components should be identical after writing BIOM files, then reading them back in, but the sample and observation metadata are not, because the column names are different.

This is particularly confusing because these components are identical between the BIOM instance read from the package data and the instance created from its components using make_biom

library(biomformat)

Below is from Examples of ?biomformat::make_biom

# import with default parameters, specify a file
biomfile = system.file("extdata", "rich_dense_otu_table.biom", package = "biomformat")
x = read_biom(biomfile)
data = biom_data(x)
smd = sample_metadata(x)
omd = observation_metadata(x)
# Make a new biom object from component data
y = make_biom(data, smd, omd)
# Won't be identical to x because of header info.
identical(x, y)
## [1] FALSE
# The data components should be, though.
identical(observation_metadata(x), observation_metadata(y))
## [1] TRUE
identical(sample_metadata(x), sample_metadata(y))
## [1] TRUE
identical(biom_data(x), biom_data(y))
## [1] TRUE
## Quickly show that writing and reading still identical.
# Define a temporary directory to write .biom files
tempdir = tempdir()
write_biom(x, biom_file=file.path(tempdir, "x.biom"))
write_biom(y, biom_file=file.path(tempdir, "y.biom"))
x1 = read_biom(file.path(tempdir, "x.biom"))
y1 = read_biom(file.path(tempdir, "y.biom"))
identical(observation_metadata(x1), observation_metadata(y1))
## [1] FALSE
identical(sample_metadata(x1), sample_metadata(y1))
## [1] FALSE
identical(biom_data(x1), biom_data(y1))
## [1] TRUE

Observations

I expect identical(observation_metadata(x1), observation_metadata(y1)) and identical(sample_metadata(x1), sample_metadata(y1)) to be TRUE, but they are not. The problem appears to be the column names:

colnames(observation_metadata(x1))
## [1] "taxonomy1" "taxonomy2" "taxonomy3" "taxonomy4" "taxonomy5" "taxonomy6"
## [7] "taxonomy7"
colnames(observation_metadata(y1))
## [1] "V1" "V2" "V3" "V4" "V5" "V6" "V7"
colnames(sample_metadata(x1))
## [1] "BarcodeSequence"      "LinkerPrimerSequence" "BODY_SITE"           
## [4] "Description"
colnames(sample_metadata(y1))
## [1] "V1" "V2" "V3" "V4"
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] datasets  utils     stats     grDevices graphics  methods   base     
## 
## other attached packages:
##  [1] biomformat_1.6.0   GGally_1.3.2       broom_0.4.3       
##  [4] openintro_1.7.1    rvest_0.3.2        xml2_1.1.1        
##  [7] stringr_1.2.0      lubridate_1.6.0    googlesheets_0.2.2
## [10] ggplot2_2.2.1      rmarkdown_1.8.6    knitr_1.18        
## [13] downloader_0.4    
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.14       compiler_3.4.1     RColorBrewer_1.1-2
##  [4] cellranger_1.1.0   pillar_1.0.1       plyr_1.8.4        
##  [7] bindr_0.1          zlibbioc_1.24.0    bitops_1.0-6      
## [10] tools_3.4.1        digest_0.6.13      rhdf5_2.22.0      
## [13] jsonlite_1.5       lattice_0.20-35    nlme_3.1-131      
## [16] evaluate_0.10.1    tibble_1.4.1       gtable_0.2.0      
## [19] pkgconfig_2.0.1    rlang_0.1.6        Matrix_1.2-10     
## [22] psych_1.7.8        yaml_2.1.16        parallel_3.4.1    
## [25] bindrcpp_0.2       dplyr_0.7.4        httr_1.3.1        
## [28] rprojroot_1.3-2    grid_3.4.1         reshape_0.8.7     
## [31] glue_1.2.0         R6_2.2.2           foreign_0.8-69    
## [34] reshape2_1.4.3     tidyr_0.7.2        purrr_0.2.4       
## [37] magrittr_1.5       backports_1.1.2    scales_0.5.0      
## [40] htmltools_0.3.6    mnormt_1.5-5       assertthat_0.2.0  
## [43] colorspace_1.3-2   stringi_1.1.6      RCurl_1.95-4.8    
## [46] lazyeval_0.2.0     munsell_0.4.3
biomformat • 289 views
ADD COMMENTlink modified 18 months ago • written 18 months ago by jgranek0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 149 users visited in the last hour