biomformat::make_biom roundtrip problem?
0
0
Entering edit mode
jgranek • 0
@jgranek-9614
Last seen 6.8 years ago

The Example from ?biomformat::make_biom is not behaving as I expect - I assume that this is a bug, but perhaps I am not understanding the example. The example seems to imply that the components should be identical after writing BIOM files, then reading them back in, but the sample and observation metadata are not, because the column names are different.

This is particularly confusing because these components are identical between the BIOM instance read from the package data and the instance created from its components using make_biom

library(biomformat)

Below is from Examples of ?biomformat::make_biom

# import with default parameters, specify a file
biomfile = system.file("extdata", "rich_dense_otu_table.biom", package = "biomformat")
x = read_biom(biomfile)
data = biom_data(x)
smd = sample_metadata(x)
omd = observation_metadata(x)
# Make a new biom object from component data
y = make_biom(data, smd, omd)
# Won't be identical to x because of header info.
identical(x, y)
## [1] FALSE
# The data components should be, though.
identical(observation_metadata(x), observation_metadata(y))
## [1] TRUE
identical(sample_metadata(x), sample_metadata(y))
## [1] TRUE
identical(biom_data(x), biom_data(y))
## [1] TRUE
## Quickly show that writing and reading still identical.
# Define a temporary directory to write .biom files
tempdir = tempdir()
write_biom(x, biom_file=file.path(tempdir, "x.biom"))
write_biom(y, biom_file=file.path(tempdir, "y.biom"))
x1 = read_biom(file.path(tempdir, "x.biom"))
y1 = read_biom(file.path(tempdir, "y.biom"))
identical(observation_metadata(x1), observation_metadata(y1))
## [1] FALSE
identical(sample_metadata(x1), sample_metadata(y1))
## [1] FALSE
identical(biom_data(x1), biom_data(y1))
## [1] TRUE

Observations

I expect identical(observation_metadata(x1), observation_metadata(y1)) and identical(sample_metadata(x1), sample_metadata(y1)) to be TRUE, but they are not. The problem appears to be the column names:

colnames(observation_metadata(x1))
## [1] "taxonomy1" "taxonomy2" "taxonomy3" "taxonomy4" "taxonomy5" "taxonomy6"
## [7] "taxonomy7"
colnames(observation_metadata(y1))
## [1] "V1" "V2" "V3" "V4" "V5" "V6" "V7"
colnames(sample_metadata(x1))
## [1] "BarcodeSequence"      "LinkerPrimerSequence" "BODY_SITE"           
## [4] "Description"
colnames(sample_metadata(y1))
## [1] "V1" "V2" "V3" "V4"
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] datasets  utils     stats     grDevices graphics  methods   base     
## 
## other attached packages:
##  [1] biomformat_1.6.0   GGally_1.3.2       broom_0.4.3       
##  [4] openintro_1.7.1    rvest_0.3.2        xml2_1.1.1        
##  [7] stringr_1.2.0      lubridate_1.6.0    googlesheets_0.2.2
## [10] ggplot2_2.2.1      rmarkdown_1.8.6    knitr_1.18        
## [13] downloader_0.4    
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.14       compiler_3.4.1     RColorBrewer_1.1-2
##  [4] cellranger_1.1.0   pillar_1.0.1       plyr_1.8.4        
##  [7] bindr_0.1          zlibbioc_1.24.0    bitops_1.0-6      
## [10] tools_3.4.1        digest_0.6.13      rhdf5_2.22.0      
## [13] jsonlite_1.5       lattice_0.20-35    nlme_3.1-131      
## [16] evaluate_0.10.1    tibble_1.4.1       gtable_0.2.0      
## [19] pkgconfig_2.0.1    rlang_0.1.6        Matrix_1.2-10     
## [22] psych_1.7.8        yaml_2.1.16        parallel_3.4.1    
## [25] bindrcpp_0.2       dplyr_0.7.4        httr_1.3.1        
## [28] rprojroot_1.3-2    grid_3.4.1         reshape_0.8.7     
## [31] glue_1.2.0         R6_2.2.2           foreign_0.8-69    
## [34] reshape2_1.4.3     tidyr_0.7.2        purrr_0.2.4       
## [37] magrittr_1.5       backports_1.1.2    scales_0.5.0      
## [40] htmltools_0.3.6    mnormt_1.5-5       assertthat_0.2.0  
## [43] colorspace_1.3-2   stringi_1.1.6      RCurl_1.95-4.8    
## [46] lazyeval_0.2.0     munsell_0.4.3
biomformat • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6