Question: bsseq combineList: how to get an HDF5Array-backed object as return value?
0
gravatar for Martin Aryee
2.1 years ago by
Martin Aryee20
Boston, MA, USA
Martin Aryee20 wrote:

I'm using HDF5Array-backed BSseq objects. I'd like to combine several of them into a new HDF5Array-backed object using combineList. I'd like to know 1) how to get a disk-backed BSseq object returned from combineList and 2) how to specify the file to be used. Below is an example of what I'm doing. Note that the output of combineList(hdf5_BS1, hdf5_BS2) is an in-memory object.

Thanks,

Martin

 

> library(bsseq)
> library(HDF5Array)
> 
> M <- matrix(0:8, 3, 3)
> Cov <- matrix(1:9, 3, 3)
> 
> hdf5_M <- writeHDF5Array(M)
> hdf5_Cov <- writeHDF5Array(Cov)
> hdf5_BS1 <- BSseq(chr = c("chr1", "chr2", "chr1"),
+                   pos = c(1, 2, 3),
+                   M = hdf5_M,
+                   Cov = hdf5_Cov,
+                   sampleNames = c("A", "B", "C"))
> hdf5_BS1
An object of type 'BSseq' with
  3 methylation loci
  3 samples
has not been smoothed
Some assays are HDF5Array-backed
> hdf5_BS2 <- BSseq(chr = c("chr1", "chr1", "chr1"),
+                   pos = c(3, 4, 5),
+                   M = hdf5_M,
+                   Cov = hdf5_Cov,
+                   sampleNames = c("D", "E", "F"))
> hdf5_BS2
An object of type 'BSseq' with
  3 methylation loci
  3 samples
has not been smoothed
Some assays are HDF5Array-backed
> 
> combineList(hdf5_BS1, hdf5_BS2)
An object of type 'BSseq' with
  5 methylation loci
  6 samples
has not been smoothed
All assays are in-memory
> 
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] HDF5Array_1.4.8            rhdf5_2.20.0               bsseq_1.12.1              
 [4] SummarizedExperiment_1.6.3 DelayedArray_0.2.7         matrixStats_0.52.2        
 [7] Biobase_2.36.2             GenomicRanges_1.28.3       GenomeInfoDb_1.12.1       
[10] IRanges_2.10.2             S4Vectors_0.14.3           BiocGenerics_0.22.0       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.11            XVector_0.16.0          zlibbioc_1.22.0         munsell_0.4.3          
 [5] colorspace_1.3-2        lattice_0.20-35         plyr_1.8.4              tools_3.4.0            
 [9] grid_3.4.0              data.table_1.10.4       R.oo_1.21.0             gtools_3.5.0           
[13] permute_0.9-4           Matrix_1.2-10           GenomeInfoDbData_0.99.0 R.utils_2.5.0          
[17] bitops_1.0-6            RCurl_1.95-4.8          limma_3.32.2            compiler_3.4.0         
[21] R.methodsS3_1.7.1       scales_0.4.1            locfit_1.5-9.1         
>
bsseq • 543 views
ADD COMMENTlink modified 2.1 years ago by Peter Hickey450 • written 2.1 years ago by Martin Aryee20
Answer: bsseq combineList: how to get an HDF5Array-backed object as return value?
2
gravatar for Peter Hickey
2.1 years ago by
Peter Hickey450
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Peter Hickey450 wrote:

Hi Martin,

combineList(hdf5_BS1, hdf5_BS2, BACKEND = "HDF5Array") will give you a HDF5Array-backed BSseq object. It's somewhat buried in the docs, but see ?combineList. 

Currently, the default location for HDF5 files is specified via HDF5Array::setHDF5DumpDir(). You can check the current dump dir (by default, a temp dir) using HDF5Array::getHDF5DumpDir(). Using combineList() you'll end up with one HDF5 file per assay, per sample (I'm 95% sure on this).

Once you've made your final object, you may want to simplify the number of HDF5 files and datasets backing the object. The easiest way to do this, and what I'd recommend, is using SummarizedExperiment::saveHDF5SummarizedExperiment(). Alternatively, you can run DelayedArray::realize() on each assay. Please see the docs of these functions for control over where they write their output files. Both these options will write yet more HDF5 file(s), but 'consolidate' intermediate files made along the way.

I plan to write a vignette documenting best practises for using HDF5-backed BSseq objects, but it's a bit of a moving target at the moment as we figure things out. I'd certainly appreciate any feedback and suggestions you have from your experience.

Cheers,

Pete

ADD COMMENTlink written 2.1 years ago by Peter Hickey450

Thanks! That works exactly as expected. I now have the combined HDF5Array-backed BSseq object. I'll report back on my experience working with these objects.

ADD REPLYlink written 2.1 years ago by Martin Aryee20
We should make this (and other) function(s) auto-guess the backend of the output. Obv. the default when input is all HDF5 should be HDF5 and likewise if all input is in-memory matrices. Should have capability of overriding of course. On Sat, Jun 10, 2017 at 1:20 AM, Martin Aryee [bioc] < noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User Martin Aryee <https: support.bioconductor.org="" u="" 5585=""/> wrote Comment: > bsseq combineList: how to get an HDF5Array-backed object as return value? > <https: support.bioconductor.org="" p="" 96658="" #96914="">: > > Thanks! That works exactly as expected. I now have the > combined HDF5Array-backed BSseq object. I'll report back on my experience > working with these objects. > > ------------------------------ > > Post tags: bsseq > > You may reply via email or visit https://support.bioconductor. > org/p/96658/#96914 >
ADD REPLYlink written 2.1 years ago by Kasper Daniel Hansen6.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 170 users visited in the last hour