I can change the dimensions of standard array object by dim function but it seems that on-disk array packages (e.g. DelayedArray
, HDF5Array
, and TileDBArray
) cannot perform this functionality.
Does anyone know of a convenient way to do this?
library("DelayedArray")
library("HDF5Array")
library("TileDBArray")
arr <- array(runif(2*3*4), dim=2:4)
darr <- DelayedArray(arr)
hdf5arr <- as(arr, "HDF5Array")
tilearr <- as(arr, "TileDBArray")
# This can be performed
dim(arr) <- c(2, 3*4)
# These can not be performed
dim(darr) <- c(2, 3*4)
dim(hdf5arr) <- c(2, 3*4)
dim(tilearr) <- c(2, 3*4)
# include your problematic code here with any corresponding output
# please also include the results of running the following in an R session
sessionInfo()
R Under development (unstable) (2021-03-18 r80099)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] TileDBArray_1.1.3 HDF5Array_1.19.14 rhdf5_2.35.2
[4] rTensor_1.4.1 DelayedArray_0.17.10 IRanges_2.25.9
[7] S4Vectors_0.29.15 MatrixGenerics_1.3.1 matrixStats_0.58.0
[10] BiocGenerics_0.37.2 Matrix_1.3-2 testthat_3.0.2
[13] BiocManager_1.30.12
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 RcppCCTZ_0.2.9 magrittr_2.0.1 bit_4.0.4
[5] pkgload_1.2.1 nanotime_0.3.2 lattice_0.20-41 R6_2.5.0
[9] rlang_0.4.10 tools_4.1.0 grid_4.1.0 tiledb_0.9.0
[13] withr_2.4.2 bit64_4.0.5 rprojroot_2.0.2 crayon_1.4.1
[17] Rhdf5lib_1.13.4 rhdf5filters_1.3.4 compiler_4.1.0 desc_1.3.0
[21] zoo_1.8-9
Thanks for informing
ReshapedHDF5Array
.I'll try this one for now.
I'm looking forward to the
dim()
setter.Best,
Koki
It would be nice if Reshaping with increasing the number of dimensions could also be added as dim<-.
I will look into this although I don't promise anything. Even though many of the operations supported by standard arrays can be implemented as delayed operations, some of them cannot.
So, how is the smart way to chenge the dimension of DelayedArray for now?
As shown below, I also found that
ReshapedHDF5Array
cannot be used not only when the dimension becomes larger, but also when the dimension becomes two or more smaller.I tried to use the normal for statement to assign the data sequentially, but it didn't work. I think it probably corresponds to the 1D-style subassignment in the documentation's description of subassignment, but the error message is difficult for me. https://www.bioconductor.org/packages/release/bioc/manuals/DelayedArray/man/DelayedArray.pdf
As mentioned somewhere else, using delayed subassignments in a loop is almost never a good idea. Not only because it might lead to an "Error: C stack usage is too close to the limit", but also because, even if it works, it will probably be a very inefficient solution. It's almost always better to write things directly to disk as you go instead of using a loop to modify the entire content of a DelayedArray via delayed subassignments.
The writing-to-disk-as-you-go solution looks something like this:
Then:
Note that this approach of walking on two grids simultaneously (one for the input, one for the output) is described in EXAMPLE 2 of the
?write_block
man page.Hope this helps,
H.