I tried to create a very huge three-dimensional array.
It cannot be created by standard array due to the out-of-memory error,
so I used HDF5Array
but the writing step stopped because of C stack error.
Is this because that I have to set something special setting when using HDF5Array
?
I changed the setting of the C stack size by ulimit -s
but the situation remains the same.
Thank you in advance.
library("HDF5Array")
# cf. https://rdrr.io/bioc/DelayedArray/man/write_block.html
.sarray <- function(dim){
dim <- as.integer(dim)
setAutoRealizationBackend("HDF5Array")
sink <- AutoRealizationSink(dim, as.sparse=TRUE)
close(sink)
as(sink, "DelayedArray")
}
human <- array(runif(13889*1977), dim=c(13889, 1977))
mouse <- array(runif(13889*1907), dim=c(13889, 1907))
new_modes <- c(ncol(human), ncol(mouse), nrow(human))
darr <- .sarray(new_modes)
for(i in seq(dim(darr)[3])){
print(paste0(i, " / ", dim(darr)[3]))
darr[,,i] <- outer(human[i,], mouse[i,])
}
# After several step (e.g. 90 / 13889)
# This calculation stops by the following error.
# Error: C stack usage 1947092 is too close to the limit
I'm using the devel version of R and Biconductor.
sessionInfo()
R Under development (unstable) (2021-03-18 r80099)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] rTensor_1.4.1 testthat_3.0.2 BiocSingular_1.7.2
[4] HDF5Array_1.19.15 rhdf5_2.35.2 DelayedArray_0.17.11
[7] IRanges_2.25.10 S4Vectors_0.29.17 MatrixGenerics_1.3.1
[10] matrixStats_0.58.0 BiocGenerics_0.37.4 Matrix_1.3-2
[13] BiocManager_1.30.12
loaded via a namespace (and not attached):
[1] rprojroot_2.0.2 compiler_4.1.0 tools_4.1.0
[4] rsvd_1.0.5 Rcpp_1.0.6 rhdf5filters_1.3.4
[7] beachmat_2.7.7 irlba_2.3.3 desc_1.3.0
[10] ScaledMatrix_0.99.2 BiocParallel_1.25.5 rlang_0.4.10
[13] lattice_0.20-41 Rhdf5lib_1.13.4 magrittr_2.0.1
[16] R6_2.5.0 withr_2.4.2 crayon_1.4.1
[19] grid_4.1.0
Ok, I didn't really understand the documentation because it was a little technical for me, but I think I finally got it.
I see that I need to align the block of the on-memory object with the block to be written to HDF5.