Question: error with HDF5Array
2
17 months ago by
cath30
cath30 wrote:

HDF5Array package seems to install correctly but I cannot load it, it gives me the following error:

library(HDF5Array)

Error : package or namespace load failed for ‘HDF5Array’:

.onLoad failed in loadNamespace() for 'HDF5Array', details :

call : H5Fcreate(file)

error : HDF5. File accessibilty. Unable to open file.

biocValid() tells me my BiocParallel version is out of date but I cannot install the latest from source because I get a "non zero exit status".

sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252 LC_NUMERIC=C                   LC_TIME=French_France.1252

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rhdf5_2.24.0         DelayedArray_0.6.0   BiocParallel_1.13.1  IRanges_2.14.1       S4Vectors_0.18.1     BiocGenerics_0.26.0  matrixStats_0.53.1   httr_1.3.1
[9] BiocInstaller_1.30.0

loaded via a namespace (and not attached):
[1] digest_0.6.15   withr_2.1.2     R6_2.2.2        git2r_0.21.0    curl_3.2        devtools_1.13.5 Rhdf5lib_1.2.0  tools_3.5.0     compiler_3.5.0  memoise_1.1.0

EDIT 2018/05/22 : outputs of library(HDF5Array) and traceback()

library(HDF5Array)
Le chargement a nécessité le package : DelayedArray
Le chargement a nécessité le package : stats4
Le chargement a nécessité le package : matrixStats
Le chargement a nécessité le package : BiocGenerics
Le chargement a nécessité le package : parallel

Attachement du package : ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colMeans, colnames, colSums, dirname, do.call, duplicated, eval,
evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste,
pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which, which.max, which.min

Le chargement a nécessité le package : S4Vectors

Attachement du package : ‘S4Vectors’

The following object is masked from ‘package:base’:

expand.grid

Le chargement a nécessité le package : IRanges

Attachement du package : ‘IRanges’

The following object is masked from ‘package:grDevices’:

windows

Le chargement a nécessité le package : BiocParallel

Attachement du package : ‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

aperm, apply

Le chargement a nécessité le package : rhdf5
Erreur : package or namespace load failed for ‘HDF5Array’:
.onLoad a échoué dans loadNamespace() pour 'HDF5Array', détails :
appel : H5Fcreate(file)
erreur : HDF5. File accessibilty. Unable to open file.

> traceback()
6: stop(msg, call. = FALSE, domain = NA)
5: value[[3L]](cond)
4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
3: tryCatchList(expr, classes, parentenv, handlers)
2: tryCatch({
attr(package, "LibPath") <- which.lib.loc
ns <- loadNamespace(package, lib.loc)
env <- attachNamespace(ns, pos = pos, deps)
}, error = function(e) {
P <- if (!is.null(cc <- conditionCall(e)))
paste(" in", deparse(cc)[1L])
else ""
msg <- gettextf("package or namespace load failed for %s%s:\n %s",
sQuote(package), P, conditionMessage(e))
if (logical.return)
message(paste("Error:", msg), domain = NA)
else stop(msg, call. = FALSE, domain = NA)
})
1: library(HDF5Array)

biocparallel hdf5array • 788 views
modified 16 months ago by Mike Smith4.0k • written 17 months ago by cath30
Answer: error with HDF5Array
3
16 months ago by
Mike Smith4.0k
EMBL Heidelberg / de.NBI
Mike Smith4.0k wrote:

This seems to be an issue with the way Windows & HDF5 interact when trying to access files where the path is UTF-8 encoded.  I'm not exactly clear if the fundamental issue lies with Windows or HDF5 (or maybe both) but there's some more details at https://support.hdfgroup.org/HDF5/doc/Advanced/UsingUnicode/index.html.

It also seems that some of the R's internal path processing can change the encoding, which is why the first example below will fail and the second will succeed, even though the first is basically just a wrapper around the second.

setwd("éxample")
rhdf5::h5createFile("test.h5")
rhdf5::H5Fclose( rhdf5::H5Fcreate("test.h5") )


I've updated rhdf5 to try and ensure paths are presented using the native encoding for the system.  This should work in the case above where 'é' can be represented in the Latin-1 encoding, but I expect it will still cause issues if more exotic characters are present.

If you'd like to test this then you can install rhdf5 version 2.25.3 from Github with

BiocInstaller::biocLite('grimbough/rhdf5')


and then see if if library(HDF5Array) works for you.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Mike Smith4.0k

I just tried and it works perfectly :-)

Answer: error with HDF5Array
2
17 months ago by
Ljubljana
Roman Luštrik20 wrote:

Cath mentioned that when building package from source, she received an error

Warning in system(cmd) : 'make' not found
ERROR: compilation failed for package 'BiocParallel'

Based on this error I could wager a bet that Rtools are not found in PATH. Here's how my PATH looks like:

C:\RBuildTools\3.3\bin;C:\RBuildTools\3.3\gcc-4.6.3\bin64;C:\RBuildTools\3.3\gcc-4.6.3\bin;C:\RBuildTools\3.3\gcc-4.6.3\i686-w64-mingw32\bin
1

This is because the version of BiocParallel has not built successfully on release builds; it is available as an older '.zip' version, and I believe that this can be installed with biocLite("BiocParallel", type = "win.binary"). It might be necessary to also options(install.packages.check.source = "no"). We will work to get BiocParallel building on windows ASAP.

ADD REPLYlink written 17 months ago by Martin Morgan ♦♦ 23k
1

For what it's worth, BiocParallel should now install on Windows without needing Rtools or to build from source.

ADD REPLYlink written 17 months ago by Martin Morgan ♦♦ 23k

I passed the lines above and now I get TRUE for biocValid but for now I still get the error when trying to load HDF5Array. (Previous to passing the lines I had added Rtools to my path). Hopefully it will be solved soon with the different fixes...

ADD REPLYlink modified 17 months ago • written 17 months ago by cath30

I guess your installation is valid

BiocInstaller::biocValid()

Can you copy and paste the output of your attempt to attach HDF5Array and then immediately after the output of traceback()

library(HDF5Array)
traceback()
ADD REPLYlink written 17 months ago by Martin Morgan ♦♦ 23k

Sorry for the late reply, I didn't get notification of your comment, my installation is indeed valid (or at least biocValid() says so), I'm afraid I won't be able to copy/paste the lines until next tuesday (I'm working at 2 places and I won't be sure to have exact same configuration where I am now)

@MartinMorgan, I edited the question with both outputs. Thank you

1

Try this

library(rhdf5)
debug(H5Fcreate)
library(HDF5Array)

You'll enter the debugger browser (see ?browser)

debugging in: H5Fcreate(file)
debug: {
if (length(name) != 1 || !is.character(name))
stop("'name' must be a character string of length 1"
...
Browser[2]>

The HDF5Array package is trying to open a temporary file. I have the following, yours should be similar

Browse[2]> name
[1] "/tmp/RtmpOLoewc/HDF5Array_dump/auto00001.h5"
Browse[2]> file.exists(dirname(name))
[1] TRUE
Browse[2]> file.exists(name)
[1] FALSE

The first part of the file path is the result of tempdir(), and it should exist

Browse[2]> tempdir()
[1] "/tmp/RtmpOLoewc"
Browse[2]> file.exists(tempdir())
[1] TRUE

You can step through the function pressing the 'n' (next) key and carriage return <cr> several times

Browse[2]> n<cr>
debug: if (length(name) != 1 || !is.character(name)) stop("'name' must be a character string of length 1")
Browse[2]><cr>
...
debug: fid <- .Call("_H5Fcreate", name, flags, fcpl@ID, fapl@ID, PACKAGE = "rhdf5")
Browse[2]> 

at this point you could copy-and-paste the line above into the browser

Browse[2]> fid <- .Call("_H5Fcreate", name, flags, fcpl@ID, fapl@ID, PACKAGE = "rhdf5")
Browse[2]> 

I guess for you it will throw an error. If you've gotten this far, and everything has been ok, then it would be interestig to try to create a file in your user directory, e.g.,

Browse[2]> name = file.choose()
Browse[2]> fid <- .Call("_H5Fcreate", name, flags, fcpl@ID, fapl@ID, PACKAGE = "rhdf5")

Hopefully that gets us enough information for further troubleshooting.

ADD REPLYlink modified 17 months ago • written 17 months ago by Martin Morgan ♦♦ 23k

Everything went fine until last lines. I tried with file.choose but I only get the error: HDF5. File accessibilty. Unable to open file. when I pass the last line

N.B: I don't know if this is relevant but if I try H5Fcreate("auto00001.h5") outside of library (i.e. directly in the console), it works fine.

ADD REPLYlink modified 17 months ago • written 17 months ago by cath30

Can you share the name, i.e., the location the file is being written to? Is it on a different share ('hard drive') compared to where HDF5Array is being installed? Is there enough space at that location?

ADD REPLYlink written 17 months ago by Martin Morgan ♦♦ 23k

It's on the same hard drive as where HDF5Array is being installed (location file : "C:/Users/Catherine Guérin/AppData/Local/Temp/Rtmp6nPtSB/HDF5Array_dump/auto00001.h5" ; location of HDF5Array : "C:/Program Files/R/R-3.5.0/library").

There is enough space there (> 52 GB) and I have write permission (I tried writing a trivial file with write.table with the same target file and it worked without problem).

I tried different things and it seems to work when using a single slash ("/") instead of 2 backslashes ("\\") in the path. I don't see how that can be possible (but I don't understand how HDFCreate works so...) but I hope it still makes sense...

1

My guess is that this is going to be an issue with the 'é' in your user name, and (probably) HDF5 is handling it badly.  I'll try to make a quick example to test.

Update:

I just tried the following on my Windows machine. Not sure how easy it will be to replicate unless you can write to a folder outside of your home directory.

## this doesn't work
dir.create('Catherine Guérin')
rhdf5::h5createFile("Catherine Guérin/test.h5")

## this does work
dir.create('Catherine Guerin')
rhdf5::h5createFile("Catherine Guerin/test.h5")


Both of these work fine on Linux, suggesting this is a Windows/HDF5 problem. I'll keep digging to see if I can find anything related to this in existing HDF5 documentation.

ADD REPLYlink modified 17 months ago • written 17 months ago by Mike Smith4.0k
1

Very likely to be the problem.

I just tried to change name to paste0(tempdir(), "\\HDF5Array_dump\\auto00001.h5") while debugging H5Fcreate and it worked. Basically, I have:

tempdir()
[1] "C:\\Users\\CATHER~1\\AppData\\Local\\Temp\\Rtmp02Yt0G"

but

name
[1] "C:\\Users\\Catherine Guérin\\AppData\\Local\\Temp\\Rtmp02Yt0G\\HDF5Array_dump\\auto00001.h5"

So I guess, either I redefine the temporary dir for R or source code of H5Fcreate can be modified a bit?

ADD REPLYlink modified 17 months ago • written 17 months ago by cath30