Problem with readFast5Summary in IONiseR related to HDF5
4
0
Entering edit mode
@nathaniellegall-12718
Last seen 7.6 years ago

I'm having a bit of trouble running the 'readFAST5Summary' function in IONiseR. The error message mentions HDF5 so I have updated these in the command line using home-brew;

brew install hdf5

pip3 install h5py

and tried to run the code again but it gives me this error message. 

> fast5files <- list.files(path = "/Volumes/NGS Lab/MinION/data/downloads/pass/batch_1487083485858", pattern = ".fast5$", full.names = TRUE)
> example.summary <- readFast5Summary( fast5files )
Checking file validity
Reading Channel Data
Reading Raw Data
Reading Template Data
Error in H5Aopen(did, "duration") : 
  HDF5. Attribute. Unable to initialize object.

I think that I have installed all of the additional packages.

> sessionInfo()

R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocStyle_2.2.1 knitr_1.15.1    rmarkdown_1.4   testthat_1.0.2  gridExtra_2.2.1 ggplot2_2.2.1   IONiseR_1.4.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10               RColorBrewer_1.1-2         GenomeInfoDb_1.10.3        plyr_1.8.4                
 [5] XVector_0.14.1             bitops_1.0-6               tools_3.3.3                zlibbioc_1.20.0           
 [9] digest_0.6.12              evaluate_0.10              tibble_1.2                 gtable_0.2.0              
[13] rhdf5_2.18.0               lattice_0.20-35            Matrix_1.2-8               DBI_0.6                   
[17] parallel_3.3.3             stringr_1.2.0              hwriter_1.3.2              dplyr_0.5.0               
[21] Biostrings_2.42.1          S4Vectors_0.12.2           IRanges_2.8.2              rprojroot_1.2             
[25] stats4_3.3.3               grid_3.3.3                 data.table_1.10.4          Biobase_2.34.0            
[29] R6_2.2.0                   BiocParallel_1.8.1         latticeExtra_0.6-28        tidyr_0.6.1               
[33] magrittr_1.5               backports_1.0.5            htmltools_0.3.5            scales_0.4.1              
[37] Rsamtools_1.26.1           GenomicAlignments_1.10.1   BiocGenerics_0.20.0        GenomicRanges_1.26.4      
[41] ShortRead_1.32.1           assertthat_0.1             SummarizedExperiment_1.4.0 colorspace_1.3-2          
[45] stringi_1.1.3              RCurl_1.95-4.8             lazyeval_0.2.0             munsell_0.4.3             
[49] crayon_1.3.2  

Any help to resolve this would be appreciated.           

 

 

 

ioniser bioinformatics minion • 2.0k views
ADD COMMENT
0
Entering edit mode

I notice you have a space in the path to your data, which is generally not a good idea. It might be unrelated to this issue (but I would suggest you to test it anyway).

ADD REPLY
0
Entering edit mode
Mike Smith ★ 6.6k
@mike-smith
Last seen 19 hours ago
EMBL Heidelberg

That's an unusual error, as it means an attribute I always expect to be present in the fast5 files (essentially how long the sequencing of that read took) can't be found.  Unfortunately I can't tell from the message if that's something that affects all your files, or just one.  

To begin with, I would suggest running the code on a single file and seeing if you get the error e.g.

example.summary <- readFast5Summary( fast5files[1] )

If that works fine you might have to go into a slightly painful process of trying to identify the offending file by using subsets of the list of files until you get the error.

If you manage to find an example file that throws the error, please send it to me (email, Google Drive, ftp, etc) and I'll see if I can determine whether this is something odd in that file, or if this a bug in IONiseR that needs patching.

ADD COMMENT
0
Entering edit mode
@nathaniellegall-12718
Last seen 7.6 years ago

 

I restarted the system and the readFast5Summary appears to work fine now on test data for another package. Maybe the HDF5 update needed the system to restart before it kicked in.

> fast5files <- list.files(path = "/Users/NLsMacBook/Documents/poretools-pfaucon/test_data", pattern = ".fast5$", full.names = TRUE)

> example.summary <- readFast5Summary( fast5files )
Checking file validity
Reading Channel Data
Reading Raw Data
Reading Template Data
Reading Complement Data
Reading Template FASTQ
Reading Complement FASTQ
Reading 2D FASTQ
Done

All of the other functions included in the tutorial (https://www.bioconductor.org/packages/devel/bioc/vignettes/IONiseR/inst/doc/IONiseR.html) work as well so I am satisfied that the code works. But when it comes to running this on my own base called files R returns with another error. 

> lambda <- list.files(path="/Volumes/NL 16GB/reads", pattern = ".fast5$", full.names = TRUE)
> lambda.summary <- readFast5Summary(lambda)
Checking file validity
Error in which(fileStatus) : argument to 'which' is not logical

I think that someone else had a similar issue and it was something to do with the file structure so I will continue this issue as part of their thread. If I can't find it then I will start another thread.

 

ADD COMMENT
0
Entering edit mode
@lescai-francesco-5078
Last seen 6.2 years ago
Denmark

Hi there, 
I have now a similar problem related to HDF5 reading. 
We sequenced with Nanopore last week and although I tried to update all packages, I get this error message

```
> library(IONiseR)
> fast5files <- list.files(path = "/data/nanopore/nano_20170831/fast5/pass/0/", pattern = ".fast5$", full.names = TRUE)
> summary <- readFast5Summary(fast5files)
Checking file validity
Reading Channel Data
Error in H5Gopen(fid, "/Analyses/EventDetection_000/Reads") : 
  HDF5. Symbol table. Can't open object.
```

I've successfully loaded the metadata with the package poRe, so would be tempted to say the data seem ok.
How would you suggest to further track down / solve the source of error?

here's my session info

```
sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.9 (Final)

Matrix products: default
BLAS/LAPACK: /scratch/.com/extra/OpenBLAS/20150505/lib/libopenblas_sandybridgep-r0.2.14.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2         IONiseR_2.0.0        BiocInstaller_1.26.1

   
```

 

 

 

 

ADD COMMENT
1
Entering edit mode

Thanks for reporting the error.  If you could make one of your Fast5 files available to me then i'll try and figure out what has changed in the file format and patch IONiseR accordingly.  You can find my email address in the package DESCRIPTION file, or put a link to Dropbox, FTP etc here.

ADD REPLY
1
Entering edit mode

Hi there,

Any progress on that issue? It seems that I have the same issue, but unfortunately I am not able to provide a fast5 file at the moment.

Best,
Frank

ADD REPLY
0
Entering edit mode
@lescai-francesco-5078
Last seen 6.2 years ago
Denmark

I sent one of my fast5 to Mike via email. I suppose they're working on it. Looking fwd to hear something too.

 

ADD COMMENT
1
Entering edit mode

It looks like EventDetection data is not available in the file you provided.  IONiseR assumed this would always be present, and didn't have any checks built in to make sure this was true.  I've patched it so the example file you can be read.  This is available in IONiseR version 2.1.1.  That should be in built by Bioconductor in the next few days, or you can install it directly with BiocInstaller::biocLite("grimbough/IONiseR").

There might be some knock on effects with other functions that expect even data to be present, so if you experience any other problems please let me know and I'll fix them up.

ADD REPLY

Login before adding your answer.

Traffic: 680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6