Question: failure importing data with tximport with error information
1
gravatar for me5537
6 weeks ago by
me553710
me553710 wrote:

hello, I am new to bioinformatics and Galaxy, so maybe my question is kind of naive. I’m reading RNAseq count data generated from kallito using tximport:

humandb<- EnsDb.Hsapiens.v75
tran.human<- transcripts(humandb, return.type = “DataFrame”)
tran.human<- tran.human[, c(“tx_id”, “gene_id”)]

gene.human<- genes(humandb, return.type = “DataFrame”)
gene.human<- gene.human[, c(“gene_id”, “symbol”)]

tx2genes.human <- merge(tran.human, gene.human, by = “gene_id”)
tx2genes <- as.data.frame(tx2genes.human[, 2:3])
colnames(tx2genes)[1]<-“target_id”
txi.kallisto <- tximport(files, type = “kallisto”, tx2gene = tx2genes, ignoreTxVersion = TRUE)

loadname <- c(“ERR2814755”,“ERR2814756”)
name<-candidate$sample
files <- file.path(path,loadname, “abundance.h5”)
names(files) <- loadname

txi.kallisto <- tximport(files, type = “kallisto”, tx2gene = tx2genes, ignoreTxVersion = TRUE)

it used to work, but after 10.29, when I try it again without any change, it gave the error information as follows:

Error: Unable to read dataset. Not all required filters available. Missing filters: deflate

my library
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS/LAPACK: /pub/anaconda3/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  LC_TIME=en_US.UTF-8          
 [4] LC_COLLATE=en_US.UTF-8        LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8      
 [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8           LC_ADDRESS=en_US.UTF-8       
[10] LC_TELEPHONE=en_US.UTF-8      LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 ggord_1.1.4                            
 [3] EnsDb.Hsapiens.v75_2.99.0               ensembldb_2.6.2                        
 [5] AnnotationFilter_1.8.0                  clusterProfiler_3.12.0                 
 [7] sva_3.32.1                              genefilter_1.66.0                      
 [9] mgcv_1.8-30                             nlme_3.1-141                           
[11] ggpubr_0.2.3                            magrittr_1.5                           
[13] dplyr_0.8.3                             data.table_1.12.6                      
[15] xlsx_0.6.1                              gplots_3.0.1.1                         
[17] hexbin_1.27.3                           rhdf5_2.28.1                           
[19] readxl_1.3.1                            RColorBrewer_1.1-2                     
[21] pheatmap_1.0.12                         tximportData_1.12.0                    
[23] stringr_1.4.0                           stringi_1.4.3                          
[25] DESeq2_1.24.0                           SummarizedExperiment_1.14.1            
[27] DelayedArray_0.10.0                     BiocParallel_1.18.1                    
[29] matrixStats_0.55.0                      readr_1.3.1                            
[31] tximport_1.12.3                         edgeR_3.26.8                           
[33] limma_3.40.6                            GenomicFeatures_1.36.4                 
[35] AnnotationDbi_1.46.1                    Biobase_2.44.0                         
[37] GenomicRanges_1.36.1                    GenomeInfoDb_1.20.0                    
[39] IRanges_2.18.3                          ggplot2_3.2.1                          
[41] S4Vectors_0.22.1                        BiocGenerics_0.30.0                    

loaded via a namespace (and not attached):
  [1] backports_1.1.5          Hmisc_4.2-0              fastmatch_1.1-0         
  [4] plyr_1.8.4               igraph_1.2.4.1           lazyeval_0.2.2          
  [7] splines_3.6.1            urltools_1.7.3           digest_0.6.22           
 [10] htmltools_0.4.0          GOSemSim_2.10.0          viridis_0.5.1           
 [13] GO.db_3.8.2              gdata_2.18.0             checkmate_1.9.4         
 [16] memoise_1.1.0            cluster_2.1.0            Biostrings_2.52.0       
 [19] annotate_1.62.0          graphlayouts_0.5.0       enrichplot_1.4.0        
 [22] prettyunits_1.0.2        colorspace_1.4-1         blob_1.2.0              
 [25] ggrepel_0.8.1            xfun_0.10                jsonlite_1.6            
 [28] crayon_1.3.4             RCurl_1.95-4.12          zeallot_0.1.0           
 [31] survival_2.44-1.1        glue_1.3.1               polyclip_1.10-0         
 [34] gtable_0.3.0             zlibbioc_1.30.0          XVector_0.24.0          
 [37] UpSetR_1.4.0             Rhdf5lib_1.6.3           scales_1.0.0            
 [40] DOSE_3.10.2              DBI_1.0.0                Rcpp_1.0.2              
 [43] viridisLite_0.3.0        xtable_1.8-4             progress_1.2.2          
 [46] htmlTable_1.13.2         gridGraphics_0.4-1       europepmc_0.3           
 [49] foreign_0.8-72           bit_1.1-14               Formula_1.2-3           
 [52] htmlwidgets_1.5.1        httr_1.4.1               fgsea_1.10.1            
 [55] acepack_1.4.1            pkgconfig_2.0.3          XML_3.98-1.20           
 [58] rJava_0.9-11             farver_1.1.0             nnet_7.3-12             
 [61] locfit_1.5-9.1           ggplotify_0.0.4          tidyselect_0.2.5        
 [64] rlang_0.4.1              reshape2_1.4.3           munsell_0.5.0           
 [67] cellranger_1.1.0         tools_3.6.1              RSQLite_2.1.2           
 [70] ggridges_0.5.1           yaml_2.2.0               knitr_1.25              
 [73] bit64_0.9-7              tidygraph_1.1.2          caTools_1.17.1.2        
 [76] purrr_0.3.3              ggraph_2.0.0             xml2_1.2.2              
 [79] DO.db_2.9                biomaRt_2.40.5           compiler_3.6.1          
 [82] rstudioapi_0.10          curl_4.2                 ggsignif_0.6.0          
 [85] tibble_2.1.3             tweenr_1.0.1             geneplotter_1.62.0      
 [88] lattice_0.20-38          ProtGenerics_1.16.0      Matrix_1.2-17           
 [91] vctrs_0.2.0              pillar_1.4.2             lifecycle_0.1.0         
 [94] BiocManager_1.30.9       triebeard_0.3.0          cowplot_1.0.0           
 [97] bitops_1.0-6             rtracklayer_1.44.4       qvalue_2.16.0           
[100] R6_2.4.0                 latticeExtra_0.6-28      KernSmooth_2.23-16      
[103] gridExtra_2.3            MASS_7.3-51.4            gtools_3.8.1            
[106] assertthat_0.2.1         xlsxjars_0.6.1           withr_2.1.2             
[109] GenomicAlignments_1.20.1 Rsamtools_2.0.3          GenomeInfoDbData_1.2.1  
[112] hms_0.5.1                grid_3.6.1               rpart_4.1-15            
[115] tidyr_1.0.0              rvcheck_0.1.5            ggforce_0.3.1           
[118] base64enc_0.1-3
software error hdf5 tximport • 147 views
ADD COMMENTlink modified 6 weeks ago by Michael Love26k • written 6 weeks ago by me553710
Answer: failure importing data with tximport with error information
1
gravatar for Michael Love
6 weeks ago by
Michael Love26k
United States
Michael Love26k wrote:

Can you check that all your packages are up to date? It looks like something may have been downgraded possibly?

BiocManager::valid()

ADD COMMENTlink written 6 weeks ago by Michael Love26k

I check the packages as follows:but it seems that it won't affect the dataset reading?

BiocManager::valid()

  • sessionInfo()

    R version 3.6.1 (2019-07-05) Platform: x8664-condacos6-linux-gnu (64-bit) Running under: Ubuntu 16.04.6 LTS

    Matrix products: default BLAS/LAPACK: /pub/anaconda3/lib/R/lib/libRblas.so

    locale: [1] LCCTYPE=enUS.UTF-8 LCNUMERIC=C LCTIME=enUS.UTF-8
    [4] LC
    COLLATE=enUS.UTF-8 LCMONETARY=enUS.UTF-8 LCMESSAGES=enUS.UTF-8
    [7] LC
    PAPER=enUS.UTF-8 LCNAME=enUS.UTF-8 LCADDRESS=enUS.UTF-8
    [10] LC
    TELEPHONE=enUS.UTF-8 LCMEASUREMENT=enUS.UTF-8 LCIDENTIFICATION=en_US.UTF-8

    attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

    other attached packages: [1] EnsDb.Hsapiens.v752.99.0 sva3.32.1 mgcv1.8-30
    [4] nlme
    3.1-141 ggpubr0.2.3 magrittr1.5
    [7] dplyr0.8.3 data.table1.12.6 xlsx0.6.1
    [10] gplots
    3.0.1.1 hexbin1.27.3 rhdf52.28.1
    [13] readxl1.3.1 RColorBrewer1.1-2 pheatmap1.0.12
    [16] tximportData
    1.12.0 stringr1.4.0 stringi1.4.3
    [19] DESeq21.24.0 SummarizedExperiment1.14.1 DelayedArray0.10.0
    [22] BiocParallel
    1.18.1 matrixStats0.55.0 readr1.3.1
    [25] tximport1.12.3 edgeR3.26.8 limma3.40.6
    [28] ggplot2
    3.2.1 ensembldb2.8.1 GenomicFeatures1.36.4
    [31] AnnotationDbi1.46.1 GenomicRanges1.36.1 GenomeInfoDb1.20.0
    [34] IRanges
    2.18.3 S4Vectors0.22.1 AnnotationFilter1.8.0
    [37] affyPLM1.60.0 preprocessCore1.46.0 simpleaffy2.60.0
    [40] gcrma
    2.56.0 genefilter1.66.0 affyQCReport1.62.0
    [43] lattice0.20-38 affy1.62.0 Biobase2.44.0
    [46] BiocGenerics
    0.30.0

    Bioconductor version '3.9'

    • 2 packages out-of-date
    • 0 packages too new

    create a valid installation with

    BiocManager::install(c( "hms", "tinytex" ), update = TRUE, ask = FALSE)

ADD REPLYlink written 5 weeks ago by me553710

Try reinstalling rhdf5. Something isn’t working with this library. The error isn’t from the tximport package but one of its dependencies, which must have been changed by some other action on the system.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Michael Love26k

I have tried reinstalled rhdf5 2.3.0, it still doesn't work and I try to read dataset by tximport using the following code

files <- file.path(path,loadname, "abundance.tsv")
names(files) <- loadname                  
txi.kallisto <- tximport(files, type = "kallisto", tx2gene = tx2genes, ignoreAfterBar = TRUE)

but it doesn't work either.

reading in files with read_tsv Error: Unable to read dataset. Not all required filters available. Missing filters: deflate

but it is .tsv format…… T-T

ADD REPLYlink written 5 weeks ago by me553710

Oh, I didn’t realize this was TSV. I think it’s actually an error from readr. Could you try updating/reinstalling that package?

ADD REPLYlink written 5 weeks ago by Michael Love26k

actually, I have tried both "abundance.h5" and "abundance.tsv" files using tximport but it gave the same ERROR information^

ADD REPLYlink written 5 weeks ago by me553710

Ok I take it back, searching through my email I've found that this is linked to rhdf5. See the identical error message here:

https://stat.ethz.ch/pipermail/bioc-devel/2019-July/015326.html

It looks like this was addressed in the devel branch in July and so now it is available as the lastest Bioc release (3.10). Can you update Bioc to 3.10 and see if the issue is fixed on your end.

ADD REPLYlink written 5 weeks ago by Michael Love26k

Thank you very much…… I asked the administrator to help me update the bioc to 3.10, but he refused for the reason that not all the R packages can be used in Bioc 3.10

the administrator told me that the reason why I suddenly couldn't use the tximport is that the server is named NFS, they depose software in one of the servers and let the other 8 servers use together, so some of them may produce error information.

It really troubles me so I change to another server producers and find that even in Bioc 3.09 it does work……

Thank you all the way!!! I do really appreciate that!!

ADD REPLYlink written 5 weeks ago by me553710

Hi Michael. I am getting this same error even with BiocManager 3.10 and after reinstalling Rhdf5lib & rhdf5

> files <- file.path(kallistodir, metadata$sample, "abundance.h5")
> names(files) <- metadata$fastqName
> txi <- tximport(files, type = "kallisto", tx2gene = tx2gene)
Error: Unable to read dataset.
Not all required filters available.
Missing filters: deflate

This is my session info:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rhdf5lib_1.8.0              fgsea_1.12.0                Rcpp_1.0.3                  GO.db_3.10.0                rmarkdown_1.17              ggplotify_0.0.4             genefilter_1.68.0          
 [8] pheatmap_1.0.12             ggpubr_0.2.4                magrittr_1.5                RColorBrewer_1.1-2          limma_3.42.0                Glimma_1.14.0               EnhancedVolcano_1.4.0      
[15] ggrepel_0.8.1               systemPipeR_1.20.0          ShortRead_1.44.0            GenomicAlignments_1.22.1    clusterProfiler_3.14.0      org.Hs.eg.db_3.10.0         vsn_3.54.0                 
[22] DESeq2_1.26.0               SummarizedExperiment_1.16.0 DelayedArray_0.12.0         BiocParallel_1.20.0         matrixStats_0.55.0          pasilla_1.14.0              colorRamps_2.3             
[29] geneplotter_1.64.0          annotate_1.64.0             XML_3.98-1.20               AnnotationDbi_1.48.0        lattice_0.20-38             Biobase_2.46.0              mclust_5.4.5               
[36] gsubfn_0.7                  proto_1.0.0                 Rsamtools_2.2.1             Biostrings_2.54.0           XVector_0.26.0              data.table_1.12.6           refGenome_1.7.7            
[43] RSQLite_2.1.2               doBy_4.6-3                  rtracklayer_1.46.0          GenomicRanges_1.38.0        GenomeInfoDb_1.22.0         IRanges_2.20.1              S4Vectors_0.24.0           
[50] BiocGenerics_0.32.0         biomaRt_2.42.0              rhdf5_2.30.0                forcats_0.4.0               stringr_1.4.0               purrr_0.3.3                 readr_1.3.1                
[57] tidyr_1.0.0                 tibble_2.1.3                ggplot2_3.2.1               tidyverse_1.3.0             dplyr_0.8.3                 tximeta_1.4.2               tximport_1.14.0            

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.1           AnnotationForge_1.28.0   acepack_1.4.1            bit64_0.9-7              knitr_1.26               rpart_4.1-15             hwriter_1.3.2            RCurl_1.95-4.12         
  [9] AnnotationFilter_1.10.0  generics_0.0.2           GenomicFeatures_1.38.0   preprocessCore_1.48.0    cowplot_1.0.0            europepmc_0.3            bit_1.1-14               enrichplot_1.6.0        
 [17] base64url_1.4            xml2_1.2.2               lubridate_1.7.4          assertthat_0.2.1         batchtools_0.9.11        viridis_0.5.1            xfun_0.11                hms_0.5.2               
 [25] evaluate_0.14            fansi_0.4.0              progress_1.2.2           dbplyr_1.4.2             readxl_1.3.1             Rgraphviz_2.30.0         igraph_1.2.4.1           DBI_1.0.0               
 [33] htmlwidgets_1.5.1        backports_1.1.5          vctrs_0.2.0              ensembldb_2.10.2         withr_2.1.2              ggforce_0.3.1            triebeard_0.3.0          BSgenome_1.54.0         
 [41] checkmate_1.9.4          prettyunits_1.0.2        cluster_2.1.0            DOSE_3.12.0              lazyeval_0.2.2           crayon_1.3.4             edgeR_3.28.0             pkgconfig_2.0.3         
 [49] tweenr_1.0.1             nlme_3.1-142             ProtGenerics_1.18.0      nnet_7.3-12              rlang_0.4.2              lifecycle_0.1.0          affyio_1.56.0            BiocFileCache_1.10.2    
 [57] GOstats_2.52.0           modelr_0.1.5             cellranger_1.1.0         tcltk_3.6.1              polyclip_1.10-0          graph_1.64.0             Matrix_1.2-17            urltools_1.7.3          
 [65] reprex_0.3.0             base64enc_0.1-3          ggridges_0.5.1           viridisLite_0.3.0        rjson_0.2.20             bitops_1.0-6             blob_1.2.0               qvalue_2.18.0           
 [73] brew_1.0-6               gridGraphics_0.4-1       ggsignif_0.6.0           scales_1.1.0             memoise_1.1.0            GSEABase_1.48.0          plyr_1.8.4               zlibbioc_1.32.0                     
[105] BiocManager_1.30.10                
ADD REPLYlink written 17 days ago by oliver.ziff0

Can you try to open the h5 file using rhdf5? I think the error is coming from that package. That will help you/me debug, because the error isn't directly from one of the functions in tximport.

Alternatively, if you can't seem to nail down why rhdf5 doesn't want to import this file, I think you can just specify dropInfReps=TRUE and it will work without loading the information in the h5 file.

ADD REPLYlink written 17 days ago by Michael Love26k

Thank you Michael! I managed to solve it by installing rhdf5 via conda directly:

conda install --channel https://conda.anaconda.org/bioconda bioconductor-rhdf5

rather than via BiocManager::install("rhdf5") which caused the issue.

ADD REPLYlink written 16 days ago by oliver.ziff0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 205 users visited in the last hour