failure importing data with tximport with error information
1
1
Entering edit mode
me5537 ▴ 10
@me5537-22246
Last seen 5.1 years ago

hello, I am new to bioinformatics and Galaxy, so maybe my question is kind of naive. I’m reading RNAseq count data generated from kallito using tximport:

humandb<- EnsDb.Hsapiens.v75
tran.human<- transcripts(humandb, return.type = “DataFrame”)
tran.human<- tran.human[, c(“tx_id”, “gene_id”)]

gene.human<- genes(humandb, return.type = “DataFrame”)
gene.human<- gene.human[, c(“gene_id”, “symbol”)]

tx2genes.human <- merge(tran.human, gene.human, by = “gene_id”)
tx2genes <- as.data.frame(tx2genes.human[, 2:3])
colnames(tx2genes)[1]<-“target_id”
txi.kallisto <- tximport(files, type = “kallisto”, tx2gene = tx2genes, ignoreTxVersion = TRUE)

loadname <- c(“ERR2814755”,“ERR2814756”)
name<-candidate$sample
files <- file.path(path,loadname, “abundance.h5”)
names(files) <- loadname

txi.kallisto <- tximport(files, type = “kallisto”, tx2gene = tx2genes, ignoreTxVersion = TRUE)

it used to work, but after 10.29, when I try it again without any change, it gave the error information as follows:

Error: Unable to read dataset. Not all required filters available. Missing filters: deflate

my library
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS/LAPACK: /pub/anaconda3/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  LC_TIME=en_US.UTF-8          
 [4] LC_COLLATE=en_US.UTF-8        LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8      
 [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8           LC_ADDRESS=en_US.UTF-8       
[10] LC_TELEPHONE=en_US.UTF-8      LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 ggord_1.1.4                            
 [3] EnsDb.Hsapiens.v75_2.99.0               ensembldb_2.6.2                        
 [5] AnnotationFilter_1.8.0                  clusterProfiler_3.12.0                 
 [7] sva_3.32.1                              genefilter_1.66.0                      
 [9] mgcv_1.8-30                             nlme_3.1-141                           
[11] ggpubr_0.2.3                            magrittr_1.5                           
[13] dplyr_0.8.3                             data.table_1.12.6                      
[15] xlsx_0.6.1                              gplots_3.0.1.1                         
[17] hexbin_1.27.3                           rhdf5_2.28.1                           
[19] readxl_1.3.1                            RColorBrewer_1.1-2                     
[21] pheatmap_1.0.12                         tximportData_1.12.0                    
[23] stringr_1.4.0                           stringi_1.4.3                          
[25] DESeq2_1.24.0                           SummarizedExperiment_1.14.1            
[27] DelayedArray_0.10.0                     BiocParallel_1.18.1                    
[29] matrixStats_0.55.0                      readr_1.3.1                            
[31] tximport_1.12.3                         edgeR_3.26.8                           
[33] limma_3.40.6                            GenomicFeatures_1.36.4                 
[35] AnnotationDbi_1.46.1                    Biobase_2.44.0                         
[37] GenomicRanges_1.36.1                    GenomeInfoDb_1.20.0                    
[39] IRanges_2.18.3                          ggplot2_3.2.1                          
[41] S4Vectors_0.22.1                        BiocGenerics_0.30.0                    

loaded via a namespace (and not attached):
  [1] backports_1.1.5          Hmisc_4.2-0              fastmatch_1.1-0         
  [4] plyr_1.8.4               igraph_1.2.4.1           lazyeval_0.2.2          
  [7] splines_3.6.1            urltools_1.7.3           digest_0.6.22           
 [10] htmltools_0.4.0          GOSemSim_2.10.0          viridis_0.5.1           
 [13] GO.db_3.8.2              gdata_2.18.0             checkmate_1.9.4         
 [16] memoise_1.1.0            cluster_2.1.0            Biostrings_2.52.0       
 [19] annotate_1.62.0          graphlayouts_0.5.0       enrichplot_1.4.0        
 [22] prettyunits_1.0.2        colorspace_1.4-1         blob_1.2.0              
 [25] ggrepel_0.8.1            xfun_0.10                jsonlite_1.6            
 [28] crayon_1.3.4             RCurl_1.95-4.12          zeallot_0.1.0           
 [31] survival_2.44-1.1        glue_1.3.1               polyclip_1.10-0         
 [34] gtable_0.3.0             zlibbioc_1.30.0          XVector_0.24.0          
 [37] UpSetR_1.4.0             Rhdf5lib_1.6.3           scales_1.0.0            
 [40] DOSE_3.10.2              DBI_1.0.0                Rcpp_1.0.2              
 [43] viridisLite_0.3.0        xtable_1.8-4             progress_1.2.2          
 [46] htmlTable_1.13.2         gridGraphics_0.4-1       europepmc_0.3           
 [49] foreign_0.8-72           bit_1.1-14               Formula_1.2-3           
 [52] htmlwidgets_1.5.1        httr_1.4.1               fgsea_1.10.1            
 [55] acepack_1.4.1            pkgconfig_2.0.3          XML_3.98-1.20           
 [58] rJava_0.9-11             farver_1.1.0             nnet_7.3-12             
 [61] locfit_1.5-9.1           ggplotify_0.0.4          tidyselect_0.2.5        
 [64] rlang_0.4.1              reshape2_1.4.3           munsell_0.5.0           
 [67] cellranger_1.1.0         tools_3.6.1              RSQLite_2.1.2           
 [70] ggridges_0.5.1           yaml_2.2.0               knitr_1.25              
 [73] bit64_0.9-7              tidygraph_1.1.2          caTools_1.17.1.2        
 [76] purrr_0.3.3              ggraph_2.0.0             xml2_1.2.2              
 [79] DO.db_2.9                biomaRt_2.40.5           compiler_3.6.1          
 [82] rstudioapi_0.10          curl_4.2                 ggsignif_0.6.0          
 [85] tibble_2.1.3             tweenr_1.0.1             geneplotter_1.62.0      
 [88] lattice_0.20-38          ProtGenerics_1.16.0      Matrix_1.2-17           
 [91] vctrs_0.2.0              pillar_1.4.2             lifecycle_0.1.0         
 [94] BiocManager_1.30.9       triebeard_0.3.0          cowplot_1.0.0           
 [97] bitops_1.0-6             rtracklayer_1.44.4       qvalue_2.16.0           
[100] R6_2.4.0                 latticeExtra_0.6-28      KernSmooth_2.23-16      
[103] gridExtra_2.3            MASS_7.3-51.4            gtools_3.8.1            
[106] assertthat_0.2.1         xlsxjars_0.6.1           withr_2.1.2             
[109] GenomicAlignments_1.20.1 Rsamtools_2.0.3          GenomeInfoDbData_1.2.1  
[112] hms_0.5.1                grid_3.6.1               rpart_4.1-15            
[115] tidyr_1.0.0              rvcheck_0.1.5            ggforce_0.3.1           
[118] base64enc_0.1-3
software error tximport hdf5 • 2.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

Can you check that all your packages are up to date? It looks like something may have been downgraded possibly?

BiocManager::valid()

ADD COMMENT
0
Entering edit mode

I check the packages as follows:but it seems that it won't affect the dataset reading?

BiocManager::valid()

  • sessionInfo()

    R version 3.6.1 (2019-07-05) Platform: x8664-condacos6-linux-gnu (64-bit) Running under: Ubuntu 16.04.6 LTS

    Matrix products: default BLAS/LAPACK: /pub/anaconda3/lib/R/lib/libRblas.so

    locale: [1] LCCTYPE=enUS.UTF-8 LCNUMERIC=C LCTIME=enUS.UTF-8
    [4] LC
    COLLATE=enUS.UTF-8 LCMONETARY=enUS.UTF-8 LCMESSAGES=enUS.UTF-8
    [7] LC
    PAPER=enUS.UTF-8 LCNAME=enUS.UTF-8 LCADDRESS=enUS.UTF-8
    [10] LC
    TELEPHONE=enUS.UTF-8 LCMEASUREMENT=enUS.UTF-8 LCIDENTIFICATION=en_US.UTF-8

    attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

    other attached packages: [1] EnsDb.Hsapiens.v752.99.0 sva3.32.1 mgcv1.8-30
    [4] nlme
    3.1-141 ggpubr0.2.3 magrittr1.5
    [7] dplyr0.8.3 data.table1.12.6 xlsx0.6.1
    [10] gplots
    3.0.1.1 hexbin1.27.3 rhdf52.28.1
    [13] readxl1.3.1 RColorBrewer1.1-2 pheatmap1.0.12
    [16] tximportData
    1.12.0 stringr1.4.0 stringi1.4.3
    [19] DESeq21.24.0 SummarizedExperiment1.14.1 DelayedArray0.10.0
    [22] BiocParallel
    1.18.1 matrixStats0.55.0 readr1.3.1
    [25] tximport1.12.3 edgeR3.26.8 limma3.40.6
    [28] ggplot2
    3.2.1 ensembldb2.8.1 GenomicFeatures1.36.4
    [31] AnnotationDbi1.46.1 GenomicRanges1.36.1 GenomeInfoDb1.20.0
    [34] IRanges
    2.18.3 S4Vectors0.22.1 AnnotationFilter1.8.0
    [37] affyPLM1.60.0 preprocessCore1.46.0 simpleaffy2.60.0
    [40] gcrma
    2.56.0 genefilter1.66.0 affyQCReport1.62.0
    [43] lattice0.20-38 affy1.62.0 Biobase2.44.0
    [46] BiocGenerics
    0.30.0

    Bioconductor version '3.9'

    • 2 packages out-of-date
    • 0 packages too new

    create a valid installation with

    BiocManager::install(c( "hms", "tinytex" ), update = TRUE, ask = FALSE)

ADD REPLY
0
Entering edit mode

Try reinstalling rhdf5. Something isn’t working with this library. The error isn’t from the tximport package but one of its dependencies, which must have been changed by some other action on the system.

ADD REPLY
0
Entering edit mode

I have tried reinstalled rhdf5 2.3.0, it still doesn't work and I try to read dataset by tximport using the following code

files <- file.path(path,loadname, "abundance.tsv")
names(files) <- loadname                  
txi.kallisto <- tximport(files, type = "kallisto", tx2gene = tx2genes, ignoreAfterBar = TRUE)

but it doesn't work either.

reading in files with read_tsv Error: Unable to read dataset. Not all required filters available. Missing filters: deflate

but it is .tsv format…… T-T

ADD REPLY
0
Entering edit mode

Oh, I didn’t realize this was TSV. I think it’s actually an error from readr. Could you try updating/reinstalling that package?

ADD REPLY
0
Entering edit mode

actually, I have tried both "abundance.h5" and "abundance.tsv" files using tximport but it gave the same ERROR information^

ADD REPLY
0
Entering edit mode

Ok I take it back, searching through my email I've found that this is linked to rhdf5. See the identical error message here:

https://stat.ethz.ch/pipermail/bioc-devel/2019-July/015326.html

It looks like this was addressed in the devel branch in July and so now it is available as the lastest Bioc release (3.10). Can you update Bioc to 3.10 and see if the issue is fixed on your end.

ADD REPLY
0
Entering edit mode

Thank you very much…… I asked the administrator to help me update the bioc to 3.10, but he refused for the reason that not all the R packages can be used in Bioc 3.10

the administrator told me that the reason why I suddenly couldn't use the tximport is that the server is named NFS, they depose software in one of the servers and let the other 8 servers use together, so some of them may produce error information.

It really troubles me so I change to another server producers and find that even in Bioc 3.09 it does work……

Thank you all the way!!! I do really appreciate that!!

ADD REPLY
0
Entering edit mode

Hi Michael. I am getting this same error even with BiocManager 3.10 and after reinstalling Rhdf5lib & rhdf5

> files <- file.path(kallistodir, metadata$sample, "abundance.h5")
> names(files) <- metadata$fastqName
> txi <- tximport(files, type = "kallisto", tx2gene = tx2gene)
Error: Unable to read dataset.
Not all required filters available.
Missing filters: deflate

This is my session info:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rhdf5lib_1.8.0              fgsea_1.12.0                Rcpp_1.0.3                  GO.db_3.10.0                rmarkdown_1.17              ggplotify_0.0.4             genefilter_1.68.0          
 [8] pheatmap_1.0.12             ggpubr_0.2.4                magrittr_1.5                RColorBrewer_1.1-2          limma_3.42.0                Glimma_1.14.0               EnhancedVolcano_1.4.0      
[15] ggrepel_0.8.1               systemPipeR_1.20.0          ShortRead_1.44.0            GenomicAlignments_1.22.1    clusterProfiler_3.14.0      org.Hs.eg.db_3.10.0         vsn_3.54.0                 
[22] DESeq2_1.26.0               SummarizedExperiment_1.16.0 DelayedArray_0.12.0         BiocParallel_1.20.0         matrixStats_0.55.0          pasilla_1.14.0              colorRamps_2.3             
[29] geneplotter_1.64.0          annotate_1.64.0             XML_3.98-1.20               AnnotationDbi_1.48.0        lattice_0.20-38             Biobase_2.46.0              mclust_5.4.5               
[36] gsubfn_0.7                  proto_1.0.0                 Rsamtools_2.2.1             Biostrings_2.54.0           XVector_0.26.0              data.table_1.12.6           refGenome_1.7.7            
[43] RSQLite_2.1.2               doBy_4.6-3                  rtracklayer_1.46.0          GenomicRanges_1.38.0        GenomeInfoDb_1.22.0         IRanges_2.20.1              S4Vectors_0.24.0           
[50] BiocGenerics_0.32.0         biomaRt_2.42.0              rhdf5_2.30.0                forcats_0.4.0               stringr_1.4.0               purrr_0.3.3                 readr_1.3.1                
[57] tidyr_1.0.0                 tibble_2.1.3                ggplot2_3.2.1               tidyverse_1.3.0             dplyr_0.8.3                 tximeta_1.4.2               tximport_1.14.0            

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.1           AnnotationForge_1.28.0   acepack_1.4.1            bit64_0.9-7              knitr_1.26               rpart_4.1-15             hwriter_1.3.2            RCurl_1.95-4.12         
  [9] AnnotationFilter_1.10.0  generics_0.0.2           GenomicFeatures_1.38.0   preprocessCore_1.48.0    cowplot_1.0.0            europepmc_0.3            bit_1.1-14               enrichplot_1.6.0        
 [17] base64url_1.4            xml2_1.2.2               lubridate_1.7.4          assertthat_0.2.1         batchtools_0.9.11        viridis_0.5.1            xfun_0.11                hms_0.5.2               
 [25] evaluate_0.14            fansi_0.4.0              progress_1.2.2           dbplyr_1.4.2             readxl_1.3.1             Rgraphviz_2.30.0         igraph_1.2.4.1           DBI_1.0.0               
 [33] htmlwidgets_1.5.1        backports_1.1.5          vctrs_0.2.0              ensembldb_2.10.2         withr_2.1.2              ggforce_0.3.1            triebeard_0.3.0          BSgenome_1.54.0         
 [41] checkmate_1.9.4          prettyunits_1.0.2        cluster_2.1.0            DOSE_3.12.0              lazyeval_0.2.2           crayon_1.3.4             edgeR_3.28.0             pkgconfig_2.0.3         
 [49] tweenr_1.0.1             nlme_3.1-142             ProtGenerics_1.18.0      nnet_7.3-12              rlang_0.4.2              lifecycle_0.1.0          affyio_1.56.0            BiocFileCache_1.10.2    
 [57] GOstats_2.52.0           modelr_0.1.5             cellranger_1.1.0         tcltk_3.6.1              polyclip_1.10-0          graph_1.64.0             Matrix_1.2-17            urltools_1.7.3          
 [65] reprex_0.3.0             base64enc_0.1-3          ggridges_0.5.1           viridisLite_0.3.0        rjson_0.2.20             bitops_1.0-6             blob_1.2.0               qvalue_2.18.0           
 [73] brew_1.0-6               gridGraphics_0.4-1       ggsignif_0.6.0           scales_1.1.0             memoise_1.1.0            GSEABase_1.48.0          plyr_1.8.4               zlibbioc_1.32.0                     
[105] BiocManager_1.30.10                
ADD REPLY
0
Entering edit mode

Can you try to open the h5 file using rhdf5? I think the error is coming from that package. That will help you/me debug, because the error isn't directly from one of the functions in tximport.

Alternatively, if you can't seem to nail down why rhdf5 doesn't want to import this file, I think you can just specify dropInfReps=TRUE and it will work without loading the information in the h5 file.

ADD REPLY
0
Entering edit mode

Thank you Michael! I managed to solve it by installing rhdf5 via conda directly:

conda install --channel https://conda.anaconda.org/bioconda bioconductor-rhdf5

rather than via BiocManager::install("rhdf5") which caused the issue.

ADD REPLY

Login before adding your answer.

Traffic: 844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6