Entering edit mode
Hi ,
I have been lately running into the following issue with biomaRt. Everytime I use the useEnsembl() function, I get the error message shown below.
> ensembl <- biomaRt::useEnsembl("ensembl", dataset="mmusculus_gene_ensembl", host = "http://dec2017.archive.ensembl.org")
Error: database disk image is malformed
In addition: Warning messages:
1: Couldn't set cache size: database disk image is malformed
Use `cache_size` = NULL to turn off this warning.
2: Couldn't set synchronous mode: database disk image is malformed
Use `synchronous` = NULL to turn off this warning.
sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /g/easybuild/x86_64/CentOS/7/haswell/software/FlexiBLAS/3.0.4-GCC-10.3.0/lib64/libflexiblas.so.3.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.33.5 SummarizedExperiment_1.23.4 Biobase_2.53.0 MatrixGenerics_1.5.4
[5] matrixStats_0.61.0 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[9] purrr_0.3.4 readr_2.0.2 tidyr_1.1.4 tibble_3.1.5
[13] ggplot2_3.3.5 tidyverse_1.3.1 rGREAT_1.25.2 GenomicRanges_1.45.0
[17] GenomeInfoDb_1.29.8 IRanges_2.27.2 S4Vectors_0.31.5 BiocGenerics_0.39.2
[21] ChIPseeker_1.29.1
loaded via a namespace (and not attached):
[1] utf8_1.2.2 tidyselect_1.1.1
[3] htmlwidgets_1.5.4 RSQLite_2.2.8
[5] AnnotationDbi_1.55.1 grid_4.1.0
[7] BiocParallel_1.27.12 devtools_2.4.2
[9] scatterpie_0.1.7 munsell_0.5.0
[11] preprocessCore_1.55.2 codetools_0.2-18
[13] withr_2.4.2 colorspace_2.0-2
[15] GOSemSim_2.19.1 filelock_1.0.2
[17] knitr_1.36 rstudioapi_0.13
[19] DOSE_3.19.3 GenomeInfoDbData_1.2.7
[21] polyclip_1.10-0 bit64_4.0.5
[23] farver_2.1.0 rprojroot_2.0.2
[25] vctrs_0.3.8 treeio_1.17.2
[27] generics_0.1.0 xfun_0.26
[29] BiocFileCache_2.1.1 fastcluster_1.2.3
[31] R6_2.5.1 doParallel_1.0.16
[33] clue_0.3-59 graphlayouts_0.7.1
[35] locfit_1.5-9.4 bitops_1.0-7
[37] cachem_1.0.6 fgsea_1.19.4
[39] gridGraphics_0.5-1 DelayedArray_0.19.4
[41] csdR_0.99.8 assertthat_0.2.1
[43] BiocIO_1.3.0 scales_1.1.1
[45] nnet_7.3-16 ggraph_2.0.5
[47] enrichplot_1.13.1 gtable_0.3.0
[49] processx_3.5.2 WGCNA_1.70-3
[51] tidygraph_1.2.0 rlang_0.4.11
[53] genefilter_1.75.1 GlobalOptions_0.1.2
[55] splines_4.1.0 rtracklayer_1.53.1
[57] lazyeval_0.2.2 impute_1.67.0
[59] checkmate_2.0.0 broom_0.7.9
[61] modelr_0.1.8 BiocManager_1.30.16
[63] yaml_2.2.1 reshape2_1.4.4
[65] backports_1.2.1 GenomicFeatures_1.45.2
[67] Hmisc_4.5-0 qvalue_2.25.0
[69] usethis_2.0.1 tools_4.1.0
[71] ggplotify_0.1.0 ellipsis_0.3.2
[73] gplots_3.1.1 RColorBrewer_1.1-2
[75] dynamicTreeCut_1.63-1 sessioninfo_1.1.1
[77] Rcpp_1.0.7 plyr_1.8.6
[79] base64enc_0.1-3 progress_1.2.2
[81] zlibbioc_1.39.0 RCurl_1.98-1.5
[83] ps_1.6.0 prettyunits_1.1.1
[85] rpart_4.1-15 GetoptLong_1.0.5
[87] viridis_0.6.1 haven_2.4.3
[89] ggrepel_0.9.1 cluster_2.1.2
[91] fs_1.5.0 magrittr_2.0.1
[93] data.table_1.14.2 DO.db_2.9
[95] circlize_0.4.13 reprex_2.0.1
[97] pkgload_1.2.2 hms_1.1.1
[99] patchwork_1.1.1 evaluate_0.14
[101] xtable_1.8-4 RhpcBLASctl_0.21-247
[103] XML_3.99-0.8 jpeg_0.1-9
[105] readxl_1.3.1 gridExtra_2.3
[107] shape_1.4.6 testthat_3.1.0
[109] compiler_4.1.0 biomaRt_2.49.4
[111] KernSmooth_2.23-20 crayon_1.4.1
[113] shadowtext_0.0.9 htmltools_0.5.2
[115] tzdb_0.1.2 ggfun_0.0.4
[117] Formula_1.2-4 geneplotter_1.71.0
[119] aplot_0.1.1 lubridate_1.7.10
[121] DBI_1.1.1 tweenr_1.0.2
[123] dbplyr_2.1.1 ComplexHeatmap_2.9.4
[125] MASS_7.3-54 rappdirs_0.3.3
[127] boot_1.3-28 Matrix_1.3-4
[129] cli_3.0.1 parallel_4.1.0
[131] igraph_1.2.6 pkgconfig_2.0.3
[133] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicAlignments_1.29.0
[135] foreign_0.8-81 xml2_1.3.2
[137] foreach_1.5.1 ggtree_3.1.5
[139] annotate_1.71.0 XVector_0.33.0
[141] rvest_1.0.1 yulab.utils_0.0.2
[143] callr_3.7.0 digest_0.6.28
[145] Biostrings_2.61.2 cellranger_1.1.0
[147] rmarkdown_2.11 fastmatch_1.1-3
[149] htmlTable_2.2.1 tidytree_0.3.5
[151] restfulr_0.0.13 curl_4.3.2
[153] Rsamtools_2.9.1 gtools_3.9.2
[155] rjson_0.2.20 lifecycle_1.0.1
[157] nlme_3.1-153 jsonlite_1.7.2
[159] desc_1.4.0 viridisLite_0.4.0
[161] fansi_0.5.0 pillar_1.6.3
[163] lattice_0.20-45 KEGGREST_1.33.0
[165] fastmap_1.1.0 httr_1.4.2
[167] plotrix_3.8-2 pkgbuild_1.2.0
[169] survival_3.2-13 GO.db_3.14.0
[171] remotes_2.4.1 glue_1.4.2
[173] png_0.1-7 iterators_1.0.13
[175] bit_4.0.4 ggforce_0.3.3
[177] stringi_1.7.5 blob_1.2.2
[179] latticeExtra_0.6-29 caTools_1.18.2
[181] memoise_2.0.0 ape_5.5
Is there anything I could do to correct this? I'm currently unable to generate a Mart object using useEnsembl(). Curiously though, useMart() works as it should.
Thank you very much in advance!
Best, VĂctor
Thanks James. I spoke to victorcampos1995 outside of this forum, and even those functions were throwing the same error. We established it was likely because the drive with the default cache location had run out of space, which broke any query to BiocFileCache.
The fix was to set the location of the biomaRt cache by setting the environment variable
BIOMART_CACHE
to a location with more space e.g.Setting this in
~/.Rprofile
ensured it was set for future R sessions too.Ha! That was actually what I thought it was to begin with. But I know you are caching things these days, and it said 'cache', so... Anyway, glad to hear you got it sorted.
UNBORKING the bioMaRt cache
biomaRt::biomartCacheClear()
also fixes the troubles with PANTHER.db installation which have also something to do with "database disk image" being "malformed"