EnsDb: bad_weak_ptr
1
0
Entering edit mode
@lluis-revilla-sancho
Last seen 9 hours ago
European Union

I'm trying to use AnnotationHub and cache it to prevent downloading again the data every time I run the code. My code is:

ah <- AnnotationHub(localHub = TRUE)
ensembl_version <- 111
code <- paste0("Ensembl ", ensembl_version, " EnsDb for Homo sapiens")
if (ah["AH116291"]$title == code && !file.exists(here("data", "aqh.rds"))) {
  aqh <- ah[["AH116291"]]
  saveRDS(aqh, here("data", "aqh.rds"))
} else if (file.exists(here("data", "aqh.rds"))) {
  aqh <- readRDS(here("data", "aqh.rds"))
} else {
  q <- query(ah, c("EnsDb", "Homo sapiens"))
  ensembl111 <- names(q)[length(q)]
  stop("Updated ensembl reference")
}
aqh
## Error: bad_weak_ptr

But once the data is saved and I load it again I get an error in almost any specific method for EnsDb. I am not sure what can be the problem as the file is correctly saved and loaded (is(aqh) works). Some search results point to C++ pointers or something similar, which I don't know how might result in the error in the R code.

I would appreciate any help to understand what is going on and how to improve my code (Using BiocFileCache?). Thanks

AnnotationHub BiocFileCache • 316 views
ADD COMMENT
0
Entering edit mode
shepherl 3.9k
@lshep
Last seen 5 hours ago
United States

I'm curious as to why you are doing a saveRDS and readRDS? AnnotationHub already caches and saves files in the background already so it isn't necessary. This also currently isn't using BiocFileCache as you are just saving locally and trying to re-read it in. It might help to show an ERROR for a method as you indicated so we can see the ERROR method as well as provide your sessionInfo() so we know what versions you are running.

aqh <- ah[["AH116291"]]
> aqh
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
ADD COMMENT
0
Entering edit mode

I am using saveRDS and readRDS because when executing this inside a quarto report it failed sometimes to connect to the remote server (before using the localHub = TRUE argument). I know I am not using BiocFileCache.

The error message is what I have above after loading the object with readRDS and printing with aqh I see "Error: bad_weak_ptr" and not the data information you provided.

Here is my sessionInfo() of the project, I'm using renv, so if you wish I could post the lock file to reproduce the R environment:

sessionInfo()
## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=en_GB.UTF-8    
##  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_GB.UTF-8   
##  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Madrid
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices datasets  utils    
## [7] methods   base     
## 
## other attached packages:
##  [1] readxl_1.4.3                rutils_0.0.1.9004          
##  [3] PCAtools_2.16.0             ggrepel_0.9.5              
##  [5] ensembldb_2.28.0            AnnotationFilter_1.28.0    
##  [7] GenomicFeatures_1.56.0      forcats_1.0.0              
##  [9] dplyr_1.1.4                 here_1.0.1                 
## [11] ggplot2_3.5.1               DelayedMatrixStats_1.26.0  
## [13] DelayedArray_0.30.1         SparseArray_1.4.8          
## [15] S4Arrays_1.4.1              abind_1.4-5                
## [17] Matrix_1.7-0                robustbase_0.99-2          
## [19] GO.db_3.19.1                AnnotationDbi_1.66.0       
## [21] BiocSingular_1.20.0         patchwork_1.2.0            
## [23] scDblFinder_1.18.0          scds_1.20.0                
## [25] BiocParallel_1.38.0         scuttle_1.14.0             
## [27] biomaRt_2.60.0              AnnotationHub_3.12.0       
## [29] BiocFileCache_2.12.0        dbplyr_2.5.0               
## [31] DropletUtils_1.24.0         SingleCellExperiment_1.26.0
## [33] SummarizedExperiment_1.34.0 Biobase_2.64.0             
## [35] GenomicRanges_1.56.0        GenomeInfoDb_1.40.1        
## [37] IRanges_2.38.0              S4Vectors_0.42.0           
## [39] BiocGenerics_0.50.0         MatrixGenerics_1.16.0      
## [41] matrixStats_1.3.0           targets_1.7.0              
## 
## loaded via a namespace (and not attached):
##   [1] BiocIO_1.14.0            bitops_1.0-7            
##   [3] filelock_1.0.3           tibble_3.2.1            
##   [5] R.oo_1.26.0              cellranger_1.1.0        
##   [7] pROC_1.18.5              XML_3.99-0.16.1         
##   [9] lifecycle_1.0.4          httr2_1.0.1             
##  [11] edgeR_4.2.0              rprojroot_2.0.4         
##  [13] processx_3.8.4           lattice_0.22-6          
##  [15] MASS_7.3-60.2            backports_1.5.0         
##  [17] magrittr_2.0.3           limma_3.60.2            
##  [19] yaml_2.3.8               metapod_1.12.0          
##  [21] cowplot_1.1.3            DBI_1.2.3               
##  [23] zlibbioc_1.50.0          purrr_1.0.2             
##  [25] R.utils_2.12.3           RCurl_1.98-1.14         
##  [27] rappdirs_0.3.3           GenomeInfoDbData_1.2.12 
##  [29] irlba_2.3.5.1            dqrng_0.4.1             
##  [31] codetools_0.2-20         xml2_1.3.6              
##  [33] tidyselect_1.2.1         farver_2.1.2            
##  [35] UCSC.utils_1.0.0         ScaledMatrix_1.12.0     
##  [37] viridis_0.6.5            GenomicAlignments_1.40.0
##  [39] jsonlite_1.8.8           BiocNeighbors_1.22.0    
##  [41] scater_1.32.0            tools_4.4.0             
##  [43] progress_1.2.3           Rcpp_1.0.12             
##  [45] glue_1.7.0               gridExtra_2.3           
##  [47] xfun_0.44                HDF5Array_1.32.0        
##  [49] withr_3.0.0              BiocManager_1.30.23     
##  [51] fastmap_1.2.0            rhdf5filters_1.16.0     
##  [53] bluster_1.14.0           fansi_1.0.6             
##  [55] callr_3.7.6              digest_0.6.35           
##  [57] rsvd_1.0.5               secretbase_0.5.0        
##  [59] R6_2.5.1                 colorspace_2.1-0        
##  [61] scattermore_1.2          RSQLite_2.3.7           
##  [63] R.methodsS3_1.8.2        utf8_1.2.4              
##  [65] generics_0.1.3           renv_1.0.7.9000         
##  [67] data.table_1.15.4        rtracklayer_1.64.0      
##  [69] prettyunits_1.2.0        httr_1.4.7              
##  [71] pkgconfig_2.0.3          gtable_0.3.5            
##  [73] blob_1.2.4               XVector_0.44.0          
##  [75] base64url_1.4            ProtGenerics_1.36.0     
##  [77] scales_1.3.0             png_0.1-8               
##  [79] scran_1.32.0             knitr_1.47              
##  [81] rstudioapi_0.16.0        reshape2_1.4.4          
##  [83] rjson_0.2.21             curl_5.2.1              
##  [85] cachem_1.1.0             rhdf5_2.48.0            
##  [87] stringr_1.5.1            BiocVersion_3.19.1      
##  [89] parallel_4.4.0           vipor_0.4.7             
##  [91] restfulr_0.0.15          pillar_1.9.0            
##  [93] grid_4.4.0               vctrs_0.6.5             
##  [95] beachmat_2.20.0          cluster_2.1.6           
##  [97] beeswarm_0.4.0           cli_3.6.2               
##  [99] locfit_1.5-9.9           compiler_4.4.0          
##  [ reached getOption("max.print") -- omitted 24 entries ]

Many thanks for your assistance Lori.

ADD REPLY
1
Entering edit mode

You cannot save and reuse an EnsDb object like that, as it's mostly just a pointer to a SQLite Db and functions to query it.

library(AnnotationHub)
hub <- AnnotationHub()
z <- hub[["AH116291"]]
> class(z)
[1] "EnsDb"
attr(,"package")
[1] "ensembldb"
> z
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
> dbconn(z)
<SQLiteConnection>
  Path: C:\Users\jmacdon\AppData\Local\R\cache\R\AnnotationHub\599041ba51cf_123037
  Extensions: TRUE
## try saving
> saveRDS(z, "tmp.Rds")
> zz <- readRDS("tmp.Rds")
> zz
Error: bad_weak_ptr
> class(zz)
[1] "EnsDb"
attr(,"package")
[1] "ensembldb"
> dbconn(zz)
<SQLiteConnection>
  DISCONNECTED

I sometimes do what you are trying to do, to ensure that a download AnnotationHub object remains static (for clients who won't understand if any annotations change), by going off-reservation and doing something like this:

> file.copy(dbconn(z)@dbname, "./ensdb.sqlite")
## and then later, in an Rmd file
> library(ensembldb)
> zzz <- EnsDb("ensdb.sqlite")
> zzz
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
> dbconn(zzz)
<SQLiteConnection>
  Path: C:\Users\jmacdon\Desktop\ensdb.sqlite
  Extensions: TRUE

But ideally you would just use the cached SQLite file that AnnotationHub should find for you.

ADD REPLY
0
Entering edit mode

I suspected this. Thanks for the answer, that solution might work! And at least now I understand why this happens (but I wish there would be a better error message from R). My main goal is avoiding problems connecting to the server and redownloading it again when it is the same (I think I avoided that with localHub = TRUE already) but I'm fine with updating the references.

ADD REPLY
0
Entering edit mode

It seems to be an issue with the saveRDS and readRDS and not the hubs

> aqh
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
> saveRDS(aqh, here("data", "aqh.rds"))
> aqh <- readRDS(here("data", "aqh.rds"))
> aqh
Error: bad_weak_ptr
ADD REPLY

Login before adding your answer.

Traffic: 645 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6