I have projects split between R 3.6.3 and 4.0.0 and I am having trouble properly managing cached versions / snapshotDates of AnnotationHub resources. I first noticed this because apparently the newest snapshotDate "2020-04-27"
is missing most of the OrgDb from NCBI. The AnnotationHub How To vignette has:
library(AnnotationHub)
ah <- AnnotationHub()
## snapshotDate(): 2020-03-31
query(ah, "OrgDb")
## AnnotationHub with 1708 records
## # snapshotDate(): 2020-03-31
However, there is a new snapshotDate available but it is missing most of these OrgDb:
> library(AnnotationHub)
> ah <- AnnotationHub()
snapshotDate(): 2020-04-27
> query(ah, "OrgDb")
AnnotationHub with 19 records
# snapshotDate(): 2020-04-27
> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationHub_2.20.0 BiocFileCache_1.12.0 dbplyr_1.4.3 BiocGenerics_0.34.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 pillar_1.4.4 compiler_4.0.0 BiocManager_1.30.10
[5] later_1.0.0 tools_4.0.0 digest_0.6.25 bit_1.1-15.2
[9] RSQLite_2.2.0 memoise_1.1.0 lifecycle_0.2.0 tibble_3.0.1
[13] pkgconfig_2.0.3 rlang_0.4.6 shiny_1.4.0.2 DBI_1.1.0
[17] rstudioapi_0.11 curl_4.3 yaml_2.2.1 fastmap_1.0.1
[21] dplyr_0.8.5 httr_1.4.1 IRanges_2.22.1 vctrs_0.3.0
[25] S4Vectors_0.26.1 rappdirs_0.3.1 stats4_4.0.0 bit64_0.9-7
[29] tidyselect_1.1.0 Biobase_2.48.0 glue_1.4.1 R6_2.4.1
[33] AnnotationDbi_1.50.0 purrr_0.3.4 blob_1.2.1 magrittr_1.5
[37] promises_1.1.0 ellipsis_0.3.1 htmltools_0.4.0 assertthat_0.2.1
[41] xtable_1.8-4 mime_0.9 interactiveDisplayBase_1.26.0 httpuv_1.5.2
[45] crayon_1.3.4 BiocVersion_3.11.1
The main AnnotationHub vignette seems to say that I could switch to a different snapshotDate by simply doing the below, but it doesn't work, I still get the same truncated list of OrgDbs:
> possibleDates(ah)
[1] "2013-03-19" "2013-03-21" "2013-03-26" "2013-04-04" "2013-04-29" "2013-06-24" "2013-06-25" "2013-06-26" "2013-06-27"
# ...
[127] "2019-10-29" "2020-01-28" "2020-02-28" "2020-03-31" "2020-04-27"
> snapshotDate(ah) <- "2020-03-31"
> query(ah, "OrgDb")
AnnotationHub with 19 records
# snapshotDate(): 2020-03-31
I am also having weird issues when running two instances of RStudio at the same time, one with R 4.0.0 and one with R 3.6.3. I am having trouble replicating everything weird that I saw but this seems to be replicable:
# 1. Open RStudio running R 4.0.0. Force refresh of of cache with:
> library(AnnotationHub)
> ah <- refreshHub(hubClass="AnnotationHub")
|================================================================| 100%
snapshotDate(): 2020-04-27
> query(ah, "OrgDb")
AnnotationHub with 19 records
# snapshotDate(): 2020-04-27
# 2. Open another RStudio running 3.6.3. I had previously downloaded a snapshotDate of "2019-10-29" and it seems to find this one at first:
> library(AnnotationHub)
> ah <- AnnotationHub()
snapshotDate(): 2019-10-29
> query(ah, "OrgDb")
AnnotationHub with 1708 records
# snapshotDate(): 2019-10-29
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationHub_2.18.0 BiocFileCache_1.10.2 dbplyr_1.4.2 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 later_1.0.0 pillar_1.4.3
[4] compiler_3.6.3 BiocManager_1.30.10 tools_3.6.3
[7] digest_0.6.25 bit_1.1-15.2 RSQLite_2.2.0
[10] memoise_1.1.0 lifecycle_0.2.0 tibble_3.0.0
[13] pkgconfig_2.0.3 rlang_0.4.5 shiny_1.4.0.2
[16] DBI_1.1.0 cli_2.0.2 rstudioapi_0.11
[19] curl_4.3 yaml_2.2.1 fastmap_1.0.1
[22] dplyr_0.8.5 httr_1.4.1 IRanges_2.20.2
[25] vctrs_0.2.4 S4Vectors_0.24.3 rappdirs_0.3.1
[28] stats4_3.6.3 bit64_0.9-7 tidyselect_1.0.0
[31] Biobase_2.46.0 glue_1.4.0 R6_2.4.1
[34] AnnotationDbi_1.48.0 fansi_0.4.1 purrr_0.3.3
[37] blob_1.2.1 magrittr_1.5 promises_1.1.0
[40] ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1
[43] xtable_1.8-4 mime_0.9 interactiveDisplayBase_1.24.0
[46] httpuv_1.5.2 crayon_1.3.4 BiocVersion_3.10.1
# 3. Switch back to 4.0.0 and then refresh hub again
> ah <- refreshHub(hubClass="AnnotationHub")
> |================================================================| 100%
snapshotDate(): 2020-04-27
> query(ah, "OrgDb")
AnnotationHub with 19 records
# snapshotDate(): 2020-04-27
# 4. Switch back to 3.6.3; query OrgDb and see that is it wrong. Refresh hub and query again
> query(ah, "OrgDb")
AnnotationHub with 0 records
# snapshotDate(): 2019-10-29
> ah <- refreshHub(hubClass="AnnotationHub")
|=======================================================================================| 100%
snapshotDate(): 2019-10-29
> query(ah, "OrgDb")
AnnotationHub with 1708 records
# snapshotDate(): 2019-10-29
# 5. Switch back to 4.0.0 and query for OrgDb again and see that is wrong:
> query(ah, "OrgDb")
AnnotationHub with 0 records
# snapshotDate(): 2020-04-27
I've been wading through the help page for ?AnnotationHub
and I probably have to do some combination of the cache and localHub options but I cannot find any good examples of how to do this. So my specific questions are:
- How do I set and switch between local caches of specific snapshotDates for different versions of R?
- How do I switch to the
2020-03-31
snapshotDate in 4.0.0? - What happened to all the OrgDB packages in snapshotDate
2020-04-27
?!?
Thanks
I apologize for the inconvenience this has causes and will let you know when they are available.
No worries! Thank you for all your work on AnnotationHub. We work with many non-model organisms and AnnotationHub is extremely useful.
Thanks for clarifying this, but it would have been so nice if this special case were documented as such, somewhere, I just spent an hour or so trying to figure this out ... IOW Can this be documented ? Or did I not look in the right place?
Philip
I'll look through the package documentation and see if there is an appropriate place to mention this if it not already is.