Hello,
I discovered that several AnnotationHub objects corresponding to genome assemblies are no longer available after updating to R 3.6.0. Specifically:
> hub = AnnotationHub()
snapshotDate(): 2019-04-29
> hub[["AH134"]]
Error: Defunct
> hub[["AH188"]]
Error: Defunct
> hub[["AH47190"]]
Error: Defunct
> hub[["AH80"]]
Error: Defunct
The above objects correspond to the hg19, mm10, danRer10, and dm5 genome assemblies. My lab relies on these assemblies, which are still pretty commonly used. I'm unsure of when these objects became unavailable, but as of R 3.5.3 (and correspond updated versions of Bioconductor packages), I was able to access them via my cached copies. However, after upgrading to R 3.6.0, I am unable to access them even though I have copies in my local cache.
I searched AnnotationHub() to see if I could find equivalents under other names, but was unable to.
Here is my sessionInfo():
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel grDevices utils datasets stats graphics methods base
other attached packages:
[1] AnnotationHub_2.15.15 BiocFileCache_1.7.10 dbplyr_1.4.0 BiocGenerics_0.29.2 readr_1.3.1 dplyr_0.8.0.1 tibble_2.1.1 magrittr_1.5
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 pillar_1.3.1 compiler_3.6.0 BiocManager_1.30.4 later_0.8.0
[6] tools_3.6.0 digest_0.6.18 bit_1.1-14 RSQLite_2.1.1 memoise_1.1.0
[11] pkgconfig_2.0.2 rlang_0.3.4 shiny_1.3.2 DBI_1.0.0 curl_3.3
[16] yaml_2.2.0 httr_1.4.0 IRanges_2.17.5 S4Vectors_0.21.24 rappdirs_0.3.1
[21] hms_0.4.2 stats4_3.6.0 bit64_0.9-7 tidyselect_0.2.5 Biobase_2.43.1
[26] glue_1.3.1 R6_2.4.0 AnnotationDbi_1.45.1 purrr_0.3.2 blob_1.1.1
[31] promises_1.0.1 htmltools_0.3.6 assertthat_0.2.1 mime_0.6 interactiveDisplayBase_1.21.0
[36] xtable_1.8-4 httpuv_1.5.1 crayon_1.3.4
Thank you for the reply. Would you mind explaining what changed in Rsamtools to prevent parsing of the FASTA files? I didn't see any changes that I would have expected to cause such a problem from the Rsamtools change log.
Thank you also for pointing me to the GTF and 2bit files. The GTF files correspond to genome annotations rather than primary genome sequence. I might be able to adapt code to use the 2bit files; however, is it correct that those aren't available for the hg19 / GRCh37 assembly?
Thank you for the reply. Would you mind explaining what changed in Rsamtools to prevent parsing of the FASTA files? I didn't see any changes that I would have expected to cause such a problem from the Rsamtools change log.
Thank you also for pointing me to the GTF and 2bit files. The GTF files correspond to genome annotations rather than primary genome sequence. I might be able to adapt code to use the 2bit files; however, is it correct that those aren't available for the hg19 / GRCh37 assembly?
Migration notes can be found here: https://github.com/Bioconductor/Rsamtools/blob/master/migration_notes.md
Thank you for the reference. Is the problem that AH134 and other AnnotationHub objects were compressed with razip instead of bgzip? If so, would it be possible to update those files to be stored with bgzip? Many people use those assemblies--some of which correspond to the latest version of the genome assembly available--and so I think that such an update would help many people.
Thank you for the reference. Is the problem that AH134 and other AnnotationHub objects were compressed with razip instead of bgzip? If so, would it be possible to update those files to be stored with bgzip? Many people use those assemblies--some of which correspond to the latest version of the genome assembly available--and so I think that such an update would help many people.
Yes the problem was that those objects were razip. We are discussing how to proceed further and will hopefully have a solution soon.
That's wonderful to hear. Thank you! I think that a solution will benefit many people (and my own group as well).
Would it be possible to create versions that are compressed with bgzip instead?
As mentioned above we started providing 2bit files instead of bgzip resources. It is recommended to update to using the 2bits as this is what Bioconductor will provide by default.
See the above post for the mouse and zebrafish options...
It seems like the following could be used for Homo sapien hg19 build
and the following for the fruit fly
Thank you for the suggestion. I will give it a try.