celldex ensembl gene annotation issue
1
2
Entering edit mode
immuno.dh ▴ 20
@6554a238
Last seen 2.5 years ago
United States

Anyone know how I can fix this?


> ref.data <- celldex::HumanPrimaryCellAtlasData(ensembl = T)
snapshotDate(): 2021-10-18
see ?celldex and browseVignettes('celldex') for documentation
loading from cache
see ?celldex and browseVignettes('celldex') for documentation
loading from cache
snapshotDate(): 2021-10-20
loading from cache
Error: failed to load resource
  name: AH73881
  title: Ensembl 97 EnsDb for Homo sapiens
  reason: Table gene is missing required columns canonical_transcript!
SingleR celldex HumanPrimaryCellAtlasData ensembl • 3.0k views
ADD COMMENT
1
Entering edit mode

I met with the same issues. I intend to annotate Mus musculus.

ah <- AnnotationHub() ens.mm.98 <- query(ah, c("Mus musculus", "Ensembl", 98))[[1]]

But R console showed that

loading from cache Error: failed to load resource name: AH75036 title: Ensembl 98 EnsDb for Mus musculus reason: Table gene is missing required columns canonical_transcript!

ADD REPLY
1
Entering edit mode

Also running into this issue using tximeta...

Error: failed to load resource
name: AH78783
title: Ensembl 99 EnsDb for Homo sapiens
reason: Table gene is missing required columns canonical_transcript!
ADD REPLY
2
Entering edit mode
ADD REPLY
1
Entering edit mode

Session information?

ADD REPLY
1
Entering edit mode

This is mine:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ensembldb_2.18.1            AnnotationFilter_1.18.0     GenomicFeatures_1.46.1      AnnotationDbi_1.56.1        DESeq2_1.34.0               SummarizedExperiment_1.24.0 Biobase_2.54.0             
 [8] MatrixGenerics_1.6.0        matrixStats_0.61.0          GenomicRanges_1.46.0        GenomeInfoDb_1.30.0         IRanges_2.28.0              S4Vectors_0.32.1            BiocGenerics_0.40.0        
[15] tximeta_1.12.0              readxl_1.3.1                ggrepel_0.9.1               forcats_0.5.1               stringr_1.4.0               dplyr_1.0.7                 purrr_0.3.4                
[22] readr_2.0.2                 tidyr_1.1.4                 tibble_3.1.5                ggplot2_3.3.5               tidyverse_1.3.1            

loaded via a namespace (and not attached):
  [1] colorspace_2.0-2              rjson_0.2.20                  ellipsis_0.3.2                XVector_0.34.0                fs_1.5.0                      rstudioapi_0.13               bit64_4.0.5                  
  [8] interactiveDisplayBase_1.32.0 fansi_0.5.0                   lubridate_1.8.0               xml2_1.3.2                    splines_4.1.2                 tximport_1.22.0               cachem_1.0.6                 
 [15] geneplotter_1.72.0            knitr_1.36                    jsonlite_1.7.2                Rsamtools_2.10.0              annotate_1.72.0               broom_0.7.10                  dbplyr_2.1.1                 
 [22] png_0.1-7                     shiny_1.7.1                   BiocManager_1.30.16           compiler_4.1.2                httr_1.4.2                    backports_1.3.0               lazyeval_0.2.2               
 [29] assertthat_0.2.1              Matrix_1.3-4                  fastmap_1.1.0                 cli_3.1.0                     later_1.3.0                   htmltools_0.5.2               prettyunits_1.1.1            
 [36] tools_4.1.2                   gtable_0.3.0                  glue_1.5.0                    GenomeInfoDbData_1.2.7        rappdirs_0.3.3                Rcpp_1.0.7                    cellranger_1.1.0             
 [43] vctrs_0.3.8                   Biostrings_2.62.0             rtracklayer_1.54.0            xfun_0.28                     rvest_1.0.2                   mime_0.12                     lifecycle_1.0.1              
 [50] restfulr_0.0.13               XML_3.99-0.8                  AnnotationHub_3.2.0           zlibbioc_1.40.0               scales_1.1.1                  vroom_1.5.5                   ProtGenerics_1.26.0          
 [57] hms_1.1.1                     promises_1.2.0.1              parallel_4.1.2                RColorBrewer_1.1-2            yaml_2.2.1                    curl_4.3.2                    memoise_2.0.0                
 [64] biomaRt_2.50.0                stringi_1.7.5                 RSQLite_2.2.8                 genefilter_1.76.0             BiocVersion_3.14.0            BiocIO_1.4.0                  filelock_1.0.2               
 [71] BiocParallel_1.28.0           rlang_0.4.12                  pkgconfig_2.0.3               bitops_1.0-7                  lattice_0.20-45               evaluate_0.14                 GenomicAlignments_1.30.0     
 [78] bit_4.0.4                     tidyselect_1.1.1              magrittr_2.0.1                R6_2.5.1                      generics_0.1.1                DelayedArray_0.20.0           DBI_1.1.1                    
 [85] pillar_1.6.4                  haven_2.4.3                   withr_2.4.2                   survival_3.2-13               KEGGREST_1.34.0               RCurl_1.98-1.5                modelr_0.1.8                 
 [92] crayon_1.4.2                  utf8_1.2.2                    BiocFileCache_2.2.0           tzdb_0.2.0                    rmarkdown_2.11                progress_1.2.2                locfit_1.5-9.4               
 [99] grid_4.1.2                    blob_1.2.2                    reprex_2.0.1                  digest_0.6.28                 xtable_1.8-4                  httpuv_1.6.3                  munsell_0.5.0
ADD REPLY
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 15 hours ago
The city by the bay

I don't really know what the cause might be; looks like the same error is occurring on the BioC build machines, despite the lack of any recent changes to celldex. I don't have a copy of the latest BioC release right now, but I would guess that cached versions of various EnsDb objects are not compatible with the latest ensembldb. This can be tested by running:

library(AnnotationHub)
AnnotationHub()[["AH78783"]] # should fail 
AnnotationHub()[["AH78783", force=TRUE]] # might work. or not.

If the second one also fails, then ensembldb is busted and I'll need to complain to someone.

ADD COMMENT
1
Entering edit mode

Thanks! Just tested and they both still fail for me – might be a problem with ensembldb then!

ADD REPLY
1
Entering edit mode

I'd suggest making a new question with this example, or posting this on the ensembldb GitHub repo; otherwise the ensembldb maintainers won't see it.

ADD REPLY
1
Entering edit mode

Just to add that Johannes Rainer, who kindly provides the EnsDb objects at the AnnotationHub, indicated that the field canonical_transcript has been add to the EnsDb objects starting from Ensembl release 102 onwards. In other words, you may need to use a more recent version of the EndDb object...?

See: https://github.com/jorainer/ensembldb/issues/109#issuecomment-726741705

<<added>> I noticed on this list that also the package txmeta seems to be affected by the same issue/bug... so it looks like it is an issue with ensembldb rather than the EnsDb objects?? ensembldb support for Ensembl's transcript_name column

ADD REPLY
2
Entering edit mode

Johannes is aware of this and has an issue here:

https://github.com/jorainer/ensembldb/issues/122

ADD REPLY
0
Entering edit mode

Hi, Michael and Aaron,

Thank you!

I tried to repeat the following codes just now.

ah <- AnnotationHub()

ens.mm.98 <- query(ah, c("Homo sapiens", "Ensembl", 98))[[1]]

ens.mm.98 <- query(ah, c("Mus musculus", "Ensembl", 98))[[1]]

But, I still get following errors.

Error: failed to load resource

name: AH75011

title: Ensembl 98 EnsDb for Homo sapiens

reason: Table gene is missing required columns canonical_transcript!

My session info is as followings

R version 4.1.1 (2021-08-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] grid stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ggvenn_0.1.9 patchwork_1.1.1 forcats_0.5.1
[4] stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
[7] readr_2.0.2 tidyr_1.1.4 tibble_3.1.5
[10] tidyverse_1.3.1 BiocParallel_1.28.0 AnnotationHub_3.2.0
[13] BiocFileCache_2.2.0 dbplyr_2.1.1 ensembldb_2.18.1
[16] AnnotationFilter_1.18.0 GenomicFeatures_1.46.1 AnnotationDbi_1.56.1
[19] scater_1.22.0 ggplot2_3.3.5 scuttle_1.4.0
[22] DropletUtils_1.14.0 SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0 [25] Biobase_2.54.0 GenomicRanges_1.46.0 GenomeInfoDb_1.30.0
[28] IRanges_2.28.0 S4Vectors_0.32.1 BiocGenerics_0.40.0
[31] MatrixGenerics_1.6.0 matrixStats_0.61.0 Matrix_1.3-4

loaded via a namespace (and not attached): [1] snow_0.4-4 readxl_1.3.1 backports_1.2.1
[4] lazyeval_0.2.2 digest_0.6.28 htmltools_0.5.2
[7] viridis_0.6.2 fansi_0.5.0 magrittr_2.0.1
[10] memoise_2.0.0 ScaledMatrix_1.2.0 tzdb_0.1.2
[13] limma_3.50.0 Biostrings_2.62.0 modelr_0.1.8
[16] R.utils_2.11.0 prettyunits_1.1.1 colorspace_2.0-2
[19] rvest_1.0.2 blob_1.2.2 rappdirs_0.3.3
[22] ggrepel_0.9.1 haven_2.4.3 crayon_1.4.2
[25] RCurl_1.98-1.5 jsonlite_1.7.2 glue_1.4.2
[28] gtable_0.3.0 zlibbioc_1.40.0 XVector_0.34.0
[31] DelayedArray_0.20.0 BiocSingular_1.10.0 Rhdf5lib_1.16.0
[34] HDF5Array_1.22.0 scales_1.1.1 DBI_1.1.1
[37] edgeR_3.36.0 Rcpp_1.0.7 viridisLite_0.4.0
[40] xtable_1.8-4 progress_1.2.2 dqrng_0.3.0
[43] bit_4.0.4 rsvd_1.0.5 httr_1.4.2
[46] ellipsis_0.3.2 pkgconfig_2.0.3 XML_3.99-0.8
[49] R.methodsS3_1.8.1 locfit_1.5-9.4 utf8_1.2.2
[52] tidyselect_1.1.1 rlang_0.4.11 later_1.3.0
[55] cellranger_1.1.0 munsell_0.5.0 BiocVersion_3.14.0
[58] tools_4.1.1 cachem_1.0.6 cli_3.0.1
[61] generics_0.1.1 RSQLite_2.2.8 broom_0.7.10
[64] fastmap_1.1.0 yaml_2.2.1 fs_1.5.0
[67] bit64_4.0.5 KEGGREST_1.34.0 sparseMatrixStats_1.6.0
[70] mime_0.12 R.oo_1.24.0 xml2_1.3.2
[73] biomaRt_2.50.0 compiler_4.1.1 rstudioapi_0.13
[76] beeswarm_0.4.0 filelock_1.0.2 curl_4.3.2
[79] png_0.1-7 interactiveDisplayBase_1.32.0 reprex_2.0.1
[82] stringi_1.7.5 lattice_0.20-44 ProtGenerics_1.26.0
[85] vctrs_0.3.8 pillar_1.6.4 lifecycle_1.0.1
[88] rhdf5filters_1.6.0 BiocManager_1.30.16 BiocNeighbors_1.12.0
[91] bitops_1.0-7 irlba_2.3.3 httpuv_1.6.3
[94] rtracklayer_1.54.0 R6_2.5.1 BiocIO_1.4.0
[97] promises_1.2.0.1 gridExtra_2.3 vipor_0.4.5
[100] assertthat_0.2.1 rhdf5_2.38.0 rjson_0.2.20
[103] withr_2.4.2 GenomicAlignments_1.30.0 Rsamtools_2.10.0
[106] GenomeInfoDbData_1.2.7 parallel_4.1.1 hms_1.1.1
[109] beachmat_2.10.0 DelayedMatrixStats_1.16.0 lubridate_1.8.0
[112] shiny_1.7.1 ggbeeswarm_0.6.0 restfulr_0.0.13

I am very new to this. I hope to know how to annotate scRNAseq matrix.tsv file.

Thank you very much!

ADD REPLY
1
Entering edit mode

This bug has been solved in ensembldb version 2.18.2, so you will need to update (you are (still) using version 2.18.1).

ADD REPLY
0
Entering edit mode

Hi, Guido,

Thank you so much! I re-installed ensembldb and then it works!!

ADD REPLY

Login before adding your answer.

Traffic: 739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6