Search
Question: Missing SYMBOL keytype in EnsDb.Hsapiens.v75
0
gravatar for Zach Roe
16 months ago by
Zach Roe10
Zach Roe10 wrote:

Hi,

I'm following the AnnotationDbi Introduction pdf on the use of the EnsDb.Hsapiens.v75 database as I have gene symbols that need to be mapped back to ENSEMBL ID's.  (They were originally ENSEMBL ID's that were mapped to gene symbol by a prior scientist, but that mapping is lost to me so I want to use a ENSEMBL specific data base to see if I can recover the original ID's).

Please refer to Section 0.6 of this July 7, 2016 published manual :

https://www.bioconductor.org/packages/devel/bioc/vignettes/AnnotationDbi/inst/doc/IntroToAnnotationPackages.pdf

 

However, when I use SYMBOL as keytype to map to GENEID column, I received the following error:

> mapIds(EnsDb.Hsapiens.v75, keys=keys, column="GENEID", keytype="SYMBOL", multiVals="first")

Error in .select(x = x, keys = keys, columns = columns, keytype = keytype,  : 
  keytype SYMBOL not available in the database. Use keytypes method to list all available keytypes.
In addition: Warning message:
In .select(x = x, keys = keys, columns = columns, keytype = keytype,  :
  The following columns are not available in the database and have thus been removed: SYMBOL

A check of available keytypes and columns per the manual shows SYMBOL is no longer available, contrary to the example shown in Section 0.6

> library(EnsDb.Hsapiens.v75)
> edb <- EnsDb.Hsapiens.v75
> columns(edb)
 [1] "ENTREZID"       "EXONID"         "EXONIDX"        "EXONSEQEND"     "EXONSEQSTART"   "GENEBIOTYPE"   
 [7] "GENEID"         "GENENAME"       "GENESEQEND"     "GENESEQSTART"   "ISCIRCULAR"     "SEQCOORDSYSTEM"
[13] "SEQLENGTH"      "SEQNAME"        "SEQSTRAND"      "TXBIOTYPE"      "TXCDSSEQEND"    "TXCDSSEQSTART" 
[19] "TXID"           "TXSEQEND"       "TXSEQSTART"    
> keytypes(edb)
[1] "ENTREZID"    "EXONID"      "GENEBIOTYPE" "GENEID"      "GENENAME"    "SEQNAME"     "SEQSTRAND"   "TXBIOTYPE"  
[9] "TXID"

Session info is below.

Thank you very much!

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v75_0.99.12 ensembldb_1.4.7            GenomicFeatures_1.24.5     gageData_2.10.0           
 [5] gage_2.22.0                pathview_1.12.0            vsn_3.40.0                 BiocParallel_1.6.3        
 [9] arrayQualityMetrics_3.28.2 HGNChelper_0.3.1           DESeq2_1.12.3              SummarizedExperiment_1.2.3
[13] GenomicRanges_1.24.2       GenomeInfoDb_1.8.3         org.Hs.eg.db_3.3.0         AnnotationDbi_1.34.4      
[17] IRanges_2.6.1              S4Vectors_0.10.2           Biobase_2.32.0             BiocGenerics_0.18.0       
[21] calibrate_1.7.2            MASS_7.3-45                xlsx_0.5.7                 xlsxjars_0.6.1            
[25] rJava_0.9-8                gridExtra_2.2.1            ggplot2_2.1.0              pheatmap_1.0.8            
[29] lattice_0.20-33            RColorBrewer_1.1-2         gplots_3.0.1               reshape2_1.4.1            
[33] reshape_0.8.5              tidyr_0.5.1                dplyr_0.5.0                circlize_0.3.7            
[37] migest_1.7.2               BiocInstaller_1.22.3  
ADD COMMENTlink modified 16 months ago by Johannes Rainer1.0k • written 16 months ago by Zach Roe10
2
gravatar for Johannes Rainer
16 months ago by
Johannes Rainer1.0k
Italy
Johannes Rainer1.0k wrote:

Hi

it is not that the column/keytype SYMBOL is no longer there, but it is not yet there. I've included support for the SYMBOL filter in version 1.5.9 of ensembldb (current version in Bioc devel).

SYMBOL is however only a symlink to *GENENAME*, so in your case you could easily use keytype = "GENENAME" instead and you would get the same results:

> library(ensembldb)
> library(EnsDb.Hsapiens.v75)
> keys <- c("BCL2", "BCL2L11", "ZBTB16", "NR3C1")
> mapIds(EnsDb.Hsapiens.v75, keys = keys, column = "GENEID",
+        keytype = "GENENAME", multiVals = "first")
             BCL2           BCL2L11            ZBTB16             NR3C1
"ENSG00000171791" "ENSG00000153094" "ENSG00000109906" "ENSG00000113580"
ADD COMMENTlink modified 16 months ago • written 16 months ago by Johannes Rainer1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 238 users visited in the last hour