Question

Does AnnotationHub replaces AnnotationDbi for ID conversion (e.g. entrez, refseq, symbols)?

1

Entering edit mode

dalloliogm ▴ 50

@dalloliogm-7141

Last seen 3.2 years ago

European Union

Hello,

I am exploring the new AnnotationHub package and I find it fantastic. Thank you for all the effort, and also for the nice tutorials on coursera and youtube.

I do not understand how much AnnotationHub will replace existing packages such as AnnotationDbi or the organism packages. For example I wonder if there is a way to convert an Entrez Id to convert gene Entrez to symbols and other IDs using AnnotationHub, or if I still have to rely on AnnotationDbi and the organism packages. In general I have a bit of confusion regarding which is the correct way to convert gene id from one type to another in bioconductor, without using biomaRt. Can anyone please help me clarify it?

Thanks

conversion symbol entrez • 1.8k views

ADD COMMENT • link updated 8.9 years ago by James W. MacDonald 66k • written 8.9 years ago by dalloliogm ▴ 50

score 2 · Accepted Answer · 2015-09-21

The AnnotationHub is just an easier way to supply annotation data to end users. It doesn't change how you interact with the data that you download - it just changes how you search for and retrieve the data. For example, if there were an org.Hs.eg.db on the AnnotationHub (I don't believe there is, btw), then you would still need AnnotationDbi to query it. As an example, for an OrgDb package that is on the hub:

> query(hub, c("OrgDb"))
AnnotationHub with 1145 records
# snapshotDate(): 2015-08-26
# $dataprovider: NCBI
# $species: 'Nostoc azollae'_0708, Acaryochloris marina_MBIC11017, Acetobact...
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description, tags, sourceurl,
#   sourcetype
# retrieve records with, e.g., 'object[["AH12818"]]'

            title                                                    
  AH12818 | org.Pseudomonas_mendocina_NK-01.eg.sqlite                
  AH12819 | org.Streptomyces_coelicolor_A3(2).eg.sqlite              
  AH12820 | org.Cricetulus_griseus.eg.sqlite                         
  AH12821 | org.Streptomyces_cattleya_NRRL_8057_=_DSM_46488.eg.sqlite
  AH12822 | org.Cavia_porcellus.eg.sqlite                            
  ...       ...                                                      
  AH13958 | org.Ochotona_princeps.eg.sqlite                          
  AH13959 | org.Aeromonas_veronii_B565.eg.sqlite                     
  AH13960 | org.Oryctolagus_cuniculus.eg.sqlite                      
  AH13961 | org.Tetraodon_nigroviridis.eg.sqlite                     
  AH13962 | org.Burkholderia_gladioli_BSR3.eg.sqlite                 
> z <- hub[["AH12818"]]
> z
OrgDb object:
| DBSCHEMAVERSION: 2.1
| DBSCHEMA: NOSCHEMA_DB
| ORGANISM: Pseudomonas mendocina_NK-01
| SPECIES: Pseudomonas mendocina_NK-01
| CENTRALID: GID
| TAXID: 1001585
| Db type: OrgDb
| Supporting package: AnnotationDbi

> select(z, head(keys(z)), "SYMBOL")
       GID   SYMBOL
1 10454784 MDS_0002
2 10454785 MDS_0003
3 10454786 MDS_0004
4 10454787 MDS_0005
5 10454788 MDS_0006
6 10454789 MDS_0007

Long term I think everything is going to end up in AnnotationHub, rather than being downloaded and installed. But for now there is still a set of packages that are installed, and another set that are on the AnnotationHub. But you don't interact with the data from either source differently - you just get the data in a different way.