Question: source of GO ontology and GO annotations in Bioconductor for GO.db
5 weeks ago by
marcinjoachimiak10 wrote:

Hello,

I started using a package called GO.db which is part of bioconductor, along the way I needed to find out the source of the GO ontology used by GO.db.

As far as I an tell based on the package source, GO.db is relying on bioconductor to provide the GO data file? However, I have not been able to find the corresponding code in bioconductor install files etc. If you could point me to where this occurs I would be very grateful. It looks like the GO.db package is maintained by bioconductor...

many thanks,

marcin

5 weeks ago by
United States
James W. MacDonald48k wrote:

This package is simply a re-packaging of data from geneontology.org, into a format that is a bit more user friendly. It's not that obvious how one would find that out, but if you do something like

> library(GO.db)

> ls(2)
[1] "GO"            "GO.db"         "GO_dbconn"     "GO_dbfile"
[5] "GO_dbInfo"     "GO_dbschema"   "GOBPANCESTOR"  "GOBPCHILDREN"
[9] "GOBPOFFSPRING" "GOBPPARENTS"   "GOCCANCESTOR"  "GOCCCHILDREN"
[13] "GOCCOFFSPRING" "GOCCPARENTS"   "GOMAPCOUNTS"   "GOMFANCESTOR"
[17] "GOMFCHILDREN"  "GOMFOFFSPRING" "GOMFPARENTS"   "GOOBSOLETE"
[21] "GOSYNONYM"     "GOTERM"  

and then, as an example,

> ?GOTERM

Under the Details section there is this:

    Mappings were based on data provided by: Gene Ontology
ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
With a date stamp from the source of: 2018-Mar28

And alternately you could do

> GO_dbInfo()
name                                                             value
1     GOSOURCENAME                                                     Gene Ontology
2      GOSOURCEURL ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
3     GOSOURCEDATE                                                        2018-Mar28
4          Db type                                                              GODb
5          package                                                     AnnotationDbi
6         DBSCHEMA                                                             GO_DB
7   GOEGSOURCEDATE                                                         2018-Apr4
8   GOEGSOURCENAME                                                       Entrez Gene
9    GOEGSOURCEURL                              ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
10 DBSCHEMAVERSION                                                               2.1

Which tells you some other stuff, but is maybe too 'inside baseball' to be useful?