How to retrieve all GO terms at level 5 as well as their annotated genes
1
0
Entering edit mode
@peter-davidsen-4584
Last seen 8.6 years ago
Dear list, I'm looking for a way to get the names of all Gene Ontology terms for Biological Processes at level 5 as well as the genes (human gene symbols) annotated to each of the level 5 GO terms. I have tried to query the DAVID knowledgebase, but the online tool doesn't seem to respond to any requests. Hence, could anybody maybe point me in the direction of a package that could provide me with the same information? Kind regards, Peter
GO GO • 3.9k views
ADD COMMENT
1
Entering edit mode
@peter-davidsen-4584
Last seen 8.6 years ago
Dear list, I'm looking for a way to get the names of all Gene Ontology terms for Biological Processes at level 5 as well as the genes (human gene symbols) annotated to each of the level 5 GO terms. I have tried to query the DAVID knowledgebase, but the online tool doesn't seem to respond to any requests. Hence, could anybody maybe point me in the direction of a package that could provide me with the same information? Kind regards, Peter
ADD COMMENT
1
Entering edit mode
Hi Peter, Probably not the most elegant way, but you could do something like this (granted that I understand correctly what a "level 5" term is): library(GO.db) getAllBPChildren <- function(goids) { ans <- unique(unlist(mget(goids, GOBPCHILDREN), use.names=FALSE)) ans <- ans[!is.na(ans)] } level1_BP_terms <- getAllBPChildren("GO:0008150") # 23 terms level2_BP_terms <- getAllBPChildren(level1_BP_terms) # 256 terms level3_BP_terms <- getAllBPChildren(level2_BP_terms) # 3059 terms level4_BP_terms <- getAllBPChildren(level3_BP_terms) # 9135 terms level5_BP_terms <- getAllBPChildren(level4_BP_terms) # 15023 terms library(org.Hs.eg.db) level5_genes <- mget(intersect(level5_BP_terms, keys(org.Hs.egGO2EG)), org.Hs.egGO2EG) Cheers, H. On 06/21/2013 02:28 AM, Peter Davidsen wrote: > Dear list, > > I'm looking for a way to get the names of all Gene Ontology terms for > Biological Processes at level 5 as well as the genes (human gene > symbols) annotated to each of the level 5 GO terms. > > I have tried to query the DAVID knowledgebase, but the online tool > doesn't seem to respond to any requests. Hence, could anybody maybe > point me in the direction of a package that could provide me with the > same information? > > Kind regards, > Peter > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY
1
Entering edit mode
Hi Peter, I recently went through this with a client and he had a hard time understanding that there is not really a unique "level" of GO terms. Many of your level 5 terms can also be level 4 terms, or level 3 terms or level 6 term, etc. This is due the acyclic nature of the GO terms and the multiple paths possible from one ancestor to one descendent. Just want to point this out! Cheers, Jenny -----Original Message----- From: bioconductor-bounces@r-project.org [mailto:bioconductor- bounces@r-project.org] On Behalf Of Hervé Pagès Sent: Monday, June 24, 2013 8:20 PM To: Peter Davidsen Cc: bioconductor at r-project.org Subject: Re: [BioC] How to retrieve all GO terms at level 5 as well as their annotated genes Hi Peter, Probably not the most elegant way, but you could do something like this (granted that I understand correctly what a "level 5" term is): library(GO.db) getAllBPChildren <- function(goids) { ans <- unique(unlist(mget(goids, GOBPCHILDREN), use.names=FALSE)) ans <- ans[!is.na(ans)] } level1_BP_terms <- getAllBPChildren("GO:0008150") # 23 terms level2_BP_terms <- getAllBPChildren(level1_BP_terms) # 256 terms level3_BP_terms <- getAllBPChildren(level2_BP_terms) # 3059 terms level4_BP_terms <- getAllBPChildren(level3_BP_terms) # 9135 terms level5_BP_terms <- getAllBPChildren(level4_BP_terms) # 15023 terms library(org.Hs.eg.db) level5_genes <- mget(intersect(level5_BP_terms, keys(org.Hs.egGO2EG)), org.Hs.egGO2EG) Cheers, H. On 06/21/2013 02:28 AM, Peter Davidsen wrote: > Dear list, > > I'm looking for a way to get the names of all Gene Ontology terms for > Biological Processes at level 5 as well as the genes (human gene > symbols) annotated to each of the level 5 GO terms. > > I have tried to query the DAVID knowledgebase, but the online tool > doesn't seem to respond to any requests. Hence, could anybody maybe > point me in the direction of a package that could provide me with the > same information? > > Kind regards, > Peter > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Jenny, On 06/25/2013 06:49 AM, Zadeh, Jenny Drnevich wrote: > Hi Peter, > > I recently went through this with a client and he had a hard time understanding that there is not really a unique "level" of GO terms. Many of your level 5 terms can also be level 4 terms, or level 3 terms or level 6 term, etc. This is due the acyclic nature of the GO terms and the multiple paths possible from one ancestor to one descendent. Just want to point this out! Good point. And to illustrate this: > length(intersect(level4_BP_terms, level5_BP_terms)) [1] 7738 which means 7738 terms belong to level 4 and 5. What is unique however is the "minimum level" of a term i.e. the length of the shortest path between the term and the root of the ontology. In other words, the "minimum level" of a term is its distance to the root. If you want the BP terms that are at distance 5 from the root, just do: dist5_BP_terms <- setdiff(level5_BP_terms, c(level4_BP_terms, level3_BP_terms, level2_BP_terms, level1_BP_terms)) > length(minlevel5_BP_terms) [1] 7072 Playing a little bit more with this it seems that all the terms in the BP ontology are at a distance <= 12 from the root term. There are only 3 terms at distance 12: GO:0051564, GO:0051565, and GO:0007035. Cheers, H. > > Cheers, > Jenny > > -----Original Message----- > From: bioconductor-bounces at r-project.org [mailto:bioconductor- bounces at r-project.org] On Behalf Of Hervé Pagès > Sent: Monday, June 24, 2013 8:20 PM > To: Peter Davidsen > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to retrieve all GO terms at level 5 as well as their annotated genes > > Hi Peter, > > Probably not the most elegant way, but you could do something like this (granted that I understand correctly what a "level 5" term is): > > library(GO.db) > > getAllBPChildren <- function(goids) > { > ans <- unique(unlist(mget(goids, GOBPCHILDREN), use.names=FALSE)) > ans <- ans[!is.na(ans)] > } > > level1_BP_terms <- getAllBPChildren("GO:0008150") # 23 terms > level2_BP_terms <- getAllBPChildren(level1_BP_terms) # 256 terms > level3_BP_terms <- getAllBPChildren(level2_BP_terms) # 3059 terms > level4_BP_terms <- getAllBPChildren(level3_BP_terms) # 9135 terms > level5_BP_terms <- getAllBPChildren(level4_BP_terms) # 15023 terms > > library(org.Hs.eg.db) > level5_genes <- mget(intersect(level5_BP_terms, keys(org.Hs.egGO2EG)), > org.Hs.egGO2EG) > > Cheers, > H. > > On 06/21/2013 02:28 AM, Peter Davidsen wrote: >> Dear list, >> >> I'm looking for a way to get the names of all Gene Ontology terms for >> Biological Processes at level 5 as well as the genes (human gene >> symbols) annotated to each of the level 5 GO terms. >> >> I have tried to query the DAVID knowledgebase, but the online tool >> doesn't seem to respond to any requests. Hence, could anybody maybe >> point me in the direction of a package that could provide me with the >> same information? >> >> Kind regards, >> Peter >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY

Login before adding your answer.

Traffic: 722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6