KEGG REST: retrieving genes
1
0
Entering edit mode
Guido Hooiveld ★ 3.9k
@guido-hooiveld-2020
Last seen 56 minutes ago
Wageningen University, Wageningen, the …
Hi, I am exploring the package KEGG REST. I would like to retrieve the genes that belong to a specific pathway, e.g. all human genes that are in the Arachidonic Acid Metabolism pathway (= map00590). For now the topology of the pathway is not of relevance to me. I have checked the KEGG REST vignette but could not find how to do this, so if this is possible a pointer would be appreciated. Thanks, Guido As a side node (for the maintainer): I noticed that the API has recently been updated (18 January 2013); a.o. KGML files can now be retrieved and also conversion options from/to KEGG IDs has been expanded. > sessionInfo() R Under development (unstable) (2012-11-21 r61136) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] KEGGREST_0.99.1 loaded via a namespace (and not attached): [1] BiocGenerics_0.5.6 Biostrings_2.27.10 digest_0.6.2 httr_0.2 [5] IRanges_1.17.30 parallel_2.16.0 png_0.1-4 RCurl_1.91-1.1 [9] stats4_2.16.0 stringr_0.6.2 tools_2.16.0 > [[alternative HTML version deleted]]
• 2.1k views
ADD COMMENT
0
Entering edit mode
Dan Tenenbaum ★ 8.2k
@dan-tenenbaum-4256
Last seen 3.1 years ago
United States
Hi Guido, On Tue, Jan 29, 2013 at 2:24 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi, > I am exploring the package KEGG REST. > I would like to retrieve the genes that belong to a specific pathway, e.g. all human genes that are in the Arachidonic Acid Metabolism pathway (= map00590). For now the topology of the pathway is not of relevance to me. > I have checked the KEGG REST vignette but could not find how to do this, so if this is possible a pointer would be appreciated. > Normally the answer would be: keggGet("map00590")[[1]]$GENE But it looks like KEGG does not have gene data for this particular pathway (see the underlying URL, http://rest.kegg.jp/get/path:map00590, we expect a GENE section like you'd see in a different pathway, such as http://rest.kegg.jp/get/path:hsa05200) You can find some (possibly outdated) genes for this pathway by doing the following: library(org.Hs.eg.db) select(org.Hs.eg.db, "00590", cols=c("ENTREZID","SYMBOL"), keytype="PATH") This is old KEGG data and I do not know why their REST interface doesn't contain this data. > Thanks, > Guido > > As a side node (for the maintainer): I noticed that the API has recently been updated (18 January 2013); a.o. KGML files can now be retrieved and also conversion options from/to KEGG IDs has been expanded. Thanks! I will update the package. Dan > >> sessionInfo() > R Under development (unstable) (2012-11-21 r61136) > Platform: i386-w64-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] KEGGREST_0.99.1 > > loaded via a namespace (and not attached): > [1] BiocGenerics_0.5.6 Biostrings_2.27.10 digest_0.6.2 httr_0.2 > [5] IRanges_1.17.30 parallel_2.16.0 png_0.1-4 RCurl_1.91-1.1 > [9] stats4_2.16.0 stringr_0.6.2 tools_2.16.0 >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Dan, Actually, after a slight modification your suggestion does work! I realized that the pathways referred to as 'mapxxxxx' are actually the reference pathways; to retrieve the genes for a specific organism 'map' has to be replaced by the abbreviation of that specific organism, e.g. 'hsa' or 'mmu'. Thus, all human genes that are in the Arachidonic Acid Metabolism pathway: > head(keggGet("hsa00590")[[1]]$GENE) 8399 "PLA2G10; phospholipase A2, group X [KO:K01047] [EC:3.1.1.4]" 26279 "PLA2G2D; phospholipase A2, group IID [KO:K01047] [EC:3.1.1.4]" 30814 "PLA2G2E; phospholipase A2, group IIE [KO:K01047] [EC:3.1.1.4]" 50487 "PLA2G3; phospholipase A2, group III [KO:K01047] [EC:3.1.1.4]" 64600 "PLA2G2F; phospholipase A2, group IIF [KO:K01047] [EC:3.1.1.4]" 81579 "PLA2G12A; phospholipase A2, group XIIA [KO:K01047] [EC:3.1.1.4]" > Thanks, Guido -----Original Message----- From: Dan Tenenbaum [mailto:dtenenba@fhcrc.org] Sent: Wednesday, January 30, 2013 00:47 To: Hooiveld, Guido Cc: bioconductor at r-project.org Subject: Re: [BioC] KEGG REST: retrieving genes Hi Guido, On Tue, Jan 29, 2013 at 2:24 PM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi, > I am exploring the package KEGG REST. > I would like to retrieve the genes that belong to a specific pathway, e.g. all human genes that are in the Arachidonic Acid Metabolism pathway (= map00590). For now the topology of the pathway is not of relevance to me. > I have checked the KEGG REST vignette but could not find how to do this, so if this is possible a pointer would be appreciated. > Normally the answer would be: keggGet("map00590")[[1]]$GENE But it looks like KEGG does not have gene data for this particular pathway (see the underlying URL, http://rest.kegg.jp/get/path:map00590, we expect a GENE section like you'd see in a different pathway, such as http://rest.kegg.jp/get/path:hsa05200) You can find some (possibly outdated) genes for this pathway by doing the following: library(org.Hs.eg.db) select(org.Hs.eg.db, "00590", cols=c("ENTREZID","SYMBOL"), keytype="PATH") This is old KEGG data and I do not know why their REST interface doesn't contain this data. > Thanks, > Guido > > As a side node (for the maintainer): I noticed that the API has recently been updated (18 January 2013); a.o. KGML files can now be retrieved and also conversion options from/to KEGG IDs has been expanded. Thanks! I will update the package. Dan > >> sessionInfo() > R Under development (unstable) (2012-11-21 r61136) > Platform: i386-w64-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United > States.1252 [3] LC_MONETARY=English_United States.1252 [4] > LC_NUMERIC=C [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] KEGGREST_0.99.1 > > loaded via a namespace (and not attached): > [1] BiocGenerics_0.5.6 Biostrings_2.27.10 digest_0.6.2 httr_0.2 > [5] IRanges_1.17.30 parallel_2.16.0 png_0.1-4 RCurl_1.91-1.1 > [9] stats4_2.16.0 stringr_0.6.2 tools_2.16.0 >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6