KEGGgraph: gene ids for nodes in a pathway

0

Entering edit mode

Tim Smith ★ 1.1k

@tim-smith-1532

Last seen 9.6 years ago

Hi David, I was trying the package to get at the genes for each node, and tried the code you suggested. However, for some nodes/pathways, they seem to give an error. For example: ##---------- library(KEGGgraph) # Glycolysis pathway xfile <- system.file("/extdata/hsa00010.xml", package="KEGGgraph") p <- parseKGML(xfile) ## -------------- But this gives 'Error: does not seem to be XML, nor to identify a file name'. Is there a way around this? thanks again! ________________________________ From: Jitao David Zhang <davidvonpku@gmail.com> Sent: Tuesday, May 5, 2009 1:21:04 PM Subject: Re: [BioC] KEGG: gene ids for nodes in a pathway Hi Tim, Using KEGGgraph package may solve the problem. As an example: library(KEGGgraph) // use human MAPK pathway as an example xfile <- system.file("/extdata/hsa04010.xml", package="KEGGgraph") p <- parseKGML(xfile) pNodes <- nodes(p) displayNames <- sapply(pNodes, getDisplayName) geneids <- sapply(pNodes, function(x) translateKEGG2GeneID(getName(x))) The displayNames now contain the labels (the visible names of the nodes), while the geneids are the EntrezGeneID (in human case) of the genes contained in that node. To install KEGGgraph, just type source("http://www.bioconductor.org/biocLite.R") biocLite(KEGGgraph) Best wishes, David Hi, I wanted a list of genes for a particular pathway arranged nodewise. For example, if I select the Jak-stat pathway ("http://www.genome.jp/kegg/pathway/hsa/hsa04630.html"), how do I get the entrez ids of genes associated with the node 'STAT' ? Currently, I use the following code: x <- toTable(org.Hs.egPATH) and then select genes associated with a particular pathway (e.g. for Jak-stat: "04630") . But this gives the entire set of genes associated with the pathway. Is there a way to get the entrez ids of the genes associated with each of the nodes ('JAK', 'STAT', 'STAM','PIAS' etc.) in the pathway? thanks! [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Jitao David Zhang Computational Biology Ph.D. Division of Molecular Genome Analysis DKFZ, Heidelberg D-69120, Germany http://sites.google.com/site/jazzydevzoo/ [[alternative HTML version deleted]]

KEGGgraph KEGGgraph • 2.0k views

ADD COMMENT • link updated 15.0 years ago by Jitao David Zhang ▴ 340 • written 15.0 years ago by Tim Smith ★ 1.1k

0

Entering edit mode

Tony Chiang ▴ 20

@tony-chiang-3438

Last seen 9.6 years ago

Hi Tim, The issue is that in David's example, the XML file, hsa04010.xml, had been downloaded and put into the KEGGgraph package to be used as an example. The file you want, hsa00010.xml, has yet to be downloaded. The system.file command is telling the R session to look in the extdata directory of the KEGGgraph package for the hsa00010.xml file, but if it is not there, they you will surely get an error. I have not played around with KEGGgraph all that much, but from the example David gave, I would suggest that you download the XML file you want from KEGG onto your desktop or some other working directory. One you have the file, the xfile object should be a character string to the relative location of this file. Then call the parseKGML function on this string. tony On Thu, May 7, 2009 at 12:11 PM, Tim Smith <tim_smith_666@yahoo.com> wrote: > Hi David, > > I was trying the package to get at the genes for each node, and tried the > code you suggested. However, for some nodes/pathways, they seem to give an > error. For example: > > ##---------- > library(KEGGgraph) > > # Glycolysis pathway > xfile <- system.file("/extdata/hsa00010.xml", package="KEGGgraph") > p <- parseKGML(xfile) > > ## -------------- > > But this gives 'Error: does not seem to be XML, nor to identify a file > name'. > > Is there a way around this? > > thanks again! > > > > > > > ________________________________ > From: Jitao David Zhang <davidvonpku@gmail.com> > > Sent: Tuesday, May 5, 2009 1:21:04 PM > Subject: Re: [BioC] KEGG: gene ids for nodes in a pathway > > Hi Tim, > > Using KEGGgraph package may solve the problem. As an example: > > library(KEGGgraph) > > // use human MAPK pathway as an example > xfile <- system.file("/extdata/hsa04010.xml", package="KEGGgraph") > p <- parseKGML(xfile) > pNodes <- nodes(p) > > displayNames <- sapply(pNodes, getDisplayName) > geneids <- sapply(pNodes, function(x) translateKEGG2GeneID(getName(x))) > > The displayNames now contain the labels (the visible names of the > nodes), while the geneids are the EntrezGeneID (in human case) of the genes > contained in that node. > > To install KEGGgraph, just type > > source("http://www.bioconductor.org/biocLite.R") > biocLite(KEGGgraph) > > Best wishes, > David > > > > > Hi, > > I wanted a list of genes for a particular pathway arranged nodewise. For > example, if I select the Jak-stat pathway (" > http://www.genome.jp/kegg/pathway/hsa/hsa04630.html"), how do I get the > entrez ids of genes associated with the node 'STAT' ? Currently, I use the > following code: > > x <- toTable(org.Hs.egPATH) > > and then select genes associated with a particular pathway (e.g. for > Jak-stat: "04630") . But this gives the entire set of genes associated with > the pathway. Is there a way to get the entrez ids of the genes associated > with each of the nodes ('JAK', 'STAT', 'STAM','PIAS' etc.) in the pathway? > > thanks! > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Jitao David Zhang > Computational Biology Ph.D. > Division of Molecular Genome Analysis > DKFZ, Heidelberg D-69120, Germany > > http://sites.google.com/site/jazzydevzoo/ > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 15.0 years ago Tony Chiang ▴ 20

0

Entering edit mode

Jitao David Zhang ▴ 340

@jitao-david-zhang-3188

Last seen 7.2 years ago

Hi Tim and Tony, Thanks for Tony's reply, I have the same impression with him. There are some KGML files available with the package release for the purpose of demo. If you want to use any other file (I am sure 00010.xml does not belong to them), you'd first download the file from ftp://ftp.genome.jp/pub/kegg/xml/, say ftp://ftp.genome.jp/pub/kegg/xml/organisms/hsa/hsa00010.xml. Save the file to any destination you want (say /my/KGML/dir/goes/here/hsa00010.xml), then xfile <- "/my/KGML/dir/goes/here/hsa00010.xml" p <- parseKGML(xfile) Best wishes, David 2009/5/7 Tim Smith <tim_smith_666@yahoo.com> > Hi David, > > I was trying the package to get at the genes for each node, and tried the > code you suggested. However, for some nodes/pathways, they seem to give an > error. For example: > > ##---------- > library(KEGGgraph) > > # Glycolysis pathway > xfile <- system.file("/extdata/hsa00010.xml", package="KEGGgraph") > p <- parseKGML(xfile) > > ## -------------- > > But this gives 'Error: does not seem to be XML, nor to identify a file > name'. > > Is there a way around this? > > thanks again! > > > > ------------------------------ > *From:* Jitao David Zhang <davidvonpku@gmail.com> > *To:* Tim Smith <tim_smith_666@yahoo.com> > *Sent:* Tuesday, May 5, 2009 1:21:04 PM > *Subject:* Re: [BioC] KEGG: gene ids for nodes in a pathway > > Hi Tim, > > Using KEGGgraph package may solve the problem. As an example: > > library(KEGGgraph) > > // use human MAPK pathway as an example > xfile <- system.file("/extdata/hsa04010.xml", package="KEGGgraph") > p <- parseKGML(xfile) > pNodes <- nodes(p) > > displayNames <- sapply(pNodes, getDisplayName) > geneids <- sapply(pNodes, function(x) translateKEGG2GeneID(getName(x))) > > The displayNames now contain the labels (the visible names of the > nodes), while the geneids are the EntrezGeneID (in human case) of the genes > contained in that node. > > To install KEGGgraph, just type > > source("http://www.bioconductor.org/biocLite.R") > biocLite(KEGGgraph) > > Best wishes, > David > > 2009/5/4 Tim Smith <tim_smith_666@yahoo.com> > >> Hi, >> >> I wanted a list of genes for a particular pathway arranged nodewise. For >> example, if I select the Jak-stat pathway (" >> http://www.genome.jp/kegg/pathway/hsa/hsa04630.html"), how do I get the >> entrez ids of genes associated with the node 'STAT' ? Currently, I use the >> following code: >> >> x <- toTable(org.Hs.egPATH) >> >> and then select genes associated with a particular pathway (e.g. for >> Jak-stat: "04630") . But this gives the entire set of genes associated with >> the pathway. Is there a way to get the entrez ids of the genes associated >> with each of the nodes ('JAK', 'STAT', 'STAM','PIAS' etc.) in the pathway? >> >> thanks! >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Jitao David Zhang > Computational Biology Ph.D. > Division of Molecular Genome Analysis > DKFZ, Heidelberg D-69120, Germany > > http://sites.google.com/site/jazzydevzoo/ > > -- Jitao David Zhang Computational Biology Ph.D. Division of Molecular Genome Analysis DKFZ, Heidelberg D-69120, Germany http://sites.google.com/site/jazzydevzoo/ [[alternative HTML version deleted]]

ADD COMMENT • link 15.0 years ago Jitao David Zhang ▴ 340

0

Entering edit mode

Thanks Tony & David, That worked. ________________________________ From: Jitao David Zhang <davidvonpku@gmail.com> Cc: bioc <bioconductor@stat.math.ethz.ch>; Tony Chiang <tc.fhcrc@gmail.com> Sent: Thursday, May 7, 2009 3:37:46 PM Subject: Re: [BioC] KEGGgraph: gene ids for nodes in a pathway Hi Tim and Tony, Thanks for Tony's reply, I have the same impression with him. There are some KGML files available with the package release for the purpose of demo. If you want to use any other file (I am sure 00010.xml does not belong to them), you'd first download the file from ftp://ftp.genome.jp/pub/kegg/xml/, say ftp://ftp.genome.jp/pub/kegg/xml/organisms/hsa/hsa00010.xml. Save the file to any destination you want (say /my/KGML/dir/goes/here/hsa00010.xml), then xfile <- "/my/KGML/dir/goes/here/hsa00010.xml" p <- parseKGML(xfile) Best wishes, David Hi David, I was trying the package to get at the genes for each node, and tried the code you suggested. However, for some nodes/pathways, they seem to give an error. For example: ##---------- library(KEGGgraph) # Glycolysis pathway xfile <- system.file("/extdata/hsa00010.xml", package="KEGGgraph") p <- parseKGML(xfile) ## -------------- But this gives 'Error: does not seem to be XML, nor to identify a file name'. Is there a way around this? thanks again! ________________________________ From: Jitao David Zhang <davidvonpku@gmail.com> Sent: Tuesday, May 5, 2009 1:21:04 PM Subject: Re: [BioC] KEGG: gene ids for nodes in a pathway Hi Tim, Using KEGGgraph package may solve the problem. As an example: library(KEGGgraph) // use human MAPK pathway as an example xfile <- system.file("/extdata/hsa04010.xml", package="KEGGgraph") p <- parseKGML(xfile) pNodes <- nodes(p) displayNames <- sapply(pNodes, getDisplayName) geneids <- sapply(pNodes, function(x) translateKEGG2GeneID(getName(x))) The displayNames now contain the labels (the visible names of the nodes), while the geneids are the EntrezGeneID (in human case) of the genes contained in that node. To install KEGGgraph, just type source("http://www.bioconductor.org/biocLite.R") biocLite(KEGGgraph) Best wishes, David Hi, I wanted a list of genes for a particular pathway arranged nodewise. For example, if I select the Jak-stat pathway ("http://www.genome.jp/kegg/pathway/hsa/hsa04630.html"), how do I get the entrez ids of genes associated with the node 'STAT' ? Currently, I use the following code: x <- toTable(org.Hs.egPATH) and then select genes associated with a particular pathway (e.g. for Jak-stat: "04630") . But this gives the entire set of genes associated with the pathway. Is there a way to get the entrez ids of the genes associated with each of the nodes ('JAK', 'STAT', 'STAM','PIAS' etc.) in the pathway? thanks! [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Jitao David Zhang Computational Biology Ph.D. Division of Molecular Genome Analysis DKFZ, Heidelberg D-69120, Germany http://sites.google.com/site/jazzydevzoo/ -- Jitao David Zhang Computational Biology Ph.D. Division of Molecular Genome Analysis DKFZ, Heidelberg D-69120, Germany http://sites.google.com/site/jazzydevzoo/ [[alternative HTML version deleted]]

ADD REPLY • link 15.0 years ago Tim Smith ★ 1.1k

Login before adding your answer.