Gene Ontology: Shortest path from root to node
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Given some GO BP terms for a gene I wish to find out, which of the terms has more specific meaning. I wish to find out the length of the shortest path between the BP Root term(GO:0008150) and the given term. Is there any suitable way to do that using any R package? Like something equivalent to: my $length = $node->lengthOfShortestPathToRoot; in Perl's "GO-TermFinder" package. Thanks in advance -- output of sessionInfo(): > sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sent via the guest posting facility at bioconductor.org.
GO GO • 2.0k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States
Hi Nicos, You could use the GO.db package to get at this. In there you will find an object called GOBPANCESTOR which acts like a classic R environment object and can be used with the get() method to pull out the ancestor terms of a given term all the way back to the root. So for your example you could have done this: library(GO.db) get("GO:0008150", GOBPANCESTOR) And you can see that the only ancestor to this term is in fact the root node: "all" What about terms further down? Well the same trick works for all the terms to get their ancestor terms: get("GO:0006955", GOBPANCESTOR) So you probably want to do something a bit like this: length(get("GO:0006955", GOBPANCESTOR)) And (for example) compare that to: length(get("GO:0008150", GOBPANCESTOR)) etc. Of course it's all a little bit more complicated than that because the gene ontologies are actually DAGs (so terms can have more than one route back to the main node), and so your ancestors list may be longer than just the simple path back to the "all" node. And in fact in the example I gave above this is true for the further down term "GO:0006955", which has two routes back to the main node, and hence it's "distance" (as hinted at by length) has been inflated by one in this case. Anyhow, I hope this helps, Marc On 01/14/2013 07:47 AM, WoA [guest] wrote: > Given some GO BP terms for a gene I wish to find out, which of the terms has more specific meaning. I wish to find out the length of the shortest path between the BP Root term(GO:0008150) and the given term. Is there any suitable way to do that using any R package? > > Like something equivalent to: > my $length = $node->lengthOfShortestPathToRoot; > > in Perl's "GO-TermFinder" package. > > Thanks in advance > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Also I forgot to mention, You can get the immediate parent to any term by using the appropriate PARENTS object. So for example: get("GO:0006955", GOBPPARENTS) Will tell you that there are TWO immediate parents for this term. Which is the situation I was describing earlier. Marc On 01/14/2013 11:09 AM, Marc Carlson wrote: > Hi Nicos, > > You could use the GO.db package to get at this. In there you will > find an object called GOBPANCESTOR which acts like a classic R > environment object and can be used with the get() method to pull out > the ancestor terms of a given term all the way back to the root. > > So for your example you could have done this: > > library(GO.db) > get("GO:0008150", GOBPANCESTOR) > > And you can see that the only ancestor to this term is in fact the > root node: "all" > > > What about terms further down? Well the same trick works for all the > terms to get their ancestor terms: > get("GO:0006955", GOBPANCESTOR) > > > > So you probably want to do something a bit like this: > > length(get("GO:0006955", GOBPANCESTOR)) > > And (for example) compare that to: > > length(get("GO:0008150", GOBPANCESTOR)) > > etc. > > > Of course it's all a little bit more complicated than that because the > gene ontologies are actually DAGs (so terms can have more than one > route back to the main node), and so your ancestors list may be longer > than just the simple path back to the "all" node. And in fact in the > example I gave above this is true for the further down term > "GO:0006955", which has two routes back to the main node, and hence > it's "distance" (as hinted at by length) has been inflated by one in > this case. > > > Anyhow, I hope this helps, > > > Marc > > > > > > On 01/14/2013 07:47 AM, WoA [guest] wrote: >> Given some GO BP terms for a gene I wish to find out, which of the >> terms has more specific meaning. I wish to find out the length of the >> shortest path between the BP Root term(GO:0008150) and the given >> term. Is there any suitable way to do that using any R package? >> >> Like something equivalent to: >> my $length = $node->lengthOfShortestPathToRoot; >> >> in Perl's "GO-TermFinder" package. >> >> Thanks in advance >> >> -- output of sessionInfo(): >> >>> sessionInfo() >> R version 2.13.1 (2011-07-08) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
A few more comments on this. Because to answer your questions you can also use some of the nice graph stuff in the project: ## So you can load a few useful libraries library(GO.db) ## for GO library(graph) ## for basic graph containers etc. library(RBGL) ## for algorithms to compute distance etc. ## 1st lets just make the PARENTS object into a nice tabular format with the toTable() method like this: xx = toTable(GOBPPARENTS) ## For now lets assume that we don't care what kind of relationship is represented, and just make that whole table into a graph with all edge weights set = 1. ## The reason I am setting the weights to 1 is so that later on I can more easily compute the distances. gg = ftM2graphNEL(as.matrix(xx[, 1:2]), W=rep(1,dim(xx)[1])) ## 'from / to' columns, with weights 'W' all set to 1 ## And if you want to visualize your graph it you can use Rgraphviz library(Rgraphviz) ## for plotting graph objects ## But lets not draw everything. Instead lets grab a subgraph containing only the nodes that we care about... sg <- subGraph(c(get("GO:0006955", GOBPANCESTOR), "GO:0006955"), gg) ## Then plot it plot(sg) ## And then we can use tools from RBGL to compute distance in terms of the number of edges... dijkstra.sp(sg, "GO:0006955")$distances["GO:0008150"] dijkstra.sp(sg, "GO:0006955")$distances["all"] Hope this helps, Marc On 01/14/2013 11:19 AM, Marc Carlson wrote: > Also I forgot to mention, > > You can get the immediate parent to any term by using the appropriate > PARENTS object. So for example: > > get("GO:0006955", GOBPPARENTS) > > Will tell you that there are TWO immediate parents for this term. > Which is the situation I was describing earlier. > > > Marc > > > On 01/14/2013 11:09 AM, Marc Carlson wrote: >> Hi Nicos, >> >> You could use the GO.db package to get at this. In there you will >> find an object called GOBPANCESTOR which acts like a classic R >> environment object and can be used with the get() method to pull out >> the ancestor terms of a given term all the way back to the root. >> >> So for your example you could have done this: >> >> library(GO.db) >> get("GO:0008150", GOBPANCESTOR) >> >> And you can see that the only ancestor to this term is in fact the >> root node: "all" >> >> >> What about terms further down? Well the same trick works for all the >> terms to get their ancestor terms: >> get("GO:0006955", GOBPANCESTOR) >> >> >> >> So you probably want to do something a bit like this: >> >> length(get("GO:0006955", GOBPANCESTOR)) >> >> And (for example) compare that to: >> >> length(get("GO:0008150", GOBPANCESTOR)) >> >> etc. >> >> >> Of course it's all a little bit more complicated than that because >> the gene ontologies are actually DAGs (so terms can have more than >> one route back to the main node), and so your ancestors list may be >> longer than just the simple path back to the "all" node. And in fact >> in the example I gave above this is true for the further down term >> "GO:0006955", which has two routes back to the main node, and hence >> it's "distance" (as hinted at by length) has been inflated by one in >> this case. >> >> >> Anyhow, I hope this helps, >> >> >> Marc >> >> >> >> >> >> On 01/14/2013 07:47 AM, WoA [guest] wrote: >>> Given some GO BP terms for a gene I wish to find out, which of the >>> terms has more specific meaning. I wish to find out the length of >>> the shortest path between the BP Root term(GO:0008150) and the given >>> term. Is there any suitable way to do that using any R package? >>> >>> Like something equivalent to: >>> my $length = $node->lengthOfShortestPathToRoot; >>> >>> in Perl's "GO-TermFinder" package. >>> >>> Thanks in advance >>> >>> -- output of sessionInfo(): >>> >>>> sessionInfo() >>> R version 2.13.1 (2011-07-08) >>> Platform: i386-pc-mingw32/i386 (32-bit) >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 >>> [2] LC_CTYPE=English_United States.1252 >>> [3] LC_MONETARY=English_United States.1252 >>> [4] LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 284 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6