BioPAX parsing
0
0
Entering edit mode
@oliver-ruebenacker-5312
Last seen 10.2 years ago
Hello, I created a prototype that can read RDF (using rJava and OpenRDF Sesame) and turn it into a data frame. Then I discovered that Egon Willighagen did almost the same and created RRDF (except that he used Jena instead of Sesame). RRDF does not provide a simple method to get a dataframe, but that should not be hard to add. Less work probably than turning my prototype into a deployable package. Take care Oliver On Fri, Jul 20, 2012 at 3:02 PM, Paul Shannon <pshannon at="" fhcrc.org=""> wrote: > Hi Oliver, > > Just checking in prior to the Bioconductor 2012 conference. Have you had any luck with parsing BioPAX format as RDF triples? > > Thanks! > > - Paul > > On Jun 16, 2012, at 3:10 AM, Oliver Ruebenacker wrote: > >> Hello, >> >> Thanks a lot for the endorsement! >> >> I will try to create a prototype in the next days, and then you can >> probably advice me on how to turn that into a package of desired >> quality. >> >> Take care >> Oliver >> >> On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon at="" fhcrc.org=""> wrote: >>> Oliver and Martin, >>> >>> It would be very helpful to have easy access to BioPAX data in Biocondcutor. >>> >>> Just now, at the weekly Bioconductor dev-team meeting, we discussed your ideas, and want to endorse them. Oliver's proposal to parse the RDF triples into a data.frame has lots to recommend it. It would be immediately useful, and yet also allow for more sophisticated uses later. With these relationships in R, annotated as BioPAX data often are, we can imagine interested parties writing S4 classes which use the data, which might provide flexible querying capabilities, and be able to transform those triples into graphs and networks, for further computation and display. >>> >>> Please let us know if we can help. >>> >>> - Paul >>> >>> >>> On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: >>> >>>> Hello Martin, >>>> >>>> I don't have code in R to test yet, but I do have extensive >>>> experience handling BioPAX in Java, so I'm assuming reading BioPAX >>>> using RJava should not be too difficult. >>>> >>>> The best target format depends on what people would like to do with >>>> the data. For visualization, a bi-partite graph in a popular >>>> graph-layout package should be best. Is there any particular graph >>>> package in BioConductor or R in general you would recommend? >>>> >>>> For actual analysis, people probably have more specific requirements. >>>> >>>> BioPAX is a format based on RDF/OWL, which in turn is based on >>>> organizing data in triples, which could be stored in a three- column >>>> data frame (or perhaps a fourth column for data type). For example >>>> (incomplete, for illustration only): >>>> >>>> ex:mapPhosphorylization rdf:type bp:BiochemicalReaction. >>>> ex:atp rdf:type bp:SmallMolecule. >>>> ex:adp rdf:type bp:SmallMolecule. >>>> ex:map rdf:type bp:Protein. >>>> ex:mapPhosphorylized rdf:type bp:Protein. >>>> ex:mapPhosphorylization bp:left ex:atp. >>>> ex:mapPhosphorylization bp:left ex:map. >>>> ex:mapPhosphorylization bp:right ex:adp. >>>> ex:mapPhosphorylization bp:right ex:mapPhosphorylized. >>>> >>>> Take care >>>> Oliver >>>> >>>> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse >>>> <martin.preusse at="" googlemail.com=""> wrote: >>>>> Hi Oliver, >>>>> >>>>> I think there is a lot interest in a bioconductor package! >>>>> >>>>> Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively. >>>>> >>>>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network! >>>>> >>>>> Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though ? :) >>>>> >>>>> Cheers >>>>> Martin >>>>> >>>>> >>>>> >>>>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: >>>>> >>>>>> Hello Martin, >>>>>> >>>>>> I'm currently looking into reading BioPAX into R using RJava and >>>>>> OpenRDF Sesame. If there is interest, I may be looking into submitting >>>>>> a package to BioConductor. >>>>>> >>>>>> It would be very helpful if you could tell me what you need the >>>>>> BioPAX data for, and in what form it would be best for you. Possible >>>>>> options are: >>>>>> >>>>>> - A data frame of the RDF/OWL triples >>>>>> - A graph of the RDF/OWL triples >>>>>> - A data frame with one row for each reaction-participant >>>>>> - A bi-partite graph with nodes for reactions and nodes for substances >>>>>> - A with nodes for substances only, with edges for interactions >>>>>> - A genetic interaction graph >>>>>> >>>>>> This list is roughly sorted form the one most easy to the most >>>>>> difficult to provide. >>>>>> >>>>>> Take care >>>>>> Oliver >>>>>> >>>>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse >>>>>> <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote: >>>>>>> Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). >>>>>>> >>>>>>> A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php >>>>>>> >>>>>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. >>>>>>> >>>>>>> >>>>>>> Any ideas are appreciated! >>>>>>> >>>>>>> Cheers >>>>>>> Martin >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Oliver Ruebenacker >>>>>> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >>>>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >>>>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Oliver Ruebenacker >>>> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Oliver Ruebenacker >> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > -- Java Developer (Bioinformatics) at PanGenX (http://www.pangenx.com) President and Founder of Knowomics (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Consultant at Predictive Medicine (http://predmed.com/people/oliverruebenacker.html) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
Pathways Visualization Network graph Pathways Visualization Network graph • 953 views
ADD COMMENT

Login before adding your answer.

Traffic: 557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6