Entering edit mode
Oliver Ruebenacker
▴
110
@oliver-ruebenacker-5312
Last seen 10.2 years ago
Hello,
I created a prototype that can read RDF (using rJava and OpenRDF
Sesame) and turn it into a data frame.
Then I discovered that Egon Willighagen did almost the same and
created RRDF (except that he used Jena instead of Sesame).
RRDF does not provide a simple method to get a dataframe, but that
should not be hard to add. Less work probably than turning my
prototype into a deployable package.
Take care
Oliver
On Fri, Jul 20, 2012 at 3:02 PM, Paul Shannon <pshannon at="" fhcrc.org="">
wrote:
> Hi Oliver,
>
> Just checking in prior to the Bioconductor 2012 conference. Have
you had any luck with parsing BioPAX format as RDF triples?
>
> Thanks!
>
> - Paul
>
> On Jun 16, 2012, at 3:10 AM, Oliver Ruebenacker wrote:
>
>> Hello,
>>
>> Thanks a lot for the endorsement!
>>
>> I will try to create a prototype in the next days, and then you
can
>> probably advice me on how to turn that into a package of desired
>> quality.
>>
>> Take care
>> Oliver
>>
>> On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon at="" fhcrc.org=""> wrote:
>>> Oliver and Martin,
>>>
>>> It would be very helpful to have easy access to BioPAX data in
Biocondcutor.
>>>
>>> Just now, at the weekly Bioconductor dev-team meeting, we
discussed your ideas, and want to endorse them. Oliver's proposal to
parse the RDF triples into a data.frame has lots to recommend it. It
would be immediately useful, and yet also allow for more sophisticated
uses later. With these relationships in R, annotated as BioPAX data
often are, we can imagine interested parties writing S4 classes which
use the data, which might provide flexible querying capabilities, and
be able to transform those triples into graphs and networks, for
further computation and display.
>>>
>>> Please let us know if we can help.
>>>
>>> - Paul
>>>
>>>
>>> On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote:
>>>
>>>> Hello Martin,
>>>>
>>>> I don't have code in R to test yet, but I do have extensive
>>>> experience handling BioPAX in Java, so I'm assuming reading
BioPAX
>>>> using RJava should not be too difficult.
>>>>
>>>> The best target format depends on what people would like to do
with
>>>> the data. For visualization, a bi-partite graph in a popular
>>>> graph-layout package should be best. Is there any particular
graph
>>>> package in BioConductor or R in general you would recommend?
>>>>
>>>> For actual analysis, people probably have more specific
requirements.
>>>>
>>>> BioPAX is a format based on RDF/OWL, which in turn is based on
>>>> organizing data in triples, which could be stored in a three-
column
>>>> data frame (or perhaps a fourth column for data type). For
example
>>>> (incomplete, for illustration only):
>>>>
>>>> ex:mapPhosphorylization rdf:type bp:BiochemicalReaction.
>>>> ex:atp rdf:type bp:SmallMolecule.
>>>> ex:adp rdf:type bp:SmallMolecule.
>>>> ex:map rdf:type bp:Protein.
>>>> ex:mapPhosphorylized rdf:type bp:Protein.
>>>> ex:mapPhosphorylization bp:left ex:atp.
>>>> ex:mapPhosphorylization bp:left ex:map.
>>>> ex:mapPhosphorylization bp:right ex:adp.
>>>> ex:mapPhosphorylization bp:right ex:mapPhosphorylized.
>>>>
>>>> Take care
>>>> Oliver
>>>>
>>>> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse
>>>> <martin.preusse at="" googlemail.com=""> wrote:
>>>>> Hi Oliver,
>>>>>
>>>>> I think there is a lot interest in a bioconductor package!
>>>>>
>>>>> Personally, I would like to read pathways stored in the BioPAX
format into any kind of graph. It's a philosophical question if
reactions should have nodes or should sit on the edges :) So far I
have not used any R graph package. But I assume there are some very
generic packages which are flexible enough to support both direct and
bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA
extensively.
>>>>>
>>>>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX
is only a format to store a pathway. And I would like to bring it back
into its natural form: a network!
>>>>>
>>>>> Do you have any code to test? I have used RJava before. All this
RDF and XML file format stuff kind of puzzles me though ? :)
>>>>>
>>>>> Cheers
>>>>> Martin
>>>>>
>>>>>
>>>>>
>>>>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker:
>>>>>
>>>>>> Hello Martin,
>>>>>>
>>>>>> I'm currently looking into reading BioPAX into R using RJava
and
>>>>>> OpenRDF Sesame. If there is interest, I may be looking into
submitting
>>>>>> a package to BioConductor.
>>>>>>
>>>>>> It would be very helpful if you could tell me what you need the
>>>>>> BioPAX data for, and in what form it would be best for you.
Possible
>>>>>> options are:
>>>>>>
>>>>>> - A data frame of the RDF/OWL triples
>>>>>> - A graph of the RDF/OWL triples
>>>>>> - A data frame with one row for each reaction-participant
>>>>>> - A bi-partite graph with nodes for reactions and nodes for
substances
>>>>>> - A with nodes for substances only, with edges for interactions
>>>>>> - A genetic interaction graph
>>>>>>
>>>>>> This list is roughly sorted form the one most easy to the most
>>>>>> difficult to provide.
>>>>>>
>>>>>> Take care
>>>>>> Oliver
>>>>>>
>>>>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse
>>>>>> <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote:
>>>>>>> Many biological pathway resourced provide their data in the
BioPAX format (http://www.biopax.org/index.php), a special XML format
for biological interaction networks. Examples are pathway commons
(http://www.pathwaycommons.org/pc/) and Reactome
(http://www.reactome.org (http://www.reactome.org/)).
>>>>>>>
>>>>>>> A JAVA library for parsing BioPAX files exists:
http://www.biopax.org/paxtools.php
>>>>>>>
>>>>>>> Has anybody used BioPAX files with R? Is it possible to read
BioPAX files in any R based graph structure? A solution similar to the
KEGGgraph package for KEGG pahways would be great, since more and more
databases start using BioPAX.
>>>>>>>
>>>>>>>
>>>>>>> Any ideas are appreciated!
>>>>>>>
>>>>>>> Cheers
>>>>>>> Martin
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org (mailto:Bioconductor at
r-project.org)
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Oliver Ruebenacker
>>>>>> Bioinformatics Consultant
(http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>>>>>> Knowomics, The Bioinformatics Network
(http://www.knowomics.com)
>>>>>> SBPAX: Turning Bio Knowledge into Math Models
(http://www.sbpax.org)
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Oliver Ruebenacker
>>>> Bioinformatics Consultant
(http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>>>> SBPAX: Turning Bio Knowledge into Math Models
(http://www.sbpax.org)
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>>
>> --
>> Oliver Ruebenacker
>> Bioinformatics Consultant
(http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>> SBPAX: Turning Bio Knowledge into Math Models
(http://www.sbpax.org)
>
--
Java Developer (Bioinformatics) at PanGenX (http://www.pangenx.com)
President and Founder of Knowomics
(http://www.knowomics.com/wiki/Oliver_Ruebenacker)
Consultant at Predictive Medicine
(http://predmed.com/people/oliverruebenacker.html)
SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)