Question: reactome.db is not updated?
2
4.7 years ago by
Guangchuang Yu1.1k
China/Guangzhou/Southern Medical University
Guangchuang Yu1.1k wrote:

Dear all,

One of a user of my package ReactomePA found that a reported enriched pathway is not exists in reactome website: http://www.reactome.org/cgi-bin/link?SOURCE=Reactome&ID=1445148

The pathway is exists in reactome.db:
> require(reactome.db)
> get("1445148", reactomePATHID2NAME)
[1] "Homo sapiens: Translocation of GLUT4 to the plasma membrane"

This is why ReactomePA report it.

After searching the website, I found the pathway ID was change to 147867:
http://www.reactome.org/content/detail/REACT_147867

It seems that the reactome.db package is not updated.

Best Regards,

Guangchuang Yu

modified 4.3 years ago by willem.ligtenberg150 • written 4.7 years ago by Guangchuang Yu1.1k

Please post your sessionInfo() so we know what version of reactome.db you are looking at.

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] reactome.db_1.50.0   RSQLite_1.0.0        DBI_0.3.1
[4] AnnotationDbi_1.28.1 GenomeInfoDb_1.2.3   IRanges_2.0.0
[7] S4Vectors_0.4.0      Biobase_2.26.0       BiocGenerics_0.12.1

1
4.7 years ago by
Marc Carlson7.2k
United States
Marc Carlson7.2k wrote:

Hi Pablo,

Yes that doc you were reading is out of date.  I have actually been updating reactome.db for a while now (with occasional input from Willem).  There has been an updated version every release and they are numbered to match the version of reactome that they contain.  But after the most recent update (merely a few weeks ago) the reactome folks pulled the denormalized db dumps from their site.  And now I can see why since it looks like the most recent denormalized dumps were not actually ever updated to reactome version 50.  That's a bummer since it basically means that the most recent package has some stale information in it (which I am working on getting updated).  This is the ultimate cause of both of the different problems that Guangchuang has mentioned.  These fields are all ones that originate in the denormalized database.

Anyhow, I am currently discussing a solution to this problem with them but the conversation has been proceeding at a pace of about one reply per day and so it still hasn't wrapped up (yet).  It all looks hopeful though (the reactome guys are helping me to know more about what their long term future plans are for this data resource).  And we are currently discussing the best alternative routes for me to get the information that previously came from their denormalized database.  When the conversation has concluded I will make a new updated reactome.db package I will push it online and post about it here.

Marc

1
4.5 years ago by
Marc Carlson7.2k
United States
Marc Carlson7.2k wrote:

Hi Guangchuang,

Unfortunately, I am still waiting for some key replacement data from reactome.  On a positive note though, a few days ago the reactome guys emailed me again confirming that they planned to really do this very soon.

Marc

1
4.3 years ago by
Netherlands
willem.ligtenberg150 wrote:

I can finally say that we have found a solution.

It was a long journey, and we will still need a long term solution, since this time they had to generate the file at Reactome, instead of me being able to do it myself. We will continue working together for a better long term solution.

I have submitted the package, and I hope it will make it in the next release. In the mean time, you can download the latest version here:
https://share.openanalytics.eu/data/public/reactome.php

Why the file downloaded from the above link only have 14M in size but the one in http://www.bioconductor.org/packages/3.1/data/annotation/html/reactome.db.html have more than 400M.

Both of them still have the issue:

> get("71593", reactomePATHID2EXTID)
[1] "178"  "2992" "8908"
> get("71593", reactomePATHID2NAME)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :


2

OK so Willem was finally able to get the reactome people to give one of us an updated file.  For that he deserves some appreciation!  And yesterday he gave me a package which I then did extensive modifications to.  This was needed so that things like select() will work.  I also (in order to be consistent with the past) included the full reactome database for release 52 (which is the most recent release and also the one that matches the important denormalized data that Willem secured for us.  This is the package that is in devel.  Unfortunately, we waited so darned long for these people to get us the data that we basically lost/skipped a release of reactome.  As was pointed out here before the version that was in release was not really fully release 50 and the reactome people never helped us with that version.  So in order to keep things honest and transparent, I changed the version number of the older reactome that has been in release to correctly reflect that which is why it will now say release 48 if you go looking for it.

As for the 'issue' that you are reporting, the trouble is that you are comparing two different bimaps.  So that means that even though both bimaps contain keys of the same type, they are not each a full set of all keys of that kind ever used.  So for example if you do this:

p2el <- as.list(reactomePATHID2EXTID)
length(names(p2el))
## And compare that to this:
p2nl <- as.list(reactomePATHID2NAME)
length(names(p2nl))

What you will notice is that you get different sets of keys.  They are the same kind of key, but they are different sets of that kind of key.  If you look at the intersection of these two sets of keys you will find that they only mostly overlap.

table(names(p2nl) %in% names(p2el))

But they do not overlap completely since the mappings themselves represent very different relationships.  I hope that this helps explain things better.

Also: I noticed while replying to you that the keys method for the "PATHID" keytype is not returning as many results as it should.  So I have patched that behavior and will be checking it in soon.  In the future, that result should be more complete than it is now.

Marc

Thanks Marc. Now it's very clear.

I really appreciate both of your efforts in maintaining reactome.db.

Bests,

Guangchuang

1

The size difference is because the one that is published on Bioconductor also includes a full SQLite version of the entire Reactome database. Useful for people who want to go beyond the mappings provided in the package itself.

The other issue is cause by an error in the query that was run.
PATHID2EXTID also contains reactions, not only pathways, whereas PATHID2NAME actually only has pathways.
We will try to address this before release.

Thanks Willem.

0
4.7 years ago by
Marc Carlson7.2k
United States
Marc Carlson7.2k wrote:

Hi Guangchuang,

I actually did update this package to version 50 of the reactome database for the most recent release.  But now I am searching for the source of the issue that you mentioned.

I will post again here when I work out what happened with these latest files from reactome.

Marc

0
4.7 years ago by
Guangchuang Yu1.1k
China/Guangzhou/Southern Medical University
Guangchuang Yu1.1k wrote:

Dear Marc,

There is also another issue. Some valid pathID in reactome.db don't have pathway name.

    ## > get("5493857", reactomePATHID2EXTID)
##  [1] "510850" "523328" "282187" "282188" ...
## > get("5493857", reactomePATHID2NAME)
## Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
##   value for "5493857" not found

I have a dirty hack to solve it by removing these pathIDs.

Can you also check this issue?

Bests,

Guangchuang

0
4.7 years ago by
University of Cambridge, UK
Pablo Moreno0 wrote:

Hi,

Reactome.db sqlite database was generated with files that are no longer available from Reactome.org to download (a denormalized version of the database, according to the doc, on 2010, but that might be inaccurate). Please see the following thread by Willem (the maintainer of reactome.db):

C: reactome.db: reactome IDs not mapped to pathway names

All the best,

Pablo

0
4.5 years ago by
Guangchuang Yu1.1k
China/Guangzhou/Southern Medical University
Guangchuang Yu1.1k wrote:

any updated news?