Best way to convert uniprot accessions to entrez gene identifiers in R
1
6
Entering edit mode
Aditya ▴ 160
@aditya-7667
Last seen 2.5 years ago
Germany

What is the best way to convert uniprot accessions to entrez gene identifiers?

What is the best way to reverse the map org.Hs.eg.db::org.Hs.egUNIPROT ?

Is there any better approach (a pity there is no 'org.Hs.uniprot.db' package)?

org.hs.eg.db uniprot accessions entrez gene identifiers • 14k views
ADD COMMENT
0
Entering edit mode

Just discovered revmap()

ADD REPLY
0
Entering edit mode

revmap() is part of the old BiMap interface. You will be better served using select().

ADD REPLY
8
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

You could either use the UniProt.ws package, or as you note, you could use the org.Hs.eg.db package. But you don't have to reverse any maps, as the BiMap interface is simply an artifact of a bygone era. These days the cool kids use select().

> uniprots <- Rkeys(org.Hs.egUNIPROT)[1:5]
> select(org.Hs.eg.db, uniprots, "ENTREZID", "UNIPROT")
  UNIPROT ENTREZID
1  P04217        1
2  V9HWD8        1
3  P01023        2
4  P18440        9
5  Q400J6        9

OR UniProt

> library(UniProt.ws)
Loading required package: RCurl
Loading required package: bitops
> up <- UniProt.ws(taxId=9606)
> select(up, uniprots, "ENTREZ_GENE")
Getting mapping data for P04217 ... and P_ENTREZGENEID
  UNIPROTKB ENTREZ_GENE
1    P04217           1
2    V9HWD8           1
3    P01023           2
4    P18440           9
5    Q400J6           9
>

 

ADD COMMENT
0
Entering edit mode

Thanks James! Love your 'cool kids' motivation to switch to the new interface :-). Will definitely do!

ADD REPLY
0
Entering edit mode

Small additional question: should I use

import org.Hs.eg.db
importFrom AnnotationDbi select

or

importFrom org.Hs.eg.db org.Hs.eg.db
importFrom AnnotationDbi select

 

ADD REPLY
0
Entering edit mode

I assume this is a package you are developing, and you are asking about your NAMESPACE file?

ADD REPLY
0
Entering edit mode

Yep, my package has a dependency on the functionality we have been discussing here

ADD REPLY
1
Entering edit mode

The org.Hs.eg.db package is just a wrapper to allow easy interrogation of an underlying SQLite database. So if you need that package specifically, then I would just put it in your Depends field.

You should note that select() will return duplicates for any one-to-many mappings. So as an example, say you have a UniProt ID that maps to two Entrez Gene IDs (this may or may not occur - I haven't checked). In that situation you will return a data.frame like

UNIPROT    ENTREZID
P12345       23434
P12345       321234

And if you are naive about things, and expect just one Entrez ID to be returned, then you will have problems. If you are just mapping from one ID to another, you can use mapIds(), with multiVals = "first". Or something different, depending on how you want to do things. But that is an easy way to control for one-to-many mappings.

And back to the question at hand, if you are only using select() or mapIds(), then you can just importFrom, rather than importing the whole namespace.

ADD REPLY
0
Entering edit mode

Thank you so much James!

ADD REPLY

Login before adding your answer.

Traffic: 585 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6