Accession ID Conversion and Missing Results
Entering edit mode
Dario Strbenac ★ 1.5k
Last seen 7 days ago

Rarely, select seems to return missing values, although there should be a result. Consider

database <- = 9606)
select(x = database, keys = c("P01613", "P01861"), columns = "GENES", keytype = "UNIPROTKB")
Getting extra data for P01861
'select()' returned 1:1 mapping between keys and columns
1    P01613  <NA>
2    P01861 IGHG4

If you check the UniProt website both P01613 and P01861 have a gene symbol. Why do I get NA for P01613? • 138 views
Entering edit mode
Last seen 11 hours ago
United States

The answer is at the top of the first page you showed. Note that what you got isn't P01613. It's P01593. When you query UniProt directly with a deprecated UniProt KB ID, it silently converts to the new one and presents the page.

Internally, the package first gets all the available keys and then removes those in your query that don't match up with those available for the species you are interested in. This is in some sense necessary as you can have problems if you provide UniProt KB IDs that aren't the species you are asking about. What happens under the hood is a URI is generated and sent to UniProt. What you are sending right now is

after having the second ID stripped off because it's not current. Ideally you would send both, because UniProt is happy to return what you are asking, and is even nice enough to not map to the new KB ID.,P01613

Which if you paste into a browser you will see returns what you expect. However, if you add a mouse ID as well,P01613,Q8K3W0

You still get all the results, only now there is a mouse symbol that's infiltrated your results.

Entering edit mode

So, is there a way to convert the IDs from such legacy data sets automatically? Is there backwards compatibility built into the R package?

Entering edit mode

If there were a simple way around this I would have told you rather than explain why it doesn't do what you expect. I mean, what's the profit in telling you why it doesn't do what you want if I can just say 'do it this way'?

You are free to fork the package and then modify lines 116-120 to ignore any keys that aren't current (or from the species you are querying on) and then it will be 'backward compatible'.


Login before adding your answer.

Traffic: 224 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6