Convert UniProt Entry Names into Gene Symbols
1
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 1 day ago
Australia

How can I convert entry names such as LYSC_HUMAN into HUGO gene symbols, such as LYZ? UNIPROTKB seems to have another format of identifiers.

head(keys(up, "UNIPROTKB"))
[1] "O95825" "Q9Y2J0" "Q13905" "Q5TD94" "Q9HA92" "Q9UHA2"

These are accessions, whereas I want to convert entry names. keytypes(up) does not list anything that looks like entry name as a possible key to use. Can any other Bioconductor package do it for me?

Proteomics ProteomicsWorkflow UniProt.ws • 3.7k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

Ideally UniProt.ws would do that natively, as the 'regular' query ID is ACC+ID, where ACC are the keys you show there, and ID are the ENTRY_NAME. Unfortunately all the ACC IDs are downloaded as part of instantiating the UniProt.ws object, and if you then do a query using an ID you get an error.

> select(up, "LYSC_HUMAN", "ENTREZ_GENE", "UNIPROTKB")
Error in .select(x, keys, columns, keytype) : 
  No data is available for the keys provided.

The main issue with UniProt.ws is that there are lots of things that can be queried at uniprot.org using the API, but not all of them are included in UniProt.ws. So as people ask for things we can add them. Anyway,

> select(up, c("LYSC_HUMAN","ALBU_HUMAN"), "GENENAME", "UNIPROTKB_ID")
Getting mapping data for LYSC_HUMAN ... and ACC
Getting mapping data for P61626 ... and GENENAME
'select()' returned 1:1 mapping between keys and columns
  UNIPROTKB_ID GENENAME
1   LYSC_HUMAN      LYZ
2   ALBU_HUMAN      ALB

That's the devel version. You could just 'fix' your current install by modifying the keytypes.txt file (I added rows 2 and 3)

> head(read.table(paste(system.file("extdata/keytypes.txt", package = "UniProt.ws")), sep = "\t"))
            V1       V2       V3
1    UNIPROTKB   ACC+ID   ACC+ID
2 UNIPROTKB_ID       ID       ID
3     GENENAME GENENAME GENENAME
4      UNIPARC    UPARC    UPARC
5     UNIREF50     NF50     NF50
6     UNIREF90     NF90     NF90
ADD COMMENT

Login before adding your answer.

Traffic: 445 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6