org.Mm.eg.db UNIPROT missing values
1
0
Entering edit mode
@juliechevalier-13173
Last seen 9 months ago
France

Hi,

I'm working with the latest version of AnnotationDbi(1.38.2) and org.Mm.eg.db (3.4.1) in R 3.3.3.

I've tried to retrieve matching SYMBOL with UNIPROT ID with this command :

select(org.Mm.eg.db,"P53784",column="SYMBOL",keytype="UNIPROT")

I've obtained this error :

Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'UNIPROT'. Please use the keys method to see a listing of valid arguments.

 

It seems that this UNIPROT (verified by printing all UNIPROT of org.Mm.eg.db with this command : keys(org.Mm.eg.db,keytype="UNIPROT") doesn't exist in the database while this UNIPROT ID exists for mouse and correspond to the "Sox3" gene.

When I try the same command with sox3 gene as entry, I obtain this result :

select(org.Mm.eg.db,"Sox3",column="UNIPROT",keytype="SYMBOL")
'select()' returned 1:many mapping between keys and columns
  SYMBOL UNIPROT
1   Sox3  A2AM37
2   Sox3  Q5RKW0

The status of these two UNIPROT ID is "unreviewed" inthe UNIPROTKB website while the uniprot P53784 is "reviewed" but not contained in the DB!

I have the same problem with several UNIPROT ID, is it normal ? is there an other version of the database containing all UNIPROT ID ?

Thanks in advance

Julie

 

Annotation mouse UNIPROT • 1.6k views
ADD COMMENT
0
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 40 minutes ago
EMBL Heidelberg

If you take a look at the manual pages for org.Mm.eg.db it gives the following details on how the Uniprot mappings are derived: 

"This object is a simple mapping of Entrez Gene identifiers https://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db=gene to Uniprot Accession Numbers"

So we can take a look at the NCBI entry for Sox-3 at https://www.ncbi.nlm.nih.gov/gene/20675 to try and understand a little more.

If you jump to the RefSeq section of that page (here) you'll see two values listed next to UniProtKB/TrEMBL: A2AM37 & Q5RKW0.  There are the two reported in the package.

P53784 is mentioned further down the pages as a "related sequence".  This doesn't necessarily answer the question as to which mapping you want to use, but it at least explains why you find the discrepancy in org.Mm.eg.db

ADD COMMENT

Login before adding your answer.

Traffic: 1111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6