Search
Question: org.Mm.eg.db UNIPROT missing values
0
gravatar for julie.chevalier
9 weeks ago by
julie.chevalier0 wrote:

Hi,

I'm working with the latest version of AnnotationDbi(1.38.2) and org.Mm.eg.db (3.4.1) in R 3.3.3.

I've tried to retrieve matching SYMBOL with UNIPROT ID with this command :

select(org.Mm.eg.db,"P53784",column="SYMBOL",keytype="UNIPROT")

I've obtained this error :

Error in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'UNIPROT'. Please use the keys method to see a listing of valid arguments.

 

It seems that this UNIPROT (verified by printing all UNIPROT of org.Mm.eg.db with this command : keys(org.Mm.eg.db,keytype="UNIPROT") doesn't exist in the database while this UNIPROT ID exists for mouse and correspond to the "Sox3" gene.

When I try the same command with sox3 gene as entry, I obtain this result :

select(org.Mm.eg.db,"Sox3",column="UNIPROT",keytype="SYMBOL")
'select()' returned 1:many mapping between keys and columns
  SYMBOL UNIPROT
1   Sox3  A2AM37
2   Sox3  Q5RKW0

The status of these two UNIPROT ID is "unreviewed" inthe UNIPROTKB website while the uniprot P53784 is "reviewed" but not contained in the DB!

I have the same problem with several UNIPROT ID, is it normal ? is there an other version of the database containing all UNIPROT ID ?

Thanks in advance

Julie

 

ADD COMMENTlink modified 9 weeks ago by Mike Smith2.1k • written 9 weeks ago by julie.chevalier0
0
gravatar for Mike Smith
9 weeks ago by
Mike Smith2.1k
EMBL Heidelberg / de.NBI
Mike Smith2.1k wrote:

If you take a look at the manual pages for org.Mm.eg.db it gives the following details on how the Uniprot mappings are derived: 

"This object is a simple mapping of Entrez Gene identifiers https://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db=gene to Uniprot Accession Numbers"

So we can take a look at the NCBI entry for Sox-3 at https://www.ncbi.nlm.nih.gov/gene/20675 to try and understand a little more.

If you jump to the RefSeq section of that page (here) you'll see two values listed next to UniProtKB/TrEMBL: A2AM37 & Q5RKW0.  There are the two reported in the package.

P53784 is mentioned further down the pages as a "related sequence".  This doesn't necessarily answer the question as to which mapping you want to use, but it at least explains why you find the discrepancy in org.Mm.eg.db

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Mike Smith2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 256 users visited in the last hour