Missing ensembl_ids in biomaRt uniprot query
1
0
Entering edit mode
@antonio-miguel-de-jesus-domingues-5182
Last seen 10 months ago
Germany

I am trying to do a "classical" match of uniprot ids, using protein IDs identified in a Zebrafish mass-spec experiment, to find the corresponding ensembl gene ids. However, there are several proteins for which my biomaRt query fails to retrive any information, although they are present in the Uniprot database and with an attributed ensembl gene id. Am I missing something?

 

Here is my code:

prot_ids = c("F1QCB4", "F1R8H7", "A0JMF6", "F1QU18", "A0JMK7", "A0MTA1")

uniProt <- useMart("unimart", dataset="uniprot")

getBM(
        attributes =c("accession" ,"name","ensembl_id", "gene_name"),
        filter="accession",
        values=prot_ids,
        mart=uniProt)

[1] accession  name       ensembl_id gene_name
<0 rows> (or 0-length row.names)

 

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] biomaRt_2.22.0     VennDiagram_1.6.9  RColorBrewer_1.1-2

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1 Biobase_2.26.0       BiocGenerics_0.12.1
 [4] bitops_1.0-6         DBI_0.3.1            GenomeInfoDb_1.2.3  
 [7] IRanges_2.0.0        parallel_3.1.2       RCurl_1.95-4.5      
[10] RSQLite_1.0.0        S4Vectors_0.4.0      stats4_3.1.2        
[13] tcltk_3.1.2          tools_3.1.2          XML_3.98-1.1 
biomaRt uniprot ensembl • 1.7k views
ADD COMMENT
0
Entering edit mode
me • 0
@me-7151
Last seen 10.0 years ago
European Union

Some of the accessions you are trying to map have been deleted from UniProt e.g. F1QCB4 your code should return results with the for example the accession P05067. However, it is likely that you will need to rerun the protein ACC mapping to using the Zebrafish mass-spec results.

 

See this example from the Biomart that shows the basic query is correct but no value exists for F1QCB4.

ADD COMMENT
0
Entering edit mode

Yes, some correspond to deleted entries, but I do not understand what you mean with "rerun the protein ACC mapping to using the Zebrafish mass-spec results". Could you please explain in more detail?

Login before adding your answer.

Traffic: 530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6