I am trying to do a "classical" match of uniprot ids, using protein IDs identified in a Zebrafish mass-spec experiment, to find the corresponding ensembl gene ids. However, there are several proteins for which my biomaRt query fails to retrive any information, although they are present in the Uniprot database and with an attributed ensembl gene id. Am I missing something?
Here is my code:
prot_ids = c("F1QCB4", "F1R8H7", "A0JMF6", "F1QU18", "A0JMK7", "A0MTA1") uniProt <- useMart("unimart", dataset="uniprot") getBM( attributes =c("accession" ,"name","ensembl_id", "gene_name"), filter="accession", values=prot_ids, mart=uniProt) [1] accession name ensembl_id gene_name <0 rows> (or 0-length row.names)
sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] biomaRt_2.22.0 VennDiagram_1.6.9 RColorBrewer_1.1-2 loaded via a namespace (and not attached): [1] AnnotationDbi_1.28.1 Biobase_2.26.0 BiocGenerics_0.12.1 [4] bitops_1.0-6 DBI_0.3.1 GenomeInfoDb_1.2.3 [7] IRanges_2.0.0 parallel_3.1.2 RCurl_1.95-4.5 [10] RSQLite_1.0.0 S4Vectors_0.4.0 stats4_3.1.2 [13] tcltk_3.1.2 tools_3.1.2 XML_3.98-1.1
Yes, some correspond to deleted entries, but I do not understand what you mean with "rerun the protein ACC mapping to using the Zebrafish mass-spec results". Could you please explain in more detail?