AnnotationDbi::select() on org.Ss.eg.db returns two unique matches for UNIPROT P59083, (PHP14_PIG, MAMDC4_PIG). I expect one. Crosschecking at uniprot.org: P59083 comes up as PHPT1 (PHP14_PIG), and lists (among others) F1RW01 for MAMDC4. Checking at https://www.ncbi.nlm.nih.gov/search/ the ENTREZID's shown below map identically. So the UNIPROT mapping appears wrong, but does not appear to come from Uniprot.org.
A BLAST of P59083 Fasta hits on many orthologs of PHPT1 but there no hits to any other proteins within scrofa, so MAMDC4 is not PHPT1. Evidence suggests somewhere a database is wrong in mapping MAMDC4 to P59083, but I don't know where. Since it maps fine at Uniprot and Entrez it seems plausible the error in mapping is within the org.Ss.eg.db object itself.
So, I'm posting here as a starting point, since I verfied Uniprot and Entrez do not show the double hit.
Code to reproduce is below.
Code should be placed in three backticks as shown below
# Running inside RStudio:
BiocManager::install("AnnotationDbi")
library(AnnotationDbi)
AnnotationDbi::select(org.Ss.eg.db, keys="P59083", keytype="UNIPROT", columns=c("UNIPROT", "SYMBOL", "ENTREZID", "GENENAME"))
'select()' returned 1:many mapping between keys and columns
UNIPROT SYMBOL ENTREZID GENENAME
1 P59083 MAMDC4 100513261 MAM domain containing 4
2 P59083 PHPT1 126964416 phosphohistidine phosphatase 1
#sessionInfo( )
R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] clusterProfiler_4.16.0 edgeR_4.6.3 limma_3.64.3 ggrepel_0.9.6 ggridges_0.5.6
[6] stringi_1.8.7 biomaRt_2.64.0 UniProt.ws_2.48.0 org.Mm.eg.db_3.21.0 org.Ss.eg.db_3.21.0
[11] org.Hs.eg.db_3.21.0 AnnotationDbi_1.70.0 IRanges_2.42.0 S4Vectors_0.46.0 Biobase_2.68.0
[16] BiocGenerics_0.54.0 generics_0.1.4 BiocManager_1.30.26 openxlsx_4.2.8 readxl_1.4.5
[21] lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.1.0
[26] readr_2.1.5 tidyr_1.3.1 tibble_3.3.0 ggplot2_3.5.2 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 rstudioapi_0.17.1 jsonlite_2.0.0 magrittr_2.0.3 ggtangle_0.0.7
[6] farver_2.1.2 fs_1.6.6 vctrs_0.6.5 memoise_2.0.1 ggtree_3.16.3
[11] BiocBaseUtils_1.10.0 progress_1.2.3 curl_7.0.0 cellranger_1.1.0 gridGraphics_0.5-1
[16] pROC_1.19.0.1 caret_7.0-1 parallelly_1.45.1 plyr_1.8.9 httr2_1.2.1
[21] cachem_1.1.0 igraph_2.1.4 lifecycle_1.0.4 iterators_1.0.14 pkgconfig_2.0.3
[26] gson_0.1.0 Matrix_1.7-3 R6_2.6.1 fastmap_1.2.0 GenomeInfoDbData_1.2.14
[31] future_1.67.0 aplot_0.2.8 enrichplot_1.28.4 digest_0.6.37 patchwork_1.3.1
[36] RSQLite_2.4.3 filelock_1.0.3 timechange_0.3.0 httr_1.4.7 compiler_4.5.1
[41] bit64_4.6.0-1 withr_3.0.2 BiocParallel_1.42.1 DBI_1.2.3 rjsoncons_1.3.2
[46] R.utils_2.13.0 MASS_7.3-65 lava_1.8.1 rappdirs_0.3.3 ModelMetrics_1.2.2.2
[51] tools_4.5.1 ape_5.8-1 zip_2.3.3 future.apply_1.20.0 nnet_7.3-20
[56] R.oo_1.27.1 glue_1.8.0 nlme_3.1-168 GOSemSim_2.34.0 grid_4.5.1
[61] reshape2_1.4.4 fgsea_1.34.2 recipes_1.3.1 gtable_0.3.6 tzdb_0.5.0
[66] R.methodsS3_1.8.2 class_7.3-23 data.table_1.17.8 hms_1.1.3 xml2_1.4.0
[71] XVector_0.48.0 foreach_1.5.2 pillar_1.11.0 yulab.utils_0.2.1 splines_4.5.1
[76] treeio_1.32.0 BiocFileCache_2.16.1 lattice_0.22-7 survival_3.8-3 bit_4.6.0
[81] tidyselect_1.2.1 GO.db_3.21.0 locfit_1.5-9.12 Biostrings_2.76.0 statmod_1.5.0
[86] hardhat_1.4.2 timeDate_4041.110 UCSC.utils_1.4.0 lazyeval_0.2.2 ggfun_0.2.0
[91] codetools_0.2-20 qvalue_2.40.0 AnVILBase_1.2.0 ggplotify_0.1.2 cli_3.6.5
[96] rpart_4.1.24 Rcpp_1.1.0 GenomeInfoDb_1.44.2 globals_0.18.0 dbplyr_2.5.0
[101] png_0.1-8 parallel_4.5.1 gower_1.0.2 blob_1.2.4 prettyunits_1.2.0
[106] DOSE_4.2.0 listenv_0.9.1 tidytree_0.4.6 ipred_0.9-15 scales_1.4.0
[111] prodlim_2025.04.28 crayon_1.5.3 rlang_1.1.6 fastmatch_1.1-6 cowplot_1.2.0
[116] KEGGREST_1.48.1
