Hello!
I am trying to use your tool to annotate some proteins and I've stumbled upon an error
> library(org.Hs.eg.db)
# This works fine
> mapIds(org.Hs.eg.db, keys = "Q8NGN2", keytype="UNIPROT", column="ENTREZID")
'select()' returned 1:1 mapping between keys and columns
Q8NGN2
"219873"
# And this does not work
> mapIds(org.Hs.eg.db, keys = "P0DPD7", keytype="UNIPROT", column="ENTREZID")
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'UNIPROT'. Please use the keys method to see a listing of valid arguments.
It seems that there are genes for both of these proteins on Uniprot (https://pir3.uniprot.org/uniprot/Q8NGN2, https://pir3.uniprot.org/uniprot/P0DPD7)
Is there a reason why I cannot annotate some of the proteins?
> sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=C LC_TIME=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8 LC_MONETARY=ru_RU.UTF-8
[6] LC_MESSAGES=ru_RU.UTF-8 LC_PAPER=ru_RU.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] org.Hs.eg.db_3.12.0 AnnotationDbi_1.52.0 IRanges_2.24.1 S4Vectors_0.28.1 Biobase_2.50.0 BiocGenerics_0.36.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 DBI_1.1.1 RSQLite_2.2.4 cachem_1.0.4 rlang_0.4.10 blob_1.2.1 vctrs_0.3.6
[8] tools_4.0.3 bit64_4.0.5 bit_4.0.4 fastmap_1.1.0 compiler_4.0.3 pkgconfig_2.0.3 BiocManager_1.30.10
[15] memoise_2.0.0
Thank you!
Huh. Somehow in querying around I got from P0DPD7 to P0DPD6. But anyway, if you want to map from UniProt to NCBI Gene IDs, it's often easier to use UniProt instead of NCBI (which is what the
org.Hs.eg.db
package is based on). One alternative is to use theUniProt.ws
package, but it's not working for me right now for some reason.For direct queries like this it's just as easy to query the UniProt REST server directly. As an example
This will create the required format, which looks like
And which you can just paste into the address bar of any browser to bring up the table of results. We use
read.table
to then just read those data into adata.frame
.