Hello! I'm trying to convert a set of 6422 gene symbols into EntrezID's (I got this list of genes from doing a differential expression analysis with TCGAbiolinks on a TCGA dataset of hepatocellular carcinoma) and I'm trying to use AnnotationDbi for that. When I do this, 838 genes return an NA for ENTREZID. However, some of them do have an Entrez ID associated with the name that was provided in the original dataset (e.g: one of my genes of interest is SNAI2 and it does have an associated Entrez ID which is 6591 but it was still among the NA's)
Any ideas on why this could be happening? I'm new to bioconductor packages so I'm sorry if this is too dumb!
#the data frame with my differentially expressed genes is called DEedgeR
Gene_names <- DEedgeR$Genes
Genes2 <- select(org.Hs.eg.db, Genes_names, 'ENTREZID', 'SYMBOL')
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Argentina.1252 LC_CTYPE=Spanish_Argentina.1252
[3] LC_MONETARY=Spanish_Argentina.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Argentina.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] pathview_1.30.0 DO.db_2.9 KEGG.db_3.2.4 KEGGprofile_1.32.0
[5] org.Hs.eg.db_3.12.0 AnnotationDbi_1.52.0 IRanges_2.24.0 S4Vectors_0.28.0
[9] Biobase_2.50.0 BiocGenerics_0.36.0 clusterProfiler_3.18.0 TCGAbiolinks_2.18.0
[13] BiocManager_1.30.10 ggthemes_4.2.0 survival_3.2-7 survminer_0.4.8
[17] ggpubr_0.4.0 ggplot2_3.3.2 xlsx_0.6.5 tidyr_1.1.2
[21] readxl_1.3.1 dplyr_1.0.2
loaded via a namespace (and not attached):
[1] shadowtext_0.0.7 backports_1.2.0 fastmatch_1.1-0
[4] BiocFileCache_1.14.0 plyr_1.8.6 igraph_1.2.6
[7] splines_4.0.3 BiocParallel_1.24.1 GenomeInfoDb_1.26.1
[10] digest_0.6.27 GOSemSim_2.16.1 viridis_0.5.1
[13] GO.db_3.12.1 fansi_0.4.1 magrittr_2.0.1
[16] memoise_1.1.0 openxlsx_4.2.3 Biostrings_2.58.0
[19] readr_1.4.0 graphlayouts_0.7.1 matrixStats_0.57.0
[22] R.utils_2.10.1 askpass_1.1 enrichplot_1.10.1
[25] prettyunits_1.1.1 colorspace_2.0-0 blob_1.2.1
[28] rvest_0.3.6 rappdirs_0.3.1 ggrepel_0.8.2
[31] haven_2.3.1 xfun_0.19 crayon_1.3.4
[34] RCurl_1.98-1.2 jsonlite_1.7.1 graph_1.68.0
[37] scatterpie_0.1.5 zoo_1.8-8 glue_1.4.2
[40] polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.36.0
[43] XVector_0.30.0 DelayedArray_0.16.0 car_3.0-10
[46] Rgraphviz_2.34.0 abind_1.4-5 scales_1.1.1
[49] DOSE_3.16.0 DBI_1.1.0 rstatix_0.6.0
[52] Rcpp_1.0.5 viridisLite_0.3.0 xtable_1.8-4
[55] progress_1.2.2 foreign_0.8-80 bit_4.0.4
[58] km.ci_0.5-2 httr_1.4.2 fgsea_1.16.0
[61] RColorBrewer_1.1-2 ellipsis_0.3.1 pkgconfig_2.0.3
[64] XML_3.99-0.5 rJava_0.9-13 R.methodsS3_1.8.1
[67] farver_2.0.3 dbplyr_2.0.0 tidyselect_1.1.0
[70] rlang_0.4.8 reshape2_1.4.4 TeachingDemos_2.12
[73] munsell_0.5.0 cellranger_1.1.0 tools_4.0.3
[76] cli_2.2.0 downloader_0.4 generics_0.1.0
[79] RSQLite_2.2.1 broom_0.7.2 stringr_1.4.0
[82] knitr_1.30 bit64_4.0.5 tidygraph_1.2.0
[85] zip_2.1.1 survMisc_0.5.5 purrr_0.3.4
[88] KEGGREST_1.30.1 ggraph_2.0.4 R.oo_1.24.0
[91] KEGGgraph_1.50.0 xml2_1.3.2 biomaRt_2.46.0
[94] compiler_4.0.3 rstudioapi_0.13 png_0.1-7
[97] curl_4.3 ggsignif_0.6.0 tibble_3.0.4
[100] tweenr_1.0.1 stringi_1.5.3 TCGAbiolinksGUI.data_1.10.0
[103] forcats_0.5.0 lattice_0.20-41 Matrix_1.2-18
[106] KMsurv_0.1-5 vctrs_0.3.5 pillar_1.4.7
[109] lifecycle_0.2.0 data.table_1.13.2 cowplot_1.1.0
[112] bitops_1.0-6 GenomicRanges_1.42.0 qvalue_2.22.0
[115] R6_2.5.0 gridExtra_2.3 rio_0.5.16
[118] MASS_7.3-53 assertthat_0.2.1 SummarizedExperiment_1.20.0
[121] xlsxjars_0.6.1 openssl_1.4.3 withr_2.3.0
[124] GenomeInfoDbData_1.2.4 hms_0.5.3 grid_4.0.3
[127] rvcheck_0.1.8 MatrixGenerics_1.2.0 carData_3.0-4
[130] ggforce_0.3.2 tinytex_0.27
>
I imagined that some Gene IDs didn't have a corresponding EntrezId but I knew this one in particular had because it's the one I work with. I tried your code and it worked so thank you very much! I've only been working with bioconductor packages for a few days so this is really helpful for me