I use the “annotate” package to convert affymetrix 3’ probe to gene ids. For some probes it does not find corresponding gene ids. For example, searching for “1368587_at” and “1385248_a_at” gives ‘NA’.
library("annotate") library("rat2302.db") getEG(c("1368587_at","1385248_a_at"),"rat2302") getSYMBOL(c("1368587_at","1385248_a_at"),"rat2302")
But when I google "1385248_a_at", I get gene symbol “Ogn” (gene id: 291015) in rat genome database.
When I used biomaRt - "1385248_a_at" mapped to two gene ids: “291015”, “100910855”. But gene id 291015 (Ogn) seems more relevant, the other id was LOC100910855.
I assume it is a common problem.
Is there any automated way to check for missing probe-gene mapping after using “annotate”?
(or) What is the best way to check if the probe id is really missing annotation or not?
I tried extracting “NA” rows and passing them to biomaRt. Biomart produces a list with many duplicates (many 2nd mapping to LOC…) that still needs to be checked manually. Is there any alternate way?
Your suggestion will be helpful. Thanks for your time,
sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale:  LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252  LC_MONETARY=English_United States.1252 LC_NUMERIC=C  LC_TIME=English_United States.1252 attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  biomaRt_2.20.0 rat2302.db_2.14.0 org.Rn.eg.db_2.14.0 RSQLite_0.11.4  DBI_0.3.1 annotate_1.42.1 eisa_1.16.0 AnnotationDbi_1.26.1  GenomeInfoDb_1.0.2 Biobase_2.24.0 BiocGenerics_0.10.0 isa2_0.3.3  RankProd_2.36.0 BiocInstaller_1.14.3 loaded via a namespace (and not attached):  Category_2.30.0 genefilter_1.46.1 graph_1.42.0 grid_3.1.1 GSEABase_1.26.0  IRanges_1.22.10 lattice_0.20-29 Matrix_1.1-4 RBGL_1.40.1 RCurl_1.95-4.3  splines_3.1.1 stats4_3.1.1 survival_2.37-7 tools_3.1.1 XML_3.98-1.1  xtable_1.7-4