I'm using the illuminaHumanv3.db package to obtain updated probe annotations as part the analysis pipeline for a project I'm currently working on. While examining the results of this analysis I noticed an inconsistency with the annotations.
Consider the following:
library(illuminaHumanv3.db) annot <- illuminaHumanv3fullReannotation() dplyr::select(dplyr::filter(annot, SymbolReannotated == "HSPA1A"), IlluminaID:NuID, EntrezReannotated, SymbolReannotated)
This produces the following output:
As you can see these two probes are annotated with the same gene symbol but different Entrez IDs. As far as I can tell the gene symbol associated with Entrez 3304 is actually HSPA1B (see here: http://www.ncbi.nlm.nih.gov/gene/?term=3304%5Buid%5D). The Entrez IDs appear to be consistent with the provided probe locations (chr6:31785490:31785539:+ and chr6:31797684:31797733:+), suggesting that the symbol is incorrect for the second of these two probes.
I haven't checked systematically for other inconsistencies but this seems a bit concerning to me or am I missing something obvious here?
sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux stretch/sid locale:  LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8  LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages:  parallel stats4 stats graphics grDevices utils datasets methods base other attached packages:  Biostrings_2.36.1 XVector_0.8.0 illuminaHumanv3.db_1.26.0 org.Hs.eg.db_3.1.2 RSQLite_1.0.0  DBI_0.3.1 AnnotationDbi_1.30.1 GenomeInfoDb_1.4.1 IRanges_2.2.5 S4Vectors_0.6.2  Biobase_2.28.0 BiocGenerics_0.14.0 loaded via a namespace (and not attached):  zlibbioc_1.14.0 tools_3.2.1