key lookup changes with AnnotationDbi version (bug?)
1
1
Entering edit mode
james ▴ 10
@james-21577
Last seen 12 months ago

I thought that all of the information about each chipset (i.e. platform) was in the corresponding R package.

For example, hgu219.db is the annotation package for the hgu219 platform.

However, my key lookup results differ depending on the package version of AnnotationDbi, even when the hgu219.db package versions are the same.

So for example,

keys(hgu219.db, keytype  = 'UNIPROT')


gives a different list of UNIPROTs depending on the AnnotationDbi version.

I thought all of the info was in the hgu219.db package. My thinking must be incorrect?

Can someone explain why this is happening? I'm not sure if I should be filing a bug report.

annotation • 330 views
0
Entering edit mode

I think this is related to org.Hs.eg.db.

There are missing UNIPROT identifiers that used to exists. Example with org.Hs.eg.db:

select(org.Hs.eg.db, c("SKOR1"), c("ENTREZID","UNIPROT"), "SYMBOL")


v3.11.4:

'select()' returned 1:1 mapping between keys and columns
SYMBOL ENTREZID UNIPROT
1  SKOR1   390598    <NA>


v3.8.2:

'select()' returned 1:1 mapping between keys and columns
SYMBOL ENTREZID UNIPROT
1  SKOR1   390598  P84550


There is an NA when it should link to P84550 (which it did in previous versions). Which doesn't seem to have any clear reason for removal: https://www.uniprot.org/uniprot/P84550

1
Entering edit mode

To expand on this comment: IMO this has to do with the fact that somehow the current annotation info provided by the NCBI does not include anymore the link to the UniProt ID (but previously it did).

Please realize that NCBI and UniProt are 2 independent groups/consortia that provide annotation info, and that the org.Hs.eg.db (or any other OrgDb) is simply a repackaged, R-compatible 1 to 1 copy of the info provided by the NCBI.

James MacDonald expanded on this multiple times, for example in this post (although it is dealing with GO annotations I assume you will get the point).

2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

As the previous commenters have alluded to, the query you are making has nothing to do with the ChipDb you are using, which only contains a single table that maps array IDs to NCBI gene IDs. If you ask for UniProt keys, the query is made to the org His.eg.db package, which changes from release to release to reflect charges so the various annotation services.

The org.Hs.eg.db package only contains those IDs that can be mapped to NCBI gene IDs, which changes over time as our understanding improves and things get updated by the annotation services.