Question: Use of and TxDb.Hsapiens.UCSC.hg19.knownGene
gravatar for Lna
23 months ago by
Lna0 wrote:


I was trying to make a list of SNPs and names of genes they are related to. So I used the VariantAnnotation package 

locateVariants(target, TxDb.Hsapiens.UCSC.hg19.knownGene, AllVariants())

and got a list of the respective geneids. As far as I understood VariantAnnotation gets the geneids from the TxDb.Hsapiens.UCSC.hg19.knownGene package and these are ENTREZIDs, which can directly be used as keys by the package. When I do this,

select(,keys=gid, columns=c("GENENAME"),keytype="ENTREZID")

it works for some of the entries, then I obtain the error:

Fehler in .testForValidKeys(x, keys, keytype, fks) :
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

I checked the geneid causing the error on the ncbi page and found that the id has been replaced by another one. So it seems TxDb.Hsapiens.UCSC.hg19.knownGene is providing an outdated geneid cannot deal with. I checked package version of TxDb.Hsapiens.UCSC.hg19.knownGene, it should be the latest version.

Now my question: Am I doing anything wrong or is this an inconsistency of the two packages I have to deal with? Is there a simple solution to solve this problem?

Thanks for any help!

ADD COMMENTlink modified 23 months ago by Vincent J. Carey, Jr.6.2k • written 23 months ago by Lna0
gravatar for Vincent J. Carey, Jr.
23 months ago by
United States
Vincent J. Carey, Jr.6.2k wrote:

You can get information about the sources of the annotation resources by mentioning them.

> TxDb.Hsapiens.UCSC.hg19.knownGene

TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC
# Genome: hg19
# Organism: Homo sapiens
# Taxonomy ID: 9606
# UCSC Table: knownGene
# Resource URL:
# Type of Gene ID: Entrez Gene ID
# Full dataset: yes
# miRBase build ID: GRCh37
# transcript_nrow: 82960
# exon_nrow: 289969
# cds_nrow: 237533
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2015-10-07 18:11:28 +0000 (Wed, 07 Oct 2015)


OrgDb object:
| Db type: OrgDb
| Supporting package: AnnotationDbi
| ORGANISM: Homo sapiens
| SPECIES: Human
| EGSOURCEDATE: 2016-Sep26
| TAXID: 9606

I am not sure mine are up to date, but in any case there is no guarantee that the two references are fully consistent -- one is made at UCSC and one at NCBI.  You can avoid the error by checking for the existence of your gid elements among the keys() result for the resource you are querying, and removing those that cannot be resolved.  Note that it will not fail if there is at least one valid key supplied:

> select(, c("1", "8"), columns="GENENAME", keytype="ENTREZID")
'select()' returned 1:1 mapping between keys and columns
  ENTREZID               GENENAME
1        1 alpha-1-B glycoprotein
2        8                   <NA>


ADD COMMENTlink written 23 months ago by Vincent J. Carey, Jr.6.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 175 users visited in the last hour