Question

Using EnsDb.Hsapiens.v86 database doesn't fetch the ENSEMBL IDs

0

Entering edit mode

sutturka • 0

@sutturka-14580

Last seen 12 months ago

United States

Recently I noticed that using the org.Hs.eg.db database annotated specific genes (e.g. CASC15, ACTR3BP5, STAG3L3) with ENSEMBL IDs as "NA". This could be the limitation of database.

I tried EnsDb.Hsapiens.v86 which could determine the correct ENSEMBL IDs for these features.

library(EnsDb.Hsapiens.v86)

edb <- EnsDb.Hsapiens.v86
transcripts(edb, filter = GeneNameFilter("CASC15"))

I was able to use the same database in ChIPSeeker without any issues but somehow it does not captured the ENSEMBL ID which is available in the column named gene_id.

library(ChIPseeker)

Anno <- annotatePeak(bedfile, tssRegion=c(-3000, 3000), TxDb=TxDb.Hsapiens.UCSC.hg38.knownGene, annoDb=EnsDb.Hsapiens.v86)

This fetched the annotation columns "geneId", "transcriptId", "distanceToTSS", "SYMBOL", "GENENAME". How to modify the ChIPSeeker code to get the column "gene_id" from the EnsDb.Hsapiens.v86 database?

Thanks

ChIPseeker • 1.6k views

ADD COMMENT • link 4.6 years ago sutturka • 0

0

Entering edit mode

annotatePeak is hard-coded to look for geneId, not gene_id - see here: https://github.com/YuLab-SMU/ChIPseeker/blob/master/R/annotatePeak.R#L178

You may want to create an issue on the GitHub page, and/or literally edit the column name of the object EnsDb.Hsapiens.v86

ADD REPLY • link 4.6 years ago Kevin Blighe ★ 3.9k