Search
Question: Annotation Database for ChIPpeakAnno
0
gravatar for Haiying.Kong
13 months ago by
Haiying.Kong100
Germany
Haiying.Kong100 wrote:

If I use annotatePeakInBatch in ChIPpeakAnno package to annotate genomic regions with

    annotatePeakInBatch(apple, AnnotationData=TSS.human.GRCh37)

The annotated gene IDs are Ensembl.

To be consistent with other results, I had to convert them to Hugo_Symbols, but significant amount of ENSG do

not have corresponding Hugo_Symbol.

I would like to use another annotation database that can give me Hugo_Symbols at once.

Which database can I use?

ADD COMMENTlink modified 13 months ago by Ou, Jianhong1.1k • written 13 months ago by Haiying.Kong100
3
gravatar for Ou, Jianhong
13 months ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Thank you to use ChIPpeakAnno package to annotate your data. 

In current release version of ChIPpeakAnno, any annotation data in GRanges format will be acceptable. In your case, you can download most recently annotations via HUGO or Ensembl (in same assembly). You can also follow the vignettes in ChIPpeakAnno to use EnsDb.Hsapiens.v75 or TxDb.Hsapiens.UCSC.hg19.knownGene. You can also use AnnotationHub to get the annotations as you want.

Let me know if you still have any questions.

ADD COMMENTlink written 13 months ago by Ou, Jianhong1.1k

Thank you very much for your reply.

(1) The database TxDb.Hsapiens.UCSC.hg19.knownGene does not have gene IDs after converted to GRanges object, unless the row names, 1, 10, 100, 1000, ....... are representing gene IDs.

GRanges object with 23056 ranges and 0 metadata columns:
        seqnames                 ranges strand
           <Rle>              <IRanges>  <Rle>
      1    chr19 [ 58858172,  58874214]      -
     10     chr8 [ 18248755,  18258723]      +
    100    chr20 [ 43248163,  43280376]      -
   1000    chr18 [ 25530930,  25757445]      -
  10000     chr1 [243651535, 244006886]      -
    ...      ...                    ...    ...
   9991     chr9 [114979995, 115095944]      -
   9992    chr21 [ 35736323,  35743440]      +
   9993    chr22 [ 19023795,  19109967]      -
   9994     chr6 [ 90539619,  90584155]      +
   9997    chr22 [ 50961997,  50964905]      -
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome

 

  (2) But I think I got what I wanted. I used EnsDb.Hsapiens.v75 database, and replaced the feature names (row names) with the gene_name column in elementMetadata slot. I presume this gene_name is Hugo_Symbol.

 

ADD REPLYlink modified 13 months ago • written 13 months ago by Haiying.Kong100
1

The names on the GRanges are Entrez gene ids, which are the ids on which the knownGene track is based. e.g., '1' is https://www.ncbi.nlm.nih.gov/gene/1

ADD REPLYlink written 13 months ago by Martin Morgan ♦♦ 22k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 114 users visited in the last hour