How to extract BOTH exon and matched transcript ID from TxDb.Hsapiens.UCSC.hg19.knownGene database?
1
@shinhengchiou-15264
Last seen 6.7 years ago
Hi,
As the title suggested, I wonder if I can extract exon information (e.g. through exonsBy) wherein I'll have each exon genome coordinates, exon ID, and the gene ID of which they belong (by = "gene") and, on top of all that, the transcript ID each exon belong to?
Thank you very much!
Shin
txdb.hsapiens.ucsc.hg19.knowngene
• 1.3k views
@james-w-macdonald-5106
Last seen 11 hours ago
United States
> ex <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, use.names = TRUE)
> exgr <- unlist(ex)
> mcols(exgr)$txid <- names(exgr)
> mcols(exgr)$geneid <- mapIds(TxDb.Hsapiens.UCSC.hg19.knownGene, names(exgr), "GENEID","TXNAME")
'select()' returned 1:1 mapping between keys and columns
> exgr
GRanges object with 742493 ranges and 5 metadata columns:
seqnames ranges strand | exon_id exon_name
<Rle> <IRanges> <Rle> | <integer> <character>
uc001aaa.3 chr1 [11874, 12227] + | 1 <NA>
uc001aaa.3 chr1 [12613, 12721] + | 3 <NA>
uc001aaa.3 chr1 [13221, 14409] + | 5 <NA>
uc010nxq.1 chr1 [11874, 12227] + | 1 <NA>
uc010nxq.1 chr1 [12595, 12721] + | 2 <NA>
... ... ... ... . ... ...
uc011mgv.2 chrUn_gl000241 [22732, 22846] - | 289961 <NA>
uc011mgv.2 chrUn_gl000241 [20433, 20481] - | 289960 <NA>
uc011mgw.1 chrUn_gl000243 [11501, 11530] + | 289967 <NA>
uc022brq.1 chrUn_gl000243 [13608, 13637] + | 289968 <NA>
uc022brr.1 chrUn_gl000247 [ 5787, 5816] - | 289969 <NA>
exon_rank txid geneid
<integer> <character> <character>
uc001aaa.3 1 uc001aaa.3 100287102
uc001aaa.3 2 uc001aaa.3 100287102
uc001aaa.3 3 uc001aaa.3 100287102
uc010nxq.1 1 uc010nxq.1 100287102
uc010nxq.1 2 uc010nxq.1 100287102
... ... ... ...
uc011mgv.2 6 uc011mgv.2 <NA>
uc011mgv.2 7 uc011mgv.2 <NA>
uc011mgw.1 1 uc011mgw.1 <NA>
uc022brq.1 1 uc022brq.1 <NA>
uc022brr.1 1 uc022brr.1 <NA>
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
>
Login before adding your answer.
Traffic: 766 users visited in the last hour