Hello,
I have list of transcripts (From an RNA-seq quantification output), and I would like to annotate. I would like the annotation to have the following information (Gene name, NM id or some information about the mRNA, chromosome number, chr start and end)
ucscid | gene | mrna | refseq | ucscid | chr | beg | end |
uc001yee.1 | AK127179 | AK127179 | uc001yee.1 | chr14 | 95643819 | 95646270 | |
uc010hxc.3 | MFN1 | U95822 | NM_033540 | uc010hxc.3 | chr3 | 179080145 | 179111008 |
uc021xcy.1 | GOLGB1 | AB593126 | uc021xcy.1 | chr3 | 121382047 | 121468602 | |
uc010jai.3 | LOC644936 | NR_004845 | uc010jai.3 | chr5 | 79594916 | 79596297 | |
uc001lkt.3 | PPP2R2D | BC045531 | uc001lkt.3 | chr10 | 133747959 | 133770053 | |
uc002wgt.4 | EBF4 | NM_001110514 | NM_001110514 | uc002wgt.4 | chr20 | 2673523 | 2740754 |
uc001kxg.4 | CALHM3 | NM_001129742 | NM_001129742 | uc001kxg.4 | chr10 | 105232560 | 105238997 |
I tried using the txdb object and sql-like query, but the annotation: (a) only has gene ID and does not have Gene name (b) Its returning multiple rows instead of one row.
Question : Should I be using another hg19 object or another type of query ? Any advice appreciated.
library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene columns(txdb) #checking what all columns are output keytypes(txdb) x=c("uc001aal.1","uc001aaa.3", "uc001aae.4") # example input cols = columns(txdb) m = select(txdb, keys = x, columns=cols, keytype="TXNAME") #returns one to many rows
Thanks, K