Entering edit mode
Paul Leo
▴
970
@paul-leo-2092
Last seen 10.2 years ago
id2name(txdb, feature.type="cds") and id2name(txdb,
feature.type="exon") both return all NAs foe ensemble and refseq.
The cds_id perhaps don't have names ? but the exon results is strange
for ensemble .
using the.cds<-cds(txdb,columns=c("cds_id","tx_id","tx_name")) takes a
*VERY* long time but is perhaps not indeed for use on a whole genome
scale (often) ?
Looking for a quick way to map the cds_id, or exon_ids to exon_names
etc
so I can complete the annotations with biomaRt when needed.....
> txdb
TranscriptDb object:
| Db type: TranscriptDb
| Data source: UCSC
| Genome: hg19
| UCSC Table: ensGene
| Type of Gene ID: Ensembl gene ID
| Full dataset: yes
| transcript_nrow: 151222
| exon_nrow: 470051
| cds_nrow: 264558
| Db created by: GenomicFeatures package from Bioconductor
| Creation time: 2010-09-24 11:00:14 +1000 (Fri, 24 Sep 2010)
| GenomicFeatures version at creation time: 1.1.12
| RSQLite version at creation time: 0.9-2
> the.cds<-cds(txdb)
> the.cds
GRanges with 264558 ranges and 1 elementMetadata value
seqnames ranges strand | cds_id
<rle> <iranges> <rle> | <integer>
[1] chr1 [ 69091, 70008] + | 10762
[2] chr1 [367659, 368597] + | 10763
[3] chr1 [721406, 721912] + | 10765
[4] chr1 [861322, 861393] + | 10766
[5] chr1 [865535, 865716] + | 10767
[6] chr1 [865692, 865716] + | 10782
[7] chr1 [866419, 866469] + | 10768
[8] chr1 [871152, 871173] + | 10772
[9] chr1 [871152, 871276] + | 10769
... ... ... ... ... ...
[264550] chrY [26951104, 26951167] - | 139000
[264551] chrY [26951604, 26951655] - | 139001
[264552] chrY [26952216, 26952307] - | 139002
[264553] chrY [26952582, 26952728] - | 139003
[264554] chrY [26959330, 26959332] - | 139004
[264555] chrY [27184245, 27184263] - | 139018
[264556] chrY [27184956, 27185061] - | 139019
[264557] chrY [27187916, 27188033] - | 139020
[264558] chrY [27190093, 27190170] - | 139021
seqlengths
chr1 chr2 ... chr18_gl000207_random
249250621 243199373 ... 4262
> ?id2name
> cds.id.to.name<-id2name(txdb, feature.type="cds")
> lengthcds.id.to.name)
[1] 264558
> sum(!is.nacds.id.to.name))
[1] 0 ## ALL NA's
> exon.id.to.name<-id2name(txdb, feature.type="exon")
> exon.id.to.name[40000:40100]
40000 40001 40002 40003 40004 40005 40006 40007 40008 40009 40010
40011
40012
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40013 40014 40015 40016 40017 40018 40019 40020 40021 40022 40023
40024
40025
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40026 40027 40028 40029 40030 40031 40032 40033 40034 40035 40036
40037
40038
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40039 40040 40041 40042 40043 40044 40045 40046 40047 40048 40049
40050
40051
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40052 40053 40054 40055 40056 40057 40058 40059 40060 40061 40062
40063
40064
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40065 40066 40067 40068 40069 40070 40071 40072 40073 40074 40075
40076
40077
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40078 40079 40080 40081 40082 40083 40084 40085 40086 40087 40088
40089
40090
NA NA NA NA NA NA NA NA NA NA NA
NA
NA
40091 40092 40093 40094 40095 40096 40097 40098 40099 40100
NA NA NA NA NA NA NA NA NA NA
> lengthexon.id.to.name)
[1] 470051
> sum(!is.naexon.id.to.name))
[1] 0
> tx.id.to.n
################# they are all missing same is true for
> sessionInfo()
R version 2.13.0 Under development (unstable) (2010-09-20 r52949)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_AU.UTF-8
[7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg19_1.3.16
BSgenome_1.17.7
[3] Biostrings_2.17.47
GenomicFeatures_1.1.12
[5] GenomicRanges_1.1.25
IRanges_1.7.34
[7] biomaRt_2.5.1
loaded via a namespace (and not attached):
[1] Biobase_2.9.1 DBI_0.2-5 RCurl_1.4-3
RSQLite_0.9-2
[5] rtracklayer_1.9.9 tools_2.13.0 XML_3.1-1
>
--
[[alternative HTML version deleted]]