TxDb basic question
1
0
Entering edit mode
Brian Smith ▴ 120
@brian-smith-6197
Last seen 4.1 years ago
United States

Hi,

I was exploring the rhesus macaque package:

> TxDb.Mmulatta.UCSC.rheMac8.refGene
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: UCSC rheMac8 refGene
# Organism: Macaca mulatta
# Taxonomy ID: 9544
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 6378
# exon_nrow: 46356
# cds_nrow: 43281
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2017-10-02 12:20:15 -0400 (Mon, 02 Oct 2017)
# GenomicFeatures version at creation time: 1.29.11
# RSQLite version at creation time: 2.0
# DBSCHEMAVERSION: 1.1
# Resource URL: http://genome.ucsc.edu/cgi-bin/hgTables

 

Does "# transcript_nrow: 6378" mean that there is information for only, at most, 6378 genes in this package. Sorry, not familiar with this organism, but doesn't 6378 genes seem a bit on the small size considering it is for the entire genome?

Or have I mis-understood something?

many thanks!

txdb TxDb.Mmulatta.UCSC.rheMac8.refGene • 1.2k views
ADD COMMENT
0
Entering edit mode
@danielvantwisk-13028
Last seen 4.4 years ago

It appears that the transcript_nrow is correct.  This link points us to the RefSeq Genes table from UCSC:

https://genome.ucsc.edu/cgi-bin/hgTables?db=rheMac8&hgta_group=genes&hgta_track=refGene&hgta_table=refGene&hgta_doSchema=describe+table+schema

Given each row in the table accounts for a transcript, the 6484 transcripts is a very close to the 6378 transcripts in question.  The remaining hundred-or-so transcripts may have been excluded from the TxDb.Mmulatta.UCSC.rheMac8.refGene due to not meeting various constraints such as a cds not being a multiple of three.

ADD COMMENT

Login before adding your answer.

Traffic: 733 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6