Question: Converting Exon counts(co-ordinates) to transcript/gene ids
gravatar for Shrule
16 months ago by
Shrule0 wrote:


I have a question regarding the input data into DExSeq. I'm working with level 3 TCGA exon_quantification data which looks likes this:

Hybridization REF TCGA-3C-AAAU-01A-11R-A41B-07 TCGA-3C-AALI-01A-11R-A41B-07 TCGA-3C-AALJ-01A-31R-A41B-07

exon raw_counts raw_counts raw_counts raw_counts

chr1:11874-12227:+ 29 23 18 2

chr1:12595-12721:+ 7 10 1 0

chr1:12613-12721:+ 7 10 1 0

chr1:12646-12697:+ 6 5 1 0

I have the raw counts for each exon. The problem I'm having is turning this into something I can use in DexSeq. I don't have access to any of the orginal SAM/BAM files, and my dataset just has raw counts which I think I can use in DexSeq but the problem I'm having is converting the exon co-ordinates into transcript IDs. I have a gtf file downloaded. I was wondering is anyone has any suggestions on how to do this or perhaps any bioconductor tools that will give me that output I need.


Thanks in advance.

ADD COMMENTlink written 16 months ago by Shrule0

Why not download transcript level data instead, e.g. from ? It's all already been compiled and you can be confident that there aren't any missing exons etc...

ADD REPLYlink written 16 months ago by biomiha10

Probably should have mentioned, the reason I'm working with exon level data is that I'm looking at a novel transcript, I have the chr co-ordinates for the novel transcript so the idea was to annotate the gtf file with the novel transcript and use DexSeq to look at expression levels. 

ADD REPLYlink written 16 months ago by Shrule0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 241 users visited in the last hour