Question: Converting Exon counts(co-ordinates) to transcript/gene ids
gravatar for Shrule
5 months ago by
Shrule0 wrote:


I have a question regarding the input data into DExSeq. I'm working with level 3 TCGA exon_quantification data which looks likes this:

Hybridization REF TCGA-3C-AAAU-01A-11R-A41B-07 TCGA-3C-AALI-01A-11R-A41B-07 TCGA-3C-AALJ-01A-31R-A41B-07

exon raw_counts raw_counts raw_counts raw_counts

chr1:11874-12227:+ 29 23 18 2

chr1:12595-12721:+ 7 10 1 0

chr1:12613-12721:+ 7 10 1 0

chr1:12646-12697:+ 6 5 1 0

I have the raw counts for each exon. The problem I'm having is turning this into something I can use in DexSeq. I don't have access to any of the orginal SAM/BAM files, and my dataset just has raw counts which I think I can use in DexSeq but the problem I'm having is converting the exon co-ordinates into transcript IDs. I have a gtf file downloaded. I was wondering is anyone has any suggestions on how to do this or perhaps any bioconductor tools that will give me that output I need.


Thanks in advance.

ADD COMMENTlink written 5 months ago by Shrule0

Why not download transcript level data instead, e.g. from ? It's all already been compiled and you can be confident that there aren't any missing exons etc...

ADD REPLYlink written 5 months ago by biomiha10

Probably should have mentioned, the reason I'm working with exon level data is that I'm looking at a novel transcript, I have the chr co-ordinates for the novel transcript so the idea was to annotate the gtf file with the novel transcript and use DexSeq to look at expression levels. 

ADD REPLYlink written 5 months ago by Shrule0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 329 users visited in the last hour