Making a gff file for introns to be used with DEXSEQ
1
0
Entering edit mode
gv ▴ 40
@gv-6516
Last seen 2.2 years ago
United States

Hi Everyone,

I am trying to use DEXSeq to find out intron retention events in my RNA seq data. I want to find first the counts associated with the introns. I made a test gff file and used dexseq_count.py script but it did not produce any counts for introns. Here is how my test.gff file for intron looks like

chrI    .    intron    87388    87500    .    +    .    ID=intron1
chrI    .    intron    139188    139218    .    +    .    ID=intron2
chrI    .    intron    142254    142619    .    +    .    ID=intron3
chrI    .    intron    151007    151096    .    -    .    ID=intron4
chrI    .    intron    181179    181210    .    +    .    ID=intron5

Can anyone tell me what do I have to change in this format. GTF files have exon information so how can I make a intron gff file from it??

 

Also downstream after having counts, should I just run DEXseq package as we do for exon? Hope to hear from you soon.

 

Thanks in advance.

 

Regards

Varun

dexseq • 2.7k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 1 day ago
Seattle, WA, United States

Hi Varun,

I'm not familiar with DEXSeq but it seems that the dexseq_count.py script expects a GFF file with features of type exonic_part. This is why you get zero counts with your test.gff file. A quick glance at the vignette seems to indicate that you need to start with a GTF file (containing features of type exon) and use the dexseq_prepare_annotation.py script to extract the "exonic parts", that is, to turn it into a GFF file containing the exonic_part features. So if you start with a GTF file that contains only intron features, dexseq_prepare_annotation.py will produce an empty GFF file.

Instead of trying to get the intron counts by using the dexseq_count.py script on a GFF a GTF file, it might be simpler to do this from within R. The Preprocessing within R section in the DEXSeq vignette explains how to do this for the exon counts. To do this for introns, you just need to replace the disjointExons(hse, aggregateGenes=FALSE) call with intronicParts(hse, linked.to.single.gene.only=TRUE).

Some important notes about intronicParts():

  1. intronicParts() is a function I just added to GenomicFeatures 1.27.10 (devel) so you would need to upgrade your installation to use Bioc devel if you want to use this (Bioc devel will be released next month as BioC 3.5 and requires R 3.4).
  2. You need to allow about 24h for GenomicFeatures 1.27.10 to propagate to our public repositories and become available via biocLite().
  3. intronicParts() is not documented yet but will be ASAP.

Hope this helps,

H.

ADD COMMENT

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6