Generate 3' UTR and 5' UTR ranges from a gff file
2
0
Entering edit mode
hwu12 ▴ 10
@hwu12-12353
Last seen 4 weeks ago
United States

I am working on a gif file that is missing the 5'UTR and 3' UTR information. For example:

​  ctg123 . gene      1050  9000  .  +  .  ID=gene00001;Name=EDEN
ctg123 . mRNA      1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
ctg123 . exon      1050  1500  .  +  .  ID=exon00002;Parent=mRNA00001,
ctg123 . exon      3000  3902  .  +  .  ID=exon00003;Parent=mRNA00001
ctg123 . exon      5000  5500  .  +  .  ID=exon00004;Parent=mRNA00001
ctg123 . exon      7000  9000  .  +  .  ID=exon00005;Parent=mRNA00001
ctg123 . CDS       1201  1500  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123 . CDS       3000  3902  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123 . CDS       5000  5300  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1

Is there a way to generate rows with the 5'UTR and 3'UTR ranges? Many thanks!

gff genomicranges • 2.4k views
1
Entering edit mode
@michael-lawrence-3846
Last seen 14 months ago
United States

To get the ranges of the UTRs, as a GRangesList or GRanges object:

library(rtracklayer)
gtf <- import.gff3("tmp.gtf")
tx <- subset(gtf, type == "mRNA")
cds <- subset(gtf, type == "CDS")
cds <- range(multisplit(cds, cds$Parent)) utrs <- psetdiff(tx, cds[tx$ID])

0
Entering edit mode

Thanks so much, Michael. This method can efficiently generate UTR ranges. However, is it possible to split them further to 5'UTR and 3'UTR?

0
Entering edit mode
arfranco ▴ 130
@arfranco-8341
Last seen 19 hours ago
European Union

It depends upon it is a model organism or not. If so, try to access to Biomart, where you can generate whatever you want.

Another possibility is to convert this gtf file to BED and use Bedtools to get the same answer. To do so, you can access to the bedtools tutorials and help

0
Entering edit mode

Hi arfranco, could you please be more specific how to use bedtools to get the UTR rows? I spent a lot of time looking, but it seems that bedtools cannot generate the 5'UTR and 3'UTR ranges for me.Thanks!