Generate 3' UTR and 5' UTR ranges from a gff file
2
0
Entering edit mode
hwu12 ▴ 10
@hwu12-12353
Last seen 14 months ago
United States

I am working on a gif file that is missing the 5'UTR and 3' UTR information. For example:

​  ctg123 . gene      1050  9000  .  +  .  ID=gene00001;Name=EDEN
  ctg123 . mRNA      1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
  ctg123 . exon      1050  1500  .  +  .  ID=exon00002;Parent=mRNA00001,
  ctg123 . exon      3000  3902  .  +  .  ID=exon00003;Parent=mRNA00001
  ctg123 . exon      5000  5500  .  +  .  ID=exon00004;Parent=mRNA00001
  ctg123 . exon      7000  9000  .  +  .  ID=exon00005;Parent=mRNA00001
  ctg123 . CDS       1201  1500  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
  ctg123 . CDS       3000  3902  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
  ctg123 . CDS       5000  5300  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1

Is there a way to generate rows with the 5'UTR and 3'UTR ranges? Many thanks!

 

 

gff genomicranges • 3.3k views
ADD COMMENT
1
Entering edit mode
@michael-lawrence-3846
Last seen 2.3 years ago
United States

To get the ranges of the UTRs, as a GRangesList or GRanges object:

library(rtracklayer)
gtf <- import.gff3("tmp.gtf")
tx <- subset(gtf, type == "mRNA")
cds <- subset(gtf, type == "CDS")
cds <- range(multisplit(cds, cds$Parent))
utrs <- psetdiff(tx, cds[tx$ID])

 

 


 

ADD COMMENT
0
Entering edit mode

Thanks so much, Michael. This method can efficiently generate UTR ranges. However, is it possible to split them further to 5'UTR and 3'UTR?

ADD REPLY
0
Entering edit mode
arfranco ▴ 130
@arfranco-8341
Last seen 9 months ago
European Union

It depends upon it is a model organism or not. If so, try to access to Biomart, where you can generate whatever you want.

Another possibility is to convert this gtf file to BED and use Bedtools to get the same answer. To do so, you can access to the bedtools tutorials and help

ADD COMMENT
0
Entering edit mode

Hi arfranco, could you please be more specific how to use bedtools to get the UTR rows? I spent a lot of time looking, but it seems that bedtools cannot generate the 5'UTR and 3'UTR ranges for me.Thanks! 

ADD REPLY

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6