The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Extracting UTRs from exon and CDS data
0
gravatar for rubi
18 months ago by
rubi90
rubi90 wrote:

Hi,

I have a data.frame of exons per each transcript and another, corresponding, data.frame of the cds intervals:

exon.df <- data.frame(id=c(rep("id1",4),rep("id2",3),rep("id3",5)),
                      start=c(10,20,30,40,100,200,300,1000,2000,3000,4000,5000),
                      end=c(15,25,35,45,150,250,350,1500,2500,3500,4500,5500))


cds.df <- data.frame(id=c(rep("id1",3),rep("id2",3),rep("id3",3)),
                      start=c(20,30,40,125,200,300,2250,3000,4000),
                      end=c(25,35,45,150,250,325,2500,3500,4250))

 

I would like to extract the UTRs from these data for each transcript. For this example, the outcomes will be:

utr5.df <- data.frame(id=c("id1","id2","id3","id3"),
                     start=c(10,100,1000,2000),
                     end=c(15,124,1500,2249))

utr3.df <- data.frame(id=c("id2","id3","id3"),
                     start=c(326,4251,5000),
                     end=c(350,4500,5500))

Can GenomicRanges or any other package be used in any way for that?

 

 

ADD COMMENTlink modified 18 months ago by Michael Lawrence10k • written 18 months ago by rubi90
Answer: Extracting UTRs from exon and CDS data
2
gravatar for Michael Lawrence
18 months ago by
United States
Michael Lawrence10k wrote:

One way would be to add a dummy "chr" variable and call GRanges() on both your exon.df and cdf.df to get GRanges objects. Then, split() them by "id" into GRangesList objects. Call range() on the exons to get the transcript bounds, then subtract the CDS regions from those to get the UTRs.

Something like (untested):

exon.df$chr <- "foo"
cds.df$chr <- "foo"
exon.gr <- GRanges(exon.df)
cds.gr <- GRanges(cds.df)
exon.grl <- split(exon.gr, ~ id)
cds.grl <- split(cds.gr, ~ id)
utr.grl <- psetdiff(unlist(range(exon.grl)), cds.grl)
stack(utr.grl, "id")
ADD COMMENTlink modified 18 months ago • written 18 months ago by Michael Lawrence10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour