Question: Extracting UTRs from exon and CDS data
0
gravatar for rubi
2.1 years ago by
rubi90
rubi90 wrote:

Hi,

I have a data.frame of exons per each transcript and another, corresponding, data.frame of the cds intervals:

exon.df <- data.frame(id=c(rep("id1",4),rep("id2",3),rep("id3",5)),
                      start=c(10,20,30,40,100,200,300,1000,2000,3000,4000,5000),
                      end=c(15,25,35,45,150,250,350,1500,2500,3500,4500,5500))


cds.df <- data.frame(id=c(rep("id1",3),rep("id2",3),rep("id3",3)),
                      start=c(20,30,40,125,200,300,2250,3000,4000),
                      end=c(25,35,45,150,250,325,2500,3500,4250))

 

I would like to extract the UTRs from these data for each transcript. For this example, the outcomes will be:

utr5.df <- data.frame(id=c("id1","id2","id3","id3"),
                     start=c(10,100,1000,2000),
                     end=c(15,124,1500,2249))

utr3.df <- data.frame(id=c("id2","id3","id3"),
                     start=c(326,4251,5000),
                     end=c(350,4500,5500))

Can GenomicRanges or any other package be used in any way for that?

 

 

ADD COMMENTlink modified 2.1 years ago by Michael Lawrence11k • written 2.1 years ago by rubi90
Answer: Extracting UTRs from exon and CDS data
2
gravatar for Michael Lawrence
2.1 years ago by
United States
Michael Lawrence11k wrote:

One way would be to add a dummy "chr" variable and call GRanges() on both your exon.df and cdf.df to get GRanges objects. Then, split() them by "id" into GRangesList objects. Call range() on the exons to get the transcript bounds, then subtract the CDS regions from those to get the UTRs.

Something like (untested):

exon.df$chr <- "foo"
cds.df$chr <- "foo"
exon.gr <- GRanges(exon.df)
cds.gr <- GRanges(cds.df)
exon.grl <- split(exon.gr, ~ id)
cds.grl <- split(cds.gr, ~ id)
utr.grl <- psetdiff(unlist(range(exon.grl)), cds.grl)
stack(utr.grl, "id")
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 146 users visited in the last hour