Question: Extract coordinates of overlapping genomic intervals
1
gravatar for rubi
3.3 years ago by
rubi90
rubi90 wrote:

Hi,

I have two sets of Genomic Ranges which I'm intersecting using the findOverlaps of the GenomicRanges package:

df1 <- data.frame(chr=rep("chr1",6), start=c(10033259,10060726,98674166,10067579,10067607,11169988), end=c(10033289,10060783,98674223,10067654,10067664,11170044), strand=c("-","-","+","+","+","+"))

df2 <- data.frame(chr=rep("chr1",3),start=c(10024601,10033258,10033258),end=c(10038168,10033323,10033323),strand=c("-","-","-"))

df1.gr <- makeGRangesFromDataFrame(df1,seqnames.field="chr",start.field="start",end.field="end",strand.field="strand")

df2.gr <- makeGRangesFromDataFrame(df2,seqnames.field="chr",start.field="start",end.field="end",strand.field="strand") dfs.ol <- findOverlapsdf1.gr,df2.gr)

My question is how to extract the actual overlapping coordinates of each of the hits in the returned value of findOverlaps (dfs.ol)?

I know that the intersect function returns the collapsed intervals in the query genomic ranges which intersect with a search genomic ranges. But what I really need for each overlap between gr1 and gr2 are the coordinates of the overlap, in addition to the indices of the genomic ranges which overlap (in the returned Hits object).

ADD COMMENTlink modified 3.3 years ago by Steve Lianoglou12k • written 3.3 years ago by rubi90
Answer: Extract coordinates of overlapping genomic intevals
2
gravatar for Jeff Johnston
3.3 years ago by
United States
Jeff Johnston90 wrote:

You can use pintersect:

overlaps.gr <- pintersect(df1.gr[queryHits(dfs.ol)], df2.gr[subjectHits(dfs.ol)])

If you want all the results in one object, you can add the indices as metadata columns:

overlaps.gr$df1_hit <- queryHits(dfs.ol)
overlaps.gr$df2_hit <- subjectHits(dfs.ol)

 

ADD COMMENTlink written 3.3 years ago by Jeff Johnston90

Does that report the overlap interval though?

 

ADD REPLYlink written 3.3 years ago by rubi90

Yes, it generates the overlapping interval for each row (a query/subject pair) in your Hits object.

ADD REPLYlink written 3.3 years ago by Jeff Johnston90

In the overlaps.gr object?

I only see the indices of the query and hit but not the overlap's coordinates. How do you extract that?

ADD REPLYlink written 3.3 years ago by rubi90

As the overlaps.gr is a GRanges object, you can use start(), end() and seqnames() to extract the coordinates of the overlapping intervals.

ADD REPLYlink written 3.3 years ago by Jeff Johnston90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 339 users visited in the last hour