finding overlap using GenomicRanges with two data sets specifying minoverlap Function
1
0
Entering edit mode
@husainmanagori1998-22771
Last seen 23 months ago
germany

I am trying to find overlap using GenomicRanges package function findOverlaps,I have two data sets and want to find overlaps using minoverlap parameter. So I want to give different minoverlap for every read.

Ex df1:

chr2 2800 3270
chr2 3600 4152
chr2 3719 5092
chr2 3893 4547

Ex. df2 :

chr2 263 20091
chr2 342 17222
chr2 414 2612
chr2 846 4265
chr2L 1030 11575

I want to find overlaps by giving different values of the minOverlap parameter,I have tried something like this but didn't work. The desired output is as usual in genomicRange object.

Example Output:
Hits object with 2 hits and 0 metadata columns:
      queryHits subjectHits
      <integer>   <integer>
  [1]         1           1
  [2]         1           2
  -------
  queryLength: 1 / subjectLength: 10
df11<-toGRanges(df1)
df22<-toGRanges(df2)

minOv<-mapply("-", df1$V3,df1$V2)+1

for (j in minOv) {
    ff<-findOverlaps(query = df11,subject = df22,minoverlap = j)
}
df1::
structure(list(df1c = c("chr2", "chr2", "chr2", "chr2"), df1c2 = c(2800, 
3600, 3719, 3893), df1c3 = c(3270, 4152, 5092, 4547)), class = "data.frame", row.names = c(NA, 
-4L))

df2::

structure(list(df1c = c("chr2", "chr2", "chr2", "chr2", "chr2L"
), df1c2 = c(263, 342, 424, 846, 1030), df1c3 = c(20091, 17222, 
2612, 4265, 11575)), class = "data.frame", row.names = c(NA, 
-5L))

I have also posted on StackOverflow.

Biostrings GenomicRanges • 951 views
ADD COMMENT
0
Entering edit mode
Basti ▴ 770
@7d45153c
Last seen 1 day ago
France

Hi, please have a look at your df1 and df2 objects. To convert your df to GRanges object with toGRanges the colnames should be seqnames/start/end which is clearly not the case because your colnames are df1c/dfc2c,df3c. Additionally, to create minOv you call df1$V3 and df1$V2 but these columns do not exist.

I assume if you correct the code with appropriate colnames of your dataframe seqnames/start/end and call it properly in the remaining code, it should be fine.

df1=structure(list(seqnames = c("2", "2", "2", "2"), start = c(2800, 
                                                                   3600, 3719, 3893), end = c(3270, 4152, 5092, 4547)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                              -4L))


df2=structure(list(seqnames = c("2", "2", "2", "2", "2L"
), start = c(263, 342, 424, 846, 1030), end = c(20091, 17222, 
                                                  2612, 4265, 11575)), class = "data.frame", row.names = c(NA, 
                                                                                                           -5L))
df11<-makeGRangesFromDataFrame(df1)
df22<-toGRanges(df2)

minOv<-mapply("-", df1$end,df1$start)+1

for (j in minOv) {
  ff<-findOverlaps(query = df11,subject = df22,minoverlap = j)
}
ADD COMMENT

Login before adding your answer.

Traffic: 501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6