Features within 2kb window, but with resampling
1
0
Entering edit mode
@bruce01campus-15323
Last seen 2.5 years ago
Ireland

Hi,

I have a GRanges object:

gr <- GRanges("chr1", IRanges(c(10000, 11000, 12050, 13000, 14500,20000,23000), width=100))

I want to get a list of sets of regions that fall within 2kb of one another, eg [1],[2],[3],[4],[5]. But not a 2kb tile, e.g. which could take in [1],[2], and [3],[4],[5] but cannot join them as one element of the list.

GRanges object with 7 ranges and 0 metadata columns:
      seqnames         ranges strand
         <Rle>      <IRanges>  <Rle>
  [1]     chr1 [10000, 10099]      *
  [2]     chr1 [11000, 11099]      *
  [3]     chr1 [12050, 12149]      *
  [4]     chr1 [13000, 13099]      *
  [5]     chr1 [14500, 14599]      *
  [6]     chr1 [20000, 20099]      *
  [7]     chr1 [23000, 23099]      *

Issue is with how to allow 2kb limit to 'reset' and hence 'resample' row [3],[4],[5] in my example. Hope that is clear, I looked at slidingWindow() but not clear if this can achieve that outcome? Essentially, I would like the sliding window to end when the last position is more than 2kb away from the next. I have a Perl script, but should be relatively easy in GRanges, or no? I have searched but unsure of how to term this function.

Kind regards,

Bruce.

 

genomicranges granges slidingwindow • 867 views
ADD COMMENT
2
Entering edit mode
@michael-lawrence-3846
Last seen 2.4 years ago
United States

Probably want something like:

bins <- reduce(x, min.gapwidth=2000L)

If you want to know which ranges fell into each bin, use , with.revmap=TRUE.

ADD COMMENT
0
Entering edit mode

Hi Michael, yes, exactly this, and with revmap to catch those with more than n rows, and then reannotate based on those row ids. Many thanks, Bruce.

ADD REPLY

Login before adding your answer.

Traffic: 695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6