Question: I want have a score vector of length of grange that is overlapping with other grange.
0
3.5 years ago by
vinod.acear30
India
vinod.acear30 wrote:

hi i have two granges gr1 and gr2, i want a have score vector for each range of gr1  that is equal to width of its range. The values of score vectors are such that positions of gr1 overlapping with gr2 have must have score of gr2 on overlapping positions and zero on other positions.

modified 3.4 years ago by Michael Lawrence11k • written 3.5 years ago by vinod.acear30

Hi,

It's a little bit confusing. The question in the title of your post looks very much like the question you already asked here  how much positions overlapped for each pair of the pairs obtained by findoverlap. 8 days ago and for which you got an answer (that you accepted). But then the details you give make it sound a little bit different, maybe.

Could you please clarify with an example? Would help a lot if you could show 2 small GRanges objects gr1 and gr2 and what how expect the score vector to be. Thanks!

H.

Hi Herve in previous post i was sub-part of this problem which has become complex for me to implement and taking time. I recently shifted to R platform so, i have problem in implementing methods in effective manner.
My problem is described by this example

I have two gr1 and gr2. All gr1 ranges have width  equal to 6.
gr1
GRanges object with 8 ranges and 0 metadata columns:
seqnames    ranges strand
<Rle> <IRanges>  <Rle>
a     chr1  [ 5, 10]      -
b     chr1  [11, 16]      +
c     chr1  [19, 24]      +
d     chr2  [ 9, 14]      +
e     chr2  [12, 17]      +
f     chr3  [ 4,  9]      +
g     chr3  [11, 16]      +
h     chr4  [ 6, 11]      -
 -------
seqinfo: 4 sequences from an unspecified genome; no seqlengths
gr2
GRanges object with 4 ranges and 1 metadata column:
seqnames    ranges strand |     score
<Rle> <IRanges>  <Rle> | <numeric>
a     chr1  [ 6,  7]      * |       0.4
b     chr1  [ 8, 10]      * |       0.3
c     chr1  [13, 14]      * |       0.7
d     chr2  [10, 18]      * |       0.1
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths

output

I want vectors for each genomic range of gr1 of length equal to width of gr1. Required vector is such that it have value of score of gr2 at overlapping positions of gr1 else zero at non-overlaping positions. eg. out put

vector for first genomic range of gr1 : [0 .4 .4 .3 .3 .3]   reverse this, as it is on negative strand of gr1

vector for second genomic range of gr1 : [0 0 .7 .7 0 0] leave this as, it is as it is on +ve strand of gr1

..

vector for fourth genomic range of gr1 : [0 .1 .1 .1 .1 .1] leave this as, it is as it is on +ve strand of gr1

Answer: I want have a score vector of length of grange that is overlapping with other gr
0
3.4 years ago by
United States
Michael Lawrence11k wrote:

v <- Views(coverage(gr2, weight="score"), as(gr1, "RangesList"))

That object is a list by chromosome, where each element is a set of "views" on the coverage vector. Perhaps it does what you need.

If you really need a list that aligns with your original GRanges, we need to do something different and it gets more complicated. First, we compute the coverage from gr2, as before. It is a list of vectors, one per chromosome. We then subset the vectors for the values inside the ranges of gr1.

cvg <- coverage(gr2, weight="score")
cvg.gr1 <- cvg[as(gr1, "RangesList")]

If your gr1 is sorted by chromosome, just unlist the list into a single vector (Rle):

rle <- unlist(cvg.gr1)

If it is NOT sorted, do this instead:

rle <- unsplit(cvg.gr1, rep(seqnames(gr1), width(gr1)))

Now, we just have to form the list with one element per range:

rle.range <- relist(rle, ranges(gr1))

And finally reverse the elements that are on the negative strand:

rle.range[strand(gr1)=="-"] <- revElements(rle.range[strand(gr1)=="-"])

Sorry that it's an insane 5 lines of code to do something so simple. In theory it could be made as simple as cvg[gr1]. Actually, it looks like that actually works, except for the strand-sensitive reversing, so we are here:

scores <- coverage(gr2, weight="score")[gr1]
scores[strand(gr1)=="-"] <- revElements(scores[strand(gr1)=="-"])