I have set of genomic interval in GRanges objects, and I did filter out each GRanges object into two distinctive set based on score column. I expect each GRanges objects must have Confirm, Discard set after filtration. My approach works for me, turns out its output format is bit of undesired and need to do simplification. I bet there must be better way to achieve efficient output format for filtering on big GRanges objects. Can anyone point me how to solve this issue easily? Thanks a lot !
Note: toy data only explain how my real data looks like, so it is simulated based on structure of my dataset.
# toy data
grs <- GRangesList( foo = GRanges( seqnames=Rle("chr1", 3),ranges=IRanges(c(2,7,16), c(5,14,20)), rangeName=c("a1", "a2", "a3"), score=c(4, 6,9)), bar = GRanges(seqnames=Rle("chr1", 3),ranges=IRanges(c(4,13,26), c(11,17,28)), rangeName=c("b1", "b2", "b3"), score=c(11, 7, 8)), bleh = GRanges(seqnames=Rle("chr1", 4),ranges=IRanges(c(1,4,10, 23), c(3,8,14, 29)), rangeName=c("c1", "c2", "c3", "c4"), score= c(4, 6, 3, 8)) )
so I come up this, turns out it is bit of difficult form, and I am stuck with its simplification :
res <- lapply(grs, function(x) split(x, c("Confirm", "Discard")[(x$score > 6)+1]))
I want to simplify because of this reason:
I want to compare
res[]$Discard, for example, assume that one regions both existed in Confirm,and Discard set, then I am gonna remove this instances. I think it's better to get out nested list first, detach nested list as individual list and access its subset respectively
How can perform this simplification? Does anyone knows any trick of doing this manipulation?