Question: How can I integrate set of compressedList in multiple list into one without duplication?
0
jian_liangli0 wrote:

Hi everyone:
​I have set of compressedList in the multiple list where each compressedList are corresponds to overlap position index of one set of genomic interval to another, and the order of compressedList in each list are very different. However, I am trying to integrate these multiple compressedList into one without duplication. To make things clarify, this data are the result of finding overlap region between two GRanges object, where queryLength and subjectLength are different. FYI, all data and output are simulated to ease readability of post. Does anyone know any trick of doing this sort of manipulation? I expect my desired output as compressedList of each list element in the final list. Is that possible to integrate them easily/efficiently?

Things are bit of different when I integrating more than two symmetric findOverlaps() for multiple GRanges, where order of GRanges object in function are really matters. I just want to figure out any solution how I can get my expected output if all Hit objects in the IntegerList. Any possible idea, approach are highly appreciated.

# mini example to run (simulated)

v1 <- list(
a=IntegerList(1,2,3),
b=IntegerList(3,3,4))

v2 <-  list(
b=IntegerList(1,2,3,4),
a=IntegerList(integer(0),1,2,3))

# desired output:
I am trying to integrate v1, v2 into one list with specific order (i.e, my_output <- list(a=IntegerList(), b=IntegerList())) where each list element must comply with compressedList object, for example, this is my expected output (manually generated):

> desired_output
[]
IntegerList of length 5
[] 0
[] 1
[] 1
[] 2
[] 3

[]
IntegerList of length 5
[] 1
[] 2
[] 3
[] 3
[] 4

How can I get my expected output easily/efficiently ? Can anyone propose possible ideas to solve my problem? Thanks a lot.

Sorry, I don't understand what you want. Could you give a reproducible set of inputs, along with the desired output and a better explanation? I'm not sure how the code for those inputs even works, since coercing a Hits object to a List produces an IntegerList, but you are calling &() on them.

Thanks your very first respond on my post. I updated my post with quick reproducible example. The sketch code I gave is just example not serious code to run, but the mini example is very much simulated based on the problem that I faced. I am trying to figure out integration of v1, v2 into one common list where each list element is CompressedIntegerList object.

Answer: How can I integrate set of compressedList in multiple list into one without dupl
1
Michael Lawrence11k wrote:

If all you want is to regroup the list elements by name, then in principle regroup() is what you want. But it looks like there is some sort of deduplication happening to get to your result, which I don't understand.

I just checked in some changes to devel (IRanges 2.7.14, S4Vectors 0.11.12) that make this work.

v <- List(c(v1, v2))
rg <- regroup(v, names(v))

That looks like:

> as.list(rg)
$a IntegerList of length 7 [] 1 [] 2 [] 3 [] integer(0) [] 1 [] 2 [] 3$b
IntegerList of length 7
[] 3
[] 3
[] 4
[] 1
[] 2
[] 3
[] 4