Question

Assesing genomic segment enrichment with GenomicRanges (R-package)

0

Entering edit mode

geretshabile • 0

@geretshabile-20947

Last seen 4.9 years ago

I have previously asked this question on Biostars and thought that maybe this is the more approriate forum for it. Apologies in advance for double posting. I saw a warning on another forum that it is poor practise to double post. However, I still haven't received any suggestions in the first forum.

I am running genomic ranges to analyse genomic segment enrichment. The first three columns in my dataset are: chr, start, end, followed by 3 additional metadata columns. All the columns are separated by tabs.

I have successfully run subsetByOverlaps(cases, controls, type="within", invert="true"). According to here, my output should be genomic segments within my chromosome start and end points, as well as being exclusive to my cases. Conversely, I also ran subsetByOverlaps(controls, cases, type="within, invert="true") to look for segments exclusive to controls. I then looked for segments that are found in both by removing the invert option. In a certain instance my queryLength was approximately 4000 segments and subject length 200 odd segments. Given the size of my queryLength, if I run subsetByOverlaps(cases, controls, type="within") I get more than 200 segments in granges object. Am I missing something with respect to the behaviour of the function, since I expected my output to be less than 200 segments assuming that the segments are treated as sets?

The second question is, if I then swap the cases and controls to run subsetByOverlaps(controls, cases, type="within"), how can I combine the data from the 2 runs? Finally, am I correct to assume that combining the two in a dataframe would give me the equivalent of the union of genomic segments found within my cases and controls? If not, is there a way to use Granges to obtain that union without doing it in 2 steps?

GenomicRanges R genomic enrichment • 976 views

ADD COMMENT • link updated 4.9 years ago by Michael Lawrence ★ 11k • written 4.9 years ago by geretshabile • 0

score 0 · Answer 1 · 2019-06-03

0

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 2.4 years ago

United States

To treat ranges as sets, look at the Ranges methods for setdiff(), intersect() and union().

ADD COMMENT • link 4.9 years ago Michael Lawrence ★ 11k