GenomicRanges Use Cases - subsetByOverlaps
1
0
Entering edit mode
James Perkins ▴ 120
@james-perkins-4948
Last seen 9.6 years ago
Hi, I am having some problems following the example in the vignette for GenomicRanges, specifically: 3.4 Identifying reads that do NOT overlap known annotation ... > filtData <- subsetByOverlaps(aligns, exonRanges) > length(filtData) [1] 17311 At this point, the filtData object only contains ranges that did not overlap with any of the known exons from Saccharomycess cerevisiae. My understanding of subsetByOverlaps is that it would bring back exactly the ranges that DO overlap with the known exons? 'subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type = c("any", "start", "end", "within", "equal"))': Returns the subset of 'query' that has an overlap hit with a range in 'subject' using the specified 'findOverlaps' parameters. Both 'query' and 'subject' should be 'Ranges', 'RangesList' or 'RangedData' objects. I don't see how this gets the reads mapping in non-exon ranges. Surely it gets the reads mapping in the exon ranges? since exonRanges is obtained using: exonRanges <- exonsBy(txdb, "tx") Shouldn't I be looking for the subset that *doesn't* overlap? Something like subsetByOverlaps(! aligns, exonRanges)? Or have I missed something obvious (quite likely!)? Many thanks, Jim -- James Perkins, PhD student Institute of Structural and Molecular Biology Division of Biosciences University College London Gower Steet London, WC1E 6BT UK email: j.perkins at ucl.ac.uk phone: 0207 679 2198
• 1.1k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States
Hi, On Tue, Nov 8, 2011 at 4:27 AM, James Perkins <j.perkins at="" ucl.ac.uk=""> wrote: > Hi, > > I am having some problems following the example in the vignette for > GenomicRanges, specifically: > > 3.4 Identifying reads that do NOT overlap known annotation > ... >> filtData <- subsetByOverlaps(aligns, exonRanges) >> length(filtData) > [1] 17311 > At this point, the filtData object only contains ranges that did not > overlap with any of the known exons from Saccharomycess cerevisiae. > > My understanding of subsetByOverlaps is that it would bring back > exactly the ranges that DO overlap with the known exons? > > 'subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type = > ? ? ? ? ?c("any", "start", "end", "within", "equal"))': Returns the > ? ? ? ? ?subset of 'query' that has an overlap hit with a range in > ? ? ? ? ?'subject' using the specified 'findOverlaps' parameters. > ? ? ? ? ?Both 'query' and 'subject' should be 'Ranges', 'RangesList' > ? ? ? ? ?or 'RangedData' objects. > > I don't see how this gets the reads mapping in non-exon ranges. Surely > it gets the reads mapping in the exon ranges? since exonRanges is > obtained using: > > exonRanges <- exonsBy(txdb, "tx") > > Shouldn't I be looking for the subset that *doesn't* overlap? > Something like subsetByOverlaps(! aligns, exonRanges)? Or have I > missed something obvious (quite likely!)? One thing you can do is call `gaps` on your exonRanges to get the regions where reads hit the "gaps" between exons: R> not.exons <- subsetByOverlaps(aligns, gaps(exonRanges)) This will still return reads that partially overlap both exonic and not exonic regions. You can also do `! ... %in% ...`: R> not.exons <- aligns[!aligns %in% exonRanges] This will (should) only return reads that don't overlap with any `exonRanges` at all. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT

Login before adding your answer.

Traffic: 787 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6