summarizeOverlaps: number of dropped reads
1
0
Entering edit mode
@gabriele-schweikert-6510
Last seen 9.6 years ago
Hello, I'm using summarizeOverlaps with mode=union and inter.feature=TRUE to count reads that overlap annotated features. This is great and exactly what I need. However, I would also like to know how many reads are dropped in this setting because they map to multiple features (as opposed to reads not mapping to any of the annotated features.) Is there an easy way to get this number? I was thinking of counting again, this time with inter.feature=FALSE and subtracting the results, but this is not quite what I want because multiple reads are then also counted multiple times. Thanks for the assistance best wishes Gabriele -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
• 725 views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
Hi Gabriele, There are two approaches you can take. 1) Reverse the role of 'reads' and 'features' in summarizeOverlaps() If your reads are in an R object (not bam file) you can possibly reverse the role of the arguments. Check showMethods("summarizeOverlaps") to confirm a method exists for your object types. summarizeOverlaps() reports overlaps in terms of the annotation, i.e., the return count vector is the same length as the 'features' arg. To answer your question we need overlap counts in terms of the reads. When you call summarizeOverlaps() provide your reads as the 'features' arg and vice versa. Use 'mode' Union with 'inter.feature=FALSE'. This will return all counts like a straightforward countOverlaps(). summarizeOverlaps(features=my_reads, reads=my_annotation, mode=Union, inter.feature=FALSE) The count result is the same length as my_reads. Reads with a count >1 are those that hit multiple features. From this output you can also distinguish the reads with no hits. 2) countOverlaps() Alternatively you can just use countOverlaps(). If reads are in a GRanges / GAlignments: co <- countOverlaps(my_reads, my_annotation) As with the summarizeOverlaps() example, the return vector is the same length as my_reads and counts the number of times that read was overlapped. You want the reads with counts > 1. my_reads[co > 1] or just sum(co > 1) If reads are in a bam file use readGAlignments() for single-end and readGAlignmentPairs() or readGAlignmentsList() for paired-end. my_reads <- readGAlignments(pathToBam) then countOverlaps() as above. Valerie On 04/23/2014 06:43 AM, Gabriele Schweikert wrote: > Hello, > > I'm using summarizeOverlaps with mode=union and inter.feature=TRUE to > count reads that overlap annotated features. This is great and exactly > what I need. However, I would also like to know how many reads are > dropped in this setting because they map to multiple features (as > opposed to reads not mapping to any of the annotated features.) Is there > an easy way to get this number? > I was thinking of counting again, this time with inter.feature=FALSE and > subtracting the results, but this is not quite what I want because > multiple reads are then also counted multiple times. > > Thanks for the assistance > > best wishes > > Gabriele > > > > -- Valerie Obenchain Program in Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, Seattle, WA 98109 Email: vobencha at fhcrc.org Phone: (206) 667-3158
ADD COMMENT

Login before adding your answer.

Traffic: 1087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6