Question: How to write a preprocess.reads function for summarizeOverlaps that requires different logic for paired and single end reads?
5 months ago by
Ryan C. Thompson6.8k wrote:

I want to write a function that I can use as the preprocess.reads argument to summarizeOverlaps that will reduce each read to just the midpoint of the fragment represented by the read. For paired-end reads, this involves taking the midpoint of the outer ends of the two reads, while for single-end reads, this involves specifying the fragment length and then taking the point half that distance downstream of the 5-prime end of the read. What is the best way for me to write a function that can handle both? Should the function simply check if its argument is an instance of GAlignmentPairs in order to decide between the paired-end and single-end logic, or is there a better way?

5 months ago by
Valerie Obenchain ♦♦ 6.6k wrote:

If you're calling summarizedOverlaps with the reads in a GAlignmentsPairs then yes, this would be the way to go. If instead the reads are in a BAM file they will be read into different classes depending on the value of the fragments argument.

When you provide a BAM file of paired-end reads with fragments=TRUE the reads go in a GAlignmentsList before counting. If fragments=FALSE (default) they go into a GAlignmentPairs class.

