Read counts between QSEA and MEDIPs
I'm comparing between MEDIPS and QSEA for methylation analysis of pulldown sequencing. I've split my genome into 200bp windows, and then added the data from my bam files.

My question is between the two, MEDIPs adds a read into each window that it lies within, whereas QSEA only puts the read into the window where the read centre of mass lies. This gives a big difference between the two in terms of normalised counts, particularly for windows are split around the centre of a peak.

I'm interested in the rationale between the two design decisions, and potential ramifications downstream.

the difference in counting between MEDIPS and QSEA is mainly a design choice. I preferred to unambiguously assign each fragment to exactly one window, such that each fragment is counted exactly once. When counted for each overlapping window, the total count of a fragment depends on the number of window boundaries it spans, which is arbitrary. As in your example, when counting reads for all overlapping windows, a peak centered at a window split would "contain" more counts then a peak centered within a window, even though the number of reads are the same.

This also has impact on follow up analysis. In QSEA, the estimation of background reads and the enrichment factor analysis assume that reads are counted only once at the center position. For this reason, the counting mode is not optional.

Hi Matthias, Sorry, I didn't realise that you had responded as I didn't get a notification. Thank you for your reply, that is what I expected to be the reason to be honest. I guess it means that I should try and make windows that are centred on the regions that I am particularly interested on.

Thanks, Simon


