Question

ChIPseq-style peak calling for paired-end sequencing?

4

Entering edit mode

samuel collombet ▴ 140

@samuel-collombet-6574

Last seen 7.4 years ago

France

Hi,

I was wondering of there was any package implementing a peak calling method that would take paired-end data (for ATACseq precisely).

I am checked the packages narrowPeaks, BayesPeaks and SPP (which is not in bioconductor), but none seems to work with paired-end data.
I am currently using macs2, which works fine, but as I do all further step in R, it would have come handy to have an R software do to this step as well.

Chipseq DNAseq ATAC bayespeak narrowpeaks • 6.5k views

ADD COMMENT • link updated 9.4 years ago by Gordon Smyth 51k • written 9.5 years ago by samuel collombet ▴ 140

score 3 · Answer 1 · 2015-05-27

3

Entering edit mode

Gordon Smyth 51k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

csaw works with paired-end reads. It won't call peaks for an individual library in isolation, but it can compare a ChIP library to input.

ADD COMMENT • link 9.4 years ago Gordon Smyth 51k

0

Entering edit mode

Thanks Gordon! In the case of ATAC we do not have a background/input to compare to (like DNAse), but I could use the method of filtering by local enrichment you propose in the csaw user guide, to mimic peak calling. Thanks!

ADD REPLY • link 9.4 years ago samuel collombet ▴ 140

score 0 · Answer 2 · 2015-05-18

0

Entering edit mode

Jonathan Cairns ▴ 10

@jonathan-cairns-6949

Last seen 7.4 years ago

Cambridge, UK

Hi,

I am not aware of any ChIP-seq peak-callers that specifically use the paired-end information, rather than just pretending the reads are single-end. Historically, paired-end ChIP-seq is pretty rare compared to single-end, which is probably why the problem has not been addressed fully. Of course, ATAC-seq has big differences from ChIP-seq. Perhaps a paired-end peak-caller will be of much greater value in this setting; somebody more familiar with ATAC-seq may be able to offer you an opinion.

As far as ChIP-seq analysis goes, two big advantages of paired-end data are 1) more specific removal of PCR duplicates and 2) better estimation of average fragment length, both of which can often be done as pre-processing steps without a paired-end specific peak-caller. Then, the data can be fed in to your favourite peak-caller as single-end reads. How appropriate that is for ATAC-seq analysis is something I don't feel qualified to say with any certainty.

ADD COMMENT • link 9.5 years ago Jonathan Cairns ▴ 10

0

Entering edit mode

I might have misunderstood something, but I think that macs2 in mode BAMPE do take the fact that data are paired-end into account (extending reads to the fragment length in the bam file, not an average fragment length).

ADD REPLY • link 9.5 years ago samuel collombet ▴ 140

0

Entering edit mode

I believe that they use paired-end reads to predict average fragment length, as per point 2) in my second paragraph. However, as far as I'm aware, the pairings do not affect the actual model they use to call peaks.

ADD REPLY • link 9.5 years ago Jonathan Cairns ▴ 10

1

Entering edit mode

from https://groups.google.com/forum/#!searchin/macs-announcement/BAMPE/macs-announcement/Vh916L3tOOE/TUjLeQtsCfkJ :
"In BAMPE mode, the 5’ end plus the observed template length will both be recorded so in later analysis, MACS2 piles up the actual entire observed fragment/template instead of estimating a fixed DNA fragment length"
So it looks like they do not comptue an average length, but use the indicated fragment length for each fragment :)

ADD REPLY • link 9.5 years ago samuel collombet ▴ 140

0

Entering edit mode

I stand corrected then. Thanks! (Though whether that procedure is a sensible thing to do or not is a separate question!)

ADD REPLY • link 9.5 years ago Jonathan Cairns ▴ 10

score 0 · Answer 3 · 2015-05-26

0

Entering edit mode

Pedro Madrigal • 0

@pedro-madrigal-7834

Last seen 9.4 years ago

Cambridge, UK

Hi,

I think for ATAC-seq you might consider peak callers that have worked good with DNase. Have a look to this paper

Koohy H, Down TA, Spivakov M, Hubbard T (2014) A Comparison of Peak Callers Used for DNase-Seq Data. PLoS ONE 9(5): e96303.

We also reviewed many peak callers here (ChIP-seq only)

Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T, et al. (2013) Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Comput Biol 9(11): e1003326.

ADD COMMENT • link 9.4 years ago Pedro Madrigal • 0

0

Entering edit mode

In the original paper describing ATACseq, they used ZINBA. In fact macs2 works very well on our data, but I was wodnering if something in R existed that would handle paired end :)

ADD REPLY • link 9.4 years ago samuel collombet ▴ 140