DiffBind dropping peaks in ATAC-seq data
2
0
Entering edit mode
Nebat • 0
@67938a65
Last seen 9 months ago
United States

Hi all,

I'm new to ATAC-seq analysis and have recently been trying to use DiffBind to systematically identify differential peaks that I've been seeing by eye when looking at macs2 output in IGV. I have two conditions in triplicate and have done combined macs2 runs on nucleosome-free regions for each condition as well as on the entire dataset. When looking in IGV I can identify peaks in the pooled dataset that are unique to one condition or the other but when I run DiffBind I only get a few or no peaks being called as differential depending on the parameters. It seems like I might only be getting hits for peaks that are present across both conditions where there's a significant difference in read counts within a peak but I'm not sure. Any tips or recommendations for analyzing this sort of ATAC-seq dataset using this tool would be greatly appreciated.

DiffBind ATACSeq • 806 views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 15 days ago
Cambridge, UK

It is difficult to know what exactly is going on without more information (script, output).

Generally, changes need to be consistent within the replicates for each sample group, and different between sample groups. The higher the variance within a sample group, the more replicates are required to be confident changes are real. Pooling samples can mask this variance so it is important to look at the counts for all of the replicates. You can retrieve the counts using the dba.peakset() function with bRetrieve=TRUE.

Feel free to send me your full DBA object and I can have a look.

ADD COMMENT
0
Entering edit mode
Malcolm Cook ★ 1.6k
@malcolm-cook-6293
Last seen 9 days ago
United States

Nice to see you have 3 replicates per condition.

Try this:

Create two peak-sets independently, using Genrich (or something like it) which appropriately handles Multiple replicates (instead of macs2, which doesn't). (note: If using Genrich, I recommend you first try using parameter a=0).

Combine the peak-sets into a single reference peak based on overlap and/or proximity. R/BioConductor's genomicRanges can make this easy, as can bedtools

Perform differential chromatin accessibility analysis on the reference peakset. I use csaw at this point, but DiffBind might perform well.

ADD COMMENT

Login before adding your answer.

Traffic: 633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6