Question: DiffBind for open chromatin mapping
gravatar for rbronste
19 months ago by
rbronste60 wrote:

Hi everyone, wanted to run my open chromatin mapping methodology with DiffBind by you and see if there is room for improvement. Basically have 3-4 replicates per experiment where the vast majority of underlying peaks will be shared between groups. In the sampleSheet just have the BAMs and narrowPeak files from the macs2 output, and using CONDITION as the contrast. Comes from PE 75 NextSeq high output runs.

Running the following:

samples <- read.csv(file.path(system.file("extra", package="DiffBind"),"RPMG_DNAse.csv"))

RPMG<- dba(minOverlap = 2, sampleSheet = "RPMG_DNAse.csv", peakCaller = "macs", peakFormat = "narrow", config=data.frame(AnalysisMethod=DBA_EDGER, fragmentSize=151))

RPMG <- dba.count(RPMG, summits=250)

RPMG<- dba.contrast(RPMG, categories=DBA_CONDITION)

RPMG<- dba.analyze(RPMG)


View (RPMG.DB)

Basically noticed that I am getting slightly more peaks using EdgeR than DeSeq2 however still very few Diff peaks given a consensus peak set of over 90K. Wondering everyones thoughts, thanks!

ADD COMMENTlink modified 18 months ago • written 19 months ago by rbronste60


You might try again without setting summits in dba.count.  If the regions are relatively broad, setting the summits=250 might only capture the middle of the region.  Also, fragmentSize looks a little short... is that the actual mean fragment size?  Other than that, the steps look right (though it's not useful to read the sample sheet into 'samples' then again in the call to dba). 

Also, maybe update DiffBind... a bug-fix was just submitted on Friday (though it shouldn't affect your example here).

How many peaks are you expecting, and how many are you getting differentially bound?  Can you post a screen shot showing a region that seems like it ought to be called as differentially-bound, but isn't?

 - Gord


ADD REPLYlink written 18 months ago by Gord Brown570


Hi Gord,

Thanks for the advice. Amended the fragment size as you were right and it was a little low and removed the summits option. Still getting small amounts of diff peaks but only for one specific comparison, it may represent actual biology but just trying to look at every angle. I would except 100-200 diff peaks but only getting on the order of 50 or so. Will work on finding that region. 


ADD REPLYlink written 18 months ago by rbronste60

Is there anything in particular you would do throughout this analysis keeping in mind its not ChIP-seq but DNase-seq?


ADD REPLYlink written 18 months ago by rbronste60

I'm not specifically experienced with DNAse-seq.  The criteria that are likely to matter are those I've already mentioned... fragment length and peak width.  Other than that, how much variability is there within your groups?  If the within-group variability is high, then it's harder to identify differentially-bound sites.  If you plot the principal components analysis via dba.plotPCA (after carrying out the differential binding analysis), is there clear separation between the groups?  What about the unbiased PCA (i.e. before differential analysis)? 

Where does your expectation of 100-200 differentially-bound peaks come from?  It's hard to guess why you're getting fewer peaks, without any idea why that's the expected number.  Can you provide more information on that?

 - Gord

ADD REPLYlink written 18 months ago by Gord Brown570
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 354 users visited in the last hour