There are a number of reasons why you may see differences between peaks called in difference sample groups (such as peaks unique to one group) and those identified as significantly differentially bound un a quantitative analysis.
On the issue of consensus peaks, I would consider a consensus peakset that includes more peaks, including more peaks that are only identified in a few samples (such as a subset of replicates), to be more lenient, representing a union of possible peaks. By default, DiffBind
should result in such an expansive consensus peakset, relying on the statistical analysis to assign low confidence scores to poor peaks rather than eliminating them up front. A more "strict" consensus could be formed by taking an intersection, with the problem that only the peaks seen in all samples -- which may be changing the least -- will be retained.
You can check which peaks are present in the consensus peakset by retrieving them in a GRanges
object using dba,.peakset()
with bRetrieve=TRUE
. Are there genomic intervals corresponding to the peaks you are seeing in each of your sample groups present? Alternatively, instead of using the default consensus peakset (all peaks that are called in at least two of the samples in the experiment), the DiffBind
vignette contains a section showing how to more systematically create a consensus peakset along the lines you suggest, by first taking an intersection of the replicates in each sample group, then combining these (union) into a consensus.
So why aren't you seeing the differential peaks you expect? Often the reason is that the experiment down not have enough power to confidently identify differences. This occurs when there is a lot of variance between replicates without having enough replicates to capture their distribution.
You can check for these in a number of ways. After counting reads, look at the clustering (using dba,plotHeatmap()
) and PCA (using dba.plotPCA()
) to see what is driving the variance in your samples. After performing an analysis, you can retrieve all of the statistics using dba.report()
and setting th=1
to see all the consensus peaks. This will show you the relative concentrations of reads in each sample group and the calculated fold change, along with the p-value and FDR confidence scores. You can also set bCounts=TRUE
to see the normalized read scores for each peak in each of the samples to see if this corresponds with what you are seeing in your genome browser.
Another reason for not seeing the differential peaks you expect can be an inappropriate normalization, where biological variance of interest is normalized away. You can use the dba.plotMA()
function to look at the distributions of reads in the raw data, and compare to how this changes under different normalization schemes.
The bottom line is that using overlaps of called peak is not really a reliable way to detect differences in enrichment patterns. We have shown both that peaks unique to one condition do not necessarily stand up to a robust quantitative analysis, and likewise peaks that are called in all samples across conditions can still have very low FDR scores indicating that their affinity is consistently different between samples in different sample groups.