I have ChIP-seq with two conditions. In condition A, there should a lot of peaks (normal binding). In condition B, there should be none (no binding). Calling peaks with MACS confirms this. When you perform differential binding analysis, you would expect most of the significant peaks to be up in condition A. However, I get half the peaks up and half down. That does not make sense.
I am using DiffBind. My first instinct was that the peaks were being normalized, so in condition B (where there are few reads in the peak regions), those few reads get normalized to a much higher level. However, that should not be the case according to the manual:
When dba.analyze is invoked using the default method=DBA_EDGER, a standardized differential analysis is performed using the edgeR package. ... First, a matrix of counts is constructed for the contrast, with columns for all the samples in the first group, followed by columns for all the samples in the second group. The raw read count is used for this matrix; if the bSubControl parameter is set to TRUE (as it is by default), the raw number of reads in the control sample (if available) will be subtracted (with a minimum final read count of 1). Next the library size is computed for each sample for use in subsequent normalization. By default, this is the total number of reads in the library (calculated from the source BAM//BED file). Alternatively, if the bFullLibrarySize parameter is set to FALSE,the total number of reads in peaks (the sum of each column) is used.
That disproves my initial theory. Is there another explanation?