Re-centering around summits isn't really meant to embed an assumption regarding the expected width of enriched areas.
Rather, it is there to address issues relating to highly variable peak widths, especially after merging peak calls from multiple replicates.
We are trying to lessen the proportion of the peaks that consist mostly of background reads, and hence increase technical variance.
The more refined (narrow) the peak boundaries are, the fewer background bases are included, leading to higher confidence in assessing differential enrichment.
The idea is that we don't really need to know the "true" boundaries of enriched areas, but rather work with subsets of those regions that are likely to exhibit enrichment across replicates in at least one sample group, and test those higher-confidence areas for differential enrichment.
Remember, when a tool like
DiffBind calculates that a certain region is significantly differentially enriched, it is not saying that areas outside of these regions are not differentially enriched.
The important thing is to keep in mind is the scientific purpose of performing the analysis, in particular, what you are doing to do with the regions identified as exhibiting differential enrichment.
In many cases, the next step is to annotate these regions and calculate their proximity to known genomic features (such as promoters or gene bodies), or to calculate their proximity to other differentially enriched features (such as histone marks indicating an active enhancer).
Often these are then correlated with some other functional feature, such as transcript expression or chromatin looping.
For these purposes, it is not necessary to know the precise boundaries of the enrichment, only that there is differential enrichment proximal to features of interest.
If more precision regarding the extent of enrichment (eg. open chromatin) is required to meet the objectives of the study, the re-centred regions identified as differential can be examined across replicates to more precisely determine the enrichment boundaries.
An approach that does not rely on peak calling, such as that used in
csaw, makes fewer assumptions about the extent of enriched regions and can test small windows (even down to the base pair level) individually.