DiffBind version 2.6.6 to do ChIP-seq differential binding analysis for a transcription factor that binds at genes transcription start sites (TSS).
I generated the peaks intervals with MACS2 and since I'm specifically interested only in TSS regions I used
bedtools interesect to intersect the TF peaks with a bed file of TSS +/- 1250bp. In this way I generate a bed file with peaks that are centred on the TSS and all have 2500bp width.
Now, when I move to
DiffBind for the differential analysis I used the function
dba.count() with the parameter
summits set to 1250 since I've read that in this way I can:
- Keep all the peaks with the same width
- Re-center each peak around the point of greatest enrichment
I'm happy with the fact that the peaks maitain the same width but is it possible instead NOT to re-center the peaks but keep the original peaks coordinates I input to
dba.count()? The re-centering does not drastically change the start and end position of the peaks, usually it moves them around of 5-10 bp centered on the peak summit instead of the TSS.
For example there's a peak with coordinates
chr6 91472296 91474796
that after using
dba.count(dba, summits = 1250) it becomes
chr6 91472294 91474794
Here is how I create the dba object:
Peaks <- "~/TSS_intersected/All_TSS_Intersect_peaks_q0.001_2500bp.bed" metadata <- data.frame(SampleID, ControlID, Tissue, Factor, Condition,
Replicate, bamReads, bamControl, Peaks, PeakCaller) config_df <- data.frame(RunParallel = TRUE, reportInit = "TF_DBA", DataType = DBA_DATA_FRAME, AnalysisMethod = DBA_EDGER, minQCth = 10, fragmentSize = 250, bCorPlot = TRUE, th = 0.1, bUsePval = FALSE, bFullLibrarySize = TRUE, bReduceObjects = F, bUseSummarizeOverlaps = F) dba <- dba(sampleSheet = metadata, config = config_df) dba_cnt <- dba.count(dba, summits = 1250)
I'm interested in this because I'd like to compared my dataset with another TF ChiP-seq, so I care to have both peak-sets with the same coordinates.