Counting reads in fixed width peaks with DiffBind::dba.count without re-centering?
1
1
Entering edit mode
Ni-Ar ▴ 10
@ni-ar-15663
Last seen 14 months ago
Spain/Barcelona

Hello,

I'm using DiffBind version 2.6.6 to do ChIP-seq differential binding analysis for a transcription factor that binds at genes transcription start sites (TSS).

I generated the peaks intervals with MACS2 and since I'm specifically interested only in TSS regions I used bedtools interesect  to intersect the TF peaks with a bed file of TSS +/- 1250bp. In this way I generate a bed file with peaks that are centred on the TSS and all have 2500bp width.

Now, when I move to DiffBind for the differential analysis I used the function dba.count() with the parameter summits set to 1250 since I've read that in this way I can:

1. Keep all the peaks with the same width
2. Re-center each peak around the point of greatest enrichment

I'm happy with the fact that the peaks maitain the same width but is it possible instead NOT to re-center the peaks but keep the original peaks coordinates I input to dba.count()The re-centering does not drastically change the start and end position of the peaks, usually it moves them around of 5-10 bp centered on the peak summit instead of the TSS.

For example there's a peak with coordinates

chr6 91472296 91474796


that after using dba.count(dba, summits = 1250) it becomes

chr6 91472294 91474794


Here is how I create the dba object:

Peaks <- "~/TSS_intersected/All_TSS_Intersect_peaks_q0.001_2500bp.bed"

metadata <- data.frame(SampleID, ControlID, Tissue, Factor, Condition,
                       Replicate, bamReads, bamControl, Peaks, PeakCaller)

config_df <- data.frame(RunParallel = TRUE, reportInit = "TF_DBA",
DataType = DBA_DATA_FRAME, AnalysisMethod = DBA_EDGER,
minQCth = 10, fragmentSize = 250, bCorPlot = TRUE,
th = 0.1, bUsePval = FALSE, bFullLibrarySize = TRUE,
bReduceObjects = F, bUseSummarizeOverlaps = F)

dba <- dba(sampleSheet = metadata, config = config_df)

dba_cnt <- dba.count(dba, summits = 1250)

I'm interested in this because I'd like to compared my dataset with another TF ChiP-seq, so I care to have both peak-sets with the same coordinates.

Thank you!

diffbind chipseq counts peaks bedtools • 694 views
1
Entering edit mode
Rory Stark ★ 3.9k
@rory-stark-5741
Last seen 37 minutes ago
CRUK, Cambridge, UK

If you do not want to re-center the peaks, there is no need to use the summits parameter. You can use the peaks parameter to specify the exact intervals you want:

> dba_cnt <- dba.count(dba, peaks=Peaks)

Cheers-

Rory

0
Entering edit mode

Hi Rory-

I realize this is an old post, but I have a related question and stumbled across this answer. I'm wanting to allow variable-width peaks (I'd like to use the same intervals as specified by the MACS2 narrowPeak files I'm using, with merging across replicatesby DiffBind). I've tried dba_cnt <- dba.count(dba, peaks=Peaks) but I get the error "Error in dba.count(dba, peaks = Peaks) : object 'Peaks' not found". Peaks is a column in my samples file with paths to the narrowPeak files. What is Peaks meant to be?

Thanks very much!

0
Entering edit mode

You don't need to specify anything for the peaks parameter; the generation of a consensus peakset consisting of variable length, merged peaks from specified MACS files is the default behavior, so long as your also specify summits=FALSE (which prevents the peaks from being reset to consistent intervals around the consensus summit).

The peaks= parameter is for when you have a pre-defined set of peaks to want to pass in so as to override the generation of a consensus peakset via overlaps.