Question: Counting reads in fixed width peaks with DiffBind::dba.count without re-centering?
1
gravatar for Ni-Ar
19 months ago by
Ni-Ar0
Germany
Ni-Ar0 wrote:

Hello,

I'm using DiffBind version 2.6.6 to do ChIP-seq differential binding analysis for a transcription factor that binds at genes transcription start sites (TSS). 

I generated the peaks intervals with MACS2 and since I'm specifically interested only in TSS regions I used bedtools interesect  to intersect the TF peaks with a bed file of TSS +/- 1250bp. In this way I generate a bed file with peaks that are centred on the TSS and all have 2500bp width.

Now, when I move to DiffBind for the differential analysis I used the function dba.count() with the parameter summits set to 1250 since I've read that in this way I can:

  1. Keep all the peaks with the same width
  2. Re-center each peak around the point of greatest enrichment

I'm happy with the fact that the peaks maitain the same width but is it possible instead NOT to re-center the peaks but keep the original peaks coordinates I input to dba.count()The re-centering does not drastically change the start and end position of the peaks, usually it moves them around of 5-10 bp centered on the peak summit instead of the TSS.

For example there's a peak with coordinates

chr6 91472296 91474796

that after using dba.count(dba, summits = 1250) it becomes

chr6 91472294 91474794

Here is how I create the dba object:

Peaks <- "~/TSS_intersected/All_TSS_Intersect_peaks_q0.001_2500bp.bed"

metadata <- data.frame(SampleID, ControlID, Tissue, Factor, Condition,
                       Replicate, bamReads, bamControl, Peaks, PeakCaller)

config_df <- data.frame(RunParallel = TRUE, reportInit = "TF_DBA",
                     DataType = DBA_DATA_FRAME, AnalysisMethod = DBA_EDGER,
                     minQCth = 10, fragmentSize = 250, bCorPlot = TRUE,
                     th = 0.1, bUsePval = FALSE, bFullLibrarySize = TRUE,
                     bReduceObjects = F, bUseSummarizeOverlaps = F)

dba <- dba(sampleSheet = metadata, config = config_df)

dba_cnt <- dba.count(dba, summits = 1250)

I'm interested in this because I'd like to compared my dataset with another TF ChiP-seq, so I care to have both peak-sets with the same coordinates. 

Thank you!

ADD COMMENTlink modified 18 months ago by Rory Stark3.0k • written 19 months ago by Ni-Ar0
Answer: Counting reads in fixed width peaks with DiffBind::dba.count without re-centerin
1
gravatar for Rory Stark
18 months ago by
Rory Stark3.0k
CRUK, Cambridge, UK
Rory Stark3.0k wrote:

If you do not want to re-center the peaks, there is no need to use the summits parameter. You can use the peaks parameter to specify the exact intervals you want:

> dba_cnt <- dba.count(dba, peaks=Peaks)

Cheers-

Rory

ADD COMMENTlink modified 18 months ago • written 18 months ago by Rory Stark3.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 297 users visited in the last hour