DiffBind merging and consensus peaks
1
0
Entering edit mode
omiguele • 0
@omiguele-11985
Last seen 8.1 years ago

Dear Bioconductor community,

I am interested in using DiffBind. I am following the procedure in "DiffBind: Differential binding analysis of ChIP-Seq peak data" and I got a bit confused, I hope you can help me. 

My first question is a bit general, if I understand correctly DiffBind from the very beggining (reading the peaksets) only takes into account the peaks that are merged (shared) between all the samples. So if there were two expeimental conditions and two replicates per condition and a peak was consistently found in the two condition1 replicates but not in the replicates for condition2, this peak would not be taken into account, is that correct?

 My second question is in the "Deriving consensus peaksets" part,in page 20, the line says:

"Alternatively, a master consensus peakset could be generated, and reads counted, directly using dba.count: tamoxifen
<- dba.count(tamoxifen, peaks=tamoxifen$masks$Consensus)"

if I try this I receive the next error:

"Error in pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score,  :
  Can't count: some peaksets are not associated with a .bam file."

I have my consensus peak lines (replicates) in the dba object but there are no BAM files associated in the original "sampleSheet". Would you recomend to merge the BAM files and upload a new sample sheet?

Thank you for your time and your attention,

Regards,

 

diffbind • 5.2k views
ADD COMMENT
1
Entering edit mode

Regarding the second issue, there was an error in that section of the Vignette. I have fixed the text, explaining more clearly how to count with a separately constructed consensus peakset, and it should be released soon as DiffBind 2.2.6.

-R

ADD REPLY
2
Entering edit mode
Gord Brown ▴ 670
@gord-brown-5664
Last seen 4.0 years ago
United Kingdom

Hi,

In regard to your first question, you can control how many peak sets have to include a peak for it to be included in the analysis.  In both dba and dba.count, the parameter minOverlap controls this: if for example you supply the argument minOverlap=2, then any peak that occurs in at least 2 peak sets will be included.

I'll have to leave the second part to Rory... I don't really understand what he is (or you are) trying to accomplish there.

Cheers,

 - Gord

ADD COMMENT
0
Entering edit mode

Hi Dr.Brown,

Thank you for your answer, maybe I am confused about the merging concept. For example in the tamoxifen dataset once the peakset is loaded, the first line of the dba object says:

“11 Samples, 2603 sites in matrix (3558 total)”

2603 sites are the ones shared by at least two of the 11 datasets (minOverlap=2), but if I use minOverlap=0 (or minOverlap=1) I will have the 3558 sites, because those are all the available sites. But 3558 is not equivalent to the sum of the intervals in the 11 samples, this happens because you are making a “merge” (like a bedtools merge) for every single one of the 11 samples independently?

Regards,

Oscar Migueles

ADD REPLY
2
Entering edit mode

The merging process is described in section 7.2 of the package vignette. Peaks that overlap by at least one base between samples are "widened" to encompass the entire enriched region. We recommend using the summits parameter in dba.count() to center the peaks on the consensus summit and make the uniform width. 

Regarding the second issue, I'll look into this further in the next day or so.

 

ADD REPLY
1
Entering edit mode

Just to clarify, we run (the equivalent of) a bedtools merge on all of the samples together, not independently.  Then count how many samples contributed to each (merged) peak.  If that number is at least minOverlap, the peak is included.

ADD REPLY
0
Entering edit mode

Thank you for all your help,

Regards,

Oscar Migueles

 

ADD REPLY

Login before adding your answer.

Traffic: 794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6