inconsistency between dba.plotVenn and input peaks
1
0
Entering edit mode
Silvia • 0
@silvia-7163
Last seen 3.4 years ago
United Kingdom

Hi there,

using dba.plotVenn I realized that the sum of number of peaks that I got in each circle of the Venn diagram is not equal to the number of peaks that I had in my input file. When I went back to the tutorial of the DiffBind package ( http://bioconductor.org/packages/release/bioc/vignettes/DiffBind/inst/doc/DiffBind.pdf ), I found the same behaviour: in fact, in figure 14 the total number of peaks for MCF72 (43+47+57+885 = 1032) is different from the number of MCF7 2nd replicate listed in the output of the command dba(sampleSheet="tamoxifen.csv") on page 4, which is 1037 (so 5 peaks are missing in the Venn diagram).
Is there a specific reason for this behaviour of dba.plotVenn (or of the dba.overlap function it is based on) or is it a bug? If not, how can I tune it in order to display a total number of peaks which would correspond to that of the input file?
Thank you for your help!

diffbind • 1.4k views
ADD COMMENT
1
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 5 weeks ago
Cambridge, UK

Hi Silvia-

The numbers don't add up because of overlapping peaks.

Consider a case where you have two peaksets, A and B. A consists of two small peaks, while B contains one peak. The two peaks in A both overlap the one peak in B. So A has two overlapping peaks, and B has one overlapping peak, and they all refer to one interval that contains all of them. There is no single value you could put in the middle of the Venn diagram -- it would have to be "2 from A and 1 from B".

To deal with this, DiffBind merges overlapping peaks. So in the above example, it would replace the two peaks in A and the peak in B with a single (likely wider) peak that encompasses all of them. As a result, the total number of peaks for a sample (the ones unique to that sample plus the ones that overlap with other samples) may be less than (never greater than) the original number if multiple peaks in that peakset have been merged into overlapping peaks.

The merging function is described in the document you reference above in Section 7.2.

Cheers-

Rory

ADD COMMENT
0
Entering edit mode

Right, I haven't thought about it... Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6