Diffbind and dba.peakset vs dba.overlap
1
0
Entering edit mode
cperez5 • 0
@cperez5-12309
Last seen 7.2 years ago


Hello, all

I am just wondering if I am misunderstanding the consensus peak.  I am under the impression that this is the number of peaks contained within a set.  for example all peaks in control replicates.  When using overlap the last number matches my venndiagram, but when added the consensus peak to my set it does not:

 

dba.overlap(test,test$masks$het  ,mode=DBA_OLAP_RATE)
#[1] 12102  6197  3116
dba.overlap(test,test$masks$control  ,mode=DBA_OLAP_RATE)
#[1] 36185 16097  8398

dba.peakset(test, consensus=DBA_CONDITION)
Add consensus: control
Add consensus: het
8 Samples, 16672 sites in matrix (36577 total):
       ID Tissue Condition Replicate Caller Intervals
1   IM-C1   iPSC   control         1   macs     29242
2   IM-C2   iPSC   control         2   macs     22238
3   IM-C3   iPSC   control         3   macs     12179
4  IM-H11   iPSC       het         1   macs      5177
5  IM-H12   iPSC       het         2   macs      8969
6  IM-H13   iPSC       het         3   macs      8320
7 control   iPSC   control     1-2-3   macs     16097
8     het   iPSC       het     1-2-3   macs      6197

 

dba.peakset instead adds the middle number..which i am actually not sure what it is, nor does it appear in my venndiagram.   Is there any way to add the peaks identified in my replicates?

 

thanks in advance

diffbind • 1.7k views
ADD COMMENT
2
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 14 days ago
Cambridge, UK

The issue has to do with the minOverlap parameter to dba.peakset() (and other functions, such as dba.count). The key is that this parameter has a default value minOverlap=2.

The vector returned by dba.overlap(...,mode=DBA_OLAP_RATE) contains the number of peaks that remain for different settings of minOverlap. So in your first example, if minOverlap=1, there will be 12,102 peaks, while if minOverlap=2, there will be 6,197 peaks. The number you show in boldface, 3,116, is the number of overlapping peaks that are identified in at least three peaksets (minOverlap=3).

In a three-way Venn diagram, the number in the middle is the number of peaks that overlap all three peaksets (minOverlap=3). When you call dba.peakset(...,consensus=DBA_CONDITION), by default minOverlap=2. This will include all the peaks in the middle of the Venn diagram (present in all peaksets), as well as those that overlap exactly two peaksets. If you add up the four numbers in the middle of the three-way Venn, leaving out the peaks that are unique to one peakset, you should find this equals the number of peaks you are getting when you add the consensus peaksets.

So if you only want the lower numbers you show in bold (a strict consensus of peaks that overlap in all three samples), you should set minOverlap=3 when you call dba.peakset(). If you are happy with peaks that overlap in at least two of the three replicates, you can use the default minOverlap=2.

-Rory

 

ADD COMMENT
0
Entering edit mode

Yes! this helped a lot

 

thanks again.

 

ADD REPLY

Login before adding your answer.

Traffic: 992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6