Question: DiffBind dba.plotVenn - Venn doesn't match Intervals
0
gravatar for t.severson
5.0 years ago by
t.severson0
Netherlands
t.severson0 wrote:

Hello all, when I use dba.plotVenn to get diagrams of my peaksets using DiffBind the Intervals numbers don't match the Venn diagram.

> tumors <- dba(sampleSheet='file.csv',minOverlap=F)
AFTER001_pre Breast ER pre  1 bed
AFTER001_post Breast ER post  1 bed

> tumors
17 Samples, 24627 sites in matrix:
              ID Tissue Factor Condition Replicate Peak.caller Intervals
1   AFTER001_pre Breast     ER       pre         1         bed      2071
8  AFTER001_post Breast     ER      post         1         bed      3381

But the Venn diagram generated has 1204 AFTER001_pre only peaks, 242 AFTER001_post only peaks and 839 overlapping peaks. 1204+839!=2071. Anyone know what I'm doing wrong?

Thanks, Tesa

 

diffbind • 1.0k views
ADD COMMENTlink modified 5.0 years ago by Gord Brown590 • written 5.0 years ago by t.severson0

Sorry, I forgot session info.

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] DiffBind_1.6.2       GenomicRanges_1.12.5 IRanges_1.18.3
[4] BiocGenerics_0.6.0   BiocInstaller_1.10.3

loaded via a namespace (and not attached):
 [1] amap_0.8-12        bitops_1.0-6       caTools_1.17.1     edgeR_3.2.4
 [5] gdata_2.13.3       gplots_2.14.2      gtools_3.4.1       KernSmooth_2.23-13
 [9] limma_3.16.8       RColorBrewer_1.0-5 stats4_3.1.1       tools_3.1.1
[13] zlibbioc_1.6.0

 

ADD REPLYlink written 5.0 years ago by t.severson0
Answer: DiffBind dba.plotVenn - Venn doesn't match Intervals
2
gravatar for Gord Brown
5.0 years ago by
Gord Brown590
United Kingdom
Gord Brown590 wrote:

Hi, Tesa,

You're most likely not doing anything wrong. To make the Venn diagram, DiffBind merges the peak sets of the samples, using a not-very-sophisticated algorithm that merges regions if they overlap at all.  In your case, there are probably instances where 2 regions in one sample overlap the same region in the other.  They'll all be merged into one big region, hence the numbers won't add up.

Now and again we talk about more clever algorithms (perhaps employing PeakSplitter or something along those lines) but it's never made it to the top of our to-do list, alas. Alternatively we could report 2 numbers in the overlapping region, one for one sample and another for the second.  But we haven't done that either... :(

Hope this helps... or at least explains what's happening.

Cheers,

 - Gord

ADD COMMENTlink written 5.0 years ago by Gord Brown590

Thanks for your reply, Gord. That makes sense. I've used the package quite a bit and had never seen it happen so it was a bit concerning. Now I understand.

Cheers!

Tesa

ADD REPLYlink written 5.0 years ago by t.severson0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 222 users visited in the last hour