Entering edit mode
Last seen 3.5 years ago


When I try to calculate the overlap rate between peaksets, I get the following errors:

> names(broad_gtsf1$masks) # just list all 
 [1] "K562"        "H3K9Me3"     "75"          "313"         "WT"          "Mutant"      "counts"      "Consensus"   "Replicate.1" "Replicate.2" "Replicate.3" "Replicate.4" "All"        
[14] "None"       
> dba.overlap(broad_gtsf1, broad_gtsf1$masks$Mutant & broad_gtsf1$masks$WT, mode = DBA_OLAP_RATE)
Error in matrix(0, nrow(pv$peaks[[1]]), length(pv$peaks) + 3) : 
  non-numeric matrix extent
> dba.overlap(broad_gtsf1, broad_gtsf1$masks$`313` & broad_gtsf1$masks$WT, mode = DBA_OLAP_RATE)
Error: No peak overlap information available

The Venn diagrams are similar, only the second line produces the diagram:

> dba.plotVenn(broad_gtsf1, broad_gtsf1$masks$Mutant & broad_gtsf1$masks$WT)
Error in `[.data.frame`(peaks, onlyA, c(1:3, (3 + A))) : 
  undefined columns selected
> dba.plotVenn(broad_gtsf1, broad_gtsf1$masks$`313` & broad_gtsf1$masks$WT)
diffbind • 386 views
Entering edit mode
Rory Stark ★ 4.5k
Last seen 5 days ago
CRUK, Cambridge, UK

Two issues here.

In the first example (mask=broad_gtsf1$masks$Mutant & broad_gtsf1$masks$WT), there are no peaksets that are both Mutant and WT:

> sum(broad_gtsf1$masks$Mutant & broad_gtsf1$masks$WT)
[1] 0

You should use | (or) -- peaks that are either Mutant or WT:

> sum(broad_gtsf1$masks$Mutant | broad_gtsf1$masks$WT)
[1] 8

The second issue is that you have stumbled on a limitation in DiffBind: when you pass in a DBA object after a consensus set has been calculated and returned by dba.count(), and also specify a mask, the original peak calling information is lost.

In general, the dba.overlap() functionality works best using a peakset-based DBA object (after loading peaks but prior to calling dba.count()).

For the first example, there is a workaround: as all the peaksets are either Mutant or WT, you don't need to specify a mask at all -- you can get the overlap rate for everything without it.

There is workaround for the second example as well. As this overlap involves only two peaksets, you can get the overlaps directly:

olaps <- dba.overlap(broad_gtsf1,mask=broad_gtsf1$masks$`313` & broad_gtsf1$masks$WT)
length(olaps$inAll) # Number of peaks in both sets (intersection)
length(olaps$onlyA) + length(onlyB) + length(inAll) # Total number of peaks in both sets (union)

You can do this for up to four peaksets.

Cheers- Rory


Login before adding your answer.

Traffic: 332 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6