Search
Question: Curious behaviour of ChIPpeakAnno
0
3.6 years ago by

Hello everyone,

I am getting some surprising behaviour out of the ChIPpeakAnno R package.  The question I’m trying to ask is ‘Which features intersect between samples ‘WT312’, ‘WT340’, and ‘WT354’, that also intersect with the intersection between samples ‘WT313’, ‘WT341’, and ‘WT355’. To visualize this, I am using makeVennDiagram to create two three-way venn diagrams: one for samples ‘WT312’, ‘WT340’, and ‘WT354’, and one for samples ‘WT313’, ‘WT341’, and ‘WT355’.  Then I use findOverlappingPeaks to make an intersection of WT312, WT340, and WT354, which I call WT312_WT340_WT354.  Separately, I also use findOverlappingPeaks to make an intersection of WT313, WT341, and WT355, which I call WT313_WT341_WT355.  Then I take these two three-way intersected lists and pass those back through to makeVennDiagrams.  The issue is that the total number of elements contained within the three-way venn diagrams for WT312-WT340-WT354 and WT313-WT341-WT355 do not sum up to the total number of elements from the two-way venn diagram.  I’m attaching three PNG files to show what I’m talking about.  Specifically, the number of elements that are found in WT312-WT340-WT354 in the two-way venn diagram exceeds the number of elements found to intersect between WT312-WT340-WT354 in the three-way venn diagram.  The same basic story exists for WT313-WT341-WT355.

I can’t understand what would cause this behaviour, but it is a bit concerning for me.  The intersection between the sets WT312-WT340-WT354 and WT313_WT341_WT355 should exactly sum to the intersection of the set WT312-WT340-WT354 and the the intersection of set WT313-WT341-WT355.

I'm pasting image links so you can see what I'm referring to.

487+395 = 882, not 852

and

292+395 = 687, not 674

Thanks,

modified 3.6 years ago by Ou, Jianhong1.1k • written 3.6 years ago by Brad Davis20
2
3.6 years ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Thank you for your message. This question is very typical one for calculating intersection of peaks. As you know a peak is not a point, it is a range of continuous points. If we consider intersection of set A (1-2, 4-5, 7-9) with set B (2-8), how many intersect peaks should we got? This is a question. If I say 1, but setA has three peaks. If I say 3 but setB has only one peak. In ChIPpeakAnno (release version), we are using the minimal number for intersect peaks. That is to say we are using 1 in this case. This is way you will get smaller number than it looks like.

I know this issue make people uncomfortable. I am thinking how to resolve this problem. I am thinking to build a new method to show the two numbers in the intersection part of venn diagram. I will let you know once it is done.

1
3.6 years ago by
Ou, Jianhong1.1k
United States
Ou, Jianhong1.1k wrote:

Please have a try the newest devel version at http://pgfe.umassmed.edu/ChIPpeakAnno/. Try makeVennDiagram with connectedPeaks="keepAll"

Hi Jianhong,

Thanks for pointing me to the latest development version of ChIPpeakAnno.  I am trying to use it but I have encountered what appears to be a bug.  While trying to use the function findOverlappingPeaks I get an error indicating that it cannot find the function .Deprecated.  I've pasted the snippet of code and error message below.

> WT312_WT340c= findOverlappingPeaks(WT312c, WT340c, select="all", connectedPeaks="keepAll")
Error in findOverlappingPeaks(WT312c, WT340c, select = "all", connectedPeaks = "keepAll") :
could not find function ".Deprected"

Do you have any suggestions on how to deal with this?

Thanks,

Edit: A partial solution to this seems to be to use findOverlapsOfPeaks instead of findOverlappingPeaks; however, it doesn't allow me to do what I was hoping to do.  I was using findOverlappingPeaks in a two step process to intersect three sets of intervals (and I did that twice on two separate sets of three sets of intervals).  Then I used the resulting two sets of intervals to make a two-way venn diagram.  Below is a snippet of code illustrating what I was doing:

#intersect normal peaks
WT312_WT340c= findOverlappingPeaks(WT312c, WT340c, select="all")
W
T312_WT340_WT354c = findOverlappingPeaks(WT312_WT340c$MergedPeaks, WT354c, select="all") #intersect disease peaks WT313_WT341c= findOverlappingPeaks(WT313c, WT341c, select="all") WT313_WT341_WT355c = findOverlappingPeaks(WT313_WT341c$MergedPeaks, WT355c, select="all")

#make two way venn diagram
makeVennDiagram(RangedDataList(WT312_WT340_WT354c$MergedPeaks, WT313_WT341_WT355c$MergedPeaks), NameOfPeaks=c("Normal", "Diseased"), fill=twocolors, col=twocolors, totalTest=100, main="allNormalIntersect intersected with allDiseasedIntersect\nthreshold=1.0", cex=3.0)

There doesn't seem to be any way to do this with findOverlapOfPeaks, because it doesn't return a list of peaks in a format that can be used with makeVennDiagrams, nor can I use the output of findOverlapsOfPeaks as input for a second round of findOverlapOfPeaks.

Do you see what I'm trying to accomplish?

1

Thanks for reporting the bug. It is fixed. You could download the new devel version from http://pgfe.umassmed.edu/ChIPpeakAnno/ now.

For the new version, you can try:

WT312_WT340_WT354c.ol <- findOverlapsOfPeaks(WT312c, WT340c, WT354c, maxgap=0)

WT313_WT341_WT355c.ol <- findOverlapsOfPeaks(WT313c, WT341c, WT355c, maxgap=0)

WT312_WT340_WT354c <- WT312_WT340_WT354c.ol$peaklist$WT312c///WT340c///WT354c

WT313_WT341_WT355c <- WT313_WT341_WT355c.ol$peaklist$WT313c///WT341c///WT355c

ol <- findOverlapsOfPeaks(WT312_WT340_WT354c, WT313_WT341_WT355c, maxgap=0, connectedPeaks="keepAll")

makeVennDiagram(ol, NameOfPeaks=c("Normal", "Diseased"), fill=twocolors, col=twocolors, main="allNormalIntersect intersected with allDiseasedIntersect\nthreshold=1.0", cex=3.0)

Let me known it work or not.

Hi Jianhong,

Thank you for your quick response. I have installed the latest version of ChIPpeakAnno and I am trying out the example code you have provided me.

What is the best way for me to make the two three-way venn diagrams so that they are comparable with this two-way venn diagram?  My gut reaction is to use

makeVennDiagram(RangedDataList(WT312c, WT340c, WT354c), NameOfPeaks=c("WT312", "WT340", "WT354"),fill=brewer.pal(n=3, "Set1"), col=brewer.pal(n=3, "Set1"),scaled=FALSE, totalTest=100, euler.d=F, main=sprintf("%s: enriched regions, E.T. = %2.1f", mark, threshold), cex=3.0, connectedPeaks="keepAll");

Would it be easier if I sent you the bed files I'm working with?

It will be great if you can send me the bed files you are working at. I really want some real data to test my code.

Another thing, you need to determine the total test number by yourself, for example some reasonable number of possible binding sites.

Hi Jianhong,

I've tried the latest version of the code and worked through your example, but the numbers still do not add up correctly.  The total number of peaks in the two way venn diagram (i.e. both for WT313_WT313_WT355 alone and the intersection between WT313_WT341_WT355 and WT312_WT340_WT354) does not equal the number of peaks identified in the intersection between WT313, WT341, and WT355 in the three way venn diagram.  The numbers are off by a few hundred.  For example, for the diseased samples (WT313, WT341. WT355), the number of features in the two-way venn diagram are 2149 and 1163 (1217.1245) for the diseased exclusive and the intersection, respectively), while the number of features for the three way venn diagram for the diseased samples equals 3502 (4030/3740/3927).  I can send you the bed files that I am working with if that would help.

could you show me the output of

ol$venn_cnt WT312_WT340_WT354c.ol$venn_cnt

WT313_WT341_WT355c.ol\$venn_cnt

0
3.6 years ago by

Hello Jianhong,

Cheers,

0
3.6 years ago by

Hi Jianhong,

It seems like this behaviour should be related to the 'select' parameter option in findOverlappingPeaks command?  If so, shouldn't there be a way to generate identical venn diagrams by setting the 'select' command to a defined argument, such as 'left'?

Also, would it be possible to report the percentage (as opposed to the raw number) for each of the elements in the venn diagram? That would make it easier to compare between comparisons.