Is it possible to change how DiffBind merge peaks?
3
0
Entering edit mode
@zhenfengliu1-21576
Last seen 4.3 years ago

The default behavior of DiffBind when merging peaks from different samples is that peaks with at least 1 bp overlap will be merged. For example, peak chr1:100-300 will be merged with peak chr1:299-500. Is it possible to change this behavior, for example, requiring at least 50% bp overlap between the two peaks, so the above example won't merge? If anyone knows how to do this in DiffBind or ways using other R packages that would still be compatible with other parts of DiffBind, that would be great.

Thank you.

DiffBind ChIPseq • 3.3k views
ADD COMMENT
2
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 13 days ago
Cambridge, UK

I'm adding another answer in response to the comments of AMA regarding the $intersectMode parameter not being properly passed in to summarizeOverlaps.

I can confirm that this is indeed a bug. Besides silently not allowing the default mode to be changed, the actual default being used has been "Union", not "IntersectionNotEmpty" as documented. In most cases, where the consensus peakset contains non-overlapping intervals, these are the same. However if the $mergeOverlap configuration parameter was set to a negative value, the behavior may not have been as expected.

This bug has been fixed and checked in; it will appear in the next day or two as DiffBind_3.2.5.

That version also has added support for a new configuration parameter, $inter.feature, that, if present, will additionally be passed in to summarizeOverlaps. This is documented in the help page for dba.count().

ADD COMMENT
0
Entering edit mode

Thanks a lot Rory Stark for your quick response. I'm wondering when will DiffBind_3.2.5 be available? or if there is a repo on GitHub I can use to update what I have.

ADD REPLY
0
Entering edit mode

Looks like it just went live on the Bioconductor site!

ADD REPLY
0
Entering edit mode

Is there a way to install this version on R3.6.2?

ADD REPLY
0
Entering edit mode

I don't think so. Bioconductor releases have been dependant on R 4.x for well over a year, and there is no way to go back and change a release prior to that. If you have a build environment you could install from the tar.gz, but there are many dependencies that would also need to be compatible.

ADD REPLY
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 13 days ago
Cambridge, UK

We are looking at exposing this feature in DiffBind, but currently there is no way to override the 1bp overlap.

If you are able to separately derive a merged consensus set using alternative criteria, not that you can pass it in to dba.count() and that set will be used for all the subsequent steps (so long as the intervals you pass in are not themselves overlapping).

ADD COMMENT
0
Entering edit mode

Hi Rory, did you solve this problem with the new diffbind? I read the manual and I didn't find anything but maybe I'm not finding the right point.

ADD REPLY
0
Entering edit mode

See separate answer.

ADD REPLY
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 13 days ago
Cambridge, UK

There will be a version of the feature in the next release (scheduled for May 20). It is available in the Development version starting from DiffBind_3_1_7.

A specific overlap value (in basepairs) can be specified by setting a configuration parameter:

DBA$config$mergeOverlap

The default is 1, meaning all peaks that overlap by at least 1 basepair will be merged. If you set it higher, for example to 100, peaks won't be merged unless they overlap by at least 100bp. Note that this means you can have separate consensus peaks that actually overlap, which may impact the counting, as by default any reads that overlap more than one consensus peak will not be counted. (You can control this with another configuration option, DBA$config$intersectMode).

Negative values can also be used to specify that peaks that do not overlap, but are within a "gap" of a set number of basepairs of each other, will be merged.

There isn't an option to specify the overlap amount using a percentage, just a constant.

ADD COMMENT
0
Entering edit mode

Hi Rory Stark

I installed the last version of DiffBind, and I'm trying to test your solution. I found that I can use mergeOverlap to configure the merging step. However, I'm not sure what inttersectMode would count the reads without discarding the ones that overlap with multiple peaks (not merged). I figured the counting is based on summarizeOverlap function, but when inspected the source code of DiffBind, I couldn't find where the intersectMode is utilized! I saw it's assigned, but it wasn't actually used in any count function including summarizeOverlap. It seems the default is enforced no matter what value you give intersectMode.

Could you please let me know what you think, and if it's still impossible to implement the idea using DiffBind?

Thank you

ADD REPLY
0
Entering edit mode

after inspecting summarizeOverlap function, it seems the settings that work in my case is: example <- summarizeOverlaps(gr, reads, mode="Union", inter.feature=FALSE)

However, I'm not sure if it's possible to use these settings with DiffBind

ADD REPLY
1
Entering edit mode

Good catch, this is a bug! I've fixed it and added support for $inter.feature as well. See my new answer below for more details.

ADD REPLY

Login before adding your answer.

Traffic: 995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6