minOverlap in DiffBind in R- what does it mean?
0
0
Entering edit mode
Rory Stark ★ 4.1k
@rory-stark-5741
Last seen 24 days ago
CRUK, Cambridge, UK
Hi- The minOverlap parameter in dba.read provides a simple way to define a consensus peakset used for further analysis. In cases where you have loaded a number of different peaksets, a single consensus peakset is defined consisting of (merged) peaks that overlap with at least X of the original peaksets. So if X=1, then all the (merged) peaks are included in the consensus peakset. If X=2, only peaks that were identified in at least two of the samples are included. In the vignette example where there are 11 samples, setting X=11 would only analyze peaks that were identified in all 11 samples. You can see how many peaks would be in each possible consensus peakset defined using minOverlap, and how this number diminishes as the overlap criterion becomes more stringent (larger values of minOverlap), by calling dba.overlap with mode=DBA_OLAP_RATE. This returns a vector containing the number of peaks that would be in a consensus peakset if minOverlap were set to the index (so the first element of the vector is the total number of merged peaks, as if minOverlap=1, the second element as if minOverlap=2, etc.). In general, setting minOverlap to a higher number results in a set of intervals that are more likely to represent genuine binding sites (as they were identified in more samples), but may result in some truly differentially bound sites being eliminated. A lower value of minOverlap may include more spurious sites (noise), but the nature of the differential analysis should prevent these from being identified as differentially bound with low FDR (although to a certain extent an FDR "penalty" is paid by including more sites in the multiple testing correction). The default of minOverlap=2 was chosen as it eliminates only sites that were identified uniquely in one and only one sample. In the vignette, minOverlap is set to 3 only because this reduced the size of the resulting DBA data objects so that they could be included with the package and remain under size limits imposed by Bioconductor. Cheers- Rory ________________________________ From: Theresa Stueve [theresas@usc.edu] Sent: 15 February 2014 02:01 To: Rory Stark; Gordon Brown Subject: minOverlap in DiffBind in R- what does it mean? Greetings Drs, Stark and Brown, my name is Theresa. I have just started using DiffBind and can't thank you enough for such a powerful and easy-to-use package. I have gone through your tamoxifen tutorial and have read through forums and threads on DiffBind online. but I still can't ferret out what "minOverlaps= X" does. Everyone online seems to be quite comfortable calling "minOverlaps" with different values from the default- so I apologize in advance if the answer is apparent and I missed it. I really appreciate your time and this wonderful program. -- (Theresa) Ryan Stueve T32 Postdoctoral Fellow in Environmental Genomics Department of Preventive Medicine Ite Laird-Offringa Lab NTT 6420 [[alternative HTML version deleted]]
DiffBind DiffBind • 1.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6