Hi,
I have been using DiffBind to perform differential binding analysis on ChIP-seq data, comparing two groups with 3 replicates for each. I am now facing an error when using its dba.count function.
I created a DBA object, Expt, using a sample sheet.
Expt <- dba(sampleSheet=”Expt.csv”)
After creating consensus peaksets by
Expt <- dba.peakset(Expt, consensus = -DBA_REPLICATE)
I used the dba.count function as follows.
Expt <- dba.count(Expt, peaks=Expt$masks$Consensus)
I then got an error message as follows.
Error in pv.counts(DBA, peaks = peaks, minOverlap = minOverlap,defaultScore = score,
Can’t count: some peaksets are not associated with a .bam file.
What is weird is that, when I used dba.count function on all the samples, without creating consensus peaksets, it works without showing any error message. In addition, this error message did not appear, when I used an older version of DiffBind. This error happens after I have updated R to ver. 3.3.1 and downloaded the latest version of DiffBind.
Thank you so much.
A couple more thoughts on this:
First, I looked through the vignette and I see there is a reference to accomplishing this in exactly the way you originally tried:
I'll fix this reference in the next release.
The other issue is that limiting the consensus set to sites that are identified in at least two replicates of both conditions may be too restrictive, depending on what questions you are trying answer. It would reasonable to expect that you would exclude many potentially interesting differentially bound sites this way. If you want to include sites that are identified in at least two replicates of either sample group, set
minOverlap=1
in the second line of code in my original answer:Cheers-
Rory
Thank you very much for the detailed comments, which are very helpful. It works now. What I wanted to try was to have a consensus peak set that corresponds to the sites identified in at least two replicates of either sample group, as you recommended. So, I will set minOverlap=1 in that command line.
Best regards,
Shohei
Hello,
I have another question related to this. When I used the dba.count function as you suggested using an older version (1.12.2) of DiffBind, I got warning messages as follows.
In sampvec | pv.whichCalled(spare, samp, masternum): the length of the longer object is not a multiple of that of the shorter one. (This is translated from the message in the Japanese version of R, so it may be different in the English version...)
Despite those messages, the dba.count function apparently worked because I was able to perform downstream analyses such as dba.contrast and dba.analyze functions... In the latest version of DiffBind, such warning messages did not appear.
I wonder whether I can ignore those warning messages and, if not, what should I do to avoid them.
Thank you very much for your kind help.
Shohei
Hi Shohei-
There was a bug in DiffBind 1.12 that has been fixed. It does not impact analyses, but could impact viewing which samples had peaks called (
bCalled=TRUE
and/orbCalledDetail=TRUE
indba.report()
) or looking at some overlaps (iedba.plotVenn()
after counting).So yes you you can ignore. I do advise updating if you can as 1.12 is at least three version old and no longer really supported.
-Rory
Hi Rory,
Thank you very much for your prompt response. I am relieved to know that the analyses were not affected by the bug. I will use the latest version of DiffBind.
Shohei