Hello,
I'm using DiffBind to extract peaks that appear in my replicates. I have four replicates, and I wish to fetch peaks that are in at least 3 replicates. It doesn't matter which 3/4 (or 4/4) replicates they appear in. How do I achieve this? I expect it has to do with minOverlap parameter, but I can't figure out the changes I require.
The code I currently have;
dbObj=dba(sampleSheet=samples)
dbObj2 <- dba.peakset(dbObj, consensus = DBA_FACTOR)
ExptConsensus <- dba(dbObj2, mask=dbObj2$masks$Consensus, minOverlap=1)
ConsensusPeaks <- dba.peakset(ExptConsensus, bRetrieve=TRUE)
Hi,
I'm simply trying to filter peaks based on their appearances across replicates. I have 4 replicates with same factors, same conditions, etc.
What confuses me that if I do minOverlap=.75, I get the same exact result as with minOverlap=1, which shouldn't be true.
How should I proceed?
I see now.
This line in your original script creates a single consensus set:
This consensus set includes all merged peaks that overlap any sample (because
minOverlap=1
). When you reference this consensus peakset usingmask=dbObj2$masks$Consensus
, the mask refers to only a single consensus peakset. So it doesn't matter what you specify forminOverlap=
in subsequent operations, you can only ever get this consensus peakset again.If you want a consensus peakset including merged peaks that overlap at least 3 of the 4 samples, you can get it using:
You could use
dba.peakset()
to add the consensus peakset you want using:But this more complicated way doesn't really gain you anything.
I see.
I tested with the two ways you suggest, but I get different amount of ranges (peaks) depending which method I use. Why is that?
Vs
Sorry, the second snippet should read:
But there isn't really any point to doing it this way.
If you want to try different overlaps without re-reading the sample sheet, you can do:
etc.
Ah, okay. I see how it works now. Thank you so much for the help!