Diffbind Consensus vs Differential peak sets from same count
Entering edit mode
rbronste ▴ 60
Last seen 23 months ago

Trying to get consensus and differential set from the same dba.count and not sure the best way to parse the two. Here is my command string:

BN<- dba(sampleSheet = "Adult_AT_BN.csv", config=data.frame(AnalysisMethod="DBA_DESEQ2",th=.05),peakFormat="narrow")

BN_count<- dba.count(BN, score=DBA_SCORE_READS, fragmentSize=0)


M_BN_Veh<-masks$MALE & masks$VEHICLE

M_BN_vehEB_contrast<-dba.contrast(BN_count, group1 = M_BN_Veh, group2 = M_BN_EB)

M_BN_vehEB_analysis<-dba.analyze(M_BN_vehEB_contrast, method = DBA_DESEQ2)

M_BN_vehEB_report<-dba.report(M_BN_vehEB_analysis, method = DBA_DESEQ2, th=.01, fold=2)

So wondering how I can get differential peaks of .01 FDR larger than 5 and -5 fold change and then consensus peaks between 5 and -5 with the same FDR limits? Thanks.



diffbind chipseq • 507 views
Entering edit mode
Rory Stark ★ 4.1k
Last seen 15 days ago
CRUK, Cambridge, UK

I'm not entirely sure what you are trying to do, but if you are just wanting to see the differentially bound peaks with FDR values lower than a threshold and with absolute fold change greater than a value, you last line should do it (in that case, peaks with FDR < 0.01 and absolute fold change > 4, as the value for the fold parameter is taken to be a log2, so fold=2 returns fold changes > 4).

What I'm not clear on is what you mean by the difference between "differential peaks" and "consensus peaks between 5 and -5 with the same FDR limits". All the peaks returned by dba.report() are consensus peaks -- the only thing that labels them as "differential" is setting some threshold on the FDR.

Two notes worth considering:

First, you can work directly on the full consensus peakset by setting th=1 in the call to dba.report(), then subsetting it. For example:

> M_BN_vehEB_report<-dba.report(M_BN_vehEB_analysis,th=1)
> M_BN_vehEB_report[M_BN_vehEB_report$FDR <.01 & abs(M_BN_vehEB_report$Fold) > 5,]

Second, it has been shown that if you threshold by fold change, the FDR values computed for the whole set are no longer correct (see McCarthy, D.J. and Smyth, G.K., 2009. Testing significance relative to a fold-change threshold is a TREATBioinformatics25(6)). We're hoping to add this correction to DiffBind soon, but it is worth being aware. The good news is that FDR values calculated this way will generally be lower, so using the existing ones is "conservative." 



Login before adding your answer.

Traffic: 456 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6