Search
Question: Diffbind Consensus vs Differential peak sets from same count
0
13 months ago by
rbronste60
rbronste60 wrote:

Trying to get consensus and differential set from the same dba.count and not sure the best way to parse the two. Here is my command string:

BN<- dba(sampleSheet = "Adult_AT_BN.csv", config=data.frame(AnalysisMethod="DBA_DESEQ2",th=.05),peakFormat="narrow")

masks<-BN_count$masks M_BN_Veh<-masks$MALE & masks$VEHICLE M_BN_vehEB_contrast<-dba.contrast(BN_count, group1 = M_BN_Veh, group2 = M_BN_EB) M_BN_vehEB_analysis<-dba.analyze(M_BN_vehEB_contrast, method = DBA_DESEQ2) M_BN_vehEB_report<-dba.report(M_BN_vehEB_analysis, method = DBA_DESEQ2, th=.01, fold=2) So wondering how I can get differential peaks of .01 FDR larger than 5 and -5 fold change and then consensus peaks between 5 and -5 with the same FDR limits? Thanks. ADD COMMENTlink modified 13 months ago by Rory Stark2.5k • written 13 months ago by rbronste60 1 13 months ago by Rory Stark2.5k CRUK, Cambridge, UK Rory Stark2.5k wrote: I'm not entirely sure what you are trying to do, but if you are just wanting to see the differentially bound peaks with FDR values lower than a threshold and with absolute fold change greater than a value, you last line should do it (in that case, peaks with FDR < 0.01 and absolute fold change > 4, as the value for the fold parameter is taken to be a log2, so fold=2 returns fold changes > 4). What I'm not clear on is what you mean by the difference between "differential peaks" and "consensus peaks between 5 and -5 with the same FDR limits". All the peaks returned by dba.report() are consensus peaks -- the only thing that labels them as "differential" is setting some threshold on the FDR. Two notes worth considering: First, you can work directly on the full consensus peakset by setting th=1 in the call to dba.report(), then subsetting it. For example: > M_BN_vehEB_report<-dba.report(M_BN_vehEB_analysis,th=1) > M_BN_vehEB_report[M_BN_vehEB_report$FDR <.01 & abs(M_BN_vehEB_report\$Fold) > 5,]

Second, it has been shown that if you threshold by fold change, the FDR values computed for the whole set are no longer correct (see McCarthy, D.J. and Smyth, G.K., 2009. Testing significance relative to a fold-change threshold is a TREATBioinformatics25(6)). We're hoping to add this correction to DiffBind soon, but it is worth being aware. The good news is that FDR values calculated this way will generally be lower, so using the existing ones is "conservative."

-Rory