DiffBind report and MA-plot DB difference
2
0
Entering edit mode
@hlszlaszlo-14791
Last seen 6.3 years ago

Hi all,

I using DiffBind to compare WT vs Treatment ATAC-seq samples and I have two questions/problems regarding the DiffBind dba.report() and dba.plotMA() functions.

1.: dba.plotMA() indicates more differential sites than dba.report() with the same thresholds. 

The input csv:

SampleID,Tissue,Factor,Condition,Replicate,bamReads,bamControl,Peaks
WT_1,hs,ATAC,WT,1,WT_1.bam,,WT_1.bed
WT_2,hs,ATAC,WT,2,WT_2.bam,,WT_2.bed
Treat_1,hs,ATAC,Treat,1,Treat_1.bam,,Treat_1.bed
Treat_2,hs,ATAC,Treat,2,Treat_2.bam,,Treat_2.bed

Code:

## Load DiffBind sample design.
ATAC.dba = dba(sampleSheet = "csv/samples.csv", peakFormat = "bed")

## Count reads in binding site intervals. 4 Samples, 53778 sites in matrix:
ATAC.dba.count = dba.count(ATAC.dba, bLog = T, score = DBA_SCORE_TMM_READS_FULL, filter = 10)

## Set up contrasts for differential binding affinity analysis.
ATAC.dba.count = dba.contrast(ATAC.dba.count, categories = DBA_CONDITION, minMembers = 2)

## Perform differential binding affinity analysis.  Contrast:
##  Group1     Members1   Group2    Members2     DB.DESeq2
##     WT               2             Treat              2                   2805
ATAC.dba.count = dba.analyze(ATAC.dba.count, bFullLibrarySize = TRUE)

## The MA-plot title: ' 1277 FDR < 0.050'

dba.plotMA(ATAC.dba.count, th = 0.05, bUsePval = FALSE, fold = 2, bFlip = TRUE)

## dba.report gives 1274 differential sites however, with the same parameters.

ATAC.dba.count.DB = data.frame(dba.report(ATAC.dba.count, th = 0.05, bUsePval = FALSE, fold = 2, bFlip = TRUE))


## With the approach below, I can reproduce the MA-plot DB count.

ATAC.dba.count.MA = data.frame(dba.report(ATAC.dba.count, th = 1, bCounts = FALSE, bFlip = TRUE))

ATAC.dba.count.MA = mutate(ATAC.dba.count.MA, sig = ifelse(abs(Fold) >= 2 & FDR <= 0.05, "sig", "ns"))
table(ATAC.dba.count.MA$sig) ##    ns (52501)  sig (1277)

## With the code below I could find the three missing DB site, and they are true DB sites after manual inspections.

tmpdf = merge(ATAC.dba.count.MA, ATAC.dba.count.DB, by = c("seqnames", "start", "end"), all = TRUE)
subset(tmpdf, sig == "sig" & is.na(width.y) == T)

 

2.: dba.report() seems to report not normalised values for a sample.  This is also present in the tamoxifen dataset. When I try to report the DB sites MCF71 sample have the same values even is bNormalized is turned on. Am I missing something?

data("tamoxifen_analysis")
dba.report(tamoxifen, th = 0.05, bUsePval = FALSE, fold = 2, bCounts = TRUE, bNormalized = FALSE)
dba.report(tamoxifen, th = 0.05, bUsePval = FALSE, fold = 2, bCounts = TRUE, bNormalized = TRUE)

Thank you for the help!

- Laszlo

diffbind differential binding analysis R report • 1.4k views
ADD COMMENT
2
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 8 days ago
Cambridge, UK

In the first instance, where there is a discrepancy between the number of differentially bound sites in the report vs the MA plot, this is indeed a bug. It relates to different levels of precision used when thresholding in the report vs. the plot(s). In this case, the report is correct and the plot is wrong. I am checking in a fix now, that will appear as DiffBind_2_6_6.

In the second case, where one sample appears to be non-normalized, this sample has a normalization factor of 1.0. The default normalization (bFullLibrarySize) is to normalize to the smallest library, so the counts for that library remain unchanged.

-Rory

ADD COMMENT
0
Entering edit mode

Thank you! 

ADD REPLY
0
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 8 days ago
Cambridge, UK

What version of DiffBind are you using? (output of sessionInfo()). Using the current release, I tried your second issue (same count values with either setting of  bNormalized) and can not reproduce it -- I get two different sets of counts, with the non-normalized counts being integers.

If you are using a recent version and you are still getting the discrepancy with dba.plotMA, the best thing is to send me your DBA object ATAC.dba.count so I can track it down.

Cheers-

Rory

rory.stark [at] cruk.cam.ac.uk

ADD COMMENT
0
Entering edit mode

Thank you for the response.

I am using R version 3.4.3 (2017-11-30) and DiffBind_2.6.5.

I also using tidyverse_1.2.1, which generates some namespace conflicts, but even if I don't load tidyverse I get the same values (norm/notnorm).

I send you the ATAC.dba.count DBA object and R code to your email address.

Thanks,

Laszlo

 

ADD REPLY

Login before adding your answer.

Traffic: 637 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6