Question: DiffBind report and MA-plot DB difference
0
gravatar for hlsz.laszlo
22 months ago by
hlsz.laszlo0 wrote:

Hi all,

I using DiffBind to compare WT vs Treatment ATAC-seq samples and I have two questions/problems regarding the DiffBind dba.report() and dba.plotMA() functions.

1.: dba.plotMA() indicates more differential sites than dba.report() with the same thresholds. 

The input csv:

SampleID,Tissue,Factor,Condition,Replicate,bamReads,bamControl,Peaks
WT_1,hs,ATAC,WT,1,WT_1.bam,,WT_1.bed
WT_2,hs,ATAC,WT,2,WT_2.bam,,WT_2.bed
Treat_1,hs,ATAC,Treat,1,Treat_1.bam,,Treat_1.bed
Treat_2,hs,ATAC,Treat,2,Treat_2.bam,,Treat_2.bed

Code:

## Load DiffBind sample design.
ATAC.dba = dba(sampleSheet = "csv/samples.csv", peakFormat = "bed")

## Count reads in binding site intervals. 4 Samples, 53778 sites in matrix:
ATAC.dba.count = dba.count(ATAC.dba, bLog = T, score = DBA_SCORE_TMM_READS_FULL, filter = 10)

## Set up contrasts for differential binding affinity analysis.
ATAC.dba.count = dba.contrast(ATAC.dba.count, categories = DBA_CONDITION, minMembers = 2)

## Perform differential binding affinity analysis.  Contrast:
##  Group1     Members1   Group2    Members2     DB.DESeq2
##     WT               2             Treat              2                   2805
ATAC.dba.count = dba.analyze(ATAC.dba.count, bFullLibrarySize = TRUE)

## The MA-plot title: ' 1277 FDR < 0.050'

dba.plotMA(ATAC.dba.count, th = 0.05, bUsePval = FALSE, fold = 2, bFlip = TRUE)

## dba.report gives 1274 differential sites however, with the same parameters.

ATAC.dba.count.DB = data.frame(dba.report(ATAC.dba.count, th = 0.05, bUsePval = FALSE, fold = 2, bFlip = TRUE))


## With the approach below, I can reproduce the MA-plot DB count.

ATAC.dba.count.MA = data.frame(dba.report(ATAC.dba.count, th = 1, bCounts = FALSE, bFlip = TRUE))

ATAC.dba.count.MA = mutate(ATAC.dba.count.MA, sig = ifelse(abs(Fold) >= 2 & FDR <= 0.05, "sig", "ns"))
table(ATAC.dba.count.MA$sig) ##    ns (52501)  sig (1277)

## With the code below I could find the three missing DB site, and they are true DB sites after manual inspections.

tmpdf = merge(ATAC.dba.count.MA, ATAC.dba.count.DB, by = c("seqnames", "start", "end"), all = TRUE)
subset(tmpdf, sig == "sig" & is.na(width.y) == T)

 

2.: dba.report() seems to report not normalised values for a sample.  This is also present in the tamoxifen dataset. When I try to report the DB sites MCF71 sample have the same values even is bNormalized is turned on. Am I missing something?

data("tamoxifen_analysis")
dba.report(tamoxifen, th = 0.05, bUsePval = FALSE, fold = 2, bCounts = TRUE, bNormalized = FALSE)
dba.report(tamoxifen, th = 0.05, bUsePval = FALSE, fold = 2, bCounts = TRUE, bNormalized = TRUE)

Thank you for the help!

- Laszlo

ADD COMMENTlink modified 22 months ago by Rory Stark2.9k • written 22 months ago by hlsz.laszlo0
Answer: DiffBind report and MA-plot DB difference
2
gravatar for Rory Stark
22 months ago by
Rory Stark2.9k
CRUK, Cambridge, UK
Rory Stark2.9k wrote:

In the first instance, where there is a discrepancy between the number of differentially bound sites in the report vs the MA plot, this is indeed a bug. It relates to different levels of precision used when thresholding in the report vs. the plot(s). In this case, the report is correct and the plot is wrong. I am checking in a fix now, that will appear as DiffBind_2_6_6.

In the second case, where one sample appears to be non-normalized, this sample has a normalization factor of 1.0. The default normalization (bFullLibrarySize) is to normalize to the smallest library, so the counts for that library remain unchanged.

-Rory

ADD COMMENTlink written 22 months ago by Rory Stark2.9k

Thank you! 

ADD REPLYlink written 22 months ago by hlsz.laszlo0
Answer: DiffBind report and MA-plot DB difference
0
gravatar for Rory Stark
22 months ago by
Rory Stark2.9k
CRUK, Cambridge, UK
Rory Stark2.9k wrote:

What version of DiffBind are you using? (output of sessionInfo()). Using the current release, I tried your second issue (same count values with either setting of  bNormalized) and can not reproduce it -- I get two different sets of counts, with the non-normalized counts being integers.

If you are using a recent version and you are still getting the discrepancy with dba.plotMA, the best thing is to send me your DBA object ATAC.dba.count so I can track it down.

Cheers-

Rory

rory.stark [at] cruk.cam.ac.uk

ADD COMMENTlink written 22 months ago by Rory Stark2.9k

Thank you for the response.

I am using R version 3.4.3 (2017-11-30) and DiffBind_2.6.5.

I also using tidyverse_1.2.1, which generates some namespace conflicts, but even if I don't load tidyverse I get the same values (norm/notnorm).

I send you the ATAC.dba.count DBA object and R code to your email address.

Thanks,

Laszlo

 

ADD REPLYlink written 22 months ago by hlsz.laszlo0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 307 users visited in the last hour