Question

DiffBind - Retrieving the differentially bound sites - HOW to calculate fold-change

0

Entering edit mode

lin.pei26 • 0

@868cea86

Last seen 7 months ago

United States

Hi!

I use the function dba.report to retrieve differentially bound sites (th = 1) I found the fold-changes tend to be very small and do not know how to compute them.

For example, at one site the mean for control is 1.6973 while the mean for treatment is 4.231, and the Fold is -0.001057009, p-value is 0.0051515283, FDR = 0.99.

How the fold-changes were computed? my code is below.

Thank you!

cond = "treatment"
obj = dba(sampleSheet=data0)
obj = dba.count(obj,minOverlap=2)
obj = dba.normalize(obj,normalize=DBA_NORM_RLE,library=DBA_LIBSIZE_PEAKREADS)
obj = dba.contrast(obj,reorderMeta=list(Condition=cond),minMembers=2)
obj = dba.analyze(obj)
dbp = dba.report(obj,th=1)

ChIPSeq DiffBind • 1.0k views

ADD COMMENT • link updated 11 months ago by Rory Stark ★ 5.2k • written 11 months ago by lin.pei26 • 0

0

Entering edit mode

I used DiffBind v3.10.0.

According to https://www.biostars.org/p/9467075/

NB: Since version 3.0, DiffBind has changed how the Fold values are reported. In the default case where a design formula is used, the Fold values included in the report are those calculated by the underlying differential analysis package (DESeq2 or edgeR). These may include shrinkage adjustments, and no longer correspond to a simple subtraction of log concentrations.

ADD REPLY • link 11 months ago lin.pei26 • 0

score 1 · Answer 1 · 2023-05-25

1

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 1 day ago

Cambridge, UK

As you note, the fold change calculations are being performed by DESeq2, so you may want to look at the documentation for the DESeq2::lfcShrink function (and the References it lists) to see how it works.

My guess is that you are getting very low shrunken lfc values, and very high FDR values, due to having only two replicates which may not agree very well, leaving you with an under-powered experiment. You may want to check the read counts (by calling dba.report() with bCounts=TRUE) to see how consistent they are within sample groups.

I'm also wondering if you had a good reason to choose to normalize with library=DBA_LIBSIZE_PEAKS (RiP) as this tends to over-normalize, which also lessens apparent fold changes.

ADD COMMENT • link 11 months ago Rory Stark ★ 5.2k

0

Entering edit mode

Thank you Rory!

(1) Yes, we have very high FDR values

(2) Using library=DBA_LIBSIZE_FULL, we found all differential peaks are higher in our control group, no matter what normalization was applied. Using library=DBA_LIBSIZE_PEAKREADS, the pattern makes more sense to us.

ADD REPLY • link 11 months ago lin.pei26 • 0

0

Entering edit mode

Are you confident that in the non-control condition there is not a systematic loss of binding levels reflecting a true biological signal? If so PEAKREADS normalization can mask this. You may also want to try BACKGROUND normalization which is usually, but not always, similar to FULL library normalization.

ADD REPLY • link 11 months ago Rory Stark ★ 5.2k