Question

Is there a way to run Diffbind without the normalization ?

1

Entering edit mode

Thomas ▴ 10

@thomas-16893

Last seen 5.7 years ago

Hello,

I was wondering if there was an option to run the differential analysis without the normalisation step ?

Apparently there is an option like that for some plots but I can't happen to find anything for the differential analysis.

Thank you for your time

Thomas

diffbind differential binding analysis • 1.4k views

ADD COMMENT • link updated 3.5 years ago by Rory Stark ★ 5.1k • written 5.7 years ago by Thomas ▴ 10

score 0 · Answer 1 · 2018-10-03

0

Entering edit mode

Rory Stark ★ 5.1k

@rory-stark-5741

Last seen 8 days ago

Cambridge, UK

I am currently working on giving users much more control over normalization in DiffBind. In the meantime, there is a trick you can do you avoid normalization when using DESeq2 as the analysis method.

After running dba.count(), you can change the individual library sizes to all be the same. Then if you run dba.analyze() with bFullLibrarySize=TRUE (default) the normalization facts will all be equal to 1. You probably want to run dba.analyze()with bSubControl=FALSE to avoid having the control reds subtracted if do not want normalization of any kind.

You can set the library size values as follows:

> myDBA <- dba.count(myDBA, ...)
> myDBA$class[8,] <- 1e+07
> myDBA <- dba.analyze(myDBA, method=DBA_DESEQ2, bFullLibrarySize=TRUE, bSubControl=FALSE)

Also, there is a "backdoor" in DiffBind to allow you to provide your own normalized read counts -- if that is of interest, I can show the code required to use it.

ADD COMMENT • link 5.5 years ago Rory Stark ★ 5.1k

0

Entering edit mode

In my opinion, for visualization purposes it's accurate enough to convert the raw reads (bam files) to tdf format and load them in IGV with the option "normalize coverage data" enabled, this will rescale the raw counts to 1000000 / total counts.thus correcting for different library sizes. This assuming you are using/willing to use IGV, other browsers surely have similar options.

DiffBind, or rather the underlying edgeR and DESeq packages, don't make use of the read profiles. All they use are the raw counts of reads in given intervals (ChipSeq peaks, genes, whatever), in fact the read profile within an interval doesn't enter in the analysis at all.

If you want to show the log fold change you could prepare a bedgraph file where each line has as coordinates the region tested, or maybe the midpoint of the region tested, whichever looks better, and as value (4th column) the log fold change. Then load this file in IGV with or without the raw reads file as above.

ADD REPLY • link 5.5 years ago cchristopherdeveauret • 0

score 0 · Answer 2 · 2020-11-06

0

Entering edit mode

Rory Stark ★ 5.1k

@rory-stark-5741

Last seen 8 days ago

Cambridge, UK

The recently released version of DiffBind includes a new interface function, dba.normalize(), that allows library sizes and normalization factors (or offsets) to be set and retrieved directly. The updated vignette includes a detailed exploration of the impact of different normalization schemes.

ADD COMMENT • link 3.5 years ago Rory Stark ★ 5.1k