**0**wrote:

Hi,

I'm trying to understand the normalization for DESeq2 analysis within DiffBind. If I run:

`Pool1 = dba.analyze(Pool1,method=DBA_DESEQ2,`

**bFullLibrarySize=FALSE**,bCorPlot=FALSE);

Pool1.DB = dba.report(Pool1,file="test",method=DBA_DESEQ2,th=1,bCounts=TRUE);

Then, the normalized counts contained within elementMetadata(Pool1.DB) are calculating by taking the raw counts for each peak divided by the normalization factor s_j calculated via the median of ratios method described in (http://genomebiology.com/2010/11/10/R106). Is this correct? When I test this, I get something close: colMeans(originalcounts/outputfromPool1.DB) is highly correlated to s_j but not exactly the same. (Note this may be because I'm using a blocking factor in my model?).

If I run instead,

Pool1 = dba.analyze(Pool1,method=DBA_DESEQ2,bFullLibrarySize=TRUE,bCorPlot=FALSE); Pool1.DB = dba.report(Pool1,file="test",method=DBA_DESEQ2,th=1,bCounts=TRUE);

Then, the normalized counts contained within elementMetadata(Pool1.DB) are calculated by taking the raw counts for each peak divided by librarysize/min(librarysize). Is this correct? I can test this as well, and again I get something close: colMeans(originalcounts/outputfromPool1.DB) is highly correlated to librarysize/min(librarysize) but not exactly the same.

So, by setting bFullLibrarySize=TRUE (the default), then I am only using the library size as a normalization factor and no other normalization factor? As I understand it, this can be biased by very highly "expressed" peaks, which is why the DESeq2 authors proposed the median normalization method. Whereas if I set bFullLibrarySize=FALSE, I use the median of ratios method as my normalization factor and not the library size?

That was a lot of questions, but thanks for helping me figure this out, and also thanks for making such a useful and well-supported package!

Jason

**3.0k**• written 4.5 years ago by JasonLouisStein •

**0**