Hi, Using DESEQ2 and EDGER in diffBind gives huge difference in total number of differentially bound sites around 8000 for DESEQ2 and around 700 for EDGER, reading the vignette, the number should be fairly similar. And also do we need to use "DBASCORETMMREADSFULL" in dba.count if I am setting "bFullLibrarySize=TRUE" in dba.analyze? Currenlty I am using "DBASCOREREADS" in dba.count with "bFullLibrarySize=TRUE" in dba.analyze.
Thank you for your reply, incase of EDGER, using
bFullLibrarySize = TRUE/FALSE
doesn't make much difference and MA plot is almost same, but when using DeSeq2, whenbFullLibrarySize = TRUE
the negative fold change is highly increased and hence around 8000, but if changed toFALSE
, the number is drastically reduced to around 200This is consistent with an experiment that induces a large change in binding, all in one direction. The TMM normalization used by
edgeR
assumes a core of relatively unchanged binding and will over-normalize. In this case, you should useDESeq2
andbFullLibrarySize = TRUE
. This is actually the reason we changed this to be the default.In future I hope to make the normalization more transparent and better separated from the analysis method.
what I see in ?dba.analyze is
So can I think that means if TRUE, I will use the reads count in BAM, then libSize are sent to sizeFactors(). So the DEseq2 will not use the function:estimateSizeFactors(). While FASLE, then I will use reads count in peak set, then DESeq2 estimateSizeFactors() will estimate sizeFactors from these sample peak counts?
Correct, the factors are only estimated by
DESeq2
ifbFullLibrarySize=FALSE
, otherwise they are based directly on thelibSize
s.