Hi,
I ran diffbind and used edgeR and DESeq2.
I have a question regarding the normalized counts. In the DESeq2 analysis, the normalized counts of one of the samples always had integers. In EdgeR - there was no column with integers.
In EdgeR normalization - one column is chosen as reference to all, and in DESeq2 they create another column of the averages that is ussed for normalization.
So In EdgeR I would expect to have after normalization one column with integers, but not in DESeq. I actually get the opposite.
Can you help with this please?
Also, in the publication of the diffbind EdgeR was the default. Now DESeq2 is the default.
Do you have any preference to one of the methods?
Thanks a lot.
OK, I've looked into this a bit more deeply. For the default case when using
DESeq2
, wherebFullLibrarySize=TRUE,
DiffBind
sets the factors to be the full library sizes (the number of reads in the .bam files) normalized to the smallest library (dividing by the minimum library size). So the smallest library size gets a normalization factor equal to 1, while the others are greater than one. This simple normalization method is only used whenbFullLibrarySize=TRUE
andmethod=DBA_DESEQ2
. If you setbFullLibrarySize=FALSE
, using only the number of reads that overlap consensus peaks, thenestimateSizeFactors()
is called and the standard mean ratio method is used to calculate the normalization factors, none of which should be equal to 1.-R