I'm now running ATAC-seq differential analysis by DiffBind with peaks from MACS2. I have 10 samples from batch 3 and 20 samples from batch 5. How could I eliminate batch effect and normalize them in DiffBind? Sorry, I'm quite new in this area. Thanks!
If you see a batch effect (eg in the PCA plots), the best way to handle them is to model them. We don't want to adjust the count matrix directly as that violates the assumption of the differential analyses in DESeq2 and edgeR.
To model the batch, use the block parameter in dba.contrast(). For example, if the batch numbers were stored in the Replicate field, you would say block=DBA_REPLICATE. Then when you run dba.analyze(), it will do the analysis two ways: as a single-factor analysis, and with the "blocking" factor representing the batch. To use the results from the blocked analysis, you need to specify method=DBA_DESEQ2_BLOCK (or method=DBA_EDGER_BLOCK) when calling dba.report() or any of the plotting functions.