How to use combat in order to remove batch effects?
Entering edit mode
Emma ▴ 10
Last seen 2.5 years ago

Hello! I have RNA seq data and I need to use combat to remove the batch effects. Somehow when I run it, it isnt actually doing anything.

dds <- DESeqDataSetFromMatrix(countData=data, 
                              design=~~Batch + dex, tidy = TRUE)
dds <- DESeq(dds, betaPrior=TRUE)
normalized_counts <- counts(dds, normalized=TRUE)
log2 = log2(normalized_counts+1)

modcombat = model.matrix(~dex, metadata) - with metadata being a variable containing treatment or control (under dex column) and batch and name of each patient.

com<-ComBat(log2, metadata$Batch, mod = modcombat)

Its supposed to be 4 different batches, but in the com variable I can see that the values have stayed the same as in log2.

What could be wrong? Would appreciate any help!

BatchEffect DESeq2 RNASeqData • 7.9k views
Entering edit mode
Last seen 2 hours ago
United States


It may help to show how you created the dds object.

Nevertheless, I would not use ComBat in this way. In your case, I would either use ComBat-seq on the raw counts prior to any DESeq2 command, or, I would use limma::removeBatchEffect in this way:

dds <- DESeq(dds)
normalized_counts <- counts(dds, normalized = TRUE)
vsd <- vst(dds, blind = FALSE)
mat <- assay(vsd)
mat <- limma::removeBatchEffect(mat, vsd$Batch)
assay(vsd) <- mat

Please also see Why after VST are there still batches in the PCA plot?


Entering edit mode

But I want to batch correct the normalized and log2 values, and the vst only takes integer numbers (my normalized values are not integers). Also I was told to use combat on my data by the unit at my University, why do you think its not a good idea?


Entering edit mode

The unit at your university is incorrect, unfortunately, unless they meant ComBat-seq?

ComBat was / is not designed for bulk RNA-seq data - it was originally developed for microarray data, which is measured on very different scales compared to RNA-seq. By applying ComBat to log2(normalized_counts+1), you are not really following good practice.

If you definitely want batch-corrected normalised counts, then use ComBat-seq and apply it to the raw counts prior to any DESeq2 function. In this case, you would be batch-correcting the raw counts, and would ultimately, therefore, obtain batch-corrected normalised counts, too. Please see:

Entering edit mode

Hi Kevin.

Considering that for applications such as WGCNA, one would want VST normalized RNA-seq counts and that unlike Combat, Combat-seq requires the input matrix to be raw counts, I'm a bit confused as where does the appropriate normalization step come in? Considering that the authors of the Combat-seq mention that after the adjustment of the data for batch effect by Combat-seq, the data can be directly used as input for algorithms such as DEseq2 (which have their own internal normalizations), am I correct to assume that after batch effect correction with Combat-seq, the data can be VST normalized and be used in downstream applications such as WGCNA?

Thanks in advance!


Login before adding your answer.

Traffic: 545 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6