How to use combat in order to remove batch effects?
1
1
Entering edit mode
Emma ▴ 10
@emma-25007
Last seen 18 months ago

Hello! I have RNA seq data and I need to use combat to remove the batch effects. Somehow when I run it, it isnt actually doing anything.

dds <- DESeqDataSetFromMatrix(countData=data,
design=~~Batch + dex, tidy = TRUE)
dds <- DESeq(dds, betaPrior=TRUE)
normalized_counts <- counts(dds, normalized=TRUE)
log2 = log2(normalized_counts+1)

modcombat = model.matrix(~dex, metadata) - with metadata being a variable containing treatment or control (under dex column) and batch and name of each patient.

com<-ComBat(log2, metadata$Batch, mod = modcombat)  Its supposed to be 4 different batches, but in the com variable I can see that the values have stayed the same as in log2. What could be wrong? Would appreciate any help! BatchEffect DESeq2 RNASeqData • 3.4k views ADD COMMENT 5 Entering edit mode @kevin Last seen 1 hour ago Republic of Ireland Hi, It may help to show how you created the dds object. Nevertheless, I would not use ComBat in this way. In your case, I would either use ComBat-seq on the raw counts prior to any DESeq2 command, or, I would use limma::removeBatchEffect in this way: dds <- DESeq(dds) normalized_counts <- counts(dds, normalized = TRUE) vsd <- vst(dds, blind = FALSE) mat <- assay(vsd) mat <- limma::removeBatchEffect(mat, vsd$Batch)
assay(vsd) <- mat


Please also see Why after VST are there still batches in the PCA plot?

Kevin

0
Entering edit mode

But I want to batch correct the normalized and log2 values, and the vst only takes integer numbers (my normalized values are not integers). Also I was told to use combat on my data by the unit at my University, why do you think its not a good idea?

Thanks!

0
Entering edit mode

The unit at your university is incorrect, unfortunately, unless they meant ComBat-seq?

ComBat was / is not designed for bulk RNA-seq data - it was originally developed for microarray data, which is measured on very different scales compared to RNA-seq. By applying ComBat to log2(normalized_counts+1), you are not really following good practice.

If you definitely want batch-corrected normalised counts, then use ComBat-seq and apply it to the raw counts prior to any DESeq2 function. In this case, you would be batch-correcting the raw counts, and would ultimately, therefore, obtain batch-corrected normalised counts, too. Please see: https://github.com/zhangyuqing/ComBat-seq

0
Entering edit mode

Hi Kevin.

Considering that for applications such as WGCNA, one would want VST normalized RNA-seq counts and that unlike Combat, Combat-seq requires the input matrix to be raw counts, I'm a bit confused as where does the appropriate normalization step come in? Considering that the authors of the Combat-seq mention that after the adjustment of the data for batch effect by Combat-seq, the data can be directly used as input for algorithms such as DEseq2 (which have their own internal normalizations), am I correct to assume that after batch effect correction with Combat-seq, the data can be VST normalized and be used in downstream applications such as WGCNA?