How to use combat in order to remove batch effects?
1
1
Entering edit mode
Emma ▴ 10
@emma-25007
Last seen 3.7 years ago

Hello! I have RNA seq data and I need to use combat to remove the batch effects. Somehow when I run it, it isnt actually doing anything.

dds <- DESeqDataSetFromMatrix(countData=data, 
                              colData=metadata, 
                              design=~~Batch + dex, tidy = TRUE)
dds <- DESeq(dds, betaPrior=TRUE)
normalized_counts <- counts(dds, normalized=TRUE)
log2 = log2(normalized_counts+1)

modcombat = model.matrix(~dex, metadata) - with metadata being a variable containing treatment or control (under dex column) and batch and name of each patient.

com<-ComBat(log2, metadata$Batch, mod = modcombat)

Its supposed to be 4 different batches, but in the com variable I can see that the values have stayed the same as in log2.

What could be wrong? Would appreciate any help!

BatchEffect DESeq2 RNASeqData • 13k views
ADD COMMENT
5
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 26 days ago
Republic of Ireland

Hi,

It may help to show how you created the dds object.

Nevertheless, I would not use ComBat in this way. In your case, I would either use ComBat-seq on the raw counts prior to any DESeq2 command, or, I would use limma::removeBatchEffect in this way:

dds <- DESeq(dds)
normalized_counts <- counts(dds, normalized = TRUE)
vsd <- vst(dds, blind = FALSE)
mat <- assay(vsd)
mat <- limma::removeBatchEffect(mat, vsd$Batch)
assay(vsd) <- mat

Please also see Why after VST are there still batches in the PCA plot?

Kevin

ADD COMMENT
0
Entering edit mode

But I want to batch correct the normalized and log2 values, and the vst only takes integer numbers (my normalized values are not integers). Also I was told to use combat on my data by the unit at my University, why do you think its not a good idea?

Thanks!

ADD REPLY
0
Entering edit mode

The unit at your university is incorrect, unfortunately, unless they meant ComBat-seq?

ComBat was / is not designed for bulk RNA-seq data - it was originally developed for microarray data, which is measured on very different scales compared to RNA-seq. By applying ComBat to log2(normalized_counts+1), you are not really following good practice.

If you definitely want batch-corrected normalised counts, then use ComBat-seq and apply it to the raw counts prior to any DESeq2 function. In this case, you would be batch-correcting the raw counts, and would ultimately, therefore, obtain batch-corrected normalised counts, too. Please see: https://github.com/zhangyuqing/ComBat-seq

ADD REPLY
0
Entering edit mode

Hi Kevin.

Considering that for applications such as WGCNA, one would want VST normalized RNA-seq counts and that unlike Combat, Combat-seq requires the input matrix to be raw counts, I'm a bit confused as where does the appropriate normalization step come in? Considering that the authors of the Combat-seq mention that after the adjustment of the data for batch effect by Combat-seq, the data can be directly used as input for algorithms such as DEseq2 (which have their own internal normalizations), am I correct to assume that after batch effect correction with Combat-seq, the data can be VST normalized and be used in downstream applications such as WGCNA?

Thanks in advance!

ADD REPLY

Login before adding your answer.

Traffic: 762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6