Hello! I have RNA seq data and I need to use combat to remove the batch effects. Somehow when I run it, it isnt actually doing anything.
dds <- DESeqDataSetFromMatrix(countData=data,
colData=metadata,
design=~~Batch + dex, tidy = TRUE)
dds <- DESeq(dds, betaPrior=TRUE)
normalized_counts <- counts(dds, normalized=TRUE)
log2 = log2(normalized_counts+1)
modcombat = model.matrix(~dex, metadata) - with metadata being a variable containing treatment or control (under dex column) and batch and name of each patient.
com<-ComBat(log2, metadata$Batch, mod = modcombat)
Its supposed to be 4 different batches, but in the com variable I can see that the values have stayed the same as in log2.
What could be wrong? Would appreciate any help!
But I want to batch correct the normalized and log2 values, and the vst only takes integer numbers (my normalized values are not integers). Also I was told to use combat on my data by the unit at my University, why do you think its not a good idea?
Thanks!
The unit at your university is incorrect, unfortunately, unless they meant ComBat-seq?
ComBat was / is not designed for bulk RNA-seq data - it was originally developed for microarray data, which is measured on very different scales compared to RNA-seq. By applying ComBat to
log2(normalized_counts+1)
, you are not really following good practice.If you definitely want batch-corrected normalised counts, then use ComBat-seq and apply it to the raw counts prior to any DESeq2 function. In this case, you would be batch-correcting the raw counts, and would ultimately, therefore, obtain batch-corrected normalised counts, too. Please see: https://github.com/zhangyuqing/ComBat-seq
Hi Kevin.
Considering that for applications such as WGCNA, one would want VST normalized RNA-seq counts and that unlike Combat, Combat-seq requires the input matrix to be raw counts, I'm a bit confused as where does the appropriate normalization step come in? Considering that the authors of the Combat-seq mention that after the adjustment of the data for batch effect by Combat-seq, the data can be directly used as input for algorithms such as DEseq2 (which have their own internal normalizations), am I correct to assume that after batch effect correction with Combat-seq, the data can be VST normalized and be used in downstream applications such as WGCNA?
Thanks in advance!