Question

sva ComBat implementation for multiple sources of batch effects

0

Entering edit mode

Zach Roe ▴ 10

@zach-roe-11189

Last seen 4.0 years ago

I have multiple sources of batch effects that show associations with highest principal components that I want to remove from my expression matrix for downstream analysis (clustering, pca).

I am following the sva::ComBat implementation and wanted to see if I can model more that one batch effect at the same time (I do not see an example of how I could do this on the vignette) or if I should iterate and remove each at a time as such:

> modcombat = model.matrix(~1, data=pheno) combat_edata1 =
> ComBat(dat=edata, batch=batch1, mod=modcombat, par.prior=TRUE,
> prior.plots=FALSE) combat_edata2 = ComBat(dat=combat_edata1,
> batch=batch2, mod=modcombat, par.prior=TRUE, prior.plots=FALSE)

Or should I set my model matrix with the second batch, while combat-adjusting for first batch, then set the model matrix with the first batch while combat-adjusting for the second batch as follows:

> modcombat1 = model.matrix(~batch2, data=pheno)
> combat_edata1 = ComBat(dat=edata, batch=batch1, mod=modcombat1, par.prior=TRUE, prior.plots=FALSE)

> modcombat2 = model.matrix(~batch1, data=pheno)
> combat_edata2 = ComBat(dat=combat_edata1, batch=batch2, mod=modcombat2, par.prior=TRUE, prior.plots=FALSE)

Or alternatively, look at the combination of batch1-batch2 and treat them together, which just gives the original implementation in the vignette.

I want to add that I am not using ComBat for DE analysis as I'm modeling the batch effects in my design matrix for that purpose, but only interested in producing batch-corrected values for visualization purposes.

sva combat batch-adjustment • 1.6k views

ADD COMMENT • link updated 4.0 years ago by W. Evan Johnson ▴ 850 • written 4.0 years ago by Zach Roe ▴ 10

score 2 · Answer 1 · 2020-05-05

2

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 15 hours ago

United States

If you just need the data for visualization purposes you should probably use removeBatchEffect in limma, which will emulate what you are doing within your linear model.

ADD COMMENT • link 4.0 years ago James W. MacDonald 65k

0

Entering edit mode

Does removeBatchEffect work on RNA-seq? I've only used limma in microarray. For RNA-seq I have rlog values from DESeq2.

ADD REPLY • link 4.0 years ago Zach Roe ▴ 10

2

Entering edit mode

For bulk RNA-seq, you would apply removeBatchEffects() to the variance-stabilised or regularised log expression levels (both DESeq2), or the log [base 2] CPM expression levels (EdgeR and limma-voom), i.e., the transformed expression levels after the data has already been normalised. I provide a use-case, here: https://www.biostars.org/p/434736/#434769

ADD REPLY • link 4.0 years ago Kevin Blighe ★ 3.9k

score 2 · Answer 2 · 2020-05-05

Zach, if you are doing RNA-seq, you should try ComBat_seq. Its available in the most recent Bioconductor release. If you have a strong batch effect, the methodology mitigates the exaggerated significance problem so we recommend that it can be used for differential expression (which incidentally, removeBatchEffect doesn’t mitigate exaggerated significance in the case of strong variance batch effects).

score 1 · Answer 3 · 2020-05-05

1

Entering edit mode

W. Evan Johnson ▴ 850

@w-evan-johnson-5447

Last seen 16 hours ago

United States

The publication for ComBat_seq is pending, but here is the Bioxiv: https://www.biorxiv.org/content/10.1101/2020.01.13.904730v1

ADD COMMENT • link 4.0 years ago W. Evan Johnson ▴ 850

0

Entering edit mode

Hi Evan, thank you. I'm also working with DNA methylation data, and my question above is specific to that. Do you have recommendation when there are more than 2 batch sources if I can implement ComBat twice? Each time correcting for one type of batch effect?

In my case I have both 450K and EPIC array data. I first implement ComBat in each platform to correct for array batches. I then merged the two platforms, and then want to implement ComBat again to correct for platform as batch.

ADD REPLY • link 4.0 years ago Zach Roe ▴ 10