Question: Removing batch effect with limma::removeBatchEffect() actually exacerbates the effect
0
gravatar for lech.kaczmarczyk
5 months ago by
lech.kaczmarczyk10 wrote:

![enter image description here][1]Hello,

I am attempting to remove batch effects from my data using limma::removeBatchEffect(). I have two batches of samples, and there are four conditions. In the figures below batches are color-coded. I'm wondering why the batch effect seems stronger after applying the limma::removeBatchEffect().

The functions were running with default parameters, as follows:

      vst <- vst(dds)
      plotPCA(vst, "Sac")
      assay(vst) <- limma::removeBatchEffect(assay(vst), vst$Sac)
      plotPCA(vst, "Sac")

Before correction: Before Limma batch correction After correction: After Limma batch correction

limma deseq2 • 227 views
ADD COMMENTlink modified 5 months ago by Gordon Smyth39k • written 5 months ago by lech.kaczmarczyk10
Answer: Removing batch effect with limma::removeBatchEffect() actually exacerbates the e
1
gravatar for Michael Love
5 months ago by
Michael Love26k
United States
Michael Love26k wrote:

What has happened when you run the removeBatchEffect function is to remove shifts in the group means associated with the grouping factor you provide, per row of the matrix. It seems like the shift is not shared across the conditions. Are these really just two batches, or where the condition samples divided further?

ADD COMMENTlink written 5 months ago by Michael Love26k

Many thanks for your response Michael, I appreciate that. This was RNAseq of mouse brain regions- and cell-specific RNA immunoprecipitations. Groups denote the days the mice were sacrificed. Conditions were not divided further.

Since the outliers were overlapping with the time-points in which the specimens were sacrificed, I thought it's a sound approach to treat it as a batch effect (importantly, the mice sacked later were also born later, so it should not be related to age).

Of course it may be i) a coincidence or ii) tissue preparation (experimental) artifact (e.g. lack or reproducibility in brain region dissection). If I understand correctly, the shift between those samples is inconsistent and therefore does not resemble a typical batch effect, hence the observed output of the removeBatchEffect function. Would it be a good use of time to try other tools to handle this?

If this is not a batch effect, I would hesitate between i) using the samples as they are for comparisons or ii) using only "red" ones, and tossing the "green" batch.

ADD REPLYlink written 5 months ago by lech.kaczmarczyk10

I might try SVA or RUV.

Another thing I would do is find a batch-y gene (via an LRT removing the batch variable) and look at plotCounts() for these genes to see if the batch effect is consistent. The important thing for DE analysis is what happens at the gene level, while the PCA is just a QC plot, to give an overview of the variation.

ADD REPLYlink written 5 months ago by Michael Love26k
Answer: Removing batch effect with limma::removeBatchEffect() actually exacerbates the e
1
gravatar for Gordon Smyth
5 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

Two points.

First, your PCA plot does not suggest a substantial batch effect, so I wonder whether you need to worry about it.

Second, when you run removeBatchEffect you need to set the design argument so that the function knows what the four treatment conditions are. The batches are unbalanced with respect to conditions, and we only want to remove the batch effect within each condition level. For example:

design0 <- model.matrix(~condition)
assay(vst) <- removeBatchEffect(assay(vst), vst$Sac, design=design0)

Without setting the design argument, the effect you have seen is to be expected.

SVA and RUV don't seem to me to be appropriate here, because they are intended to discover the batch factor whereas you already know what it is. If you do use those algorithms, then you will have the same issue that you have with removeBatchEffect. When you do the actual batch correction, the batch correction algorithm will need to know the treatment conditions as well as the batch factor or surrogate variables.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Gordon Smyth39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 170 users visited in the last hour