Question: limma removeBatchEffect -- strange values in presence of NAs
10 months ago by
jf10
jf10 wrote:

When trying to adjust data for batch effects using the respective limma (v3.34.9) function removeBatchEffect() I got strange values (at least to my understanding).

My question: In the example below wouldn't one expect row 1 to show the same value distribution as row 3, i.e. the mean value of the respective row? What am I missing here?

library(limma)
> a = cbind(1,4,c(1,4,NA),c(NA,1,1))
> a
[,1] [,2] [,3] [,4]
[1,]    1    4    1   NA
[2,]    1    4    4    1
[3,]    1    4   NA    1

> removeBatchEffect(a,1:4)
1   2   3   4
[1,] 1.0 1.0 1.0  NA
[2,] 2.5 2.5 2.5 2.5
[3,] 2.0 2.0  NA 2.0
Warnmeldung:
Partial NA coefficients for 2 probe(s)

10 months ago by Gordon Smyth37k • written 10 months ago by jf10
Answer: limma removeBatchEffect -- strange values in presence of NAs
10 months ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

According to your code, every observation belongs to a different batch. The whole concept of batch removal doesn't make any sense in this context. Why are you doing this?

The reason why the returned value isn't always the mean of the non-NA values is that removeBatchEffects doesn't reorthogonalize the design matrix when some of the batch effects become non-estimable. The function isn't designed for situations when one of the batches is entirely missing. But it hardly matters. You're removing all the signal from the data, so the value that remains doesn't matter from a DE point of view.

Thank you so much for the quick answer! Reason for doing so: I was just curious what the effect of NAs would be, So I constructed this minimal example with synthetic values to see it, and was wondering after having a look at the results.