How to remove more than two batches in my data-sets
1
3
Entering edit mode
yi.huang ▴ 30
@yihuang-11146
Last seen 8.3 years ago

Hi all,

I've used RemoveBatchEffect function from limma package for my previous datasets but it's limited to two batches only.

I'm working with a data-sets that contains more than two batches. How can I do?

Many thanks,

Yi

limma RNAseq removebatcheffect() • 4.9k views
ADD COMMENT
6
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

removeBatchEffects() is not limited to two batches. It works just the same with any number of batches. For example, if you have three batches (A, B, C) you might use:

batch <- c("A","A","A","B","B","B","C","C","C")
​removeBatchEffect(y, batch)

or

removeBatchEffect(y, batch, design=design)

Strangely, the same question has been asked before: Batch effect removal

Edit 24 hours later:

If your need is actually to handle more than 2 batch factors, rather than just more than 2 batches, then (as suggested by Steve) this can be achieved using the covariates argument to removeBatchEffects. Suppose you have three batch factors:

contrasts(batch1) <- contr.sum(levels(batch1))
contrasts(batch2) <- contr.sum(levels(batch2))
contrasts(batch3) <- contr.sum(levels(batch3))
covariates <- model.matrix(~batch1+batch2+batch3)
covariates <- covariates[,-1]

Then you can correct by

removeBatchEffect(y, covariates=covariates, design=design)
ADD COMMENT
1
Entering edit mode

My 2 cents:

The (common) confusion that you're pointing out is likely due to the easy-to-understand description of the batch and batch2 parameters, and the (bit) cryptic description of the covariate parameter.

Presumably you mean that one could control for an arbitrary number of batches by creating a covariates design matrix that encodes these batches, but the average lay person will have no intuition on how to do that.

Furthermore, there are no examples in ?removeBatchEffect that can help shed light on the situation other than the example using the single batch parameter.

ADD REPLY
0
Entering edit mode

No, that isn't what I mean at all. I think you are confusing batch-factors with batches, and perhaps that is OP's misunderstanding as well.

removeBatchEffects() can handle only two batch factors, but each factor can have an arbitrary number of levels, just like any factor in any R function. Each level corresponds to a batch. So removeBatchEffects() naturally handles an arbitrary number of batches even without the batch2 or covariate arguments.

The batch2 and covariate arguments are for more complex situations where there is an additive structure of batch effects from multiple sources.

ADD REPLY
0
Entering edit mode

There we have it: dollars to donuts it's a terminology thing, then, as you point out.

My bet is that the OP is asking about how to control for > 2 batch factors (since that's how I interpreted the question ;-), since it's pretty straightforward to see (and try to test) how the batch and batch2 parameters can have > 2 unique categorical values (levels) ... but let's see.

ADD REPLY
0
Entering edit mode

I bet they only have one factor. (Later: but I was wrong.)

ADD REPLY
1
Entering edit mode

In service to my fellow mere mortals who will be trying to grok what contr.sum  is doing in Gordon's updated answer, you can start by reading through this tutorial on contrast coding schemes for categorical variables.

ADD REPLY
0
Entering edit mode

Hi Gordon and Steve, 

Thanks for your replies. Sorry for poor description of my question.

What I mean is that there are more than two batch "factors".

For example, how can I handle batch "factors" while the design is  ~ batch1+batch2+batch3+factor1+factor2?

The removeBatchEffect is limited to "batch2". 

ADD REPLY
0
Entering edit mode

I have edited my answer above to address this modified question.

ADD REPLY
0
Entering edit mode

Thank you so much! Sorry I didn't notice that there are some updates here. 

ADD REPLY

Login before adding your answer.

Traffic: 591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6