Hi all,
I've used RemoveBatchEffect function from limma package for my previous datasets but it's limited to two batches only.
I'm working with a data-sets that contains more than two batches. How can I do?
Many thanks,
Yi
Hi all,
I've used RemoveBatchEffect function from limma package for my previous datasets but it's limited to two batches only.
I'm working with a data-sets that contains more than two batches. How can I do?
Many thanks,
Yi
removeBatchEffects() is not limited to two batches. It works just the same with any number of batches. For example, if you have three batches (A, B, C) you might use:
batch <- c("A","A","A","B","B","B","C","C","C") removeBatchEffect(y, batch)
or
removeBatchEffect(y, batch, design=design)
Strangely, the same question has been asked before: Batch effect removal
Edit 24 hours later:
If your need is actually to handle more than 2 batch factors, rather than just more than 2 batches, then (as suggested by Steve) this can be achieved using the covariates argument to removeBatchEffects. Suppose you have three batch factors:
contrasts(batch1) <- contr.sum(levels(batch1)) contrasts(batch2) <- contr.sum(levels(batch2)) contrasts(batch3) <- contr.sum(levels(batch3)) covariates <- model.matrix(~batch1+batch2+batch3) covariates <- covariates[,-1]
Then you can correct by
removeBatchEffect(y, covariates=covariates, design=design)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
My 2 cents:
The (common) confusion that you're pointing out is likely due to the easy-to-understand description of the
batch
andbatch2
parameters, and the (bit) cryptic description of thecovariate
parameter.Presumably you mean that one could control for an arbitrary number of batches by creating a
covariates
design matrix that encodes these batches, but the average lay person will have no intuition on how to do that.Furthermore, there are no examples in
?removeBatchEffect
that can help shed light on the situation other than the example using the singlebatch
parameter.No, that isn't what I mean at all. I think you are confusing batch-factors with batches, and perhaps that is OP's misunderstanding as well.
removeBatchEffects() can handle only two batch factors, but each factor can have an arbitrary number of levels, just like any factor in any R function. Each level corresponds to a batch. So removeBatchEffects() naturally handles an arbitrary number of batches even without the batch2 or covariate arguments.
The batch2 and covariate arguments are for more complex situations where there is an additive structure of batch effects from multiple sources.
There we have it: dollars to donuts it's a terminology thing, then, as you point out.
My bet is that the OP is asking about how to control for > 2 batch factors (since that's how I interpreted the question ;-), since it's pretty straightforward to see (and try to test) how the
batch
andbatch2
parameters can have > 2 unique categorical values (levels) ... but let's see.I bet they only have one factor. (Later: but I was wrong.)
In service to my fellow mere mortals who will be trying to grok what
contr.sum
is doing in Gordon's updated answer, you can start by reading through this tutorial on contrast coding schemes for categorical variables.Hi Gordon and Steve,
Thanks for your replies. Sorry for poor description of my question.
What I mean is that there are more than two batch "factors".
For example, how can I handle batch "factors" while the design is ~ batch1+batch2+batch3+factor1+factor2?
The removeBatchEffect is limited to "batch2".
I have edited my answer above to address this modified question.
Thank you so much! Sorry I didn't notice that there are some updates here.