removeBatchEffect options: design and covariates
1
1
Entering edit mode
Rao,Xiayu ▴ 540
@raoxiayu-6003
Last seen 6.4 years ago
United States

Hello,

I want to use removeBatchEffect() on the expression data (Elist) prior to drawing a heatmap based on the expression of sig diff genes. Those sig diff genes were generated from limma linear modelling, with the batch factor already included in the linear model.

I saw people use removeBatchEffect(y, batch=batch) and removeBatchEffect(y, batch=batch, design=design). I would very much like to know in what condition I should include the design matrix, and when to also include covariates ??? Any comments would be very appreciated. Thank you in advance!

removeBatchEffect(x, batch=NULL, covariates=NULL,
design=matrix(1,ncol(x),1), ...)


Thanks,
Xiayu

limma removebatcheffect • 5.5k views
5
Entering edit mode
@ryan-c-thompson-5618
Last seen 14 months ago
Scripps Research, La Jolla, CA

Hello,

When calling removeBatchEffect, you should use the same design that you used for limma, but with with batch effect term removed from the design. Then you would pass the batch effect factor as the batch argument instead. So, if the design matrix that you used for limma was constructed as:

model.matrix(~Condition + Batch),

then for removeBatchEffect, you would use design=model.matrix(~Condition), and batch=Batch. In other words, you take the batch effect out of your model design and pass it as the batch argument instead.

-Ryan

0
Entering edit mode

Hi, Ryan

That's very clear. Thank you very much for your instructions. Then when should I include covariates, or what kind of covariates I should adjusted for ?? Covariates in the model should already be in the design matrix, right?

Thanks,
Xiayu

0
Entering edit mode

<p>Hi Xiayu,</p>

<p>The covariates argument is just a more general way to specify batch
effects to be corrected for. If you had more than 2 batch factors, or
if you had some continuous numerical covariates to correct for, you
would manually construct your own "batch design matrix" and pass it as
the covariates argument. If all you have is one or two batch factors,
then all you need to do is pass them for batch and batch2, and you
don't need to worry about the covariates argument.</p>

<p>-Ryan</p>

0
Entering edit mode

Hi, Ryan

Thank you for your input! One more quick follow-up question, considering your example of specifying design=model.matrix(~Condition), and batch=Batch, what if I also have a random effect in my limma design? do I need to put that variable(subject as below) anywhere in the removeBatchEffect command or just ignore it?

design <- model.matrix(~Condition + Batch)
duplicateCorrelation(y,design,block=targets$subject)  Thanks, Xiayu ADD REPLY 0 Entering edit mode Well, I'm not as familiar with random effects analysis, but the normal way to use duplicateCorrelation is to pass corfit$consensus as the correlation argument to lmFit. The help page for removeBatchEffect states that any additional arguments are passed to lmFit, so I think I would simply do likewise and pass the same correlation argument to removeBatchEffect.

0
Entering edit mode

Thanks a lot! I think you are right. I will do that. But what if the Batch itself was treated as a random effect in the model ?? Can I put it in batch=Batch in the removeBatchEffect command? Sorry but I promise this is the last question.

Thank you so much!

Xiayu

0
Entering edit mode

I don't think you can do that. removeBatchEffect works by treating the batches as fixed effects, so I don't think it's valid to specify a random effect there.

0
Entering edit mode

Hello,

Would that also be the way to proceed in case the batch effect was estimated via sva?

The reason for my doubt is that sva does receive the design matrix as its input (mod1 parameter). So since the batch effect was estimated while taking into account the design, I am not sure it is necessary to "take it into account" again.