Question: How to calculate average log2 CPM with batch effect?
gravatar for b.nota
9 months ago by
b.nota320 wrote:

I have used limma trend to find DE genes between 3 groups (including batch effect in design). Now I would like to visualize the genes, and used removeBatchEffect() to get corrected logCPM values for each sample.

logCPMc <- removeBatchEffect(logCPM, batch=targets$Batch, design=design_4_batch)

Now I want to use the average of each group, and with aveLogCPM() from edgeR, I don't see a way to include this batch effect. What would be the correct way to get these average log2 CPM values with batch effect removed? Averaging the logCPMc (batch corrected) values would not be correct, right?


limma edger cpm • 194 views
ADD COMMENTlink modified 9 months ago by Aaron Lun24k • written 9 months ago by b.nota320
Answer: How to calculate average log2 CPM with batch effect?
gravatar for Aaron Lun
9 months ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

Simply use glmFit and take the coefficients:

groups <- gl(4, 2)
batches <- rep(LETTERS[1:2], 4)
design <- model.matrix(~0 + groups + batches)

y <- matrix(rpois(8000, lambda=10), ncol=8) # making up data
y <- DGEList(y)
y <- calcNormFactors(y)
y <- estimateDisp(y, design)

fit <- glmFit(y, design, prior.count=2) # mimic aveLogCPM() prior.count
averages <- fit$coefficients[,1:4]

Here, the first four coefficients represent the log-scaled average expression for each group. Some work is required to convert them into a similar scale as the values reported by aveLogCPM:

averages <- averages + log(1e6) # get per-million
averages <- averages/log(2) # get base 2

... and there you have it.

ADD COMMENTlink written 9 months ago by Aaron Lun24k

That's a great solution! Thanks for explaining it so clearly.

ADD REPLYlink written 9 months ago by b.nota320
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 218 users visited in the last hour