Question: Does order of removeBatchEffect() and DE subsetting matter for clustered heat map visualization?
0
gravatar for Ekarl2
3.7 years ago by
Ekarl250
Sweden
Ekarl250 wrote:

I want to make a clustered heat map for the differentially expressed genes I identified with EdgeR where the batch effect was modeled as an experimental factor. To do this, I used the limma package removeBatchEffect() after TMM-normalization and log2 CPM transformation on the DE subset of the raw expression matrix and it worked.

Is there any concrete practical benefit in doing removeBatchEffect() on the entire dataset first (but with some independent filtering first), and then take out the rows corresponding to the DE genes for visualization? In terms of robustness and accuracy?

In other words, is it better to take out differentially expressed genes from the raw matrix first, and then run the batch correction described above or first run the batch correction procedure on the full raw matrix and only then take out the genes you want to visualize? My DE set is on the order of a few thousand de novo contigs. How much does the size of this DE set matter for the answer?

ADD COMMENTlink modified 3.7 years ago by Aaron Lun25k • written 3.7 years ago by Ekarl250
Answer: Does order of removeBatchEffect() and DE subsetting matter for clustered heat ma
2
gravatar for Aaron Lun
3.7 years ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

Doesn't matter. The batch correction is done separately for each gene, so whether it's subsetted or not is irrelevant to the function as it'll give the same result for each gene:

y <- matrix(rnorm(10*6),10,6)
colnames(y) <- c("A1","A2","A3","B1","B2","B3")
y[,1:3] <- y[,1:3] + 10

# All together:
batch <- c("A","A","A","B","B","B")
out <- removeBatchEffect(y, batch=batch)

# One at a time:
out2 <- removeBatchEffect(y[1,,drop=FALSE], batch=batch)
all.equal(out[1,,drop=FALSE], out2) # should be TRUE

If you subset first, it'll be faster, but that shouldn't be a major issue unless you've got lots of features.

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Aaron Lun25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 252 users visited in the last hour