Question

How to apply limma-trend or voom to batch corrected single cell data? (non-integer and negative values included)

0

Entering edit mode

hainct • 0

@8c4ff108

Last seen 17 months ago

South Korea

We have applied batch corrected on mouse data using MNNCorrect and scMerge. The corrected matrix included non-integer and negative values. Assuming that the batch-corrected data are nonintegers and include negative values, how to apply limma-trend or voom to batch corrected single-cell data?

MNNCorrect scMerge limma-trend • 2.0k views

ADD COMMENT • link updated 3.0 years ago by Gordon Smyth 50k • written 3.0 years ago by hainct • 0

score 3 · Accepted Answer · 2021-04-19

You should really tag your question with the name of the relevant package rather than the function if you want the package maintainers to comment here.

It's not a good idea to apply limma - or really, any DE analysis machinery - to corrected values produced by the non-linear batch correction algorithms typical of scRNA-seq data analysis. You can read my comments about it here; to summarize, the correction has no obligation to preserve the magnitude of differences within batches, and so it's anyone's guess what will crawl out the other side. The closest thing you'll get to a guarantee of behavior is that corresponding populations across two batches will be merged - the algorithm is free to do whatever it wants to the per-gene expression values to get to that point. And in the specific case of MNN correction, the cosine normalization means that the values are not on the log-scale, so none of the reported log-fold changes from limma will be real log-FCs.

Why not include the batch as a blocking factor in a standard DE analysis on the uncorrected values? You would be on much safer ground if you did so. The corrected values are only used for cell-based analyses, not gene-based ones.

score 2 · Accepted Answer · 2021-04-17

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 7 hours ago

WEHI, Melbourne, Australia

If you have applied mnnCorrect, then the output is already on the log-expression scale so you would directly enter standard limma pipelines as for microarray data. The log-expression values should in principle already be ready for limma without further preprocessing. Certainly you should not use functions like voom that are designed for counts.

Note that I am simply choosing here between the two options you give. Generally we recommend that batch correction be done as part of the linear model rather than a pre-processing step.