Question

Subclustering after MNN batch-correction

0

Entering edit mode

heir_of_isildur88 • 0

@heir_of_isildur88-15346

Last seen 4.2 years ago

Hi, I was wondering what would be the best approach to perform clustering on a subset of cells pulled out from a MNN-batch-corrected object. I used fastMNN from SeuratWrappers to perform a MNN batch-correction and perform an integrated analysis. Then I subsetted a cluster of cells and wanted to perform re-clustering. Should I perform another round of MNN batch-correction on the subsetted object or proceed with the usual analytical workflow (e.g. Seurat, scran)? Thank you.

batchelor mnn scrna-seq • 1.7k views

ADD COMMENT • link updated 24 months ago by Guandong Shang ▴ 40 • written 4.5 years ago by heir_of_isildur88 • 0

score 2 · Answer 1 · 2020-01-12

tl;dr I don't think it's necessary but there probably isn't any harm in doing so either.

By default, I would just keep things simple and go ahead with the rest of the analysis without another round of correction. Save yourself some time in writing code and computation.

I would only do it again if there was still some batch effect in the subset. This is possible because fastMNN() uses information from all cells to help remove the batch effect, and if the "direction" of the subset's batch effect is different from the other cells, then you'll get incomplete removal. Re-correcting might get you a better-looking merge in terms of improved mixing between batches, assuming that the cells are genuinely of the same type.

If you must correct again for your subset of cells, make sure you do it from the original log-count values, not from the already-corrected values that are returned by fastMNN(). Using those would not be good.

Also consider the discussion here: https://github.com/MarioniLab/FurtherMNN2018/issues/6. This is not particularly relevant if your subset contains a group of similar cells, but if you are subsetting by other data-independent factors (e.g., experimental condition) it may be important to keep in mind.