Hi,
I am trying to use fastMNN for integrating snRNA-seq data from different samples of tissue at progressive stages of development. Some of the samples are biological replicates. Two samples are run together in one sequencing run, but they are not necessarily the same time-point or same individual. While reading the mnncorrect paper, I realized that he authors have made a note that each dataset that is needed to be integrated must have a shared population. So does it mean that some population of cell types should be present in all the datasets? For example, cell population X should be present in sample A (t= day1), sample B (t=day2), sample C (t=day3) and sample D(t=day4).
Or that two population must share same cell types. For example, Sample A and B share celltype X, sample B and C share celltype Y and sample C and D share cell type Z?
If scenario 1 is required, then we can't use fastMNN to integrate tissues unless we spike in with a well defined cell population at the library prep step? Then as that cell population s artificial, we would have to somehow discard it after integrating the data-set before performing DEG analysis.
I hope the author of the method could respond to this.
Thanks!
Thanks Aaron! This answers my concern. I am using the Seurat wrapper for fastMNN. In that I am assuming the order in which I create the merged seurat object would be the default merge order of dataset?
I wouldn't know. That's a question for the maintainer of the wrapper function.