Dear all, dear Laleh and Aaron,
I have a "best practice" question regarding the mnnCorrect that you provide in scran. My setup is the following: We sampled single cells from three different time points across two sequencing platforms (3'-end and full-length cDNA). Unfortunately, the sequencing depth is magnitudes lower for the 3'-end sequencing, as well as there is one timepoint missing in the 3'-end sequencing.
All in all, when I run mnnCorrect on both (full) expression matrices (full-length with three timepoints, 3'-end with two timepoints), the PCA/t-SNE looks quite ok, but still shows a batch effect that gets exaggerated even more when using other downstream dimension reduction methods (like SOMs).
My question is: If I have known subpopulations that are in matched principally (e.g. by capture timepoint), would you recommend to run mnnCorrect for each timepoint seperately? Or are the returned expression values becoming uncomparable (different scales?)?
Or is it a mixed effect due to mismatched sequencing depth / missing subpopulations? Which would lead to the question on how to evaluate the mnnCorrect output further than with PCA and t-SNE?
Best and thanks a lot for the great article on bioRxiv!