Question: Using SC3 with batch corrected MNN values
gravatar for hamza_karakurt
4 months ago by
hamza_karakurt30 wrote:

Hello, I want to use SC3 for data sets from multiple batches. I use fastMNN() function of Scran/Scater package for batch normalization but it does not effect logcounts, it creates a reduced dimension "MNN" that shows the corrected data which also used in clustering step. How can I use SC3 with these values? Can I create a new SingleCellExperiment with MNN matrix and use SC3 on that matrix? MNN matrix includes negative values so I know I should not use gene_filter parameter as TRUE.

Thank you in advance.

ADD COMMENTlink modified 4 months ago by Vladimir Kiselev150 • written 4 months ago by hamza_karakurt30
Answer: Using SC3 with batch corrected MNN values
gravatar for Vladimir Kiselev
4 months ago by
Sanger Institute, Cambridge, UK
Vladimir Kiselev150 wrote:

Hi, you can always copy your corrected matrix to logcounts, so no need to create a new object. Or if you care about logcounts, then yes, it would be a good idea to create a separate object.

However, I think SC3 won't work well with negative values (as majority of other scRNAseq methods), so cannot guarantee a good result.

ADD COMMENTlink written 4 months ago by Vladimir Kiselev150

Thank you for answer. As you said, negative values effects results as expected. To try it, I used sc3estimatek function on both data set itself and reduced dimension (PCA with first 50 PCs in that case), and estimated k was 27 for all data and 5 for reduced dimension. Probably it is not a good way to do it. Since the data sets from different batches are really common, what is optimal way to use SC3 on these kind of data sets? Actually I looked for a method to correct all logcounts but could not find any method.

ADD REPLYlink written 4 months ago by hamza_karakurt30

There are lots of batch correction methods at the moment. Not all of them correct the expression matrix though. But for those that don't you could use other clustering methods such as louvain clustering on knn graph (default in scanpy package). Here we cover some of the batch correction methods: R - python -

ADD REPLYlink written 4 months ago by Vladimir Kiselev150

Thank you for answer. Actually I am planning to use MNN correction. It is more suitable in my situation and further analyses I am planning. MNN can create a corrected expression matrix but it also have negative values (due to cosine normalization I believe). I took the risk and used SC3 on this corrected matrix but I have NAs in clustering results.

ADD REPLYlink written 4 months ago by hamza_karakurt30

I'll chip in here and mention that a batch correction method will only be able to preserve zeroes if it is aware that the data are derived from counts. This is not the case for the vast majority of methods, which operate on transformed expression values where the count-based nature of the data are lost. And for good reason; the theory for count-based models is difficult. (See batchelor::rescaleBatches() for a limited exception.) Indeed, there is no philosophical reason that log-expression values should be non-negative. The fact that they often are is simply a matter of practical convenience to avoid loss of sparsity.

Now, I can't remember exactly what special stuff SC3 does, but if you just want to do no-frills k-means clustering, you can apply kmeans on the low-dimensional MNN corrected values. Any feature selection should have been done before MNN correction anyway.

ADD REPLYlink written 4 months ago by Aaron Lun24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 240 users visited in the last hour