Question

Change the number of clusters with scater

0

Entering edit mode

shbrief ▴ 20

@shbrief-15679

Last seen 6.4 years ago

USA

Can I pre-set the number of clusters with scater? I'm comparing scRNAseq analysis tools, and the same dataset clustered into 8 by Seurat is clustered into 20 by scater. For my purpose, I'd prefer to have fewer clusters than 20 and I can't figure out how to do it with scater. Thanks!

scater scran scrnaseq clustering • 2.6k views

ADD COMMENT • link updated 7.1 years ago by Aaron Lun ★ 29k • written 7.1 years ago by shbrief ▴ 20

score 1 · Answer 1 · 2018-11-20

1

Entering edit mode

Aaron Lun ★ 29k

@alun

Last seen 3 hours ago

The city by the bay

scater doesn't have any clustering functions, as far as I remember. Perhaps you're referring to scran's buildSNNGraph? There are a bunch of parameters that you can fiddle with, but the most direct approach is to increase k. This follows the opposite reasoning described in the comments section of the workflow.

You can also try a variety of different clustering algorithms from igraph. Based on my theoretical understanding, the closest reproduction of Seurat's clustering algorithm would be to set type="number" in buildSNNGraph and then use igraph::cluster_louvain. I don't know how similar this actually is, though, so YMMV.

ADD COMMENT • link 7.1 years ago Aaron Lun ★ 29k

0

Entering edit mode

Thanks, Aaron! Changing k does the job I want. And yes, scran - not scater. Sorry.

ADD REPLY • link 7.1 years ago shbrief ▴ 20

0

Entering edit mode

Actually, I found one weird behavior after I made the above change. Whenever I applied plotTSNE() to my updated object, the location of clusters in the plot changes with different 'colour_by' argument. I have no idea why this is happening. Any advice to resolve this?

ADD REPLY • link 7.1 years ago shbrief ▴ 20

0

Entering edit mode

ADD REPLY • link 7.1 years ago shbrief ▴ 20

1

Entering edit mode

You're going to have to be more specific about the code that you actually ran. I'm guessing that you re-ran plotTSNE on the sce object that did not already contain precomputed t-SNE coordinates. This recomputes the t-SNE coordinates so if you don't set the random seed, you'll get different coordinates for all points. So, you can either:

Set the seed prior to each plotTSNE call. This is the simplest solution.
Precompute t-SNE coordinates and store them in the sce object using runTSNE. Then, all ensuing plotTSNE calls will use the precomputed coordinates rather than recomputing it at every call. This is the preferred solution for experienced users as it avoids wasting time in redundant calculations when all you want to do is to change the colour of the points (or various other aesthetics that have nothing to do with the t-SNE itself).

Previous versions of scater set the seed internally for convenience. However, this was probably a Bad Idea because it silently changes the behaviour of all downstream functions that depend on randomization - see the commentary at WARNING: remove set.seed usage in R code. As such, this behaviour was deprecated in the last release and removed in this release.

ADD REPLY • link 7.1 years ago Aaron Lun ★ 29k

0

Entering edit mode

Thank you so much! I went though my code and found out that I skipped runTSNE on my updated run. Putting it back fixed the problem. Thanks again!

ADD REPLY • link 7.1 years ago shbrief ▴ 20