Can I pre-set the number of clusters with scater? I'm comparing scRNAseq analysis tools, and the same dataset clustered into 8 by Seurat is clustered into 20 by scater. For my purpose, I'd prefer to have fewer clusters than 20 and I can't figure out how to do it with scater. Thanks!
scater doesn't have any clustering functions, as far as I remember. Perhaps you're referring to scran's buildSNNGraph? There are a bunch of parameters that you can fiddle with, but the most direct approach is to increase k. This follows the opposite reasoning described in the comments section of the workflow.
You can also try a variety of different clustering algorithms from igraph. Based on my theoretical understanding, the closest reproduction of Seurat's clustering algorithm would be to set type="number" in buildSNNGraph and then use igraph::cluster_louvain. I don't know how similar this actually is, though, so YMMV.
Actually, I found one weird behavior after I made the above change. Whenever I applied plotTSNE() to my updated object, the location of clusters in the plot changes with different 'colour_by' argument. I have no idea why this is happening. Any advice to resolve this?
You're going to have to be more specific about the code that you actually ran. I'm guessing that you re-ran plotTSNE on the sce object that did not already contain precomputed t-SNE coordinates. This recomputes the t-SNE coordinates so if you don't set the random seed, you'll get different coordinates for all points. So, you can either:
Set the seed prior to each plotTSNE call. This is the simplest solution.
Precompute t-SNE coordinates and store them in the sce object using runTSNE. Then, all ensuing plotTSNE calls will use the precomputed coordinates rather than recomputing it at every call. This is the preferred solution for experienced users as it avoids wasting time in redundant calculations when all you want to do is to change the colour of the points (or various other aesthetics that have nothing to do with the t-SNE itself).
Previous versions of scater set the seed internally for convenience. However, this was probably a Bad Idea because it silently changes the behaviour of all downstream functions that depend on randomization - see the commentary at WARNING: remove set.seed usage in R code. As such, this behaviour was deprecated in the last release and removed in this release.
Thanks, Aaron! Changing k does the job I want. And yes, scran - not scater. Sorry.
Actually, I found one weird behavior after I made the above change. Whenever I applied
plotTSNE()
to my updated object, the location of clusters in the plot changes with different 'colour_by' argument. I have no idea why this is happening. Any advice to resolve this?You're going to have to be more specific about the code that you actually ran. I'm guessing that you re-ran
plotTSNE
on thesce
object that did not already contain precomputed t-SNE coordinates. This recomputes the t-SNE coordinates so if you don't set the random seed, you'll get different coordinates for all points. So, you can either:plotTSNE
call. This is the simplest solution.sce
object usingrunTSNE
. Then, all ensuingplotTSNE
calls will use the precomputed coordinates rather than recomputing it at every call. This is the preferred solution for experienced users as it avoids wasting time in redundant calculations when all you want to do is to change the colour of the points (or various other aesthetics that have nothing to do with the t-SNE itself).Previous versions of scater set the seed internally for convenience. However, this was probably a Bad Idea because it silently changes the behaviour of all downstream functions that depend on randomization - see the commentary at WARNING: remove set.seed usage in R code. As such, this behaviour was deprecated in the last release and removed in this release.
Thank you so much! I went though my code and found out that I skipped
runTSNE
on my updated run. Putting it back fixed the problem. Thanks again!