6 months ago by
Cambridge, United Kingdom
To answer your immediate question: I don't see an inherent problem with using the Dunn index to assess separation of clusters, provided you're willing to do all those distance calculations. But keep in mind that the clustering methods in igraph will attempt to maximize the modularity, not the Dunn index. If a graph-based clustering strategy gives you a higher modularity but a lower Dunn index, you can hardly say that it performs poorly - it's just doing its job.
You also don't mention what flavor of Dunn index is being used. If you're using the one that involves computing the ratio of the minimum inter-cluster distance to the maximum intra-cluster distance, I'd say that this is far too conservative to be useful in single-cell data. A single misassigned cell is enough to make your index very small, even if the rest of the clustering is fine.
I want to try Dunn Index to validate my clustering results from scRNA-Seq data.
Don't use the word "validation". Validation implies that there is some kind of truth to be found, but this isn't really the purpose of clustering, as we have already discussed. Currently, all that you're doing is to evaluate the separation between clusters, which is fine and useful but is a long way from establishing truth. If you want to "validate" something, you should be performing functional experiments to demonstrate that your clusters correspond to cells that have different biological behaviour.
I tried range of k values and want to score them.
Or you could just pick one and see if it's useful. Clustering doesn't have to be perfect, it just has to be good enough for downstream interpretation.
I know modularity function can be used in that cases but I saw that modularity decreases with increasing of k in buildSNNGraph and buildKNNGraph functions so I wanted to use a different method.
This is a natural consequence of increasing the number of connections in the graph. I would say that this is a feature rather than a bug, because increasing the connectivity allows us to obtain more granular clusters. In this manner, we can adjust the resolution as desired if there are too few/many clusters for further examination.
modified 6 months ago
6 months ago by
Aaron Lun • 24k