**0**wrote:

In the documentation of buildSNNGraph it says that

The choice of k can be roughly interpreted as the minimum cluster size.

Can I have an explanation for this please.

**24k**• written 5 weeks ago by Angelos Armen •

**0**

Question: k argument in buildSNNGraph

0

Angelos Armen • **0** wrote:

In the documentation of buildSNNGraph it says that

The choice of k can be roughly interpreted as the minimum cluster size.

Can I have an explanation for this please.

ADD COMMENT
• link
•
modified 5 weeks ago
by
Aaron Lun • **24k**
•
written
5 weeks ago by
Angelos Armen • **0**

Answer: k argument in buildSNNGraph

1

Aaron Lun • **24k** wrote:

There's nothing special here. If you have a subpopulation with fewer than `k+1`

cells, `buildSNNGraph()`

will forcibly construct edges between cells in that subpopulation and cells in other subpopulations. This increases the risk that the subpopulation will not form its own cluster as it is more interconnected with the rest of the cells in the dataset.

I guess the wording of the documentation is misleading, as the interpretation of `k`

is that of the anticipated size of the smallest subpopulation. It is not a specification of the size of the smallest cluster that you are willing to obtain. The actual minimum cluster size is at the mercy of the community detection algorithm that you choose, if it enforces (explicitly or otherwise) a minimum cluster size at all.

If you have a subpopulation with fewer than k+1 cells, buildSNNGraph() will forcibly construct edges between cells in that subpopulation and cells in other subpopulations. This increases the risk that the subpopulation will not form its own cluster as it is more interconnected with the rest of the cells in the dataset.

Yes this is what I was thinking too but wouldn't interpret k as the minimum cluster size. Suppose that we have a subpopulation S1 with fewer than k+1 cells. Then the cells in S1 will have a set of cells C from the nearest subpopulation S2 as nearest neighbours. If S2 is large enough and far away enough from S1, then the cells in C will only have nearest neighbours from S2 and S1 and S2 will not merge.

On the other hand, if S2 is of similar size as S1, then S1 and S2 may indeed merge regardless of their distance. So I would think of k + 1 as the size of the smallest discoverable isolated subpopulation (where isolated means that the intra-subpopulation distances are smaller than the distances to the other subpopulations).

Please log in to add an answer.

Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.

Powered by Biostar
version 16.09

Traffic: 120 users visited in the last hour