Evaluating HCPC clusters from FactoMineR using cluster.stats from fpc library
Entering edit mode
Last seen 2.8 years ago

I'm trying to get quality measures, for example silhouette, from HCPC clusters

I can get avg.silwidth from the k-means algorithm, using cluster.stats from fpc library :

km.res <- eclust(dat, "kmeans", k = 20, nstart = 25, graph = FALSE)
# Compute pairwise-distance matrices
dd <- dist(dat, method ="euclidean")
# Statistics for k-means clustering
km_stats <- cluster.stats(dd,  km.res$cluster)

But now I'm trying to use cluster.stats with HCPC. To run cluster.stats I need a distance matrix and the clusters. When I execute HCPC with consol=TRUE it indeed runs a k-mean algorithm therefore I've been trying to find the distance matrix and the clusters but I was not able.

names(res.hcpc) [1] "data.clust" "desc.var" "desc.axes" "call" "desc.ind"

In this assignment https://eric.univ-lyon2.fr/~jahpine/cours/l3_cestat-dm/tp5.pdf

it seems that they get the clusters from data.clust$clust:

clust . eval . hcpc = cluster . stats ( cluster :: daisy ( sub . D . kmodes ) , as . numeric ( res . hcpc . sub . acm$data . clust$clust ) , res . kmodes . sub . D . kmodes$cluster )

But when I run this

res <- PCA(dat,quali.sup=1:3, ncp=8)
#Hierarchical ascendant clustering
res.hcpc <- HCPC(res, kk=Inf, min=3, max=10, consol=TRUE)
km_stats <- cluster.stats(dd, res.hcpc$data.clust$clust)

I get the Error:

Error in Summary.factor(c(3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,  : 
  ‘max’ not meaningful for factors
clustering r fpc FactoMineR • 1.1k views
Entering edit mode


I am just asking myself if you could find a solution for this? I am also dealing with the same issue...I want to get some cluster validation statistics for my HCPC but I just can't find any solution.

BR, Daniela


Login before adding your answer.

Traffic: 752 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6