Prior Control with Mclust package
2
0
Entering edit mode
mabdulsa • 0
@mabdulsa-10836
Last seen 8.6 years ago

I've just started using R a couple of weeks ago and I'm new to programming.

I'm doing some EM algorithm clustering using the Mclust package provided in R. This seems to be exactly what I need, however the number of clusters that I get when I run it is more than I expect. I assume that my problem can be solved using prior control, but whenever I use the script

mclustBIC(mydata, prior = priorControl())

I get the following error message:

Error in chol.default(priorParams$scale) : 
  the leading minor of order 3 is not positive definite

Here's an example showing how prior control works:

treesBIC <- mclustBIC(trees) # default (no prior)
plot(treesBIC, legendArgs = list(x = "bottom", ncol = 2, cex = .75))

treesBICprior <- mclustBIC(trees, prior = priorControl()) # with prior
plot(treesBICprior, legendArgs = list(x = "bottom", ncol = 2, cex = .75))

 

R mclust clustering • 2.7k views
ADD COMMENT
0
Entering edit mode
Aedin Culhane ▴ 510
@aedin-culhane-1526
Last seen 5.3 years ago
United States

Hi 

Try specifying G. The default is 9.

 

G is an integer vector specifying the numbers of mixture components (clusters) for which the BIC is to be calculated. The default is G=1:9, unless the argument x is specified, in which case the default is taken from the values associated with x.

 

Best
Aedin

ADD COMMENT
0
Entering edit mode
Aedin Culhane ▴ 510
@aedin-culhane-1526
Last seen 5.3 years ago
United States

Sorry mean to give example code

With 5 cluster, G=1:5

testdata<-iris[,-5]

tt<- mclustBIC(testdata, G = 1:5)

plot(tt)

 

Default is 9 clusters.. G=1:9

 plot(mclustBIC(testdata))

ADD COMMENT
0
Entering edit mode

Hi Aedin,

Actually I wanted the code to run G from 1 to 9, and give me the optimal G from the highest BIC value.

If I execute the code without prior control, it runs fine without any errors, but the output is a large number of clusters, higher than expected. I was hoping that prior control can regularize my data, and the output will be a more accurate and lower number of clusters.

Thanks, I really appreciate your input.

 

ADD REPLY

Login before adding your answer.

Traffic: 608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6