what is the role of 'itemLabels' in 'bhc' function
0
0
Entering edit mode
sk.rhyeu • 0
@skrhyeu-21536
Last seen 4.7 years ago

Hi! I am Rhyeu.

I have been interested in Bayesian clustering and tried to test some packages and "BHC" package.

And I have had a question.

what is the role of the 'itemLables' in bhc function?

I have thought that it provides a sort of a prior imformation in clustering.. but when I test in Fisher's iris data, It have not made some differents....

I have been wrong to employ this function or had some misunderstanding??

I already have changed 'randomised' and 'numReps' options. They have not affect the results as I thought.

and I have already read some articles related this package and I found that my result is same as 'Lowing and Bomalaski, 2017''s results that removed 'Species labels'. I have not been sure that how to apply 'Species Label' in this function.

I attached the code that I tested as follow.

thanks for reading this question.

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=Korean_Korea.949  LC_CTYPE=Korean_Korea.949   
[3] LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C                
[5] LC_TIME=Korean_Korea.949    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.8.3           here_0.1              BHC_1.36.0           
[4] rstan_2.19.2          ggplot2_3.2.0         StanHeaders_2.18.1-10

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         pillar_1.4.2       compiler_3.6.1    
 [4] prettyunits_1.0.2  tools_3.6.1        pkgbuild_1.0.3    
 [7] packrat_0.5.0      tibble_2.1.3       gtable_0.3.0      
[10] pkgconfig_2.0.2    rlang_0.4.0        cli_1.1.0         
[13] rstudioapi_0.10    parallel_3.6.1     xfun_0.8          
[16] loo_2.1.0          gridExtra_2.3      withr_2.1.2       
[19] knitr_1.23         rprojroot_1.3-2    stats4_3.6.1      
[22] grid_3.6.1         tidyselect_0.2.5   glue_1.3.1        
[25] inline_0.3.15      R6_2.4.0           processx_3.4.1    
[28] callr_3.3.1        purrr_0.3.2        magrittr_1.5      
[31] backports_1.1.4    scales_1.0.0       ps_1.3.0          
[34] matrixStats_0.54.0 assertthat_0.2.1   colorspace_1.4-1  
[37] lazyeval_0.2.2     munsell_0.5.0      crayon_1.3.4      

library(BHC)
library(dplyr)
data(iris)

itemLabels = as.character(c(rep(1, 50), rep(2, 50), rep(3, 50))) # setosa : 1, versicolor : 2, virginica : 3 or itemLables = iris$Species
itemLabels2 = as.character(1:150)

percentiles = FindOptimalBinning(t(iris[,1:4]), itemLabels, transposeData = T, verbose = T)
percentiles2 = FindOptimalBinning(t(iris[,1:4]), itemLabels2, transposeData = T, verbose = T)

percentiles
percentiles2

discreteData <- DiscretiseData(t(iris[,1:4]), percentiles=percentiles)
discreteData2 <- DiscretiseData(t(iris[,1:4]), percentiles=percentiles2)

discreteData <- t(discreteData)
discreteData2 <- t(discreteData2)
discreteData
hc3 <- bhc(discreteData, 
           itemLabels, 
           verbose=TRUE 

           # randomised = T, 
           # numReps = 50
           )



hc3_2 <- bhc(discreteData2, 
             itemLabels2, 
             verbose=TRUE

             # randomised = T, 
             # numReps = 50
             )


par(mfrow=c(1,2))
plot(hc3, main = "Label 1:3")
plot(hc3_2, main = "Lable 1:150")

bhc • 651 views
ADD COMMENT

Login before adding your answer.

Traffic: 746 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6