Problem in K- means clustering
1
0
Entering edit mode
Last seen 3.4 years ago
Dear List, I am working on Affimetrix's HGU133Plus2 chip data GSE23343 from GEO and want to find the optimal number of clusters. I am following http://koti.mbnet.fi/tuimala/oppaat/r2.pdf at pg 113 following code is there but when I tried, I do not find graph but received following error. please suggest where I am doing mistake ? kmax<-c(100) > if(nrow(dat2)<100) { + kmax<-nrow(dat2) + } km<-rep(NA,(kmax-1)) > i<c(2) while(i<kmax){="" +="" km[i]<-sum(kmeans(dat2,i,iter.max="20000,nstart=10)$withinss)" +="" if(i="">=3 & km[i-1]/km[i]<=1.01){ + i<-kmax + } else { + i<-i+1 + } + } > plot(2:kmax,km,xlab="K",ylab="sum(withinss)",type="b",pch="+",main="Te rminated when change less then 1%") *Error in plot.window(...) : need finite 'ylim' valuesIn addition: Warning messages:1: In min(x) : no non-missing arguments to min; returning Inf2: In max(x) : no non-missing arguments to max; returning -Inf* Many thanks, [[alternative HTML version deleted]] hgu133plus2 graph hgu133plus2 graph • 1.1k views ADD COMMENT 0 Entering edit mode Sonali Arora ▴ 380 @sonali-arora-6563 Last seen 5.7 years ago United States Hi Aditya, You have a typo error in your code . It should be - i <- c(2) instead of i<c(2) i="" had="" no="" problem="" running="" the="" following:="">rm(list=ls()) >dat2<- mtcars >kmax<-c(100) ># If there are less than 100 genes or arrays ># make the max. no. of cluster equal to the ># number of genes or arrays >if(nrow(dat2)<100) { + kmax<-nrow(dat2) +} ># Create an empty vector for storing the ># within SS values >km<-rep(NA, (kmax-1)) ># Minimum number of cluster is 2 >i<-c(2) ># Test all numbers of clusters between 2 ># max. 100 using the while -loop > > > >while(i<kmax) {="" +="" km[i]<-sum(kmeans(dat2,="" i,="" iter.max="20000," +="" nstart="10)$withinss)" +="" #="" terminate="" the="" run="" if="" the="" change="" in="" within="" ss="" is="" +="" #="" less="" than="" 1%="" +="" if(i="">=3 & km[i-1]/km[i]<=1.01) { + i<-kmax + } else { + i<-i+1 + } +} ># Plot the number of K against the within SS >plot(2:kmax, km, xlab="K", ylab="sum(withinss)", type="b", + pch="+", main="Terminated when change less than 1%") >sessionInfo() R version 3.1.1 (2014-07-10) Platform: i386-w64-mingw32/i386 (32-bit) locale:  LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252  LC_MONETARY=English_United States.1252  LC_NUMERIC=C  LC_TIME=English_United States.1252 attached base packages:  stats graphics grDevices utils datasets  methods base loaded via a namespace (and not attached):  tools_3.1.1 In future- Please attach a script which people can copy-paste into their browser session along with the sesssionInfo() - It will help people in exactly replicating your problem and troubleshooting. Thanks and Regards, Sonali. On 8/22/2014 9:02 AM, Aditya Saxena wrote: > Dear List, > > I am working on Affimetrix's HGU133Plus2 chip data GSE23343 from GEO and > want to find the optimal number of clusters. > > I am following http://koti.mbnet.fi/tuimala/oppaat/r2.pdf at pg 113 > following code is there but when I tried, I do not find graph but received > following error. > > please suggest where I am doing mistake ? > > kmax<-c(100) >> if(nrow(dat2)<100) { > + kmax<-nrow(dat2) > + } > km<-rep(NA,(kmax-1)) >> i<c(2)> while(i<kmax){> + km[i]<-sum(kmeans(dat2,i,iter.max=20000,nstart=10)\$withinss) > + if(i>=3 & km[i-1]/km[i]<=1.01){ > + i<-kmax > + } else { > + i<-i+1 > + } > + } > plot(2:kmax,km,xlab="K",ylab="sum(withinss)",type="b",pch="+",main=" Terminated > when change less then 1%") > > > > > > *Error in plot.window(...) : need finite 'ylim' valuesIn addition: Warning > messages:1: In min(x) : no non-missing arguments to min; returning Inf2: In > max(x) : no non-missing arguments to max; returning -Inf* > Many thanks, > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]