flowClust problem
1
0
Entering edit mode
@anja-mirenska-5207
Last seen 9.7 years ago
Dear all, I am trying out different approaches for clustering cytometric data with R. For trying flowClust, I am using a matrix with 7 markers as columns. At the beginning I tried clustering the first two columns of the matrix (say var1 and var2) and this worked perfectly: *library("flowClust")* *res1 <- flowClust(mydata, varNames=c("var1", "var2"), K=1, B=100)* *mydata2 <- mydata[mydata %in% res1,]* *res2 <- flowClust(mydata2, varNames=c("var1", "var2"), K=1:6, B=100)* Then, I went on to trying to cluster var1 vs. var3 with the same settings, and suddenly an error occurred when calculating res2: *res1 <- flowClust(mydata, varNames=c("var1", "var3"), K=1, B=100)* *mydata2 <- mydata[mydata %in% res1,]* *res2 <- flowClust(mydata2, varNames=c("var1", "var3"), K=1:6, B=100)* * * *"Error in if (M == 0) label else maxLabel[[M]] : argument has length 0"* I tried to decrease the number of K, as this error did not appear when calculating res1 and it worked at up to K=1:4. Now I looked at the BIC: *criterion(res2, "BIC")* *[1] -372.1891 -377.6957 -330.1110 NaN* Where does this NaN come from? Next, I tried to add var2 as the third parameter: *res1 <- flowClust(mydata, varNames=c("var1", "var2", "var3"), K=1, B=100)* *mydata2 <- mydata[mydata %in% res1,]* *res2 <- flowClust(mydata2, varNames=c("var1", "var2", "var3"), K=1:4, B=100)* And again the same error as above. This time it only works with up to K=1:3, and again the third BIC is NaN. So next try: res2 <- flowClust(mydata2, varNames=c("var1", "var2", "var3"), K=1:3, B=100) # this works plot(res2[[3]], data=cmat2, level=0.8, z.cutoff=0) And here comes the next error, although it does create the plot:* "Error in eigen(x@sigma[i, subset, subset]): infinite or missing values in 'x'"* Adding var4 only allows K=1. I wonder what's the reason for these errors. The original data is of mode numeric, there are no NA or 0 within it. Maybe someone has an idea what I could do to eliminate this problem? I am sorry for the long email, but as I can't figure out the problem, I also can't create a short, reproducible example. All of this works fine with three columns from the rituximab data. Best wishes Anja [[alternative HTML version deleted]]
Clustering flowClust Clustering flowClust • 1.0k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
Hi Anja, The NA in the result from criterion(res2, "BIC") is telling you four clusters cannot be estimated given your data. This may have to do with data conditioning, the columns you chose in your dataset or the number of values per column. It is hard to say without looking at the data. Is your 'mydata' object too large to send? Have you looked at the results of summary(res1) and summary(res2) for further clues? I'm also cc'ing the package author/maintainer. Valerie On 04/10/2012 05:01 AM, Anja Mirenska wrote: > Dear all, > > I am trying out different approaches for clustering cytometric data with R. > For trying flowClust, I am using a matrix with 7 markers as columns. At the > beginning I tried clustering the first two columns of the matrix (say var1 > and var2) and this worked perfectly: > > *library("flowClust")* > *res1<- flowClust(mydata, varNames=c("var1", "var2"), K=1, B=100)* > *mydata2<- mydata[mydata %in% res1,]* > *res2<- flowClust(mydata2, varNames=c("var1", "var2"), K=1:6, B=100)* > > Then, I went on to trying to cluster var1 vs. var3 with the same settings, > and suddenly an error occurred when calculating res2: > > *res1<- flowClust(mydata, varNames=c("var1", "var3"), K=1, B=100)* > *mydata2<- mydata[mydata %in% res1,]* > *res2<- flowClust(mydata2, varNames=c("var1", "var3"), K=1:6, B=100)* > * > * > *"Error in if (M == 0) label else maxLabel[[M]] : argument has length 0"* > > I tried to decrease the number of K, as this error did not appear when > calculating res1 and it worked at up to K=1:4. Now I looked at the BIC: > > *criterion(res2, "BIC")* > *[1] -372.1891 -377.6957 -330.1110 NaN* > > Where does this NaN come from? > > Next, I tried to add var2 as the third parameter: > > *res1<- flowClust(mydata, varNames=c("var1", "var2", "var3"), K=1, B=100)* > *mydata2<- mydata[mydata %in% res1,]* > *res2<- flowClust(mydata2, varNames=c("var1", "var2", "var3"), K=1:4, > B=100)* > > And again the same error as above. This time it only works with up to > K=1:3, and again the third BIC is NaN. So next try: > > res2<- flowClust(mydata2, varNames=c("var1", "var2", "var3"), K=1:3, > B=100) # this works > plot(res2[[3]], data=cmat2, level=0.8, z.cutoff=0) > > And here comes the next error, although it does create the plot:* "Error in > eigen(x at sigma[i, subset, subset]): infinite or missing values in 'x'"* > > Adding var4 only allows K=1. > > I wonder what's the reason for these errors. The original data is of mode > numeric, there are no NA or 0 within it. Maybe someone has an idea what I > could do to eliminate this problem? > > I am sorry for the long email, but as I can't figure out the problem, I > also can't create a short, reproducible example. All of this works fine > with three columns from the rituximab data. > > Best wishes > > Anja > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6