Search
Question: error under "hclust" for microarray clustering
0
gravatar for avehna
6.8 years ago by
avehna230
avehna230 wrote:
Hi All, I'm trying to cluster 21657 genes that are differentially expressed in my microarray data, but it's actually not working for me. After reading the normalized signal and calculating the mean for each treatment I proceed to read the list of genes differentially expressed (previously calculated using limma). The problem occurs during "hclust" function (please see below my code and corresponding error). Is it possible for this error to be due to the number of genes? when I use the same code for only 1000 genes it works pretty well. How could I solve this problem? I need this figure for my paper... Thank you for your help! Sincerely, Avhena ************************************************ > signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE] > pDatam <- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header = TRUE, sep = '\t') > pData <- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header = TRUE, sep = '\t') > expset <- new("ExpressionSet", exprs = signal, phenoData = pData) > means1 <- means(pairwise.comparison(expset, "Type", c("Control", "BMP"), method="logged", logged=FALSE)) > means2 <- means(pairwise.comparison(expset, "Type", c("BMP.VPA", "SHH.1D"), method="logged", logged=FALSE)) > means3 <- means(pairwise.comparison(expset, "Type", c("SHH.6H", "SHH.VPA.1D"), method="logged", logged=FALSE)) > all_means<-cbind(means1,means2,means3) > expmeans <- new("ExpressionSet", exprs = all_means, phenoData = pDatam) > subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D", "SHH.VPA.1D")) > genes<-read.table("affy_ids_diff_exprs05.dat") > mysubset<-exprs(subset)[match(levels(genes[,]), rownames(exprs(subset))),] > hr <- hclust(as.dist(1-cor(t(mysubset), method="spearman")), method="complete") Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method = "complete") : NA/NaN/Inf in foreign function call (arg 11) Calls: hclust -> .Fortran In addition: Warning message: In cor(t(mysubset), method = "spearman") : the standard deviation is zero Execution halted [[alternative HTML version deleted]]
ADD COMMENTlink modified 6.8 years ago by James W. MacDonald45k • written 6.8 years ago by avehna230
0
gravatar for Sean Davis
6.8 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Thu, Jan 27, 2011 at 12:00 AM, avehna <avhena@gmail.com> wrote: > Hi All, > > I'm trying to cluster 21657 genes that are differentially expressed in my > microarray data, but it's actually not working for me. After reading the > normalized signal and calculating the mean for each treatment I proceed to > read the list of genes differentially expressed (previously calculated > using > limma). The problem occurs during "hclust" function (please see below my > code and corresponding error). Is it possible for this error to be due to > the number of genes? when I use the same code for only 1000 genes it works > pretty well. > > How could I solve this problem? I need this figure for my paper... > > Thank you for your help! > > Sincerely, > Avhena > > > > ************************************************ > > signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE] > > pDatam <- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header = > TRUE, sep = '\t') > > pData <- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header = > TRUE, sep = '\t') > > expset <- new("ExpressionSet", exprs = signal, phenoData = pData) > > > means1 <- means(pairwise.comparison(expset, "Type", c("Control", "BMP"), > method="logged", logged=FALSE)) > > means2 <- means(pairwise.comparison(expset, "Type", c("BMP.VPA", > "SHH.1D"), method="logged", logged=FALSE)) > > means3 <- means(pairwise.comparison(expset, "Type", c("SHH.6H", > "SHH.VPA.1D"), method="logged", logged=FALSE)) > > all_means<-cbind(means1,means2,means3) > > expmeans <- new("ExpressionSet", exprs = all_means, phenoData = pDatam) > > subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D", > "SHH.VPA.1D")) > > genes<-read.table("affy_ids_diff_exprs05.dat") > > mysubset<-exprs(subset)[match(levels(genes[,]), > rownames(exprs(subset))),] > > > hr <- hclust(as.dist(1-cor(t(mysubset), method="spearman")), > method="complete") > > Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method > = > "complete") : > NA/NaN/Inf in foreign function call (arg 11) > Looks like you might have some NAs or Inf in your data. Try summary(mysubset) to see. Sean > Calls: hclust -> .Fortran > In addition: Warning message: > In cor(t(mysubset), method = "spearman") : the standard deviation is zero > Execution halted > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.8 years ago by Sean Davis21k
Hi Sean, you were right... there are some inf in my data. Thanks a lot for your help! On Thu, Jan 27, 2011 at 6:04 AM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > On Thu, Jan 27, 2011 at 12:00 AM, avehna <avhena@gmail.com> wrote: > >> Hi All, >> >> I'm trying to cluster 21657 genes that are differentially expressed in my >> microarray data, but it's actually not working for me. After reading the >> normalized signal and calculating the mean for each treatment I proceed to >> read the list of genes differentially expressed (previously calculated >> using >> limma). The problem occurs during "hclust" function (please see below my >> code and corresponding error). Is it possible for this error to be due to >> the number of genes? when I use the same code for only 1000 genes it works >> pretty well. >> >> How could I solve this problem? I need this figure for my paper... >> >> Thank you for your help! >> >> Sincerely, >> Avhena >> >> >> >> ************************************************ >> > signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE] >> > pDatam <- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header = >> TRUE, sep = '\t') >> > pData <- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header = >> TRUE, sep = '\t') >> > expset <- new("ExpressionSet", exprs = signal, phenoData = pData) >> >> > means1 <- means(pairwise.comparison(expset, "Type", c("Control", "BMP"), >> method="logged", logged=FALSE)) >> > means2 <- means(pairwise.comparison(expset, "Type", c("BMP.VPA", >> "SHH.1D"), method="logged", logged=FALSE)) >> > means3 <- means(pairwise.comparison(expset, "Type", c("SHH.6H", >> "SHH.VPA.1D"), method="logged", logged=FALSE)) >> > all_means<-cbind(means1,means2,means3) >> > expmeans <- new("ExpressionSet", exprs = all_means, phenoData = pDatam) >> > subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D", >> "SHH.VPA.1D")) >> > genes<-read.table("affy_ids_diff_exprs05.dat") >> > mysubset<-exprs(subset)[match(levels(genes[,]), >> rownames(exprs(subset))),] >> >> > hr <- hclust(as.dist(1-cor(t(mysubset), method="spearman")), >> method="complete") >> >> Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method >> = >> "complete") : >> NA/NaN/Inf in foreign function call (arg 11) >> > > Looks like you might have some NAs or Inf in your data. Try > summary(mysubset) to see. > > Sean > > >> Calls: hclust -> .Fortran >> In addition: Warning message: >> In cor(t(mysubset), method = "spearman") : the standard deviation is zero >> Execution halted >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLYlink written 6.8 years ago by avehna230
0
gravatar for James W. MacDonald
6.8 years ago by
United States
James W. MacDonald45k wrote:
Hi Avhena, On 1/27/2011 12:00 AM, avehna wrote: > Hi All, > > I'm trying to cluster 21657 genes that are differentially expressed in my > microarray data, but it's actually not working for me. After reading the > normalized signal and calculating the mean for each treatment I proceed to > read the list of genes differentially expressed (previously calculated using > limma). The problem occurs during "hclust" function (please see below my > code and corresponding error). Is it possible for this error to be due to > the number of genes? when I use the same code for only 1000 genes it works > pretty well. > > How could I solve this problem? I need this figure for my paper... You need to remove the rows that have no variability. For example: > dat <- matrix(rnorm(1000), nc=10) > dat[3,] <- rep(dat[3,3], 10) ## make row three have var=0 > hclust(as.dist(1-cor(t(dat), method="spearman")), method="complete") Error in hclust(as.dist(1 - cor(t(dat), method = "spearman")), method = "complete") : NA/NaN/Inf in foreign function call (arg 11) In addition: Warning message: In cor(t(dat), method = "spearman") : the standard deviation is zero now again, without this row > hclust(as.dist(1-cor(t(dat[-3,]), method="spearman")), method="complete") Call: hclust(d = as.dist(1 - cor(t(dat[-3, ]), method = "spearman")), method = "complete") Cluster method : complete Number of objects: 99 something like ind <- apply(mysubset, 1, var) == 0 mysubset <- mysubset[!ind,] should do the trick. Best, Jim > > Thank you for your help! > > Sincerely, > Avhena > > > > ************************************************ >> signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE] >> pDatam<- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header = > TRUE, sep = '\t') >> pData<- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header = > TRUE, sep = '\t') >> expset<- new("ExpressionSet", exprs = signal, phenoData = pData) > >> means1<- means(pairwise.comparison(expset, "Type", c("Control", "BMP"), > method="logged", logged=FALSE)) >> means2<- means(pairwise.comparison(expset, "Type", c("BMP.VPA", > "SHH.1D"), method="logged", logged=FALSE)) >> means3<- means(pairwise.comparison(expset, "Type", c("SHH.6H", > "SHH.VPA.1D"), method="logged", logged=FALSE)) >> all_means<-cbind(means1,means2,means3) >> expmeans<- new("ExpressionSet", exprs = all_means, phenoData = pDatam) >> subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D", > "SHH.VPA.1D")) >> genes<-read.table("affy_ids_diff_exprs05.dat") >> mysubset<-exprs(subset)[match(levels(genes[,]), rownames(exprs(subset))),] > >> hr<- hclust(as.dist(1-cor(t(mysubset), method="spearman")), > method="complete") > > Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method = > "complete") : > NA/NaN/Inf in foreign function call (arg 11) > Calls: hclust -> .Fortran > In addition: Warning message: > In cor(t(mysubset), method = "spearman") : the standard deviation is zero > Execution halted > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENTlink written 6.8 years ago by James W. MacDonald45k
Jim, thanks a lot! there were some inf in my data... I will try again after removing then. Avhena On Thu, Jan 27, 2011 at 9:43 AM, James W. MacDonald <jmacdon@med.umich.edu>wrote: > Hi Avhena, > > > On 1/27/2011 12:00 AM, avehna wrote: > >> Hi All, >> >> I'm trying to cluster 21657 genes that are differentially expressed in my >> microarray data, but it's actually not working for me. After reading the >> normalized signal and calculating the mean for each treatment I proceed to >> read the list of genes differentially expressed (previously calculated >> using >> limma). The problem occurs during "hclust" function (please see below my >> code and corresponding error). Is it possible for this error to be due to >> the number of genes? when I use the same code for only 1000 genes it works >> pretty well. >> >> How could I solve this problem? I need this figure for my paper... >> > > You need to remove the rows that have no variability. For example: > > > dat <- matrix(rnorm(1000), nc=10) > > dat[3,] <- rep(dat[3,3], 10) ## make row three have var=0 > > hclust(as.dist(1-cor(t(dat), method="spearman")), method="complete") > Error in hclust(as.dist(1 - cor(t(dat), method = "spearman")), method = > "complete") : > > NA/NaN/Inf in foreign function call (arg 11) > In addition: Warning message: > In cor(t(dat), method = "spearman") : the standard deviation is zero > > now again, without this row > > > hclust(as.dist(1-cor(t(dat[-3,]), method="spearman")), method="complete") > > Call: > hclust(d = as.dist(1 - cor(t(dat[-3, ]), method = "spearman")), method = > "complete") > > Cluster method : complete > Number of objects: 99 > > something like > > ind <- apply(mysubset, 1, var) == 0 > mysubset <- mysubset[!ind,] > > should do the trick. > > Best, > > Jim > > > >> Thank you for your help! >> >> Sincerely, >> Avhena >> >> >> >> ************************************************ >> >>> signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE] >>> pDatam<- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header = >>> >> TRUE, sep = '\t') >> >>> pData<- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header = >>> >> TRUE, sep = '\t') >> >>> expset<- new("ExpressionSet", exprs = signal, phenoData = pData) >>> >> >> means1<- means(pairwise.comparison(expset, "Type", c("Control", "BMP"), >>> >> method="logged", logged=FALSE)) >> >>> means2<- means(pairwise.comparison(expset, "Type", c("BMP.VPA", >>> >> "SHH.1D"), method="logged", logged=FALSE)) >> >>> means3<- means(pairwise.comparison(expset, "Type", c("SHH.6H", >>> >> "SHH.VPA.1D"), method="logged", logged=FALSE)) >> >>> all_means<-cbind(means1,means2,means3) >>> expmeans<- new("ExpressionSet", exprs = all_means, phenoData = pDatam) >>> subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D", >>> >> "SHH.VPA.1D")) >> >>> genes<-read.table("affy_ids_diff_exprs05.dat") >>> mysubset<-exprs(subset)[match(levels(genes[,]), >>> rownames(exprs(subset))),] >>> >> >> hr<- hclust(as.dist(1-cor(t(mysubset), method="spearman")), >>> >> method="complete") >> >> Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method >> = >> "complete") : >> NA/NaN/Inf in foreign function call (arg 11) >> Calls: hclust -> .Fortran >> In addition: Warning message: >> In cor(t(mysubset), method = "spearman") : the standard deviation is zero >> Execution halted >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues [[alternative HTML version deleted]]
ADD REPLYlink written 6.8 years ago by avehna230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 147 users visited in the last hour