Cluster analysis distance measuer

0

Entering edit mode

Auer Michael ▴ 250

@auer-michael-953

Last seen 11.2 years ago

I would like to know wheter there exists the possibility to cluster genes non-hierachically, but with the correlation as distance measure? K-means, clara, pam, etc, only seem to work with euclidean metrics. I aks the question because the number of genes is often too big to apply hierarchical clustering, and the distance measure has a strong influence on the way genes are clusterd. Thanks Send Bioconductor mailing list submissions to > bioconductor@stat.math.ethz.ch > > To subscribe or unsubscribe via the World Wide Web, visit > https://stat.ethz.ch/mailman/listinfo/bioconductor > or, via email, send a message with subject or body 'help' to > bioconductor-request@stat.math.ethz.ch > > You can reach the person managing the list at > bioconductor-owner@stat.math.ethz.ch > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Bioconductor digest..." > > > Today's Topics: > > 1. error following cluster example (kfbargad@lg.ehu.es) > 2. RE: error following cluster example (Claire Wilson) > 3. Re: error following cluster example (Robert Gentleman) > 4. Re: AnnBuilder bug // R-2.0.0 // getList4GO (John Zhang) > 5. comparing different experiments (Julia Engelmann) > 6. Problems with heatmap on genes... (Giulio Di Giovanni) > 7. help with limma contrast matrix (Kimpel, Mark W) > 8. RE: Problems with heatmap on genes... (michael watson (IAH-C)) > 9. Re: Problems with heatmap on genes... (jeffrey rasmussen) > 10. Re: comparing different experiments (Fangxin Hong) > 11. affy segmentation fault (Sucheta Tripathy) > 12. Re: affy segmentation fault (Adaikalavan Ramasamy) > 13. Re: affy segmentation fault (Ben Bolstad) > 14. Re: affy segmentation fault (Sucheta Tripathy) > 15. Re: affy segmentation fault (Adaikalavan Ramasamy) > 16. RE: Problems with heatmap on genes... (Johan Lindberg) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 9 Nov 2004 12:59:41 +0100 (CET) > From: kfbargad@lg.ehu.es > Subject: [BioC] error following cluster example > To: bioconductor@stat.math.ethz.ch > Message-ID: <9456297971kfbargad@lg.ehu.es> > Content-Type: text/plain; charset="ISO-8859-1" > > Dear Users, > > I am following the example on Lab 5: Cluster analysis (June 2003) with > my own data. > > I have filtered my expression set as shown on the example and I get > the following > >> sub <- genefilter(X,ffun) >> sum(sub) > [1] 1124 > > I save this subset of genes and then log transform it. But when I type > the next command I get the following error: >> X <- X[sub,] >> X <- log2(X) >> RawDataSub <- Raw.Data[,sub] > Error in Raw.Data[, sub] : (subscript) logical subscript too long > > Why do I get this error?? > Also, if I have stored the subset expression data as X, why is Raw.Data > [,sub] using [,sub] again? I don?t really understand this step, if > anyone could explain its purpose. > > I?m running R 1.9.1 on an XP computer > > Thanks a lot for your help > > David > > > > ------------------------------ > > Message: 2 > Date: Tue, 9 Nov 2004 12:29:19 -0000 > From: "Claire Wilson" <clairewilson@picr.man.ac.uk> > Subject: RE: [BioC] error following cluster example > To: <kfbargad@lg.ehu.es>, <bioconductor@stat.math.ethz.ch> > Message-ID: > <baa35444b19ad940997ed02a6996aae001de15e7@sanmail.picr.man.ac.uk> > Content-Type: text/plain; charset="US-ASCII" > > >> Dear Users, >> >> I am following the example on Lab 5: Cluster analysis (June >> 2003) with >> my own data. >> >> I have filtered my expression set as shown on the example and I get >> the following >> >> > sub <- genefilter(X,ffun) >> > sum(sub) >> [1] 1124 >> >> I save this subset of genes and then log transform it. But >> when I type >> the next command I get the following error: >> > X <- X[sub,] >> > X <- log2(X) >> > RawDataSub <- Raw.Data[,sub] >> Error in Raw.Data[, sub] : (subscript) logical subscript too long > > it looks like you are tyring to select columns not rows, > RawDataSub <- Raw.Data[,sub] #subsets on columns > try: > RawDataSub <- Raw.Data[sub,] #subset on rows > > hth > > claire > > -------------------------------------------------------- > > > This email is confidential and intended solely for the use o...{{dropped}} > > > > ------------------------------ > > Message: 3 > Date: Tue, 9 Nov 2004 08:07:59 -0500 > From: Robert Gentleman <rgentlem@jimmy.harvard.edu> > Subject: Re: [BioC] error following cluster example > To: kfbargad@lg.ehu.es > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <20041109080759.E29793@jimmy.harvard.edu> > Content-Type: text/plain; charset=iso-8859-1 > > On Tue, Nov 09, 2004 at 12:59:41PM +0100, kfbargad@lg.ehu.es wrote: >> Dear Users, >> >> I am following the example on Lab 5: Cluster analysis (June 2003) with >> my own data. >> >> I have filtered my expression set as shown on the example and I get >> the following >> >> > sub <- genefilter(X,ffun) >> > sum(sub) >> [1] 1124 >> >> I save this subset of genes and then log transform it. But when I type >> the next command I get the following error: >> > X <- X[sub,] >> > X <- log2(X) >> > RawDataSub <- Raw.Data[,sub] >> Error in Raw.Data[, sub] : (subscript) logical subscript too long >> >> Why do I get this error?? > > Perhaps because the dimensions of X and of Raw.Data are not the > same? If you are not familiar with R you should spend some time with > introductory material to learn about the language as that knowledge > is essential for debugging. > >> Also, if I have stored the subset expression data as X, why is Raw.Data >> [,sub] using [,sub] again? I don?t really understand this step, if >> anyone could explain its purpose. >> > > Because X and Raw.Data are not the same object. R basically has pass > by value semantics and so (almost) everything is a copy. > >> I?m running R 1.9.1 on an XP computer >> >> Thanks a lot for your help >> >> David >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- > +------------------------------------------------------------------- --------+ > | Robert Gentleman phone : (617) 632-5250 > | > | Associate Professor fax: (617) 632-2444 > | > | Department of Biostatistics office: M1B20 > | > | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu > | > +------------------------------------------------------------------- --------+ > > > > ------------------------------ > > Message: 4 > Date: Tue, 9 Nov 2004 08:32:08 -0500 (EST) > From: John Zhang <jzhang@jimmy.harvard.edu> > Subject: Re: [BioC] AnnBuilder bug // R-2.0.0 // getList4GO > To: hathanassiou@automatedcell.com > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <200411091332.IAA26906@blaise.dfci.harvard.edu> > Content-Type: TEXT/plain; charset=us-ascii > > Thanks. I will have a look at the code and fix the problem. > >>From: "Harry Athanassiou" <hathanassiou@automatedcell.com> >>To: <bioconductor@stat.math.ethz.ch> >>Date: Tue, 9 Nov 2004 01:22:42 -0500 >>MIME-Version: 1.0 >>X-Priority: 3 (Normal) >>X-MSMail-Priority: Normal >>Importance: Normal >>X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 >>X-ELNK-Trace: > 5cb454646877e76194f5150ab1c16ac08f4233f47979de267864528e82c1f9d01709 5c3ef67204ee > 350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c >>X-Originating-IP: 70.20.82.76 >>Received-SPF: none (hypatia: domain of >> bioconductor-bounces@stat.math.ethz.ch > does not designate permitted sender hosts) >>Received-SPF: none (hypatia: domain of hathanassiou@automatedcell.com >> does not > designate permitted sender hosts) >>X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch >>Content-Transfer-Encoding: 8bit >>X-MIME-Autoconverted: from quoted-printable to 8bit by >> hypatia.math.ethz.ch id > iA96MkAl017284 >>Subject: [BioC] AnnBuilder bug // R-2.0.0 // getList4GO >>X-BeenThere: bioconductor@stat.math.ethz.ch >>X-Mailman-Version: 2.1.5 >>List-Id: The Bioconductor Project Mailing List >> <bioconductor.stat.math.ethz.ch> >>List-Unsubscribe: <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">, > <mailto:bioconductor-request@stat.math.ethz.ch?subject=unsubscribe> >>List-Archive: <https: stat.ethz.ch="" pipermail="" bioconductor=""> >>List-Post: <mailto:bioconductor@stat.math.ethz.ch> >>List-Help: <mailto:bioconductor- request@stat.math.ethz.ch?subject="help"> >>List-Subscribe: <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">, > <mailto:bioconductor-request@stat.math.ethz.ch?subject=subscribe> >>X-Spam-Checker-Version: SpamAssassin 2.60-rc1 (1.197-2003-08-21-exp) on > blaise.dfci.harvard.edu >>X-Spam-Status: No, hits=0.0 required=5.0 tests=none autolearn=ham > version=2.60-rc1 >>X-Spam-Level: >> >>I'm trying to use AnnBuilder to make some custom annotation files for a >>non-standard microarray chip. In running the tests with R-2.0.0, I run >>acroos a problem in the function getList4GO. I'm not sure if this issue >> is >>due to R-2.0.0 or not. >> >>Here's the issue: >>when the sub-function procOne is called by sapply, the names(goids) is >> NULL. >>Thus when procOne calls : >> apply(temp, 1, vect2List, vectNames = c("GOID", "Evidence", >> "Ontology")) >>the number of list-elements to be named is mismatched. >> >>I do not know how to make sapply pass the names() of its first argument >> to >>the FUN() it calls, so I modified procOne->procOne.new to drop the column >>"Evidence". >>And add this column with a trick afterwards. >> >>I'm sure this is not the best solution, just worked for me >> >>>>> >>getList4GO <- function (goNCat, goNEvi) >>{ >> procOne <- function(goids) { >> if (is.null(goids) || is.na(goids)) { >> return(NA) >> } >> else { >> temp <- cbind(goids, names(goids), goNCat[goids]) >> rownames(temp) <- goids >> return(apply(temp, 1, vect2List, vectNames = c("GOID", >>"Evidence", "Ontology"))) >> } >> } >> >> # the names(goids) do not get propagated through the sapply() in >>R-2.0.0! >> # remove the column evidence >> procOne.new <- function(goids) { >> if (is.null(goids) || is.na(goids)) { >> return(NA) >> } >> else { >> temp <- cbind(goids, goNCat[goids]) >> rownames(temp) <- goids >> return(apply(temp, 1, vect2List, vectNames = c("GOID", >>"Ontology"))) >> } >> } >> >> temp <- sapply(goNEvi, procOne.new) >> names(temp) <- 1:length(temp) >> >> # add the evidence list-element >> # do not know a better way will do a loop on an index to acc two >> arrays >>at the same time >> for (r in 1:length(goNEvi)) { >> if (!is.na(temp[r])) { >> temp[[r]] <- c(temp[[r]], "Evidence"=names(goNEvi)[r]) >> } >> } >> >> return(temp) >>} >>>>> >> >>Harry Athanassiou >>BioInformatics manager >>Automated Cell, Inc >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > Jianhua Zhang > Department of Biostatistics > Dana-Farber Cancer Institute > 44 Binney Street > Boston, MA 02115-6084 > > > > ------------------------------ > > Message: 5 > Date: Tue, 09 Nov 2004 16:30:53 +0100 > From: Julia Engelmann <julia.engelmann@biozentrum.uni-wuerzburg.de> > Subject: [BioC] comparing different experiments > To: bioconductor@stat.math.ethz.ch > Message-ID: <4190E2AD.1060501@biozentrum.uni-wuerzburg.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi list, > > I wonder if I can compare Affymetrix arrays of the same type (ATH1) > which were made in different laboratories and with different tissue > types and different references. I have: "tissue1 treated", "tissue1 > untreated" from one lab and "tissue2 treated", "tissue2 untreated" from > the other lab. > The references (untreated) are different because of the different > tissue types. I am interested in the difference between tissue1 treated > and tissue2 treated, so I thought I could use limma to make a contrast: > (tissue1_treated-tissue1_untreated)-(tissue2_treated- tissue2_untreated). > I am not sure if this is valid, though? For example, I do not account > for the different labs that way. > Maybe it is just possible to analyse each experiment by itself and > compare the results at a latter stage, say compare lists of > differentially expressed genes? > > Any advice, comments or hints are highly appreciated, > > Julia > > > > ------------------------------ > > Message: 6 > Date: Tue, 09 Nov 2004 15:37:56 +0000 > From: "Giulio Di Giovanni" <perimessaggini@hotmail.com> > Subject: [BioC] Problems with heatmap on genes... > To: bioconductor@stat.math.ethz.ch > Message-ID: <bay10-f38bmhwrhiu340003395b@hotmail.com> > Content-Type: text/plain; charset=iso-8859-1; format=flowed > > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and width), > and modifying par() options (fin, etc..), to have long cluster figures (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > > > ------------------------------ > > Message: 7 > Date: Tue, 9 Nov 2004 11:54:55 -0500 > From: "Kimpel, Mark W" <mkimpel@iupui.edu> > Subject: [BioC] help with limma contrast matrix > To: <bioconductor@stat.math.ethz.ch> > Message-ID: > <2E6C5260C7C387449A96DF46EE76313C017D8985@iu-mssg- mbx02.exchange.iu.edu> > > Content-Type: text/plain; charset="us-ascii" > > I would appreciate advice on how to construct a contrast matrix for a > 5X2 ANOVA design. Briefly, I have a genomic experiment to analyze that > compares 5 brain regions in 2 strains of rats. We are interested in > discovering overall differences between strains (collapsing all brain > regions together) but also discovering differences that may only be > expressed in one brain region. > > I have attempted to construct the appropriate matrix with the code > listed below, but it does not work. I seem to get differences between > strains, but all the brain region contrasts give exactly the same > results, so I know something isn't correct. > > contrast <-makeContrasts( > > ( > (NPAccumbens + NPAmygdala + NPHippocampus + NPPrefrontal_Cortex > + NPStriatum) - #all regions of strain "NP" > > (PAccumbens + PAmygdala + PHippocampus + PPrefrontal_Cortex + > PStriatum) #all regions of strain "P" > > ), > > (NPAccumbens - PAccumbens), > #accumbens region of both strains > > (NPAmygdala - PAmygdala), > #amygdala region of both strains > > (NPHippocampus - PHippocampus), > #hippocampus region of both strains > > (NPPrefrontal_Cortex - PPrefrontal_Cortex), > #Prefrontal_Cortex region of both strains > > (NPStriatum - PStriatum), > #striatum region of both strains > > levels=design) > > > Thanks! > > Mark > > Mark W. Kimpel MD > > > > (317) 490-5129 Home, Work, & Mobile > > (317) 278-4104 FAX > > > > ------------------------------ > > Message: 8 > Date: Tue, 9 Nov 2004 16:54:09 -0000 > From: "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> > Subject: RE: [BioC] Problems with heatmap on genes... > To: "Giulio Di Giovanni" <perimessaggini@hotmail.com>, > <bioconductor@stat.math.ethz.ch> > Message-ID: > <8975119BCD0AC5419D61A9CF1A923E950121B868@iahce2knas1.iah.bbsr c.reserved> > > Content-Type: text/plain; charset="Windows-1252" > > Hi > > There's a function called heatmap.2 in the gregmisc library that will > resize properly when you send it to a long png() or jpg(). > > It's similar to, but not the same as, heatmap() so read the docs! > > Mick > > > -----Original Message----- > From: Giulio Di Giovanni [mailto:perimessaggini@hotmail.com] > Sent: Tue 11/9/2004 3:37 PM > To: bioconductor@stat.math.ethz.ch > Cc: > Subject: [BioC] Problems with heatmap on genes... > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and width), > and modifying par() options (fin, etc..), to have long cluster figures (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > ------------------------------ > > Message: 9 > Date: Tue, 9 Nov 2004 09:14:23 -0800 (PST) > From: jeffrey rasmussen <rasmuss@u.washington.edu> > Subject: Re: [BioC] Problems with heatmap on genes... > To: Giulio Di Giovanni <perimessaggini@hotmail.com> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: > <pine.a41.4.61b.0411090909320.309756@homer11.u.washington.edu> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > Hi Giulio, > > If you have access to Adobe Illustrator, you could write your heatmap to a > postscript file using postscript() and then open and edit the file in > Illustrator. I've found that in many cases this is much easier than > wrangling with the plotting parameters in R, in particular when it comes > to fonts. Otherwise, trying to display > 50 genes on a heatmap becomes > prohibitively difficult. > > Best, > Jeff. > > On Tue, 9 Nov 2004, Giulio Di Giovanni wrote: > >> >> Hi, >> >> I'm trying to have a clear figure of gene clusters using heatmaps, but >> with >> more than 100-200 genes it's not possible to do it, with default options >> (and >> I would like to do that with 1500 genes or so...). Gene names (and >> branchs >> too) collapse together... >> >> I tried, setting new device dimensions (jpeg() or png() height and >> width), >> and modifying par() options (fin, etc..), to have long cluster figures >> (to be >> clear, dChip style). Well, it works for others high-level graphical >> functions, but it doesn't work for heatmaps(). I always obtain big >> figures, >> but with exactely the same squared heatmap inside. >> >> I spent long time on the documentation and searching the web, and when I >> found something, it was always some heatmaps for 50-100 genes at max >> >> I trust that someone working on gene clustering is confidential on this, >> and I will appreciate a lot any suggestion... I almost became crazy on >> that >> !!! >> >> Thanks in advance, >> >> Giulio >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > > ------------------------------ > > Message: 10 > Date: Tue, 9 Nov 2004 13:26:19 -0800 (PST) > From: "Fangxin Hong" <fhong@salk.edu> > Subject: Re: [BioC] comparing different experiments > To: "Julia Engelmann" <julia.engelmann@biozentrum.uni-wuerzburg.de> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <1630.10.10.200.250.1100035579.squirrel@10.10.200.250> > Content-Type: text/plain;charset=iso-8859-1 > > >> I wonder if I can compare Affymetrix arrays of the same type (ATH1) >> which were made in different laboratories and with different tissue >> types and different references. I have: "tissue1 treated", "tissue1 >> untreated" from one lab and "tissue2 treated", "tissue2 untreated" from >> the other lab. >> The references (untreated) are different because of the different >> tissue types. I am interested in the difference between tissue1 treated >> and tissue2 treated, so I thought I could use limma to make a contrast: >> (tissue1_treated-tissue1_untreated)-(tissue2_treated- tissue2_untreated). >> I am not sure if this is valid, though? For example, I do not account >> for the different labs that way. >> Maybe it is just possible to analyse each experiment by itself and >> compare the results at a latter stage, say compare lists of >> differentially expressed genes? > Based on what I observed when study data generated at different lab, lab > effect can't not be completely removed by normalization step. If you do > have some replicates or several data sets from each lab, and you want to > combine data together, I would suggest you to inlcude a fixed effect for > lab factor. > Hopefully this will help. > > Fangxin > > > _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> > > > -- > Fangxin Hong, Ph.D. > Plant Biology Laboratory > The Salk Institute > 10010 N. Torrey Pines Rd. > La Jolla, CA 92037 > E-mail: fhong@salk.edu > > > > ------------------------------ > > Message: 11 > Date: Tue, 9 Nov 2004 16:46:03 -0500 (EST) > From: "Sucheta Tripathy" <sutripa@vbi.vt.edu> > Subject: [BioC] affy segmentation fault > To: bioconductor@stat.math.ethz.ch > Message-ID: <1815.199.3.136.4.1100036763.squirrel@webmail.vbi.vt.edu> > Content-Type: text/plain;charset=iso-8859-1 > > > I know we have been cluttering this mailing list with this question over > and again. The reason I want to ask again is after seeing the segmentation > fault error, I found it says 340000 KB to be the size it needs. > > What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB > swap memory). > > After trying all the remedies, it still fails. Can anyone suggest if in > the source where the exact memory allocation takes place, how much is > fixed to be the size. Can we not increase it? Or to begin with which > version of affy package has a fix for it. > > Thanks in advance. > > Sucheta > > -- > Sucheta Tripathy > Virginia Bioinformatics Institute Phase-I > Washington street. > Virginia Tech. > Blacksburg,VA 24061-0447 > phone:(540)231-8138 > Fax: (540) 231-2606 > > > > ------------------------------ > > Message: 12 > Date: Tue, 09 Nov 2004 22:46:04 +0000 > From: Adaikalavan Ramasamy <ramasamy@cancer.org.uk> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: BioConductor mailing list <bioconductor@stat.math.ethz.ch> > Message-ID: <1100040364.3326.10.camel@localhost.localdomain> > Content-Type: text/plain > > I just checked the mailing archives. You sent 2 mails in Novembers > (excluding this) and 2 in October but none of them talk about > segmentation fault error. Perhaps you can explain who "we" are or better > yet state the problem or link to past mail (perhaps from > https://stat.ethz.ch/pipermail/bioconductor/). > > Start from a clean R session and see if you can repeat the problem. > Next, reduce the number of arrays till you find out how many arrays your > machine can handle. Try just.rma or just.gcrma. Also search the mailing > archives. These are all guesses. > > Note that although 5 GB is available to a machine, there might be a > limit to how much each process/user can have access to. Speak to your > system administrator about any such limitation. > > Regards, Adai > > > > On Tue, 2004-11-09 at 21:46, Sucheta Tripathy wrote: >> I know we have been cluttering this mailing list with this question over >> and again. The reason I want to ask again is after seeing the >> segmentation >> fault error, I found it says 340000 KB to be the size it needs. >> >> What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB >> swap memory). >> >> After trying all the remedies, it still fails. Can anyone suggest if in >> the source where the exact memory allocation takes place, how much is >> fixed to be the size. Can we not increase it? Or to begin with which >> version of affy package has a fix for it. >> >> Thanks in advance. >> >> Sucheta > > > > ------------------------------ > > Message: 13 > Date: Tue, 09 Nov 2004 14:49:05 -0800 > From: Ben Bolstad <bolstad@stat.berkeley.edu> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: bioconductor@stat.math.ethz.ch > Message-ID: <1100040545.2398.70.camel@bmbbox.dyndns.org> > Content-Type: text/plain > > Please wait for the next version of the affy package 1.6.0 which should > appear on the web in a few days. It has the requisite fix to deal with > your soybean arrays. > > Ben > > > > On Tue, 2004-11-09 at 13:46, Sucheta Tripathy wrote: >> I know we have been cluttering this mailing list with this question over >> and again. The reason I want to ask again is after seeing the >> segmentation >> fault error, I found it says 340000 KB to be the size it needs. >> >> What puzzles me is our memory is way beyond that(almost 5 GB with 10 GB >> swap memory). >> >> After trying all the remedies, it still fails. Can anyone suggest if in >> the source where the exact memory allocation takes place, how much is >> fixed to be the size. Can we not increase it? Or to begin with which >> version of affy package has a fix for it. >> >> Thanks in advance. >> >> Sucheta > -- > Ben Bolstad <bolstad@stat.berkeley.edu> > http://www.stat.berkeley.edu/~bolstad > > > > ------------------------------ > > Message: 14 > Date: Tue, 09 Nov 2004 18:51:17 -0500 > From: Sucheta Tripathy <sutripa@vbi.vt.edu> > Subject: Re: [BioC] affy segmentation fault > To: bioconductor@stat.math.ethz.ch > Message-ID: <5.1.0.14.0.20041109183629.01f966c8@mail.vbi.vt.edu> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > At 04:59 PM 11/9/2004 -0500, Robert Gentleman wrote: >>On Tue, Nov 09, 2004 at 04:46:03PM -0500, Sucheta Tripathy wrote: >> > >> > I know we have been cluttering this mailing list with this question >> over >> > and again. The reason I want to ask again is after seeing the >> segmentation >> > fault error, I found it says 340000 KB to be the size it needs. >> > >> > What puzzles me is our memory is way beyond that(almost 5 GB with 10 >> GB >> > swap memory). >> >> And as I have said very many times already, it likely has nothing to >> do with that, but rather that you have a corrupted installation. You >> almost surely need to recompile R with the correct set of compiler >> flags for your system and to reinstall the the appropriate >> packages. I am not sure how I can say this more explicitly, but the >> problem does not seem to be affy, it seems to be your installation. > > I guess at this point if any body else who has done installation and > compilation with any other flag, shares the flags they have used, I will > really appreciate that. After digging through the installation > instruction, > I don't find anything other than > > $ ./configure > $ make > > with may be a option to prefix path.(where R binaries and libraries should > go). > > Probably I need help from someone who can point where to find a more > detailed installation help. I have been also looking at file config.site, > and most of the default options look fine to me. > > If it is just the case of R being corrupted,is no big deal provided we > know > what flags we are using to compile next. > > -Sucheta > >> Robert >> >> >> > >> > After trying all the remedies, it still fails. Can anyone suggest if >> in >> > the source where the exact memory allocation takes place, how much is >> > fixed to be the size. Can we not increase it? Or to begin with which >> > version of affy package has a fix for it. >> > >> > Thanks in advance. >> > >> > Sucheta >> > >> > -- >> > Sucheta Tripathy >> > Virginia Bioinformatics Institute Phase-I >> > Washington street. >> > Virginia Tech. >> > Blacksburg,VA 24061-0447 >> > phone:(540)231-8138 >> > Fax: (540) 231-2606 >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor@stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>-- >>+------------------------------------------------------------------- --------+ >>| Robert Gentleman phone : (617) 632-5250 >> | >>| Associate Professor fax: (617) 632-2444 >> | >>| Department of Biostatistics office: M1B20 >> | >>| Harvard School of Public Health email: rgentlem@jimmy.harvard.edu >> | >>+------------------------------------------------------------------- --------+ > > > > ------------------------------ > > Message: 15 > Date: Wed, 10 Nov 2004 08:16:32 +0000 > From: Adaikalavan Ramasamy <ramasamy@cancer.org.uk> > Subject: Re: [BioC] affy segmentation fault > To: Sucheta Tripathy <sutripa@vbi.vt.edu> > Cc: BioConductor mailing list <bioconductor@stat.math.ethz.ch> > Message-ID: <1100074592.7513.41.camel@localhost.localdomain> > Content-Type: text/plain > > A normal installation procedure for me would be something like : > > make clean # or make distclean if you tried configuring before > ./configure --prefix=/home/adai/R > make > make check > make install > > There are variants of versions of 'make check' such as 'make check- all' > which are more comprehensive testing (see page 3 of R-admin). > > I do not know comprehend the flags and various options. If there is an > error or problem, I usually get my system administrator involved and > failing that I would try R-help mailing which is the more appropriate > place. > > And when you email R-help, please mention some vital information such as > your operating system (and kernel), gcc version, R version. Have you > tried checking R-help or BioC mailing archives ? > > BTW, does Ben Bolstad's reply about affy 1.6.0. answer your question ? > > > > On Tue, 2004-11-09 at 23:51, Sucheta Tripathy wrote: >> At 04:59 PM 11/9/2004 -0500, Robert Gentleman wrote: >> >On Tue, Nov 09, 2004 at 04:46:03PM -0500, Sucheta Tripathy wrote: >> > > >> > > I know we have been cluttering this mailing list with this question >> over >> > > and again. The reason I want to ask again is after seeing the >> segmentation >> > > fault error, I found it says 340000 KB to be the size it needs. >> > > >> > > What puzzles me is our memory is way beyond that(almost 5 GB with 10 >> GB >> > > swap memory). >> > >> > And as I have said very many times already, it likely has nothing to >> > do with that, but rather that you have a corrupted installation. You >> > almost surely need to recompile R with the correct set of compiler >> > flags for your system and to reinstall the the appropriate >> > packages. I am not sure how I can say this more explicitly, but the >> > problem does not seem to be affy, it seems to be your installation. >> >> I guess at this point if any body else who has done installation and >> compilation with any other flag, shares the flags they have used, I will >> really appreciate that. After digging through the installation >> instruction, >> I don't find anything other than >> >> $ ./configure >> $ make >> >> with may be a option to prefix path.(where R binaries and libraries >> should go). >> >> Probably I need help from someone who can point where to find a more >> detailed installation help. I have been also looking at file >> config.site, >> and most of the default options look fine to me. >> >> If it is just the case of R being corrupted,is no big deal provided we >> know >> what flags we are using to compile next. >> >> -Sucheta >> >> > Robert >> > >> > >> > > >> > > After trying all the remedies, it still fails. Can anyone suggest if >> in >> > > the source where the exact memory allocation takes place, how much >> is >> > > fixed to be the size. Can we not increase it? Or to begin with which >> > > version of affy package has a fix for it. >> > > >> > > Thanks in advance. >> > > >> > > Sucheta >> > > >> > > -- >> > > Sucheta Tripathy >> > > Virginia Bioinformatics Institute Phase-I >> > > Washington street. >> > > Virginia Tech. >> > > Blacksburg,VA 24061-0447 >> > > phone:(540)231-8138 >> > > Fax: (540) 231-2606 >> > > >> > > _______________________________________________ >> > > Bioconductor mailing list >> > > Bioconductor@stat.math.ethz.ch >> > > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> >-- >> >+----------------------------------------------------------------- ----------+ >> >| Robert Gentleman phone : (617) 632-5250 >> | >> >| Associate Professor fax: (617) 632-2444 >> | >> >| Department of Biostatistics office: M1B20 >> | >> >| Harvard School of Public Health email: rgentlem@jimmy.harvard.edu >> | >> >+----------------------------------------------------------------- ----------+ >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > > ------------------------------ > > Message: 16 > Date: Wed, 10 Nov 2004 09:43:43 +0100 > From: "Johan Lindberg" <johanl@biotech.kth.se> > Subject: RE: [BioC] Problems with heatmap on genes... > To: "'Giulio Di Giovanni'" <perimessaggini@hotmail.com>, > <bioconductor@stat.math.ethz.ch> > Message-ID: <000b01c4c701$62f059b0$27230a0a@biochem.kth.se> > Content-Type: text/plain; charset="US-ASCII" > > Hi Giulio. Heatmap is as you say a great tool if you have a small number > of genes but NOT if you have a lot of genes. I was dealing with the same > thing as you are doing now some 6 month ago and I found no good solution > using Heatmap. Therefore we use the freeware (note freeware) MeV from > TIGR at our department to do hierarchical clustering and similar things. > > http://www.tigr.org/software/tm4/mev.html > > What we have done is to write a script (exportMEV) that takes an > MA-object (package Aroma in R) and export that object to MeV format and > use it when doing clustering. > http://www.biotech.kth.se/molbio/microarray/pages/kthpackagetransfer .htm > l > > Best regards > > // Johan Lindberg > > > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Giulio Di > Giovanni > Sent: Tuesday, November 09, 2004 4:38 PM > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] Problems with heatmap on genes... > > > Hi, > > I'm trying to have a clear figure of gene clusters using heatmaps, but > with > more than 100-200 genes it's not possible to do it, with default options > > (and I would like to do that with 1500 genes or so...). Gene names (and > branchs too) collapse together... > > I tried, setting new device dimensions (jpeg() or png() height and > width), > and modifying par() options (fin, etc..), to have long cluster figures > (to > be clear, dChip style). Well, it works for others high-level graphical > functions, but it doesn't work for heatmaps(). I always obtain big > figures, > but with exactely the same squared heatmap inside. > > I spent long time on the documentation and searching the web, and when I > > found something, it was always some heatmaps for 50-100 genes at max > > I trust that someone working on gene clustering is confidential on this, > and I will appreciate a lot any suggestion... I almost became crazy on > that > !!! > > Thanks in advance, > > Giulio > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > ------------------------------ > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > End of Bioconductor Digest, Vol 21, Issue 10 > ******************************************** >

Microarray Annotation Normalization Clustering Cancer affy limma AnnBuilder BRAIN Cancer • 1.6k views

ADD COMMENT • link updated 21.0 years ago by Jenny Bryan ▴ 110 • written 21.0 years ago by Auer Michael ▴ 250

0

Entering edit mode

Jenny Bryan ▴ 110

@jenny-bryan-949

Last seen 11.2 years ago

> From: "Auer Michael" <michael.auer@meduniwien.ac.at> > > I would like to know wheter there exists the possibility to cluster genes > non-hierachically, but with the correlation as distance measure? K-means, > clara, pam, etc, only seem to work with euclidean metrics. I aks the Many clustering algorithms, pam for example, will accept a dissimilarity object as input. The limitation you perceive arises only if you ask the pam function itself to compute the dissimilarity for you. Below is a tiny example of how to use a '1 minus correlation' type of dissimilarity. ############################ library(cluster) library(MASS) Sigma.x <- matrix(0.7,nrow = 3, ncol = 3) diag(Sigma.x) <- 1 x <- mvrnorm(n = 4, mu = c(3,5,3), Sigma = Sigma.x) Sigma.y <- matrix(0.6, nrow = 3, ncol = 3) diag(Sigma.y) <- 1 y <- mvrnorm(n = 4, mu = rep(1,3), Sigma = Sigma.y) z <- rbind(x,y) matplot(1:3,t(z), col = rep(c("red","green"),each=4),type = "l", lty = 1) cor.dist.z <- as.dist(1 - abs(cor(t(z)))) pamfit <- pam(cor.dist.z, k = 2) plot(pamfit) -- Jenny Bryan *----------------------------------* * Assistant Professor * * Department of Statistics and * * the Michael Smith Laboratories * * University of British Columbia * *----------------------------------* 333-6356 Agricultural Road Vancouver, BC V6T 1Z2 Canada tel: 604.822.6422 fax: 604.822.6960 email: jenny@stat.ubc.ca

ADD COMMENT • link 21.0 years ago Jenny Bryan ▴ 110

Login before adding your answer.