Memory error in Mac OS X Aqua GUI v1.01 with cluster package functions
1
0
Entering edit mode
@betty-gilbert-1120
Last seen 9.6 years ago
I'm sorry if the answer to my problem is buried in the archives. I have limited experience with R and I couldn't find a solution to my particular problem. I am running Mac OS X Aqua GUI v1.01 on a new G5 running os 10.3.8 with a 1.8Ghz processor and 1GB of sdram. I just downloaded bioconducter a week ago and I'm trying to cluster a matrix I created with a simulation with dimensions dim(nca35) [1] 10481 12 with size > object.size(nca352) [1] 1426204 I checked my ulimits variable on the unix terminal and it says it's unlimited as does > mem.limits() nsize vsize NA NA But I'm still getting errors like the following with funtions in the cluster package > daisy(nca352, metric= "euclidean", stand=FALSE)->dnca35 Error: cannot allocate vector of size 858213 Kb *** malloc: vm_allocate(size=878813184) failed (error code=3) *** malloc[599]: error: Can't allocate region if it helps i also checked > gc() used (Mb) gc trigger (Mb) Ncells 448662 12.0 741108 19.8 Vcells 847630 6.5 135357901 1032.7 I tried the suggested unix command in the memory help doc but that doesn't work in the Aqua GUI. Can someone tell me how to change the Vcells? Although to the best of my understanding (which is limited) I shouldn't have to do that. Any suggestions would be greatly appreciated. thanks, betty -- Betty Gilbert lgilbert@berkeley.edu Taylor Lab Plant and Microbial Biology 321 Koshland Hall U.C. Berkeley Berkeley, Ca 94720
GUI trigger GUI trigger • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States
Betty Gilbert wrote: > I'm sorry if the answer to my problem is buried in the archives. I have > limited experience with R and I couldn't find a solution to my > particular problem. I am running Mac OS X Aqua GUI v1.01 on a new G5 > running os 10.3.8 with a 1.8Ghz processor and 1GB of sdram. I just > downloaded bioconducter a week ago and I'm trying to cluster a matrix I > created with a simulation with dimensions > dim(nca35) > [1] 10481 12 Making the assumption that you are simulating microarray data, I don't see the purpose of clustering such a large set of data. The usual approach is to whittle the data down to those genes thought to be 'important' based on some metric, and then to cluster this smaller subset. See the genefilter package for some examples. If you were to post this sort of message on R-help, someone who knows more about the bits and bytes of computers than I would probably calculate how much memory the distance matrix would require for this object. I believe it would end up being something like this: 10481*10481*8bytes = 0.94 Gb, which means it would take all but 6% of your RAM just to hold your dist matrix in memory, so you need more memory to do this. > > with size > >> object.size(nca352) > > [1] 1426204 > > I checked my ulimits variable on the unix terminal and it says it's > unlimited as does > >> mem.limits() > > nsize vsize > NA NA > But I'm still getting errors like the following with funtions in the > cluster package > >> daisy(nca352, metric= "euclidean", stand=FALSE)->dnca35 daisy() is designed for clustering variables of mixed types. However, your data are all numeric, so this is probably not the method you want. If you really want to do all rows, you may be able to get by with clara(), which is designed for clustering large objects such as this. Jim > > Error: cannot allocate vector of size 858213 Kb > *** malloc: vm_allocate(size=878813184) failed (error code=3) > *** malloc[599]: error: Can't allocate region > if it helps i also checked > >> gc() > > used (Mb) gc trigger (Mb) > Ncells 448662 12.0 741108 19.8 > Vcells 847630 6.5 135357901 1032.7 > > I tried the suggested unix command in the memory help doc but that > doesn't work in the Aqua GUI. Can someone tell me how to change the > Vcells? Although to the best of my understanding (which is limited) I > shouldn't have to do that. Any suggestions would be greatly appreciated. > thanks, > betty -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
Pre-filtering the expression data (usually for evidence of differential expression) has pretty dramatic effects on the clustering structure you will find. If your gene features were 2-dimensional, rather than 12, you could imagine scatterplotting the genes in the plane. A typical screen for differential expression would empty certain regions of that scatterplot and leave behind a very different point pattern. (Depending on the filter and the type of data you're simulating, it might empty an area around the origin and/or a corridor along the x = y line.) It will mostly be *that* structure that will (possibly) be recovered by the cluster analysis. The same thing will be operating in your 12-dimensional gene feature space, it's just a lot harder to illustrate. Another way to work around your RAM constraint and still use the routines in cluster and still retain all your genes, would be to subdivide your genes into smaller groups in an explicit, supervised way and then enact unsupervised clustering on each group. You could then 'manually' merge the results into a global gene clustering. Jenny James W. MacDonald writes: > Making the assumption that you are simulating microarray data, I don't > see the purpose of clustering such a large set of data. The usual > approach is to whittle the data down to those genes thought to be > 'important' based on some metric, and then to cluster this smaller > subset. See the genefilter package for some examples. > > Jim
ADD REPLY

Login before adding your answer.

Traffic: 1026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6