Question: multtest MTP: Error: cannot allocate vector
0
11.3 years ago by
Timur Shtatland30 wrote:
Dear all, I am looking for differentially expressed genes using multtest package, MTP procedure. I am computing raw and adjusted t-test P values (group 1 = 7 samples, group 2 = 3 samples) using bootstrap. R is running out of memory ('Error: cannot allocate vector') when the number of genes (5413) combined with the number of bootstrap iterations (B=10000) produce a large matrix. I am running R on PowerBook G4, Mac OS 10.4.11. See the code and the output below. Are there any other possible solutions, except for 1-3 listed below? For example, can I enable only part of the null matrix, rather than the entire matrix, to be held in RAM at any given time? The other possible solutions to this problem are: 1. Buy more memory - my choice #1 unless another solution is easily available. I assume that MTP is trying to read the entire null matrix into memory: 10000 iterations * 5413 genes * 10 samples * 8 bytes/(sample*gene*iteration) = 4.3 GB, and it obviously does not fit into the current RAM (1.25 GB). 2. Use fewer bootstrap iterations - my choice #2, because with fewer iterations many genes have raw P values equal to exactly 0: https://stat.ethz.ch/pipermail/bioconductor/2008-March/021396.html https://stat.ethz.ch/pipermail/bioconductor/2008-March/021436.html 3. Use fewer genes - my choice #3, because it is not clear exactly what effect a more restrictive filter will have on the false positive rate (the rate of truly differentially expressed genes that will be filtered out). Currently I already use genefilter to reduce the number of genes for MTP input from 22283 to 5413: ffun <- filterfun(pOverA(p = 0.5, A = 100), cv(a = 0.3)) I was not running any process other than R at the time of the error, to maximize available memory. The error occurs always at the end of the bootstrap iterations (which take 18-24 hours). I searched the MTP help page and its vignettes, as well as Bioconductor mailing list archives. The solution 'buy more memory' appears to be the most commonly suggested on this mailing list for other assorted 'out of memory' problems, but I was wondering if there is an easy way around it. Thank you for your help. Best regards, Timur -- Timur Shtatland, PhD Center for Molecular Imaging Research Massachusetts General Hospital 149 13th Street, Room 5408 Charlestown, MA 02129 tshtatland at mgh dot harvard dot edu ############################################################ ## read *only* the gcrma-processed dataset (nothing else) into a new R session ## to reduce the number of objects in memory: > load("esetGcrma.rda") ... > B=10000 ... > ffun <- filterfun(pOverA(p = 0.5, A = 100), cv(a = 0.3)) > filtered <- genefilter(2^exprs(esetGcrma), ffun) ... > TTBoot <- MTP(X=esetGcrmaExprsFiltered, Y=TT, test = "t.twosamp.unequalvar", alternative = "two.sided", typeone="fdr", method="ss.maxT", fdr.method="conservative", keep.nulldist = FALSE, B=B, seed=seed) running bootstrap... iteration = 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 2700 2800 2900 3000 3100 3200 3300 3400 3500 3600 3700 3800 3900 4000 4100 4200 4300 4400 4500 4600 4700 4800 4900 5000 5100 5200 5300 5400 5500 5600 5700 5800 5900 6000 6100 6200 6300 6400 6500 6600 6700 6800 6900 7000 7100 7200 7300 7400 7500 7600 7700 7800 7900 8000 8100 8200 8300 8400 8500 8600 8700 8800 8900 9000 9100 9200 9300 9400 9500 9600 9700 9800 9900 10000 Error: cannot allocate vector of size 413.0 Mb R(202,0xa000ed88) malloc: *** vm_allocate(size=433041408) failed (error code=3) R(202,0xa000ed88) malloc: *** error: can't allocate region R(202,0xa000ed88) malloc: *** set a breakpoint in szone_error to debug 2008-03-08 18:56:10.979 R[202] tossing reply message sequence 2 on thread 0x4d2a110 > > traceback() 7: apply(null, 2, max) 6: ss.maxT(nulldistn, obs, alternative, get.cutoff, get.cr, pind, alpha) 5: MTP(X = esetGcrmaExprsFiltered, Y = TT, test = "t.twosamp.unequalvar", alternative = "two.sided", typeone = "fdr", method = "ss.maxT", fdr.method = "conservative", keep.nulldist = FALSE, B = B, seed = seed) 4: multTest(esetGcrma = esetGcrma, B = 10000) 3: eval.with.vis(expr, envir, enclos) 2: eval.with.vis(ei, envir) 1: source("~/bin/computeRestrictedMa2.R") > sessionInfo() R version 2.6.0 (2007-10-03) powerpc-apple-darwin8.10.1 locale: en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines tools stats graphics grDevices utils datasets methods base other attached packages: [1] multtest_1.18.0 genefilter_1.16.0 survival_2.32 Biobase_1.16.1 loaded via a namespace (and not attached): [1] AnnotationDbi_1.0.6 DBI_0.2-4 RSQLite_0.6-4 annotate_1.16.1 > version _ platform powerpc-apple-darwin8.10.1 arch powerpc os darwin8.10.1 system powerpc, darwin8.10.1 status major 2 minor 6.0 year 2007 month 10 day 03 svn rev 43063 language R version.string R version 2.6.0 (2007-10-03) > .Machine$sizeof.pointer == 8 [1] FALSE > R.version$arch [1] "powerpc" > .Platform\$r_arch [1] "ppc" The information transmitted in this electronic communica...{{dropped:16}}
genefilter process • 626 views