retrieve genes names after KEGG hypergeometric test

0

Entering edit mode

Mike Walter ▴ 230

@mike-walter-4000

Last seen 10.8 years ago

Germany

Hi Cl?mentine, The "db" just adds a ".db" suffix to load the library. Just try paste(yourDb, "db", sep="."). This will give "yourDb.db". Maybe a short example will help. There is a list of 8 probesets from tha affy rat 230 2.0 array. I'm looking which of these transcripts are involved in the KEGG pathways 04710 (circadian rhythm) and 00240 (Pyrimidine metabolism): Regards, Mike > genelist=c("1368303_at", "1378745_at", "1392640_at", "1369996_at", +????????????? "1370295_at", "1378180_at", "1388182_at", "1398875_at") > KEGGID=c("04710", "00240") > db="rat2302" #which will load rat2302.db annotation package > > myKEGG = KEGG2symbol(KEGGID, genelist, db) > myKEGG $`04710` ???????????????????????? [,1] 1368303_at "Per2" 1378745_at "Per3" 1392640_at "Cry1" $`00240` ????????????????????????? [,1] 1369996_at "Polr2f" 1370295_at "Nme1" 1378180_at "Dctd" 1388182_at "Prim1" 1398875_at "Polr3k" -----Urspr?ngliche Nachricht----- Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> Gesendet: 29.10.2010 15:21:33 An: "Mike Walter" <michael_walter at="" email.de=""> Betreff: Re: [BioC] retrieve genes names after KEGG hypergeometric test > >Hi Mike, > > > >Could ou explain me the difference between the db and "db" you are using? > >If db is the character vector with the annotation database for your array > >without the .db extension, then what does db represent? > > > >Again thanks for your help, > > > >Cl?mentine > > > > > >On Fri, 29 Oct 2010 14:23:00 +0200 (CEST), "Mike Walter" > ><michael_walter at="" email.de=""> wrote: > >> Hi Cl?mentine, > >> > >> I don't know, if such a function exists. I use two little helper > >functions > >> to retrieve probe IDs or gene symbols of genes in a genelist, that are > >> associated with a KEGG ID: > >> > >> KEGG2genes = function(KEGGID, genelist, db){ > >> require(paste(db, "db", sep="."), character.only = TRUE) > >> l = vector("list") > >> for (i in 1:length(KEGGID)){ > >> kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", > >> sep="")), ifnotfound=NA))) > >> l[[i]] = genelist[is.element(genelist,kegg[,1])] > >> } > >> names(l)=KEGGID > >> l > >> } > >> > >> KEGG2symbol = function(KEGGID, genelist, db){ > >> l = vector("list") > >> for (i in 1:length(KEGGID)){ > >> id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db)) > >> l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")), > >> ifnotfound=NA)) > >> } > >> names(l)=KEGGID > >> l > >> } > >> > >> where "KEGGID" is a character vector of your KEGGID(s) you are > >interested > >> in, "genelist" is a character vector containing the probe IDs/probeset > >IDs > >> of your genelist you used to create the KEGGHyperGResult and "db" is a > >> character vector with the annotation database for your array without the > >> .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a > >> result you get a matrix containing the probeIDs and genesymbols for each > >> KEGGID stored in a list. It might not be the most elegant way, but it > >> works. > >> > >> Kind regards, > >> > >> Mike > >> > >> -----Urspr?ngliche Nachricht----- > >> Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> > >> Gesendet: 29.10.2010 13:27:44 > >> An: bioconductor at stat.math.ethz.ch > >> Betreff: [BioC] retrieve genes names after KEGG hypergeometric test > >> > >>> > >>>Dear BioC users, > >>> > >>> > >>> > >>>I performed different hypergometric tests on my data regarding GO terms > >>> > >>>and KEGG pathways. With GO resukt I can use the probeSetSummary function > >>>to > >>> > >>>retrieve the gene list associated to each significant category. > >>> > >>>However this function does not work if I apply the HG test using > >>> > >>>KEGGHyperGParams because the results are not of GOHyperGResult class... > >Is > >>> > >>>there any equivalent KEGG function to get those genes list? > >>> > >>> > >>> > >>>WIth advanced thanks for your help. > >>> > >>> > >>> > >>>Cl?mentine > >>> > >>> > >>> > >>>-- > >>> > >>>Cl?mentine Dressaire > >>> > >>>Post-doctoral research fellow > >>> > >>>Control of gene expression lab > >>> > >>>ITQB - Instituto de Tecnologia Qu?mica e Biol?gica > >>> > >>>Apartado 127, Av. da Rep?blica > >>> > >>>2780-157 Oeiras > >>> > >>>Portugal > >>> > >>>+351 214469562 > >>> > >>>_______________________________________________ > >>>Bioconductor mailing list > >>>Bioconductor at stat.math.ethz.ch > >>>https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>Search the archives: > >>>http://news.gmane.org/gmane.science.biology.informatics.conductor

Annotation Pathways GO rat2302 probe affy Category Annotation Pathways GO rat2302 probe • 1.9k views

ADD COMMENT • link updated 15.2 years ago by Iain Gallagher ▴ 930 • written 15.3 years ago by Mike Walter ▴ 230

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 9.5 years ago

United States

Hi guys, I don't want to jump in here and tell you how to write your code, but it might simplify your life somewhat to know about a couple of conveniences. One is the getAnnMap function from the annotate package. This is a nice thing for when you want to be able to just load a mapping up. So instead of doing stuff like: require(paste(db, "db", sep="."), character.only = TRUE) You could do something like this: library(annotate) yourMap <- getAnnMap("PATH2PROBE", "hgu95av2.db") You can basically get whatever mapping you want in a way that will automatically load the relevant annotation libraries, and append a .db suffix onto the end of the 'chip' argument (in case you forgot it). So, this will also work for the SYMBOL mapping, or any other mapping that you need. And then later when you want to retrieve something from a mapping, it also pays to know that mget() is vectorized Which means that if you pass in a vector for "x", it will return a list with all the matching results attached for each value in "x". Therefore, instead of using a for loop like this: for (i in 1:length(KEGGID)){ kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", sep="")), ifnotfound=NA))) l[[i]] = genelist[is.element(genelist,kegg[,1])] } I think that you should be able to get basically the same kind of result by doing something like more like this: l = unlist(mget(KEGGID, yourMap, ifnotfound=NA)) l = l[l %in% genelist] Hope this helps you, Marc On 10/29/2010 05:23 AM, Mike Walter wrote: > Hi Cl?mentine, > > I don't know, if such a function exists. I use two little helper functions to retrieve probe IDs or gene symbols of genes in a genelist, that are associated with a KEGG ID: > > KEGG2genes = function(KEGGID, genelist, db){ > require(paste(db, "db", sep="."), character.only = TRUE) > l = vector("list") > for (i in 1:length(KEGGID)){ > kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", sep="")), ifnotfound=NA))) > l[[i]] = genelist[is.element(genelist,kegg[,1])] > } > names(l)=KEGGID > l > } > > KEGG2symbol = function(KEGGID, genelist, db){ > l = vector("list") > for (i in 1:length(KEGGID)){ > id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db)) > l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")), ifnotfound=NA)) > } > names(l)=KEGGID > l > } > > where "KEGGID" is a character vector of your KEGGID(s) you are interested in, "genelist" is a character vector containing the probe IDs/probeset IDs of your genelist you used to create the KEGGHyperGResult and "db" is a character vector with the annotation database for your array without the .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a result you get a matrix containing the probeIDs and genesymbols for each KEGGID stored in a list. It might not be the most elegant way, but it works. > > Kind regards, > > Mike > > -----Urspr?ngliche Nachricht----- > Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> > Gesendet: 29.10.2010 13:27:44 > An: bioconductor at stat.math.ethz.ch > Betreff: [BioC] retrieve genes names after KEGG hypergeometric test > > >> Dear BioC users, >> >> >> >> I performed different hypergometric tests on my data regarding GO terms >> >> and KEGG pathways. With GO resukt I can use the probeSetSummary function to >> >> retrieve the gene list associated to each significant category. >> >> However this function does not work if I apply the HG test using >> >> KEGGHyperGParams because the results are not of GOHyperGResult class... Is >> >> there any equivalent KEGG function to get those genes list? >> >> >> >> WIth advanced thanks for your help. >> >> >> >> Cl?mentine >> >> >> >> -- >> >> Cl?mentine Dressaire >> >> Post-doctoral research fellow >> >> Control of gene expression lab >> >> ITQB - Instituto de Tecnologia Qu?mica e Biol?gica >> >> Apartado 127, Av. da Rep?blica >> >> 2780-157 Oeiras >> >> Portugal >> >> +351 214469562 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 15.3 years ago Marc Carlson ★ 7.2k

0

Entering edit mode

Iain Gallagher ▴ 930

@iain-gallagher-2532

Last seen 10.5 years ago

United Kingdom

Hi Clementine Below is the code I use for this (no functions but it works). hgOver is the result of the hyperGtest. hypoResults is the result of a limma test for differential expression. Basically this gets the sig genes by category and pulls in the logFC (from the limma result) and generates a table so I can see the change for each gene in each category. Not the most elegant code - but then that's not my main profession ;-) You could stop at keggMapped3 and that would be genes with pathways (if I remember correctly). #geneIdsByCategory##### sigGenesByCat <- geneIdsByCategory(hgOver, summary(hgOver)[,1]) sigMap <- stack(sigGenesByCat) symsMapped <- mget(as.character(sigMap[,1]), org.Hs.egSYMBOL, ifnotfound=NA) symsMapped <- stack(symsMapped) keggMapped <- with(sigMap, symsMapped[,1][match(symsMapped[,2], sigMap[,1])]) keggMapped <- cbind(keggMapped, sigMap) keggMapped2 <- unstack(keggMapped, keggMapped~ind) #now replace KEGG IDs with Term termsInd <- match(names(keggMapped2), summary(hgOver)[,1]) keggMapped3 <- keggMapped2 names(keggMapped3) <- summary(hgOver)[,7][termsInd] ##add FC info to GO categories keggMapped4 <- stack(keggMapped3) fcInd<-match(keggMapped4[,1], hypoResults[,8]) keggMapped4$logFC <- hypoResults[,2][fcInd] write.table(keggMapped4, 'KEGGAnnotated.txt', sep='\t', quote=F) cheers iain --- On Fri, 29/10/10, Cl?mentine Dressaire <clementinedressaire at="" itqb.unl.pt=""> wrote: > From: Cl?mentine Dressaire <clementinedressaire at="" itqb.unl.pt=""> > Subject: Re: [BioC] retrieve genes names after KEGG hypergeometric test > To: "Mike Walter" <michael_walter at="" email.de=""> > Cc: bioconductor at stat.math.ethz.ch > Date: Friday, 29 October, 2010, 14:21 > > Hi Mike, > > > > Could ou explain me the difference between the db and "db" > you are using? > > If db is the character vector with the annotation database > for your array > > without the .db extension, then what does db represent? > > > > Again thanks for your help, > > > > Cl?mentine > > > > > > On Fri, 29 Oct 2010 14:23:00 +0200 (CEST), "Mike Walter" > > <michael_walter at="" email.de=""> > wrote: > > > Hi Cl?mentine, > > > > > > I don't know, if such a function exists. I use two > little helper > > functions > > > to retrieve probe IDs or gene symbols of genes in a > genelist, that are > > > associated with a KEGG ID: > > > > > > KEGG2genes = function(KEGGID, genelist, db){ > > >? require(paste(db, "db", sep="."), character.only > = TRUE) > > >? l = vector("list") > > >? for (i in 1:length(KEGGID)){ > > >? kegg = as.matrix(unlist(mget(KEGGID[i], > get(paste(db, "PATH2PROBE", > > >? sep="")), ifnotfound=NA))) > > >? l[[i]] = > genelist[is.element(genelist,kegg[,1])] > > >? } > > > names(l)=KEGGID > > > l > > > } > > > > > > KEGG2symbol = function(KEGGID, genelist, db){ > > >? l = vector("list") > > >? for (i in 1:length(KEGGID)){ > > >? id = unlist(KEGG2genes(KEGGID=KEGGID[i], > genelist=genelist, db=db)) > > >? l[[i]] = as.matrix(mget(id, get(paste(db, > "SYMBOL", sep="")), > > >? ifnotfound=NA)) > > >? } > > >? names(l)=KEGGID > > >? l > > > } > > > > > > where "KEGGID" is a character vector of your KEGGID(s) > you are > > interested > > > in, "genelist" is a character vector containing the > probe IDs/probeset > > IDs > > > of your genelist you used to create the > KEGGHyperGResult and "db" is a > > > character vector with the annotation database for your > array without the > > > .db extension (e.g. db="hgu133plus" for the affy U133+ > 2.0 array). As a > > > result you get a matrix containing the probeIDs and > genesymbols for each > > > KEGGID stored in a list. It might not be the most > elegant way, but it > > > works. > > > > > > Kind regards, > > > > > > Mike > > > > > > -----Urspr?ngliche Nachricht----- > > > Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> > > > Gesendet: 29.10.2010 13:27:44 > > > An: bioconductor at stat.math.ethz.ch > > > Betreff: [BioC] retrieve genes names after KEGG > hypergeometric test > > > > > >> > > >>Dear BioC users, > > >> > > >> > > >> > > >>I performed different hypergometric tests on my > data regarding GO terms > > >> > > >>and KEGG pathways. With GO resukt I can use the > probeSetSummary function > > >>to > > >> > > >>retrieve the gene list associated to each > significant category. > > >> > > >>However this function does not work if I apply the > HG test using > > >> > > >>KEGGHyperGParams because the results are not of > GOHyperGResult class... > > Is > > >> > > >>there any equivalent KEGG function to get those > genes list? > > >> > > >> > > >> > > >>WIth advanced thanks for your help. > > >> > > >> > > >> > > >>Cl?mentine > > >> > > >> > > >> > > >>-- > > >> > > >>Cl?mentine Dressaire > > >> > > >>Post-doctoral research fellow > > >> > > >>Control of gene expression lab > > >> > > >>ITQB - Instituto de Tecnologia Qu?mica e > Biol?gica > > >> > > >>Apartado 127, Av. da Rep?blica > > >> > > >>2780-157 Oeiras > > >> > > >>Portugal > > >> > > >>+351 214469562 > > >> > > >>_______________________________________________ > > >>Bioconductor mailing list > > >>Bioconductor at stat.math.ethz.ch > > >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > >>Search the archives: > > >>http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 15.2 years ago Iain Gallagher ▴ 930

0

Entering edit mode

Hello, the problem is as in the subject, and here are the details: > class(RG) [1] "RGList" attr(,"package") [1] "limma" > library(arrayQualityMetrics) > library(convert) > x=as(RG,"NChannelSet") > class(x) [1] "NChannelSet" attr(,"package") [1] "Biobase" > arrayQualityMetrics(x,outdir="qc") The report will be written in directory 'qc'. Error: cannot allocate vector of size 74.3 Mb R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug > sessionInfo() R version 2.12.0 (2010-10-15) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] convert_1.26.0 marray_1.28.0 limma_3.6.6 arrayQualityMetrics_2.6.0 [5] affyPLM_1.26.0 preprocessCore_1.12.0 gcrma_2.22.0 affy_1.28.0 [9] Biobase_2.10.0 loaded via a namespace (and not attached): [1] affyio_1.18.0 annotate_1.28.0 AnnotationDbi_1.12.0 beadarray_2.0.1 Biostrings_2.18.0 [6] DBI_0.2-5 genefilter_1.32.0 grid_2.12.0 hwriter_1.2 IRanges_1.8.2 [11] lattice_0.19-13 latticeExtra_0.6-14 RColorBrewer_1.0-2 RSQLite_0.9-2 simpleaffy_2.26.0 [16] splines_2.12.0 stats4_2.12.0 survival_2.36-1 vsn_3.18.0 xtable_1.5-6 RG is a result of importing 10 samples into limma from agilent scanner. Each sample is a 1M probe mouse custom CGH array. Is it a memory/performance problem? I have 4Gb of RAM + 2.4GHz Core2 Duo processor. I would be very grateful for any clues as to what is wrong and how to fix it... Regards, jarek -- Jarek Bryk | www.evolbio.mpg.de/~bryk Max Planck Institute for Evolutionary Biology August Thienemann Str. 2 | 24306 Pl?n, Germany tel. +49 4522 763 287 | bryk at evolbio.mpg.de

ADD REPLY • link 15.2 years ago Jarek Bryk ▴ 110

0

Entering edit mode

Hi Jarek this is weird, and ugly. Thank you for reporting. Can you: - make the object 'RG' available on a public fileserver for me & others to try to reproduce the error? - can you set options(error=recover) and then send us the output (error message and call stack) that you get? This will help localise where the problem happens. - try on a different machine (with more RAM, or different OS) - also try with arrayQualityMetrics_3.2.0 *and* SVGannotation_0.7-2 [1] (which fixes the problem with libcairo 1.10 that I reported earlier, https://stat.ethz.ch/pipermail/bioconductor/2010-October/035958.html - I planned to announce this fix as soon as SVGannotation_0.7-2 is on the build system in Seattle.) [1] http://www.omegahat.org/SVGAnnotation Best wishes Wolfgang Il Nov/8/10 2:07 PM, Jarek Bryk ha scritto: > Hello, > the problem is as in the subject, and here are the details: > >> class(RG) > [1] "RGList" > attr(,"package") > [1] "limma" >> library(arrayQualityMetrics) >> library(convert) >> x=as(RG,"NChannelSet") >> class(x) > [1] "NChannelSet" > attr(,"package") > [1] "Biobase" > >> arrayQualityMetrics(x,outdir="qc") > The report will be written in directory 'qc'. > Error: cannot allocate vector of size 74.3 Mb > R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] convert_1.26.0 marray_1.28.0 limma_3.6.6 arrayQualityMetrics_2.6.0 > [5] affyPLM_1.26.0 preprocessCore_1.12.0 gcrma_2.22.0 affy_1.28.0 > [9] Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 annotate_1.28.0 AnnotationDbi_1.12.0 beadarray_2.0.1 Biostrings_2.18.0 > [6] DBI_0.2-5 genefilter_1.32.0 grid_2.12.0 hwriter_1.2 IRanges_1.8.2 > [11] lattice_0.19-13 latticeExtra_0.6-14 RColorBrewer_1.0-2 RSQLite_0.9-2 simpleaffy_2.26.0 > [16] splines_2.12.0 stats4_2.12.0 survival_2.36-1 vsn_3.18.0 xtable_1.5-6 > > > > RG is a result of importing 10 samples into limma from agilent scanner. Each sample is a 1M probe mouse custom CGH array. Is it a memory/performance problem? I have 4Gb of RAM + 2.4GHz Core2 Duo processor. I would be very grateful for any clues as to what is wrong and how to fix it... > > Regards, > jarek >

ADD REPLY • link 15.2 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

Hi Jarek I could not reproduce this problem on two machines available to me (x86_64-unknown-linux-gnu (64-bit), with 8 GB RAM, and x86_64-apple-darwin10.4.0/x86_64 (64-bit), with 4 GB RAM). You also mentioned (outside this list) that the report was produced without errors on a different machine of yours (Ubuntu 10.10 64bit 8GB RAM 3Gz i7). So it appears that this problem is quite specific to your system, i386-apple-darwin9.8.0/i386 (32-bit). Note that the amount of RAM that you have there is no less than what I have. Perhaps you're hitting a 2 GB limit (on a 32-bit OS) for your R process (?). It seems there are three options, which are all on your end: - update the R on your Mac to 64 bit - use a bigger machine - subsample the object 'x', e.g. sx = x[ sample(nrow(x), 5e4), ] which should give you an almost as useful report. Also, you probably want to set 'do.logtransform = TRUE'. (Btw, object.size(x) is 271675096 bytes ~ 260 MB, which is only a fraction of the available RAM. Nevertheless, I watched the R process grow to 2.5-3 GB while munching through 'arrayQualityMetrics()', which is a consequence of that function creating a couple of copies of the data - minimisation of memory use has not been a design goal so far.) Best wishes Wolfgang Jarek Bryk scripsit 08/11/10 14:07: > Hello, > the problem is as in the subject, and here are the details: > >> class(RG) > [1] "RGList" > attr(,"package") > [1] "limma" >> library(arrayQualityMetrics) >> library(convert) >> x=as(RG,"NChannelSet") >> class(x) > [1] "NChannelSet" > attr(,"package") > [1] "Biobase" > >> arrayQualityMetrics(x,outdir="qc") > The report will be written in directory 'qc'. > Error: cannot allocate vector of size 74.3 Mb > R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=3899392) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=7794688) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > R(2581,0xa048a500) malloc: *** mmap(size=77922304) failed (error code=12) > *** error: can't allocate region > *** set a breakpoint in malloc_error_break to debug > >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] convert_1.26.0 marray_1.28.0 limma_3.6.6 arrayQualityMetrics_2.6.0 > [5] affyPLM_1.26.0 preprocessCore_1.12.0 gcrma_2.22.0 affy_1.28.0 > [9] Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 annotate_1.28.0 AnnotationDbi_1.12.0 beadarray_2.0.1 Biostrings_2.18.0 > [6] DBI_0.2-5 genefilter_1.32.0 grid_2.12.0 hwriter_1.2 IRanges_1.8.2 > [11] lattice_0.19-13 latticeExtra_0.6-14 RColorBrewer_1.0-2 RSQLite_0.9-2 simpleaffy_2.26.0 > [16] splines_2.12.0 stats4_2.12.0 survival_2.36-1 vsn_3.18.0 xtable_1.5-6 > > > > RG is a result of importing 10 samples into limma from agilent scanner. Each sample is a 1M probe mouse custom CGH array. Is it a memory/performance problem? I have 4Gb of RAM + 2.4GHz Core2 Duo processor. I would be very grateful for any clues as to what is wrong and how to fix it... > > Regards, > jarek > -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber

ADD REPLY • link 15.2 years ago Wolfgang Huber ★ 13k

Login before adding your answer.