Question: bout big data set for Affy R packge
0
gravatar for 刘伟
7.0 years ago by
刘伟30
刘伟30 wrote:
Dear Buddy, I am a user of affy R package. When I attempt to handle a large number (aprox. 300) of microarrays, I always get an error in memory allocation from R. I searched the web but didnot find any solution for readaffy() with large dataset. I donnot know if the problem can be fixed in some way. Any suggestion is appreciated. Thanks. Sincerely, Wei Liu [[alternative HTML version deleted]]
affy • 625 views
ADD COMMENTlink modified 7.0 years ago by Stephen Piccolo560 • written 7.0 years ago by 刘伟30
Answer: bout big data set for Affy R packge
0
gravatar for James W. MacDonald
7.0 years ago by
United States
James W. MacDonald52k wrote:
Hi Wei Liu, You can try justRMA(). If that doesn't work, you can try the aroma.affymetrix package. Note that the aroma.affymetrix package is not part of BioC, and has its own user group and repository, so you need to do a google search for that one. Best, Jim On 12/19/2012 9:21 AM, ?? wrote: > Dear Buddy, > I am a user of affy R package. When I attempt to handle a large > number (aprox. 300) of microarrays, I always get an error in memory > allocation from R. I searched the web but didnot find any solution for > readaffy() with large dataset. I donnot know if the problem can be > fixed in some way. Any suggestion is appreciated. Thanks. > > Sincerely, > Wei Liu > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENTlink written 7.0 years ago by James W. MacDonald52k
Answer: bout big data set for Affy R packge
0
gravatar for cstrato
7.0 years ago by
cstrato3.9k
Austria
cstrato3.9k wrote:
Dear Wei Liu, You could use the BioConductor package xps which can handle a couple of thousand microarrays on computers with 1-2 GB RAM only. See also: http://www.bioconductor.org/help/workflows/oligo-arrays/#pre- processing-resources which packages might be relevant. Regards Christian On 12/19/12 3:21 PM, ?? wrote: > Dear Buddy, > I am a user of affy R package. When I attempt to handle a large > number (aprox. 300) of microarrays, I always get an error in memory > allocation from R. I searched the web but didnot find any solution for > readaffy() with large dataset. I donnot know if the problem can be > fixed in some way. Any suggestion is appreciated. Thanks. > > Sincerely, > Wei Liu > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 7.0 years ago by cstrato3.9k
Answer: bout big data set for Affy R packge
0
gravatar for Rob Dunne
7.0 years ago by
Rob Dunne230
Rob Dunne230 wrote:
Hi Wei Liu, if they are affymetrix 1.0 ST exon arrays, I can send you a modified version of read.celfiles from the oligo package that should read a 300 microarray data set. I dont know it it will work for other array types, possibly not without some work. It is a modified version of the read.celfiles that uses the big.matrix class from the big.memory package my.data<-read.celfiles(filenames=ff,useAffyio=FALSE) my. data #assayData: 6553600 features, 335 samples #Annotation: pd.huex.1.0.st.v2 Bye Rob On 12/20/2012 01:21 AM, ?? wrote: > Dear Buddy, > I am a user of affy R package. When I attempt to handle a large > number (aprox. 300) of microarrays, I always get an error in memory > allocation from R. I searched the web but didnot find any solution for > readaffy() with large dataset. I donnot know if the problem can be > fixed in some way. Any suggestion is appreciated. Thanks. > > Sincerely, > Wei Liu > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- - Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 Locked Bag 17, North Ryde, New South Wales, Australia, 1670 http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au Java has certainly revolutionized marketing and litigation.
ADD COMMENTlink written 7.0 years ago by Rob Dunne230
Hi Rob, looks like you're running an old version of oligo. Today, our approach is: library(ff) library(oligo) my.data <- read.celfiles(<cel file="" names="">) HTH, b On 21 December 2012 01:02, Rob Dunne <rob.dunne at="" csiro.au=""> wrote: > Hi Wei Liu, > > if they are affymetrix 1.0 ST exon arrays, I can send you a modified version of read.celfiles from the oligo package that > should read a 300 microarray data set. I dont know it it will work for other array types, possibly not without some work. > It is a modified version of the read.celfiles that uses the big.matrix class from the big.memory package > > my.data<-read.celfiles(filenames=ff,useAffyio=FALSE) > my. data > #assayData: 6553600 features, 335 samples > #Annotation: pd.huex.1.0.st.v2 > > Bye > Rob > > > > > On 12/20/2012 01:21 AM, ?? wrote: >> Dear Buddy, >> I am a user of affy R package. When I attempt to handle a large >> number (aprox. 300) of microarrays, I always get an error in memory >> allocation from R. I searched the web but didnot find any solution for >> readaffy() with large dataset. I donnot know if the problem can be >> fixed in some way. Any suggestion is appreciated. Thanks. >> >> Sincerely, >> Wei Liu >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > - > Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 > CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 > Locked Bag 17, North Ryde, New South Wales, Australia, 1670 > http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au > > Java has certainly revolutionized marketing and litigation. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 7.0 years ago by Benilton Carvalho4.3k
Hi Benilton, Unless I am missing something, ff wont help in this case. From the ff help page "Currently ff objects cannot have length zero and are limited to ?.Machine$integer.max? elements" and .Machine$integer.max is 2^(31)-1. This is exceeded when you try to load 328 Affy exon arrays hence library(ff) library(oligo) data<-read.celfiles(filenames=files) #Loading required package: pd.huex.1.0.st.v2 #Loading required package: RSQLite #Loading required package: DBI #Platform design info loaded. #Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 1 and .Machine$integer.max") : # missing value where TRUE/FALSE needed #In addition: Warning message: #In ff(initdata = initdata, vmode = vmode, dim = dim, pattern = file.path(ldPath(), : # NAs introduced by coercion traceback() #4: ff(initdata = initdata, vmode = vmode, dim = dim, pattern = file.path(ldPath(), # basename(name))) #3: createFF("intensities-", dim = c(nr, length(filenames))) #2: smartReadCEL(filenames, sampleNames, headdetails = headdetails) #1: read.celfiles(filenames = ff) This is why I went done the path of modifying read.celfiles to use big.matrix, which does not have the 2^(31)-1 limit Bye Rob sessionInfo() #R version 2.15.0 (2012-03-30) #Platform: x86_64-unknown-linux-gnu (64-bit) # #locale: # [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C # [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 # [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 # [7] LC_PAPER=C LC_NAME=C # [9] LC_ADDRESS=C LC_TELEPHONE=C #[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C # #attached base packages: #[1] tools stats graphics grDevices utils datasets methods #[8] base # #other attached packages: #[1] pd.huex.1.0.st.v2_3.6.0 RSQLite_0.11.2 DBI_0.2-5 #[4] oligo_1.20.4 oligoClasses_1.18.0 ff_2.2-10 #[7] bit_1.1-9 # #loaded via a namespace (and not attached): # [1] affxparser_1.28.1 affyio_1.24.0 Biobase_2.16.0 # [4] BiocGenerics_0.2.0 BiocInstaller_1.4.9 Biostrings_2.24.1 # [7] codetools_0.2-8 compiler_2.15.0 foreach_1.4.0 #[10] IRanges_1.14.4 iterators_1.0.6 preprocessCore_1.18.0 #[13] splines_2.15.0 stats4_2.15.0 zlibbioc_1.2.0 On 12/21/2012 10:45 PM, Benilton Carvalho wrote: > Hi Rob, > > looks like you're running an old version of oligo. > > Today, our approach is: > > library(ff) > library(oligo) > my.data <- read.celfiles(<cel file="" names="">) > > HTH, > b > > On 21 December 2012 01:02, Rob Dunne <rob.dunne at="" csiro.au=""> wrote: >> Hi Wei Liu, >> >> if they are affymetrix 1.0 ST exon arrays, I can send you a modified version of read.celfiles from the oligo package that >> should read a 300 microarray data set. I dont know it it will work for other array types, possibly not without some work. >> It is a modified version of the read.celfiles that uses the big.matrix class from the big.memory package >> >> my.data<-read.celfiles(filenames=ff,useAffyio=FALSE) >> my. data >> #assayData: 6553600 features, 335 samples >> #Annotation: pd.huex.1.0.st.v2 >> >> Bye >> Rob >> >> >> >> >> On 12/20/2012 01:21 AM, ?? wrote: >>> Dear Buddy, >>> I am a user of affy R package. When I attempt to handle a large >>> number (aprox. 300) of microarrays, I always get an error in memory >>> allocation from R. I searched the web but didnot find any solution for >>> readaffy() with large dataset. I donnot know if the problem can be >>> fixed in some way. Any suggestion is appreciated. Thanks. >>> >>> Sincerely, >>> Wei Liu >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> - >> Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 >> CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 >> Locked Bag 17, North Ryde, New South Wales, Australia, 1670 >> http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au >> >> Java has certainly revolutionized marketing and litigation. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- - Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 Locked Bag 17, North Ryde, New South Wales, Australia, 1670 http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au Java has certainly revolutionized marketing and litigation.
ADD REPLYlink written 7.0 years ago by Rob Dunne230
Answer: bout big data set for Affy R packge
0
gravatar for Stephen Piccolo
7.0 years ago by
United States
Stephen Piccolo560 wrote:
Wei, I'm assuming your end goal is to normalize the files? If so, there are a few other options you could try for a large number of CEL files. You could process the CEL files in smaller groups. Alternatively (and in my opinion, a better approach), you could use our SCAN.UPC package (or the frma package), which are designed to normalize one file at a time. That way you only need enough memory to process one file at a time. Regards, -Steve On 12/22/2011 Sat, Dec 22, 2011 4:00 AM, "bioconductor-request at r-project.org" <bioconductor-request at="" r-project.org=""> wrote: > > >------------------------------ > >Message: 10 >Date: Sat, 22 Dec 2012 15:31:51 +1100 >From: Rob Dunne <rob.dunne at="" csiro.au=""> >To: Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> >Cc: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> >Subject: Re: [BioC] bout big data set for Affy R packge >Message-ID: <50D537B7.700 at csiro.au> >Content-Type: text/plain; charset="UTF-8"; format=flowed > >Hi Benilton, > >Unless I am missing something, ff wont help in this case. From the ff >help page > >"Currently ff objects cannot have length zero and are limited to >?.Machine$integer.max? elements" > >and .Machine$integer.max is 2^(31)-1. This is exceeded when you try to >load 328 Affy exon arrays hence > >library(ff) >library(oligo) >data<-read.celfiles(filenames=files) >#Loading required package: pd.huex.1.0.st.v2 >#Loading required package: RSQLite >#Loading required package: DBI >#Platform design info loaded. >#Error in if (length < 0 || length > .Machine$integer.max) stop("length >must be between 1 and .Machine$integer.max") : ># missing value where TRUE/FALSE needed >#In addition: Warning message: >#In ff(initdata = initdata, vmode = vmode, dim = dim, pattern = >file.path(ldPath(), : ># NAs introduced by coercion > > traceback() >#4: ff(initdata = initdata, vmode = vmode, dim = dim, pattern = >file.path(ldPath(), ># basename(name))) >#3: createFF("intensities-", dim = c(nr, length(filenames))) >#2: smartReadCEL(filenames, sampleNames, headdetails = headdetails) >#1: read.celfiles(filenames = ff) > >This is why I went done the path of modifying read.celfiles to use >big.matrix, which does not have the 2^(31)-1 >limit > >Bye >Rob > > > > > > >sessionInfo() >#R version 2.15.0 (2012-03-30) >#Platform: x86_64-unknown-linux-gnu (64-bit) ># >#locale: ># [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C ># [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 ># [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 ># [7] LC_PAPER=C LC_NAME=C ># [9] LC_ADDRESS=C LC_TELEPHONE=C >#[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C ># >#attached base packages: >#[1] tools stats graphics grDevices utils datasets methods >#[8] base ># >#other attached packages: >#[1] pd.huex.1.0.st.v2_3.6.0 RSQLite_0.11.2 DBI_0.2-5 >#[4] oligo_1.20.4 oligoClasses_1.18.0 ff_2.2-10 >#[7] bit_1.1-9 ># >#loaded via a namespace (and not attached): ># [1] affxparser_1.28.1 affyio_1.24.0 Biobase_2.16.0 ># [4] BiocGenerics_0.2.0 BiocInstaller_1.4.9 Biostrings_2.24.1 ># [7] codetools_0.2-8 compiler_2.15.0 foreach_1.4.0 >#[10] IRanges_1.14.4 iterators_1.0.6 preprocessCore_1.18.0 >#[13] splines_2.15.0 stats4_2.15.0 zlibbioc_1.2.0 > > >On 12/21/2012 10:45 PM, Benilton Carvalho wrote: >> Hi Rob, >> >> looks like you're running an old version of oligo. >> >> Today, our approach is: >> >> library(ff) >> library(oligo) >> my.data <- read.celfiles(<cel file="" names="">) >> >> HTH, >> b >> >> On 21 December 2012 01:02, Rob Dunne <rob.dunne at="" csiro.au=""> wrote: >>> Hi Wei Liu, >>> >>> if they are affymetrix 1.0 ST exon arrays, I can send you a modified >>>version of read.celfiles from the oligo package that >>> should read a 300 microarray data set. I dont know it it will work for >>>other array types, possibly not without some work. >>> It is a modified version of the read.celfiles that uses the >>>big.matrix class from the big.memory package >>> >>> my.data<-read.celfiles(filenames=ff,useAffyio=FALSE) >>> my. data >>> #assayData: 6553600 features, 335 samples >>> #Annotation: pd.huex.1.0.st.v2 >>> >>> Bye >>> Rob >>> >>> >>> >>> >>> On 12/20/2012 01:21 AM, ?? wrote: >>>> Dear Buddy, >>>> I am a user of affy R package. When I attempt to handle a large >>>> number (aprox. 300) of microarrays, I always get an error in memory >>>> allocation from R. I searched the web but didnot find any solution for >>>> readaffy() with large dataset. I donnot know if the problem can be >>>> fixed in some way. Any suggestion is appreciated. Thanks. >>>> >>>> Sincerely, >>>> Wei Liu >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>>http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> -- >>> - >>> Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 >>> CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 >>> Locked Bag 17, North Ryde, New South Wales, Australia, 1670 >>> http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au >>> >>> Java has certainly revolutionized marketing and litigation. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>>http://news.gmane.org/gmane.science.biology.informatics.conductor > > >-- >- >Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263 >CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100 >Locked Bag 17, North Ryde, New South Wales, Australia, 1670 >http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au > > Java has certainly revolutionized marketing and litigation. >
ADD COMMENTlink written 7.0 years ago by Stephen Piccolo560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 211 users visited in the last hour