Large number of CEL files!!!
3
0
Entering edit mode
@hrishikesh-deshmukh-1008
Last seen 10.2 years ago
Hi All, I have 200 CEL files and i want to use bioconductor to read these files and then do simple things like hist(),boxplot()! I think i will run into memory issues! Any suggestions as to how to handle this problem? Thanks in advance. Hrishi
• 915 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hrishikesh Deshmukh wrote: > Hi All, > > I have 200 CEL files and i want to use bioconductor to > read these files and then do simple things like > hist(),boxplot()! I think i will run into memory > issues! > Any suggestions as to how to handle this problem? If you want to have an AffyBatch so you can use hist(), etc., you will need lots of RAM (probably 3 or 4 Gb). You might also need to use a *nix OS because windows doesn't handle memory that efficiently. Luckily, RAM is comparatively cheap these days. Best, Jim > > Thanks in advance. > Hrishi > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 10.2 years ago
Do you want to plot the data before or after preprocessing ? The current maximum features are 242 million (=55000x200x22) and 11 million (=55000x200). Also do you want to investigate the distribution of each array/column or look at overall distribution. With my Pentium 4 at 1.6 GHz and 512 RAM, I can do a hist() or boxplot() on the pre-processed dataset. mat <- matrix( rnorm(55000*200), nc=200 ) library(fields) system.time( bplot(mat) ) [1] 16.21 1.23 23.83 0.00 0.00 But the real problem is that there are too many data points on the graphs that makes each array difficult to see. I think it would be better to read in, say 25-50 arrays at a time and plot their distribution. Besides being less memory intensive, the graphics may look well spaces for you to look at. Regards, Adai On Tue, 2005-03-08 at 11:10 -0800, Hrishikesh Deshmukh wrote: > Hi All, > > I have 200 CEL files and i want to use bioconductor to > read these files and then do simple things like > hist(),boxplot()! I think i will run into memory > issues! > Any suggestions as to how to handle this problem? > > Thanks in advance. > Hrishi > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
@steffen-durinck-519
Last seen 10.2 years ago
Hi, I once did RMA on 237 CEL files and needed about 5Gb RAM on a Linux machine to do this. Cheers, Steffen Hrishikesh Deshmukh wrote: >Hi All, > >I have 200 CEL files and i want to use bioconductor to >read these files and then do simple things like >hist(),boxplot()! I think i will run into memory >issues! >Any suggestions as to how to handle this problem? > >Thanks in advance. >Hrishi > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor > > >
ADD COMMENT

Login before adding your answer.

Traffic: 646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6