Hi,
I have two questions regarding the use of the affy package:
- I have a large series of cel-files and am trying to read
them at once. Unfortunately, the ReadAffy-function seems to
use a lot of memory. My workstation has 2 Gig of RAM
installed, but trying to read >100 cel-files (HGU133a) is a
no-go. This is R1.8.1 with Bioconductor 1.3 on a Windows
machine. Reading in 50 cel-files already means a
(peak-)memory usage of 800 meg. Is there any solution to
this, because I would like to read 300 cel-files at the same
time if possible. I have played around with the
memory.limit()-function, but with no success.
- Second, after data import I would like to retrieve all
information on a probe level for several probe sets. I do
this using the pm()-function, for instance
>pm(CelFiles)[1:16, 1:50]. The problem is that I have to find out
first where which exact "location" the probe set of interest has. This
is not a big problem, but I thought there might be a more elegant
solution to this.
Thanks in advance,
Roel Verhaak
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r.verhaak.vcf
Type: text/x-vcard
Size: 308 bytes
Desc: Card for Roel Verhaak
Url : https://www.stat.math.ethz.ch/pipermail/bioconductor/attachments
/20040406/7d7e5b20/r.verhaak.vcf
Simply put, you are not going to be able to read in >300 cel files
with
only(!) 2 Gb RAM. You should be able to do justRMA with this much RAM
on
100 or so chips, but if you really want to be able to do huge numbers
of
chips you are going to have to upgrade to a 64 bit architecture, which
at this point in time also means you have to switch to Linux.
To get the pm probes I think you want to do something like this:
my.pms <- probes(abatch, "pm", "1007_s_at") # for e.g., the 1007_s_at
probes
Best,
Jim
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
>>> Roel Verhaak <r.verhaak@erasmusmc.nl> 04/06/04 08:32AM >>>
Hi,
I have two questions regarding the use of the affy package:
- I have a large series of cel-files and am trying to read
them at once. Unfortunately, the ReadAffy-function seems to
use a lot of memory. My workstation has 2 Gig of RAM
installed, but trying to read >100 cel-files (HGU133a) is a
no-go. This is R1.8.1 with Bioconductor 1.3 on a Windows
machine. Reading in 50 cel-files already means a
(peak-)memory usage of 800 meg. Is there any solution to
this, because I would like to read 300 cel-files at the same
time if possible. I have played around with the
memory.limit()-function, but with no success.
- Second, after data import I would like to retrieve all
information on a probe level for several probe sets. I do
this using the pm()-function, for instance
>pm(CelFiles)[1:16, 1:50]. The problem is that I have to find out
first
where which exact "location" the probe set of interest has. This is
not
a big problem, but I thought there might be a more elegant solution to
this.
Thanks in advance,
Roel Verhaak
Hi,
I'd interested in a solution myself ;-) .
Reading in many cel files is actually not the bottle neck! When it
comes to
use "expresso" with normalisation across the cel files the memory
usage
increases again, and you may eventually exceed all your 2GB of memory.
I'm
normalizing 42 MG-U74Av2 chips and expresso takes about 800mb for rma
+
quntiles + pmonly + medianpolish.
It was mentioned previously in this list (don't remember when exactly)
the
justRMA method performs exactly the above normalization procedure, but
is a
lot fasdte rand used a lot less memory. You feed the cel file names to
the
routine not an AffyBath object (see help(justRMA)).
Off course if you'd like to use other normalization of background
correction
methods you're pretty much bound to use expresso, for which will
probably
need too much memory for your 100 cel files ...
Sorry, I've not suggestion for your 2nd problem (probe set location).
regards,
Arne
--
Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com
> -----Original Message-----
> From: bioconductor-bounces@stat.math.ethz.ch
> [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of
> Roel Verhaak
> Sent: 06 April 2004 14:32
> To: bioconductor@stat.math.ethz.ch
> Subject: [BioC] memory usage ReadAffy() and probe level information
>
>
> Hi,
>
> I have two questions regarding the use of the affy package:
> - I have a large series of cel-files and am trying to read
> them at once. Unfortunately, the ReadAffy-function seems to
> use a lot of memory. My workstation has 2 Gig of RAM
> installed, but trying to read >100 cel-files (HGU133a) is a
> no-go. This is R1.8.1 with Bioconductor 1.3 on a Windows
> machine. Reading in 50 cel-files already means a
> (peak-)memory usage of 800 meg. Is there any solution to
> this, because I would like to read 300 cel-files at the same
> time if possible. I have played around with the
> memory.limit()-function, but with no success.
> - Second, after data import I would like to retrieve all
> information on a probe level for several probe sets. I do
> this using the pm()-function, for instance
> >pm(CelFiles)[1:16, 1:50]. The problem is that I have to find
> out first where which exact "location" the probe set of
> interest has. This is not a big problem, but I thought there
> might be a more elegant solution to this.
>
> Thanks in advance,
> Roel Verhaak
>
The limit for justRMA that I have encountered is approximately 110-120
HGU133 Plus 2.0 CEL files, with 2 GB RAM, R 1.9.0 beta running on
Windows. I do not encounter a memory error directly - the R GUI
application just gracefully folds up and reminds me to send an error
report to Bill Gates. Anyway, I'm not sure if it directly translates
in
terms of number of transcripts, but that limit should allow >200
HGU133A
chips?
I am told that the maximum addressable memory for a 32 bit processor
is
4GB - would it make a difference if the actual physical memory is
increased from 2 GB to 4 GB (even though my memory.limit has already
been set to 4095)?
Min-Han Tan
-----Original Message-----
From: James MacDonald [mailto:jmacdon@med.umich.edu]
Sent: Tuesday, April 06, 2004 8:52 AM
To: r.verhaak@erasmusmc.nl; bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] memory usage ReadAffy() and probe level
information
Simply put, you are not going to be able to read in >300 cel files
with
only(!) 2 Gb RAM. You should be able to do justRMA with this much RAM
on
100 or so chips, but if you really want to be able to do huge numbers
of
chips you are going to have to upgrade to a 64 bit architecture, which
at this point in time also means you have to switch to Linux.
To get the pm probes I think you want to do something like this:
my.pms <- probes(abatch, "pm", "1007_s_at") # for e.g., the 1007_s_at
probes
Best,
Jim
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
>>> Roel Verhaak <r.verhaak@erasmusmc.nl> 04/06/04 08:32AM >>>
Hi,
I have two questions regarding the use of the affy package:
- I have a large series of cel-files and am trying to read
them at once. Unfortunately, the ReadAffy-function seems to
use a lot of memory. My workstation has 2 Gig of RAM
installed, but trying to read >100 cel-files (HGU133a) is a no-go.
This
is R1.8.1 with Bioconductor 1.3 on a Windows machine. Reading in 50
cel-files already means a (peak-)memory usage of 800 meg. Is there any
solution to this, because I would like to read 300 cel-files at the
same
time if possible. I have played around with the
memory.limit()-function,
but with no success.
- Second, after data import I would like to retrieve all information
on
a probe level for several probe sets. I do this using the
pm()-function,
for instance
>pm(CelFiles)[1:16, 1:50]. The problem is that I have to find out
first
where which exact "location" the probe set of interest has. This is
not
a big problem, but I thought there might be a more elegant solution to
this.
Thanks in advance,
Roel Verhaak
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
This email message, including any attachments, is for the
so...{{dropped}}