Hi,
I am doing some analysis on the Affymetrix Exon human microarrays.
Normally I would use RMA to summarize, but in the papers by Affymetrix
they use iterPLIER for the gene-level estimates and PLIER for the
Exon-level estimates.
Is it ok to use RMA for both summary levels for these arrays?
Does anyone have an opinion on what the best approach to use is?
Also does anyone use DABG to filter out badly detected probes?
Thanks
Daniel
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Molecular Carcinogenesis
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
addre...{{dropped}}
Hi Daniel,
Daniel Brewer wrote:
> Hi,
>
> I am doing some analysis on the Affymetrix Exon human microarrays.
> Normally I would use RMA to summarize, but in the papers by
Affymetrix
> they use iterPLIER for the gene-level estimates and PLIER for the
> Exon-level estimates.
>
> Is it ok to use RMA for both summary levels for these arrays?
> Does anyone have an opinion on what the best approach to use is?
Have you looked at the exonmap package?
IIRC Affy intend iterPLIER to correct for the fact that lots of the
probesets on the exon array don't actually interrogate exons
(according
to current knowledge of the genome). In fact Wing Wong has recently
published a paper showing that the correlation between the 'core'
probesets and the 'full' and 'extended' are really bad.
IMO, it is better to remove the probesets that we currently think are
either interrogating multiple transcripts or missing the exons
altogether _before_ computing any expression values, so you don't have
to hope that the statistics are robust enough to ignore spurious
signal.
One downside to the exonmap package is the fact that you need a 64 bit
linux box with lots of RAM (which presumably you have already, else
how
are you doing anything with these things?). In addition, you need to
install MySQL and set up the Ensembl core database, as well as the
tables for exonmap. However, Michal and Crispin have given pretty
detailed instructions for how to go about doing that (I was able to
get
set up, and my knowledge of DBs wouldn't fill a thimble).
Best,
Jim
>
> Also does anyone use DABG to filter out badly detected probes?
>
> Thanks
>
> Daniel
>
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.
Hi Daniel, Jim,
> Have you looked at the exonmap package?
>
> IIRC Affy intend iterPLIER to correct for the fact that lots of the
probesets on the exon array don't actually interrogate exons
(according
to current
> knowledge of the genome). In fact Wing Wong has recently published a
paper showing that the correlation between the 'core'
> probesets and the 'full' and 'extended' are really bad.
> IMO, it is better to remove the probesets that we currently think
are
either interrogating multiple transcripts or missing the exons
altogether _before_ > computing any expression values, so you don't
have
to hope that the statistics are robust enough to ignore spurious
signal.
> One downside to the exonmap package is the fact that you need a 64
bit
linux box with lots of RAM (which presumably you have already, else
how
are you
> doing anything with these things?). In addition, you need to install
MySQL and set up the Ensembl core database, as well as the tables for
exonmap.
> However, Michal and Crispin have given pretty detailed instructions
for how to go about doing that (I was able to get set up, and my
knowledge of DBs
> wouldn't fill a thimble).
I'm not sure if this is your experience too?: we found we need a big
machine mainly for the normalization/expression summary side of things
-
I don't think the package itself should need too much clout to manage
the annotation; it's the arrays themselves that eat up the RAM.
Cheers,
Crispin
--------------------------------------------------------
This email is confidential and intended solely for the use
o...{{dropped}}
Crispin Miller wrote:
> I'm not sure if this is your experience too?: we found we need a big
> machine mainly for the normalization/expression summary side of
things -
> I don't think the package itself should need too much clout to
manage
> the annotation; it's the arrays themselves that eat up the RAM.
Exactly. Well, the cdf package is pretty big as well, although I don't
know offhand how much RAM it takes. But you are exactly right - the
arrays are the big RAM users here.
There are two possible fixes in the pipeline. For the cdf there will
soon be the ability to create a hybrid cdf/annotation package based on
a
SQLite database that will use much less RAM. Because SQLite doesn't
have
the capabilities that MySQL does (stored procedures, for instance), I
don't think this will be able to supercede exonmap. Rather just
complement it.
On the array side, there is the BufferedMatrix package which puts most
of the data on disk (and buffers a small, user adjustable amount of
data
in RAM). The oligo package already uses BufferedMatrix for the raw
data
and I imagine could also use BufferedMatrix for ExpressionSets as
well.
So hopefully we will soon get the analysis of these data back onto
more
modest computers.
Best,
Jim
>
> Cheers,
>
> Crispin
>
> --------------------------------------------------------
>
>
> This email is confidential and intended solely for the
use...{{dropped}}
Crispin Miller wrote:
> Hi Daniel, Jim,
>
>> Have you looked at the exonmap package?
>>
>> IIRC Affy intend iterPLIER to correct for the fact that lots of the
> probesets on the exon array don't actually interrogate exons
(according
> to current
>> knowledge of the genome). In fact Wing Wong has recently published
a
> paper showing that the correlation between the 'core'
>> probesets and the 'full' and 'extended' are really bad.
>
>> IMO, it is better to remove the probesets that we currently think
are
> either interrogating multiple transcripts or missing the exons
> altogether _before_ > computing any expression values, so you don't
have
> to hope that the statistics are robust enough to ignore spurious
signal.
>
>> One downside to the exonmap package is the fact that you need a 64
bit
> linux box with lots of RAM (which presumably you have already, else
how
> are you
>> doing anything with these things?). In addition, you need to
install
> MySQL and set up the Ensembl core database, as well as the tables
for
> exonmap.
>> However, Michal and Crispin have given pretty detailed instructions
> for how to go about doing that (I was able to get set up, and my
> knowledge of DBs
>> wouldn't fill a thimble).
>
> I'm not sure if this is your experience too?: we found we need a big
> machine mainly for the normalization/expression summary side of
things -
> I don't think the package itself should need too much clout to
manage
> the annotation; it's the arrays themselves that eat up the RAM.
>
> Cheers,
>
> Crispin
I did look into exonmap, but was scared off by the memory (mainly) and
setup requirements. Is it possible to use APT (affymetrix power
tools)
to produce the exon-level summaries and then use the exonmap package
for
the annotation and analysis? The advantages to using APT is that it
uses memory efficient techniques in C++ unlike R.
Many thanks
Dan
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
addre...{{dropped}}
Daniel Brewer wrote:
> I did look into exonmap, but was scared off by the memory (mainly)
and
> setup requirements. Is it possible to use APT (affymetrix power
tools)
> to produce the exon-level summaries and then use the exonmap package
for
> the annotation and analysis? The advantages to using APT is that it
> uses memory efficient techniques in C++ unlike R.
I suppose you could, but that sort of misses the whole point of the
exonmap package, which I think is to be able to choose which probesets
you want to use when computing an expression value for a particular
gene.
The annotation will probably not be much different from what you get
using Affy's software (at the level of expression values for a set of
probesets), so in this instance I don't know if you gain much for the
outlay of time.
Best,
Jim
>
> Many thanks
>
> Dan
>
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.
Hi Jim and Daniel,
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James
W.
MacDonald
Sent: 08 May 2007 17:44
Daniel Brewer wrote:
> I did look into exonmap, but was scared off by the memory (mainly)
and
> setup requirements. Is it possible to use APT (affymetrix power
> tools) to produce the exon-level summaries and then use the exonmap
> package for the annotation and analysis? The advantages to using
APT
> is that it uses memory efficient techniques in C++ unlike R.
I suppose you could, but that sort of misses the whole point of the
exonmap package, which I think is to be able to choose which probesets
you want to use when computing an expression value for a particular
gene.
The annotation will probably not be much different from what you get
using Affy's software (at the level of expression values for a set of
probesets), so in this instance I don't know if you gain much for the
outlay of time.
Best,
Jim
=============
Finding right probesets for a gene-level summary - as Jim wrote -
is one of the possible and important uses of exonmap.
However - it is also possible (as Daniel suggests) - to get ready
numbers
from APT (or its derivatives - ExACT or expression console), import it
into R via a flat file and use exonmap just to process
gene/transcript/exon/probeset annotations. I was doing it myself
happily
a couple of times.
As for annotations in XMAP database (so in exonmap) - they are
genome-based
and independent from Affy - thus you can get for example more detailed
information
on genome multiple targeting (ie. cross-hybridization).
Cheers,
Michal
--------------------------------------------------------
This email is confidential and intended solely for the use
o...{{dropped}}