Question

Affy EXON Array: iterPLIER or RMA

0

Entering edit mode

Daniel Brewer ★ 1.9k

@daniel-brewer-1791

Last seen 11.4 years ago

Hi, I am doing some analysis on the Affymetrix Exon human microarrays. Normally I would use RMA to summarize, but in the papers by Affymetrix they use iterPLIER for the gene-level estimates and PLIER for the Exon-level estimates. Is it ok to use RMA for both summary levels for these arrays? Does anyone have an opinion on what the best approach to use is? Also does anyone use DABG to filter out badly detected probes? Thanks Daniel -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addre...{{dropped}}

Cancer plier Cancer plier • 1.9k views

ADD COMMENT • link updated 18.8 years ago by James W. MacDonald 68k • written 18.8 years ago by Daniel Brewer ★ 1.9k

score 0 · Answer 1 · 2007-05-04

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 7 hours ago

United States

Hi Daniel, Daniel Brewer wrote: > Hi, > > I am doing some analysis on the Affymetrix Exon human microarrays. > Normally I would use RMA to summarize, but in the papers by Affymetrix > they use iterPLIER for the gene-level estimates and PLIER for the > Exon-level estimates. > > Is it ok to use RMA for both summary levels for these arrays? > Does anyone have an opinion on what the best approach to use is? Have you looked at the exonmap package? IIRC Affy intend iterPLIER to correct for the fact that lots of the probesets on the exon array don't actually interrogate exons (according to current knowledge of the genome). In fact Wing Wong has recently published a paper showing that the correlation between the 'core' probesets and the 'full' and 'extended' are really bad. IMO, it is better to remove the probesets that we currently think are either interrogating multiple transcripts or missing the exons altogether _before_ computing any expression values, so you don't have to hope that the statistics are robust enough to ignore spurious signal. One downside to the exonmap package is the fact that you need a 64 bit linux box with lots of RAM (which presumably you have already, else how are you doing anything with these things?). In addition, you need to install MySQL and set up the Ensembl core database, as well as the tables for exonmap. However, Michal and Crispin have given pretty detailed instructions for how to go about doing that (I was able to get set up, and my knowledge of DBs wouldn't fill a thimble). Best, Jim > > Also does anyone use DABG to filter out badly detected probes? > > Thanks > > Daniel > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD COMMENT • link 18.8 years ago James W. MacDonald 68k

0

Entering edit mode

Hi Daniel, Jim, > Have you looked at the exonmap package? > > IIRC Affy intend iterPLIER to correct for the fact that lots of the probesets on the exon array don't actually interrogate exons (according to current > knowledge of the genome). In fact Wing Wong has recently published a paper showing that the correlation between the 'core' > probesets and the 'full' and 'extended' are really bad. > IMO, it is better to remove the probesets that we currently think are either interrogating multiple transcripts or missing the exons altogether _before_ > computing any expression values, so you don't have to hope that the statistics are robust enough to ignore spurious signal. > One downside to the exonmap package is the fact that you need a 64 bit linux box with lots of RAM (which presumably you have already, else how are you > doing anything with these things?). In addition, you need to install MySQL and set up the Ensembl core database, as well as the tables for exonmap. > However, Michal and Crispin have given pretty detailed instructions for how to go about doing that (I was able to get set up, and my knowledge of DBs > wouldn't fill a thimble). I'm not sure if this is your experience too?: we found we need a big machine mainly for the normalization/expression summary side of things - I don't think the package itself should need too much clout to manage the annotation; it's the arrays themselves that eat up the RAM. Cheers, Crispin -------------------------------------------------------- This email is confidential and intended solely for the use o...{{dropped}}

ADD REPLY • link 18.8 years ago Crispin Miller ★ 1.1k

0

Entering edit mode

Crispin Miller wrote: > I'm not sure if this is your experience too?: we found we need a big > machine mainly for the normalization/expression summary side of things - > I don't think the package itself should need too much clout to manage > the annotation; it's the arrays themselves that eat up the RAM. Exactly. Well, the cdf package is pretty big as well, although I don't know offhand how much RAM it takes. But you are exactly right - the arrays are the big RAM users here. There are two possible fixes in the pipeline. For the cdf there will soon be the ability to create a hybrid cdf/annotation package based on a SQLite database that will use much less RAM. Because SQLite doesn't have the capabilities that MySQL does (stored procedures, for instance), I don't think this will be able to supercede exonmap. Rather just complement it. On the array side, there is the BufferedMatrix package which puts most of the data on disk (and buffers a small, user adjustable amount of data in RAM). The oligo package already uses BufferedMatrix for the raw data and I imagine could also use BufferedMatrix for ExpressionSets as well. So hopefully we will soon get the analysis of these data back onto more modest computers. Best, Jim > > Cheers, > > Crispin > > -------------------------------------------------------- > > > This email is confidential and intended solely for the use...{{dropped}}

ADD REPLY • link 18.8 years ago James W. MacDonald 68k

0

Entering edit mode

Crispin Miller wrote: > Hi Daniel, Jim, > >> Have you looked at the exonmap package? >> >> IIRC Affy intend iterPLIER to correct for the fact that lots of the > probesets on the exon array don't actually interrogate exons (according > to current >> knowledge of the genome). In fact Wing Wong has recently published a > paper showing that the correlation between the 'core' >> probesets and the 'full' and 'extended' are really bad. > >> IMO, it is better to remove the probesets that we currently think are > either interrogating multiple transcripts or missing the exons > altogether _before_ > computing any expression values, so you don't have > to hope that the statistics are robust enough to ignore spurious signal. > >> One downside to the exonmap package is the fact that you need a 64 bit > linux box with lots of RAM (which presumably you have already, else how > are you >> doing anything with these things?). In addition, you need to install > MySQL and set up the Ensembl core database, as well as the tables for > exonmap. >> However, Michal and Crispin have given pretty detailed instructions > for how to go about doing that (I was able to get set up, and my > knowledge of DBs >> wouldn't fill a thimble). > > I'm not sure if this is your experience too?: we found we need a big > machine mainly for the normalization/expression summary side of things - > I don't think the package itself should need too much clout to manage > the annotation; it's the arrays themselves that eat up the RAM. > > Cheers, > > Crispin I did look into exonmap, but was scared off by the memory (mainly) and setup requirements. Is it possible to use APT (affymetrix power tools) to produce the exon-level summaries and then use the exonmap package for the annotation and analysis? The advantages to using APT is that it uses memory efficient techniques in C++ unlike R. Many thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addre...{{dropped}}

ADD REPLY • link 18.8 years ago Daniel Brewer ★ 1.9k

0

Entering edit mode

Daniel Brewer wrote: > I did look into exonmap, but was scared off by the memory (mainly) and > setup requirements. Is it possible to use APT (affymetrix power tools) > to produce the exon-level summaries and then use the exonmap package for > the annotation and analysis? The advantages to using APT is that it > uses memory efficient techniques in C++ unlike R. I suppose you could, but that sort of misses the whole point of the exonmap package, which I think is to be able to choose which probesets you want to use when computing an expression value for a particular gene. The annotation will probably not be much different from what you get using Affy's software (at the level of expression values for a set of probesets), so in this instance I don't know if you gain much for the outlay of time. Best, Jim > > Many thanks > > Dan > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 18.8 years ago James W. MacDonald 68k

0

Entering edit mode

Hi Jim and Daniel, -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James W. MacDonald Sent: 08 May 2007 17:44 Daniel Brewer wrote: > I did look into exonmap, but was scared off by the memory (mainly) and > setup requirements. Is it possible to use APT (affymetrix power > tools) to produce the exon-level summaries and then use the exonmap > package for the annotation and analysis? The advantages to using APT > is that it uses memory efficient techniques in C++ unlike R. I suppose you could, but that sort of misses the whole point of the exonmap package, which I think is to be able to choose which probesets you want to use when computing an expression value for a particular gene. The annotation will probably not be much different from what you get using Affy's software (at the level of expression values for a set of probesets), so in this instance I don't know if you gain much for the outlay of time. Best, Jim ============= Finding right probesets for a gene-level summary - as Jim wrote - is one of the possible and important uses of exonmap. However - it is also possible (as Daniel suggests) - to get ready numbers from APT (or its derivatives - ExACT or expression console), import it into R via a flat file and use exonmap just to process gene/transcript/exon/probeset annotations. I was doing it myself happily a couple of times. As for annotations in XMAP database (so in exonmap) - they are genome-based and independent from Affy - thus you can get for example more detailed information on genome multiple targeting (ie. cross-hybridization). Cheers, Michal -------------------------------------------------------- This email is confidential and intended solely for the use o...{{dropped}}

ADD REPLY • link 18.8 years ago Michal Okoniewski ▴ 120