Mismatch probe handling for exon arrays
1
0
Entering edit mode
@steven-mckinney-1754
Last seen 10.3 years ago
The new Affy Exon chips do not have a mismatch probe for every perfect match probe, but rather have a collection of GC-varied background probes (about 50000 probes out of the 6 million on the chip - the "background probe collection" BGP). Info is in, amongst others, http://www.affymetrix.com/support/technical/whitepapers/exon_backgroun d_correction_whitepaper.pdf Affy has modified their PLIER algorithm to use this background probe collection to perform a "PM-GCBG" correction, using the median BGP intensity for probes with the same GC content as the PM probe. Has any implementation of the PM-GCBG idea been done for justRMA() or justPLIER() in R/BioC? Can anyone comment on which input parameters or control options of these functions should be specifically set to allow these functions to do a reasonable job of normalizing/correcting exon array data? For example, justPLIER() has argument usemm=TRUE but no documentation about it - no doubt usemm = FALSE is appropriate for the exon arrays. But, will the algorithms still perform alright? Are there newer versions of the algorithms that handle the exon data configuration? Any feedback appreciated. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney at bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada
Cancer Breast probe affy plier Cancer Breast probe affy plier • 965 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States
Hi Steven, Steven McKinney wrote: > The new Affy Exon chips do not have a mismatch probe for > every perfect match probe, but rather have a collection > of GC-varied background probes (about 50000 probes out > of the 6 million on the chip - the "background probe > collection" BGP). > > Info is in, amongst others, > http://www.affymetrix.com/support/technical/whitepapers/exon_backgro und_correction_whitepaper.pdf > > Affy has modified their PLIER algorithm to use this > background probe collection to perform a "PM-GCBG" > correction, using the median BGP intensity for probes > with the same GC content as the PM probe. > > Has any implementation of the PM-GCBG idea been done for > justRMA() or justPLIER() in R/BioC? Certainly not for justRMA(), since it doesn't use MM probe values. > > Can anyone comment on which input parameters or > control options of these functions should be specifically > set to allow these functions to do a reasonable job > of normalizing/correcting exon array data? > > For example, justPLIER() has argument usemm=TRUE > but no documentation about it - no doubt > usemm = FALSE is appropriate for the exon arrays. > But, will the algorithms still perform alright? > > Are there newer versions of the algorithms that handle > the exon data configuration? The problem with the exon arrays right now has to do with the amount of data involved and the current paradigm we use for analyzing these data. Currently we hold all the data in RAM, and given R's pass-by-value semantics, there can be quite a bit of copying. This isn't such a problem with the 3' biased arrays from Affy, especially if you have a reasonable amount of RAM. Unfortunately, the larger genotyping arrays and the exon arrays are so huge that this paradigm is really not working well anymore. Going forward the goal is to transition from holding the data in RAM to putting it all in SQlite databases, so one can work with a subset of data that is appropriate given the amount of RAM available. Since this will involve putting three things in databases (the cdf information, the annotation information, and the data) that will all have to play nice together, it is inherently a slow process. So, long story short, unless you have a 64 bit operating system and LOTS of RAM, the algorithm used to compute expression values is currently a moot point. Best, Jim > > Any feedback appreciated. > > > > Steven McKinney > > Statistician > Molecular Oncology and Breast Cancer Program > British Columbia Cancer Research Centre > > email: smckinney at bccrc.ca > > tel: 604-675-8000 x7561 > > BCCRC > Molecular Oncology > 675 West 10th Ave, Floor 4 > Vancouver B.C. > V5Z 1L3 > Canada > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT

Login before adding your answer.

Traffic: 617 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6