removing outlier/masked probes and gcrma
3
0
Entering edit mode
Andrew Su ▴ 20
@andrew-su-2000
Last seen 9.7 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070112/ 32ec8247/attachment.pl
• 605 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Andrew, Andrew Su wrote: > I am attempting to use gcrma on AffyBatch objects which were read in > using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the ReadAffy > function). For example, I put two MOE430 CEL files in the working > directory, and here is what I tried: > > > > >>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE) > > >>ai<-compute.affinities(cdfName(ab)) > > > .> data<-gcrma(ab,ai) > > Adjusting for optical effect..Done. > > Adjusting for non-specific binding.Error in > gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : > > NAs are not allowed in subscripted assignments As you can see, you cannot have any NAs in your data to use gcrma. An alternative to this is to use the MBNI cdf/probe packages that have the probes with SNPs in the central 15 base pairs removed. Anything in this listing with SNP in the name has these probes removed. http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF _download_v6.asp Note that there are some downsides to using these cdfs, mainly that the standard errors of your estimates will be highly variable, since the probesets for these cdfs are quite variable in size (unlike the stock affy chip, where the vast majority have 11 probes). Best, Jim > > >>sessionInfo() > > > Version 2.3.1 (2006-06-01) > > i386-pc-mingw32 > > > > attached base packages: > > [1] "splines" "tools" "methods" "stats" "graphics" > "grDevices" > > [7] "utils" "datasets" "base" > > > > other attached packages: > > mouse4302probe mouse4302cdf gcrma matchprobes > affy > > "1.10.0" "1.10.0" "2.6.0" "1.4.0" > "1.12.2" > > affyio Biobase > > "1.0.0" "1.10.1" > > > > > > I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions > 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0. I get a similar > error when using the rm.mask=TRUE option. > > > > My overall goal is to remove select probes from the analysis (in this > case, probes that overlap known polymorphisms). Any thoughts on how > best to do this are most appreciated... > > > > Cheers, > > -andrew > > > > -- > > Andrew Su, Ph.D. > > Genomics Institute of the > > Novartis Research Foundation > > asu at gnf.org > > Tel: 858-812-1656 > > Fax: 858-812-1630 > > http://web.gnf.org > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT
0
Entering edit mode
Andrew Su ▴ 20
@andrew-su-2000
Last seen 9.7 years ago
Thanks, Jim, for the thoughts below. Unfortunately, we are using a custom Affy chip design here, so those precomputed ones won't work for us. But we could certainly create a custom CDF file for our chip type too... ... but it surprises me somewhat that there isn't an alternate solution. First, what do people do with an AffyBatch object which was read in using the rm.mask option if it can't be used for further analyses? (Or is this a failing in how gcrma specifically deals with NAs?) And second, although custom CDFs would be great for dealing with ChipType-specific effects (e.g., SNPs), how do people deal with chip-specific effects (e.g., scratches and debris)? Just a couple of thoughts... Any additional ideas are welcome, but we'll be pushing ahead on custom CDFs in the mean time... Cheers, -andrew -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: Saturday, January 13, 2007 6:41 AM To: Andrew Su Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] removing outlier/masked probes and gcrma Hi Andrew, Andrew Su wrote: > I am attempting to use gcrma on AffyBatch objects which were read in > using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the ReadAffy > function). For example, I put two MOE430 CEL files in the working > directory, and here is what I tried: > > > > >>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE) > > >>ai<-compute.affinities(cdfName(ab)) > > > .> data<-gcrma(ab,ai) > > Adjusting for optical effect..Done. > > Adjusting for non-specific binding.Error in > gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : > > NAs are not allowed in subscripted assignments As you can see, you cannot have any NAs in your data to use gcrma. An alternative to this is to use the MBNI cdf/probe packages that have the probes with SNPs in the central 15 base pairs removed. Anything in this listing with SNP in the name has these probes removed. http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF _d ownload_v6.asp Note that there are some downsides to using these cdfs, mainly that the standard errors of your estimates will be highly variable, since the probesets for these cdfs are quite variable in size (unlike the stock affy chip, where the vast majority have 11 probes). Best, Jim > > >>sessionInfo() > > > Version 2.3.1 (2006-06-01) > > i386-pc-mingw32 > > > > attached base packages: > > [1] "splines" "tools" "methods" "stats" "graphics" > "grDevices" > > [7] "utils" "datasets" "base" > > > > other attached packages: > > mouse4302probe mouse4302cdf gcrma matchprobes > affy > > "1.10.0" "1.10.0" "2.6.0" "1.4.0" > "1.12.2" > > affyio Biobase > > "1.0.0" "1.10.1" > > > > > > I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions > 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0. I get a similar > error when using the rm.mask=TRUE option. > > > > My overall goal is to remove select probes from the analysis (in this > case, probes that overlap known polymorphisms). Any thoughts on how > best to do this are most appreciated... > > > > Cheers, > > -andrew > > > > -- > > Andrew Su, Ph.D. > > Genomics Institute of the > > Novartis Research Foundation > > asu at gnf.org > > Tel: 858-812-1656 > > Fax: 858-812-1630 > > http://web.gnf.org > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT
0
Entering edit mode
> Thanks, Jim, for the thoughts below. Unfortunately, we are using a > custom Affy chip design here, so those precomputed ones won't work for > us. But we could certainly create a custom CDF file for our chip type > too... The package altcdfenvs will let you do so (almost) painlessly. > .. but it surprises me somewhat that there isn't an alternate solution. > First, what do people do with an AffyBatch object which was read in > using the rm.mask option if it can't be used for further analyses? (Or > is this a failing in how gcrma specifically deals with NAs?) Dealing with unknown values (NAs) is quite a general problem, and although some methods can accommodate their presence, some will not be able to proceed. One way is to try to infer what are your missing data: they are several approaches to pick from (ranging from taking the average to more elaborated techniques), keeping in mind that there is probably no magic wand for missing values - otherwise it would have been built into all methods. Once you guessed what the missing values could be, you can apply your processing method. You may also consider something else than gcrma to process your data. > And > second, although custom CDFs would be great for dealing with > ChipType-specific effects (e.g., SNPs), how do people deal with > chip-specific effects (e.g., scratches and debris)? affyPLM is letting you fit chip-effect-at-the-probe-level and the likes, I think. That can be something to help you. hoping this helps, Laurent > Just a couple of > thoughts... Any additional ideas are welcome, but we'll be pushing > ahead on custom CDFs in the mean time... > > Cheers, > -andrew > > > > -----Original Message----- > From: James W. MacDonald [mailto:jmacdon at med.umich.edu] > Sent: Saturday, January 13, 2007 6:41 AM > To: Andrew Su > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] removing outlier/masked probes and gcrma > > Hi Andrew, > > Andrew Su wrote: >> I am attempting to use gcrma on AffyBatch objects which were read in >> using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the > ReadAffy >> function). For example, I put two MOE430 CEL files in the working >> directory, and here is what I tried: >> >> >> >> >>>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE) >> >> >>>ai<-compute.affinities(cdfName(ab)) >> >> >> .> data<-gcrma(ab,ai) >> >> Adjusting for optical effect..Done. >> >> Adjusting for non-specific binding.Error in >> gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : >> >> NAs are not allowed in subscripted assignments > > As you can see, you cannot have any NAs in your data to use gcrma. An > alternative to this is to use the MBNI cdf/probe packages that have the > probes with SNPs in the central 15 base pairs removed. Anything in this > listing with SNP in the name has these probes removed. > > http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/C DF_d > ownload_v6.asp > > Note that there are some downsides to using these cdfs, mainly that the > standard errors of your estimates will be highly variable, since the > probesets for these cdfs are quite variable in size (unlike the stock > affy chip, where the vast majority have 11 probes). > > Best, > > Jim > > >> >> >>>sessionInfo() >> >> >> Version 2.3.1 (2006-06-01) >> >> i386-pc-mingw32 >> >> >> >> attached base packages: >> >> [1] "splines" "tools" "methods" "stats" "graphics" >> "grDevices" >> >> [7] "utils" "datasets" "base" >> >> >> >> other attached packages: >> >> mouse4302probe mouse4302cdf gcrma matchprobes >> affy >> >> "1.10.0" "1.10.0" "2.6.0" "1.4.0" >> "1.12.2" >> >> affyio Biobase >> >> "1.0.0" "1.10.1" >> >> >> >> >> >> I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions >> 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0. I get a similar >> error when using the rm.mask=TRUE option. >> >> >> >> My overall goal is to remove select probes from the analysis (in this >> case, probes that overlap known polymorphisms). Any thoughts on how >> best to do this are most appreciated... >> >> >> >> Cheers, >> >> -andrew >> >> >> >> -- >> >> Andrew Su, Ph.D. >> >> Genomics Institute of the >> >> Novartis Research Foundation >> >> asu at gnf.org >> >> Tel: 858-812-1656 >> >> Fax: 858-812-1630 >> >> http://web.gnf.org >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald > University of Michigan > Affymetrix and cDNA Microarray Core > 1500 E Medical Center Drive > Ann Arbor MI 48109 > 734-647-5623 > > > > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not > be used for urgent or sensitive issues. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > !DSPAM:45abe36b161401355422449! > > >
ADD REPLY
0
Entering edit mode
Jenny Drnevich ★ 2.2k
@jenny-drnevich-382
Last seen 9.7 years ago
Hi Andrew, >... but it surprises me somewhat that there isn't an alternate solution. >First, what do people do with an AffyBatch object which was read in >using the rm.mask option if it can't be used for further analyses? (Or >is this a failing in how gcrma specifically deals with NAs?) And >second, although custom CDFs would be great for dealing with >ChipType-specific effects (e.g., SNPs), how do people deal with >chip-specific effects (e.g., scratches and debris)? The short answer is, for Affymetrix expression arrays, we don't worry about scratches and debris. If there are only a few blemishes, the specific probes affected are likely to belong to completely different probesets. Most summarization methods calculate a probeset's value "robustly", meaning they down-weight or ignore an outlier probe, so most scratches and debris shouldn't have much effect on the resulting probeset values. At our facility, if the chip blemishes are > 10% of the array, we rerun the sample on another chip. Affy's reasonably good to us in replacing these 'defective' arrays free of charge. If you just want to remove specific probes and/or probesets from all the arrays, the easiest way is likely the 'RemoveProbes' function that Amy mentioned in her response to you. However, if you want to consistently make this change and you'll be doing lots of arrays, then it might be better to take the time to make the custom CDF. Cheers, Jenny > Just a couple of >thoughts... Any additional ideas are welcome, but we'll be pushing >ahead on custom CDFs in the mean time... > >Cheers, >-andrew > > > >-----Original Message----- >From: James W. MacDonald [mailto:jmacdon at med.umich.edu] >Sent: Saturday, January 13, 2007 6:41 AM >To: Andrew Su >Cc: bioconductor at stat.math.ethz.ch >Subject: Re: [BioC] removing outlier/masked probes and gcrma > >Hi Andrew, > >Andrew Su wrote: > > I am attempting to use gcrma on AffyBatch objects which were read in > > using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the >ReadAffy > > function). For example, I put two MOE430 CEL files in the working > > directory, and here is what I tried: > > > > > > > > > >>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE) > > > > > >>ai<-compute.affinities(cdfName(ab)) > > > > > > .> data<-gcrma(ab,ai) > > > > Adjusting for optical effect..Done. > > > > Adjusting for non-specific binding.Error in > > gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : > > > > NAs are not allowed in subscripted assignments > >As you can see, you cannot have any NAs in your data to use gcrma. An >alternative to this is to use the MBNI cdf/probe packages that have the >probes with SNPs in the central 15 base pairs removed. Anything in this >listing with SNP in the name has these probes removed. > >http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CD F_d >ownload_v6.asp > >Note that there are some downsides to using these cdfs, mainly that the >standard errors of your estimates will be highly variable, since the >probesets for these cdfs are quite variable in size (unlike the stock >affy chip, where the vast majority have 11 probes). > >Best, > >Jim > > > > > > > >>sessionInfo() > > > > > > Version 2.3.1 (2006-06-01) > > > > i386-pc-mingw32 > > > > > > > > attached base packages: > > > > [1] "splines" "tools" "methods" "stats" "graphics" > > "grDevices" > > > > [7] "utils" "datasets" "base" > > > > > > > > other attached packages: > > > > mouse4302probe mouse4302cdf gcrma matchprobes > > affy > > > > "1.10.0" "1.10.0" "2.6.0" "1.4.0" > > "1.12.2" > > > > affyio Biobase > > > > "1.0.0" "1.10.1" > > > > > > > > > > > > I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions > > 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0. I get a similar > > error when using the rm.mask=TRUE option. > > > > > > > > My overall goal is to remove select probes from the analysis (in this > > case, probes that overlap known polymorphisms). Any thoughts on how > > best to do this are most appreciated... > > > > > > > > Cheers, > > > > -andrew > > > > > > > > -- > > > > Andrew Su, Ph.D. > > > > Genomics Institute of the > > > > Novartis Research Foundation > > > > asu at gnf.org > > > > Tel: 858-812-1656 > > > > Fax: 858-812-1630 > > > > http://web.gnf.org > > > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor > > >-- >James W. MacDonald >University of Michigan >Affymetrix and cDNA Microarray Core >1500 E Medical Center Drive >Ann Arbor MI 48109 >734-647-5623 > > > >********************************************************** >Electronic Mail is not secure, may not be read every day, and should not >be used for urgent or sensitive issues. > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu
ADD COMMENT

Login before adding your answer.

Traffic: 515 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6