Can't reproduce assessment data included in affycomp package

0

Entering edit mode

Harris A. Jaffee ▴ 590

@harris-a-jaffee-3972

Last seen 11.2 years ago

United States

Whether or not rma() has changed over the years, it appears that the read.newspikein() / assessSpikeIn2() pipeline has definitely changed. I just re-processed Rafa's original file here, from 2004: http://affycomp.jhsph.edu/AFFY2/rafa at jhu.edu/030429.1332/hgu133.csv The result was at least similar to Philippe's and different from the old one. It didn't matter whether I let the Affycomp tool process the file or if I did it by hand. For reference: http://affycomp.jhsph.edu/AFFY2/rafa at jhu.edu/030429.1332/ http://affycomp.jhsph.edu/AFFY2/rafa at jhu.edu/030429.1332/hgu133.csv http://affycomp.jhsph.edu/AFFY2/rafa at jhu.edu/030429.1332/spike- in-133- assessment.pdf http://affycomp.jhsph.edu/AFFY2/hj at jhu.edu/110419.1757/ http://affycomp.jhsph.edu/AFFY2/hj at jhu.edu/110419.1757/hgu133.csv http://affycomp.jhsph.edu/AFFY2/hj at jhu.edu/110419.1757/spike- in-133- assessment.pdf The two Fig. 5c's are different, but what does this mean? Begin forwarded message: > From: "Harris A. Jaffee" <hj at="" jhu.edu=""> > Date: April 19, 2011 3:53:39 PM EDT > To: Philippe Serhal <philippe.serhal at="" umontreal.ca=""> > Cc: Rafael Irizarry <rafa at="" jhu.edu="">, Kasper Daniel Hansen > <khansen at="" jhsph.edu=""> > Subject: Re: [BioC] Can't reproduce assessment data included in > affycomp package > > So we see differences by making this call: > > > affycomp.compfig5c(list(rma.assessment2.133$MA2, a$MA2)) > > And something is certainly wrong, because these two things are not > even the same size: > > > str(a$MA2$tp.low) > num [1:630] 0.00159 0.00317 0.00476 0.00635 0.00794 ... > > > str(rma.assessment2.133$MA2$tp.low) > num [1:504] 0.00198 0.00397 0.00595 0.00794 0.00992 ... > > If I did it right, your call to assessMA2() got 630 / 504 / 378 : > > $ fp.low : int [1:630] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.low : num [1:630] 0.00159 0.00317 0.00476 0.00635 > 0.00794 ... > $ fp.med : int [1:504] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.med : num [1:504] 0.00198 0.00397 0.00595 0.00794 > 0.00992 ... > $ fp.high : int [1:378] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.high : num [1:378] 0.00265 0.00529 0.00794 0.01058 > 0.01323 ... > $ method.name: chr "RMA_testing" > > whereas Rafa had 504 / 378 / 378 : > > $ fp.low : num [1:504] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.low : num [1:504] 0.00198 0.00397 0.00595 0.00794 > 0.00992 ... > $ fp.med : num [1:378] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.med : num [1:378] 0.00265 0.00529 0.00794 0.01058 > 0.01323 ... > $ fp.high : num [1:378] 0 0 0 0 0 0 0 0 0 0 ... > $ tp.high : num [1:378] 0.00265 0.00529 0.00794 0.01058 > 0.01323 ... > $ method.name: chr "RMA" > > Are you getting what I have above, and does that suggest anything? > > > sessionInfo() > R version 2.13.0 Patched (2011-04-19 r55517) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C > [3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915 > [5] LC_MONETARY=C LC_MESSAGES=en_US.iso885915 > [7] LC_PAPER=en_US.iso885915 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C > > attached base packages: > [1] splines stats graphics grDevices datasets utils > methods > [8] base > > other attached packages: > [1] affycomp_1.27.2 affycompData_0.99.0 hgu133atagcdf_2.8.0 > [4] affy_1.30.0 Biobase_2.12.1 > > loaded via a namespace (and not attached): > [1] affyio_1.20.0 preprocessCore_1.14.0 tools_2.13.0 > > > On Apr 19, 2011, at 11:57 AM, Philippe Serhal wrote: >> I can't quite seem to reproduce the precomputed MAS 5.0 and RMA >> assessments provided in the affycomp package. Although I have >> been unable to find the code that was originally used to generate >> them, I think it is fairly safe to assume it looked a little >> something like this (given a current working directory containing >> the HG-U133A spike-in data [1]): >> >> R> library (affy) >> R> library (affycomp) >> R> data <- ReadAffy () >> R> eset <- rma (data) >> R> write.table (data.frame (2 ^ exprs (eset), check.names = F), >> + file = "tmp.csv", sep = ",", col.names = NA, quote = F) >> R> tmp <- read.newspikein ("tmp.csv") >> R> a <- assessSpikeIn2 (tmp, method.name = "RMA_testing") >> >> I then compare this with the precomputed assessments: >> >> R> data (rma.assessment2.133) >> R> affycompPlot (rma.assessment2.133, a) >> >> Four of the six resulting figures [2] -- 1b, 2b, 4c, and 5e -- >> show identical results for the two assessments, while the other >> two -- 5c and 5d -- show significantly diverging results. >> Thinking perhaps the assessments had been computed using an older >> version of RMA, I tried 'bgversion = 1' in the rma() call, but >> that didn't help. In fact, this doesn't seem to have anything to >> do with the RMA pipeline, because if I repeat the process >> substituting 'mas5' for 'rma' and 'mas5.assessment2.133' for >> 'rma.assessment2.133' (and not exponentiating), I obtain similar >> results: 5c differs, but none of the others. >> >> Any idea what I'm doing wrong here? Thank you in advance for any >> insight you may be able to provide. >> >> R version: 2.10.0 >> affy version: 1.24.1 >> >> [1] HG-U133A spike-in data: http://www.biostat.jhsph.edu/~ririzarr/ >> affycomp/hgu133spikein.tgz >> [2] affycomp figure definitions: http://affycomp.jhsph.edu/AFFY2/ >> comp_form.html#definitions >> >> -- >> Philippe Serhal >> Functional and Structural Bioinformatics Lab >> Institute for Research in Immunology and Cancer (IRIC) >> Universit? de Montr?al >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor >

Cancer affy affycomp PROcess safe Cancer affy affycomp PROcess safe • 1.3k views

ADD COMMENT • link 14.8 years ago Harris A. Jaffee ▴ 590

Login before adding your answer.