Question

summarizing probe intensites before or after normalization- 1. how to do with RMA 2. Opinions?

0

Entering edit mode

k. brand ▴ 420

@k-brand-1874

Last seen 9.6 years ago

Dear All, I compared two normalization approaches for an experiment using twelve affy 430-2.0 chips. (histogram plot comparing bith methods forwarded on request). #1. RMA library(affy) data <- ReadAffy() datarma <- rma(data) exprs2excel(datarma, file="dataRMA.csv") Plotting histograms of the output shows arrays NOT perfectly aligning at the means and spreads. I used a custom script to effect a quantile normalization on MAS5 preprocessed but unnormalized data- #2. Mas5 sans interchip normalization library(affy) data <- ReadAffy() datamas5sannorm <- mas5(data, normalize=FALSE) exprs2excel(datamas5sannorm, file="datamas5sannorm.csv") f.qnorm <- function(x,qinit=0.75,perc=100) {... The means and spreads of this normalization approach do align perfectly. THUS- summarizing probe intensites before or after normalization does appear to make a noticeable difference, as may be expected. My questions/requests- 1. Help to effect Bolstad normalization of the RMA preprocessed and summarized data. Whilst I succeed in generating unnormalized RMA preprocessed data with- library(affy) data <- ReadAffy() datarma <- rma(data, normalize=FALSE) As a result of my limited R experience, I failed in finding a method to effect Bolstad (quantile) normalization on this output. 2. Thoughts/comments on the benefits/caveats of normalizing before or after summarizing probe intensities. I look forward to any thoughts, advice & suggestions from users. thanks in advance, Karl =========================================== > sessionInfo() Version 2.3.0 (2006-04-24) i386-pc-mingw32 attached base packages: [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" "datasets" "base" other attached packages: affy affyio Biobase "1.10.0" "1.0.0" "1.10.0" -- Karl Brand <k.brand at="" erasmusmc.nl=""> Department of Cell Biology and Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam lab +31 (0)10 408 7409 fax +31 (0)10 408 9468

Normalization probe affy affyio Normalization probe affy affyio • 992 views

ADD COMMENT • link updated 17.6 years ago by Ben Bolstad ★ 1.2k • written 17.6 years ago by k. brand ▴ 420

score 0 · Answer 1 · 2006-09-11

0

Entering edit mode

Ben Bolstad ★ 1.2k

@ben-bolstad-1494

Last seen 6.7 years ago

> 1. Help to effect Bolstad normalization of the RMA preprocessed and > summarized data. Whilst I succeed in generating unnormalized RMA > preprocessed data with- > > library(affy) > data <- ReadAffy() > datarma <- rma(data, normalize=FALSE) > > As a result of my limited R experience, I failed in finding a method to > effect Bolstad (quantile) normalization on this output. library(affyPLM) datarma.postqnorm <- normalize(datarma)

ADD COMMENT • link 17.6 years ago Ben Bolstad ★ 1.2k

0

Entering edit mode

Ben, My apologies, i see my error now. library(affyPLM) dat <- ReadAffy() datrma <- rma(dat, normalize=FALSE) datrma.postqnorm <- normalize(datrma) boxplot(datrma.postqnorm) The above worked a treat! Id still like very much to hear your opinion on whether i might be better to normalize before or after summarizing given the variation i have to deal with. thanks again, Karl on 9/11/2006 3:51 PM Ben Bolstad said the following: >> 1. Help to effect Bolstad normalization of the RMA preprocessed and >> summarized data. Whilst I succeed in generating unnormalized RMA >> preprocessed data with- >> >> library(affy) >> data <- ReadAffy() >> datarma <- rma(data, normalize=FALSE) >> >> As a result of my limited R experience, I failed in finding a method to >> effect Bolstad (quantile) normalization on this output. > > library(affyPLM) > datarma.postqnorm <- normalize(datarma) > > > > -- Karl Brand <k.brand at="" erasmusmc.nl=""> Department of Cell Biology and Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam lab +31 (0)10 408 7409 fax +31 (0)10 408 9468

ADD REPLY • link 17.6 years ago k. brand ▴ 420

score 0 · Answer 2 · 2006-09-11

k. brand wrote: > Dear All, > > I compared two normalization approaches for an experiment using twelve > affy 430-2.0 chips. (histogram plot comparing bith methods forwarded on > request). > > #1. RMA > library(affy) > data <- ReadAffy() > datarma <- rma(data) > exprs2excel(datarma, file="dataRMA.csv") > > Plotting histograms of the output shows arrays NOT perfectly aligning at > the means and spreads. > > I used a custom script to effect a quantile normalization on MAS5 > preprocessed but unnormalized data- > > #2. Mas5 sans interchip normalization > library(affy) > data <- ReadAffy() > datamas5sannorm <- mas5(data, normalize=FALSE) > exprs2excel(datamas5sannorm, file="datamas5sannorm.csv") > f.qnorm <- function(x,qinit=0.75,perc=100) {... > > The means and spreads of this normalization approach do align perfectly. > > THUS- summarizing probe intensites before or after normalization does > appear to make a noticeable difference, as may be expected. > > My questions/requests- > > 1. Help to effect Bolstad normalization of the RMA preprocessed and > summarized data. Whilst I succeed in generating unnormalized RMA > preprocessed data with- > > library(affy) > data <- ReadAffy() > datarma <- rma(data, normalize=FALSE) Next step would be datarma <- normalize.quantiles(exprs(datarma)) also note that 'data' is not a very good variable name, as you are masking an existing function. When creating variable names it is often enlightening to type the name first at an R prompt to see if you get any response. > > As a result of my limited R experience, I failed in finding a method to > effect Bolstad (quantile) normalization on this output. > > 2. Thoughts/comments on the benefits/caveats of normalizing before or > after summarizing probe intensities. Normalizing after summarization for something like rma() seems questionable to me. Since the expression values are based on fitting a model to the PM probe values, if you don't normalize first you are ignoring any non-biological variability which may end up biasing your results. Using median polish for the model fit should help protect against this, but I don't know that I would want to take chances. As an aside, how far off are the histograms? Are you sure that there is a reasonable difference? Eyeballing a histogram isn't the best way to determine if the mean and variance are different or not. A quick run through with some data here shows very little differences: > eset <- justRMA(filenames=list.celfiles()[1:10]) > apply(exprs(eset),2,summary) Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Min. 4.085 4.070 4.091 4.051 4.068 4.090 4.087 1st Qu. 5.835 5.859 5.832 5.812 5.842 5.858 5.852 Median 7.079 7.069 7.048 7.061 7.070 7.077 7.080 Mean 7.225 7.227 7.224 7.227 7.229 7.225 7.232 3rd Qu. 8.352 8.324 8.351 8.363 8.361 8.330 8.347 Max. 14.550 14.440 14.420 14.400 14.490 14.430 14.260 Best, Jim > > I look forward to any thoughts, advice & suggestions from users. > > thanks in advance, > > Karl > > > =========================================== > > > sessionInfo() > Version 2.3.0 (2006-04-24) > i386-pc-mingw32 > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > "datasets" "base" > > other attached packages: > affy affyio Biobase > "1.10.0" "1.0.0" "1.10.0" > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.