Question

Normalizing single-channel data [was: is my normalization right?]

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

Dear Xiaopeng, You are raising the issue of normalizing single channel (non-Affy) microarray data. This is not yet documented but is not difficult using between-array normalization methods provided in limma or vsn. Firstly, let me point out that your text file doesn't contain the "raw data" from Genepix since it doesn't contain background intensities. Have you already subtracted the background or have you just ignored it? What did you do with the Genepix flags? 1. Given a text file like you describe, you can read into R using the basic function read.table() Data <- read.table("myfile.txt",sep="\t") # I assume your file is tab-delimited y <- as.matrix(Data[,-1]) rownames(y) <- as.character(Data[,1]) Now you have two major normalization choices, quantile or vsn normalization. library(limma) y2 <- normalizeBetweenArrays(log2(y), method="quantile") or y2 <- normalizeBetweenArrays(y, method="vsn") Now you are ready to go straight into analysis differential expression using limma like fit <- lmFit(y2, design) If you use quantile normalization, you must make sure that all your intensities are positive before normalizing, for example by y <- pmax(1, y) 2. You never did need to extract the intensity data from the Genepix gpr files in the first place. You could have proceeded in limma as targets <- readTargets() # Always good practice to make a targets file RG <- read.maimages(targets$FileName, source="genepix", columns=list(Rf="F532 Mean",Gf="F532 Mean",Rb="B532 Median",Gb="B532 Median")) y2 <- normalizeBetweenArrays(RG$G, method="quantile") Or you might choose to apply backgroundCorrect() before normalizeBetweenArrays() Gordon >xpzhang xpzhang at genetics.ac.cn >Sat May 29 09:21:55 CEST 2004 > > >Thank you for your answer! > >My raw-data was from GenePix. Because I used only Cy3 in my whole >microarray experiment, I only extract data by the software,and try to >normalize the data by Bioconductor. > >I made a .txt file for the raw data, it was just like this: > >Gene Name Contrl(intensity) Treat1(intensity) Treat2(intensity) >Treat3(intensity) >1 >2 >3 >4 >5 >... > >I want to use mutiple slides normalization with intensity dependent, is >it appropriate? And could you tell me howto? I am trying to find out >ways by reading Bioconductor's document and help files,but I feel really >difficult. > >Thank you very much!

Genetics Normalization GO vsn limma Genetics Normalization GO vsn limma • 1.3k views

ADD COMMENT • link updated 19.9 years ago by xpzhang ▴ 90 • written 19.9 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2004-06-02

Dear Gordon, Thank you very much! I will try the method that you teach me as soon as possible. On Wed, 02 Jun 2004 09:44:25 +1000 Gordon Smyth <smyth@wehi.edu.au> wrote: > Firstly, let me point out that your text file doesn't contain the "raw > data" from Genepix since it doesn't contain background intensities. Have > you already subtracted the background or have you just ignored it? What did > you do with the Genepix flags? Here I talk something about my data which will be used for normalization. The scanner is Axon's 4000B.I chose the raw data from *.gpr file, and I only chose two columns, one is F532 Median, the other is B532 median. Then I subtracted the backgroud from F532 Median by Excel.At last, I got a *.text file just as I described last time. And there is another question about the sbutraction.After I substracted the backgroud, I got some data negative.I think it is really unreasonable.And what will be if I did not substract the backgroud? Is there more errors in the data with background subtraction than that without background substraction? > 1. Given a text file like you describe, you can read into R using the basic > function read.table() > > Data <- read.table("myfile.txt",sep="\t") # I assume your file is > tab-delimited > y <- as.matrix(Data[,-1]) > rownames(y) <- as.character(Data[,1]) > > Now you have two major normalization choices, quantile or vsn normalization. > > library(limma) > y2 <- normalizeBetweenArrays(log2(y), method="quantile") > > or > > y2 <- normalizeBetweenArrays(y, method="vsn") > > Now you are ready to go straight into analysis differential expression > using limma like > > fit <- lmFit(y2, design) > > If you use quantile normalization, you must make sure that all your > intensities are positive before normalizing, for example by > > y <- pmax(1, y) > > > 2. You never did need to extract the intensity data from the Genepix gpr > files in the first place. You could have proceeded in limma as > > targets <- readTargets() # Always good practice to make a targets file > RG <- read.maimages(targets$FileName, source="genepix", > columns=list(Rf="F532 Mean",Gf="F532 Mean",Rb="B532 Median",Gb="B532 Median")) > y2 <- normalizeBetweenArrays(RG$G, method="quantile") > > Or you might choose to apply backgroundCorrect() before > normalizeBetweenArrays() > > > Gordon > > >xpzhang xpzhang at genetics.ac.cn > >Sat May 29 09:21:55 CEST 2004 > > > > > >Thank you for your answer! > > > >My raw-data was from GenePix. Because I used only Cy3 in my whole > >microarray experiment, I only extract data by the software,and try to > >normalize the data by Bioconductor. > > > >I made a .txt file for the raw data, it was just like this: > > > >Gene Name Contrl(intensity) Treat1(intensity) Treat2(intensity) > >Treat3(intensity) > >1 > >2 > >3 > >4 > >5 > >... > > > >I want to use mutiple slides normalization with intensity dependent, is > >it appropriate? And could you tell me howto? I am trying to find out > >ways by reading Bioconductor's document and help files,but I feel really > >difficult. > > > >Thank you very much! -- Xiaopeng ZHANG<xpzhang@genetics.ac.cn>