Question about quantile normalization

0

Entering edit mode

qwertyui_period@yahoo.co.jp ▴ 10

@qwertyui_periodyahoocojp-4782

Last seen 9.2 years ago

Dear all, My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. I'm going to normalize the microarray data by "normalizeBetweenArrays" which is the quantile normalization function in "limma" package. I have read the "usersguide.pdf" in bioconductor website, however, I still have two questions. Question 1: Which is proper to use for quantile normalization: raw- scale or log2-scale values ? The quantile normalization includes the step of calculating arithmetic mean, so I suppose the raw-scale values should be used, though the microarray data is generally log2-scale values. Question 2: How does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Example code 1, > ngenes <- 3 > narrays <- 2 > x <- matrix(c(3,1,5,6,4,2),ngenes,narrays) [,1] [,2] [1,] 3 6 [2,] 1 4 [3,] 5 2 >(y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] 3.5 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 I understand the process of "normalizeBetweenArrays" is devided into 4 steps as follows: step 1: Sorting values in each column in descending order [,1] [,2] [1,] 5 6 [2,] 3 4 [3,] 1 2 step 2: Averaging values in each rank [,1] [,2] Average [1,] 5 6 5.5 [2,] 3 4 3.5 [3,] 1 2 1.5 step 3: Replacing the values in each column with the average [,1] [,2] Average [1,] 5.5 5.5 5.5 [2,] 3.5 3.5 3.5 [3,] 1.5 1.5 1.5 step 4: Re-sorting the values in each column at original positions [,1] [,2] [1,] 3.5 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 Then, how does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Example code 2, > (x <- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) [,1] [,2] [1,] NA 6 [2,] 1 4 [3,] 5 2 (y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] NA 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 Thanks in advance ! [[alternative HTML version deleted]]

Microarray Normalization limma PROcess Microarray Normalization limma PROcess • 3.3k views

ADD COMMENT • link updated 14.6 years ago by K J ▴ 10 • written 14.6 years ago by qwertyui_period@yahoo.co.jp ▴ 10

0

Entering edit mode

Laurent Gautier ★ 2.3k

@laurent-gautier-29

Last seen 11.5 years ago

On 2011-08-01 06:55, qwertyui_period at yahoo.co.jp wrote: > Dear all, > > My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. > I'm going to normalize the microarray data by "normalizeBetweenArrays" > which is the quantile normalization function in "limma" package. > I have read the "usersguide.pdf" in bioconductor website, however, I still > have two questions. > > Question 1: Which is proper to use for quantile normalization: raw- scale or > log2-scale values ? > The quantile normalization includes the step of calculating arithmetic > mean, > so I suppose the raw-scale values should be used, though the microarray > data is generally log2-scale values. Quantile normalization is usually performed on untransformed data ("raw-scale"). Log2 transformation comes after (before probe summary when using RMA or RMA-like approaches). > Question 2: How does "normalizeBetweenArrays" deal “N/A” in data ? Missing values are just ignored and left as such (missing values). Hoping this helps, L. > Example code 1, > >> ngenes<- 3 >> narrays<- 2 >> x<- matrix(c(3,1,5,6,4,2),ngenes,narrays) > [,1] [,2] > [1,] 3 6 > [2,] 1 4 > [3,] 5 2 > >> (y<- normalizeBetweenArrays(x)) > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > I understand the process of "normalizeBetweenArrays" is devided into 4 > steps as follows: > > step 1: Sorting values in each column in descending order > > [,1] [,2] > [1,] 5 6 > [2,] 3 4 > [3,] 1 2 > > step 2: Averaging values in each rank > > [,1] [,2] Average > [1,] 5 6 5.5 > [2,] 3 4 3.5 > [3,] 1 2 1.5 > > step 3: Replacing the values in each column with the average > > [,1] [,2] Average > [1,] 5.5 5.5 5.5 > [2,] 3.5 3.5 3.5 > [3,] 1.5 1.5 1.5 > > step 4: Re-sorting the values in each column at original positions > > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Then, how does "normalizeBetweenArrays" deal “N/A” in data ? > > Example code 2, > >> (x<- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) > [,1] [,2] > [1,] NA 6 > [2,] 1 4 > [3,] 5 2 > > (y<- normalizeBetweenArrays(x)) > > [,1] [,2] > [1,] NA 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Thanks in advance ! > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 14.6 years ago Laurent Gautier ★ 2.3k

0

Entering edit mode

K J ▴ 10

@k-j-4788

Last seen 11.5 years ago

Dear Dr. Laurent Gautier, Thanks for your kind reply. > Quantile normalization is usually performed on untransformed data ("raw-scale"). I read some journals on RMA or RMA++, and I made sure that raw-scale is usually used for quantile normarlization as you say. > Missing values are just ignored and left as such (missing values). I'm afraid I don't quite understand what you say on how to treat the N/ A. Based on the program written in "normalizeBetweenArrays", it seems that the N/A is first replaced with the median, but after ranking, averaging and re-ordering to the original position, it's transformed back to N/A. Is this correct? I described the example herein: step 1: Calculating median in each column > ngenes <- 5 > narrays <- 2 > x <- matrix(c(1:10),ngenes,narrays) > x[2,1] <- x[3,1] <- NA > x[4,2] <- x[5,2] <- NA > x [,1] [,2] [1,] 1 6 [2,] NA 7 [3,] NA 8 [4,] 4 NA [5,] 5 NA > (xm <- apply(x,2,median,na.rm=T)) [1] 4 7 step 2: Replacing N/A in each column with median > x[2,1] <- x[3,1] <- xm[1] > x[4,2] <- x[5,2] <- xm[2] [,1] [,2] [1,] 1 6 [2,] 4 7 [3,] 4 8 [4,] 4 7 [5,] 5 7 step 3: Sorting values in each column in descending order [,1] [,2] [1,] 5 8 [2,] 4 7 [3,] 4 7 [4,] 4 7 [5,] 1 6 step 4: Averaging values in each rank [,1] [,2] Average [1,] 5 8 6.5 [2,] 4 7 5.5 [3,] 4 7 5.5 [4,] 4 7 5.5 [5,] 1 6 3.5 step 5: Replacing the values in each column with the average [,1] [,2] Average [1,] 6.5 6.5 6.5 [2,] 5.5 5.5 5.5 [3,] 5.5 5.5 5.5 [4,] 5.5 5.5 5.5 [5,] 3.5 3.5 3.5 step 6: Re-sorting the values in each column at original positions [,1] [,2] [1,] 3.5 3.5 [2,] 5.5 5.5 [3,] 5.5 6.5 [4,] 5.5 5.5 [5,] 6.5 5.5 step 7: Replacing the values with N/A at original positions [,1] [,2] [1,] 3.5 3.5 [2,] NA 5.5 [3,] NA 6.5 [4,] 5.5 NA [5,] 6.5 NA This result corresponds to normalizeBetweenArrays() result; > x <- matrix(c(1:10),ngenes,narrays) > x[2,1] <- x[3,1] <- NA > x[4,2] <- x[5,2] <- NA > (y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] 3.5 3.5 [2,] NA 5.5 [3,] NA 6.5 [4,] 5.5 NA [5,] 6.5 NA J, K --- On Mon, 2011/8/1, Laurent Gautier <laurent at="" cbs.dtu.dk=""> wrote: On 2011-08-01 06:55, qwertyui_period at yahoo.co.jp wrote: > Dear all, > > My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. > I'm going to normalize the microarray data by "normalizeBetweenArrays" > which is the quantile normalization function in "limma" package. > I have read the "usersguide.pdf" in bioconductor website, however, I still > have two questions. > > Question 1: Which is proper to use for quantile normalization: raw- scale or > log2-scale values ? > The quantile normalization includes the step of calculating arithmetic > mean, > so I suppose the raw-scale values should be used, though the microarray > data is generally log2-scale values. Quantile normalization is usually performed on untransformed data ("raw-scale"). Log2 transformation comes after (before probe summary when using RMA or RMA-like approaches). > Question 2: How does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Missing values are just ignored and left as such (missing values). Hoping this helps, L. > Example code 1, > >> ngenes<- 3 >> narrays<- 2 >> x<- matrix(c(3,1,5,6,4,2),ngenes,narrays) > [,1] [,2] > [1,] 3 6 > [2,] 1 4 > [3,] 5 2 > >> (y<- normalizeBetweenArrays(x)) > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > I understand the process of "normalizeBetweenArrays" is devided into 4 > steps as follows: > > step 1: Sorting values in each column in descending order > > [,1] [,2] > [1,] 5 6 > [2,] 3 4 > [3,] 1 2 > > step 2: Averaging values in each rank > > [,1] [,2] Average > [1,] 5 6 5.5 > [2,] 3 4 3.5 > [3,] 1 2 1.5 > > step 3: Replacing the values in each column with the average > > [,1] [,2] Average > [1,] 5.5 5.5 5.5 > [2,] 3.5 3.5 3.5 > [3,] 1.5 1.5 1.5 > > step 4: Re-sorting the values in each column at original positions > > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Then, how does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? > > Example code 2, > >> (x<- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) > [,1] [,2] > [1,] NA 6 > [2,] 1 4 > [3,] 5 2 > > (y<- normalizeBetweenArrays(x)) > > [,1] [,2] > [1,] NA 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Thanks in advance ! > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 14.6 years ago K J ▴ 10

Login before adding your answer.