Question about quantile normalization
2
0
Entering edit mode
@qwertyui_periodyahoocojp-4782
Last seen 8.0 years ago
Dear all, My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. I'm going to normalize the microarray data by "normalizeBetweenArrays" which is the quantile normalization function in "limma" package. I have read the "usersguide.pdf" in bioconductor website, however, I still have two questions. Question 1: Which is proper to use for quantile normalization: raw- scale or log2-scale values ? The quantile normalization includes the step of calculating arithmetic mean, so I suppose the raw-scale values should be used, though the microarray data is generally log2-scale values. Question 2: How does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Example code 1, > ngenes <- 3 > narrays <- 2 > x <- matrix(c(3,1,5,6,4,2),ngenes,narrays) [,1] [,2] [1,] 3 6 [2,] 1 4 [3,] 5 2 >(y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] 3.5 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 I understand the process of "normalizeBetweenArrays" is devided into 4 steps as follows: step 1: Sorting values in each column in descending order [,1] [,2] [1,] 5 6 [2,] 3 4 [3,] 1 2 step 2: Averaging values in each rank [,1] [,2] Average [1,] 5 6 5.5 [2,] 3 4 3.5 [3,] 1 2 1.5 step 3: Replacing the values in each column with the average [,1] [,2] Average [1,] 5.5 5.5 5.5 [2,] 3.5 3.5 3.5 [3,] 1.5 1.5 1.5 step 4: Re-sorting the values in each column at original positions [,1] [,2] [1,] 3.5 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 Then, how does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Example code 2, > (x <- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) [,1] [,2] [1,] NA 6 [2,] 1 4 [3,] 5 2 (y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] NA 5.5 [2,] 1.5 3.5 [3,] 5.5 1.5 Thanks in advance ! [[alternative HTML version deleted]]
Microarray Normalization limma PROcess Microarray Normalization limma PROcess • 3.1k views
ADD COMMENT
0
Entering edit mode
Laurent Gautier ★ 2.3k
@laurent-gautier-29
Last seen 10.2 years ago
On 2011-08-01 06:55, qwertyui_period at yahoo.co.jp wrote: > Dear all, > > My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. > I'm going to normalize the microarray data by "normalizeBetweenArrays" > which is the quantile normalization function in "limma" package. > I have read the "usersguide.pdf" in bioconductor website, however, I still > have two questions. > > Question 1: Which is proper to use for quantile normalization: raw- scale or > log2-scale values ? > The quantile normalization includes the step of calculating arithmetic > mean, > so I suppose the raw-scale values should be used, though the microarray > data is generally log2-scale values. Quantile normalization is usually performed on untransformed data ("raw-scale"). Log2 transformation comes after (before probe summary when using RMA or RMA-like approaches). > Question 2: How does "normalizeBetweenArrays" deal “N/A” in data ? Missing values are just ignored and left as such (missing values). Hoping this helps, L. > Example code 1, > >> ngenes<- 3 >> narrays<- 2 >> x<- matrix(c(3,1,5,6,4,2),ngenes,narrays) > [,1] [,2] > [1,] 3 6 > [2,] 1 4 > [3,] 5 2 > >> (y<- normalizeBetweenArrays(x)) > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > I understand the process of "normalizeBetweenArrays" is devided into 4 > steps as follows: > > step 1: Sorting values in each column in descending order > > [,1] [,2] > [1,] 5 6 > [2,] 3 4 > [3,] 1 2 > > step 2: Averaging values in each rank > > [,1] [,2] Average > [1,] 5 6 5.5 > [2,] 3 4 3.5 > [3,] 1 2 1.5 > > step 3: Replacing the values in each column with the average > > [,1] [,2] Average > [1,] 5.5 5.5 5.5 > [2,] 3.5 3.5 3.5 > [3,] 1.5 1.5 1.5 > > step 4: Re-sorting the values in each column at original positions > > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Then, how does "normalizeBetweenArrays" deal “N/A” in data ? > > Example code 2, > >> (x<- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) > [,1] [,2] > [1,] NA 6 > [2,] 1 4 > [3,] 5 2 > > (y<- normalizeBetweenArrays(x)) > > [,1] [,2] > [1,] NA 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Thanks in advance ! > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
K J ▴ 10
@k-j-4788
Last seen 10.2 years ago
Dear Dr. Laurent Gautier, Thanks for your kind reply. > Quantile normalization is usually performed on untransformed data ("raw-scale"). I read some journals on RMA or RMA++, and I made sure that raw-scale is usually used for quantile normarlization as you say. > Missing values are just ignored and left as such (missing values). I'm afraid I don't quite understand what you say on how to treat the N/ A. Based on the program written in "normalizeBetweenArrays", it seems that the N/A is first replaced with the median, but after ranking, averaging and re-ordering to the original position, it's transformed back to N/A. Is this correct? I described the example herein: step 1: Calculating median in each column > ngenes <- 5 > narrays <- 2 > x <- matrix(c(1:10),ngenes,narrays) > x[2,1] <- x[3,1] <- NA > x[4,2] <- x[5,2] <- NA > x [,1] [,2] [1,] 1 6 [2,] NA 7 [3,] NA 8 [4,] 4 NA [5,] 5 NA > (xm <- apply(x,2,median,na.rm=T)) [1] 4 7 step 2: Replacing N/A in each column with median > x[2,1] <- x[3,1] <- xm[1] > x[4,2] <- x[5,2] <- xm[2] [,1] [,2] [1,] 1 6 [2,] 4 7 [3,] 4 8 [4,] 4 7 [5,] 5 7 step 3: Sorting values in each column in descending order [,1] [,2] [1,] 5 8 [2,] 4 7 [3,] 4 7 [4,] 4 7 [5,] 1 6 step 4: Averaging values in each rank [,1] [,2] Average [1,] 5 8 6.5 [2,] 4 7 5.5 [3,] 4 7 5.5 [4,] 4 7 5.5 [5,] 1 6 3.5 step 5: Replacing the values in each column with the average [,1] [,2] Average [1,] 6.5 6.5 6.5 [2,] 5.5 5.5 5.5 [3,] 5.5 5.5 5.5 [4,] 5.5 5.5 5.5 [5,] 3.5 3.5 3.5 step 6: Re-sorting the values in each column at original positions [,1] [,2] [1,] 3.5 3.5 [2,] 5.5 5.5 [3,] 5.5 6.5 [4,] 5.5 5.5 [5,] 6.5 5.5 step 7: Replacing the values with N/A at original positions [,1] [,2] [1,] 3.5 3.5 [2,] NA 5.5 [3,] NA 6.5 [4,] 5.5 NA [5,] 6.5 NA This result corresponds to normalizeBetweenArrays() result; > x <- matrix(c(1:10),ngenes,narrays) > x[2,1] <- x[3,1] <- NA > x[4,2] <- x[5,2] <- NA > (y <- normalizeBetweenArrays(x)) [,1] [,2] [1,] 3.5 3.5 [2,] NA 5.5 [3,] NA 6.5 [4,] 5.5 NA [5,] 6.5 NA J, K --- On Mon, 2011/8/1, Laurent Gautier <laurent at="" cbs.dtu.dk=""> wrote: On 2011-08-01 06:55, qwertyui_period at yahoo.co.jp wrote: > Dear all, > > My environment is limma Version 3.2.2, R version 2.10.1, and Windows XP. > I'm going to normalize the microarray data by "normalizeBetweenArrays" > which is the quantile normalization function in "limma" package. > I have read the "usersguide.pdf" in bioconductor website, however, I still > have two questions. > > Question 1: Which is proper to use for quantile normalization: raw- scale or > log2-scale values ? > The quantile normalization includes the step of calculating arithmetic > mean, > so I suppose the raw-scale values should be used, though the microarray > data is generally log2-scale values. Quantile normalization is usually performed on untransformed data ("raw-scale"). Log2 transformation comes after (before probe summary when using RMA or RMA-like approaches). > Question 2: How does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? Missing values are just ignored and left as such (missing values). Hoping this helps, L. > Example code 1, > >> ngenes<- 3 >> narrays<- 2 >> x<- matrix(c(3,1,5,6,4,2),ngenes,narrays) > [,1] [,2] > [1,] 3 6 > [2,] 1 4 > [3,] 5 2 > >> (y<- normalizeBetweenArrays(x)) > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > I understand the process of "normalizeBetweenArrays" is devided into 4 > steps as follows: > > step 1: Sorting values in each column in descending order > > [,1] [,2] > [1,] 5 6 > [2,] 3 4 > [3,] 1 2 > > step 2: Averaging values in each rank > > [,1] [,2] Average > [1,] 5 6 5.5 > [2,] 3 4 3.5 > [3,] 1 2 1.5 > > step 3: Replacing the values in each column with the average > > [,1] [,2] Average > [1,] 5.5 5.5 5.5 > [2,] 3.5 3.5 3.5 > [3,] 1.5 1.5 1.5 > > step 4: Re-sorting the values in each column at original positions > > [,1] [,2] > [1,] 3.5 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Then, how does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in data ? > > Example code 2, > >> (x<- matrix(c(NA,1,5,6,4,2),ngenes,narrays)) > [,1] [,2] > [1,] NA 6 > [2,] 1 4 > [3,] 5 2 > > (y<- normalizeBetweenArrays(x)) > > [,1] [,2] > [1,] NA 5.5 > [2,] 1.5 3.5 > [3,] 5.5 1.5 > > Thanks in advance ! > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 1020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6