Question

Limma: four suggestions related to weights and background correction

0

Entering edit mode

Ramon Diaz ★ 1.1k

@ramon-diaz-159

Last seen 9.7 years ago

Dear Gordon and other bioconductor-users, We have encountered a few recurrent problem when using limma. If I may, I'd like to make four suggestions for changes which might benefit other people: 1. When "backgroundCorrect(temp1, method = "none")" is used columns for Rb and Gb are eliminated from the RGList object. We have found this problematic, because, even if one does not want to do background subtraction, one might still want to look at image plots of background, just to check nothing funny has happened with the background. The problem is that the subsequent step of using "normalizeWithinArrays" calls MA.RG, which itself uses the presence or absence of the Rb and Gb to do (or not) the subtraction. Then, if we still want to plot background and do the normalization, we either have to make sure we are done with all background business before we do the normalization (since after that all background info will be lost) or we need to keep the normalized and the unnormalized objects around. None of these seem ideal. We wonder if it would be possible to pass the type of subtraction to "normalizeWithinArrays" and to "MA.RG" as a parameter, so that the original object preservers all the information, but still we can do no background subtraction. (We are hacking the code here, for our internal use). 2. We have found it slightly confussing that the M value of a point with weight 0 for the normalization is not an NA. Sure, the points with weight = 0 do not get used to fit the loess curve, but they still have a residual. But, so far, with the wet lab people we have worked, it has always been the case that if a point was deemed unsuitable for the normalization it was also unsuitable for further analyses (though, occasionally, points suitable for fitting the normalization curve have been deemed unsuitable for further analyses). Are we missing something obvious here? 3. Occasionally, a complet print-tip group might be missing which will make "normalizeWithinArrays" to fail. We have added the following to that function, in the print-tip loess case, right before the line "spots <- spots + nspots" (and substituting the direct call to loessFit): ****************************************** num.fill <- length(object$M[spots, j]) object$M[spots, j] <- rep(NA, num.fill) no.objects <- try(object$M[spots, j] <- loessFit(y, x, w, span = span, iterations = iterations)$residuals) if(class(no.objects) == "try-error") print(paste("WARNING: in print-tip loess normalization", "no samples in array", j, "grid row", gridr, "grid column", gridc)) ******************************************** 4. When all weights are either 0 or 1, there are slight differences in the results of the normalization (of points with weight = 1) depending on whether or not the cases with weight of 0 are passed to subsequent calls. Sure, these differences are irrelevant, but it can be disconcerting. Wouldn't it be appropriate to check for the binary situation of weights either 0 or 1, and if so, exclude points with weight 0, which would allow us to call directly ".C("lowess"", instead of ".vsimpleLoess" (inside "loessFit")? *************** The following is an example. (The data sets are at: http://bioinfo.cnio.es/~rdiaz/00G66.txt) temp01 <- read.maimages(files = "00G66.txt", source = "genepix", wt.fun=wtflags(0)) temp01.b <- temp01 set.to.na <- which(temp01.b$weights == 0) temp01.b$R[set.to.na] <- NA temp01.b$G[set.to.na] <- NA temp01.c <- temp01.b temp01.c$weights <- NULL n1c <- normalizeWithinArrays(temp01.c, layout = temp.layout, iterations = 5, method = "loess") n1b <- normalizeWithinArrays(temp01.b, layout = temp.layout, iterations = 5, method = "loess") n1 <- normalizeWithinArrays(temp01, layout = temp.layout, iterations = 5, method = "loess") pairs(cbind(n1$M, n1b$M, n1c$M)) summary(n1$M - n1b$M) ****************** Best, R. -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)

Normalization Cancer limma Normalization Cancer limma • 940 views

ADD COMMENT • link updated 20.3 years ago by Gordon Smyth 50k • written 20.3 years ago by Ramon Diaz ★ 1.1k

score 0 · Answer 1 · 2004-01-23

Dear Ramon, Thanks for your constructive suggestions. I leave today for holiday and so won't be able to reply substantially for a while. Quick response to 2. below. > Dear Gordon and other bioconductor-users, > > We have encountered a few recurrent problem when using limma. If I may, > I'd like to make four suggestions for changes which might benefit other > people: > > > 1. When "backgroundCorrect(temp1, method = "none")" is used columns for > Rb and Gb are eliminated from the RGList object. We have found this > problematic, because, even if one does not want to do background > subtraction, one might still want to look at image plots of background, > just to check nothing funny has happened with the background. The > problem is that the subsequent step of using "normalizeWithinArrays" > calls MA.RG, which itself uses the presence or absence of the Rb and Gb > to do (or not) the subtraction. Then, if we still want to plot > background and do the normalization, we either have to make sure we are > done with all background business before we do the normalization (since > after that all background info will be lost) or we need to keep the > normalized and the unnormalized objects around. None of these seem > ideal. We wonder if it would be possible to pass the type of > subtraction to "normalizeWithinArrays" and to "MA.RG" as a parameter, > so that the original object preservers all the information, but still > we can do no background subtraction. (We are hacking the code here, for > our internal use). > > > 2. We have found it slightly confussing that the M value of a point with > weight 0 for the normalization is not an NA. Sure, the points with > weight = 0 do not get used to fit the loess curve, but they still have > a residual. But, so far, with the wet lab people we have worked, it has > always been the case that if a point was deemed unsuitable for the > normalization it was also unsuitable for further analyses (though, > occasionally, points suitable for fitting the normalization curve have > been deemed unsuitable for further analyses). I think the current behavior is correct. I often give spike-in ratio controls zero weight in the normalization step, but they're definitely not missing! There are many reasons why zero weights are not the same as missing. > Are we missing something obvious here? > > 3. Occasionally, a complet print-tip group might be missing which will > make "normalizeWithinArrays" to fail. We have added the following to > that function, in the print-tip loess case, right before the line > "spots <- spots + nspots" (and substituting the direct call to > loessFit): > > ****************************************** > num.fill <- length(object$M[spots, j]) > object$M[spots, j] <- rep(NA, num.fill) > no.objects <- > try(object$M[spots, j] > <- loessFit(y, x, w, span = span, > iterations = iterations)$residuals) > if(class(no.objects) == "try-error") > print(paste("WARNING: in print-tip loess normalization", > "no samples in array", j, "grid row", gridr, > "grid column", gridc)) > ******************************************** > > > 4. When all weights are either 0 or 1, there are slight differences in > the results of the normalization (of points with weight = 1) depending > on whether or not the cases with weight of 0 are passed to subsequent > calls. Sure, these differences are irrelevant, but it can be > disconcerting. Wouldn't it be appropriate to check for the binary > situation of weights either 0 or 1, and if so, exclude points with > weight 0, which would allow us to call directly ".C("lowess"", instead > of ".vsimpleLoess" (inside "loessFit")? The problem isn't so much with limma, as with the different interpolation methods used by lowess and loess. The change that you suggest would make 0, 1 identical to no weights which is nice, but would then make weight 1e-6 discontinuous from weight 0 which is undesirable. So there is no perfect solution. Gordon > *************** > The following is an example. > (The data sets are at: > http://bioinfo.cnio.es/~rdiaz/00G66.txt) > > > temp01 <- read.maimages(files = "00G66.txt", > source = "genepix", > wt.fun=wtflags(0)) > > temp01.b <- temp01 > set.to.na <- which(temp01.b$weights == 0) > temp01.b$R[set.to.na] <- NA > temp01.b$G[set.to.na] <- NA > > temp01.c <- temp01.b > temp01.c$weights <- NULL > > n1c <- normalizeWithinArrays(temp01.c, layout = temp.layout, > iterations = 5, > method = "loess") > n1b <- normalizeWithinArrays(temp01.b, layout = temp.layout, > iterations = 5, > method = "loess") > n1 <- normalizeWithinArrays(temp01, layout = temp.layout, > iterations = 5, > method = "loess") > > pairs(cbind(n1$M, n1b$M, n1c$M)) > summary(n1$M - n1b$M) > > > ****************** > > Best, > > R. > > -- > Ram?n D?az-Uriarte > Bioinformatics Unit > Centro Nacional de Investigaciones Oncol?gicas (CNIO) > (Spanish National Cancer Center) > Melchor Fern?ndez Almagro, 3 > 28029 Madrid (Spain) > Fax: +-34-91-224-6972 > Phone: +-34-91-224-6900 > > http://bioinfo.cnio.es/~rdiaz > PGP KeyID: 0xE89B3462 > (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)