16.2 years ago by
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
To follow up on my general remarks on lowess and loess, I should also explain the slight differences between loess normalization in marrayNorm and limma.
I am the one mostly to blame for the fact that marrayNorm and limma are not exactly the same for loess normalization. Jean and I co-ordinated marrayNorm and limma earlier in the year to use:
(See my other answer to this question for the meaning of these parameters.) These parameter setting are conservative choices. They result in a relatively stiff curve, with a high degree of robustness and with exact loess calculations involving no interpolation. The decision to avoid interpolation was motivated more by the desire to avoid confusing warning messages from 'loess' rather than because interpolation is not accurate.
As some users have noted on this mailing list, the avoidance of interpolation results in very, very slow fits for some data sets. It was much, much too slow for me anyway. So I re-introduced interpolation to limma and have implemented some warning suppression at a lower code level to avoid the confusing warning messages. limma currently uses default values:
interpolation: 'lowess'-style interpolation where possible, otherwise 'loess'-style
The default values in limma agree exactly with the earlier software SMA, i.e., with the software that was used for the original papers on loess normalization. If you want limma to produce the slightly stiffer, slightly more robust curves produced by marrayNorm, you can use
normalizeWithinArrays(RG, span=0.4, iterations=5)
The only difference between limma and marrayNorm will then be a result of interpolation used by limma.
Which parameter settings are best? Data analysis is not such a precise science that it is possible to give categorical answers. Either span=0.3 or span=0.4 are acceptable. In general, a higher value for span is appropriate if your data doesn't show much intensity-dependence in dye-bias and vice-versa. Iterations=4 produces a reasonably robust fit. If you desperately need a very robust procedure, perhaps because you have a very high proportion of differentially expressed genes, then the most robust possible print-tip intensity-based normalization procedure is available from
normalizeWithinArrays(RG, method="robustspline", robust="MM")