Question

Limma: background correction. Use or ignore?

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 8 hours ago

WEHI, Melbourne, Australia

Dear Jose, For some brief but relevant comments see See Section 6.1: Background Correction in the Limma User's Guide, and Section 3 of http://www.statsci.org/smyth/pubs/mareview.pdf Whether background subtraction is a good idea depends entirely on the background estimation used. You do not mention what image analysis program you used or which background estimation method was chosen, but everything depends on this. Firstly, can you get away with ignoring the background entirely? I agree with Jim and Naomi's general remarks, and I agree with Jim that not background correcting can lead to cleaner results for some data sets, especially for good quality arrays with low background. The UCSF microarray center has made the same argument for their own arrays. But in my lab, we always background correct. There are a lot of reasons for this. For one thing, foreground-background plots almost always show that background correcting does remove some systematic bias. The most critical reason though is to achieve comparability between experimental conditions. Not background correction is a lot like adding an offset to your data (see the backgroundCorrect function in limma), and the size of the offset depends on the level of the background. In my lab we see data from lots of different labs, platforms, image analysis programs, species etc, and the background levels can vary wildly. For example, I analysed one important experiment when the scanner changed from Axon to Agilent halfway through, and the overall background levels increased 10-fold. I prefer to background correct and to add the offset explicitly, rather than to allow it to vary with the data in an uncontrolled way. Had I not background corrected the Axon-Agilent experiment, the results would have been far more damped in the second half of the experiment and not comparable to the first. But background correcting doesn't mean that we simply *subtract* the background. We subtract if we have 1. morph background from SPOT 2. morphological opening background from GenePix 6, or 3. background from AgilentFE and in no other cases. In most other cases, subtracting is so bad that you would indeed probably be better off ignoring the background entirely. In most other cases we currently use 'normexp' background correction with an offset. This is an adaptive background method which is a modification of the background correction method used by the RMA algorithm for affy data. It is adaptive in that it adapts to the overall level of background on each array. It avoids the negative intensities which so often arise from naive background subtraction. The Bioconductor book has an example of a data set which was analysed both with no background correction (Chapter 4) and with normexp background correction (Chapter 23). Best wishes Gordon >Message: 23 >Date: Fri, 31 Mar 2006 16:01:24 -0500 >From: Naomi Altman <naomi at="" stat.psu.edu=""> >Subject: Re: [BioC] Limma: background correction. Use or ignore? >To: "James W. MacDonald" <jmacdon at="" med.umich.edu="">, > J.delasHeras at ed.ac.uk >Cc: Bioconductor Newsgroup <bioconductor at="" stat.math.ethz.ch=""> > >I have investigated this (somewhat) experimentally. Background >correction increases the variability of low-expression genes and >reduces it for high expression. This corresponds to the RMA noise >model since background correction would double the additive variance >but not affect the multiplicative variance (which is the dominant >source of variance for highly expressing genes.) > >--Naomi > > >At 12:43 PM 3/31/2006, James W. MacDonald wrote: > >Hi Jose, > > > >J.delasHeras at ed.ac.uk wrote: > > > I have been using LimmaGUI for a while to analyse my cDNA microarrays. > > > I have always used "substract" as a method for background correction. > > > Why? Not sure. Intuitively it made sense, and I didn't observe any > > > obvious problems. > > > Once I played with the different methods for background correction > > > available in LimmaGUI, and when looking at the MA plots I decided I > > > preferred to substract. > > > > > > However, I have recently had problems with the statistics being quite > > > poor in my analises (see my post a week ago or so about low B > > > values)... and whilst checking the data, I noticed that at least in my > > > current experiments, if I do no background correction at all the stats > > > look a lot better, the MA plots look better, and everything looks > > > better in general. The actual list of genes doesn't change a lot, but > > > the values seem a lot tighter. > > > > > > This makes me question whether we should background correct at all. My > > > slides are pretty clean, low background. Am I not adding more noise to > > > the data by removing background? > > > >I have never been a big fan of subtracting background, especially if the > >background of the slide is low and relatively consistent. I have two > >main reasons for this. > > > >First, the portion of the slide used to estimate background doesn't have > >any cDNA bound, so you are estimating the background binding of the spot > >by using a portion of the slide that might not be very similar. When we > >were doing more spotted arrays, we would always spot unrelated cDNA on > >the slides as well (e.g., A.thaliana and salmon sperm DNA). These spots > >almost always had a negative intensity if you subtracted the local > >background, which indicates to me that cDNA does a better job of > >blocking the slide than BSA or other blocking agents. > > > >Second, you *are* adding more noise to the data. When you subtract, the > >variances are additive. However, if you don't subtract then you take the > >chance that you are biasing your expression values, especially if the > >background from chip to chip isn't relatively consistent. So the > >tradeoff is higher variance vs possible bias. If the background was > >consistent I usually took a chance on the bias in order to reduce the > >variance. As you note, the data usually look 'cleaner' if you don't > >adjust the background. > > > >Note that these points are directed towards simple subtraction of a > >local background estimate. Other more sophisticated methods may help > >address these shortcomings. > > > >As for references, have you looked at the references that Gordon gives > >on the man page for backgroundCorrect()? That would probably be a good > >place to start. > > > >Best, > > > >Jim > > > > > > > > > > Can anybody point me to a good reference to learn about the effects of > > > background correction, pros and cons? I'm just a molecular biologist, > > > not a statistician, but I need to understand a bit better these issues > > > or there'll be no molecular biology to work on from my experiments! > > > > > > Jose > > > > > > > > > > > >-- > >James W. MacDonald, M.S. > >Biostatistician > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > >Naomi S. Altman 814-865-3791 (voice) >Associate Professor >Dept. of Statistics 814-863-7114 (fax) >Penn State University 814-865-1348 (Statistics) >University Park, PA 16802-2111

Microarray Cancer affy limma limmaGUI Microarray Cancer affy limma limmaGUI • 2.2k views

ADD COMMENT • link updated 18.1 years ago by J.delasHeras@ed.ac.uk ★ 1.9k • written 18.1 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2006-04-04

Dear Gordon, many thanks for your helpful reply and this link: > For some brief but relevant comments see See Section 6.1: Background > Correction in the Limma User's Guide, and Section 3 of > http://www.statsci.org/smyth/pubs/mareview.pdf > Whether background subtraction is a good idea depends entirely on the > background estimation used. You do not mention what image analysis > program you used or which background estimation method was chosen, > but everything depends on this. It varied. Initially I used TIGR Spotfinder (SF), using its Otsu algorithm (of which I don't know much, but I understand it's a variation of the histogram method that takes into consideration the physical distribution of the pixels, so that only pixels that group into something resembling a spot are considered -proximity to one another, proximity to the centre of the grid, and also considering estimates of spot sizes given by the user). I used to simply substract the background estimated by SF, which also turns negative values into zero. This always gave me decent results, although now I realise that not doing background substraction may improve them. More recently, coinciding with the use of a deifferent set of arrays, I started using a different scanner (Axon 4200AL, prior to that I used an ArraywoRx one), and Genepix 6.0. I used its default background estimation (local background median, 3 feature diameters wide area, except for the area surrounding the spots, 2 pixels wide). The spots are located using the "irregular features" option. Here's where the trouble started, when I substracted teh background estimated in Genepix from Limma. Simple substraction. It took me a while to notice there was a problem because I was using different arrays, and the experiments were also different and more prone to variation. But when I re-did some simple experiments whose results I could expect with reasonable certainty, it showed me very poor B values... and my worries started! I just did not expect that the background correction would have such a profound effect. When I compared quantitation made by SF and Genepix, I only looked at the raw intensities, not the background... the intensities looked reasonably comparable. Right now I am re-quantitating some images using Genepix and Spotfinder to compare the differences estimating foreground and background, but I still don't have that data. > Firstly, can you get away with ignoring the background entirely? I > agree with Jim and Naomi's general remarks, and I agree with Jim that > not background correcting can lead to cleaner results for some data > sets, especially for good quality arrays with low background. The I think this is where I am leaning towards, at the moment. My slides are quite clean and the background images as displayed by Limma show the background is quite uniform too. It certainly seems like the simplest option. Although you point at some reasons not to do that... > UCSF microarray center has made the same argument for their own > arrays. But in my lab, we always background correct. There are a lot > of reasons for this. For one thing, foreground-background plots > almost always show that background correcting does remove some > systematic bias. what do you mean exactly by this? what's teh source of this bias? dye-specific bias? > The most critical reason though is to achieve > comparability between experimental conditions. Not background > correction is a lot like adding an offset to your data (see the > backgroundCorrect function in limma), and the size of the offset > depends on the level of the background. In my lab we see data from > lots of different labs, platforms, image analysis programs, species > etc, and the background levels can vary wildly. For example, I > analysed one important experiment when the scanner changed from Axon > to Agilent halfway through, and the overall background levels > increased 10-fold. I prefer to background correct and to add the > offset explicitly, rather than to allow it to vary with the data in > an uncontrolled way. Had I not background corrected the Axon-Agilent > experiment, the results would have been far more damped in the second > half of the experiment and not comparable to the first. I understand how in a case like this it would be very important to account for the variability of background measurements. Usually my experiment are performed ina relatively short period of time, so teh scanner/analysis program is the same for each individual experiment. I do observe that the background levels vary between slides, but usually not a lot (although, what is "a lot"?)... which is the reason I am thinking I will probably end up not background correcting. But since it seems to have such a marked effect, I don't want to make a decision like this without investigating a little more. > But background correcting doesn't mean that we simply *subtract* the > background. We subtract if we have > 1. morph background from SPOT > 2. morphological opening background from GenePix 6, or > 3. background from AgilentFE > and in no other cases. In most other cases, subtracting is so bad > that you would indeed probably be better off ignoring the background > entirely. In what I am testing now, I included background measurements using the morphological option in GenePix, as well as the default local bkg, and using negative spots as controls, for comparison. I haven't analysed the data yet. > In most other cases we currently use 'normexp' background correction > with an offset. This is an adaptive background method which is a > modification of the background correction method used by the RMA > algorithm for affy data. It is adaptive in that it adapts to the > overall level of background on each array. It avoids the negative > intensities which so often arise from naive background subtraction. I imagined you'd be a fan of this method ;-) I must say I haven't tried this one yet, but I will. My "problem" to try this method was how to determine the offset. I am guessing I just have to try different values and check the effect... > The Bioconductor book has an example of a data set which was analysed > both with no background correction (Chapter 4) and with normexp > background correction (Chapter 23). I'll be sure to read that. Many thanks, Gordon. Very helpful. Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK