Question: Harsh results using limma!
0
15.0 years ago by
michael watson IAH-C3.4k wrote:
Hi Firstly, I think limma is excellent and use it a lot, but some recent results are a bit, erm, disappointing and I wondered if someone could explain them. Basic set up was a double dye-swap experiment (4 arrays) involving different animals, one infected with one type of bacterium and the other a different bacterium, compared to one another directly. I used limma to analyse this and got a list of genes differentially regulated - great! THEN another replicate experiment was performed (so now I have 6 arrays, 3 dye-swaps), and I re-did the analysis and my set of genes was completely different - but that's fine, we can put that down to biological variation. We know limma likes genes which show consistent results across arrays, and when I looked at my data, I found that the genes in my original list were not consistent across all six arrays. So I am reasonably happy about this. My question comes from looking at the top gene from my old list in the context of all six arrays. Here are the normalised log ratios across all six arrays (ds indicates the dye-swap): Gene1 Exp1 -5.27 Exp1ds 6.29 Exp2 -4.61 Exp2ds 5.54 Exp3 -0.2 Exp3ds 0.2 Not suprisingly, limma put this as the top gene when looking at the first four arrays. However, when looking across all six arrays, limma places it at 230 in the list with a p-value of 0.11 (previously the p-value was 0.0004). So finally we get to my point/question - does this gene really "deserve" a p-value of 0.11 (ie not significant)? In every case the dye-flips are the correct way round, it is only the magnitude of the log(ratio) which differs - and as we are talking about BIOLOGICAL variation here, don't we expect the magnitude to change? If we are taking into account biological variation, surely we can't realistically expect consistent ratios across all replicate experiments?? Isn't limma being a little harsh here? After all the average log ratio is -3.7 (taking into account the dye-flips) - and to me, experiment 3's results still support the idea of the gene being differentially expressed, and are even consistent within that biological replicate. Clearly I am looking at this data from a biologists point of view and not a statisticians. But we are studying biology, not statistics, and I can't help feel I am missing out on something important here if I disregard this gene as not significantly differentially expressed (NB this is just the first example, there are many others). I should also add that there appears nothing strange about the arrays for Experiment 3 - the distribution of log(ratio) for those arrays is pretty much the same as the other four, so this is not an array- effect, it is an effect due to natural biological variation. Comments, questions, criticisms all welcome :-) Mick
limma • 670 views
modified 14.9 years ago by A.J. Rossini810 • written 15.0 years ago by michael watson IAH-C3.4k
0
15.0 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:
At 09:14 PM 13/08/2004, michael watson (IAH-C) wrote: >Hi > >Firstly, I think limma is excellent and use it a lot, but some recent >results are a bit, erm, disappointing and I wondered if someone could >explain them. > >Basic set up was a double dye-swap experiment (4 arrays) involving >different animals, one infected with one type of bacterium and the other >a different bacterium, compared to one another directly. I used limma >to analyse this and got a list of genes differentially regulated - >great! > >THEN another replicate experiment was performed (so now I have 6 arrays, >3 dye-swaps), and I re-did the analysis and my set of genes was >completely different - but that's fine, we can put that down to >biological variation. We know limma likes genes which show consistent >results across arrays, and when I looked at my data, I found that the >genes in my original list were not consistent across all six arrays. So >I am reasonably happy about this. > >My question comes from looking at the top gene from my old list in the >context of all six arrays. Here are the normalised log ratios across >all six arrays (ds indicates the dye-swap): > >Gene1 >Exp1 -5.27 >Exp1ds 6.29 >Exp2 -4.61 >Exp2ds 5.54 >Exp3 -0.2 >Exp3ds 0.2 Changes of +-0.2 are tiny and look like pure noise. So, you can have a gene for which only 2/3 of your mice show a difference. Statistical methods based on means and standard deviations will always judge this situation harshly. If you try an ordinary t-test rather than the limma method, you'll find that this gene would be judged much more harshly again. Gordon >Not suprisingly, limma put this as the top gene when looking at the >first four arrays. However, when looking across all six arrays, limma >places it at 230 in the list with a p-value of 0.11 (previously the >p-value was 0.0004). > >So finally we get to my point/question - does this gene really "deserve" >a p-value of 0.11 (ie not significant)? In every case the dye-flips are >the correct way round, it is only the magnitude of the log(ratio) which >differs - and as we are talking about BIOLOGICAL variation here, don't >we expect the magnitude to change? If we are taking into account >biological variation, surely we can't realistically expect consistent >ratios across all replicate experiments?? Isn't limma being a little >harsh here? After all the average log ratio is -3.7 (taking into >account the dye-flips) - and to me, experiment 3's results still support >the idea of the gene being differentially expressed, and are even >consistent within that biological replicate. > >Clearly I am looking at this data from a biologists point of view and >not a statisticians. But we are studying biology, not statistics, and I >can't help feel I am missing out on something important here if I >disregard this gene as not significantly differentially expressed (NB >this is just the first example, there are many others). > >I should also add that there appears nothing strange about the arrays >for Experiment 3 - the distribution of log(ratio) for those arrays is >pretty much the same as the other four, so this is not an array- effect, >it is an effect due to natural biological variation. > >Comments, questions, criticisms all welcome :-) > >Mick
0
15.0 years ago by
michael watson IAH-C3.4k wrote:
Hi Gordon Yes you're right. I didn't really mean to compare limma to a t-test. It's just that the results are very consistent within technical replicates (the dye-swaps), just not consistent between biological replicates. But this is the situation we expect - technical replicates highly correlated and biological replicates much less so. Clearly differences of 0.2 could be noise, but my due-swaps BOTH came up with 0.2. If I had ten replicate dye-swaps, all with 0.2 as the log(ratio) would we still call this noise? Given that the other replicate experiments were also highly reproducible, I can't help but think this gene is differentially expressed. I know why limma and t-test disregard this gene, I just still think it is a little harsh and that I am "throwing the baby away with the bathwater", as it were. Mick -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 13 August 2004 12:56 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Harsh results using limma! At 09:14 PM 13/08/2004, michael watson (IAH-C) wrote: >Hi > >Firstly, I think limma is excellent and use it a lot, but some recent >results are a bit, erm, disappointing and I wondered if someone could >explain them. > >Basic set up was a double dye-swap experiment (4 arrays) involving >different animals, one infected with one type of bacterium and the >other a different bacterium, compared to one another directly. I used >limma to analyse this and got a list of genes differentially regulated >- great! > >THEN another replicate experiment was performed (so now I have 6 >arrays, 3 dye-swaps), and I re-did the analysis and my set of genes was >completely different - but that's fine, we can put that down to >biological variation. We know limma likes genes which show consistent >results across arrays, and when I looked at my data, I found that the >genes in my original list were not consistent across all six arrays. >So I am reasonably happy about this. > >My question comes from looking at the top gene from my old list in the >context of all six arrays. Here are the normalised log ratios across >all six arrays (ds indicates the dye-swap): > >Gene1 >Exp1 -5.27 >Exp1ds 6.29 >Exp2 -4.61 >Exp2ds 5.54 >Exp3 -0.2 >Exp3ds 0.2 Changes of +-0.2 are tiny and look like pure noise. So, you can have a gene for which only 2/3 of your mice show a difference. Statistical methods based on means and standard deviations will always judge this situation harshly. If you try an ordinary t-test rather than the limma method, you'll find that this gene would be judged much more harshly again. Gordon >Not suprisingly, limma put this as the top gene when looking at the >first four arrays. However, when looking across all six arrays, limma >places it at 230 in the list with a p-value of 0.11 (previously the >p-value was 0.0004). > >So finally we get to my point/question - does this gene really >"deserve" a p-value of 0.11 (ie not significant)? In every case the >dye-flips are the correct way round, it is only the magnitude of the >log(ratio) which differs - and as we are talking about BIOLOGICAL >variation here, don't we expect the magnitude to change? If we are >taking into account biological variation, surely we can't realistically expect consistent >ratios across all replicate experiments?? Isn't limma being a little >harsh here? After all the average log ratio is -3.7 (taking into >account the dye-flips) - and to me, experiment 3's results still >support the idea of the gene being differentially expressed, and are >even consistent within that biological replicate. > >Clearly I am looking at this data from a biologists point of view and >not a statisticians. But we are studying biology, not statistics, and >I can't help feel I am missing out on something important here if I >disregard this gene as not significantly differentially expressed (NB >this is just the first example, there are many others). > >I should also add that there appears nothing strange about the arrays >for Experiment 3 - the distribution of log(ratio) for those arrays is >pretty much the same as the other four, so this is not an array- effect, >it is an effect due to natural biological variation. > >Comments, questions, criticisms all welcome :-) > >Mick
0
15.0 years ago by
A.J. Rossini810
A.J. Rossini810 wrote:
> I think Mick's experiences point out a fundamental problem with current statistical analysis of > microarray data. If his data was .2, .2, .2, (dye flips) -.2, -.2, -.2 then Limma would note > this gene as highly differentially expressed. In contrast when he sees 6.29, 5.54, 0.2, (dye > flips)-5.27,-4.61, -0.2 Limma did not mark it as differentially expressed. Actually it is not true that limma will necessarily rank the first gene higher than the second. Obviously t-tests would do so, but limma may well rank the second gene higher depending on the information about variability inferred from the whole data set. Looking at fold change alone ranks the second gene higher while t-tests would rank the first higher. Limma is somewhere in between depending on the dataset. A typical microarray dataset actually would lead to the second gene being ranked higher, i.e., would lead to the ranking that you would prefer. > As a biologist I would argue the case for the genes actually being differentially expressed > is much higher in the second case. Yet using modified T-statistic approaches and with the > limited number of repeats common with current array experiments, I see array experiments > "missing" these very interesting high variance genes all the time. > Current analytical techniques put a high premium on consistency of results and a lower premium > on strength of differential expression which is the parameter that biologists would argue is > the most significant. > There are a variety of biological reasons why high variance genes should exist and personally > I think these genes are likely to be the biologically interesting ones that we should be > looking for on microarrays. > I understand why Limma does what it is does and it is a fantastically useful program. > However, I would suggest to the statisticians reading this message that it would be very > useful to start developing analytical techniques which could better detect high variance > genes. I agree with the overall point. Two strategies currently available are: 1. Use spot quality weights. In the example given above it appears that two of the arrays or spots have failed to register any worthwhile fold change for a gene which is differentially expressed on the other arrays. If this can be identified as being due to low quality spots or arrays, then the values may be down-weighted in an analysis and the gene will revert to being highly significant. 2. If small fold changes are not of biological interest to you, then you can require a minimum magnitude for the fold change as well as looking for evidence of differential expression. Gordon > David Pritchard
0
14.9 years ago by
michael watson IAH-C3.4k wrote:
0
14.9 years ago by
A.J. Rossini810
A.J. Rossini810 wrote:
Argh. You can't really draw conclusions (even discovery) with biological variation from three animals without bringing in extra information. Suppose you don't have extra information. Think about the possibilities of the next 3 animals. under a conservative assumption that half the population is diff expr'd, Reasonable options are: 1. the next 3 show non-differential results (probability = 1/8th, which is not unreasonable!) So, you've got 1/3 of the population responding (possibly dropping lower...). 2. 2 are non-differential (probability 3/8ths) 3. 2 are differential (probability 3/8ths) 4. 3 are differetial (probability 1/8), and you are happier (except for "wasting" $$...). So unfortunately, your claim of "highly repeatable" sounds more like "wishful thinking", if you look at the possibilities. (now, perhaps you are bringing in more biological insight into the problem, and it's not following a discovery paradigm, i.e. the insight is a-priori -- then a Bayesian decision making procedure might be reasonable to look at the strength of evidence; but in this case, you might just make a decision heavily weighting biology data rather than expression data). That is, use the data to generate hypotheses, and "confirm" using annotation and metadata. I've always found this approach suspect, but it tends to occur in practice and be "believeable" to some groups. best, -tony "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> writes: > Hi Guys > > Well this turned into a very interesting discussion, thank you for your > inputs. All of the explanations lead to a single conclusion, and that > is that I (we?) need to find significant differences which are present > in only subsets of the data. > > Let me explain - here I had samples from three animals. Two animals > showed what looks like highly-repeatable differential expression, and > the third did not. If we make the assumption that this is down to > biological variation (ie two of my animals showed an immune response, > the third did not, simply because they are different animals), then > standard statistical tests are missing an effect which is present in two > thirds of my population. If you ask me "are you interested in finding > effects which are present in only two thirds of your population?" then > the answer is of course I am! > > Over the last 5 years the whole issue of pharmacogenomics became huge, > the right drug for the right patient etc, and I know I am speculating > wildly here, but perhaps what my data is showing me is exactly that - > that two-thirds of my population show a particular immune response but > the other third does not. And that's very interesting ;-) > > Now, to the non-statistician, the "bull in a china shop" approach to > solving this would appear to be to take all possible subsets of my data > and running limma on them, to find significant changes in subsets of my > data. Clearly this becomes problematic for large datasets. Presumably > there are many more intelligent ways....? > > Thanks again > > Mick > > -----Original Message----- > From: Gordon K Smyth [mailto:smyth@wehi.edu.au] > Sent: 14 August 2004 01:07 > To: David K Pritchard > Cc: Anthony Rossini; bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Harsh results using limma! > > >> I think Mick's experiences point out a fundamental problem with >> current statistical analysis of microarray data. If his data was .2, >> .2, .2, (dye flips) -.2, -.2, -.2 then Limma would note this gene as > highly differentially expressed. In contrast when he sees 6.29, 5.54, > 0.2, (dye >> flips)-5.27,-4.61, -0.2 Limma did not mark it as differentially > expressed. > > Actually it is not true that limma will necessarily rank the first gene > higher than the second. > Obviously t-tests would do so, but limma may well rank the second gene > higher depending on the information about variability inferred from the > whole data set. Looking at fold change alone ranks the second gene > higher while t-tests would rank the first higher. Limma is somewhere in > between depending on the dataset. A typical microarray dataset actually > would lead to the second gene being ranked higher, i.e., would lead to > the ranking that you would prefer. > >> As a biologist I would argue the case for the genes actually >> being differentially expressed is much higher in the second case. Yet > >> using modified T-statistic approaches and with the limited number of >> repeats common with current array experiments, I see array > experiments "missing" these very interesting high variance genes all the > time. >> Current analytical techniques put a high premium on consistency of > >> results and a lower premium on strength of differential expression >> which is the parameter that biologists would argue is the most > significant. >> There are a variety of biological reasons why high variance genes > >> should exist and personally I think these genes are likely to be the >> biologically interesting ones that we should be looking for on > microarrays. >> I understand why Limma does what it is does and it is a >> fantastically useful program. However, I would suggest to the >> statisticians reading this message that it would be very useful to >> start developing analytical techniques which could better detect high >> variance genes. > > I agree with the overall point. Two strategies currently available are: > 1. Use spot quality weights. In the example given above it appears that > two of the arrays or spots have failed to register any worthwhile fold > change for a gene which is differentially expressed on the other arrays. > If this can be identified as being due to low quality spots or arrays, > then the values may be down-weighted in an analysis and the gene will > revert to being highly significant. 2. If small fold changes are not of > biological interest to you, then you can require a minimum magnitude for > the fold change as well as looking for evidence of differential > expression. > > Gordon > >> David Pritchard > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- Anthony Rossini Research Associate Professor rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} ADD COMMENTlink written 14.9 years ago by A.J. Rossini810 Answer: Harsh results using limma! 0 14.9 years ago by michael watson IAH-C3.4k wrote: Hi Tony I take your point, but I am no longer talking about my pitifully replicated three-animal experiment ;-) I am extending the argument to the larger case where I might have plenty of replicated experiments and still want to find significant genes amongst sub-populations. I've had one e-mail which shows that this is possible, so I am happy. What everyone needs to understand, though, is that on one hand we have statisticians saying we should have 100s of replicates and on the other hand a legal obligation and an ethics committee saying we have to use as few animals as possible. There's also an element of "don't shoot the messenger"; I didn't design the experiment, in fact the 1st I heard of it was when I was asked to analyse it (hands up on the list who's experienced that....). I *know* there should be more replicates, but at the end of the day, I need to make as much use of the data I have as is possible, and any number of "Argh"'s and "you need more replicates"'s are not going to help me :-D Thanks Mick -----Original Message----- From: A.J. Rossini [mailto:rossini@blindglobe.net] Sent: 16 August 2004 14:57 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Harsh results using limma! Argh. You can't really draw conclusions (even discovery) with biological variation from three animals without bringing in extra information. Suppose you don't have extra information. Think about the possibilities of the next 3 animals. under a conservative assumption that half the population is diff expr'd, Reasonable options are: 1. the next 3 show non-differential results (probability = 1/8th, which is not unreasonable!) So, you've got 1/3 of the population responding (possibly dropping lower...). 2. 2 are non-differential (probability 3/8ths) 3. 2 are differential (probability 3/8ths) 4. 3 are differetial (probability 1/8), and you are happier (except for "wasting"$$...). So unfortunately, your claim of "highly repeatable" sounds more like "wishful thinking", if you look at the possibilities. (now, perhaps you are bringing in more biological insight into the problem, and it's not following a discovery paradigm, i.e. the insight is a-priori -- then a Bayesian decision making procedure might be reasonable to look at the strength of evidence; but in this case, you might just make a decision heavily weighting biology data rather than expression data). That is, use the data to generate hypotheses, and "confirm" using annotation and metadata. I've always found this approach suspect, but it tends to occur in practice and be "believeable" to some groups. best, -tony "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> writes: > Hi Guys > > Well this turned into a very interesting discussion, thank you for > your inputs. All of the explanations lead to a single conclusion, and > that is that I (we?) need to find significant differences which are > present in only subsets of the data. > > Let me explain - here I had samples from three animals. Two animals > showed what looks like highly-repeatable differential expression, and > the third did not. If we make the assumption that this is down to > biological variation (ie two of my animals showed an immune response, > the third did not, simply because they are different animals), then > standard statistical tests are missing an effect which is present in > two thirds of my population. If you ask me "are you interested in > finding effects which are present in only two thirds of your > population?" then the answer is of course I am! > > Over the last 5 years the whole issue of pharmacogenomics became huge, > the right drug for the right patient etc, and I know I am speculating > wildly here, but perhaps what my data is showing me is exactly that - > that two-thirds of my population show a particular immune response but > the other third does not. And that's very interesting ;-) > > Now, to the non-statistician, the "bull in a china shop" approach to > solving this would appear to be to take all possible subsets of my > data and running limma on them, to find significant changes in subsets > of my data. Clearly this becomes problematic for large datasets. > Presumably there are many more intelligent ways....? > > Thanks again > > Mick > > -----Original Message----- > From: Gordon K Smyth [mailto:smyth@wehi.edu.au] > Sent: 14 August 2004 01:07 > To: David K Pritchard > Cc: Anthony Rossini; bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Harsh results using limma! > > >> I think Mick's experiences point out a fundamental problem with >> current statistical analysis of microarray data. If his data was .2, >> .2, .2, (dye flips) -.2, -.2, -.2 then Limma would note this gene as > highly differentially expressed. In contrast when he sees 6.29, 5.54, > 0.2, (dye >> flips)-5.27,-4.61, -0.2 Limma did not mark it as differentially > expressed. > > Actually it is not true that limma will necessarily rank the first > gene higher than the second. Obviously t-tests would do so, but limma > may well rank the second gene higher depending on the information > about variability inferred from the whole data set. Looking at fold > change alone ranks the second gene higher while t-tests would rank the > first higher. Limma is somewhere in between depending on the dataset. > A typical microarray dataset actually would lead to the second gene > being ranked higher, i.e., would lead to the ranking that you would > prefer. > >> As a biologist I would argue the case for the genes actually >> being differentially expressed is much higher in the second case. Yet > >> using modified T-statistic approaches and with the limited number of >> repeats common with current array experiments, I see array > experiments "missing" these very interesting high variance genes all > the time. >> Current analytical techniques put a high premium on consistency >> of > >> results and a lower premium on strength of differential expression >> which is the parameter that biologists would argue is the most > significant. >> There are a variety of biological reasons why high variance >> genes > >> should exist and personally I think these genes are likely to be the >> biologically interesting ones that we should be looking for on > microarrays. >> I understand why Limma does what it is does and it is a >> fantastically useful program. However, I would suggest to the >> statisticians reading this message that it would be very useful to >> start developing analytical techniques which could better detect high >> variance genes. > > I agree with the overall point. Two strategies currently available > are: 1. Use spot quality weights. In the example given above it > appears that two of the arrays or spots have failed to register any > worthwhile fold change for a gene which is differentially expressed on > the other arrays. If this can be identified as being due to low > quality spots or arrays, then the values may be down-weighted in an > analysis and the gene will revert to being highly significant. 2. If > small fold changes are not of biological interest to you, then you can > require a minimum magnitude for the fold change as well as looking for > evidence of differential expression. > > Gordon > >> David Pritchard > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- Anthony Rossini Research Associate Professor rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
0
14.9 years ago by
A.J. Rossini810
A.J. Rossini810 wrote:
"michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> writes: > I am extending the argument to the larger case where I might have plenty > of replicated experiments and still want to find significant genes > amongst sub-populations. I've had one e-mail which shows that this is > possible, so I am happy. > > What everyone needs to understand, though, is that on one hand we have > statisticians saying we should have 100s of replicates and on the other > hand a legal obligation and an ethics committee saying we have to use as > few animals as possible. There's also an element of "don't shoot the > messenger"; I didn't design the experiment, in fact the 1st I heard of > it was when I was asked to analyse it (hands up on the list who's > experienced that....). I *know* there should be more replicates, but at > the end of the day, I need to make as much use of the data I have as is > possible, and any number of "Argh"'s and "you need more replicates"'s > are not going to help me :-D We run into the same problem with clinical trials, which is in a sense the framework that you are using (biological replicates). When you throw in the "cost, ethics, legal obligations, IRBs..." issues, having statistical power is more critical. And if you can't find results, one might argue that the experiment shouldn't be done (taking the harsh other view that a experiment with a highly probable inconclusive result, when ethics raise their ugly head, is not worth doing). Statistical designs are always are a tradeoff. best, -tony -- Anthony Rossini Research Associate Professor rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.