Normalising divergent samples.

0

Entering edit mode

Matthew Hannah ▴ 940

@matthew-hannah-621

Last seen 11.4 years ago

Robert, Wolfgang, Thanks for the response, this is exactly what I was looking for. A lot to discuss so I'll add some comment/questions. First minor points - I realise (GC)RMA are exprs estimates, and the discussion applies to all normalisations. I was just referring to them as a common/convienient reference. My comment on people being unaware or ignoring it applied to the 90% of users who use something like MAS or RMA and look for changes without exploring the data more, and not aimed at BioC users or the wider bioinf community. Guessing there is work being done my posts were partly an attempt to learn more (publicly or privately) about what is in the pipeline. The paper by Holstege is interesting but their approach has one obvious (but unmentioned?) drawback. They quantify based on total RNA, and the basic gist seems to be - if you have a say 2-fold increase in mRNA then normalising on the gene level will under-estimate the number of increases. With (total RNA) spike-ins you get a different (true?) picture. I'm not the most knowledgeable in biochem but with mRNA c.2% total RNA, then a 2% change in rRNA and tRNA levels seems more likely than a 100% change in mRNA. Why not just measure mRNA/total RNA in the original sample (I assume there is a method?), you could even do this afterwards. This obviously leads onto the point of what you want to standardise to, total RNA is discussed above and has the problem that rRNA/tRNA could vary with growth rate or treatment (but maybe I'm wrong). Per cell is obviously good (also used by Holstege) if possible but I guess that's fairly impracticable for those of us working with tissue, and calculations based on weight have problems with changes in cell volume/water content. And there are cheaper ways to measure water content than with a microarray ;-) I suppose at present northerns have used total RNA and RT and Q-PCR have standardised to house-keeping genes. Perhaps I'm being naive but could standardising to mRNA level be as valid as the alternatives (assuming you can't use per cell), it obviously has the limitations we're discussing but doesn't have (the presumed) troubles discussed above. Obviously if you get a large number of changes biased in one direction then there would be a corresponding group of changes in the opposite direction. Anyway I think I might be getting lost in my thoughts there.. One final clarification - on affy chips followed by 'standard' normalisation you have exprs values on a (cRNA / mRNA) proportional scale?? Right now on the points about big differences. I have multiple genotypes x 3 reps with treated and control giving c.25 treated versus 25 untreated. I've looked at the p values from a t-test and limma (ebayes(fit)) as you suggested and they are fairly flatly distributed, except the very lowest ones, and completely uniform from around 0.4-0.5 to 1. >0.5 I have c.5000, which would mean I have c. 13000 changes (c.55%, or the amount of genes thought to be usually expressed!). How accurate is this estimate (I'm not a statistician), my previous estimate was based on controlling the false positives, but I guess I should also worry about the false negatives (how do you get an accurate measure of these?). If I look at the p values for a 3 rep treat versus untreated they are only uniform from >0.75 and I get c.6500 = 10000 changes. This is lower than with more replica (I assume as you have less statistical power). My use of the Pearson correlation was an attempt to get over the difference in replica as many other studies only have 1-3 and so I assumed the treat vs. untreat Pearson correlation would indicate the approx number and magnitude of changes and is not influence by the number of reps? If it does then many studies have smaller Pearson's than my 'treatment effect'. I guess my point about us seeing similar numbers of genes going up and down is flawed by the fact this is after normalisation. After quantile normalisation would you always see approx equal up and down? Anyway discussion on spike-ins is really a side issue for me as this isn't established (?) with affy chips and we already have the data. In our case we are more interested in interpreting the differences between the genotypes. However, my central interest is that looking at the genes that change in all genotypes we see that these change more (increased magnitude of up AND down) in some genotypes. This makes biological sense BUT how can we be sure it's not an artefact of normalisation??? Finally I guess the scaling is a simple use of apply, but I know how to apply functions across columns but what function would mean or median scale? Do you just 'shift' the data so the means are equal or do you scale(?) to take into account range/deviation? Sorry for the long reply, it's just I've been waiting to discuss this for a while. Hopefully the length won't stop further comment from others. Cheers, Matt On Wed, Sep 15, 2004 at 10:20:42AM +0200, Matthew Hannah wrote: >> I've tried to get a discussion on this several times but have got very >> few responses. >> >> I'm looking at some data where the treatment has a very BIG effect, but >> I don't think this is unusual it's just that a lot of people don't >> realise it or ignore it. Hi Matthew, Robert Gentleman and myself just talked about the issue that you brought up and here are some of our points: If by big effect you mean that a lot of genes change expression under different conditions, then I think that many people are aware of the issue, but that it has not been widely discussed. Frank Holstege and his group in Utrecht have worked quite systematically on this, and I would recommend looking at one of their recent papers: Monitoring global messenger RNA changes in externally controlled microarray experiments. van de Peppel J, Kemmeren P, van Bakel H, Radonjic M, van Leenen D, Holstege FC. EMBO Rep. 2003 Apr;4(4):387-93. PMID: 12671682 You seemed to indicate that having a large fraction of genes changing being only a problem for RMA and GCRMA, but our understanding is that it is a problem for all "intrinsic" methods of normalizing, i.e. those do not use external spike-in controls. We believe that the problem is largely to do with normalization, and not with computing expression estimates. As long as microarray experiments are not trying to measure absolute molecule abundances, but rather just "relative expression", then we think that the problem of interpreting situations in which most genes change will always remain hard. On the other hand side, if you use an "extrinsic" method, you need to decide whether you want to measure number of molecules per cell, or per total RNA, or per what ... so that's a conceptual issue that needs to be worked out. This is also known as a research *opportunity*. >> If we take the average Pearson correlation of treated versus untreated >> as a crude indication of the number of changes (is this valid?) then in >> our experiments this is 0.96. Comparing 25 treated versus 25 untreated >> replicates (GCRMA, LIMMA gene-wise fdr corrected p<0.001) we get c.30% >> of transcripts on the chip changing! You are too imprecise in your description for us to comment on whether the method is valid, but why don't you use good old-fashioned statistics: for each gene, calculate a p-value from e.g. t-test or an appropriate linear model generalization, and look at the histogram of p-values. The empirical p-values at the right end of the histogram should be approximately uniformly distributed, and the number of non-differentially expressed genes can be estimated by 2 times the genes with p>0.5. >> Looking at a couple of public datasets I don't think our treatment >> effect (as indicated by the Pearson) is that unusual, it's just that we >> have the statistical power to detect the changes. Also looking at the >> changes, and considering the biology it seems reasonable to get these >> changes. It depends a lot on what the factors are. Robert has some collaborators who use treatments that greatly change things, and others that use treatments that are so specific that the number of changes genes is under 10. No global statement that can be made here. For the former, they need to be warned that their experiment lies outside of the currently available technology, and then you do the best you can. With the latter, we should be able to do a good job, although one may never find the signal if p-value corrections are applied in a naive fashion (but that is a different story). >> In the discussions on RMA/GCRMA there are 2 assumptions discussed >> 1)few genes changing - obviously not >> 2)equal # up and down - despite the huge amount of changes there are >> only 20 more transcripts going up compared to down - so yes. As I said above, I do not believe this is specific to RMA or GCRMA, but rather a general problem for all normalization methods. Also, these are sufficient conditions, but not necessary. I.e. if they are fulfilled, (GC)RMA can be guaranteed to work, but if they aren't, the results from (GC)RMA, or other normalization methods, may still be valid to a sufficient degree! >> I've also looked at a number of control genes and can't find any real >> bias, in fact there is quite a bit of (random?) variation, so if you >> normalised on a few of these then you may get strange results... I think the problem with using control genes is that there are typically few of them and they do not necessarily span the range of intensities so they provide a poor basis for normalization. Although, in the present case they may be better than the alternatives. >> Finally, I will at some point try separate GCRMAs and then scaling. If >> anyone has any scripts for mean, robust mean or median scaling a series >> of separate exprs sets then I'd appreciate it. I hope that this is a rather simple use of lapply or similar (depending on how you have your exprSets stored). Regards, Robert, Wolfgang

Microarray Normalization affy limma gcrma Microarray Normalization affy limma gcrma • 1.5k views

ADD COMMENT • link updated 21.4 years ago by Arne.Muller@aventis.com ▴ 620 • written 21.4 years ago by Matthew Hannah ▴ 940

0

Entering edit mode

Arne.Muller@aventis.com ▴ 620

@arnemulleraventiscom-466

Last seen 11.4 years ago

Hello, I'm not sure wheather my comment fits in precisely. I'm writing up a toxicology study for which gene expression was measured by three independent laboratories for a compound at different doses. The dose effet is rather small (~200 genes effected at p < 0.001) but at least half of the genes on the chip are changed between the laboratoties. However, a two way anova copes well with this issue, revealign th 'true' dose effect (there are hardly any interactions between laboratory and dose factor). Note, that the data was normalized. The boxplots for the normalized data looks perfect, but I found that none of the tested nromalization methods (rosetta resolver, rma+quatile, mas5+loess or quantile, vsn and others) help much in scaling the data to reduce the laboratory effect without reducing the number of dose effected genes dramatically. Could it be that if so many genes change in your study you may have a hidden (unwanted) factor? kind regards, Arne > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Matthew > Hannah > Sent: 16 September 2004 18:48 > To: bioconductor@stat.math.ethz.ch > Cc: w.huber@dkfz-heidelberg.de; rgentlem@jimmy.harvard.edu > Subject: [BioC] Normalising divergent samples. > > > Robert, Wolfgang, > > Thanks for the response, this is exactly what I was looking for. A lot > to discuss so I'll add some comment/questions. > > First minor points - I realise (GC)RMA are exprs estimates, and the > discussion applies to all normalisations. I was just referring to them > as a common/convienient reference. > > My comment on people being unaware or ignoring it applied to > the 90% of > users who use something like MAS or RMA and look for changes without > exploring the data more, and not aimed at BioC users or the > wider bioinf > community. Guessing there is work being done my posts were partly an > attempt to learn more (publicly or privately) about what is in the > pipeline. > > The paper by Holstege is interesting but their approach has > one obvious > (but unmentioned?) drawback. They quantify based on total RNA, and the > basic gist seems to be - if you have a say 2-fold increase in > mRNA then > normalising on the gene level will under-estimate the number of > increases. With (total RNA) spike-ins you get a different (true?) > picture. I'm not the most knowledgeable in biochem but with mRNA c.2% > total RNA, then a 2% change in rRNA and tRNA levels seems more likely > than a 100% change in mRNA. Why not just measure mRNA/total RNA in the > original sample (I assume there is a method?), you could even do this > afterwards. > > This obviously leads onto the point of what you want to > standardise to, > total RNA is discussed above and has the problem that rRNA/tRNA could > vary with growth rate or treatment (but maybe I'm wrong). Per cell is > obviously good (also used by Holstege) if possible but I guess that's > fairly impracticable for those of us working with tissue, and > calculations based on weight have problems with changes in cell > volume/water content. And there are cheaper ways to measure water > content than with a microarray ;-) I suppose at present northerns have > used total RNA and RT and Q-PCR have standardised to house-keeping > genes. > > Perhaps I'm being naive but could standardising to mRNA level be as > valid as the alternatives (assuming you can't use per cell), it > obviously has the limitations we're discussing but doesn't have (the > presumed) troubles discussed above. Obviously if you get a > large number > of changes biased in one direction then there would be a corresponding > group of changes in the opposite direction. Anyway I think I might be > getting lost in my thoughts there.. One final clarification - on affy > chips followed by 'standard' normalisation you have exprs values on a > (cRNA / mRNA) proportional scale?? > > Right now on the points about big differences. I have > multiple genotypes > x 3 reps with treated and control giving c.25 treated versus 25 > untreated. I've looked at the p values from a t-test and limma > (ebayes(fit)) as you suggested and they are fairly flatly distributed, > except the very lowest ones, and completely uniform from > around 0.4-0.5 > to 1. >0.5 I have c.5000, which would mean I have c. 13000 changes > (c.55%, or the amount of genes thought to be usually expressed!). How > accurate is this estimate (I'm not a statistician), my > previous estimate > was based on controlling the false positives, but I guess I > should also > worry about the false negatives (how do you get an accurate measure of > these?). > > If I look at the p values for a 3 rep treat versus untreated they are > only uniform from >0.75 and I get c.6500 = 10000 changes. > This is lower > than with more replica (I assume as you have less statistical > power). My > use of the Pearson correlation was an attempt to get over the > difference > in replica as many other studies only have 1-3 and so I assumed the > treat vs. untreat Pearson correlation would indicate the approx number > and magnitude of changes and is not influence by the number > of reps? If > it does then many studies have smaller Pearson's than my 'treatment > effect'. > > I guess my point about us seeing similar numbers of genes going up and > down is flawed by the fact this is after normalisation. After quantile > normalisation would you always see approx equal up and down? > > Anyway discussion on spike-ins is really a side issue for me as this > isn't established (?) with affy chips and we already have the data. In > our case we are more interested in interpreting the > differences between > the genotypes. However, my central interest is that looking > at the genes > that change in all genotypes we see that these change more (increased > magnitude of up AND down) in some genotypes. This makes > biological sense > BUT how can we be sure it's not an artefact of normalisation??? > > Finally I guess the scaling is a simple use of apply, but I > know how to > apply functions across columns but what function would mean or median > scale? Do you just 'shift' the data so the means are equal or do you > scale(?) to take into account range/deviation? > > Sorry for the long reply, it's just I've been waiting to discuss this > for a while. Hopefully the length won't stop further comment from > others. > > Cheers, > Matt > > > > > > > > > > > > On Wed, Sep 15, 2004 at 10:20:42AM +0200, Matthew Hannah wrote: > > >> I've tried to get a discussion on this several times but have got > very > >> few responses. > >> > >> I'm looking at some data where the treatment has a very BIG effect, > but > >> I don't think this is unusual it's just that a lot of people don't > >> realise it or ignore it. > > Hi Matthew, > > Robert Gentleman and myself just talked about the issue that > you brought > up and here are some of our points: > > If by big effect you mean that a lot of genes change expression under > different conditions, then I think that many people are aware of the > issue, but that it has not been widely discussed. Frank > Holstege and his > group in Utrecht have worked quite systematically on this, and I would > recommend looking at one of their recent papers: > > Monitoring global messenger RNA changes in externally controlled > microarray experiments. van de Peppel J, Kemmeren P, van Bakel H, > Radonjic > M, van Leenen D, Holstege FC. EMBO Rep. 2003 Apr;4(4):387-93. PMID: > 12671682 > > You seemed to indicate that having a large fraction of genes changing > being only a problem for RMA and GCRMA, but our understanding > is that it > is a problem for all "intrinsic" methods of normalizing, i.e. those do > not > use external spike-in controls. We believe that the problem is largely > to > do with normalization, and not with computing expression estimates. > > As long as microarray experiments are not trying to measure absolute > molecule abundances, but rather just "relative expression", then we > think > that the problem of interpreting situations in which most genes change > will always remain hard. > > On the other hand side, if you use an "extrinsic" method, you need to > decide whether you want to measure number of molecules per > cell, or per > total RNA, or per what ... so that's a conceptual issue that > needs to be > worked out. > > This is also known as a research *opportunity*. > > >> If we take the average Pearson correlation of treated versus > untreated > >> as a crude indication of the number of changes (is this > valid?) then > in > >> our experiments this is 0.96. Comparing 25 treated versus 25 > untreated > >> replicates (GCRMA, LIMMA gene-wise fdr corrected p<0.001) we get > c.30% > >> of transcripts on the chip changing! > > You are too imprecise in your description for us to comment on whether > the > method is valid, but why don't you use good old-fashioned statistics: > for > each gene, calculate a p-value from e.g. t-test or an > appropriate linear > model generalization, and look at the histogram of p-values. The > empirical > p-values at the right end of the histogram should be approximately > uniformly distributed, and the number of non-differentially expressed > genes can be estimated by 2 times the genes with p>0.5. > > >> Looking at a couple of public datasets I don't think our treatment > >> effect (as indicated by the Pearson) is that unusual, it's > just that > we > >> have the statistical power to detect the changes. Also > looking at the > >> changes, and considering the biology it seems reasonable > to get these > >> changes. > > It depends a lot on what the factors are. Robert has some > collaborators > who use treatments that greatly change things, and others that use > treatments that are so specific that the number of changes genes is > under > 10. No global statement that can be made here. For the > former, they need > to be warned that their experiment lies outside of the currently > available > technology, and then you do the best you can. With the > latter, we should > be able to do a good job, although one may never find the signal if > p-value corrections are applied in a naive fashion (but that is a > different story). > > >> In the discussions on RMA/GCRMA there are 2 assumptions discussed > >> 1)few genes changing - obviously not > >> 2)equal # up and down - despite the huge amount of changes > there are > >> only 20 more transcripts going up compared to down - so yes. > > As I said above, I do not believe this is specific to RMA or > GCRMA, but > rather a general problem for all normalization methods. > > Also, these are sufficient conditions, but not necessary. I.e. if they > are > fulfilled, (GC)RMA can be guaranteed to work, but if they aren't, the > results from (GC)RMA, or other normalization methods, may > still be valid > to a sufficient degree! > > >> I've also looked at a number of control genes and can't > find any real > >> bias, in fact there is quite a bit of (random?) variation, > so if you > >> normalised on a few of these then you may get strange results... > > I think the problem with using control genes is that there are > typically few of them and they do not necessarily span the range of > intensities so they provide a poor basis for normalization. Although, > in the present case they may be better than the alternatives. > > >> Finally, I will at some point try separate GCRMAs and then scaling. > If > >> anyone has any scripts for mean, robust mean or median scaling a > series > >> of separate exprs sets then I'd appreciate it. > > I hope that this is a rather simple use of lapply or similar > (depending on how you have your exprSets stored). > > Regards, > Robert, Wolfgang > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.4 years ago Arne.Muller@aventis.com ▴ 620

Login before adding your answer.