GCRMA/RMA bimodal distribution
0
0
Entering edit mode
@matthew-hannah-621
Last seen 9.6 years ago
Thanks for the replies. I guess the whole gist of this is when you have lots of changes and some biologically interesting 'bias', how can you be sure it's not due to the normalisation. Maybe I'm misunderstanding how these exprs measures work, but my basic worry was that if you have a bimodal distribution, an intensity in the middle of the range (that fluctuates a bit between chips) may be more likely to shifted towards one of the two peaks. In contrast a uni- modal distribution would tend to stabilise (or even compress?) these fluctuations? Probably I've got this wrong. Also I've come to realise this is perhaps more about the 'real' expression levels, than what an exprs measure does to them (see below). I initially assumed that other chips should have a similar distribution after gcrma to the U95A chip with which it was optimised. However, after reading about its design (all known, characterised genes) it became obvious that more (the majority?) of the genes are perhaps likely to be present in a given sample. The more common design (inc. ATH1) is based on predicted genes + known so you will have a greater chance for 'not present' - perhaps even enough to give the low intensity peak. Indeed this is probably supported by the 'wider' distribution at low raw intensities on these arrays. So happy(ish) with this point I'm left with the other concern... As the raw intensity distributions on these arrays look quite different to the U95A, are the gcrma model assumptions still true? Looking at log2 MM vs. PM for U95A (cloud with slope near to 1) compared to ATH1 or U133A (Nike "swoosh" overlaid?) just fuelled these concerns. Any comments? Basically is GCRMA/other exprs measures developed on the U95A array likely to be valid for all array types? Cheers, Matt > -----Original Message----- > From: Naomi Altman [mailto:naomi@stat.psu.edu] > Sent: Dienstag, 31. August 2004 22:14 > To: Matthew Hannah; bioconductor@stat.math.ethz.ch; Rafael A. > Irizarry; James MacDonald; Ben Bolstad; zwu@jhsph.edu > Subject: Re: [BioC] GCRMA/RMA bimodal distribution > > I have used RMA and MAS on ATH arrays, and the distributions > are bimodal (both probe-wise and probesets.) Setting a > p-value threshold at about .05 > (MAS) removes the lower peak. But, like others on this list, > I do not really take the p-values too seriously. > > I am not sure why I should care about the bimodality. The > methods I use like t-tests and limma require normality within > genes across arrays, and > (possibly) a distribution for the variance of the genes, but > say nothing otherwise about the distribution of genes on the > same array. > > --Naomi > > At 06:06 PM 8/31/2004 +0200, Matthew Hannah wrote: > >Hi, > > > >Sorry for including the developers, but I guess you are the > only ones > >that will be able to answer this, (and I'm not sure BioC > accepts .docs). > >I saw a comment from Jean addressing the same question but couldn't > >find the reply he referred to. > > > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-Aug ust/005769. > >html > > > >It seems the mouse chip exprs values have a double peak after gcrma > >(looking at a density plot). > > > >As I'd received no response I've been doing some investigating (see > >attached). Basically gcrma gives a single peaked > distribution only for > >U95 human chips (optimised with these?). Double peaks for exprs > >estimates appear in the following - U133A(least) - > Drosgenome1 - ATH1 > >(worst). > > > >To a lesser extent this also occurs with RMA. U133A has a > single wide > >peak, and then they get worse in the order Dros1 - U95 - > ATH1 (The last > >two have obvious double peaks). > > > > >From what has been said this is likely to be a problem of > BG correction. > >I don't know if there are opportunities to change this for > RMA, but in > >GCRMA there are tuning factors and I don't know if the > ad-hoc estimate > >(rather than full model) is causing this to happen. Turning > of optical > >correct had no effect. > > > >I wanted to play about with GCRMA to see if the distribution changed > >with the tuning factors but currently I seem to have an error (see > >below) with gcrma and justGCRMA not finding gcrma.bg.transformation, > >and I'm not sure how k should be expressed. > > > >I know people should look more at their data but with the ease of > >just(GC)RMA and RMAexpress I know a lot of people just computing > >expression measures for different chip types without looking > at density > >of the returned expression. Clearly these people are going to be > >working with data that may be skewed in some way. > > > >I guess that each chip type will need its BG correction > optimising for > >RMA and GCRMA to allow for a better estimate of true > expression levels > >and changes. I really hope this can be fixed as RMA and > GCRMA seem to > >be really useful expression measures and it would be a shame > to have to > >find alternative methods just because they are not optimised > for your > >chip type. > > > >Thanks in advance, > >Matt > > > >R devel 2.0, win2k > >affy 1.5.2 (I know it's not the latest but getBioC is not > working for > >me at the moment) gcrma 1.1.0 > > <<exprs_meas_comp.doc>> > > > > > esetgcrma_slow <- gcrma(raw,fast=FALSE) > >Computing affinities.Done. > >Adjusting for optical effect.........Done. > >Adjusting for non-specific binding.Error in > bg.adjust.fullmodel(pms[, > >i], mms[, i], pm.affinities, mm.affinities, : > > couldn't find function "gcrma.bg.transformation" > > > esetgcrma_slow <- justGCRMA(fast=FALSE) > >Computing affinities..Done. > >Adjusting for optical effect..........Done. > >Adjusting for non-specific binding.Error in > bg.adjust.fullmodel(pms[, > >i], mms[, i], pm.affinities, mm.affinities, : > > couldn't find function "gcrma.bg.transformation" > > > esetgcrma_k4 <- justGCRMA(k=4*fast+0.5*(1-fast)) > >Computing affinities..Done. > >Adjusting for optical effect..........Done. > >Adjusting for non-specific binding.Error in > >gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) : > > Object "fast" not found > > > > > > > > > > > > > > > > > > > > > > > > > > > >Hi, > > > >This has been mentioned before in the context of rma and > that it was an > >artifact of BG correction. > > > >http://files.protsuggest.org/biocond/html/5066.html > > > >I was very suprised to see that gcrma also gave a very pronouned > >bimodal distribution. When comparing samples, obviously the relative > >positions of the 2 peaks may influence observed expression changes. > >Would such peak shifts be more likely in divergent samples, and if > >anyone wants to comment on those.... ;-) > > > >This example is using 12 chips (biological reps). But I initially > >noticed it using 3 and 6 chips in rma. > > > >Hope attachment works. > > > >Cheers, > >Matt > > > > > >-------------- next part -------------- A non-text attachment was > >scrubbed... > >Name: gcrma_dist.png > >Type: image/png > >Size: 6633 bytes > >Desc: gcrma_dist.png > >Url : > >https://stat.ethz.ch/pipermail/bioconductor/attachments/20040 > 825/083e56 > >a > >5/gcrma_dist.png > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 > (Statistics) > University Park, PA 16802-2111 > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: U95mm_pm.png Type: image/png Size: 18826 bytes Desc: U95mm_pm.png Url : https://stat.ethz.ch/pipermail/bioconductor/attachments/20040901 /100a7c50/U95mm_pm.png -------------- next part -------------- A non-text attachment was scrubbed... Name: ATH1mm_pm.png Type: image/png Size: 19997 bytes Desc: ATH1mm_pm.png Url : https://stat.ethz.ch/pipermail/bioconductor/attachments/20040901 /100a7c50/ATH1mm_pm.png
drosgenome1 limma gcrma drosgenome1 limma gcrma • 845 views
ADD COMMENT

Login before adding your answer.

Traffic: 906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6