In general, the big differences should come from the genes with very
small variance or very large variance, and moderate differential
expression. This is because the variance estimate is shrunk towards
the mean in Limma (and SAM), whereas ANOVA uses the sample
variance. Since the t or F test uses the variance estimate in the
denominator, genes with very small sample variance will be less
significant with shrinkage, whereas genes with high variance will be
more significant.
--Naomi
At 08:45 PM 4/17/2006, Wettenhall James wrote:
>Hi,
>
>I am now working in a place where GeneSpring GX is the standard
>microarray analysis tool, whereas my only microarray analysis
experience
>is using R/BioC, particularly limma. I'm sure this is not an easy
>question, but is there any recommended reading for trying to answer
the
>question "Why do I get really different gene lists from the different
>statistical tests?" (In one case where this has arisen, a thorough
>RT-PCR follow-up is being done, so that will be most interesting.)
>
>Here's a GeneSpring tutorial, and a quick discussion of the
statistical
>test used (obviously different from limma):
>http://hsc.unm.edu/som/micro/genomics/tutorial.html#OLE_LINK6
>
>1-Way Anova
>Parameteric test, don't assume variances equal
>Multiple testing correction : Benjamini & Hochberg is the default,
but
>if this gives no statistically significant genes, then users seem to
>turn this off. One thing I don't understand is that when multiple
>testing correction is turned on, the p-value cutoff entry box is
>_replaced_ by a false discovery rate entry box - why can't I have
both?
>(Probably a question for GeneSpring rather the BioC.)
>
>After the ANOVA test, (particularly if there are more than two
>conditions being compared), a post-hoc test (Tukey or
>Student-Newman-Keuls) can be done to determine which pairs of
conditions
>the significant genes differ between.
>
>I have been able to get exactly the same results from normalizing /
>probe-level summary between GeneSpring and BioC. (For Affy data,
>GeneSpring has RMA and GCRMA, but it refers to them as "pre-
processing".
>"Normalization" is done later - so I suspect that some users
>over-normalize compared with what is done in BioC, not realizing that
>that RMA "pre-processing" includes a quantile normalization.)
>
>So if anyone can recommend any reading for comparing the "different
>worlds" of microarray analysis, I would be most interested.
>
>Best wishes,
>James
>
>
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348
(Statistics)
University Park, PA 16802-2111
Hi,
Thanks Naomi and thanks in particular to Gordon Smyth for a discussion
off the list. I do understand the value of moderating the t statistic
(shrinking variance etc.) for small numbers of replicate arrays, but I
was initially missing the fact that a 1-way ANOVA test can be
equivalent
to a t test. Probably most people who read this list regularly have a
statistics background, so it might be surprising to hear from someone
who knows so little about ANOVA, but I come from a maths/comp sci (not
stats) background, and so I was mistakenly concerned that ANOVA
methods
(e.g. Churchill et al) might be completely different from the t tests
and linear models which I was more familiar with.
Thanks for the helpful information.
Best wishes,
James