Question

ANOVA vs T-TEST vs eBayes

1

Entering edit mode

avehna ▴ 240

@avehna-3930

Last seen 9.6 years ago

Dear All: I'm trying to identify genes that are differentially expressed in 4 different treatments vs Control. First, I applied *pairwise.comparison*(simpleaffy library) to my data, and then, just to compare both results, I tried *lmFit* and *eBayes* (from limma library). I was wondering which method is best, because although pairwise.comparison applies a t-test, it doesn't include Bonferroni correction. On the other hand I'm not sure whether fitting the data to a linear model using lmFit and eBayes is more convenient. I have also found another library(maanova) that uses Anova and it's also suitable for DNA microarray analyzes. I will appreciate any hint about which method to choose. Thanks a lot, Avhena [[alternative HTML version deleted]]

Microarray limma Microarray limma • 4.4k views

ADD COMMENT • link updated 14.2 years ago by Simon Anders ★ 3.7k • written 14.2 years ago by avehna ▴ 240

score 0 · Answer 1 · 2010-02-25

Hi avehna wrote: > I'm trying to identify genes that are differentially expressed in 4 > different treatments vs Control. First, I applied > *pairwise.comparison*(simpleaffy library) to my data, and then, just > to compare both results, I > tried *lmFit* and *eBayes* (from limma library). I was wondering which > method is best, because although pairwise.comparison applies a t-test, it > doesn't include Bonferroni correction. On the other hand I'm not sure > whether fitting the data to a linear model using lmFit and eBayes is more > convenient. I have also found another library(maanova) that uses Anova and > it's also suitable for DNA microarray analyzes. First of all: If you have the trivial linear model of just comparing two conditions against each other, the F test for the coefficient for the condition (i.e., the test that ANOVA does) is the same thing as a t test. Hence, doing a t test and an ANOVA should give the same results in the case of just two conditions. The main issue with the t test is that the denominator of the 't' value is the sample variance, as estimated from the values of the gene in the replicates. As you only have four replicates, this estimate may fluctuate a lot. What Limma's eBayes does is to "share information across genes", i.e., it find a compromise between the variance estimate for the gene under consideration and the average variance from all the genes. This gives more reliable results. The correction for multiple testing is a completely separate issue: All these techniques give you raw p values which you should correct for multiple testing, either with the standard R function 'p.adjust' or with Storey's 'qvalue' package. Make sure you understand what this correction actually does, i.e., read up on family-wise error rate (FWER) and especially false discovery rate (FDR). Cheers Simon +--- | Dr. Simon Anders, Dipl.-Phys. | European Molecular Biology Laboratory (EMBL), Heidelberg | office phone +49-6221-387-8632 | preferred (permanent) e-mail: sanders at fs.tum.de