Dear All:
I'm trying to identify genes that are differentially expressed in 4
different treatments vs Control. First, I applied
*pairwise.comparison*(simpleaffy library) to my data, and then, just
to compare both results, I
tried *lmFit* and *eBayes* (from limma library). I was wondering which
method is best, because although pairwise.comparison applies a
t-test, it
doesn't include Bonferroni correction. On the other hand I'm not sure
whether fitting the data to a linear model using lmFit and eBayes is
more
convenient. I have also found another library(maanova) that uses Anova
and
it's also suitable for DNA microarray analyzes.
I will appreciate any hint about which method to choose.
Thanks a lot,
Avhena
[[alternative HTML version deleted]]
Hi
avehna wrote:
> I'm trying to identify genes that are differentially expressed in 4
> different treatments vs Control. First, I applied
> *pairwise.comparison*(simpleaffy library) to my data, and then, just
> to compare both results, I
> tried *lmFit* and *eBayes* (from limma library). I was wondering
which
> method is best, because although pairwise.comparison applies a
t-test, it
> doesn't include Bonferroni correction. On the other hand I'm not
sure
> whether fitting the data to a linear model using lmFit and eBayes is
more
> convenient. I have also found another library(maanova) that uses
Anova and
> it's also suitable for DNA microarray analyzes.
First of all: If you have the trivial linear model of just comparing
two
conditions against each other, the F test for the coefficient for
the
condition (i.e., the test that ANOVA does) is the same thing as a t
test. Hence, doing a t test and an ANOVA should give the same results
in
the case of just two conditions.
The main issue with the t test is that the denominator of the 't'
value
is the sample variance, as estimated from the values of the gene in
the
replicates. As you only have four replicates, this estimate may
fluctuate a lot. What Limma's eBayes does is to "share information
across genes", i.e., it find a compromise between the variance
estimate
for the gene under consideration and the average variance from all the
genes. This gives more reliable results.
The correction for multiple testing is a completely separate issue:
All
these techniques give you raw p values which you should correct for
multiple testing, either with the standard R function 'p.adjust' or
with
Storey's 'qvalue' package. Make sure you understand what this
correction
actually does, i.e., read up on family-wise error rate (FWER) and
especially false discovery rate (FDR).
Cheers
Simon
+---
| Dr. Simon Anders, Dipl.-Phys.
| European Molecular Biology Laboratory (EMBL), Heidelberg
| office phone +49-6221-387-8632
| preferred (permanent) e-mail: sanders at fs.tum.de
Thank you so much Simon, now it's clearer to me.
Best Regards,
Avhena
On Thu, Feb 25, 2010 at 5:05 AM, Simon Anders <anders@embl.de> wrote:
> Hi
>
>
> avehna wrote:
>
>> I'm trying to identify genes that are differentially expressed in 4
>> different treatments vs Control. First, I applied
>> *pairwise.comparison*(simpleaffy library) to my data, and then,
just
>> to compare both results, I
>> tried *lmFit* and *eBayes* (from limma library). I was wondering
which
>> method is best, because although pairwise.comparison applies a
t-test, it
>> doesn't include Bonferroni correction. On the other hand I'm not
sure
>> whether fitting the data to a linear model using lmFit and eBayes
is more
>> convenient. I have also found another library(maanova) that uses
Anova and
>> it's also suitable for DNA microarray analyzes.
>>
>
> First of all: If you have the trivial linear model of just comparing
two
> conditions against each other, the F test for the coefficient for
the
> condition (i.e., the test that ANOVA does) is the same thing as a t
test.
> Hence, doing a t test and an ANOVA should give the same results in
the case
> of just two conditions.
>
> The main issue with the t test is that the denominator of the 't'
value is
> the sample variance, as estimated from the values of the gene in the
> replicates. As you only have four replicates, this estimate may
fluctuate a
> lot. What Limma's eBayes does is to "share information across
genes", i.e.,
> it find a compromise between the variance estimate for the gene
under
> consideration and the average variance from all the genes. This
gives more
> reliable results.
>
> The correction for multiple testing is a completely separate issue:
All
> these techniques give you raw p values which you should correct for
multiple
> testing, either with the standard R function 'p.adjust' or with
Storey's
> 'qvalue' package. Make sure you understand what this correction
actually
> does, i.e., read up on family-wise error rate (FWER) and especially
false
> discovery rate (FDR).
>
> Cheers
> Simon
>
>
> +---
> | Dr. Simon Anders, Dipl.-Phys.
> | European Molecular Biology Laboratory (EMBL), Heidelberg
> | office phone +49-6221-387-8632
> | preferred (permanent) e-mail: sanders@fs.tum.de
>
>
[[alternative HTML version deleted]]