Hi all,
Consider 20 samples at baseline later exposed to treatment. 10 develop
a
disease and 10 do not develop a disease. Here we want to make a
longitudinal assessment of gene expression in the diseased vs disease-
free.
All done on Affy microarrays.
Are there any obvious reasons why one would consider limma over GEE
for
testing for conditional or disease related outcomes?
Cheers,
Michael
[[alternative HTML version deleted]]
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20131001="" 66720437="" attachment-0001.pl="">
Dear Michael,
It would help if you explained what you mean by "GEE" and why you
think it
might be relevant for your problem.
Best wishes
Gordon
> Date: Tue, 1 Oct 2013 15:28:36 +0100
> From: Michael Breen <breenbioinformatics at="" gmail.com="">
> To: "bioconductor at r-project.org" <bioconductor at="" r-project.org="">,
> "Bioconductor Mailing List" <bioconductor at="" stat.math.ethz.ch="">
> Subject: [BioC] Limma vs GEE
>
> Hi all,
>
> Consider 20 samples at baseline later exposed to treatment. 10
develop a
> disease and 10 do not develop a disease. Here we want to make a
> longitudinal assessment of gene expression in the diseased vs
> disease-free. All done on Affy microarrays.
>
> Are there any obvious reasons why one would consider limma over GEE
for
> testing for conditional or disease related outcomes?
>
> Cheers,
>
> Michael
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
Hi Gordon,
We are just about finished with a write-up of a manuscript where we
describe longitudinal differences within subjects between two
different
groups from baseline to an outcome.
We used a factorial design in limma and are happy with its results and
robustness.
Recently, a colleague mentioned had GEE as a means to test for DE
between
groups. I have yet to find any microarray differential testing done
with
it. GEE is a generalized estimated equation used to estimate
parameters of
a glm, it measures population-averaged effects. Truthfully, I dont
what is
is about and was hoping to gain a bit more of insight which google
could
not offer. Often this mail listing brings me resolution in a much more
explicit and unambigous manner.
Yours,
Michael
On Wed, Oct 2, 2013 at 2:13 PM, Gordon K Smyth <smyth@wehi.edu.au>
wrote:
> Dear Michael,
>
> It would help if you explained what you mean by "GEE" and why you
think it
> might be relevant for your problem.
>
> Best wishes
> Gordon
>
> Date: Tue, 1 Oct 2013 15:28:36 +0100
>> From: Michael Breen <breenbioinformatics@gmail.com**>
>> To: "bioconductor@r-project.org" <bioconductor@r-project.org>,
>> "Bioconductor Mailing List"
<bioconductor@stat.math.ethz.**ch<bioconductor@stat.math.ethz.ch>
>> >
>> Subject: [BioC] Limma vs GEE
>>
>> Hi all,
>>
>> Consider 20 samples at baseline later exposed to treatment. 10
develop a
>> disease and 10 do not develop a disease. Here we want to make a
>> longitudinal assessment of gene expression in the diseased vs
disease-free.
>> All done on Affy microarrays.
>>
>> Are there any obvious reasons why one would consider limma over GEE
for
>> testing for conditional or disease related outcomes?
>>
>> Cheers,
>>
>> Michael
>>
>
> ______________________________**______________________________**____
______
> The information in this email is confidential and
inte...{{dropped:10}}
GEE makes sense when you have lots of samples in the population,
measured
many times, and wish to know about population-level effects of a small
number of factors; it's an alternative to other methods of
hierarchical/nested mixed models. In the limma moderated-ANOVA world,
you'd use duplicateCorrelation to account for the nested correlation
structure, while still getting the (often huge) benefits of between-
gene
moderation of variance you get with limma.
http://www.jstatsoft.org/v15/i02/paper
If this were a reviewer making the criticism/suggestion, I'd respond
with a
sensitivity analysis: to what degree are your inferred results
sensitive to
choices in modeling (limma with/without duplicateCorrelation; limma
vs.
unmoderated straight-up glm; glm vs. geepack error structures). Based
on
that, I'd argue whether it even matters, and if it does matter, show
data
for an example gene or two for which the modeling choice has a strong
effect on estimates of magnitude and significance. There's never any
one
single "right" answer -- "all models are wrong; but some are useful"
(George Box).
-Aaron
[[alternative HTML version deleted]]
Hi Micahel,
As Aaron Mackey has said in separate email, limma has the obvious
advantage of borrowing information between genes.
I have trouble thinking of any possible motivation for using a GEE in
your
context, and the fact that you can't find any applications to
microarrays
is a sign of this.
GEEs are not actually used to fit generalized linear models (glms).
If
one wanted to fit a glm, one would simply do so using the usual
likelihood
method. GEEs are actually used to estimate glms with correlation
structures. The reason why a "generalized" (approximate) estimating
equation is needed is that such models don't correspond to any well
defined probability distribution. The GEE equations don't maximize
any
optimality criteria such as a likelihood or sum of squares.
In your case you don't even have glms. You have normal data from
Affymetrix arrays for which likelihood methods are readily available.
So
there is no need to use glms or GEEs.
With your data, the potential motivation for fitting a correlation
structure would be to take account of correlation between repeated
time
course measurements on the same samples (if that is what you actually
have). limma allows you to fit a constant correlation between the
repeated measures. That should be sufficient unless you have large
number
of longitudinal observations on the same samples. If you did need to
go
outside the limma framework to fit a more complex correlation
structure
(and forgo the benefits of information borrowing), you would probably
want
to use one of the many normal-based mixed model tools rather than
GEEs.
Best wishes
Gordon
On Wed, 2 Oct 2013, Michael Breen wrote:
> Hi Gordon,
>
> We are just about finished with a write-up of a manuscript where we
> describe longitudinal differences within subjects between two
different
> groups from baseline to an outcome.
>
> We used a factorial design in limma and are happy with its results
and
> robustness.
>
> Recently, a colleague mentioned had GEE as a means to test for DE
between
> groups. I have yet to find any microarray differential testing done
with
> it. GEE is a generalized estimated equation used to estimate
parameters of
> a glm, it measures population-averaged effects. Truthfully, I dont
what is
> is about and was hoping to gain a bit more of insight which google
could
> not offer. Often this mail listing brings me resolution in a much
more
> explicit and unambigous manner.
>
> Yours,
>
> Michael
>
>
>
>
>
> On Wed, Oct 2, 2013 at 2:13 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote:
>
>> Dear Michael,
>>
>> It would help if you explained what you mean by "GEE" and why you
think it
>> might be relevant for your problem.
>>
>> Best wishes
>> Gordon
>>
>> Date: Tue, 1 Oct 2013 15:28:36 +0100
>>> From: Michael Breen <breenbioinformatics at="" gmail.com**="">
>>> To: "bioconductor at r-project.org" <bioconductor at="" r-project.org="">,
>>> "Bioconductor Mailing List" <bioconductor at="" stat.math.ethz.**ch<bioconductor="" at="" stat.math.ethz.ch="">
>>>>
>>> Subject: [BioC] Limma vs GEE
>>>
>>> Hi all,
>>>
>>> Consider 20 samples at baseline later exposed to treatment. 10
develop
>>> a disease and 10 do not develop a disease. Here we want to make a
>>> longitudinal assessment of gene expression in the diseased vs
>>> disease-free. All done on Affy microarrays.
>>>
>>> Are there any obvious reasons why one would consider limma over
GEE
>>> for testing for conditional or disease related outcomes?
>>>
>>> Cheers,
>>>
>>> Michael
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
Hi Aaron and Gordon,
Thanks for your entirely straightforward replies to our broad
question.
In fact this was not yet critic from a reviewer, rather constructive
criticism from a colleague. Although, now we have a better idea about
these
types of tests. I find your summarization of GEE rather helpful in
that
they do not maximize any optimality criteria such as they don't
correspond
to any well defined probability distribution and are poor when
maximizing
likelihood and sum of squares.
Thanks again for your time and answers!
Michael
On Thu, Oct 3, 2013 at 12:39 AM, Gordon K Smyth <smyth@wehi.edu.au>
wrote:
> Hi Micahel,
>
> As Aaron Mackey has said in separate email, limma has the obvious
> advantage of borrowing information between genes.
>
> I have trouble thinking of any possible motivation for using a GEE
in your
> context, and the fact that you can't find any applications to
microarrays
> is a sign of this.
>
> GEEs are not actually used to fit generalized linear models (glms).
If
> one wanted to fit a glm, one would simply do so using the usual
likelihood
> method. GEEs are actually used to estimate glms with correlation
> structures. The reason why a "generalized" (approximate) estimating
> equation is needed is that such models don't correspond to any well
defined
> probability distribution. The GEE equations don't maximize any
optimality
> criteria such as a likelihood or sum of squares.
>
> In your case you don't even have glms. You have normal data from
> Affymetrix arrays for which likelihood methods are readily
available. So
> there is no need to use glms or GEEs.
>
> With your data, the potential motivation for fitting a correlation
> structure would be to take account of correlation between repeated
time
> course measurements on the same samples (if that is what you
actually
> have). limma allows you to fit a constant correlation between the
repeated
> measures. That should be sufficient unless you have large number of
> longitudinal observations on the same samples. If you did need to
go
> outside the limma framework to fit a more complex correlation
structure
> (and forgo the benefits of information borrowing), you would
probably want
> to use one of the many normal-based mixed model tools rather than
GEEs.
>
> Best wishes
> Gordon
>
>
> On Wed, 2 Oct 2013, Michael Breen wrote:
>
> Hi Gordon,
>>
>> We are just about finished with a write-up of a manuscript where we
>> describe longitudinal differences within subjects between two
different
>> groups from baseline to an outcome.
>>
>> We used a factorial design in limma and are happy with its results
and
>> robustness.
>>
>> Recently, a colleague mentioned had GEE as a means to test for DE
between
>> groups. I have yet to find any microarray differential testing done
with
>> it. GEE is a generalized estimated equation used to estimate
parameters of
>> a glm, it measures population-averaged effects. Truthfully, I dont
what is
>> is about and was hoping to gain a bit more of insight which google
could
>> not offer. Often this mail listing brings me resolution in a much
more
>> explicit and unambigous manner.
>>
>> Yours,
>>
>> Michael
>>
>>
>>
>>
>>
>> On Wed, Oct 2, 2013 at 2:13 PM, Gordon K Smyth <smyth@wehi.edu.au>
wrote:
>>
>> Dear Michael,
>>>
>>> It would help if you explained what you mean by "GEE" and why you
think
>>> it
>>> might be relevant for your problem.
>>>
>>> Best wishes
>>> Gordon
>>>
>>> Date: Tue, 1 Oct 2013 15:28:36 +0100
>>>
>>>> From: Michael Breen <breenbioinformatics@gmail.com****>
>>>> To: "bioconductor@r-project.org" <bioconductor@r-project.org>,
>>>> "Bioconductor Mailing List"
<bioconductor@stat.math.ethz.***>>>> *ch<bioconductor@stat.math.**ethz.ch <bioconductor@stat.math.ethz.ch="">>
>>>>
>>>>
>>>>> Subject: [BioC] Limma vs GEE
>>>>
>>>> Hi all,
>>>>
>>>> Consider 20 samples at baseline later exposed to treatment. 10
develop
>>>> a disease and 10 do not develop a disease. Here we want to make a
>>>> longitudinal assessment of gene expression in the diseased vs
disease-free.
>>>> All done on Affy microarrays.
>>>>
>>>> Are there any obvious reasons why one would consider limma over
GEE for
>>>> testing for conditional or disease related outcomes?
>>>>
>>>> Cheers,
>>>>
>>>> Michael
>>>>
>>>
> ______________________________**______________________________**____
______
> The information in this email is confidential and
inte...{{dropped:10}}