Dear Michal,
No there is no publication on these or related methods. But have you
read
Section 10.3 "Testing across contrasts" in the limma User's Guide
which
discusses these methods? You might also find it helpful though to
search
for my replies to previous questions about decideTests on this mailing
list.
I recommend "global" whenever you have a series of contrasts that you
are
analysing together, especially if you want to compare the number of DE
genes that you find. There is no requirement that the contrasts be of
similar "strength". An advantage of "global" is that the same
t-statistic
cutoff is used for all contrasts (in the absence of weights or NAs).
However you need to be aware that the number of DE genes for any
contrast
will depend on what other contrasts it is tested with -- contrasts
with
lots of DE genes will pull the others up.
I recommend "separate" if you are analysing different contrasts for
different purposes, so the numbers of DE genes are not comparable.
"nestedF" has a specialist purpose in trying to give more weight to
genes
which are simultaneously DE in two or more several contrasts. Both
this
and "hierarchical" are somewhat experimental in that I have not
published
the theory, and therefore I don't strongly recommend them.
Best wishes
Gordon
> From: Michal Kol?? <kolarmi at="" img.cas.cz="">
> Subject: [BioC] Methods in decideTests (limma)
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <62C854CA-B73C-4B73-A383-51ED398CCCE3 at img.cas.cz>
> Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed
>
> Dear list,
>
> I would like to ask on the theory behind various methods in
> decideTests. The method "separate" is clear to me, giving p-value
> adjustment for each contrast separately. The method "global" may be
> used presumably when there are two or more contrasts of similar
> strength in the experiment. The method "nestedF" seems to me similar
> to the "global", yet it considers for p-value adjustment only those
> genes for which F-test is significant after p-value correction
> (roughly said). Is that correct? And finally, the method
> "hierarchical" considers for the final p-value adjustment only those
> genes/contrasts for which adjusted p-value (of the gene in the
> contrast treated separately) is smaller than some cut-off. Is that
> right?
>
> My next question is whether any of the methods "global",
> "hierarchical" or "nestedF" can be used to correct p-values for
> contrasts that have different strength (different number of expected
> DEGs)? In my case 'different' means hundreds of DEGs for one
contrast
> and tens for the other.
>
> Can I find more details on the methods in some reference?
>
> Cheers,
> Michal
>
> --
> -----------------------------------------------------
> Michal Kol??
> Academy of Sciences of the Czech Republic
> Institute of Molecular Genetics
Dear Gordon,
thank you for your answer. I read the Section 10.3, yet I wanted to
get little more details. I should have searched in the List, as now I
have been able to find answers to many of my questions in previous
posts (by you and James W. MacDonald). I, however, still have one
question.
In one of the posts James W. MacDonald writes:
>> From: jmacdon at med.umich.edu (James W. MacDonald)
>> Date: Tue, 10 Oct 2006 15:42:17 -0400
>> Subject: [BioC] Limma nestedF
>>
>> ...
>> I am not sure the nestedF method is appropriate for this situation,
>> because you have interaction terms. When there is an interaction
>> term in
>> the model, the usual thing to do is to check for significance of
the
>> interaction term, and if it isn't significant, then you would drop
it
>> from the model and check for significance of the main effects
>> terms. The
>> nestedF method won't do this - it will treat all the contrasts as
>> if equal.
>> ...
>
I have a 2*2 factorial design with surgery and treatment factors and
their interaction
> design <- model.matrix(~surgery*treatment).
The interaction term is potentially of interest. Do I gauge the
importance of the interaction term using decideTests with the method
"global" or "separate"? Or should I estimate the importance directly
by observing the distribution of p-values for the interaction term?
If so, should I remove all contrasts with, say, flat distribution of
p-values from the decideTests("global") call?
Thank you for your help,
Michal
PS: When trying to understand the functions, I spotted a possible
minor bug in classifyTestsF: A global variable 'contrasts' may be
used in rare cases in the 'Return TestResults matrix' chunk of the
code. (The bug is present in my version limma_2.16.5 as well as in
the development version 2.19.1)
On 5 Jun 2009, at 03:47, Gordon K Smyth wrote:
> Dear Michal,
>
> No there is no publication on these or related methods. But have
> you read Section 10.3 "Testing across contrasts" in the limma
> User's Guide which discusses these methods? You might also find it
> helpful though to search for my replies to previous questions about
> decideTests on this mailing list.
>
> I recommend "global" whenever you have a series of contrasts that
> you are analysing together, especially if you want to compare the
> number of DE genes that you find. There is no requirement that the
> contrasts be of similar "strength". An advantage of "global" is
> that the same t-statistic cutoff is used for all contrasts (in the
> absence of weights or NAs). However you need to be aware that the
> number of DE genes for any contrast will depend on what other
> contrasts it is tested with -- contrasts with lots of DE genes will
> pull the others up.
>
> I recommend "separate" if you are analysing different contrasts for
> different purposes, so the numbers of DE genes are not comparable.
>
> "nestedF" has a specialist purpose in trying to give more weight to
> genes which are simultaneously DE in two or more several
> contrasts. Both this and "hierarchical" are somewhat experimental
> in that I have not published the theory, and therefore I don't
> strongly recommend them.
>
> Best wishes
> Gordon
>
>> From: Michal Kol?? <kolarmi at="" img.cas.cz="">
>> Subject: [BioC] Methods in decideTests (limma)
>> To: bioconductor at stat.math.ethz.ch
>> Message-ID: <62C854CA-B73C-4B73-A383-51ED398CCCE3 at img.cas.cz>
>> Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed
>>
>> Dear list,
>>
>> I would like to ask on the theory behind various methods in
>> decideTests. The method "separate" is clear to me, giving p-value
>> adjustment for each contrast separately. The method "global" may be
>> used presumably when there are two or more contrasts of similar
>> strength in the experiment. The method "nestedF" seems to me
similar
>> to the "global", yet it considers for p-value adjustment only those
>> genes for which F-test is significant after p-value correction
>> (roughly said). Is that correct? And finally, the method
>> "hierarchical" considers for the final p-value adjustment only
those
>> genes/contrasts for which adjusted p-value (of the gene in the
>> contrast treated separately) is smaller than some cut-off. Is that
>> right?
>>
>> My next question is whether any of the methods "global",
>> "hierarchical" or "nestedF" can be used to correct p-values for
>> contrasts that have different strength (different number of
expected
>> DEGs)? In my case 'different' means hundreds of DEGs for one
contrast
>> and tens for the other.
>>
>> Can I find more details on the methods in some reference?
>>
>> Cheers,
>> Michal
>>
>> --
>> -----------------------------------------------------
>> Michal Kolar
>> Academy of Sciences of the Czech Republic
>> Institute of Molecular Genetics
Dear Michal,
On Fri, 5 Jun 2009, Michal Kolar wrote:
> Dear Gordon,
>
> thank you for your answer. I read the Section 10.3, yet I wanted to
get
> little more details. I should have searched in the List, as now I
have been
> able to find answers to many of my questions in previous posts (by
you and
> James W. MacDonald). I, however, still have one question.
...
> I have a 2*2 factorial design with surgery and treatment factors and
their
> interaction
>
>> design <- model.matrix(~surgery*treatment).
>
> The interaction term is potentially of interest. Do I gauge the
> importance of the interaction term using decideTests with the method
> "global" or "separate"? Or should I estimate the importance directly
by
> observing the distribution of p-values for the interaction term? If
so,
> should I remove all contrasts with, say, flat distribution of
p-values
> from the decideTests("global") call?
I cannot tell you how analyse a specific data set. However
statistical
answers should always be tuned to the question at hand. Your
question, as
you state it, concerns only the interaction, so it is naturally
answered
by a separate test of the interaction contrast. A "global" call is
always
an answer to a question involving more than one contrast, and you have
not
asked such a question. What I am saying is that you have to think
carefully about all the scientific questions you really want to
answer,
then your formulation of the questions drives the analysis.
Best wishes
Gordon
Dear Gordon,
many thanks for your help :)
Best regards,
Michal
On 6 Jun 2009, at 02:23, Gordon K Smyth wrote:
> Dear Michal,
>
> On Fri, 5 Jun 2009, Michal Kolar wrote:
>
>> Dear Gordon,
>>
>> thank you for your answer. I read the Section 10.3, yet I wanted
>> to get little more details. I should have searched in the List, as
>> now I have been able to find answers to many of my questions in
>> previous posts (by you and James W. MacDonald). I, however, still
>> have one question.
>
> ...
>
>> I have a 2*2 factorial design with surgery and treatment factors
>> and their interaction
>>
>>> design <- model.matrix(~surgery*treatment).
>>
>> The interaction term is potentially of interest. Do I gauge the
>> importance of the interaction term using decideTests with the
>> method "global" or "separate"? Or should I estimate the importance
>> directly by observing the distribution of p-values for the
>> interaction term? If so, should I remove all contrasts with, say,
>> flat distribution of p-values from the decideTests("global") call?
>
> I cannot tell you how analyse a specific data set. However
> statistical answers should always be tuned to the question at
> hand. Your question, as you state it, concerns only the
> interaction, so it is naturally answered by a separate test of the
> interaction contrast. A "global" call is always an answer to a
> question involving more than one contrast, and you have not asked
> such a question. What I am saying is that you have to think
> carefully about all the scientific questions you really want to
> answer, then your formulation of the questions drives the analysis.
>
> Best wishes
> Gordon
Dear Michal,
On Fri, 5 Jun 2009, Michal Kolar wrote:
> PS: When trying to understand the functions, I spotted a possible
minor bug
> in classifyTestsF: A global variable 'contrasts' may be used in rare
cases in
> the 'Return TestResults matrix' chunk of the code. (The bug is
present in my
> version limma_2.16.5 as well as in the development version 2.19.1)
Thanks, you're correct. I've now removed the offending line of code.
Best wishes
Gordon