The publication for CAMERA (http://nar.oxfordjournals.org/content/earl
y/2012/05/24/nar.gks461.full) mentions using the average absolute-
logFC as a statistics for "non-directional" gene sets.
This would seem like the appropriate approach for gene sets from e.g.
the GO database.
However I was unable to find how to use anything other than logFC or
its rank from the camera documentation in limma.
Have I missed something?
Thanking you in advance,
Best regards,
Simon.
Hi Simon,
Thank you for your interest in using CAMERA. It has lots of good
feathers, holding correct false positive rate and having good power.
It can be used when you have multiple gene sets, e.g., GO as you
mentioned.
Currently, the default test statistics for individual genes is the
moderated t, which is a variant of the ordinary t. See Smyth 2004.
(http://www.statsci.org/smyth/pubs/ebayes.pdf)
It is up to the user whether ranks (of the moderated t) should be used
or not.
Of course, it is easily to edit the code to allow log fold change to
represent the change of the individual genes. What other statistics
for individual genes will you be interested in using?
It is also worth noting that, according to other users, it is safe to
set ?allow.neg.cor=FLASE?, to let correlation be zero when the actual
calculated correlation is negative.
Gordon may also have some insight regarding your question.
Enjoy using CAMERA.
Di
----
Di Wu
Postdoctoral fellow
Harvard University, Statistics Department
Harvard Medical School
Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA
________________________________________
From: bioconductor-bounces@r-project.org [bioconductor-
bounces@r-project.org] On Behalf Of Simon de Bernard
[simon.debernard@altrabio.com]
Sent: Monday, September 03, 2012 6:58 AM
To: bioconductor at r-project.org
Subject: [BioC] CAMERA for non-directional gene sets
The publication for CAMERA (http://nar.oxfordjournals.org/content/earl
y/2012/05/24/nar.gks461.full) mentions using the average absolute-
logFC as a statistics for "non-directional" gene sets.
This would seem like the appropriate approach for gene sets from e.g.
the GO database.
However I was unable to find how to use anything other than logFC or
its rank from the camera documentation in limma.
Have I missed something?
Thanking you in advance,
Best regards,
Simon.
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Dear Di,
thanks for your answer.
> Thank you for your interest in using CAMERA. It has lots of good
feathers, holding correct false positive rate and having good power.
That's why it piqued my interest ;-)
> It can be used when you have multiple gene sets, e.g., GO as you
mentioned.
>
> Currently, the default test statistics for individual genes is the
moderated t, which is a variant of the ordinary t. See Smyth 2004.
(http://www.statsci.org/smyth/pubs/ebayes.pdf)
>
> It is up to the user whether ranks (of the moderated t) should be
used or not.
>
> Of course, it is easily to edit the code to allow log fold change to
represent the change of the individual genes. What other statistics
for individual genes will you be interested in using?
Sorry for mixing up logFC and moderated t. However, isn't it still an
approach only appropriate for "directional" gene sets?
Suppose I have a gene set for which I know that genes should be
differentially expressed but not necessarily in the same direction. If
half the genes in my set have a statistic of -10 and the other half of
+10, won't the current implementation give me p=1 when I would expect
significance?
Best regards,
Simon.
> It is also worth noting that, according to other users, it is safe
to set ?allow.neg.cor=FLASE?, to let correlation be zero when the
actual calculated correlation is negative.
>
> Gordon may also have some insight regarding your question.
>
> Enjoy using CAMERA.
>
> Di
>
>
>
>
> ----
> Di Wu
> Postdoctoral fellow
> Harvard University, Statistics Department
> Harvard Medical School
> Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA
Dear Simon,
CAMERA outputs the testing p values with the directions, up, down and
either. In your example, you are right that CAMERA gives very large p
value to indicate non-significant.
It doesn't output the non-directional p value (we also call the non
diretional test as the test for the "mixed" direction), as we found
the correlation effects in the test for the mixed direction is quite
complicated.
CAMERA is a competitive gene set test as you may know the major two
types are competitive and self-contained. (see Goeman, J. J. and
B?uhlmann, P. (2007), and the CAMERA paper).
Generally, to test a mixed direction for a competitive hypothesis ,
Wilcoxon mean rank gene set test (wilcoxGST in limma) can be used
although this method ignores gene-gene correlations. According to my
experience, the correlation effect in the test of the mixed direction
is much weaker than in the directional tests.
Another self-contained test ROAST can also be used to help
understanding the test for the non-directional tests.
(http://bioinformatics.oxfordjournals.org/content/early/2010/07/07/bio
informatics.btq401.full.pdf)
Hope this help.
Di
----
Di Wu
Postdoctoral fellow
Harvard University, Statistics Department
Harvard Medical School
Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA
________________________________________
From: Simon de Bernard [simon.debernard@altrabio.com]
Sent: Tuesday, September 04, 2012 12:06 PM
To: Wu, Di
Cc: bioconductor at r-project.org
Subject: Re: [BioC] CAMERA for non-directional gene sets
Dear Di,
thanks for your answer.
> Thank you for your interest in using CAMERA. It has lots of good
feathers, holding correct false positive rate and having good power.
That's why it piqued my interest ;-)
> It can be used when you have multiple gene sets, e.g., GO as you
mentioned.
>
> Currently, the default test statistics for individual genes is the
moderated t, which is a variant of the ordinary t. See Smyth 2004.
(http://www.statsci.org/smyth/pubs/ebayes.pdf)
>
> It is up to the user whether ranks (of the moderated t) should be
used or not.
>
> Of course, it is easily to edit the code to allow log fold change to
represent the change of the individual genes. What other statistics
for individual genes will you be interested in using?
Sorry for mixing up logFC and moderated t. However, isn't it still an
approach only appropriate for "directional" gene sets?
Suppose I have a gene set for which I know that genes should be
differentially expressed but not necessarily in the same direction. If
half the genes in my set have a statistic of -10 and the other half of
+10, won't the current implementation give me p=1 when I would expect
significance?
Best regards,
Simon.
> It is also worth noting that, according to other users, it is safe
to set ?allow.neg.cor=FLASE?, to let correlation be zero when the
actual calculated correlation is negative.
>
> Gordon may also have some insight regarding your question.
>
> Enjoy using CAMERA.
>
> Di
>
>
>
>
> ----
> Di Wu
> Postdoctoral fellow
> Harvard University, Statistics Department
> Harvard Medical School
> Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA