Dear Gordon,
Is the geneSetTest() fast to calculate? Not sure if you used
permutation
test under the hood.
For GSEA and GSA, sometimes we see artifacts when the size of the set
is too
small. Is the same true for geneSetTest?
Thanks!
Simon
Date: Sun, 04 Mar 2007 18:51:00 +1100
From: Gordon Smyth <smyth@wehi.edu.au>
Subject: [BioC] GSEA with one class metaanalysis
To: Mark W Kimpel <mwkimpel at="" gmail.com="">
Cc: bioconductor at stat.math.ethz.ch
Message-ID: <6.2.5.6.1.20070304184303.0242d7a0 at wehi.edu.au>
Content-Type: text/plain; charset="us-ascii"; format=flowed
Dear Mark,
If I understand your problem correctly, neither GSEA nor GSA will
accomodate it. The only option I know of is geneSetTest() in the
limma package. This generally works well, although it will give you
someone over optimistic p-values if there are strong positive
correlations between the genes in your gene sets.
Best wishes
Gordon
Dear Simon,
geneSetTest() is very fast if you use the default settings. In that
case it's a closed form calculation. It's intended to use with
individual gene sets and has no problem with small gene sets. It's
usable down to size=1.
GSEA and especially GSA are very sophisticated methods which use
permutation over arrays as well as standardization over genes to
control for possible dependence between the genes in the test set.
I'm not an expert on either method, but they seem intended for
two-sample situations with at least half a dozen arrays in each
group, many gene sets, and many genes in each set.
geneSetTest() is a far simpler (hence more flexible) approach which
is aimed at a class of problems that we see regularly at the WEHI.
Here the aim is to relate a gene ranking, usually achieved by fitting
a linear model, to a prior set of genes of special interest. It's
based on permuting the genes, not the arrays. The default method is
simply a Wilcoxon test using the ranks of the genes. The caveat of
geneSetTest() is that significance can arise theoretically from high
correlations between genes in the test set rather than a shift in the
mean, so this possibility should ideally be checked or ruled out
separately.
Best wishes
Gordon
At 10:00 PM 5/03/2007, bioconductor-request at stat.math.ethz.ch
wrote:
>Date: Sun, 4 Mar 2007 12:46:19 -0600
>From: "Simon Lin" <simonlin at="" duke.edu="">
>Subject: Re: [BioC] geneSetTest() / GESA
>To: <bioconductor at="" stat.math.ethz.ch="">
>
>Dear Gordon,
>
>Is the geneSetTest() fast to calculate? Not sure if you used
permutation
>test under the hood.
>
>For GSEA and GSA, sometimes we see artifacts when the size of the set
is too
>small. Is the same true for geneSetTest?
>
>Thanks!
>
>Simon
>
>
>Date: Sun, 04 Mar 2007 18:51:00 +1100
>From: Gordon Smyth <smyth at="" wehi.edu.au="">
>Subject: [BioC] GSEA with one class metaanalysis
>To: Mark W Kimpel <mwkimpel at="" gmail.com="">
>Cc: bioconductor at stat.math.ethz.ch
>Message-ID: <6.2.5.6.1.20070304184303.0242d7a0 at wehi.edu.au>
>Content-Type: text/plain; charset="us-ascii"; format=flowed
>
>Dear Mark,
>
>If I understand your problem correctly, neither GSEA nor GSA will
>accomodate it. The only option I know of is geneSetTest() in the
>limma package. This generally works well, although it will give you
>someone over optimistic p-values if there are strong positive
>correlations between the genes in your gene sets.
>
>Best wishes
>Gordon