Entering edit mode
First, You may want to read a few similar questions on GAGE, which
explain how GAGE works:
http://seqanswers.com/forums/showthread.php?p=148507#4
You may always choose to use the native GAGE/Pathview workflow, then
the joint workflow (with other tools like DESeq). The former is more
powerful, the latter exist for users? convenience.
Small sample size is common for current RNA-seq datasets, which raise
statistical concern for differential expression analysis in general in
such condition: http://www.biomedcentral.com/1471-2105/14/91. In this
sense, such p-value or test statistics could be less robust than fold
changes for differential expression score. Having that said, you may
always choose to use differential expression statistics other than
fold change (section 5 of the tutorial). And you may always compare
the effect of using different per gene scores/statistics as in section
5.
It is not likely to generate false positive no matter you use fold
change or other test statistics in GAGE analysis given that GAGE test
the mean of tens or hundreds of genes in a gene set or pathway against
the background of all genes. In the meantime, GAGE does FDR control to
exclude false positives.
On 8/28/2014, Chun wrote:> Hi Dr. Luo,
>
> Hope this email finds you well. Recently I tried to use your RNAseq
> pipeline to analyze our data. Could you please help me to clarify
two
> questions?
>
> Currently I am trying to use DESeq2 first to get list of log ratio
> changes and then feed into gage (as described in your vegnettes
?RNA-seq
> data pathway and gene-set analysis workflows, 6.1?, . In this case,
we
> will lose statistics information for all genes, right? Those genes
with
> high fold change but small p-value (from DESeq2) could lead to false
> discoveries of the enriched gene sets. And I am a little bit
confused
> about how do you calculate p-val and q-val for different sets in
this
> case. We will not even have 1-on-1 comparison. How do you determine
the
> statistics for each gene?
>
> Chun
>