Question

DESeq2 - GAGE workflow

1

Entering edit mode

sup230 ▴ 30

@sup230-13286

Last seen 7.2 years ago

I have some confusion about GAGE workflow. I understand GAGE is a type of functional class scoring tools with no preset cutoff used to identify significant genes. But I have seen several workflow/model scripts where they used the output from DESeq2 which is selected based on p-adjusted value of 0.05. Shouldn't the experimental set be the entire expression data? I guess in brief, I am confused about exactly which two groups are compared in order to extract pathways that are considered disturbed with statistical significance. If I choose to use a subset of genes that are selected as significant as a result of DESeq2 analyses and run GAGE with gsets=kegg.sigmet, what is the comparison made in this case?

kegg_human<-kegg.gsets(species = "hsa", id.type = "kegg")
names(kegg_human)
kegg.sigmet<-kegg_human$kg.sets[kegg_human$sigmet.idx]

Also, what is the key difference in algorithm behind between GSEA and GAGE? I read in papers that GSEA uses Kolmogorov-Smirnov statistics and GAGE uses Wilcoxon Mann-Whitney test. I guess these are both non-parametric ranking tests, but is the difference that GAGE uses two sample t-test based on the ranking while GSEA tests whether the shape of cumulative functions are different?

One last question for the result of GAGE, if I just look at the result without specifying greater or less, the output table shows both greater and less columns but the q-values listed do not match to those when I got separate lists for up/down regulated gene sets. Why is this?

Thank you for your help!

gage package deseq2 • 1.6k views

ADD COMMENT • link 7.4 years ago sup230 ▴ 30