I've recently started using topGO in my expression analysis and like it a lot. The "elimination" paradigm is a very useful one when doing pathway analysis, since many different pathways do in fact overlap in the genes they contain.
In the manual, all the examples are using p-values as a statistic, but never separate the up- and down-regulated genes. They should be considered separately, should they not?
And, correspondingly, would it be necessary to modify the application for Kolmogorov-Smirnov statistic and KS-elim? In classical GSEA, your statistic has a plus or a minus sign, to indicate if the gene is up- or down-regulated. However, since you're only using the p-values, I was curious as to how can you separate them.
Additionally, it would be nice to understand how are the p-values used in Kolmogorov-Smoirnov (like) methods in topGO, and whether it is possible to use another statistic - such as t-statistic from limma or Wald statistic from DESeq2.
yeah, it's pretty clear how it works for Fisher's statistics and the like - you just select the genes you consider significant and calculate the p-value for 2x2 table
it's much more tricky for KS (or KS-like, as in GSEA) - statistic should add up some statistic (in some power) to the running sum, right? So there's no way to distinguish up- and down-regulated genes - you just lump them all together.
I think this might be very misleading to a lot of users and produce meaningless results.
The bug you are describing is also interesting. Did you try reporting it?
You can distinguish up and down regulated genes if you use the t statistic or the fold change (or log fold change). How is the statistic used in the KS approach? I am not sure.
Yes, I reported the bug several times