Question

topGO - treating up- and down-regulated genes separately

0

Entering edit mode

predeus • 0

@predeus-9207

Last seen 4.0 years ago

United States

I've recently started using topGO in my expression analysis and like it a lot. The "elimination" paradigm is a very useful one when doing pathway analysis, since many different pathways do in fact overlap in the genes they contain.

In the manual, all the examples are using p-values as a statistic, but never separate the up- and down-regulated genes. They should be considered separately, should they not?

And, correspondingly, would it be necessary to modify the application for Kolmogorov-Smirnov statistic and KS-elim? In classical GSEA, your statistic has a plus or a minus sign, to indicate if the gene is up- or down-regulated. However, since you're only using the p-values, I was curious as to how can you separate them.

Additionally, it would be nice to understand how are the p-values used in Kolmogorov-Smoirnov (like) methods in topGO, and whether it is possible to use another statistic - such as t-statistic from limma or Wald statistic from DESeq2.

topGO GSEA pathway analysis • 2.3k views

ADD COMMENT • link updated 7.1 years ago by Lluís Revilla Sancho ▴ 730 • written 7.1 years ago by predeus • 0

score 0 · Answer 1 · 2017-03-10

0

Entering edit mode

Lluís Revilla Sancho ▴ 730

@lluis-revilla-sancho

Last seen 33 minutes ago

European Union

One can use any statistic you want. The statistic is used to select the genes with geneSelectionFun.(Be aware that geneSelectionFun doesn't work properly).

However I can't shed any light to how are the p-values calculated (You can have a p-value of a GO term without any significant gene on that term, how? I don't know).

ADD COMMENT • link 7.1 years ago Lluís Revilla Sancho ▴ 730

0

Entering edit mode

yeah, it's pretty clear how it works for Fisher's statistics and the like - you just select the genes you consider significant and calculate the p-value for 2x2 table

it's much more tricky for KS (or KS-like, as in GSEA) - statistic should add up some statistic (in some power) to the running sum, right? So there's no way to distinguish up- and down-regulated genes - you just lump them all together.

I think this might be very misleading to a lot of users and produce meaningless results.

The bug you are describing is also interesting. Did you try reporting it?

ADD REPLY • link 7.1 years ago predeus • 0

0

Entering edit mode

You can distinguish up and down regulated genes if you use the t statistic or the fold change (or log fold change). How is the statistic used in the KS approach? I am not sure.

Yes, I reported the bug several times

ADD REPLY • link 7.1 years ago Lluís Revilla Sancho ▴ 730