**Short answer: FELLA's p.scores from diffusion/pagerank are not corrected for multiple testing and not used as p-values, but as a statistically adjusted prioritiser.**

Long answer:

In general, `hypergeom`

uses the classical hypothesis testing (and results are therefore adjusted and called significant), whereas `diffusion`

and `pagerank`

are used as prioritisers. In the latter, scores are computed (the lower, the better) and sorted in ascending order. A threshold is applied to extract the top entities and return them as a sub-network, which typically contains large connected components.

For simplicity, both approximations `normality`

and `simulation`

have p-scores between 0 and 1 and go the same direction (lower is better). Specifically, `normality`

uses the cumulative distribution function of a Gaussian distribution applied to the exact z-scores, capping at 1e-6, while `simulation`

gives an empirical p-value by definition. Both approaches lying in `[0, 1]`

means, in practice, that a default p-score cutoff will most likely work equally well. Besides, by default, up to 250 nodes are reported (even if more nodes have a p-score lower than 0.05), since manual examination of larger networks becomes cumbersome.

Can those p-scores be used as p-values? Yes, especially those obtained by `simulation`

(since the null distribution does not need to be Gaussian), but it would move away from the current prioritiser paradigm into a hypothesis testing one. We experimented with that and often reached trivial conclusions, like only the input nodes being significant. If you want to further pursue that idea, you may try a correction like the FDR and stratify by node category. This would prevent smaller categories (pathways, modules) from being overly penalised by larger ones (compounds, reactions). We did not fully explore this path.

As a side note, FELLA also filters out connected components that are small enough so that they could come from a random selection of nodes. If an input list is noisy (meaning the compounds are not really proximal in the network) one would expect smaller connected components, often filtered out.