topGO weight01 method: potential bug in the code
Vladimir • 0
I'm currently trying to understand the details for the "weight01" topGO method, and I'm puzzled. Can I please kindly ask for some comments?

Inside .sigGroups.weight01 function (topGOalgo.R), there is the following code block:

    for(child in termChildren) {
      w[child] <- .sigRatio.01(a = childSig, b = termSig)

    ## if w[child] > 1 than that child is more significant
    sig.termChildren <- names(w[w > 1])

    ## CASE 1:  if we don't have significant children

    ## CASE 2:   'child' is more significant that 'u'

At the same time, function .sigRatio.01 (topGOfunctions.R) always returns values exceeding 1:

.sigRatio.01 <- function(a, b, tolerance = 1e-50) {

  ## if a and b are almost equal we return 2
  if(identical(all.equal(a, b, tolerance = tolerance), TRUE))

  if(a < b)


In my understanding, it has the following effect.

  • All child nodes (terms) are always treated as more significant irrespective of actual p-values (and CASE 1 is impossible).
  • Thus (CASE 2), genes associated with all child nodes are always removed for an analyzed node (term). As a result, gene propagation based on the rule of path is reverted (and additionally genes associated with a term are always removed if occasionally they are also associated with a term’s (grand)child).
  • Fisher's test (by default) is applied to such a "cleaned-up" set of genes.

In other words, for a "perfect-world case" with no genes assigned simultaneously to a GO term and one of its (grand)-parent terms, the analysis can be equivalently described as follows:

  • gene propagation based on the rule of path is not performed;
  • classical graph-independent enrichment analysis is applied.

Can I please ask

  • if it is a correct interpretation, or if the code actually does something different;
  • and if it is a correct interpretation, is it a bug or a feature?

Thank you very much!

topGO weight01 • 38 views

