Question

What is Nontrivial node in topGO analysis?

0

Entering edit mode

s.apocarpum ▴ 10

@sapocarpum-8616

Last seen 4.9 years ago

Netherlands

There is a code example in the "Performing the test" chapter of the topGO manual:

 >test.stat <- new("classicCount", testStatistic = GOFisherTest, name = "Fisher test")

 >resultFisher <- getSigGroups(GOdata, test.stat)

> resultFisher

Description: GO analysis of ALL data; B-cell vs T-cell Object modified on: 16 Apr 2015

Ontology: BP

'classic' algorithm with the 'Fisher test' test

5032 GO terms scored: 130 terms with p < 0.01

Annotation data:

Annotated genes: 3838

Significant genes: 330

Min. no. of genes annotated to a GO: 5

Nontrivial nodes: 3558

I'm just curious, what is "Nontrivial node" ? Should I worry about this value during the topGO analysis? I didn't find any explanation in the manual.

topgo • 2.0k views

ADD COMMENT • link updated 8.7 years ago by James W. MacDonald 65k • written 8.7 years ago by s.apocarpum ▴ 10

score 5 · Accepted Answer · 2015-08-24

This would be a bit tricky to figure out, as it doesn't seem to be documented anywhere. But we can figure things out for ourselves, regardless. Note that the output from getSigGroups() is a topGOresult object, and when you type the name of an object in R, you will dispatch the show() method on that object.

So let's see what the show() method is.

> showMethods(show, class = "topGOresult", includeDefs=T)
Function: show (package methods)
object="topGOresult"
function (object)
.printTopGOresult(x = object)

Anytime you see something like .someFunctionName(), you can almost surely bet that is an unexposed function, so we have to use the triple colon (:::) accessor to see the function body.

> topGO:::.printTopGOresult
function (x)
{
    cat("\nDescription:", description(x), "\n")
    cat("'", algorithm(x), "' algorithm with the '", testName(x),
        "' test\n", sep = "")
    cat(length(score(x)), "GO terms scored:", sum(score(x) <=
        0.01), "terms with p < 0.01\n")
    .printGeneData(geneData(x))
}

Getting closer, but still not there.

> topGO:::.printGeneData
function (x)
{
    cat("Annotation data:\n")
    if ("Annotated" %in% names(x))
        cat("    Annotated genes:", x["Annotated"], "\n")
    if ("Significant" %in% names(x))
        cat("    Significant genes:", x["Significant"], "\n")
    cat("    Min. no. of genes annotated to a GO:", x["NodeSize"],
        "\n")
    if ("SigTerms" %in% names(x))
        cat("    Nontrivial nodes:", x["SigTerms"], "\n")
}

So that tells us something. The nontrivial nodes are the 'SigTerms' slot of the topGOresults object. But what does that mean? Let's do a search of the vignette for 'SigTerms' and see what we come up with:

Basic information on input data can be accessed using the geneData function. The number of annotated genes, the number of signicant genes (if it is the case), the minimal size of a GO category as well as the number of GO categories which have at least one signicant gene annotated are listed:

> geneData(resultWeight)

Annotated Significant NodeSize SigTerms

3838 330 5 3558

So there you have it. The nontrivial nodes are those GO categories which have at least one significant gene annotated.