What is Nontrivial node in topGO analysis?
1
0
Entering edit mode
s.apocarpum ▴ 10
@sapocarpum-8616
Last seen 5.6 years ago
Netherlands

There is a code example in the "Performing the test" chapter of the topGO manual:

 >test.stat <- new("classicCount", testStatistic = GOFisherTest, name = "Fisher test")

 >resultFisher <- getSigGroups(GOdata, test.stat)

> resultFisher

 

Description: GO analysis of ALL data; B-cell vs T-cell Object modified on: 16 Apr 2015

Ontology: BP

'classic' algorithm with the 'Fisher test' test

5032 GO terms scored: 130 terms with p < 0.01

Annotation data:

Annotated genes: 3838

Significant genes: 330

Min. no. of genes annotated to a GO: 5

Nontrivial nodes: 3558

I'm just curious, what is "Nontrivial node" ? Should I worry  about this value during the topGO analysis? I didn't find any explanation in the manual.

topgo • 2.2k views
ADD COMMENT
5
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States

This would be a bit tricky to figure out, as it doesn't seem to be documented anywhere. But we can figure things out for ourselves, regardless. Note that the output from getSigGroups() is a topGOresult object, and when you type the name of an object in R, you will dispatch the show() method on that object.

So let's see what the show() method is.

> showMethods(show, class = "topGOresult", includeDefs=T)
Function: show (package methods)
object="topGOresult"
function (object)
.printTopGOresult(x = object)

Anytime you see something like .someFunctionName(), you can almost surely bet that is an unexposed function, so we have to use the triple colon (:::) accessor to see the function body.

> topGO:::.printTopGOresult
function (x)
{
    cat("\nDescription:", description(x), "\n")
    cat("'", algorithm(x), "' algorithm with the '", testName(x),
        "' test\n", sep = "")
    cat(length(score(x)), "GO terms scored:", sum(score(x) <=
        0.01), "terms with p < 0.01\n")
    .printGeneData(geneData(x))
}

Getting closer, but still not there.

> topGO:::.printGeneData
function (x)
{
    cat("Annotation data:\n")
    if ("Annotated" %in% names(x))
        cat("    Annotated genes:", x["Annotated"], "\n")
    if ("Significant" %in% names(x))
        cat("    Significant genes:", x["Significant"], "\n")
    cat("    Min. no. of genes annotated to a GO:", x["NodeSize"],
        "\n")
    if ("SigTerms" %in% names(x))
        cat("    Nontrivial nodes:", x["SigTerms"], "\n")
}

So that tells us something. The nontrivial nodes are the 'SigTerms' slot of the topGOresults object. But what does that mean? Let's do a search of the vignette for 'SigTerms' and see what we come up with:

Basic information on input data can be accessed using the geneData function. The number of annotated genes, the number of signi cant genes (if it is the case), the minimal size of a GO category as well as the number of GO categories which have at least one signi cant gene annotated are listed:

> geneData(resultWeight)

Annotated Significant NodeSize SigTerms

3838 330 5 3558

So there you have it. The nontrivial nodes are those GO categories which have at least one significant gene annotated.

 

ADD COMMENT
0
Entering edit mode

Thank you so much!

It's a really excellent explanation!!!

ADD REPLY

Login before adding your answer.

Traffic: 562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6