Question: topGO - Extra terms appearing in main analysis and the enrichment test
gravatar for beachboarder900
8 days ago by
beachboarder90040 wrote:


I am using a fairly straight-forward analysis:

GOdata <- new("topGOdata",
              description = "Simple session",
              ontology = "BP",   
              allGenes = top.vec,
              nodeSize = 5,      
              annot = annFUN.GO2genes,
              GO2genes = gos.l )

And enrichment test:

weight01_glm <- GenTable(GOdata, elim = weight01_glm, orderBy = "weight01")

Which gives results:

weight01_glm <- [1:2,]
GO.ID                                                         Term Annotated Significant Expected    elim
1 GO:0060048                           cardiac muscle contraction        51          21    10.29 0.00022
2 GO:1902105              regulation of leukocyte differentiation         6           5     1.21 0.00127

But the Annotated and Significant proteins output from GenTable() are not the same as my input from our data set into topGO for these GOBP terms.

 [1] gene40068:106578961 gene7960:100137048  gene21720:106561017 gene21719:100136569 gene21981:100136562 gene47679:100194662 gene43438:106582275 gene3031:106571564 
 [9] gene27542:106566457 gene47690:106586456 gene43438:106582275 gene43438:106582275 gene47679:100194662 gene6676:106600326  gene40784:106579610 gene9629:106603084 
[17] gene24401:106563560 gene30311:106569202 gene30437:100136504 gene7960:100137048  gene51544:106589989 gene46526:106585259 gene15073:106608253 gene36911:106575818
[25] gene36911:106575818 gene48851:106587620 gene47690:106586456 gene43438:106582275 gene40351:106579178 gene47690:106586456 gene47690:106586456 gene40784:106579610
[33] gene34928:106573547 gene22452:106561611 gene43874:100194559 gene28855:106568026 gene729:106601553   gene476:106587081   gene41441:106580341 gene31944:106570802
[41] gene16298:106609429 gene15876:100194596 gene21719:100136569 gene45809:106584521 gene51844:106590268 gene26252:100196342 gene39081:106577953 gene31333:106570192
[49] gene33361:106572105

countGenesInTerm(GOdata, 'GO:0060048')

The same for the other significant BP term:

countGenesInTerm(GOdata, 'GO:1902105')

My question is, why are GOBP terms that are not included in my term universe, and custom annotation (gos.l), showing up in the topGO analysis? Is the topGO package assigning BP terms 'up' or 'down' that are parent or child terms to the two terms that appear in my enrichment analysis (weight01_glm)?

Any help or guidance would be greatly appreciated!

ADD COMMENTlink modified 8 days ago • written 8 days ago by beachboarder90040

I am not sure, of where do these terms come. You could find the genes present in GOdata for that GO that wheren't originally in your gos.l object, and then find where do they belong. But how are you calculating your weight01_glmpassed to GenTable?

As a side note, topGO version in Bioconductor has many bugs, I tried to correct them in a repo, if you open an issue with a reproducible data I can add the tests and try to correct this bug there.  

ADD REPLYlink modified 8 days ago • written 8 days ago by Lluís R300
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 266 users visited in the last hour