topGO: significant genes in GO enriched categories
Hi,

I used topGO to perform enrichment analysis, giving him a list of genes with scores, and a function to select a subset in them for Fisher test:

topClusterGenes <- function(allScore){ return(allScore>0.7) }
GOdata <- new("topGOdata", ontology="BP", annot=annFUN.gene2GO,
gene2GO=ensemblGene2go, allGenes=GeneList,
nodeSize=5, geneSel=topClusterGenes)

I performed my test and at the end I would like to retrieve, for the signficant GO term found, which of the significant genes, (score>0.7) are in this GO term. But, if I use genesInTerm(), he give me back all the genes from my GeneList that are in this GO term, not just the ones that have score>0.7 ...

Is there a way in TopGO to do it automatically, or should I just filter it mylself?

Hi!

I don't know if topGO can do it automatically (I don't think so) but I did it "manually" in a single line of code using genesInTerm() and filtering. You need a vector containing your significant genes.

SignificantGenes <- genes with a score higher than 0.7 in ENTREZ ID!!

goresults$genes<-sapply(goresults$GO.ID, function(x) {genes<-genesInTerm(GOdata, x); genes[[1]][sapply(genes[[1]],function(id) id %in% SignificantGenes)]})

Hope this helps!

Hi Lidia!

Thanks for providing the answer, it solved the problem I was having right now. If I can edit a little your proposal, instead of the inner sapply I would go with the vectorized form - I know we're talking microseconds here, but still I'd find it easier to read:

goresults$genes <- sapply(goresults$GO.ID, function(x)
{
genes<-genesInTerm(GOdata, x)
genes[[1]][genes[[1]] %in% SignificantGenes]
})
Just had the same problem. You solution fixed it, thanks!