Question

topGO: significant genes in GO enriched categories

0

Entering edit mode

samuel collombet ▴ 140

@samuel-collombet-6574

Last seen 6.9 years ago

France

Hi,

I used topGO to perform enrichment analysis, giving him a list of genes with scores, and a function to select a subset in them for Fisher test:

topClusterGenes <- function(allScore){ return(allScore>0.7) }
GOdata <- new("topGOdata", ontology="BP", annot=annFUN.gene2GO,
              gene2GO=ensemblGene2go, allGenes=GeneList,
              nodeSize=5, geneSel=topClusterGenes)

I performed my test and at the end I would like to retrieve, for the signficant GO term found, which of the significant genes, (score>0.7) are in this GO term. But, if I use genesInTerm(), he give me back all the genes from my GeneList that are in this GO term, not just the ones that have score>0.7 ...

Is there a way in TopGO to do it automatically, or should I just filter it mylself?

topGO • 3.9k views

ADD COMMENT • link updated 9.1 years ago by lidia.mateo ▴ 30 • written 9.1 years ago by samuel collombet ▴ 140

score 3 · Answer 1 · 2015-04-16

3

Entering edit mode

lidia.mateo ▴ 30

@lidiamateo-7243

Last seen 3.7 years ago

Spain

Hi!

I don't know if topGO can do it automatically (I don't think so) but I did it "manually" in a single line of code using genesInTerm() and filtering. You need a vector containing your significant genes.

SignificantGenes <- genes with a score higher than 0.7 in ENTREZ ID!!

goresults <- GenTable(...your stuff...)

goresults$genes<-sapply(goresults$GO.ID, function(x) {genes<-genesInTerm(GOdata, x); genes[[1]][sapply(genes[[1]],function(id) id %in% SignificantGenes)]})

Hope this helps!

ADD COMMENT • link 9.1 years ago lidia.mateo ▴ 30

1

Entering edit mode

Hi Lidia!

Thanks for providing the answer, it solved the problem I was having right now. If I can edit a little your proposal, instead of the inner sapply I would go with the vectorized form - I know we're talking microseconds here, but still I'd find it easier to read:

goresults$genes <- sapply(goresults$GO.ID, function(x)
    {
      genes<-genesInTerm(GOdata, x) 
      genes[[1]][genes[[1]] %in% SignificantGenes]
    })