HyperGTest Gene Universe problem
1
0
Entering edit mode
Vivek Kaimal ▴ 20
@vivek-kaimal-2087
Last seen 10.3 years ago
Hi Seth. I am using the hyperGTest function for some genesets I have and I'm having some problems with the Gene Universe & Gene set being used in the analysis. My original Gene Universe contains 18382 genes and one of my gene sets contains 597 genes. > length(GeneUniverse) [1] 18382 > length(GeneList) [1] 597 Then I run the following to test for over-representation: > hgCutoff<-0.05 > params <- new("GOHyperGParams", geneIds = GeneList, universeGeneIds = GeneUniverse,annotation = "hgu133plus2",ontology = "BP", pvalueCutoff = hgCutoff, conditional = FALSE,testDirection = "over") > hgOver <- hyperGTest(params) But when I check the details for "hgOver", the number of genes used for Gene Universe and Gene set seem to be much lower than my original sets. The summary is as given below: > hgOver Gene to GO BP test for over-representation 1101 GO BP ids tested (160 have p < 0.05) Selected gene set size: 427 Gene universe size: 11292 Annotation package: hgu133plus2 Is it because some of my Entrez IDs are not being found in the annotation package? Do I need to use another annotation package? Thanks in advance. Vivek
Annotation GO Annotation GO • 1.1k views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.3 years ago
Hi Vivek, "Vivek Kaimal" <vivek.kaimal at="" cchmc.org=""> writes: > Hi Seth. > > I am using the hyperGTest function for some genesets I have and I'm > having some problems with the Gene Universe & Gene set being used in the > analysis. My original Gene Universe contains 18382 genes and one of my > gene sets contains 597 genes. >> length(GeneUniverse) > [1] 18382 >> length(GeneList) > [1] 597 > > Then I run the following to test for over-representation: >> hgCutoff<-0.05 >> params <- new("GOHyperGParams", geneIds = GeneList, universeGeneIds = > GeneUniverse,annotation = "hgu133plus2",ontology = "BP", pvalueCutoff = > hgCutoff, conditional = FALSE,testDirection = "over") >> hgOver <- hyperGTest(params) > > But when I check the details for "hgOver", the number of genes used for > Gene Universe and Gene set seem to be much lower than my original sets. > The summary is as given below: > >> hgOver > Gene to GO BP test for over-representation > 1101 GO BP ids tested (160 have p < 0.05) > Selected gene set size: 427 > Gene universe size: 11292 > Annotation package: hgu133plus2 > > Is it because some of my Entrez IDs are not being found in the > annotation package? Do I need to use another annotation package? Unfortunately, the documentation is a bit too spread out to be as useful as I would like. If you read the doc for hyperGTest in the Category package (sorry, not in GOstats), then you will see: Both the selected genes and the universe are reduced by removing identifiers that do not have any annotations in the specified category. And so in your case, it means there are gene IDs in selected and universe that have no GO BP annotation and they have been removed. We made this choice because inflating the gene universe with IDs that cannot appear in any of the categories will, in general, result in more impressive, but less meaningful, p-value for the over-represented terms. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org
ADD COMMENT

Login before adding your answer.

Traffic: 714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6