extracting all GO terms from GOHyperGResult object?
1
0
Entering edit mode
Jenny Drnevich ★ 2.0k
@jenny-drnevich-2812
Last seen 23 days ago
United States
Hi all, I've been successfully using the GOstats package for a while now to do testing for over-representation of GO terms. I'd also like to use it as a quick way to output all the GO terms that get tested. However, I can't get the GOHyperGResult object to output all the GO terms that it says it tested, it will only output those that are below the pvalueCutoff specified. Even when I raise the pvalueCutoff to 1 (max allowed value), I still can't get all the terms. Here's a reproducible example: > library(ALL) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. > library(GOstats) Loading required package: Category Loading required package: AnnotationDbi Loading required package: graph Loading required package: DBI > library(hgu95av2.db) Loading required package: org.Hs.eg.db > > data(ALL, package = "ALL") > > > sel.IDs <- unique(unlist(mget(featureNames(ALL)[1:20],hgu95av2ENTREZID))) > uni.IDs <- unique(unlist(mget(featureNames(ALL),hgu95av2ENTREZID))) > > > params <- new("GOHyperGParams", geneIds=sel.IDs, universeGeneIds=uni.IDs, + annotation="hgu95av2.db",ontology="BP",pvalueCutoff=1, conditional=T, + testDirection="over") > > hgOver <- hyperGTest(params) > hgOver Gene to GO BP Conditional test for over-representation 356 GO BP ids tested (184 have p < 1) Selected gene set size: 18 Gene universe size: 7685 Annotation package: hgu95av2 > > dim(summary(hgOver)) [1] 184 7 As you can see, the hgOver object says that it tested 356 GO BP ids, but only 184 have p < 1, so the summary(hgOver) only has 184 rows. Is there any easy way to get all 356 GO terms out, along with their ExpCount, Count, Size, etc.? Thanks, Jenny > > sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GO.db_2.3.5 hgu95av2.db_2.3.5 org.Hs.eg.db_2.3.6 [4] GOstats_2.12.0 RSQLite_0.8-0 DBI_0.2-5 [7] graph_1.24.1 Category_2.12.0 AnnotationDbi_1.8.1 [10] ALL_1.4.7 Biobase_2.6.1 loaded via a namespace (and not attached): [1] annotate_1.24.0 genefilter_1.28.2 GSEABase_1.8.0 RBGL_1.22.0 [5] splines_2.10.1 survival_2.35-7 tools_2.10.1 XML_2.6-0 [9] xtable_1.5-6 > Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu
Annotation GO hgu95av2 GOstats Annotation GO hgu95av2 GOstats • 1.0k views
ADD COMMENT
0
Entering edit mode
James F. Reid ▴ 610
@james-f-reid-3148
Last seen 9.6 years ago
Dear Jenny, It looks like the pvalue parameter in the summary of a hyperGTest is strict, so 184 have p<1 while the rest are exactly 1. A quick, brute force was to overcome this is to call the summary with a pvalue greater than one, > dim(summary(hgOver, pvalue=1.1)) [1] 356 7 A more elegant way is to inspect the content of the results: > slotNames(hgOver) [1] "goDag" "pvalue.order" "conditional" "annotation" [5] "geneIds" "testName" "pvalueCutoff" "testDirection" Tested GO ids are contained in the goDag slot, the nodes of which are the tested GO ids, so: >length(nodes((goDag(hgOver)))) [1] 356 > nodes((goDag(hgOver)))[1:3] GO:0000002 GO:0000018 GO:0000279 "GO:0000002" "GO:0000018" "GO:0000279" HTH. J. On 21/01/2010 16:33, Jenny Drnevich wrote: > Hi all, > > I've been successfully using the GOstats package for a while now to do > testing for over-representation of GO terms. I'd also like to use it as > a quick way to output all the GO terms that get tested. However, I can't > get the GOHyperGResult object to output all the GO terms that it says it > tested, it will only output those that are below the pvalueCutoff > specified. Even when I raise the pvalueCutoff to 1 (max allowed value), > I still can't get all the terms. Here's a reproducible example: > > > library(ALL) > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > library(GOstats) > Loading required package: Category > Loading required package: AnnotationDbi > Loading required package: graph > Loading required package: DBI > > library(hgu95av2.db) > Loading required package: org.Hs.eg.db > > > > data(ALL, package = "ALL") > > > > > > sel.IDs <- > unique(unlist(mget(featureNames(ALL)[1:20],hgu95av2ENTREZID))) > > uni.IDs <- unique(unlist(mget(featureNames(ALL),hgu95av2ENTREZID))) > > > > > > params <- new("GOHyperGParams", geneIds=sel.IDs, > universeGeneIds=uni.IDs, > + annotation="hgu95av2.db",ontology="BP",pvalueCutoff=1, conditional=T, > + testDirection="over") > > > > hgOver <- hyperGTest(params) > > hgOver > Gene to GO BP Conditional test for over-representation > 356 GO BP ids tested (184 have p < 1) > Selected gene set size: 18 > Gene universe size: 7685 > Annotation package: hgu95av2 > > > > dim(summary(hgOver)) > [1] 184 7 > > As you can see, the hgOver object says that it tested 356 GO BP ids, but > only 184 have p < 1, so the summary(hgOver) only has 184 rows. Is there > any easy way to get all 356 GO terms out, along with their ExpCount, > Count, Size, etc.? > > Thanks, > Jenny > > > > > sessionInfo() > R version 2.10.1 (2009-12-14) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GO.db_2.3.5 hgu95av2.db_2.3.5 org.Hs.eg.db_2.3.6 > [4] GOstats_2.12.0 RSQLite_0.8-0 DBI_0.2-5 > [7] graph_1.24.1 Category_2.12.0 AnnotationDbi_1.8.1 > [10] ALL_1.4.7 Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] annotate_1.24.0 genefilter_1.28.2 GSEABase_1.8.0 RBGL_1.22.0 > [5] splines_2.10.1 survival_2.35-7 tools_2.10.1 XML_2.6-0 > [9] xtable_1.5-6 > > > > > > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 901 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6