hypergeometric enrichment of GO terms
1
0
Entering edit mode
Axel Rasche ▴ 30
@axel-rasche-1275
Last seen 9.6 years ago
Hello, For a set of marker genes from a longer list of evaluated genes I would like to test the hypergeometrical enrichment of GO terms. Due to the tree structure of GO terms this is a bit intricate. Can anyone point me to useful functions? By now I have lists with Ensembl gene identifiers and the GO annotation can be provided using the package biomaRt. In the package GOstats I find the function GOhyperG to compare a differentially expressed genes to all the genes on an Affymetrix chip with the hypergeometric distribution. Unfortunately the package GOstats is restricted to microarray datasets. Many thanks for your answer in advance, Axel Rasche -- ******************************************* Dipl. Math. ETH Axel Rasche Max-Planck-Institute for Molecular Genetics Department Lehrach (Vertebrate Genomics) Ihnestrasse 63-73 D-14195 Berlin-Dahlem GERMANY Tel. ++49-30-8413-1289 Fax ++49-30-8413-1128
Microarray GO GOstats biomaRt Microarray GO GOstats biomaRt • 1.4k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Axel I'd recommend to use "gene set enrichment" analysis, as implemented for example in Robert Gentleman's Category package. Chapter 8 of the vignette of the cellHTS package http://www.bioconductor.org/packages/1.9/bioc/vignettes/cellHTS/inst/d oc/cellhts.pdf gives an example on how to do this on a scored set of genes whose GO annotation is obtained via biomaRt (the full code of the vignette is provided in the inst/doc directory of that package). Best wishes Wolfgang PS: GO is not really a tree, it is a directed graph. Axel Rasche wrote: > Hello, > > For a set of marker genes from a longer list of evaluated genes I would > like to test the hypergeometrical enrichment of GO terms. Due to the > tree structure of GO terms this is a bit intricate. Can anyone point me > to useful functions? > By now I have lists with Ensembl gene identifiers and the GO annotation > can be provided using the package biomaRt. > > In the package GOstats I find the function GOhyperG to compare a > differentially expressed genes to all the genes on an Affymetrix chip > with the hypergeometric distribution. Unfortunately the package GOstats > is restricted to microarray datasets. > > Many thanks for your answer in advance, > Axel Rasche >
ADD COMMENT
0
Entering edit mode
Hi again, I should have added that the example mentioned below does not tackle the problem of selecting the 'most meaningful' category out of a set of nested categories that are all enriched. That is hard problem and I am not sure it has an objective solution. I have been doing it manually, and for doing that recommend looking at a plot showing the GO graph together with the enrichment p-values [like, for example, in Fig. 4 of the vignette of the GOstats package "Using GO for Statistical Analyses".] Best wishes Wolfgang Wolfgang Huber wrote: > Chapter 8 of the vignette of the cellHTS package > http://www.bioconductor.org/packages/1.9/bioc/vignettes/cellHTS/inst /doc/cellhts.pdf > gives an example on how to do this on a scored set of genes whose GO > annotation is obtained via biomaRt (the full code of the vignette is provided in the > inst/doc directory of that package). > > Best wishes > Wolfgang
ADD REPLY
0
Entering edit mode
Axel, Here is a R/BioC script that performs the phyper test on your own data sets: http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/R_BioCondManual.htm l#GOHyperGAll The last function "simplifyDF" (4.2) is an attempt to address the problem with nested categories. As pointed out by Wolfgang there doesn't appear to be an efficient solution to this problem. Thomas On Mon 08/21/06 17:39, Wolfgang Huber wrote: > Hi again, > > I should have added that the example mentioned below does not tackle the > problem of selecting the 'most meaningful' category out of a set of > nested categories that are all enriched. That is hard problem and I am > not sure it has an objective solution. I have been doing it manually, > and for doing that recommend looking at a plot showing the GO graph > together with the enrichment p-values [like, for example, in Fig. 4 of > the vignette of the GOstats package "Using GO for Statistical Analyses".] > > Best wishes > Wolfgang > > > Wolfgang Huber wrote: > > Chapter 8 of the vignette of the cellHTS package > > http://www.bioconductor.org/packages/1.9/bioc/vignettes/cellHTS/in st/doc/cellhts.pdf > > gives an example on how to do this on a scored set of genes whose GO > > annotation is obtained via biomaRt (the full code of the vignette is provided in the > > inst/doc directory of that package). > > > > Best wishes > > Wolfgang > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Thomas Girke, Ph.D. 1008 Noel T. Keen Hall Center for Plant Cell Biology (CEPCEB) University of California Riverside, CA 92521 E-mail: thomas.girke at ucr.edu Website: http://faculty.ucr.edu/~tgirke Ph: 951-827-2469 Fax: 951-827-4437
ADD REPLY

Login before adding your answer.

Traffic: 940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6