Question: Query about the conditional hypergeometric test in GOstats package
0
gravatar for Chanchal Kumar
11.6 years ago by
Chanchal Kumar130 wrote:
Dear Bioconductor developers and users, I am using the "GOstats" package to find over/under enriched GO terms in my dataset. And there is an option to calculate the "conditional hypergeometric test". I am not sure what this would imply, I am aware of the conventional hypergeometric test but this is a bit unfamiliar concept. Therefore it will be very helpful if someone could explain the concept and point me to relevant references. Thanks in advance! Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ ===============================
proteomics • 555 views
ADD COMMENTlink modified 11.6 years ago by James W. MacDonald50k • written 11.6 years ago by Chanchal Kumar130
Answer: Query about the conditional hypergeometric test in GOstats package
0
gravatar for James W. MacDonald
11.6 years ago by
United States
James W. MacDonald50k wrote:
Hi Chanchal, The GO ontology is set up as a directed acyclic graph, where a parent term is comprised of all its child terms. If you do a standard hypergeometric, you might e.g., find 'positive regulation of kinase activity' to be significant. If you then test 'positive regulation of catalytic activity', which is a parent term, then it might be significant as well, but only because of the terms coming from positive regulation of kinase activity. The conditional hypergeometric takes this into account, and only uses those terms that were not already significant when testing a higher order (parent) term. For a reference, see the paper by Adrian Alexa. You can find the citation on the last page of the 'How to use GOstats' vignette. Best, Jim Chanchal Kumar wrote: > Dear Bioconductor developers and users, > > I am using the "GOstats" package to find over/under enriched GO terms > in my dataset. And there is an option to calculate the "conditional > hypergeometric test". I am not sure what this would imply, I am aware of > the conventional hypergeometric test but this is a bit unfamiliar > concept. Therefore it will be very helpful if someone could explain the > concept and point me to relevant references. > > Thanks in advance! > > Best Regards, > Chanchal > =============================== > Chanchal Kumar, Ph.D. Candidate > Dept. of Proteomics and Signal Transduction > Max Planck Institute of Biochemistry > Am Klopferspitz 18 > 82152 D-Martinsried (near Munich) > Germany > e-mail: chanchal at biochem.mpg.de > Phone: (Office) +49 (0) 89 8578 2296 > Fax:(Office) +49 (0) 89 8578 2219 > http://www.biochem.mpg.de/mann/ > =============================== > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENTlink written 11.6 years ago by James W. MacDonald50k
a bit more info 1) there are two packages GOstats (Falcon and Gentleman) and topGO (Alexa), that do conditional analyses are very similar in spirit. They were developed independently, but at about the same time 2) and each has its own reference, ours is S. Falcon and R. Gentleman, Using GOstats to test gene lists for GO term association, Bioinformatics, 2007, 23, 257?258. Jim already provided the other one and the vignette for the package should also provide some reasonable explanation and examples, best wishes Robert James W. MacDonald wrote: > Hi Chanchal, > > The GO ontology is set up as a directed acyclic graph, where a parent > term is comprised of all its child terms. If you do a standard > hypergeometric, you might e.g., find 'positive regulation of kinase > activity' to be significant. > > If you then test 'positive regulation of catalytic activity', which is a > parent term, then it might be significant as well, but only because of > the terms coming from positive regulation of kinase activity. > > The conditional hypergeometric takes this into account, and only uses > those terms that were not already significant when testing a higher > order (parent) term. > > For a reference, see the paper by Adrian Alexa. You can find the > citation on the last page of the 'How to use GOstats' vignette. > > Best, > > Jim > > > Chanchal Kumar wrote: >> Dear Bioconductor developers and users, >> >> I am using the "GOstats" package to find over/under enriched GO terms >> in my dataset. And there is an option to calculate the "conditional >> hypergeometric test". I am not sure what this would imply, I am aware of >> the conventional hypergeometric test but this is a bit unfamiliar >> concept. Therefore it will be very helpful if someone could explain the >> concept and point me to relevant references. >> >> Thanks in advance! >> >> Best Regards, >> Chanchal >> =============================== >> Chanchal Kumar, Ph.D. Candidate >> Dept. of Proteomics and Signal Transduction >> Max Planck Institute of Biochemistry >> Am Klopferspitz 18 >> 82152 D-Martinsried (near Munich) >> Germany >> e-mail: chanchal at biochem.mpg.de >> Phone: (Office) +49 (0) 89 8578 2296 >> Fax:(Office) +49 (0) 89 8578 2219 >> http://www.biochem.mpg.de/mann/ >> =============================== >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLYlink written 11.6 years ago by rgentleman5.5k
Dear Dr. Gentleman, Thanks for the inputs. It was indeed helpful and now after reading the references the concept is pretty clear. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== -----Original Message----- From: Robert Gentleman [mailto:rgentlem@fhcrc.org] Sent: Monday, November 26, 2007 5:36 PM To: James W. MacDonald Cc: Chanchal Kumar; bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Query about the conditional hypergeometric test in GOstats package a bit more info 1) there are two packages GOstats (Falcon and Gentleman) and topGO (Alexa), that do conditional analyses are very similar in spirit. They were developed independently, but at about the same time 2) and each has its own reference, ours is S. Falcon and R. Gentleman, Using GOstats to test gene lists for GO term association, Bioinformatics, 2007, 23, 257-258. Jim already provided the other one and the vignette for the package should also provide some reasonable explanation and examples, best wishes Robert James W. MacDonald wrote: > Hi Chanchal, > > The GO ontology is set up as a directed acyclic graph, where a parent > term is comprised of all its child terms. If you do a standard > hypergeometric, you might e.g., find 'positive regulation of kinase > activity' to be significant. > > If you then test 'positive regulation of catalytic activity', which is a > parent term, then it might be significant as well, but only because of > the terms coming from positive regulation of kinase activity. > > The conditional hypergeometric takes this into account, and only uses > those terms that were not already significant when testing a higher > order (parent) term. > > For a reference, see the paper by Adrian Alexa. You can find the > citation on the last page of the 'How to use GOstats' vignette. > > Best, > > Jim > > > Chanchal Kumar wrote: >> Dear Bioconductor developers and users, >> >> I am using the "GOstats" package to find over/under enriched GO terms >> in my dataset. And there is an option to calculate the "conditional >> hypergeometric test". I am not sure what this would imply, I am aware of >> the conventional hypergeometric test but this is a bit unfamiliar >> concept. Therefore it will be very helpful if someone could explain the >> concept and point me to relevant references. >> >> Thanks in advance! >> >> Best Regards, >> Chanchal >> =============================== >> Chanchal Kumar, Ph.D. Candidate >> Dept. of Proteomics and Signal Transduction >> Max Planck Institute of Biochemistry >> Am Klopferspitz 18 >> 82152 D-Martinsried (near Munich) >> Germany >> e-mail: chanchal at biochem.mpg.de >> Phone: (Office) +49 (0) 89 8578 2296 >> Fax:(Office) +49 (0) 89 8578 2219 >> http://www.biochem.mpg.de/mann/ >> =============================== >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLYlink written 11.6 years ago by Chanchal Kumar130
Dear Jim, Thank you very much for the information. It was indeed very helpful. Now I did my analysis with the "conditional" argument as TRUE and I see that the result is more explainable. I am thinking of using this conditional test in rest of my analysis. I will go through the reference as well. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: Monday, November 26, 2007 3:57 PM To: Chanchal Kumar Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Query about the conditional hypergeometric test in GOstats package Hi Chanchal, The GO ontology is set up as a directed acyclic graph, where a parent term is comprised of all its child terms. If you do a standard hypergeometric, you might e.g., find 'positive regulation of kinase activity' to be significant. If you then test 'positive regulation of catalytic activity', which is a parent term, then it might be significant as well, but only because of the terms coming from positive regulation of kinase activity. The conditional hypergeometric takes this into account, and only uses those terms that were not already significant when testing a higher order (parent) term. For a reference, see the paper by Adrian Alexa. You can find the citation on the last page of the 'How to use GOstats' vignette. Best, Jim Chanchal Kumar wrote: > Dear Bioconductor developers and users, > > I am using the "GOstats" package to find over/under enriched GO terms > in my dataset. And there is an option to calculate the "conditional > hypergeometric test". I am not sure what this would imply, I am aware of > the conventional hypergeometric test but this is a bit unfamiliar > concept. Therefore it will be very helpful if someone could explain the > concept and point me to relevant references. > > Thanks in advance! > > Best Regards, > Chanchal > =============================== > Chanchal Kumar, Ph.D. Candidate > Dept. of Proteomics and Signal Transduction > Max Planck Institute of Biochemistry > Am Klopferspitz 18 > 82152 D-Martinsried (near Munich) > Germany > e-mail: chanchal at biochem.mpg.de > Phone: (Office) +49 (0) 89 8578 2296 > Fax:(Office) +49 (0) 89 8578 2219 > http://www.biochem.mpg.de/mann/ > =============================== > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLYlink written 11.6 years ago by Chanchal Kumar130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour