Query about the conditional hypergeometric test in GOstats package
1
0
Entering edit mode
@chanchal-kumar-2465
Last seen 9.6 years ago
Dear Bioconductor developers and users, I am using the "GOstats" package to find over/under enriched GO terms in my dataset. And there is an option to calculate the "conditional hypergeometric test". I am not sure what this would imply, I am aware of the conventional hypergeometric test but this is a bit unfamiliar concept. Therefore it will be very helpful if someone could explain the concept and point me to relevant references. Thanks in advance! Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ ===============================
Proteomics Proteomics • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 17 hours ago
United States
Hi Chanchal, The GO ontology is set up as a directed acyclic graph, where a parent term is comprised of all its child terms. If you do a standard hypergeometric, you might e.g., find 'positive regulation of kinase activity' to be significant. If you then test 'positive regulation of catalytic activity', which is a parent term, then it might be significant as well, but only because of the terms coming from positive regulation of kinase activity. The conditional hypergeometric takes this into account, and only uses those terms that were not already significant when testing a higher order (parent) term. For a reference, see the paper by Adrian Alexa. You can find the citation on the last page of the 'How to use GOstats' vignette. Best, Jim Chanchal Kumar wrote: > Dear Bioconductor developers and users, > > I am using the "GOstats" package to find over/under enriched GO terms > in my dataset. And there is an option to calculate the "conditional > hypergeometric test". I am not sure what this would imply, I am aware of > the conventional hypergeometric test but this is a bit unfamiliar > concept. Therefore it will be very helpful if someone could explain the > concept and point me to relevant references. > > Thanks in advance! > > Best Regards, > Chanchal > =============================== > Chanchal Kumar, Ph.D. Candidate > Dept. of Proteomics and Signal Transduction > Max Planck Institute of Biochemistry > Am Klopferspitz 18 > 82152 D-Martinsried (near Munich) > Germany > e-mail: chanchal at biochem.mpg.de > Phone: (Office) +49 (0) 89 8578 2296 > Fax:(Office) +49 (0) 89 8578 2219 > http://www.biochem.mpg.de/mann/ > =============================== > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
a bit more info 1) there are two packages GOstats (Falcon and Gentleman) and topGO (Alexa), that do conditional analyses are very similar in spirit. They were developed independently, but at about the same time 2) and each has its own reference, ours is S. Falcon and R. Gentleman, Using GOstats to test gene lists for GO term association, Bioinformatics, 2007, 23, 257?258. Jim already provided the other one and the vignette for the package should also provide some reasonable explanation and examples, best wishes Robert James W. MacDonald wrote: > Hi Chanchal, > > The GO ontology is set up as a directed acyclic graph, where a parent > term is comprised of all its child terms. If you do a standard > hypergeometric, you might e.g., find 'positive regulation of kinase > activity' to be significant. > > If you then test 'positive regulation of catalytic activity', which is a > parent term, then it might be significant as well, but only because of > the terms coming from positive regulation of kinase activity. > > The conditional hypergeometric takes this into account, and only uses > those terms that were not already significant when testing a higher > order (parent) term. > > For a reference, see the paper by Adrian Alexa. You can find the > citation on the last page of the 'How to use GOstats' vignette. > > Best, > > Jim > > > Chanchal Kumar wrote: >> Dear Bioconductor developers and users, >> >> I am using the "GOstats" package to find over/under enriched GO terms >> in my dataset. And there is an option to calculate the "conditional >> hypergeometric test". I am not sure what this would imply, I am aware of >> the conventional hypergeometric test but this is a bit unfamiliar >> concept. Therefore it will be very helpful if someone could explain the >> concept and point me to relevant references. >> >> Thanks in advance! >> >> Best Regards, >> Chanchal >> =============================== >> Chanchal Kumar, Ph.D. Candidate >> Dept. of Proteomics and Signal Transduction >> Max Planck Institute of Biochemistry >> Am Klopferspitz 18 >> 82152 D-Martinsried (near Munich) >> Germany >> e-mail: chanchal at biochem.mpg.de >> Phone: (Office) +49 (0) 89 8578 2296 >> Fax:(Office) +49 (0) 89 8578 2219 >> http://www.biochem.mpg.de/mann/ >> =============================== >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY
0
Entering edit mode
Dear Dr. Gentleman, Thanks for the inputs. It was indeed helpful and now after reading the references the concept is pretty clear. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== -----Original Message----- From: Robert Gentleman [mailto:rgentlem@fhcrc.org] Sent: Monday, November 26, 2007 5:36 PM To: James W. MacDonald Cc: Chanchal Kumar; bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Query about the conditional hypergeometric test in GOstats package a bit more info 1) there are two packages GOstats (Falcon and Gentleman) and topGO (Alexa), that do conditional analyses are very similar in spirit. They were developed independently, but at about the same time 2) and each has its own reference, ours is S. Falcon and R. Gentleman, Using GOstats to test gene lists for GO term association, Bioinformatics, 2007, 23, 257-258. Jim already provided the other one and the vignette for the package should also provide some reasonable explanation and examples, best wishes Robert James W. MacDonald wrote: > Hi Chanchal, > > The GO ontology is set up as a directed acyclic graph, where a parent > term is comprised of all its child terms. If you do a standard > hypergeometric, you might e.g., find 'positive regulation of kinase > activity' to be significant. > > If you then test 'positive regulation of catalytic activity', which is a > parent term, then it might be significant as well, but only because of > the terms coming from positive regulation of kinase activity. > > The conditional hypergeometric takes this into account, and only uses > those terms that were not already significant when testing a higher > order (parent) term. > > For a reference, see the paper by Adrian Alexa. You can find the > citation on the last page of the 'How to use GOstats' vignette. > > Best, > > Jim > > > Chanchal Kumar wrote: >> Dear Bioconductor developers and users, >> >> I am using the "GOstats" package to find over/under enriched GO terms >> in my dataset. And there is an option to calculate the "conditional >> hypergeometric test". I am not sure what this would imply, I am aware of >> the conventional hypergeometric test but this is a bit unfamiliar >> concept. Therefore it will be very helpful if someone could explain the >> concept and point me to relevant references. >> >> Thanks in advance! >> >> Best Regards, >> Chanchal >> =============================== >> Chanchal Kumar, Ph.D. Candidate >> Dept. of Proteomics and Signal Transduction >> Max Planck Institute of Biochemistry >> Am Klopferspitz 18 >> 82152 D-Martinsried (near Munich) >> Germany >> e-mail: chanchal at biochem.mpg.de >> Phone: (Office) +49 (0) 89 8578 2296 >> Fax:(Office) +49 (0) 89 8578 2219 >> http://www.biochem.mpg.de/mann/ >> =============================== >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY
0
Entering edit mode
Dear Jim, Thank you very much for the information. It was indeed very helpful. Now I did my analysis with the "conditional" argument as TRUE and I see that the result is more explainable. I am thinking of using this conditional test in rest of my analysis. I will go through the reference as well. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal at biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: Monday, November 26, 2007 3:57 PM To: Chanchal Kumar Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Query about the conditional hypergeometric test in GOstats package Hi Chanchal, The GO ontology is set up as a directed acyclic graph, where a parent term is comprised of all its child terms. If you do a standard hypergeometric, you might e.g., find 'positive regulation of kinase activity' to be significant. If you then test 'positive regulation of catalytic activity', which is a parent term, then it might be significant as well, but only because of the terms coming from positive regulation of kinase activity. The conditional hypergeometric takes this into account, and only uses those terms that were not already significant when testing a higher order (parent) term. For a reference, see the paper by Adrian Alexa. You can find the citation on the last page of the 'How to use GOstats' vignette. Best, Jim Chanchal Kumar wrote: > Dear Bioconductor developers and users, > > I am using the "GOstats" package to find over/under enriched GO terms > in my dataset. And there is an option to calculate the "conditional > hypergeometric test". I am not sure what this would imply, I am aware of > the conventional hypergeometric test but this is a bit unfamiliar > concept. Therefore it will be very helpful if someone could explain the > concept and point me to relevant references. > > Thanks in advance! > > Best Regards, > Chanchal > =============================== > Chanchal Kumar, Ph.D. Candidate > Dept. of Proteomics and Signal Transduction > Max Planck Institute of Biochemistry > Am Klopferspitz 18 > 82152 D-Martinsried (near Munich) > Germany > e-mail: chanchal at biochem.mpg.de > Phone: (Office) +49 (0) 89 8578 2296 > Fax:(Office) +49 (0) 89 8578 2219 > http://www.biochem.mpg.de/mann/ > =============================== > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY

Login before adding your answer.

Traffic: 810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6