Problem with the function hyperGtest from GOstats package
1
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.2 years ago
"James W. MacDonald" <jmacdon at="" med.umich.edu=""> writes: > As you already noted, the man page states > > 'cateogrySubsetIds': Object of class '"ANY"': If the test method > supports it, can be used to specify a subset of category ids > to include in the test instead of all possible category ids. > > I don't know which test method supports this argument, but apparently > hyperGTest() doesn't. Unfortunately, the "cateogrySubsetIds" is a half-implemented feature and hyperGTest ignores it. I will add it to my list, just after the "spell check code" item for the next release ;-) The reason that you can't simply test all of the GO IDs and then subset after testing is that in the current implementation, the universe of gene IDs is determined in part by requiring that each gene have at least one annotation in the set of GO IDs. Hence, reducing the set of GO IDs tested could remove some gene IDs from the universe and that will change the results for all tests. Now whether removing gene IDs from the universe that have no GO annotation is the right thing to do could be up for discussion. My argument is that removal is good because it makes the test more conservative. If you leave them in, all you do is increase the size of the gene universe and this tends to make any over-represented GO IDs look all the more impressive. So, sorry for the teaser w.r.t. to a method for subsetting the category. I hope to have code that can handle that for the next release. Best, + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org
Annotation GO Cancer Annotation GO Cancer • 841 views
ADD COMMENT
0
Entering edit mode
@arnemullersanofi-aventiscom-1086
Last seen 10.2 years ago
Hello, Whether genes not included in the subset of GO terms should be removed from the universe or not depends on the question one asks (the hypothesis). If the subset of GO terms represent what you're interested in but you want to know the chance of observing these terms under consideration of the entire GO BP tree, you need to leave the un-annotated genes in the universe. This would be the same to test all GO BP terms and extracting the subset of terms afterwards ... (but it's less elegant I think ;-). I suggest to make this an option in "cateogrySubsetIds". Kind regards, Arne >-----Original Message----- >From: bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of >Seth Falcon >Sent: Thursday, March 15, 2007 5:30 PM >To: James W. MacDonald >Cc: Biton, Anne PH/FR; bioconductor at stat.math.ethz.ch >Subject: Re: [BioC] Problem with the function hyperGtest from >GOstats package > >"James W. MacDonald" <jmacdon at="" med.umich.edu=""> writes: >> As you already noted, the man page states >> >> 'cateogrySubsetIds': Object of class '"ANY"': If the test method >> supports it, can be used to specify a subset of >category ids >> to include in the test instead of all possible >category ids. >> >> I don't know which test method supports this argument, but apparently >> hyperGTest() doesn't. > >Unfortunately, the "cateogrySubsetIds" is a half-implemented >feature and hyperGTest ignores it. I will add it to my list, >just after the "spell check code" item for the next release ;-) > >The reason that you can't simply test all of the GO IDs and >then subset after testing is that in the current >implementation, the universe of gene IDs is determined in part >by requiring that each gene have at least one annotation in >the set of GO IDs. Hence, reducing the set of GO IDs tested >could remove some gene IDs from the universe and that will >change the results for all tests. > >Now whether removing gene IDs from the universe that have no >GO annotation is the right thing to do could be up for >discussion. My argument is that removal is good because it >makes the test more conservative. If you leave them in, all >you do is increase the size of the gene universe and this >tends to make any over-represented GO IDs look all the more impressive. > >So, sorry for the teaser w.r.t. to a method for subsetting the >category. I hope to have code that can handle that for the >next release. > >Best, > >+ seth > >-- >Seth Falcon | Computational Biology | Fred Hutchinson Cancer >Research Center http://bioconductor.org > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor >

Login before adding your answer.

Traffic: 520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6