Gostats with Yeast annotation
3
0
Entering edit mode
@alex-gutteridge-2935
Last seen 9.6 years ago
United States
Hi, I've been trying to use the hyperGTest method from the GOstats package with some yeast ORF data. I notice in this thread from a month or so ago that there are problems at the moment with using any of the yeast annotation sets apart from 'YEAST' (which is deprecated) due to missing ID2EntrezID methods: https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html I just wanted to make sure that this was still the case and I guess fish around for an estimated ETA for when the org.Sc.sgd.db annotations (which are replacing YEAST as I understand it) will be compatible with hyperGTest? Also, is the exact source of GO annotations used in these packages documented anywhere? Looking in the DESCRIPTION file I see 'primarily based on mapping using ORF identifiers from SGD' for org.Sc.sgd.db and 'assembled using data from public data repositories' for YEAST. Should I just take it these are based on the SGD GO annotation file from the date given in the Packaged field of the DESCRIPTION file? For YEAST there is also a Created field which is aprox. 1 month prior to the Packaged date so I'm guessing the real age of the data is that one? The yeast annotations change so quickly it's useful to be able to pin this down as accurately as possible. Thanks in advance for any help with these questions. Alex Gutteridge Department of Biochemistry University of Cambridge
Annotation GO Yeast GOstats Annotation GO Yeast GOstats • 1.2k views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
Hi Alex, If you are willing to use R-devel and Bioc-devel, the issue should be fixed there. I would be interested in hearing of any problems you might have (or successes) using that version. I am waiting for some reports of success before I port this to release, best wishes Robert Alex Gutteridge wrote: > Hi, > > I've been trying to use the hyperGTest method from the GOstats package > with some yeast ORF data. I notice in this thread from a month or so ago > that there are problems at the moment with using any of the yeast > annotation sets apart from 'YEAST' (which is deprecated) due to missing > ID2EntrezID methods: > > https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html > > I just wanted to make sure that this was still the case and I guess fish > around for an estimated ETA for when the org.Sc.sgd.db annotations > (which are replacing YEAST as I understand it) will be compatible with > hyperGTest? > > Also, is the exact source of GO annotations used in these packages > documented anywhere? Looking in the DESCRIPTION file I see 'primarily > based on mapping using ORF identifiers from SGD' for org.Sc.sgd.db and > 'assembled using data from public data repositories' for YEAST. Should I > just take it these are based on the SGD GO annotation file from the date > given in the Packaged field of the DESCRIPTION file? For YEAST there is the man page is pretty explicit, (?org.Sc.sgdGO) Mappings were based on data provided by: Yeast Genome ( ftp://genome-ftp.stanford.edu/pub/yeast/data_download ) on 2008-Mar29 I am not sure what more we could put there. best wishes Robert > also a Created field which is aprox. 1 month prior to the Packaged date > so I'm guessing the real age of the data is that one? The yeast > annotations change so quickly it's useful to be able to pin this down as > accurately as possible. > > Thanks in advance for any help with these questions. > > Alex Gutteridge > > Department of Biochemistry > University of Cambridge > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD COMMENT
0
Entering edit mode
Hi, Just to confirm the org.Sc.sgd.db package and GOstats seem to work fine together for me in Bioc-devel (Sample session pasted below). R version 2.8.0 Under development (unstable) (2008-07-22 r46103) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] > library(Category) Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: graph Loading required package: annotate Loading required package: AnnotationDbi Loading required package: DBI Loading required package: RSQLite Loading required package: xtable Loading required package: genefilter Loading required package: survival Loading required package: splines > library(GOstats) Loading required package: GO.db Loading required package: RBGL > sel = readLines("Turbidostat.genes") > uni = readLines("all.genes") > params = new ("GOHyperGParams ",geneIds = sel ,universeGeneIds = uni ,annotation = "org .Sc .sgd .db ",ontology="BP",pvalueCutoff=0.1,conditional=FALSE,testDirection="over ") > over = hyperGTest(params) > summary(over) GOBPID Pvalue OddsRatio ExpCount Count Size GO:0006412 GO:0006412 1.109223e-16 2.755286 62.1352567 125 383 GO:0010467 GO:0010467 4.482367e-14 1.824627 225.8284001 317 1392 GO:0009059 GO:0009059 8.256804e-14 2.232463 89.0659423 154 549 GO:0043170 GO:0043170 8.691113e-13 1.698656 414.6676658 510 2556 GO:0044267 GO:0044267 5.484817e-12 1.746362 214.3098539 296 1321 GO:0019538 GO:0019538 1.268805e-11 1.718287 224.6927688 306 1385 [..snip..] > q() Save workspace image? [y/n/c]: n ag357 at ag357-pc2102:~/Desktop/study> head Turbidostat.genes YAL001C YAL002W YAL003W YAL005C YAL008W YAL009W YAL010C YAL011W YAL019W ag357 at ag357-pc2102:~/Desktop/study> head all.genes YHR047C YHR051W YHR066W YHR068W YHR075C YHR076W YHR080C YHR083W YHR143W-A YKL137W AlexG On 22 Jul 2008, at 18:07, Robert Gentleman wrote: > Hi Alex, > If you are willing to use R-devel and Bioc-devel, the issue should > be fixed there. I would be interested in hearing of any problems > you might have (or successes) using that version. I am waiting for > some reports of success before I port this to release, > > best wishes > Robert > > > Alex Gutteridge wrote: >> Hi, >> I've been trying to use the hyperGTest method from the GOstats >> package with some yeast ORF data. I notice in this thread from a >> month or so ago that there are problems at the moment with using >> any of the yeast annotation sets apart from 'YEAST' (which is >> deprecated) due to missing ID2EntrezID methods: >> https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html >> I just wanted to make sure that this was still the case and I guess >> fish around for an estimated ETA for when the org.Sc.sgd.db >> annotations (which are replacing YEAST as I understand it) will be >> compatible with hyperGTest? >> Also, is the exact source of GO annotations used in these packages >> documented anywhere? Looking in the DESCRIPTION file I see >> 'primarily based on mapping using ORF identifiers from SGD' for >> org.Sc.sgd.db and 'assembled using data from public data >> repositories' for YEAST. Should I just take it these are based on >> the SGD GO annotation file from the date given in the Packaged >> field of the DESCRIPTION file? For YEAST there is > > the man page is pretty explicit, (?org.Sc.sgdGO) > > Mappings were based on data provided by: Yeast Genome ( > ftp://genome-ftp.stanford.edu/pub/yeast/data_download ) on > 2008-Mar29 > > I am not sure what more we could put there. > > best wishes > Robert > >> also a Created field which is aprox. 1 month prior to the Packaged >> date so I'm guessing the real age of the data is that one? The >> yeast annotations change so quickly it's useful to be able to pin >> this down as accurately as possible. >> Thanks in advance for any help with these questions. >> Alex Gutteridge >> Department of Biochemistry >> University of Cambridge >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > Robert Gentleman, PhD > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M2-B876 > PO Box 19024 > Seattle, Washington 98109-1024 > 206-667-7700 > rgentlem at fhcrc.org > Alex Gutteridge Department of Biochemistry University of Cambridge
ADD REPLY
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Alex, You can see the sources used to make the org.Sc.eg.db package by calling its dbInfo() function like this: org.Sc.sgd_dbInfo() You can see that there are 3 sources of information listed for this package. To make the GO to yeast mappings, the mappings themselves come from SGD as listed in the help pages. Hope this helps, Marc Alex Gutteridge wrote: > Hi, > > I've been trying to use the hyperGTest method from the GOstats package > with some yeast ORF data. I notice in this thread from a month or so > ago that there are problems at the moment with using any of the yeast > annotation sets apart from 'YEAST' (which is deprecated) due to > missing ID2EntrezID methods: > > https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html > > I just wanted to make sure that this was still the case and I guess > fish around for an estimated ETA for when the org.Sc.sgd.db > annotations (which are replacing YEAST as I understand it) will be > compatible with hyperGTest? > > Also, is the exact source of GO annotations used in these packages > documented anywhere? Looking in the DESCRIPTION file I see 'primarily > based on mapping using ORF identifiers from SGD' for org.Sc.sgd.db and > 'assembled using data from public data repositories' for YEAST. Should > I just take it these are based on the SGD GO annotation file from the > date given in the Packaged field of the DESCRIPTION file? For YEAST > there is also a Created field which is aprox. 1 month prior to the > Packaged date so I'm guessing the real age of the data is that one? > The yeast annotations change so quickly it's useful to be able to pin > this down as accurately as possible. > > Thanks in advance for any help with these questions. > > Alex Gutteridge > > Department of Biochemistry > University of Cambridge > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
@joern-toedling-1244
Last seen 9.6 years ago
Hi Alex, afraid I cannot your questions about the yeast annotation package, but I would like to mention that alternatively you could create a simple list mapping gene identifiers to GO identifiers using the biomaRt package to query the up-to-date annotation in the Ensembl data base and use this list to do the same analysis using the topGO package. The topGO package has a slightly different interface to set up the test, but the vignette provides a good working example. Hope this helps. Regards, Joern Alex Gutteridge wrote: > Hi, > > I've been trying to use the hyperGTest method from the GOstats package > with some yeast ORF data. I notice in this thread from a month or so > ago that there are problems at the moment with using any of the yeast > annotation sets apart from 'YEAST' (which is deprecated) due to > missing ID2EntrezID methods: > > https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html > > I just wanted to make sure that this was still the case and I guess > fish around for an estimated ETA for when the org.Sc.sgd.db > annotations (which are replacing YEAST as I understand it) will be > compatible with hyperGTest? > > Also, is the exact source of GO annotations used in these packages > documented anywhere? Looking in the DESCRIPTION file I see 'primarily > based on mapping using ORF identifiers from SGD' for org.Sc.sgd.db and > 'assembled using data from public data repositories' for YEAST. Should > I just take it these are based on the SGD GO annotation file from the > date given in the Packaged field of the DESCRIPTION file? For YEAST > there is also a Created field which is aprox. 1 month prior to the > Packaged date so I'm guessing the real age of the data is that one? > The yeast annotations change so quickly it's useful to be able to pin > this down as accurately as possible. > > Thanks in advance for any help with these questions. > > Alex Gutteridge > > Department of Biochemistry > University of Cambridge -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk
ADD COMMENT

Login before adding your answer.

Traffic: 730 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6