GO.db and Gostats question
2
0
Entering edit mode
Tarca, Adi ▴ 570
@tarca-adi-1500
Last seen 5 months ago
United States
Hi all, I have a question regarding the way in which GOstats scans the GO tree and counts how many genes in a given list belong to those go terms. Here is how I would look for all BP GO terms that have one or more genes in a given list called ALL: library(org.Hs.eg.db) library(GO.db) ALL=c("6462","54784","386678","1572","57830","51764","85443","387644", "54807","401106","11012","26529") xyy<- as.list(org.Hs.egGO2ALLEGS) x1 <- as.list(GOTERM) alGO=names(xyy) out<-list() for (i in 1:length(alGO)){ mygo=alGO[i] if(Ontology(x1[[mygo]])=="BP"){ genesgo=intersect(as.vector(xyy[[mygo]]),ALL) out[[i]]<-genesgo } } table(unlist(lapply(out,length))) I am not concerned here with the speed of the process by merely finding the same BP terms that GOstats would find. I see that hyperGTest uses both an universeGeneIds argument but also an annotation argument, and I am not sure why the annotation argument is needed given that the universeGeneIds is provided. Note that I can not get the list of all terms tested by GOstats using something like: library(GOstats) params <- new("GOHyperGParams", geneIds = ALL, universeGeneIds = ALL, annotation = "hgu133plus2.db", ontology = "BP", pvalueCutoff = 1, conditional = FALSE, testDirection = "over") hgCondOver <- hyperGTest(params) Because all p values will be exactly 1, hyperGTest will return no entries since none are less than the pvalueCutoff. Thanks a lot, Adi
Annotation GO PROcess GOstats Annotation GO PROcess GOstats • 1.2k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Adi, In addition to the gene list you provide, and the GO DAG, there is also information (from the annotation package that you specify) to indicate which GO terms should be associated with which gene IDs. Marc Tarca, Adi wrote: > Hi all, > > I have a question regarding the way in which GOstats scans the GO tree and counts how many genes in a given list belong to those go terms. > Here is how I would look for all BP GO terms that have one or more genes in a given list called ALL: > > library(org.Hs.eg.db) > library(GO.db) > ALL=c("6462","54784","386678","1572","57830","51764","85443","387644 ","54807","401106","11012","26529") > xyy<- as.list(org.Hs.egGO2ALLEGS) > x1 <- as.list(GOTERM) > alGO=names(xyy) > out<-list() > for (i in 1:length(alGO)){ > mygo=alGO[i] > if(Ontology(x1[[mygo]])=="BP"){ > genesgo=intersect(as.vector(xyy[[mygo]]),ALL) > out[[i]]<-genesgo > } > } > table(unlist(lapply(out,length))) > > > I am not concerned here with the speed of the process by merely finding the same BP terms that GOstats would find. I see that hyperGTest uses both an universeGeneIds argument but also an annotation argument, and I am not sure why the annotation argument is needed given that the universeGeneIds is provided. > Note that I can not get the list of all terms tested by GOstats using something like: > > library(GOstats) > params <- new("GOHyperGParams", geneIds = ALL, > universeGeneIds = ALL, annotation = "hgu133plus2.db", > ontology = "BP", pvalueCutoff = 1, conditional = FALSE, > testDirection = "over") > hgCondOver <- hyperGTest(params) > > Because all p values will be exactly 1, hyperGTest will return no entries since none are less than the pvalueCutoff. > > > > > Thanks a lot, > > Adi > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi Adi, I am reposting this reply to the list so that others may benefit from any clarifications that we make here. It's a good idea to keep things on list in the future so that others can learn from your questions as well. To answer your question: I am afraid that was not what I was trying to say. Sorry if I confused you. The org.Hs.eg.db package can be used instead of a human based array package. But in that case, "org.Hs.eg.db" would be your annotation argument when making your GOHyperGParams object. The point I was trying to make before was that either an org package or a chip based annotation package is needed to supply the GO to gene mapping. But you need to supply one of those if you are going to do this, because the GO.db package only contains the Gene Ontology itself and not information about which GO IDs map to particular genes. Hope this clarifies things, Marc Tarca, Adi wrote: > > Thanks Marc, > So what you are saying is that the GO.db and org.Hs.eg.db does not contain all info required to map any human entrez id to all GO terms, and you really need the array package as well? > > Thanks, > Adi > > > -----Original Message----- > From: Marc Carlson [mailto:mcarlson at fhcrc.org] > Sent: Thursday, September 03, 2009 6:01 PM > To: Tarca, Adi > Cc: 'bioconductor at stat.math.ethz.ch' > Subject: Re: [BioC] GO.db and Gostats question > > Hi Adi, > > In addition to the gene list you provide, and the GO DAG, there is also information (from the annotation package that you specify) to indicate which GO terms should be associated with which gene IDs. > > Marc > > > > > Tarca, Adi wrote: > >> Hi all, >> >> I have a question regarding the way in which GOstats scans the GO tree and counts how many genes in a given list belong to those go terms. >> Here is how I would look for all BP GO terms that have one or more genes in a given list called ALL: >> >> library(org.Hs.eg.db) >> library(GO.db) >> ALL=c("6462","54784","386678","1572","57830","51764","85443","387644", >> "54807","401106","11012","26529") >> xyy<- as.list(org.Hs.egGO2ALLEGS) >> x1 <- as.list(GOTERM) >> alGO=names(xyy) >> out<-list() >> for (i in 1:length(alGO)){ >> mygo=alGO[i] >> if(Ontology(x1[[mygo]])=="BP"){ >> genesgo=intersect(as.vector(xyy[[mygo]]),ALL) >> out[[i]]<-genesgo >> } >> } >> table(unlist(lapply(out,length))) >> >> >> I am not concerned here with the speed of the process by merely finding the same BP terms that GOstats would find. I see that hyperGTest uses both an universeGeneIds argument but also an annotation argument, and I am not sure why the annotation argument is needed given that the universeGeneIds is provided. >> Note that I can not get the list of all terms tested by GOstats using something like: >> >> library(GOstats) >> params <- new("GOHyperGParams", geneIds = ALL, universeGeneIds = ALL, >> annotation = "hgu133plus2.db", ontology = "BP", pvalueCutoff = 1, >> conditional = FALSE, testDirection = "over") hgCondOver <- >> hyperGTest(params) >> >> Because all p values will be exactly 1, hyperGTest will return no entries since none are less than the pvalueCutoff. >> >> >> >> >> Thanks a lot, >> >> Adi >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> > > > >
ADD COMMENT

Login before adding your answer.

Traffic: 674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6