Hi all,
I have a question regarding the way in which GOstats scans the GO tree
and counts how many genes in a given list belong to those go terms.
Here is how I would look for all BP GO terms that have one or more
genes in a given list called ALL:
library(org.Hs.eg.db)
library(GO.db)
ALL=c("6462","54784","386678","1572","57830","51764","85443","387644",
"54807","401106","11012","26529")
xyy<- as.list(org.Hs.egGO2ALLEGS)
x1 <- as.list(GOTERM)
alGO=names(xyy)
out<-list()
for (i in 1:length(alGO)){
mygo=alGO[i]
if(Ontology(x1[[mygo]])=="BP"){
genesgo=intersect(as.vector(xyy[[mygo]]),ALL)
out[[i]]<-genesgo
}
}
table(unlist(lapply(out,length)))
I am not concerned here with the speed of the process by merely
finding the same BP terms that GOstats would find. I see that
hyperGTest uses both an universeGeneIds argument but also an
annotation argument, and I am not sure why the annotation argument is
needed given that the universeGeneIds is provided.
Note that I can not get the list of all terms tested by GOstats using
something like:
library(GOstats)
params <- new("GOHyperGParams", geneIds = ALL,
universeGeneIds = ALL, annotation = "hgu133plus2.db",
ontology = "BP", pvalueCutoff = 1, conditional = FALSE,
testDirection = "over")
hgCondOver <- hyperGTest(params)
Because all p values will be exactly 1, hyperGTest will return no
entries since none are less than the pvalueCutoff.
Thanks a lot,
Adi
Hi Adi,
In addition to the gene list you provide, and the GO DAG, there is
also
information (from the annotation package that you specify) to indicate
which GO terms should be associated with which gene IDs.
Marc
Tarca, Adi wrote:
> Hi all,
>
> I have a question regarding the way in which GOstats scans the GO
tree and counts how many genes in a given list belong to those go
terms.
> Here is how I would look for all BP GO terms that have one or more
genes in a given list called ALL:
>
> library(org.Hs.eg.db)
> library(GO.db)
> ALL=c("6462","54784","386678","1572","57830","51764","85443","387644
","54807","401106","11012","26529")
> xyy<- as.list(org.Hs.egGO2ALLEGS)
> x1 <- as.list(GOTERM)
> alGO=names(xyy)
> out<-list()
> for (i in 1:length(alGO)){
> mygo=alGO[i]
> if(Ontology(x1[[mygo]])=="BP"){
> genesgo=intersect(as.vector(xyy[[mygo]]),ALL)
> out[[i]]<-genesgo
> }
> }
> table(unlist(lapply(out,length)))
>
>
> I am not concerned here with the speed of the process by merely
finding the same BP terms that GOstats would find. I see that
hyperGTest uses both an universeGeneIds argument but also an
annotation argument, and I am not sure why the annotation argument is
needed given that the universeGeneIds is provided.
> Note that I can not get the list of all terms tested by GOstats
using something like:
>
> library(GOstats)
> params <- new("GOHyperGParams", geneIds = ALL,
> universeGeneIds = ALL, annotation = "hgu133plus2.db",
> ontology = "BP", pvalueCutoff = 1, conditional = FALSE,
> testDirection = "over")
> hgCondOver <- hyperGTest(params)
>
> Because all p values will be exactly 1, hyperGTest will return no
entries since none are less than the pvalueCutoff.
>
>
>
>
> Thanks a lot,
>
> Adi
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
Hi Adi,
I am reposting this reply to the list so that others may benefit from
any clarifications that we make here. It's a good idea to keep things
on list in the future so that others can learn from your questions as
well.
To answer your question: I am afraid that was not what I was trying to
say. Sorry if I confused you. The org.Hs.eg.db package can be used
instead of a human based array package. But in that case,
"org.Hs.eg.db" would be your annotation argument when making your
GOHyperGParams object. The point I was trying to make before was that
either an org package or a chip based annotation package is needed to
supply the GO to gene mapping. But you need to supply one of those if
you are going to do this, because the GO.db package only contains the
Gene Ontology itself and not information about which GO IDs map to
particular genes.
Hope this clarifies things,
Marc
Tarca, Adi wrote:
>
> Thanks Marc,
> So what you are saying is that the GO.db and org.Hs.eg.db does not
contain all info required to map any human entrez id to all GO terms,
and you really need the array package as well?
>
> Thanks,
> Adi
>
>
> -----Original Message-----
> From: Marc Carlson [mailto:mcarlson at fhcrc.org]
> Sent: Thursday, September 03, 2009 6:01 PM
> To: Tarca, Adi
> Cc: 'bioconductor at stat.math.ethz.ch'
> Subject: Re: [BioC] GO.db and Gostats question
>
> Hi Adi,
>
> In addition to the gene list you provide, and the GO DAG, there is
also information (from the annotation package that you specify) to
indicate which GO terms should be associated with which gene IDs.
>
> Marc
>
>
>
>
> Tarca, Adi wrote:
>
>> Hi all,
>>
>> I have a question regarding the way in which GOstats scans the GO
tree and counts how many genes in a given list belong to those go
terms.
>> Here is how I would look for all BP GO terms that have one or more
genes in a given list called ALL:
>>
>> library(org.Hs.eg.db)
>> library(GO.db)
>>
ALL=c("6462","54784","386678","1572","57830","51764","85443","387644",
>> "54807","401106","11012","26529")
>> xyy<- as.list(org.Hs.egGO2ALLEGS)
>> x1 <- as.list(GOTERM)
>> alGO=names(xyy)
>> out<-list()
>> for (i in 1:length(alGO)){
>> mygo=alGO[i]
>> if(Ontology(x1[[mygo]])=="BP"){
>> genesgo=intersect(as.vector(xyy[[mygo]]),ALL)
>> out[[i]]<-genesgo
>> }
>> }
>> table(unlist(lapply(out,length)))
>>
>>
>> I am not concerned here with the speed of the process by merely
finding the same BP terms that GOstats would find. I see that
hyperGTest uses both an universeGeneIds argument but also an
annotation argument, and I am not sure why the annotation argument is
needed given that the universeGeneIds is provided.
>> Note that I can not get the list of all terms tested by GOstats
using something like:
>>
>> library(GOstats)
>> params <- new("GOHyperGParams", geneIds = ALL, universeGeneIds =
ALL,
>> annotation = "hgu133plus2.db", ontology = "BP", pvalueCutoff = 1,
>> conditional = FALSE, testDirection = "over") hgCondOver <-
>> hyperGTest(params)
>>
>> Because all p values will be exactly 1, hyperGTest will return no
entries since none are less than the pvalueCutoff.
>>
>>
>>
>>
>> Thanks a lot,
>>
>> Adi
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>
>
>
>
>