hyperGTest(GOstats) result question
3
0
Entering edit mode
burak kutlu ▴ 200
@burak-kutlu-1561
Last seen 7.1 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070630/ 4831de78/attachment.pl
• 811 views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.3 years ago
Hi Burak, burak kutlu <burak_kutlu at="" yahoo.com=""> writes: > Hello > I am using hyperGTest to look at the GO enrichments. I am confused > with the output of the test. For a test, I took the genes in my > matrix, and found the those that map to GO:0009790 (there are 8 such > genes). Then I ran hyperGTest: using the 8 genes as the input, the > genes in my matrix as the universe genes (it is a subset of the > genes represented in the hgu133plus2 ). First, I'm a bit confused about what you are trying to accomplish. Usually, hyperGTest is used to find potentially interesting GO terms. Here it seems you are starting with a GO term of interest -- or are you just trying to understand/test hyperGTest? > When I look at the results, the count field is = 3. I would expect > it to be 8. This probably changes the outcome. Am I missing > something? > I include the code below as well as the sessionInfo() output. > Any help is appreciated. I wonder if you could provide a reproducible example? If you can send me off list the myGenes and universeGenes, perhaps I can see what you are seeing. A few comments on your code below... > universeGeneIds = rownames(myMatrix) > genesWithMyGO = as.list(GOALLENTREZID)$'GO:0009790' How about: genesWithMyGO = GOALLENTREZID[["GO:0009790"]] But the GO::GOALLENTREZID map is odd in that it contains data across different organisms. Why not instead: affyWithMyGO = hgu133plus2GO2ALLPROBES[["GO:0009790"]] egWithMyGO = unique(unlist(mget(affyWithMyGO, hgu133plus2ENTREZID))) > geneIds = intersect(genesWithMyGO, universeGeneIds) # there are 8 genes that are in my matrix with this GO id > >> geneIds > [1] "3084" "4683" "5801" "650" "9314" "89870" "2139" "3216" > > params = new("GOHyperGParams", geneIds = mygenes, universeGeneIds = Did you intend to use geneIds instead of mygenes here? > hyp = hyperGTest(params) > t = summary(hyp) > tRep = cbind(t$GOBPID,t$Pvalue,t$OddsRatio,t$ExpCount,t$Count,t$Size,t$Term) > test = tRep[which(tRep[,1]=='GO:0009790'),] # enrichment result with this term > names(test) = c( > "GOBPID","Pvalue","OddsRatio","ExpCount","Count","Size","Term") Why not: t["GO:0009790", ] + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org
ADD COMMENT
0
Entering edit mode
burak kutlu ▴ 200
@burak-kutlu-1561
Last seen 7.1 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070710/ ec44dc7c/attachment.pl
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.3 years ago
burak kutlu <burak_kutlu at="" yahoo.com=""> writes: >>But the GO::GOALLENTREZID map is odd in that it contains data across >>different organisms. Why not instead: > >> affyWithMyGO = hgu133plus2GO2ALLPROBES[["GO:0009790"]] >> egWithMyGO = unique(unlist(mget(affyWithMyGO, hgu133plus2ENTREZID))) > > Intersection of 'egWithMyGO' with 'universeGenes' contains 31 genes. > >> mygenes = intersect(egWithMyGO, universeGeneIds) > > When I run the hyperGTest... > >> params = new("GOHyperGParams", geneIds = mygenes, universeGeneIds = universeGeneIds, > annotation = "hgu133plus2",ontology = "BP", pvalueCutoff = 0.05,conditional = FALSE,testDirection = "over") > >> hyp = hyperGTest(params) >> t = summary(hyp) > > I get: > >> t["GO:0009790",] > GOBPID Pvalue OddsRatio ExpCount Count Size > GO:0009790 GO:0009790 4.455886e-60 Inf 0.9285024 31 31 > Term > GO:0009790 embryonic development > > > This result does make sense, note that the Count and Size are the > same, 31. We used 31 genes as input, got 31 back. So my question > about hyperGTest is answered, I was just confused with the different > number of genes. > > Then my new question is why does 'GOALLENTREZID' contain only 8 > genes associated with GO:0009790 while 'egWithMyGO' obtained from > 'hgu133plus2GO2ALLPROBES' contains more, i.e. 31 genes. One would > expect at least the same amount of genes in both environments or > maybe more in 'GOALLENTREZID' because affy chip does not represent > all the genes? I'm confused. Here's what I see with egWithMyGO computed as above: > egFromGO = unique(GOALLENTREZID[["GO:0009790"]]) > length(egFromGO) [1] 3963 > all(egWithMyGO %in% egFromGO) [1] TRUE I'm using devel versions here, so there could be differences in the counts, but I really expect the egWithMyGO to be a subset of egFromGO. -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org
ADD COMMENT

Login before adding your answer.

Traffic: 675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6