Hi Tao,
I tried out your KEGGHyperG function.  Seems to work just great.
Thanks very much,
Dick
**********************************************************************
*********
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378     Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696     4225 Roosevelt Way NE, # 100
                        Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
http://staff.washington.edu/~dbeyer
**********************************************************************
*********
--- "Shi, Tao" <shidaxia at="" yahoo.com=""> wrote:
Message: 8
Date: Tue, 6 Sep 2005 10:27:05 -0700 (PDT)
From: "Shi, Tao" <shidaxia@yahoo.com>
Subject: Re: [BioC] GOHyperG for KEGG
To: Gunnar Wrobel <bioc at="" gunnarwrobel.de="">
Cc: bioconductor at stat.math.ethz.ch
Message-ID: <20050906172705.66173.qmail at web52703.mail.yahoo.com>
Content-Type: text/plain; charset=iso-8859-1
Thank you very much, Gunnar.  I'll try that.
At same time, I wrote a function by myself, which I totally stole from
GOHyperG.
Just want to
share it with everybody.  Please let me know if there are any bugs!
...Tao
======================================================================
=======
KEGGHyperG <-
function (geneIDs, lib = "hgu95av2") {
     getDataEnv <- function(name, lib) {
         get(paste(lib, name, sep = ""), mode = "environment")
     }
     require(lib, character.only = TRUE) || stop("need data package",
lib)
     if (any(duplicated(geneIDs)))  stop("input IDs must be unique")
     keggV <- as.list(getDataEnv("PATH2PROBE", lib))
     whWeHave <- sapply(keggV, function(y) {
         if is.na(y) || length(y) == 0)
             return(FALSE)
         ids = unique(unlist(y))
         any(geneIDs %in% ids)
     })
     keggV <- keggV[whWeHave]
     keggV <- sapply(keggV, function(x) {
         if(any(grep("AFFX",x))) {
             return(x[-grep("AFFX",x)])
         } else {
             return(x)
         }
     } ) ## get rid of control probes
     bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x)))
     keggV <- keggV[!bad]
     cIDs <- unique(unlist(keggV))
     nIDs <- length(cIDs)
     keggCounts <- sapply(keggV, length)
     ourIDs <- unique(geneIDs[!is.na(geneIDs)])
     ours <- ourIDs[!duplicated(ourIDs)]
     whGood <- ours[ours %in% cIDs]
     nInt = length(whGood)
     if (nInt == 0)  { warning("no interesting genes found") }
     useCts <- sapply(keggV, function(x) sum(whGood %in% x))
     pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts,
lower.tail = FALSE)
     ord <- order(pvs)
     return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord],
         chip = lib, kegg2Affy = keggV, intCounts = useCts[ord],
numIDs = nIDs,
         numInt = nInt, intIDs = geneIDs))
}
======================================================================
==========
==
--- Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> wrote:
> > Is there a similar function like GOHyperG that works on KEGG?  It
seems
there is no such thing
> > back in Feb. 05
(
https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.html
).  Any
> > updates?
> Hi Tao,
>
> you might try to do this with goCluster. It does the same kind of
> calculation as GOHyperG but can use any kind of annotation.
>
> Cheers
>
> Gunnar
>