Question

GOHyperG for KEGG

0

Entering edit mode

Shi, Tao ▴ 720

@shi-tao-199

Last seen 10.4 years ago

Hi List, Is there a similar function like GOHyperG that works on KEGG? It seems there is no such thing back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005-Febr uary/007532.html). Any updates? Thanks, ...Tao

• 1.4k views

ADD COMMENT • link updated 20.2 years ago by Dick Beyer ★ 1.4k • written 20.2 years ago by Shi, Tao ▴ 720

score 0 · Answer 1 · 2005-09-03

Shi, Tao wrote: > Hi List, > > Is there a similar function like GOHyperG that works on KEGG? It seems there is no such thing > back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005-Fe bruary/007532.html). Any > updates? Hi Tao, you might try to do this with goCluster. It does the same kind of calculation as GOHyperG but can use any kind of annotation. Cheers Gunnar

score 0 · Answer 2 · 2005-09-06

Thank you very much, Gunnar. I'll try that. At same time, I wrote a function by myself, which I totally stole from GOHyperG. Just want to share it with everybody. Please let me know if there are any bugs! ...Tao ====================================================================== ======= KEGGHyperG <- function (geneIDs, lib = "hgu95av2") { getDataEnv <- function(name, lib) { get(paste(lib, name, sep = ""), mode = "environment") } require(lib, character.only = TRUE) || stop("need data package", lib) if (any(duplicated(geneIDs))) stop("input IDs must be unique") keggV <- as.list(getDataEnv("PATH2PROBE", lib)) whWeHave <- sapply(keggV, function(y) { if is.na(y) || length(y) == 0) return(FALSE) ids = unique(unlist(y)) any(geneIDs %in% ids) }) keggV <- keggV[whWeHave] keggV <- sapply(keggV, function(x) { if(any(grep("AFFX",x))) { return(x[-grep("AFFX",x)]) } else { return(x) } } ) ## get rid of control probes bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x))) keggV <- keggV[!bad] cIDs <- unique(unlist(keggV)) nIDs <- length(cIDs) keggCounts <- sapply(keggV, length) ourIDs <- unique(geneIDs[!is.na(geneIDs)]) ours <- ourIDs[!duplicated(ourIDs)] whGood <- ours[ours %in% cIDs] nInt = length(whGood) if (nInt == 0) { warning("no interesting genes found") } useCts <- sapply(keggV, function(x) sum(whGood %in% x)) pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE) ord <- order(pvs) return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord], chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs, numInt = nInt, intIDs = geneIDs)) } ====================================================================== ============ --- Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> wrote: > > Is there a similar function like GOHyperG that works on KEGG? It seems there is no such thing > > back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005- February/007532.html). Any > > updates? > Hi Tao, > > you might try to do this with goCluster. It does the same kind of > calculation as GOHyperG but can use any kind of annotation. > > Cheers > > Gunnar >

score 0 · Answer 3 · 2005-09-14

0

Entering edit mode

Dick Beyer ★ 1.4k

@dick-beyer-26

Last seen 11.2 years ago

Hi Tao, I tried out your KEGGHyperG function. Seems to work just great. Thanks very much, Dick ********************************************************************** ********* Richard P. Beyer, Ph.D. University of Washington Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695 Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100 Seattle, WA 98105-6099 http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html http://staff.washington.edu/~dbeyer ********************************************************************** ********* --- "Shi, Tao" <shidaxia at="" yahoo.com=""> wrote: Message: 8 Date: Tue, 6 Sep 2005 10:27:05 -0700 (PDT) From: "Shi, Tao" <shidaxia@yahoo.com> Subject: Re: [BioC] GOHyperG for KEGG To: Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> Cc: bioconductor at stat.math.ethz.ch Message-ID: <20050906172705.66173.qmail at web52703.mail.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 Thank you very much, Gunnar. I'll try that. At same time, I wrote a function by myself, which I totally stole from GOHyperG. Just want to share it with everybody. Please let me know if there are any bugs! ...Tao ====================================================================== ======= KEGGHyperG <- function (geneIDs, lib = "hgu95av2") { getDataEnv <- function(name, lib) { get(paste(lib, name, sep = ""), mode = "environment") } require(lib, character.only = TRUE) || stop("need data package", lib) if (any(duplicated(geneIDs))) stop("input IDs must be unique") keggV <- as.list(getDataEnv("PATH2PROBE", lib)) whWeHave <- sapply(keggV, function(y) { if is.na(y) || length(y) == 0) return(FALSE) ids = unique(unlist(y)) any(geneIDs %in% ids) }) keggV <- keggV[whWeHave] keggV <- sapply(keggV, function(x) { if(any(grep("AFFX",x))) { return(x[-grep("AFFX",x)]) } else { return(x) } } ) ## get rid of control probes bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x))) keggV <- keggV[!bad] cIDs <- unique(unlist(keggV)) nIDs <- length(cIDs) keggCounts <- sapply(keggV, length) ourIDs <- unique(geneIDs[!is.na(geneIDs)]) ours <- ourIDs[!duplicated(ourIDs)] whGood <- ours[ours %in% cIDs] nInt = length(whGood) if (nInt == 0) { warning("no interesting genes found") } useCts <- sapply(keggV, function(x) sum(whGood %in% x)) pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE) ord <- order(pvs) return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord], chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs, numInt = nInt, intIDs = geneIDs)) } ====================================================================== ========== == --- Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> wrote: > > Is there a similar function like GOHyperG that works on KEGG? It seems there is no such thing > > back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.html ). Any > > updates? > Hi Tao, > > you might try to do this with goCluster. It does the same kind of > calculation as GOHyperG but can use any kind of annotation. > > Cheers > > Gunnar >

ADD COMMENT • link 20.2 years ago Dick Beyer ★ 1.4k

0

Entering edit mode

Hi Dick, Glad to help! I have a newer version of the function, since the old one uses probe set names as unique identifiers. Dr. Gentleman pointed out to me that this is not appropriate since you end up double counting some. The function below takes care of that. Again, it was modified from GOHyperG and I've added a new "KEGG" option. The input 'x' is a set of Locus Link ID numbers. Please let me know if any problems. Thanks, ...Tao ##================================================================= GOKEGGHyperG <- function (x, lib = "hgu95av2", what = "MF") { getDataEnv <- function(name, lib) { get(paste(lib, name, sep = ""), mode = "environment") } require(lib, character.only = TRUE) || stop("need data package", lib) if (any(duplicated(x))) stop("input IDs must be unique") ################ MODIFIED ################# match.arg(what, c("MF", "BP", "CC", "KEGG")) ############################################ cLLs <- unlist(as.list(getDataEnv("LOCUSID", lib))) ourLLs <- cLLs[match(x, cLLs)] ################## MODIFIED ########################### if (what=="KEGG") { goV <- as.list(getDataEnv("PATH2PROBE", lib)) } else { goV <- as.list(getDataEnv("GO2ALLPROBES", lib)) } ###################################################### whWeHave <- sapply(goV, function(y) { if is.na(y) || length(y) == 0) return(FALSE) lls = unique(unlist(mget(y, getDataEnv("LOCUSID", lib), ifnotfound = NA))) any(x %in% lls) }) goV <- goV[whWeHave] ################## MODIFIED ######################### if (what!="KEGG") { goCat <- unlist(getGOOntology(names(goV))) goodGO <- goCat == what mm <- match(names(goCat), names(goV)) mm <- mm[goodGO] goV <- goV[mm] } ###################################################### goVcts = sapply(goV, function(x) { if (length(x) == 0 || is.na(x)) return(NA) lls <- unique(unlist(mget(x, getDataEnv("LOCUSID", lib)))) lls <- lls[!is.na(lls)] lls }) bad <- sapply(goVcts, function(x) (length(x) == 1 && is.na(x))) goVcts = goVcts[!bad] goV = goV[!bad] cLLs <- unique(unlist(goVcts)) nLL <- length(cLLs) goCounts <- sapply(goVcts, length) ourLLs <- unique(ourLLs[!is.na(ourLLs)]) ours <- ourLLs[!duplicated(ourLLs)] whGood <- ours[ours %in% cLLs] nInt = length(whGood) if (nInt == 0) warning("no interesting genes found") useCts <- sapply(goVcts, function(x) sum(whGood %in% x)) pvs <- phyper(useCts - 1, nInt, nLL - nInt, goCounts, lower.tail = FALSE) ord <- order(pvs) ################################ MODIFIED ################################################## if (what!="KEGG") { return(list(pvalues = pvs[ord], goCounts = goCounts[ord], chip = lib, go2Affy = goV, intCounts = useCts[ord], numLL = nLL, numInt = nInt, intLLs = x)) } else { return(list(pvalues = pvs[ord], keggCounts = goCounts[ord], chip = lib, kegg2Affy = goV, intCounts = useCts[ord], numLL = nLL, numInt = nInt, intLLs = x)) } ################################ MODIFIED ################################################## } #===================================================================== --- Dick Beyer <dbeyer at="" u.washington.edu=""> wrote: > Hi Tao, > > I tried out your KEGGHyperG function. Seems to work just great. Thanks very much, > Dick > > ******************************************************************** *********** > Richard P. Beyer, Ph.D. University of Washington > Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695 > Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100 > Seattle, WA 98105-6099 > http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html > http://staff.washington.edu/~dbeyer > ******************************************************************** *********** > > --- "Shi, Tao" <shidaxia at="" yahoo.com=""> wrote: > > Message: 8 > Date: Tue, 6 Sep 2005 10:27:05 -0700 (PDT) > From: "Shi, Tao" <shidaxia at="" yahoo.com=""> > Subject: Re: [BioC] GOHyperG for KEGG > To: Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> > Cc: bioconductor at stat.math.ethz.ch > Message-ID: <20050906172705.66173.qmail at web52703.mail.yahoo.com> > Content-Type: text/plain; charset=iso-8859-1 > > Thank you very much, Gunnar. I'll try that. > > At same time, I wrote a function by myself, which I totally stole from GOHyperG. > Just want to > share it with everybody. Please let me know if there are any bugs! > > ...Tao > > ==================================================================== ========= > > KEGGHyperG <- > function (geneIDs, lib = "hgu95av2") { > getDataEnv <- function(name, lib) { > get(paste(lib, name, sep = ""), mode = "environment") > } > require(lib, character.only = TRUE) || stop("need data package", lib) > if (any(duplicated(geneIDs))) stop("input IDs must be unique") > keggV <- as.list(getDataEnv("PATH2PROBE", lib)) > > whWeHave <- sapply(keggV, function(y) { > if is.na(y) || length(y) == 0) > return(FALSE) > ids = unique(unlist(y)) > any(geneIDs %in% ids) > }) > > keggV <- keggV[whWeHave] > keggV <- sapply(keggV, function(x) { > if(any(grep("AFFX",x))) { > return(x[-grep("AFFX",x)]) > } else { > return(x) > } > } ) ## get rid of control probes > > bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x))) > keggV <- keggV[!bad] > cIDs <- unique(unlist(keggV)) > nIDs <- length(cIDs) > keggCounts <- sapply(keggV, length) > ourIDs <- unique(geneIDs[!is.na(geneIDs)]) > ours <- ourIDs[!duplicated(ourIDs)] > whGood <- ours[ours %in% cIDs] > > nInt = length(whGood) > if (nInt == 0) { warning("no interesting genes found") } > useCts <- sapply(keggV, function(x) sum(whGood %in% x)) > > pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE) > ord <- order(pvs) > return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord], > chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs, > numInt = nInt, intIDs = geneIDs)) > } > ==================================================================== ============ > == > > > > > --- Gunnar Wrobel <bioc at="" gunnarwrobel.de=""> wrote: > > > > Is there a similar function like GOHyperG that works on KEGG? It seems > there is no such thing > > > back in Feb. 05 > (https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.ht ml). Any > > > updates? > > Hi Tao, > > > > you might try to do this with goCluster. It does the same kind of > > calculation as GOHyperG but can use any kind of annotation. > > > > Cheers > > > > Gunnar > > > > > > > > >

ADD REPLY • link 20.2 years ago Shi, Tao ▴ 720