ENTREZ identifiers for PGSEA

0

Entering edit mode

Sebastien Gerega ▴ 370

@sebastien-gerega-2229

Last seen 11.4 years ago

Hi, I would like to use a lumi expression set with the PGSEA package and GO to generate genesets. I understand that the identifiers used in the data matrix need to be the same as those used in the genesets. Therefore, since I am using the GO database I need to use the ENTREZ identifiers for my matrix. How can I go about doing this? Obviously I cannot have multiple rows with the same name so I will have to remove duplicates and should probably do so based on variance or something. Is there an elegant way to do what I need? thanks, Sebastien

GO lumi PGSEA GO lumi PGSEA • 1.6k views

ADD COMMENT • link updated 17.6 years ago by Dykema, Karl ▴ 90 • written 17.6 years ago by Sebastien Gerega ▴ 370

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 19 hours ago

United States

You could use the findLargest() function in Biobase to select identifers that have the largest differences between your samples. Best, Jim Sebastien Gerega wrote: > Hi, > I would like to use a lumi expression set with the PGSEA package and GO > to generate genesets. I understand that the identifiers used in the data > matrix need to be the same as those used in the genesets. Therefore, > since I am using the GO database I need to use the ENTREZ identifiers > for my matrix. How can I go about doing this? Obviously I cannot have > multiple rows with the same name so I will have to remove duplicates and > should probably do so based on variance or something. Is there an > elegant way to do what I need? > thanks, > Sebastien > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, MS Biostatistician UMCCC cDNA and Affymetrix Core University of Michigan 1500 E Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623

ADD COMMENT • link 17.6 years ago James W. MacDonald 68k

0

Entering edit mode

Dykema, Karl ▴ 90

@dykema-karl-1116

Last seen 11.4 years ago

Hi all, We have been having an offline discussion of this issue. I am posting the correspondence so that it can be referenced in the future. Thanks. -karl Sebastien, Nope, you are doing it exactly right. I fixed that problem a while ago but apparently neglected to upload my code changes to Bioconductor... In the mean time here is the fixed function for you to use: aggregateExprs <- function(x,package="hgu133plus2",using="ENTREZID",FUN,...) { if(class(x) != "matrix" && !is(x, "ExpressionSet")) stop("need matrix or ExpressionSet") if(is.null(package) && is(x, "ExpressionSet")) package <- annotation(x) if(is.null(package)) stop("annotation package name is required") if(!require(package,character.only=TRUE)) stop(package," is not available") pPos <- paste("package",package,sep=":") if(grep(".*db",package)) package <- gsub(".db","",package) nEnv <- paste(package,using,sep="") Env <- get(nEnv,pos=pPos) if(is(x, "ExpressionSet")) { ids <- featureNames(x) x <- exprs(x) } else { ids <- rownames(x) } lls <- mget(ids,env=Env,ifnotfound=NA) if(length(lls)!=length(unlist(lls))) for(i in 1:length(lls)) lls[[i]] <- lls[[i]][1] lls <- unlist(lls) f <- factor(lls) undupx <- aggregate(x,by=list(f),FUN,...) rownames(undupx) <- as.character(undupx[,1]) undupx <- as.matrix(undupx[,-1]) return(undupx) } -karl -----Original Message----- From: Sebastien Gerega [mailto:sgerega@gmail.com] On Behalf Of Sebastien Gerega Sent: Wednesday, July 09, 2008 7:17 PM To: Dykema, Karl Subject: Re: [BioC] ENTREZ identifiers for PGSEA Hi, thanks for your reply. I have just had a quick go using the following command: aggregateExprs(exprs(eset), "lumiHumanAll", FUN=mean, na.rm=TRUE) but I get the error: "There is no package called "lumiHumanAll" If I try specifying the library as "lumiHumanAll.db" then I get the error: variable "lumiHumanAll.dbENTREZID" was not found... Am I doing something wrong? thanks, Sebastien Dykema, Karl wrote: > Sebastien, > > You should take a look at our function "aggregateExprs". As long as you have an annotation environment for your lumi microarray, it is quite simple. Let me know if you run into any problems. > > -karl > > > > ------ Forwarded Message > From: Sebastien Gerega <seb at="" gerega.net=""> > Date: Wed, 09 Jul 2008 16:31:29 +1000 > To: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] ENTREZ identifiers for PGSEA > > Hi, > I would like to use a lumi expression set with the PGSEA package and GO > to generate genesets. I understand that the identifiers used in the data > matrix need to be the same as those used in the genesets. Therefore, > since I am using the GO database I need to use the ENTREZ identifiers > for my matrix. How can I go about doing this? Obviously I cannot have > multiple rows with the same name so I will have to remove duplicates and > should probably do so based on variance or something. Is there an > elegant way to do what I need? > thanks, > Sebastien > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor This email message, including any attachments, is for th...{{dropped:6}}

ADD COMMENT • link 17.6 years ago Dykema, Karl ▴ 90

Login before adding your answer.