Pipeline for listing phenoData in exprSet

0

Entering edit mode

Sergii Ivakhno ▴ 40

@sergii-ivakhno-2203

Last seen 11.3 years ago

hello All, I was wandering if you could possibly give me some suggestions with the following problem: I would like to build a pipline which opens consequently exprset files (imported from GEO) and extracts and evaluates the phenoData labels (except the fields "sample" and "description"). The program is below: the basic problem is that when you use names(pData(eset)), you obtain a character vector and you can not use say "phenonames <- names(pData(eset)); "eset$phenonames[2] or paste("eset",phenonames[2],sep="$")" (remember I need the vector in the first place to remove phenolabels "sample", "description"). Thanks a lot for advice!! Best, Sergii Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SA, UK dfg <-c("sample", "description"); files <- dir(getwd(),".RData") for (k in 1:length(files)){ load(files[k]); pdateset <- names(pData(eset)); labels <- pdateset[-which(pdateset %in% dfg)]; for (m in 1:length(labels)){ teamp2 <- unique(paste("eset" ,labels[m],sep="$"); teamp<-as.vector(teamp2); for (i in 1:length(teamp)){ for (j in 2:length(teamp)){ if (i != j){ teamp1 <- paste(teamp[i] ,teamp[j],sep="_") teamp1 <- paste(teamp1 ,files[k],sep="_") temp <- ( as.character(eset$agent) == teamp[i])|(as.character(eset$agent) == teamp[j]); tempeset <-eset[,temp]; design <- model.matrix(~factor(tempeset$agent)); fit <- lmFit(tempeset, design); ebayes <- eBayes(fit); sortebays <- sort.int(ebayes$t[,2], decreasing = TRUE, index.return = TRUE); sortebays1 <- ebayes[sortebays$ix,]; save(sortebays1, file = paste(teamp1,c(".RData"),sep="")); rm (sortebays,ebayes,temp,teamp1,tempeset,design,sortebays1,pdatesetla bels); } } } } } -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

• 662 views

ADD COMMENT • link updated 18.5 years ago by Martin Morgan 25k • written 18.5 years ago by Sergii Ivakhno ▴ 40

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 11 months ago

United States

Sergii -- Are you trying to use phenotypic data to subset your expression set? If so... By way of reproducible example, here's an expression set we all have access to > library(Biobase) > data(sample.ExpressionSet) > obj <- sample.ExpressionSet pData(obj) is a data frame. It sounds like you want to use column names that are in a character vector. In this case, use '[[' to access columns of the data frame. > df <- pData(obj) > nm <- names(df) > okNms <- nm[!nm %in% "type"] > okVals <- df[[ okNms[2] ]] > 0.9 Use '[' to subset the samples present in the expression set based on their phenotypic values > obj1 <- obj[,okVals] Use ';' to wink in text messages. Is this (other than ;) helpful? Martin Sergii Ivakhno <si2 at="" sanger.ac.uk=""> writes: > hello All, > I was wandering if you could possibly give me some suggestions with the > following problem: > I would like to build a pipline which opens consequently exprset files > (imported from GEO) and extracts and evaluates the phenoData labels > (except the fields "sample" and "description"). > The program is below: the basic problem is that when you use > names(pData(eset)), you obtain a character vector and you can > not use say "phenonames <- names(pData(eset)); "eset$phenonames[2] or > paste("eset",phenonames[2],sep="$")" (remember I need the vector in the > first place to remove phenolabels "sample", "description"). > > Thanks a lot for advice!! > Best, > Sergii > > > Wellcome Trust Genome Campus > Hinxton, Cambridge, CB10 1SA, UK > > > > > dfg <-c("sample", "description"); > files <- dir(getwd(),".RData") > for (k in 1:length(files)){ > load(files[k]); > pdateset <- names(pData(eset)); > labels <- pdateset[-which(pdateset %in% dfg)]; > for (m in 1:length(labels)){ > teamp2 <- unique(paste("eset" ,labels[m],sep="$"); > > teamp<-as.vector(teamp2); > for (i in 1:length(teamp)){ > for (j in 2:length(teamp)){ > if (i != j){ > teamp1 <- paste(teamp[i] ,teamp[j],sep="_") > teamp1 <- paste(teamp1 ,files[k],sep="_") > temp <- ( as.character(eset$agent) == > teamp[i])|(as.character(eset$agent) == teamp[j]); > tempeset <-eset[,temp]; > design <- model.matrix(~factor(tempeset$agent)); > fit <- lmFit(tempeset, design); > ebayes <- eBayes(fit); > sortebays <- sort.int(ebayes$t[,2], decreasing = TRUE, index.return = TRUE); > sortebays1 <- ebayes[sortebays$ix,]; > save(sortebays1, file = paste(teamp1,c(".RData"),sep="")); > rm (sortebays,ebayes,temp,teamp1,tempeset,design,sortebays1,pdateset labels); > } > } > } > } > } > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Bioconductor / Computational Biology http://bioconductor.org

ADD COMMENT • link 18.5 years ago Martin Morgan 25k

Login before adding your answer.