package xps: export.filter issues
3
0
Entering edit mode
steven wink ▴ 90
@steven-wink-5440
Last seen 6.4 years ago
Dear xps user/ developer, My goal is to obtain a dataframe with all the statistics (fold changes p-values etc) and annotation data (gene symbols etc). I need this dataframe to select a subset hereof of 311 genes based on gene symbols. The issue I have is as follows ( I use xps.pdf vignette var names to make it easier to follow) 1) root scheme loaded for hgu133plus2 2) cel files imported (just 4 of them: 2 vehicle, 2 treated) 3) rma probe set normalized (data.rma) 4) filter operation: unifltr<-UniFilter and: uniTest(unifltr) <- c("t.test","two.sided","BH",0,0.0,FALSE,0.95,TRUE) for Mult Testing correction (adjp) 5) this dataframe is fine: tmp <- validData(rma.ufr) adding phenotype data is the problem: 6) tmp <- export.filter(rma.ufr, treetype="stt", + varlist="fUnitName:fName:fSymbol:fc:pval:flag", + as.dataframe=TRUE, verbose=FALSE) the resulting data.frame in R memory and when I write it to disk is messed up in certain rows ( clumps ALOT of rows in some rows in the 4th column) and then clumps columns in 1 column at the end of the file. Bit hard to explain exactly as I don't see the logic atm. So then I tried removing the "FUnitName from the varlist because it seems like unit_ID and unit_name are kinda redundant and I get a crash (reproduced it 3 times with underlying code): I don't know if this is relevant to the messed up data.frame, but added it just in case. tmp<-export.filter(rma.ufr,treetype="stt",varlist="fName:fSymbol:fc:pv al:flag",as.dataframe=TRUE,verbose=FALSE) *** Break *** segmentation violation =========================================================== There was a crash. This is the entire stack trace of all threads: =========================================================== #0 0x00007f2bd8a26c3e in waitpid () from /lib/x86_64-linux- gnu/libc.so.6 #1 0x00007f2bd89acf5e in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007f2bd2ef6227 in TUnixSystem::StackTrace() () from /home/winks/ROOT/root/lib/libCore.so #3 0x00007f2bd2ef8aa3 in TUnixSystem::DispatchSignals(ESignals) () from /home/winks/ROOT/root/lib/libCore.so #4 <signal handler="" called=""> #5 0x00007f2bd3852e92 in XUnivarSet::ExportUnivarTrees(int, TString*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #6 0x00007f2bd377828d in XTreeSet::ExportTree(char const*, int, TString*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #7 0x00007f2bd377cbe8 in XTreeSet::ExportTrees(char const*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #8 0x00007f2bd377bcbc in XManager::Export(char const*, char const*, char const*, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #9 0x00007f2bd3859034 in ExportData () from /home/winks/R_HOME/library/xps/libs/xps.so #10 0x00007f2bd8ffe7ba in ?? () from /usr/lib/R/lib/libR.so #11 0x00007f2bd903668d in Rf_eval () from /usr/lib/R/lib/libR.so #12 0x00007f2bd903bc03 in ?? () from /usr/lib/R/lib/libR.so #13 0x00007f2bd911a0cc in ?? () from /usr/lib/R/lib/libR.so #14 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #15 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #16 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #17 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #18 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #19 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #20 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #21 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #22 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #23 0x00007f2bd9039157 in ?? () from /usr/lib/R/lib/libR.so #24 0x00007f2bd90394a5 in R_execMethod () from /usr/lib/R/lib/libR.so #25 0x00007f2bd6399047 in ?? () from /usr/lib/R/library/methods/libs/methods.so #26 0x00007f2bd908b1e3 in ?? () from /usr/lib/R/lib/libR.so #27 0x00007f2bd90365c9 in Rf_eval () from /usr/lib/R/lib/libR.so #28 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #29 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #30 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #31 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #32 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #33 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #34 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #35 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #36 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #37 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #38 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #39 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #40 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #41 0x00007f2bd90730a3 in Rf_ReplIteration () from /usr/lib/R/lib/libR.so #42 0x00007f2bd9073330 in ?? () from /usr/lib/R/lib/libR.so #43 0x00007f2bd90733c0 in run_Rmainloop () from /usr/lib/R/lib/libR.so #44 0x000000000040078b in main () #45 0x00007f2bd898a76d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6 #46 0x00000000004007bd in _start () =========================================================== The lines below might hint at the cause of the crash. If they do not help you then please submit a bug report at http://root.cern.ch/bugs. Please post the ENTIRE stack trace from above as an attachment in addition to anything else that might help us fixing this issue. =========================================================== #5 0x00007f2bd3852e92 in XUnivarSet::ExportUnivarTrees(int, TString*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #6 0x00007f2bd377828d in XTreeSet::ExportTree(char const*, int, TString*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #7 0x00007f2bd377cbe8 in XTreeSet::ExportTrees(char const*, char const*, std::basic_ofstream<char, std::char_traits<char=""> >&, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #8 0x00007f2bd377bcbc in XManager::Export(char const*, char const*, char const*, char const*) () from /home/winks/R_HOME/library/xps/libs/xps.so #9 0x00007f2bd3859034 in ExportData () from /home/winks/R_HOME/library/xps/libs/xps.so #10 0x00007f2bd8ffe7ba in ?? () from /usr/lib/R/lib/libR.so #11 0x00007f2bd903668d in Rf_eval () from /usr/lib/R/lib/libR.so #12 0x00007f2bd903bc03 in ?? () from /usr/lib/R/lib/libR.so #13 0x00007f2bd911a0cc in ?? () from /usr/lib/R/lib/libR.so #14 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #15 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #16 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #17 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #18 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #19 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #20 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #21 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #22 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #23 0x00007f2bd9039157 in ?? () from /usr/lib/R/lib/libR.so #24 0x00007f2bd90394a5 in R_execMethod () from /usr/lib/R/lib/libR.so #25 0x00007f2bd6399047 in ?? () from /usr/lib/R/library/methods/libs/methods.so #26 0x00007f2bd908b1e3 in ?? () from /usr/lib/R/lib/libR.so #27 0x00007f2bd90365c9 in Rf_eval () from /usr/lib/R/lib/libR.so #28 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #29 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #30 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #31 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #32 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #33 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #34 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #35 0x00007f2bd90383a0 in ?? () from /usr/lib/R/lib/libR.so #36 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #37 0x00007f2bd903988d in Rf_applyClosure () from /usr/lib/R/lib/libR.so #38 0x00007f2bd9036350 in Rf_eval () from /usr/lib/R/lib/libR.so #39 0x00007f2bd9038220 in ?? () from /usr/lib/R/lib/libR.so #40 0x00007f2bd903646f in Rf_eval () from /usr/lib/R/lib/libR.so #41 0x00007f2bd90730a3 in Rf_ReplIteration () from /usr/lib/R/lib/libR.so #42 0x00007f2bd9073330 in ?? () from /usr/lib/R/lib/libR.so #43 0x00007f2bd90733c0 in run_Rmainloop () from /usr/lib/R/lib/libR.so #44 0x000000000040078b in main () #45 0x00007f2bd898a76d in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6 #46 0x00000000004007bd in _start () =========================================================== winks@ubuntu:/media/Working Drive/Promotie/mArray biomarker study/Lisa Paper/selection of genes$ thanks in advance Steven Wink [[alternative HTML version deleted]]
Annotation probe xps Annotation probe xps • 1.6k views
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 7.1 years ago
Austria
Dear Steven, Since I cannot reproduce your problem (see below) could you please supply: - sessionInfo() - version of ROOT - version of Affymetrix annotation file - your complete code Here is what I have just done w/o experiencing any problems: # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # Tissues from Affymetrix Exon Array Dataset for HG-U133_Plus_2 # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ### create ROOT scheme files for ivt expression arrays ### ### new R session: load library xps library(xps) ### define directories: # directory containing Affymetrix library files libdir <- "/Volumes/GigaDrive/Affy/libraryfiles" # directory containing Affymetrix annotation files anndir <- "/Volumes/GigaDrive/Affy/Annotation" # directory to store ROOT scheme files scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" # HG-U133_Plus_2: scheme.hgu133plus2.na32 <- import.expr.scheme("hgu133plus2", filedir = file.path(scmdir, "na32"), schemefile = file.path(libdir, "HG-U133_Plus_2.CDF"), probefile = file.path(libdir, "HG-U133-PLUS_probe.tab"), annotfile = file.path(anndir, "Version11Jul", "HG-U133_Plus_2.na32.annot.csv")) ### HG-U133_Plus_2 data: import raw data ### ### new R session: load library xps library(xps) ### define directories: # directory of ROOT scheme files scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes/na32" # directory containing Tissues CEL files celdir <- "/Volumes/GigaDrive/ChipData/Exon/HuMixture" # directory to store ROOT raw data files datdir <- "/Volumes/GigaDrive/CRAN/Workspaces/BreastProstate" # first, import ROOT scheme file scheme.u133p2 <- root.scheme(paste(scmdir,"hgu133plus2.root",sep="/")) # subset of CEL files to import celfiles <- c("u1332plus_ivt_breast_A.CEL","u1332plus_ivt_breast_B.CEL","u1332plus _ivt_breast_C.CEL", "u1332plus_ivt_prostate_A.CEL","u1332plus_ivt_prostate_B.CEL","u1332pl us_ivt_prostate_C.CEL") # rename CEL files celnames <- c("BreastA","BreastB","BreastC","ProstateA","ProstateB","ProstateC") # import CEL files data.mix.u133p2 <- import.data(scheme.u133p2, "BrPrU133P2", filedir=datdir,celdir=celdir,celfiles=celfiles,celnames=celnames) ### preprocess raw data ### ### new R session: load library xps library(xps) ### first, load ROOT scheme file and ROOT data file scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes/na32" scheme.u133p2 <- root.scheme(paste(scmdir,"hgu133plus2.root",sep="/")) datdir <- "/Volumes/MitziData/CRAN/Workspaces/BreastProstate" data.u133p2 <- root.data(scheme.u133p2, paste(datdir,"BrPrU133P2_cel.root",sep="/")) ### RMA data.rma <- rma(data.u133p2,"BrPrU133P2RMA",tmpdir="",background="pmonly",normaliz e=TRUE) # get data.frames expr.rma <- validData(data.rma) # export expression data export.expr(data.rma, treename = "*", treetype = "mdp", varlist = "*", outfile = "BreastProstateRMAU133P2.txt", sep = "\t", as.dataframe = FALSE, verbose = TRUE) ### apply univariate filters ### ### new R session: load library xps library(xps) # create UniFilter unifltr <- UniFilter(unitest=c("t.test", "two.sided", "BH", 0, 0.0, FALSE, 0.95, TRUE)) # apply unifilter rma.ufr <- unifilter(data.rma, "BrPrU133P2Unifilter", getwd(), unifltr, group=c("GrpA","GrpA","GrpA", "GrpB","GrpB","GrpB")) export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fName:fSymbol:fc:pval:flag", outfile = "UniFltr.txt", sep = "\t", as.dataframe = FALSE, verbose = TRUE) tmp <- validData(rma.ufr, which="UnitName") tmp <- export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fName:fSymbol:fc:pval:flag", as.dataframe = TRUE, verbose = TRUE) Best regards, Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 8/16/12 12:25 PM, steven wink wrote: > Dear xps user/ developer, > > My goal is to obtain a dataframe with all the statistics (fold changes > p-values etc) and annotation data (gene symbols etc). I need this dataframe > to select a subset hereof of 311 genes based on gene symbols. > > The issue I have is as follows ( I use xps.pdf vignette var names to make > it easier to follow) > > 1) root scheme loaded for hgu133plus2 > 2) cel files imported (just 4 of them: 2 vehicle, 2 treated) > 3) rma probe set normalized (data.rma) > 4) filter operation: unifltr<-UniFilter and: uniTest(unifltr) <- > c("t.test","two.sided","BH",0,0.0,FALSE,0.95,TRUE) for Mult Testing > correction (adjp) > 5) this dataframe is fine: tmp <- validData(rma.ufr) > > adding phenotype data is the problem: > 6) > > tmp <- export.filter(rma.ufr, treetype="stt", > + > varlist="fUnitName:fName:fSymbol:fc:pval:flag", > + > as.dataframe=TRUE, verbose=FALSE) > > the resulting data.frame in R memory and when I write it to disk is messed > up in certain rows ( clumps ALOT of rows in some rows in the 4th column) > and then clumps columns in 1 column at the end of the file. > Bit hard to explain exactly as I don't see the logic atm. > > > So then I tried removing the "FUnitName from the varlist because it seems > like unit_ID and unit_name are kinda redundant and I get a crash > (reproduced it 3 times with underlying code): I don't know if this is > relevant to the messed up data.frame, but added it just in case. > > tmp<-export.filter(rma.ufr,treetype="stt",varlist="fName:fSymbol:fc: pval:flag",as.dataframe=TRUE,verbose=FALSE) > > *** Break *** segmentation violation > > > > > > thanks in advance > Steven Wink > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 7.1 years ago
Austria
Dear Steven, It seems, that adding fName does indeed cause problems, probably due to the fact that the Affymetrix gene annotation does contain characters which interfere with R data.frame. However, adding fSymbol only is fine. Here is my example: > tmp <- validData(rma.ufr, which="UnitName") > dim(tmp) [1] 54675 9 > tmp <- export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fc:pval:flag", as.dataframe = TRUE, verbose = TRUE) Warning: tree <unitest.ufr> does not exist or has not <54675> entries.> > dim(tmp) [1] 54675 5 > tmp <- export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fSymbol:fc:pval:flag", as.dataframe = TRUE, verbose = TRUE) Warning: tree <unitest.ufr> does not exist or has not <54675> entries.Reading entries from <hg-u133_plus_2.ann> ...Finished > dim(tmp) [1] 54675 6 > tmp <- export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fName:fSymbol:fc:pval:flag", as.dataframe = TRUE, verbose = TRUE) Warning: tree <unitest.ufr> does not exist or has not <54675> entries.Reading entries from <hg-u133_plus_2.ann> ...Finished dim(tmp) #[1] 28298 7 As you see, everything is fine as long as fName is not included. (You can ignore the warning) As you have already seen, when you simply export the results, you will get the complete output: > export.filter(rma.ufr, treename = "*", treetype = "stt", varlist = "fUnitName:fName:fSymbol:fc:pval:flag", outfile = "UniFltr.txt", sep = "\t", as.dataframe = FALSE, verbose = TRUE) If you try to import table "UniFltr.txt" as data.frame you get the same problem: > tmp <- read.table("UniFltr.txt", header=TRUE, row.names=NULL, sep="\t", check.names=FALSE, stringsAsFactors=FALSE, comment.char="") > dim(tmp) [1] 28298 7 You see that even setting read.table(...,comment.char="",...) does not help. Thus the only option is to use fSymbol only but not fName. Since you only need to get the gene symbols there is no need to use hgu133plus2.db, as my example above has shown. I hope this does help. Best regards, Christian On 8/17/12 3:05 PM, steven wink wrote: > Dear Christian, > > Thank you for your reply. I noticed the problem occurs when loading the > data in R memory as a dataframe. Writing to a file directly: no problem. > Loading from file to R memory it gets messed up again. The dataframe > without the gene annotation is fine. > I wanted the full set in a data.frame to select a set of genes with > foldchanges etc. For now I used the bioc package hgu133plus2.db instead > of using xps: export.filter to attach gene symbols. > > I saw one warning: > Warning: tree <unitest.ufr> does not exist or has not <54675> > entries.Reading entries from <hg-u133_plus_2.ann> ...Finished > > I also followed the code on your email. Same effect. > > root version > Version 5.32/04 13 July 2012 > > xps 1.16.0 > = > netaffx-annotation-date=2011-06-22 > = > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=nl_NL.UTF-8 LC_NUMERIC=C > [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=nl_NL.UTF-8 > [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=nl_NL.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rjson_0.2.9 hgu133plus2.db_2.7.1 org.Hs.eg.db_2.7.1 > [4] RSQLite_0.11.1 DBI_0.2-5 AnnotationDbi_1.18.1 > [7] Biobase_2.16.0 BiocGenerics_0.2.0 xps_1.16.0 > > loaded via a namespace (and not attached): > [1] IRanges_1.14.4 probemapper_1.0.0 stats4_2.15.1 tools_2.15.1 > = > > importing the previously created scheme file: > > scmdir<-"/media/Working Drive/Promotie/mArray biomarker study/Schemes/na32" > scheme<-root.scheme(file.path(scmdir,"hgu133plus2.root")) > class(scheme) > [1] "SchemeTreeSet" > attr(,"package") > [1] "xps" > > library(xps) > celdir<-file.path("/media/Working Drive/Promotie/mArray biomarker > study/Lisa Paper/carbamazepine.Human.in_vitro.Liver/24 low") > celfiles<-c("carbamazepine_ctrl_24h_1.CEL","carbamazepine_ctrl_24h_2.C EL","carbamazepine_l_24h_1.CEL","carbamazepine_l_24h_2.CEL") > data.carb<-import.data(scheme,"tmpdt_dataCarb",celdir=celdir,celfiles= celfiles,verbose=FALSE) > data.rma<-rma(data.carb,"tmpdt_rma",verbose=FALSE) > png("pca_carb_24h_low_PHH.png",width=800,height=800) > > > pcaplot(data.rma,group=c("ctrl","ctrl","l24h","l24h"),add.labels=TRUE, add.legend=TRUE) > > dev.off() > null device > > unifltr<-UniFilter(foldchange=c(1.2,"both"),unifilter=c(0.1,"pval")) > uniTest(unifltr) <- c("t.test","two.sided","BH",0,0.0,FALSE,0.95,TRUE) > > str(unifltr) > Formal class 'UniFilter' [package "xps"] with 5 slots > ..@ foldchange:List of 2 > .. ..$ cutoff : num 1.2 > .. ..$ direction: chr "both" > ..@ prescall : list() > ..@ unifilter :List of 2 > .. ..$ cutoff : num 0.1 > .. ..$ variable: chr "pval" > ..@ unitest :List of 8 > .. ..$ type : chr "t.test" > .. ..$ alternative: chr "two.sided" > .. ..$ correction : chr "BH" > .. ..$ numperm : int 0 > .. ..$ mu : num 0 > .. ..$ paired : logi FALSE > .. ..$ conflevel : num 0.95 > .. ..$ varequ : logi TRUE > ..@ numfilters: num 2 > > > rma.ufr<-unifilter(data.rma,"tmpdt_rmaufr5",getwd(),unifltr, > group=c("ctrl","ctrl","l24h","l24h"),verbose=FALSE) > > tmp<-export.filter(rma.ufr,treetype="stt",varlist="fUnitName:fName:fSy mbol:fc:pval:flag", > + as.dataframe=TRUE,verbose=FALSE) > > > tmp[3670,] > UNIT_ID UnitName GeneName GeneSymbol P-Value FoldChange Flag > 3670 5641 206054_at kininogen 1 KNG1 0.369156 1.05887 0 > > > Bellow I include a small portion of the output from row 3671. (In my > text file some blocks/ multiple rows of character/strings are quoted, > then blocks contain strings which are not.) > > tmp[3671, ] > 72\t0\n8438\t208882_s_at\tubiquitin protein ligase E3 component > n-recognin 5\tUBR5\t0.874644\t1.00482\t0\n8439\t208883_at\tubiquitin > protein ligase E3 component n-recognin > 5\tUBR5\t0.770298\t0.967776\t0\n8440\t208884_s_at\tubiquitin protein > ligase E3 component n-recognin > 5\tUBR5\t0.253671\t1.04481\t0\n8441\t208885_at\tlymphocyte cytosolic > protein 1 (L-plastin)\tLCP1\t0.977804\t1.00235\t0\n8442\t208886_at\tH1 > histone family, member > 0\tH1F0\t0.500673\t0.955601\t0\n8443\t208887_at\teukaryotic translation > initiation factor 3, subunit > G\tEIF3G\t0.0389222\t1.05615\t0\n8444\t208888_s_at\tnuclear receptor > corepressor 2\tNCOR2\t0.372557\t1.22396\t0\n8445\t208889_s_at\tnuclear > receptor corepressor > 2\tNCOR2\t0.272628\t1.0801\t0\n8446\t208890_s_at\tplexin > B2\tPLXNB2\t0.0275446\t1.0633\t0\n8447\t208891_at\tdual specificity > phosphatase 6\tDUSP6\t0.837928\t0.981868\t0\n8448\t208892_s_at\tdual > specificity phosphatase > 6\tDUSP6\t0.491362\t0.932852\t0\n8449\t208893_s_at\tdual specificity > phosphatase 6\tDUSP6\t0.359966\t0.964169\t0\n8450\t208894_at\tmajor > histocompatibility complex, class II, DR > alpha\tHLA-DRA\t0.9963\t0.998907\t0\n8451\t208895_s_at\tDEAD > (Asp-Glu-Ala-Asp) box polypeptide > 18\tDDX18\t0.265118\t1.03538\t0\n8452\t208896_at\tDEAD (Asp-Glu- Ala-Asp) > box polypeptide > 18\tDDX18\t0.124475\t0.955801\t0\n8453\t208897_s_at\tDEAD > (Asp-Glu-Ala-Asp) box polypeptide > 18\tDDX18\t0.11651\t1.05331\t0\n8454\t208898_at\tATPase, H+ > transporting, lysosomal 34kDa, V1 subunit > D\tATP6V1D\t0.118903\t0.970489\t0\n8455\t208899_x_at\tATPase, H+ > transporting, lysosomal 34kDa, V1 subunit > D\tATP6V1D\t0.738698\t0.995617\t0\n8456\t208900_s_at\ttopoisomerase > (DNA) I\tTOP1\t0.660803\t0.965493\t0\n8457\t208901_s_at\ttopoisomerase > (DNA) I\tTOP1\t0.140795\t0.963365\t0\n8458\t208902_s_at\tribosomal > protein S28\tRPS28\t0.939403\t1.00859\t0\n8459\t208903_at\tribosomal > protein S28\tRPS28\t0.720425\t0.953657\t0\n8460\t208904_s_at\tribosomal > protein S28\tRPS28\t0.118968\t0.958277\t0\n8461\t208905_at\tcytochrome > c, > somatic\tCYCS\t0.801304\t0.993872\t0\n8462\t208906_at \tBerardinelli-Seip > congenital lipodystrophy 2 > (seipin)\tBSCL2\t0.156978\t1.05388\t0\n8463\t208907_s_at\tmitochond rial > ribosomal protein > S18B\tMRPS18B\t0.113534\t1.02395\t0\n8464\t208908_s_at\tcalpastatin\tC AST\t0.067195\t1.03521\t0\n8465\t208909_at\tubiquinol-cytochrome > c reductase, Rieske iron-sulfur polypeptide > 1\tUQCRFS1\t0.570856\t1.0093\t0\n8466\t208910_s_at\tcomplement component > 1, q subcomponent binding > protein\tC1QBP\t0.0812231\t0.958867\t0\n8467\t208911_s_at\tpyruvate > dehydrogenase (lipoamide) > beta\tPDHB\t0.351079\t1.01438\t0\n8468\t208912_s_at\t2,3-cyclic > nucleotide 3 phosphodiesterase > GeneSymbol P-Value FoldChange Flag > 3671 CNP 0.886081 0.991868 0 > > > > > > > ================================ > following Christian's mail > ================================ > datdir<-"/media/Working Drive/Promotie/mArray biomarker study/Lisa > Paper/selection of genes/carb data" > > > data.carb<-import.data(scheme,"dataCarbamazepine",filedir=datdir,celdi r=celdir, > celfiles=celfiles) > > > data.rma<-rma(data.carb,"dataCarbamazepineRMA",tmpdir="",background="p monly",normalize > = TRUE) > expr.rma<-validData(data.rma) > export.expr(data.rma, treename="*", treetype="mdp",varlist ="*", > outfile="24hoursLow_carb_PHH.txt", sep="\t", as.dataframe=FALSE, verbose > = TRUE) > unifltr <- UniFilter(unitest=c("t.test", "two.sided", "BH", 0, 0.0, > FALSE, 0.95, TRUE)) > rma.ufr<-unifilter(data.rma,"dataCarbUnifilter",getwd(),unifltr, > group=c("ctrl","ctrl","l24h","l24h")) > export.filter(rma.ufr, treename="*",treetype="stt", varlist = > "fUnitName:fName:fSymbol:fc:pval:flag", outfile = "UniFltr.txt", sep = > "\t", as.dataframe = FALSE, verbose = TRUE) > =verbose= > Opening file </media> study/Schemes/na32/hgu133plus2.root> in <read> mode... > Opening file </media> Paper/selection of genes/dataCarbUnifilter_ufr.root> in <read> mode... > Opening file </media> Paper/selection of genes/dataCarbUnifilter_ufr.root> in <read> mode... > Exporting data from tree <*> to file <unifltr.txt>... > Warning: tree <unitest.ufr> does not exist or has not <54675> > entries.Reading entries from <hg-u133_plus_2.ann> ...Finished > NULL > > > == > > tmp<-validData(rma.ufr, which="UnitName") > tmp <- export.filter(rma.ufr, treename = "*", treetype = "stt", > varlist = "fUnitName:fName:fSymbol:fc:pval:flag", as.dataframe = TRUE, > verbose = TRUE) > > Having done this the text file "UniFltr.txt" looks fine. However: > test<-read.table("UniFltr.txt",sep="\t",header=TRUE) > > test[3671,2] > [1] 206055_s_at > 28298 Levels: 1007_s_at 1053_at 117_at 1553075_a_at 1553995_a_at ... > AFFX-TrpnX-M_at
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 7.1 years ago
Austria
Dear Steven, Good to hear that my workaround could solve your problem. RMA normalization with about 2000 microarrays should be no problem, since some years ago one user did RMA with all 23000 HGU133_Plus2 arrays from GEO. It took about one week and did use about 2.5-3GB RAM. However, I can give you the same suggestion as to the former user, i.e. do RMA stepwise, as shown in my example script "script4xps.R": # first, load ROOT scheme file and ROOT data file scheme.test3 <- root.scheme(paste(.path.package("xps"),"schemes/SchemeTest3.root",sep= "/")) data.test3 <- root.data(scheme.test3, paste(.path.package("xps"),"rootdata/DataTest3_cel.root",sep="/")) # 1.step: background - rma data.bg.rma <- bgcorrect.rma(data.test3,"Test3RMABgrd",filedir=datdir) # 2step: normalization - quantile data.qu.rma <- normalize.quantiles(data.bg.rma,"Test3RMANorm",filedir=datdir) # 3.step: summarization - medpol data.mp.rma <- summarize.rma(data.qu.rma,"Test3RMAExpr",filedir=datdir,tmpdir="") If one step fails then you do not need to start from the beginning. This code from my script is for the Test3 array, so that you have to modify it for your HG-U133_Plus_2 arrays. Please note that if you do stepwise computation you are not allowed to define a "tmpdir" for the background step and the normalization step since this will result in empty root files due to saving the trees in a temporary file. Only for the summarization step it is allowed to define a "tmpdir". Furthermore, in the normalization step I would do: normalize.quantiles(..., add.data = FALSE) (This may also be necessary in the summarization step, but hopefully not.) Finally, I would suggest to test your code first with 6 CEL-files only. Please let me know of your further progress. Best regards, Christian On 8/21/12 9:35 AM, steven wink wrote: > Dear Christian, > > That does indeed solve it for me, thank you again for your help. > > In the next few days I plan to use the xps rma function on about 2000 > microArrays. Do you forsee any problems or have any advice on this? I > assume it will take several days? What would be the bottleneck when > using xps? If it is pocessor speed, is there a user friendly way to use > all 4 of my processors in parrallel? > > Kind regards > Steven
ADD COMMENT

Login before adding your answer.

Traffic: 1051 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6