xps: root.profile on a asubset of the data
1
0
Entering edit mode
Daniel Brewer ★ 1.9k
@daniel-brewer-1791
Last seen 7.1 years ago
Hello, I am using xps to do some quality control on an Affymetrix exon array experiment I am looking at. I am trying to use the ROOT graphics to plot density boxplots of the raw intensities (using root.profile). The problem is that there is too many arrays to look reasonable on one plot. Is there a way to split up the dataset into smaller pieces and plot them? Thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}}
Cancer xps Cancer xps • 735 views
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 3.0 years ago
Austria
Dear Daniel, You can simply use parameter "treename" to plot only a subset of trees, see "?root.profile". Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 8/20/10 4:47 PM, Daniel Brewer wrote: > Hello, > > I am using xps to do some quality control on an Affymetrix exon array > experiment I am looking at. I am trying to use the ROOT graphics to > plot density boxplots of the raw intensities (using root.profile). The > problem is that there is too many arrays to look reasonable on one plot. > Is there a way to split up the dataset into smaller pieces and plot them? > > Thanks > > Dan >
0
Entering edit mode
Hi Christian, I tried that, but it kicked up an error and only plotted one boxplot. It was like "treename" could only take one parameter. Maybe I was doing something wrong. I will have another go. Dan On 20/08/2010 4:06 PM, cstrato wrote: > Dear Daniel, > > You can simply use parameter "treename" to plot only a subset of trees, > see "?root.profile". > > Best regards > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > > On 8/20/10 4:47 PM, Daniel Brewer wrote: >> Hello, >> >> I am using xps to do some quality control on an Affymetrix exon array >> experiment I am looking at. I am trying to use the ROOT graphics to >> plot density boxplots of the raw intensities (using root.profile). The >> problem is that there is too many arrays to look reasonable on one plot. >> Is there a way to split up the dataset into smaller pieces and plot >> them? >> >> Thanks >> >> Dan >> -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis MUCRC 15 Cotswold Road Sutton, Surrey SM2 5NG United Kingdom Tel: +44 (0) 20 8722 4109 ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}}
0
Entering edit mode
Dear Daniel, Sorry my mistake! As the help file says you can use parameter "treename" only to draw different leaves for one tree. In your case you need to define a subset of your dataset, e.g.: # get scheme scheme.test3 <- root.scheme(paste(.path.package("xps"),"schemes/SchemeTest3.root",sep= "/")) # import all CEL-files celdir <- "/Volumes/CoreData/ROOT/rootdata/testAB/raw" data.test3 <- import.data(scheme.test3, "tmp_test3", celdir=celdir) # define subset of CEL-files subdata.test3 <- root.data(scheme.test3, rootFile(data.test3), c("TestA1","TestA2")) # apply root.profile to subset root.profile(subdata.test3) Please let me know if this works for you. Best regards Christian On 8/20/10 5:35 PM, Daniel Brewer wrote: > Hi Christian, > > I tried that, but it kicked up an error and only plotted one boxplot. > It was like "treename" could only take one parameter. Maybe I was doing > something wrong. I will have another go. > > Dan > > On 20/08/2010 4:06 PM, cstrato wrote: >> Dear Daniel, >> >> You can simply use parameter "treename" to plot only a subset of trees, >> see "?root.profile". >> >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ >> >> >> On 8/20/10 4:47 PM, Daniel Brewer wrote: >>> Hello, >>> >>> I am using xps to do some quality control on an Affymetrix exon array >>> experiment I am looking at. I am trying to use the ROOT graphics to >>> plot density boxplots of the raw intensities (using root.profile). The >>> problem is that there is too many arrays to look reasonable on one plot. >>> Is there a way to split up the dataset into smaller pieces and plot >>> them? >>> >>> Thanks >>> >>> Dan >>> >
0
Entering edit mode
Dear Daniel, Sorry my mistake again! After looking at my source code I realized that currently it is not possible to use a subset of trees only. Thus I have just uploaded to Bioconductor a new version "xps_1.8.3" which should solve the problem, and will be available within the next 1-2 days. You should now be able to use parameter "treename" to plot only a subset of trees. Please let me know if the new version solves your problem. Especially I am interested to know how many treenames you can pass to function root.profile() since there could be a limit on the number of characters you can pass to the root macro. Since you mention that there are too many arrays to look reasonable on one plot, you could also change parameter "w" from the default "w=800" to e.g. "w=4000", especially if you save the plot by setting e.g. "save.as='png'". Best regards Christian On 8/20/10 5:35 PM, Daniel Brewer wrote: > Hi Christian, > > I tried that, but it kicked up an error and only plotted one boxplot. > It was like "treename" could only take one parameter. Maybe I was doing > something wrong. I will have another go. > > Dan > > On 20/08/2010 4:06 PM, cstrato wrote: >> Dear Daniel, >> >> You can simply use parameter "treename" to plot only a subset of trees, >> see "?root.profile". >> >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ >> >> >> On 8/20/10 4:47 PM, Daniel Brewer wrote: >>> Hello, >>> >>> I am using xps to do some quality control on an Affymetrix exon array >>> experiment I am looking at. I am trying to use the ROOT graphics to >>> plot density boxplots of the raw intensities (using root.profile). The >>> problem is that there is too many arrays to look reasonable on one plot. >>> Is there a way to split up the dataset into smaller pieces and plot >>> them? >>> >>> Thanks >>> >>> Dan >>> >
0
Entering edit mode
Dear Christian, Many thanks for making some code changes, thats great. Unfortunately I have tried it and it doesn't seem to work. I create a list of the tree nodes: > samples <- unlist(treeNames(rootData)) Then check to see if I can use the root graphics: > root.image(rootData,treename = samples[1]) > root.profile(rootData,treename=samples[1]) But when I try to do a root.profile with more in it I get an error: > root.profile(rootData,treename=samples[1:2]) root [0] Processing /Users/dbrewer/Library/R/2.11/library/xps/rootsrc/macroDrawProfilePlot .C("/Users/dbrewer/Library/R/2.11/library/xps/libs/i386/xps.so","/Volu mes/Datastore/ProstateCancerMap/QCFINAL/cancermapQC_cel.root","Profile Plot","DataSet","0309_CoC(3)41_ExH_PRC133:0309_CoC(3)42_ExH_PRC134","c el","fInten","",0,0,1,1,1,800,600)... Warning in <tparallelcoord::tparallelcoord>: Call tree->SetEstimate(tree->GetEntries()) to display all the tree variables Error in <ttreeformula::analyzefunction>: We thought we had a function but we dont (in 0309_CoC(3)41_ExH_PRC133.fInten) Error in <ttreeformula::compile>: Bad numerical expression : "0309_CoC(3)41_ExH_PRC133.fInten" Warning in <tparallelcoord::addvariable>: log(0309_CoC(3)41_ExH_PRC133.fInten) could not be evaluated Error in <ttreeformula::analyzefunction>: We thought we had a function but we dont (in 0309_CoC(3)42_ExH_PRC134.cel.fInten) Error in <ttreeformula::compile>: Bad numerical expression : "0309_CoC(3)42_ExH_PRC134.cel.fInten" Warning in <tparallelcoord::addvariable>: log(0309_CoC(3)42_ExH_PRC134.cel.fInten) could not be evaluated *** Break *** bus error /Volumes/Datastore/ProstateCancerMap/QCFINAL/2278: No such file or directory. Attaching to process 2278. Reading symbols for shared libraries . done Reading symbols for shared libraries .......................................... done 0x94bc7189 in wait4 () ========== STACKS OF ALL THREADS ========== Thread 1 (process 2278 thread 0x10b): #0 0x94bc7189 in wait4 () #1 0x94bc4cd4 in system$UNIX2003 () #2 0x00906141 in TUnixSystem::StackTrace () #3 0x00909ac5 in TUnixSystem::DispatchSignals () #4 0x00909c38 in SigHandler () #5 <signal handler="" called=""> #6 0x048bd989 in TParallelCoordEditor::SetModel () #7 0x044ecf41 in TGedEditor::ConfigureGedFrames () #8 0x044eda3a in TGedEditor::SetModel () #9 0x04556eee in G__G__Ged_221_0_28 () #10 0x01077e27 in Cint::G__CallFunc::Execute () #11 0x008eec7f in TCint::CallFunc_Exec () #12 0x0086b002 in TQConnection::ExecuteMethod () #13 0x0086fd45 in TQObject::Emit () #14 0x00429bce in TCanvas::Selected () #15 0x048b1c62 in TParallelCoord::Draw () #16 0x03b44fd9 in XPlot::DrawParallelCoord () #17 0x03c9128f in G__xpsDict_564_0_22 () #18 0x010750c2 in Cint::G__ExceptionWrapper () #19 0x01148021 in G__execute_call () #20 0x011484ed in G__call_cppfunc () #21 0x0111bd4e in G__interpret_func () #22 0x01106d7b in G__getfunction () #23 0x0121b96b in G__getstructmem () #24 0x0121161b in G__getvariable () #25 0x010d3931 in G__getitem () #26 0x010d6839 in G__getexpr () #27 0x0117fc11 in G__exec_statement () #28 0x0111df95 in G__interpret_func () #29 0x011072f5 in G__getfunction () #30 0x010d3a74 in G__getitem () #31 0x010d6839 in G__getexpr () #32 0x010e9c79 in G__calc_internal () #33 0x0118e40c in G__process_cmd () #34 0x008f21c4 in TCint::ProcessLine () #35 0x008f0f1f in TCint::ProcessLineSynch () #36 0x008388b0 in TApplication::ExecuteFile () #37 0x0083759d in TApplication::ProcessLine () #38 0x00031799 in TRint::Run () #39 0x00001bae in main () Root > Function macroDrawProfilePlot() busy flag cleared installed.packages() indicates xps is version 1.8.3 Thanks Dan On 22/08/2010 6:49 PM, cstrato wrote: > Dear Daniel, > > Sorry my mistake again! > After looking at my source code I realized that currently it is not > possible to use a subset of trees only. Thus I have just uploaded to > Bioconductor a new version "xps_1.8.3" which should solve the problem, > and will be available within the next 1-2 days. You should now be able > to use parameter "treename" to plot only a subset of trees. > > Please let me know if the new version solves your problem. > Especially I am interested to know how many treenames you can pass to > function root.profile() since there could be a limit on the number of > characters you can pass to the root macro. > > Since you mention that there are too many arrays to look reasonable on > one plot, you could also change parameter "w" from the default "w=800" > to e.g. "w=4000", especially if you save the plot by setting e.g. > "save.as='png'". > > Best regards > Christian > > > On 8/20/10 5:35 PM, Daniel Brewer wrote: >> Hi Christian, >> >> I tried that, but it kicked up an error and only plotted one boxplot. >> It was like "treename" could only take one parameter. Maybe I was doing >> something wrong. I will have another go. >> >> Dan >> >> On 20/08/2010 4:06 PM, cstrato wrote: >>> Dear Daniel, >>> >>> You can simply use parameter "treename" to plot only a subset of trees, >>> see "?root.profile". >>> >>> Best regards >>> Christian >>> _._._._._._._._._._._._._._._._._._ >>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>> V.i.e.n.n.a A.u.s.t.r.i.a >>> e.m.a.i.l: cstrato at aon.at >>> _._._._._._._._._._._._._._._._._._ >>> >>> >>> On 8/20/10 4:47 PM, Daniel Brewer wrote: >>>> Hello, >>>> >>>> I am using xps to do some quality control on an Affymetrix exon array >>>> experiment I am looking at. I am trying to use the ROOT graphics to >>>> plot density boxplots of the raw intensities (using root.profile). The >>>> problem is that there is too many arrays to look reasonable on one >>>> plot. >>>> Is there a way to split up the dataset into smaller pieces and plot >>>> them? >>>> >>>> Thanks >>>> >>>> Dan >>>> >> -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} ADD REPLY 0 Entering edit mode Dear Daniel, The reason for the error messages you get is that the names of your CEL-files contain parenthesis which cause the ROOT C++ class TParallelCoord to interpret the names as formulas, since TParallelCoord allows you to pass an expression. In my C++ code I am even taking advantage of this possibility and pass "varexpr=log(celname)" if parameter "as.log=TRUE". I am afraid that the only possibility to solve this problem is to change the names of the CEL-files when importing them as trees into the ROOT file, however you do NOT need to change the names of the original CEL-files. Since I know that many CEL-files contain strange names with many problematic characters (e.g. "42A#0214(12);06/23/99;MES-SA/Dx-batch#21089.CEL") I have implemented the possibility to use more informative names as aliases when importing the CEL-files. For example to confirm the error messages you get I have renamed two of the four test CEL-files (TestA1.CEL, TestA2.CEL, TestB1.CEL, TestB2.CEL) as follows: # first, import ROOT scheme file > scheme.test3 <- root.scheme(paste(.path.package("xps"),"schemes/SchemeTest3.root",sep= "/")) # import CEL files with new names > celdir <- "/Volumes/CoreData/ROOT/rootdata/testAB/raw" > celnames <- c("TestA1", "TestA2", "0309_CoC(3)41_ExH_PRC133", "0309_CoC(3)42_ExH_PRC134") > data.test3 <- import.data(scheme.test3, "tmp_Test3", celdir=celdir, celnames=celnames) # this profile plot is ok > root.profile(data.test3, treename=c("TestA1.cel", "TestA2.cel")) # this profile plot reproduces the error messages you get > root.profile(data.test3, treename=c("0309_CoC(3)41_ExH_PRC133.cel", "0309_CoC(3)42_ExH_PRC134.cel")) # you can always extract the names of the original CEL-files > rawCELName(data.test3, fullpath=FALSE) [1] "TestA1.CEL" "TestA2.CEL" "TestB1.CEL" "TestB2.CEL" I hope this helps you to understand the reason of the problem, and that my only advice can be to change the name of the CEL-files during import (it is not possible to change the tree names once they are imported into a ROOT file). Best regards Christian On 8/23/10 11:32 AM, Daniel Brewer wrote: > Dear Christian, > > Many thanks for making some code changes, thats great. Unfortunately I > have tried it and it doesn't seem to work. > > I create a list of the tree nodes: >> samples<- unlist(treeNames(rootData)) > > Then check to see if I can use the root graphics: >> root.image(rootData,treename = samples[1]) >> root.profile(rootData,treename=samples[1]) > > But when I try to do a root.profile with more in it I get an error: >> root.profile(rootData,treename=samples[1:2]) > root [0] > Processing > /Users/dbrewer/Library/R/2.11/library/xps/rootsrc/macroDrawProfilePl ot.C("/Users/dbrewer/Library/R/2.11/library/xps/libs/i386/xps.so","/Vo lumes/Datastore/ProstateCancerMap/QCFINAL/cancermapQC_cel.root","Profi lePlot","DataSet","0309_CoC(3)41_ExH_PRC133:0309_CoC(3)42_ExH_PRC134", "cel","fInten","",0,0,1,1,1,800,600)... > Warning in<tparallelcoord::tparallelcoord>: Call > tree->SetEstimate(tree->GetEntries()) to display all the tree variables > Error in<ttreeformula::analyzefunction>: We thought we had a function > but we dont (in 0309_CoC(3)41_ExH_PRC133.fInten) > > Error in<ttreeformula::compile>: Bad numerical expression : > "0309_CoC(3)41_ExH_PRC133.fInten" > Warning in<tparallelcoord::addvariable>: > log(0309_CoC(3)41_ExH_PRC133.fInten) could not be evaluated > Error in<ttreeformula::analyzefunction>: We thought we had a function > but we dont (in 0309_CoC(3)42_ExH_PRC134.cel.fInten) > > Error in<ttreeformula::compile>: Bad numerical expression : > "0309_CoC(3)42_ExH_PRC134.cel.fInten" > Warning in<tparallelcoord::addvariable>: > log(0309_CoC(3)42_ExH_PRC134.cel.fInten) could not be evaluated > > *** Break *** bus error > /Volumes/Datastore/ProstateCancerMap/QCFINAL/2278: No such file or > directory. > Attaching to process 2278. > Reading symbols for shared libraries . done > Reading symbols for shared libraries > .......................................... done > 0x94bc7189 in wait4 () > > ========== STACKS OF ALL THREADS ========== > > Thread 1 (process 2278 thread 0x10b): > #0 0x94bc7189 in wait4 () > #1 0x94bc4cd4 in system$UNIX2003 () > #2 0x00906141 in TUnixSystem::StackTrace () > #3 0x00909ac5 in TUnixSystem::DispatchSignals () > #4 0x00909c38 in SigHandler () > #5<signal handler="" called=""> > #6 0x048bd989 in TParallelCoordEditor::SetModel () > #7 0x044ecf41 in TGedEditor::ConfigureGedFrames () > #8 0x044eda3a in TGedEditor::SetModel () > #9 0x04556eee in G__G__Ged_221_0_28 () > #10 0x01077e27 in Cint::G__CallFunc::Execute () > #11 0x008eec7f in TCint::CallFunc_Exec () > #12 0x0086b002 in TQConnection::ExecuteMethod () > #13 0x0086fd45 in TQObject::Emit () > #14 0x00429bce in TCanvas::Selected () > #15 0x048b1c62 in TParallelCoord::Draw () > #16 0x03b44fd9 in XPlot::DrawParallelCoord () > #17 0x03c9128f in G__xpsDict_564_0_22 () > #18 0x010750c2 in Cint::G__ExceptionWrapper () > #19 0x01148021 in G__execute_call () > #20 0x011484ed in G__call_cppfunc () > #21 0x0111bd4e in G__interpret_func () > #22 0x01106d7b in G__getfunction () > #23 0x0121b96b in G__getstructmem () > #24 0x0121161b in G__getvariable () > #25 0x010d3931 in G__getitem () > #26 0x010d6839 in G__getexpr () > #27 0x0117fc11 in G__exec_statement () > #28 0x0111df95 in G__interpret_func () > #29 0x011072f5 in G__getfunction () > #30 0x010d3a74 in G__getitem () > #31 0x010d6839 in G__getexpr () > #32 0x010e9c79 in G__calc_internal () > #33 0x0118e40c in G__process_cmd () > #34 0x008f21c4 in TCint::ProcessLine () > #35 0x008f0f1f in TCint::ProcessLineSynch () > #36 0x008388b0 in TApplication::ExecuteFile () > #37 0x0083759d in TApplication::ProcessLine () > #38 0x00031799 in TRint::Run () > #39 0x00001bae in main () > Root> Function macroDrawProfilePlot() busy flag cleared > > > > installed.packages() indicates xps is version 1.8.3 > > Thanks > > Dan > > On 22/08/2010 6:49 PM, cstrato wrote: >> Dear Daniel, >> >> Sorry my mistake again! >> After looking at my source code I realized that currently it is not >> possible to use a subset of trees only. Thus I have just uploaded to >> Bioconductor a new version "xps_1.8.3" which should solve the problem, >> and will be available within the next 1-2 days. You should now be able >> to use parameter "treename" to plot only a subset of trees. >> >> Please let me know if the new version solves your problem. >> Especially I am interested to know how many treenames you can pass to >> function root.profile() since there could be a limit on the number of >> characters you can pass to the root macro. >> >> Since you mention that there are too many arrays to look reasonable on >> one plot, you could also change parameter "w" from the default "w=800" >> to e.g. "w=4000", especially if you save the plot by setting e.g. >> "save.as='png'". >> >> Best regards >> Christian >> >> >> On 8/20/10 5:35 PM, Daniel Brewer wrote: >>> Hi Christian, >>> >>> I tried that, but it kicked up an error and only plotted one boxplot. >>> It was like "treename" could only take one parameter. Maybe I was doing >>> something wrong. I will have another go. >>> >>> Dan >>> >>> On 20/08/2010 4:06 PM, cstrato wrote: >>>> Dear Daniel, >>>> >>>> You can simply use parameter "treename" to plot only a subset of trees, >>>> see "?root.profile". >>>> >>>> Best regards >>>> Christian >>>> _._._._._._._._._._._._._._._._._._ >>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>> e.m.a.i.l: cstrato at aon.at >>>> _._._._._._._._._._._._._._._._._._ >>>> >>>> >>>> On 8/20/10 4:47 PM, Daniel Brewer wrote: >>>>> Hello, >>>>> >>>>> I am using xps to do some quality control on an Affymetrix exon array >>>>> experiment I am looking at. I am trying to use the ROOT graphics to >>>>> plot density boxplots of the raw intensities (using root.profile). The >>>>> problem is that there is too many arrays to look reasonable on one >>>>> plot. >>>>> Is there a way to split up the dataset into smaller pieces and plot >>>>> them? >>>>> >>>>> Thanks >>>>> >>>>> Dan >>>>> >>> >
0
Entering edit mode