Question: plotting a CA
0
7.4 years ago by
Guest User12k
Guest User12k wrote:
Hey guys, Can anyone help? I did a correspondance analysis and made a plot. I also have a specific list of nodes that i want to find in my plot and want to either color the nodes that appear in my list differently, or put some kind of border around that group of nodes... Would anyone know how to do this? -- output of sessionInfo(): n/a -- Sent via the guest posting facility at bioconductor.org.
• 923 views
modified 7.4 years ago by Aedin Culhane510 • written 7.4 years ago by Guest User12k
0
7.4 years ago by
Susan Holmes120
Susan Holmes120 wrote:
Your best bet is to use the package ade4 using res=dudi.coa(data) then s.class(res$li,group) where group is your grouping variable you want to highlight. Best Susan Holmes Statistics Stanford On Wed, 7 Mar 2012, aoife [guest] wrote: > > Hey guys, Can anyone help? > > I did a correspondance analysis and made a plot. > > I also have a specific list of nodes that i want to find in my plot and want to either color the nodes that appear in my list differently, or put some kind of border around that group of nodes... > > Would anyone know how to do this? > > > > > -- output of sessionInfo(): > > n/a > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > ADD COMMENTlink written 7.4 years ago by Susan Holmes120 Many thanks. I tried this: table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", "codon3"))) library(ca) plot(ca(table,suprow=c(4,5))) This will give me a ca plot, where the nodes of interest 4,5 are open circles. However i have two questions. 1. Is it possible instead of manually typing in 4 and 5 to somehow get R to read in a list of nodes of interest. Basically is it possible to change: c(4,5) to c(all the nodes that are in a file) and 2. Is it possible instead of the individual nodes of interest being open circles, if the area encompassing all the nodes of interest could be shaded differently/highlighted. i THINK this is where your suggestion of: Your best bet is to use the package ade4 using res=dudi.coa(data) then s.class(res$li,group) where group is your grouping variable you want to highlight. comes in, but i am completely new at R, i have genuinely tried to understand the packages from the manual, I am confused however. Aoife On Wed, Mar 7, 2012 at 12:45 PM, Susan Holmes <susan@stat.stanford.edu>wrote: > > Your best bet is to use the package ade4 > using res=dudi.coa(data) > then > s.class(res$li,group) > where group is your grouping variable you want to highlight. > Best > Susan Holmes > Statistics > Stanford > > > On Wed, 7 Mar 2012, aoife [guest] wrote: > > >> Hey guys, Can anyone help? >> >> I did a correspondance analysis and made a plot. >> >> I also have a specific list of nodes that i want to find in my plot and >> want to either color the nodes that appear in my list differently, or put >> some kind of border around that group of nodes... >> >> Would anyone know how to do this? >> >> >> >> >> -- output of sessionInfo(): >> >> n/a >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by StephK70 In R, to get and use a package (i.e. ade4), then find out about a function, try this: > install.packages('ade4') # ...time passes... > library(ade4) > ?dudi.coa You might want to read the following paper, too: Use and misuse of correspondence analysis in codon usage studies<http: nar.oxfordjournals.org="" content="" 30="" 20="" 4548.full=""> Hope this helps... the 'made4' package would seem to be useful (it was created for similar purposes) but Aedin Culhane, one of the authors, suggested *not* using it recently, and I assume that she knows a great deal more about the situation than I do, so if you do use it, perhaps you might contact her. On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty <aoife.m.doherty@gmail.com>wrote: > Many thanks. I tried this: > > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", > "codon3"))) > > library(ca) > > plot(ca(table,suprow=c(4,5))) > > This will give me a ca plot, where the nodes of interest 4,5 are open > circles. > > However i have two questions. > > 1. Is it possible instead of manually typing in 4 and 5 to somehow get R to > read in a list of nodes of interest. Basically is it possible to change: > > c(4,5) to c(all the nodes that are in a file) > > and > > 2. Is it possible instead of the individual nodes of interest being open > circles, if the area encompassing all the nodes of interest could be shaded > differently/highlighted. > i THINK this is where your suggestion of: > > Your best bet is to use the package ade4 > using res=dudi.coa(data) > then > s.class(res$li,group) > where group is your grouping variable you want to highlight. > > comes in, but i am completely new at R, i have genuinely tried to > understand the packages from the manual, I am confused however. > > Aoife > > > > > > On Wed, Mar 7, 2012 at 12:45 PM, Susan Holmes <susan@stat.stanford.edu> >wrote: > > > > > Your best bet is to use the package ade4 > > using res=dudi.coa(data) > > then > > s.class(res$li,group) > > where group is your grouping variable you want to highlight. > > Best > > Susan Holmes > > Statistics > > Stanford > > > > > > On Wed, 7 Mar 2012, aoife [guest] wrote: > > > > > >> Hey guys, Can anyone help? > >> > >> I did a correspondance analysis and made a plot. > >> > >> I also have a specific list of nodes that i want to find in my plot and > >> want to either color the nodes that appear in my list differently, or > put > >> some kind of border around that group of nodes... > >> > >> Would anyone know how to do this? > >> > >> > >> > >> > >> -- output of sessionInfo(): > >> > >> n/a > >> > >> -- > >> Sent via the guest posting facility at bioconductor.org. > >> > >> ______________________________**_________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/**listinfo/bioconductor< > https://stat.ethz.ch/mailman/listinfo/bioconductor> > >> Search the archives: http://news.gmane.org/gmane.** > >> science.biology.informatics.**conductor< > http://news.gmane.org/gmane.science.biology.informatics.conductor> > >> > >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by Tim Triche4.2k Disregard what I said, although the paper may still be of interest. I misunderstood what you wrote. Apologies to all involved. --t On Thu, Mar 8, 2012 at 10:07 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > In R, to get and use a package (i.e. ade4), then find out about a > function, try this: > > > install.packages('ade4') > # ...time passes... > > library(ade4) > > ?dudi.coa > > You might want to read the following paper, too: > > Use and misuse of correspondence analysis in codon usage studies<http: nar.oxfordjournals.org="" content="" 30="" 20="" 4548.full=""> > > Hope this helps... the 'made4' package would seem to be useful (it was > created for similar purposes) but Aedin Culhane, one of the authors, > suggested *not* using it recently, and I assume that she knows a great deal > more about the situation than I do, so if you do use it, perhaps you might > contact her. > > > On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty <aoife.m.doherty@gmail.com>wrote: > >> Many thanks. I tried this: >> >> table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, >> 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", >> "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", >> "codon3"))) >> >> library(ca) >> >> plot(ca(table,suprow=c(4,5))) >> >> This will give me a ca plot, where the nodes of interest 4,5 are open >> circles. >> >> However i have two questions. >> >> 1. Is it possible instead of manually typing in 4 and 5 to somehow get R >> to >> read in a list of nodes of interest. Basically is it possible to change: >> >> c(4,5) to c(all the nodes that are in a file) >> >> and >> >> 2. Is it possible instead of the individual nodes of interest being open >> circles, if the area encompassing all the nodes of interest could be >> shaded >> differently/highlighted. >> i THINK this is where your suggestion of: >> >> Your best bet is to use the package ade4 >> using res=dudi.coa(data) >> then >> s.class(res$li,group) >> where group is your grouping variable you want to highlight. >> >> comes in, but i am completely new at R, i have genuinely tried to >> understand the packages from the manual, I am confused however. >> >> Aoife >> >> >> >> >> >> On Wed, Mar 7, 2012 at 12:45 PM, Susan Holmes <susan@stat.stanford.edu>> >wrote: >> >> > >> > Your best bet is to use the package ade4 >> > using res=dudi.coa(data) >> > then >> > s.class(res$li,group) >> > where group is your grouping variable you want to highlight. >> > Best >> > Susan Holmes >> > Statistics >> > Stanford >> > >> > >> > On Wed, 7 Mar 2012, aoife [guest] wrote: >> > >> > >> >> Hey guys, Can anyone help? >> >> >> >> I did a correspondance analysis and made a plot. >> >> >> >> I also have a specific list of nodes that i want to find in my plot and >> >> want to either color the nodes that appear in my list differently, or >> put >> >> some kind of border around that group of nodes... >> >> >> >> Would anyone know how to do this? >> >> >> >> >> >> >> >> >> >> -- output of sessionInfo(): >> >> >> >> n/a >> >> >> >> -- >> >> Sent via the guest posting facility at bioconductor.org. >> >> >> >> ______________________________**_________________ >> >> Bioconductor mailing list >> >> Bioconductor@r-project.org >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor< >> https://stat.ethz.ch/mailman/listinfo/bioconductor> >> >> Search the archives: http://news.gmane.org/gmane.** >> >> science.biology.informatics.**conductor< >> http://news.gmane.org/gmane.science.biology.informatics.conductor> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by Tim Triche4.2k I still appreciate that you took the time to read. Many thanks! On Thu, Mar 8, 2012 at 6:08 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > Disregard what I said, although the paper may still be of interest. I > misunderstood what you wrote. > > Apologies to all involved. > > --t > > > On Thu, Mar 8, 2012 at 10:07 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > >> In R, to get and use a package (i.e. ade4), then find out about a >> function, try this: >> >> > install.packages('ade4') >> # ...time passes... >> > library(ade4) >> > ?dudi.coa >> >> You might want to read the following paper, too: >> >> Use and misuse of correspondence analysis in codon usage studies<http: nar.oxfordjournals.org="" content="" 30="" 20="" 4548.full=""> >> >> Hope this helps... the 'made4' package would seem to be useful (it was >> created for similar purposes) but Aedin Culhane, one of the authors, >> suggested *not* using it recently, and I assume that she knows a great deal >> more about the situation than I do, so if you do use it, perhaps you might >> contact her. >> >> >> On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty <aoife.m.doherty@gmail.com>wrote: >> >>> Many thanks. I tried this: >>> >>> table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, >>> 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", >>> "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", >>> "codon3"))) >>> >>> library(ca) >>> >>> plot(ca(table,suprow=c(4,5))) >>> >>> This will give me a ca plot, where the nodes of interest 4,5 are open >>> circles. >>> >>> However i have two questions. >>> >>> 1. Is it possible instead of manually typing in 4 and 5 to somehow get R >>> to >>> read in a list of nodes of interest. Basically is it possible to change: >>> >>> c(4,5) to c(all the nodes that are in a file) >>> >>> and >>> >>> 2. Is it possible instead of the individual nodes of interest being open >>> circles, if the area encompassing all the nodes of interest could be >>> shaded >>> differently/highlighted. >>> i THINK this is where your suggestion of: >>> >>> Your best bet is to use the package ade4 >>> using res=dudi.coa(data) >>> then >>> s.class(res$li,group) >>> where group is your grouping variable you want to highlight. >>> >>> comes in, but i am completely new at R, i have genuinely tried to >>> understand the packages from the manual, I am confused however. >>> >>> Aoife >>> >>> >>> >>> >>> >>> On Wed, Mar 7, 2012 at 12:45 PM, Susan Holmes <susan@stat.stanford.edu>>> >wrote: >>> >>> > >>> > Your best bet is to use the package ade4 >>> > using res=dudi.coa(data) >>> > then >>> > s.class(res$li,group) >>> > where group is your grouping variable you want to highlight. >>> > Best >>> > Susan Holmes >>> > Statistics >>> > Stanford >>> > >>> > >>> > On Wed, 7 Mar 2012, aoife [guest] wrote: >>> > >>> > >>> >> Hey guys, Can anyone help? >>> >> >>> >> I did a correspondance analysis and made a plot. >>> >> >>> >> I also have a specific list of nodes that i want to find in my plot >>> and >>> >> want to either color the nodes that appear in my list differently, or >>> put >>> >> some kind of border around that group of nodes... >>> >> >>> >> Would anyone know how to do this? >>> >> >>> >> >>> >> >>> >> >>> >> -- output of sessionInfo(): >>> >> >>> >> n/a >>> >> >>> >> -- >>> >> Sent via the guest posting facility at bioconductor.org. >>> >> >>> >> ______________________________**_________________ >>> >> Bioconductor mailing list >>> >> Bioconductor@r-project.org >>> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor< >>> https://stat.ethz.ch/mailman/listinfo/bioconductor> >>> >> Search the archives: http://news.gmane.org/gmane.** >>> >> science.biology.informatics.**conductor< >>> http://news.gmane.org/gmane.science.biology.informatics.conductor> >>> >> >>> >> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> *A model is a lie that helps you see the truth.* >> * >> * >> Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >> >> > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by StephK70 Answer: plotting a CA 0 7.4 years ago by Aedin Culhane510 United States Aedin Culhane510 wrote: Hi Tim, Aoife and Susan Sorry Tim, I didn't know that I said not to use made4. When did I say this? I may have said I need to update some of the functions as I wrote the made4 package many years ago. Susan, made4 calls ade4 but is designed to convert microarray and other Bioconductor data classes into formats that can be input into ade4. It calls ade4 (and other) plot functions but with more sensible defaults for genomics data (ie it doesn't label all of the objects!). When I implemented the package I did it with Guy and Jean who wrote the paper you cited and I wholeheartedly agree with all you say ;-) However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used for what you want. This will plot rows 4 and 5 as supplementary plots onto the plot. These points won't be used in the computation of the analysis and thus would provide what you want. Have a look at these plots ### -------------------------------------------- ## From here, you can copy/paste everything to R ##------------------------------------------------ ## Your data... I renamed it, as table is a function in R codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2","codon3"))) library(ca) codonCA<-ca(codonData) ## Draw 2 plots, one with results of analysis of all the data, # the other as you described par(mfrow=c(1,2)) plot(ca(codonData,suprow=c(4,5))) plot(codonCA) ## You will notice that the 2 plots are very different, ## one analysis is a CA of all 5 rows, the other is only 3 rows. ## To run a CA on a dataset using made4 or ade4, use the following code ## install made4 ## source("http://bioconductor.org/biocLite.R") ## biocLite("made4") library(made4) ## example dataset data(khan) df<-khan$train ## The function ord will run PCA, CA or NSC, ## by default it runs CA (by calling dudi.coa from ade4) myCA<- ord(df) plot(myCA) plotgenes(myCA) plotarrays(myCA) ## using the ade4 library library(ade4) codonCA<-dudi.coa(codonData, scan=FALSE) scatter(codonCA) ## However neither of these will do exactly as you wish ## made4 expects groups in the column not the rows (genes x samples) library(made4) codonCA<-ord(t(codonData)) ## Create a factor which list the groups of "nodes" of interest fac<-factor(c(rep("Node1",3), rep("Node2", 2))) fac plot(codonCA, , classvec=fac) ## but the function below will do what you need. plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { require(made4) fac2char<-function(fac, newLabels) { cLab<- class(newLabels) if (!length(levels(fac))==length(newLabels)) stop("Number does not equal to number of factor levels") vec<-as.character(factor(fac, labels=newLabels)) if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) return(vec) } if (plotgroups) s.groups(dudi$li, fac, col=cols) if (!plotgroups) { pchs<-fac2char(rowFac, pch) cols<-fac2char(rowFac, cols) if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, col=cols, cpoint=2, clabel=0, xax=xax, yax=yax, ...) if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, xax=xax, yax=yax, ...) } s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, xax=xax, yax=yax, ...) } ##-------------------------------------------- ## Examples: Function has 3 different options ##------------------------------------------- library(ade4) codonCA<-dudi.coa(codonData, scan=FALSE) ## Option 1, plot a biplot (cases and samples) with point ## colored by rowFAC plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) ## Option 2. Same plot as above, but with labels rather than points plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), plotrowLabels=TRUE) ## Option 3, Same plot but put a circle around the groups ## If you look at the help page for s.groups (in made4) ## which calls s.class (in ade4) you will see you can also ## change the size and other details about the ## ellipse (or circle drawn around the groups) plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty <aoife.m.doherty at="" gmail.com="">wrote: > Many thanks. I tried this: > > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", > "codon3"))) > > library(ca) > > plot(ca(table,suprow=c(4,5))) > > This will give me a ca plot, where the nodes of interest 4,5 are open > circles. > > However i have two questions. > > 1. Is it possible instead of manually typing in 4 and 5 to somehow get R to > read in a list of nodes of interest. Basically is it possible to change: > > c(4,5) to c(all the nodes that are in a file) > > and > > 2. Is it possible instead of the individual nodes of interest being open > circles, if the area encompassing all the nodes of interest could be shaded > differently/highlighted. > i THINK this is where your suggestion of: > > Your best bet is to use the package ade4 > using res=dudi.coa(data) > then > s.class(res$li,group) > where group is your grouping variable you want to highlight. > > comes in, but i am completely new at R, i have genuinely tried to > understand the packages from the manual, I am confused however. > > Aoife > > > > > -- Aedin Culhane Computational Biology and Functional Genomics Laboratory Harvard School of Public Health, Dana-Farber Cancer Institute web: http://www.hsph.harvard.edu/research/aedin-culhane/ email: aedin at jimmy.harvard.edu phone: +1 617 632 2468 Fax: +1 617 582 7760 Mailing Address: Attn: Aedin Culhane, SM822C 450 Brookline Ave. Boston, MA 02215 ADD COMMENTlink written 7.4 years ago by Aedin Culhane510 O wow i was way off. Many thanks. May i ask one question (I'm a total newbie), i was trying out the different pieces of (much appreciated) code because i want to play around with them and make sure i understand them. But i have never used a function in R. For this section: xax =1, yax = 2, .. of this line plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { may i just ask what they represent? I am trying to work out how everything works by copy and pasting each line into R, and then seeing what happens, but for that line i keep getting: > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = xlim, : Non convenient selection for xax > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), plotrowLabels=TRUE) Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = xlim, : Non convenient selection for xax > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) Error in [.data.frame(dfxy, , xax) : undefined columns selected Deeply indebted. Aoife On Fri, Mar 9, 2012 at 5:49 PM, aedin culhane <aedin@jimmy.harvard.edu>wrote: > Hi Tim, Aoife and Susan > > Sorry Tim, I didn't know that I said not to use made4. When did I say > this? I may have said I need to update some of the functions as I wrote the > made4 package many years ago. > > Susan, made4 calls ade4 but is designed to convert microarray and other > Bioconductor data classes into formats that can be input into ade4. It > calls ade4 (and other) plot functions but with more sensible defaults for > genomics data (ie it doesn't label all of the objects!). When I > implemented the package I did it with Guy and Jean who wrote the paper you > cited and I wholeheartedly agree with all you say ;-) > > > However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used for > what you want. This will plot rows 4 and 5 as supplementary plots onto the > plot. These points won't be used in the computation of the analysis and > thus would provide what you want. Have a look at these plots > > ### ------------------------------**-------------- > ## From here, you can copy/paste everything to R > ##----------------------------**-------------------- > > > ## Your data... I renamed it, as table is a function in R > > codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, 10, > 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", "gene5"), > c("codon1", "codon2","codon3"))) > > library(ca) > codonCA<-ca(codonData) > > ## Draw 2 plots, one with results of analysis of all the data, > # the other as you described > > par(mfrow=c(1,2)) > plot(ca(codonData,suprow=c(4,**5))) > plot(codonCA) > > ## You will notice that the 2 plots are very different, > ## one analysis is a CA of all 5 rows, the other is only 3 rows. > > > ## To run a CA on a dataset using made4 or ade4, use the following code > > ## install made4 > ## source("http://bioconductor.**org/biocLite.R<http: bioconductor.="" org="" bioclite.r=""> > ") > ## biocLite("made4") > > library(made4) > > ## example dataset > data(khan) > df<-khan$train > > ## The function ord will run PCA, CA or NSC, > ## by default it runs CA (by calling dudi.coa from ade4) > > myCA<- ord(df) > plot(myCA) > plotgenes(myCA) > plotarrays(myCA) > > > ## using the ade4 library > library(ade4) > codonCA<-dudi.coa(codonData, scan=FALSE) > scatter(codonCA) > > > ## However neither of these will do exactly as you wish > ## made4 expects groups in the column not the rows (genes x samples) > > library(made4) > codonCA<-ord(t(codonData)) > > ## Create a factor which list the groups of "nodes" of interest > fac<-factor(c(rep("Node1",3), rep("Node2", 2))) > fac > plot(codonCA, , classvec=fac) > > > > ## but the function below will do what you need. > > > plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, > plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { > > require(made4) > > fac2char<-function(fac, newLabels) { > cLab<- class(newLabels) > if (!length(levels(fac))==length(**newLabels)) stop("Number does > not equal to number of factor levels") > vec<-as.character(factor(fac, labels=newLabels)) > if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) > return(vec) > } > > > if (plotgroups) s.groups(dudi$li, fac, col=cols) > if (!plotgroups) { > pchs<-fac2char(rowFac, pch) > cols<-fac2char(rowFac, cols) > > > if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, col=cols, > cpoint=2, clabel=0, xax=xax, yax=yax, ...) > if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, xax=xax, > yax=yax, ...) > } > > s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, > xax=xax, yax=yax, ...) > } > > ##----------------------------**---------------- > ## Examples: Function has 3 different options > ##----------------------------**--------------- > > library(ade4) > codonCA<-dudi.coa(codonData, scan=FALSE) > > ## Option 1, plot a biplot (cases and samples) with point > ## colored by rowFAC > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) > > ## Option 2. Same plot as above, but with labels rather than points > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), > plotrowLabels=TRUE) > > ## Option 3, Same plot but put a circle around the groups > ## If you look at the help page for s.groups (in made4) > ## which calls s.class (in ade4) you will see you can also > ## change the size and other details about the > ## ellipse (or circle drawn around the groups) > > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) > > > > > > On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty <aoife.m.doherty@gmail.com>* > *wrote: > > > Many thanks. I tried this: > > > > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, > > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", > > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", > > "codon3"))) > > > > library(ca) > > > > plot(ca(table,suprow=c(4,5))) > > > > This will give me a ca plot, where the nodes of interest 4,5 are open > > circles. > > > > However i have two questions. > > > > 1. Is it possible instead of manually typing in 4 and 5 to somehow get R > to > > read in a list of nodes of interest. Basically is it possible to change: > > > > c(4,5) to c(all the nodes that are in a file) > > > > and > > > > 2. Is it possible instead of the individual nodes of interest being open > > circles, if the area encompassing all the nodes of interest could be > shaded > > differently/highlighted. > > i THINK this is where your suggestion of: > > > > Your best bet is to use the package ade4 > > using res=dudi.coa(data) > > then > > s.class(res$li,group) > > where group is your grouping variable you want to highlight. > > > > comes in, but i am completely new at R, i have genuinely tried to > > understand the packages from the manual, I am confused however. > > > > Aoife > > > > > > > > > > > > -- > Aedin Culhane > Computational Biology and Functional Genomics Laboratory > Harvard School of Public Health, > Dana-Farber Cancer Institute > > web: http://www.hsph.harvard.edu/**research/aedin- culhane/<http: www.hsph.harvard.edu="" research="" aedin-culhane=""/> > email: aedin@jimmy.harvard.edu > phone: +1 617 632 2468 > Fax: +1 617 582 7760 > > > Mailing Address: > Attn: Aedin Culhane, SM822C > 450 Brookline Ave. > Boston, MA 02215 > [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by StephK70 Hi Aoife xax and yax are the axes, so xax =1 and yax =2 plots the first 2 components (or axes) Aedin On 03/09/2012 05:09 PM, aoife doherty wrote: > O wow i was way off. Many thanks. > May i ask one question (I'm a total newbie), i was trying out the > different pieces of (much appreciated) code because i want to play > around with them and make sure i understand them. > > But i have never used a function in R. > > For this section: > > xax =1, yax = 2, .. of this line > plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, > plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { > > may i just ask what they represent? > > I am trying to work out how everything works by copy and pasting each > line into R, and then seeing what happens, but for that line i keep getting: > > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) > Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = > xlim, : > Non convenient selection for xax > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), > plotrowLabels=TRUE) > Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = > xlim, : > Non convenient selection for xax > > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) > Error in [.data.frame(dfxy, , xax) : undefined columns selected > > Deeply indebted. > Aoife > > On Fri, Mar 9, 2012 at 5:49 PM, aedin culhane <aedin at="" jimmy.harvard.edu=""> <mailto:aedin at="" jimmy.harvard.edu="">> wrote: > > Hi Tim, Aoife and Susan > > Sorry Tim, I didn't know that I said not to use made4. When did I > say this? I may have said I need to update some of the functions as > I wrote the made4 package many years ago. > > Susan, made4 calls ade4 but is designed to convert microarray and > other Bioconductor data classes into formats that can be input into > ade4. It calls ade4 (and other) plot functions but with more > sensible defaults for genomics data (ie it doesn't label all of the > objects!). When I implemented the package I did it with Guy and > Jean who wrote the paper you cited and I wholeheartedly agree with > all you say ;-) > > > However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used > for what you want. This will plot rows 4 and 5 as supplementary > plots onto the plot. These points won't be used in the computation > of the analysis and thus would provide what you want. Have a look > at these plots > > ### ------------------------------__-------------- > ## From here, you can copy/paste everything to R > ##----------------------------__-------------------- > > > ## Your data... I renamed it, as table is a function in R > > codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, > 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", > "gene4", "gene5"), c("codon1", "codon2","codon3"))) > > library(ca) > codonCA<-ca(codonData) > > ## Draw 2 plots, one with results of analysis of all the data, > # the other as you described > > par(mfrow=c(1,2)) > plot(ca(codonData,suprow=c(4,__5))) > plot(codonCA) > > ## You will notice that the 2 plots are very different, > ## one analysis is a CA of all 5 rows, the other is only 3 rows. > > > ## To run a CA on a dataset using made4 or ade4, use the following code > > ## install made4 > ## source("http://bioconductor.__org/biocLite.R > <http: bioconductor.org="" bioclite.r="">") > ## biocLite("made4") > > library(made4) > > ## example dataset > data(khan) > df<-khan$train > > ## The function ord will run PCA, CA or NSC, > ## by default it runs CA (by calling dudi.coa from ade4) > > myCA<- ord(df) > plot(myCA) > plotgenes(myCA) > plotarrays(myCA) > > > ## using the ade4 library > library(ade4) > codonCA<-dudi.coa(codonData, scan=FALSE) > scatter(codonCA) > > > ## However neither of these will do exactly as you wish > ## made4 expects groups in the column not the rows (genes x samples) > > library(made4) > codonCA<-ord(t(codonData)) > > ## Create a factor which list the groups of "nodes" of interest > fac<-factor(c(rep("Node1",3), rep("Node2", 2))) > fac > plot(codonCA, , classvec=fac) > > > > ## but the function below will do what you need. > > > plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, > plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, > ...) { > > require(made4) > > fac2char<-function(fac, newLabels) { > cLab<- class(newLabels) > if (!length(levels(fac))==length(__newLabels)) stop("Number > does not equal to number of factor levels") > vec<-as.character(factor(fac, labels=newLabels)) > if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) > return(vec) > } > > > if (plotgroups) s.groups(dudi$li, fac, col=cols) > if (!plotgroups) { > pchs<-fac2char(rowFac, pch) > cols<-fac2char(rowFac, cols) > > > if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, > col=cols, cpoint=2, clabel=0, xax=xax, yax=yax, ...) > if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, > xax=xax, yax=yax, ...) > } > > s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, > xax=xax, yax=yax, ...) > } > > ##----------------------------__---------------- > ## Examples: Function has 3 different options > ##----------------------------__--------------- > > library(ade4) > codonCA<-dudi.coa(codonData, scan=FALSE) > > ## Option 1, plot a biplot (cases and samples) with point > ## colored by rowFAC > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) > > ## Option 2. Same plot as above, but with labels rather than points > > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), > plotrowLabels=TRUE) > > ## Option 3, Same plot but put a circle around the groups > ## If you look at the help page for s.groups (in made4) > ## which calls s.class (in ade4) you will see you can also > ## change the size and other details about the > ## ellipse (or circle drawn around the groups) > > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) > > > > > > On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty > <aoife.m.doherty at="" gmail.com="" <mailto:aoife.m.doherty="" at="" gmail.com="">>__wrote: > > > Many thanks. I tried this: > > > > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, > > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", > > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", > > "codon3"))) > > > > library(ca) > > > > plot(ca(table,suprow=c(4,5))) > > > > This will give me a ca plot, where the nodes of interest 4,5 are open > > circles. > > > > However i have two questions. > > > > 1. Is it possible instead of manually typing in 4 and 5 to > somehow get R to > > read in a list of nodes of interest. Basically is it possible to > change: > > > > c(4,5) to c(all the nodes that are in a file) > > > > and > > > > 2. Is it possible instead of the individual nodes of interest > being open > > circles, if the area encompassing all the nodes of interest could > be shaded > > differently/highlighted. > > i THINK this is where your suggestion of: > > > > Your best bet is to use the package ade4 > > using res=dudi.coa(data) > > then > > s.class(res$li,group) > > where group is your grouping variable you want to highlight. > > > > comes in, but i am completely new at R, i have genuinely tried to > > understand the packages from the manual, I am confused however. > > > > Aoife > > > > > > > > > > > > -- > Aedin Culhane > Computational Biology and Functional Genomics Laboratory > Harvard School of Public Health, > Dana-Farber Cancer Institute > > web: http://www.hsph.harvard.edu/__research/aedin-culhane/ > <http: www.hsph.harvard.edu="" research="" aedin-culhane=""/> > email: aedin at jimmy.harvard.edu <mailto:aedin at="" jimmy.harvard.edu=""> > phone: +1 617 632 2468 <tel:%2b1%20617%20632%202468> > Fax: +1 617 582 7760 <tel:%2b1%20617%20582%207760> > > > Mailing Address: > Attn: Aedin Culhane, SM822C > 450 Brookline Ave. > Boston, MA 02215 > > -- Aedin Culhane Computational Biology and Functional Genomics Laboratory Harvard School of Public Health, Dana-Farber Cancer Institute web: http://www.hsph.harvard.edu/research/aedin-culhane/ email: aedin at jimmy.harvard.edu phone: +1 617 632 2468 Fax: +1 617 582 7760 Mailing Address: Attn: Aedin Culhane, SM822C 450 Brookline Ave. Boston, MA 02215 ADD REPLYlink written 7.4 years ago by Aedin Culhane510 Thank you. I tried to modify it slightly. It did not go well. At this step: creating a group of interesting nodes: fac<-factor(c(rep("Node1",3), rep("Node2", 2))) In reality there can be hundreds of these nodes. So instead of typing the nodes in manually, i wanted to read the nodes in from a file. Eg my nodes of interest (say gene1 and gene2) would be in a separate file "list1". Eg: list1: gene1 gene2 The aim is to change this: fac<-factor(c(rep("Node1",3), rep("Node2", 2))) to this: fac<-factor(read in the genes in the file labelled list1) I almost got it started with: list1 <- scan("filename",what=list(item="gene")) which will read in the list but then when i do various attempts at parsing that into this sentence: fac<-factor(c(rep("Node1",3), rep("Node2", 2))) it goes drastically wrong. Is there any possibility you would have an alternative solution? Aoife On Fri, Mar 9, 2012 at 10:20 PM, aedin culhane <aedin@jimmy.harvard.edu>wrote: > Hi Aoife > xax and yax are the axes, so xax =1 and yax =2 plots the first 2 > components (or axes) > Aedin > > > On 03/09/2012 05:09 PM, aoife doherty wrote: > >> O wow i was way off. Many thanks. >> May i ask one question (I'm a total newbie), i was trying out the >> different pieces of (much appreciated) code because i want to play >> around with them and make sure i understand them. >> >> But i have never used a function in R. >> >> For this section: >> >> xax =1, yax = 2, .. of this line >> plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, >> plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { >> >> may i just ask what they represent? >> >> I am trying to work out how everything works by copy and pasting each >> line into R, and then seeing what happens, but for that line i keep >> getting: >> >> > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) >> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = >> xlim, : >> Non convenient selection for xax >> > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), >> plotrowLabels=TRUE) >> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim = >> xlim, : >> Non convenient selection for xax >> > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) >> Error in [.data.frame(dfxy, , xax) : undefined columns selected >> >> Deeply indebted. >> Aoife >> >> On Fri, Mar 9, 2012 at 5:49 PM, aedin culhane <aedin@jimmy.harvard.edu>> <mailto:aedin@jimmy.harvard.**edu <aedin@jimmy.harvard.edu="">>> wrote: >> >> Hi Tim, Aoife and Susan >> >> Sorry Tim, I didn't know that I said not to use made4. When did I >> say this? I may have said I need to update some of the functions as >> I wrote the made4 package many years ago. >> >> Susan, made4 calls ade4 but is designed to convert microarray and >> other Bioconductor data classes into formats that can be input into >> ade4. It calls ade4 (and other) plot functions but with more >> sensible defaults for genomics data (ie it doesn't label all of the >> objects!). When I implemented the package I did it with Guy and >> Jean who wrote the paper you cited and I wholeheartedly agree with >> all you say ;-) >> >> >> However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used >> for what you want. This will plot rows 4 and 5 as supplementary >> plots onto the plot. These points won't be used in the computation >> of the analysis and thus would provide what you want. Have a look >> at these plots >> >> ### ------------------------------**__-------------- >> >> ## From here, you can copy/paste everything to R >> ##----------------------------**__-------------------- >> >> >> >> ## Your data... I renamed it, as table is a function in R >> >> codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, >> 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", >> "gene4", "gene5"), c("codon1", "codon2","codon3"))) >> >> library(ca) >> codonCA<-ca(codonData) >> >> ## Draw 2 plots, one with results of analysis of all the data, >> # the other as you described >> >> par(mfrow=c(1,2)) >> plot(ca(codonData,suprow=c(4,_**_5))) >> >> plot(codonCA) >> >> ## You will notice that the 2 plots are very different, >> ## one analysis is a CA of all 5 rows, the other is only 3 rows. >> >> >> ## To run a CA on a dataset using made4 or ade4, use the following code >> >> ## install made4 >> ## source("http://bioconductor.__**org/biocLite.R >> >> <http: bioconductor.org="" **bioclite.r<http:="" bioconductor.org="" bi="" oclite.r=""> >> >") >> ## biocLite("made4") >> >> library(made4) >> >> ## example dataset >> data(khan) >> df<-khan$train >> >> ## The function ord will run PCA, CA or NSC, >> ## by default it runs CA (by calling dudi.coa from ade4) >> >> myCA<- ord(df) >> plot(myCA) >> plotgenes(myCA) >> plotarrays(myCA) >> >> >> ## using the ade4 library >> library(ade4) >> codonCA<-dudi.coa(codonData, scan=FALSE) >> scatter(codonCA) >> >> >> ## However neither of these will do exactly as you wish >> ## made4 expects groups in the column not the rows (genes x samples) >> >> library(made4) >> codonCA<-ord(t(codonData)) >> >> ## Create a factor which list the groups of "nodes" of interest >> fac<-factor(c(rep("Node1",3), rep("Node2", 2))) >> fac >> plot(codonCA, , classvec=fac) >> >> >> >> ## but the function below will do what you need. >> >> >> plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, >> plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, >> ...) { >> >> require(made4) >> >> fac2char<-function(fac, newLabels) { >> cLab<- class(newLabels) >> if (!length(levels(fac))==length(**__newLabels)) stop("Number >> >> does not equal to number of factor levels") >> vec<-as.character(factor(fac, labels=newLabels)) >> if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) >> return(vec) >> } >> >> >> if (plotgroups) s.groups(dudi$li, fac, col=cols) >> if (!plotgroups) { >> pchs<-fac2char(rowFac, pch) >> cols<-fac2char(rowFac, cols) >> >> >> if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, >> col=cols, cpoint=2, clabel=0, xax=xax, yax=yax, ...) >> if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, >> xax=xax, yax=yax, ...) >> } >> >> s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, >> xax=xax, yax=yax, ...) >> } >> >> ##----------------------------**__---------------- >> >> ## Examples: Function has 3 different options >> ##----------------------------**__--------------- >> >> >> library(ade4) >> codonCA<-dudi.coa(codonData, scan=FALSE) >> >> ## Option 1, plot a biplot (cases and samples) with point >> ## colored by rowFAC >> >> plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) >> >> ## Option 2. Same plot as above, but with labels rather than points >> >> plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"), >> plotrowLabels=TRUE) >> >> ## Option 3, Same plot but put a circle around the groups >> ## If you look at the help page for s.groups (in made4) >> ## which calls s.class (in ade4) you will see you can also >> ## change the size and other details about the >> ## ellipse (or circle drawn around the groups) >> >> plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue")) >> >> >> >> >> >> On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty >> <aoife.m.doherty@gmail.com <mailto:aoife.m.doherty@gmail.**com<aoife.m.doherty@gmail.com=""> >> >>__wrote: >> >> >> > Many thanks. I tried this: >> > >> > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, >> > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1", >> > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2", >> > "codon3"))) >> > >> > library(ca) >> > >> > plot(ca(table,suprow=c(4,5))) >> > >> > This will give me a ca plot, where the nodes of interest 4,5 are >> open >> > circles. >> > >> > However i have two questions. >> > >> > 1. Is it possible instead of manually typing in 4 and 5 to >> somehow get R to >> > read in a list of nodes of interest. Basically is it possible to >> change: >> > >> > c(4,5) to c(all the nodes that are in a file) >> > >> > and >> > >> > 2. Is it possible instead of the individual nodes of interest >> being open >> > circles, if the area encompassing all the nodes of interest could >> be shaded >> > differently/highlighted. >> > i THINK this is where your suggestion of: >> > >> > Your best bet is to use the package ade4 >> > using res=dudi.coa(data) >> > then >> > s.class(res$li,group) >> > where group is your grouping variable you want to highlight. >> > >> > comes in, but i am completely new at R, i have genuinely tried to >> > understand the packages from the manual, I am confused however. >> > >> > Aoife >> > >> > >> > >> > >> > >> >> -- >> Aedin Culhane >> Computational Biology and Functional Genomics Laboratory >> Harvard School of Public Health, >> Dana-Farber Cancer Institute >> >> web: http://www.hsph.harvard.edu/__**research/aedin- culhane/<http: www.hsph.harvard.edu="" __research="" aedin-culhane=""/> >> <http: www.hsph.harvard.edu="" **research="" aedin-="" culhane="" <http:="" www.hsph.harvard.edu="" research="" aedin-culhane=""/> >> > >> email: aedin@jimmy.harvard.edu <mailto:aedin@jimmy.harvard.**edu<aedin@jimmy.harvard.edu> >> > >> phone: +1 617 632 2468 <tel:%2b1%20617%20632%202468> >> Fax: +1 617 582 7760 <tel:%2b1%20617%20582%207760> >> >> >> >> Mailing Address: >> Attn: Aedin Culhane, SM822C >> 450 Brookline Ave. >> Boston, MA 02215 >> >> >> > > -- > Aedin Culhane > Computational Biology and Functional Genomics Laboratory > Harvard School of Public Health, > Dana-Farber Cancer Institute > > web: http://www.hsph.harvard.edu/**research/aedin- culhane/<http: www.hsph.harvard.edu="" research="" aedin-culhane=""/> > email: aedin@jimmy.harvard.edu > phone: +1 617 632 2468 > Fax: +1 617 582 7760 > > > Mailing Address: > Attn: Aedin Culhane, SM822C > 450 Brookline Ave. > Boston, MA 02215 > [[alternative HTML version deleted]] ADD REPLYlink written 7.4 years ago by StephK70 Hi Aoife Welcome to R. I understand it can seem tough at first. But you are learning a language that will be wonderful once you do. I find new users like the Rstudio.org interface to R Scan works easy for a one column file. It will read it in as a vector that you then have to convert to a factor. It's more complicated when you have >1 column In that case you might find it easier to create an annotation file in excel. Save it as a csv (comma delimited) file It can contain >1 column eg a column of gene names and a second of categories etc. then use read.csv to read it into R Then select the column you want eg if it's the second column: annot<~read.csv("file.csv", header=TRUE) myFac<-annot[,2] ?scan read.csv or read.table will read any categorical column as a factor by default. Use? To get help on a function. Or use help.search to search for a function. The website rseek.org is a good google like search engine for all things R There are several intro to R textbooks which might be useful to me I can send you a list if you wish Or look at Tom Girke UC riverside's online class or i link course notes on an intro to R class I teach on my website. Good luck Aedin On Mar 10, 2012, at 5:50, aoife doherty <aoife.m.doherty at="" gmail.com=""> wrote: > change ADD REPLYlink written 7.4 years ago by Aedin Culhane510 May I please pick your brain once more! I used the code you sent and modified it slightly but i just need help with one part, also i appreciate the reading you sent (to Ms. Culhane)...... so I've highlighted the awkward part in yellow, but i've sent the full code so as to provide context: #### Increase max print, load libraries options(max.print=10000000) library(ca) library(made4) ####read in a test matrix codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2","codon3"))) #### make function plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { require(made4) fac2char<-function(fac, newLabels) { cLab<- class(newLabels) if (!length(levels(fac))==length(newLabels)) stop("Number does not equal to number of factor levels") vec<-as.character(factor(fac, labels=newLabels)) if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) return(vec) } if (plotgroups) s.groups(dudi$li, fac, col=cols) if (!plotgroups) { pchs<-fac2char(rowFac, pch) cols<-fac2char(rowFac, cols) if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, col=cols, cpoint=2, clabel=0, xax=xax, yax=yax, ...) if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, xax=xax, yax=yax, ...) } s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, xax=xax, yax=yax, ...) } ## run CA analysis codonCA<-ord(t(codonData)) > codonCA$ord Duality diagramm class: coa dudi $call: dudi.coa(df = data.tr, scannf = FALSE, nf = ord.nf)$nf: 2 axis-components saved $rank: 2 eigen values: 0.3946 0.03043 vector length mode content 1$cw 5 numeric column weights 2 $lw 3 numeric row weights 3$eig 2 numeric eigen values data.frame nrow ncol content 1 $tab 3 5 modified array 2$li 3 2 row coordinates 3 $l1 3 2 row normed scores 4$co 5 2 column coordinates 5 $c1 5 2 column normed scores other elements: N$fac NULL attr(,"class") [1] "coa" "ord" ## Create a factor which list the groups of "nodes" of interest ### This next section is the section i was trying to change. My aim is that if codonData is as described above: > codonData codon1 codon2 codon3 gene1 4.0 7 11 gene2 7.0 222 8 gene3 0.2 3 8 gene4 3.0 10 10 gene5 0.1 5 7 for example i find gene2 and gene5 interesting, i want all the nodes in the plot to be black, *except for gene 2 and gene 5 that i want to both be red* (or whatever). So i understand i need to make a factor to group these variables. I think this command: fac<-factor(c(rep("Node1",3), rep("Node2", 2))) fac took my data and said that the first 3 rows (ie. gene 1,2 and 3) were level one (and labelled node1) and gene4 and gene5 were level two (and labelled node 2). so, similar to this, i also want to set two levels, one for my node of interest, one for the interesting nodes, and one for everything else. so i did: list1 <-c("gene2", "gene4") to read my interesting rows as a vector. Then i wanted to say: look at the cells in my vector list1 and table. if in vector list1 and table, change to a factor (I should acknowledge that i robbed this from the R forum...!) > codonData$mycells <-factor(codonCA$genes %in% list1,c("Special","NotSpecial")) Error in codonData$cells :$ operator is invalid for atomic vectors and then once i had found the interesting nodes in my CA analysis, just do as before: codonCA<-dudi.coa(codonData, scan=FALSE) plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) Miss Singh, I know you were asking something similar about factors and levels and stuff, I've read a bit about them, basically a factor is a type of variable, and as an optional argument you can add a level which determines the categories of factor variables, and the label is a vector of values that will be the labels of categories in levels argument... I don't know if that can help your error... Also may I point out that I didn't try just this one way and I'm already asking for help, I've been stuck on this for over a day, it's so much to take in at once! Aoife On Sat, Mar 10, 2012 at 2:28 PM, Aedin <aedin@jimmy.harvard.edu> wrote: > Hi Aoife > Welcome to R. I understand it can seem tough at first. But you are > learning a language that will be wonderful once you do. I find new users > like the Rstudio.org interface to R > > Scan works easy for a one column file. It will read it in as a vector that > you then have to convert to a factor. It's more complicated when you have > >1 column > > In that case you might find it easier to create an annotation file in > excel. Save it as a csv (comma delimited) file It can contain >1 column > eg a column of gene names and a second of categories etc. then use > read.csv to read it into R Then select the column you want eg if it's the > second column: > > annot<~read.csv("file.csv", header=TRUE) > myFac<-annot[,2] > ?scan > > read.csv or read.table will read any categorical column as a factor by > default. Use? To get help on a function. Or use help.search to search for a > function. The website rseek.org is a good google like search engine for > all things R > > There are several intro to R textbooks which might be useful to me I can > send you a list if you wish Or look at Tom Girke UC riverside's online > class or i link course notes on an intro to R class I teach on my website. > > Good luck > Aedin > > On Mar 10, 2012, at 5:50, aoife doherty <aoife.m.doherty@gmail.com> wrote: > > > change > [[alternative HTML version deleted]]
Hi Aoife Regarding your selection 1)codonCA is the results of a CA on your matrix codonData. codonCA$genes doesn't exist. Are trying to add an element to the list codonCA or to your data matrix codonData? 2)list1<-c("gene2", "gene4") Calling a vector "list1" might be confusing as it does not have class list 3) The following options will work: > codonData<- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2","codon3"))) > codonData codon1 codon2 codon3 gene1 4.0 7 11 gene2 7.0 222 8 gene3 0.2 3 8 gene4 3.0 10 10 gene5 0.1 5 7 > class(codonData) [1] "matrix" > > rownames(codonData) [1] "gene1" "gene2" "gene3" "gene4" "gene5" > rownames(codonData)%in%list1 [1] FALSE TRUE FALSE TRUE FALSE > factor(rownames(codonData)%in%list1, ,c("Special","NotSpecial")) [1] Special NotSpecial Special NotSpecial Special Levels: Special NotSpecial ## Can't add a character vector to a numerical matrix so need to convert to a data.frame > codonData<-as.data.frame(codonData) > codonData codon1 codon2 codon3 gene1 4.0 7 11 gene2 7.0 222 8 gene3 0.2 3 8 gene4 3.0 10 10 gene5 0.1 5 7 > class(codonData) [1] "data.frame" > > codonData$mycells<-factor(rownames(codonData)%in%list1, ,c("Special","NotSpecial")) > codonData codon1 codon2 codon3 mycells gene1 4.0 7 11 Special gene2 7.0 222 8 NotSpecial gene3 0.2 3 8 Special gene4 3.0 10 10 NotSpecial gene5 0.1 5 7 Special Or > codonCA$mycells<-factor(rownames(codonData)%in%list1, ,c("Special","NotSpecial")) *****NOTE**** There are two comma when calling factor, factor(x = character(), levels, labels = levels, exclude = NA, ordered = is.ordered(x)) This is because c("Special","NotSpecial")is a labels parameter. If you neglect the two commas, it tries to send c("Special","NotSpecial") to level which will fail as these are not the levels of the factor > levels(factor(rownames(codonData)%in%list1)) [1] "FALSE" "TRUE" > codonCA$mycells<-factor(rownames(codonData)%in%list1 ,c("Special","NotSpecial")) > codonCA$mycells [1]<na> <na> <na> <na> <na> Levels: Special NotSpecial > codonCA$mycells<-factor(rownames(codonData)%in%list1, ,c("Special","NotSpecial")) > codonCA$mycells [1] Special NotSpecial Special NotSpecial Special Levels: Special NotSpecial > codonCA$mycells<-factor(rownames(codonData)%in%list1,labels=c("Spec ial","NotSpecial")) > codonCA$mycells [1] Special NotSpecial Special NotSpecial Special Levels: Special NotSpecial On 3/13/2012 10:48 AM, aoife doherty wrote: > May I please pick your brain once more! I used the code you sent and > modified it slightly but i just need help with one part, also i > appreciate the reading you sent (to Ms. Culhane)...... > > so I've highlighted the awkward part in yellow, but i've sent the full > code so as to provide context: > > > #### Increase max print, load libraries > options(max.print=10000000) > library(ca) > library(made4) > > > > ####read in a test matrix > codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8, > 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4", > "gene5"), c("codon1", "codon2","codon3"))) > > > #### make function > plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE, > plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) { > > require(made4) > > fac2char<-function(fac, newLabels) { > cLab<- class(newLabels) > if (!length(levels(fac))==length(newLabels)) stop("Number does > not equal to number of factor levels") > vec<-as.character(factor(fac, labels=newLabels)) > if(inherits(newLabels, "numeric")) vec<-as.numeric(vec) > return(vec) > } > > > if (plotgroups) s.groups(dudi$li, fac, col=cols) > if (!plotgroups) { > pchs<-fac2char(rowFac, pch) > cols<-fac2char(rowFac, cols) > > > if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs, col=cols, > cpoint=2, clabel=0, xax=xax, yax=yax, ...) > if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols, xax=xax, > yax=yax, ...) > } > > s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE, > xax=xax, yax=yax, ...) > } > > ## run CA analysis > codonCA<-ord(t(codonData)) > > > codonCA >$ord > Duality diagramm > class: coa dudi > $call: dudi.coa(df = data.tr <http: data.tr="">, scannf = FALSE, nf = > ord.nf <http: ord.nf="">) > >$nf: 2 axis-components saved > $rank: 2 > eigen values: 0.3946 0.03043 > vector length mode content > 1$cw 5 numeric column weights > 2 $lw 3 numeric row weights > 3$eig 2 numeric eigen values > > data.frame nrow ncol content > 1 $tab 3 5 modified array > 2$li 3 2 row coordinates > 3 $l1 3 2 row normed scores > 4$co 5 2 column coordinates > 5 $c1 5 2 column normed scores > other elements: N > >$fac > NULL > > attr(,"class") > [1] "coa" "ord" > > > ## Create a factor which list the groups of "nodes" of interest > > ### This next section is the section i was trying to change. > My aim is that if codonData is as described above: > > > codonData > codon1 codon2 codon3 > gene1 4.0 7 11 > gene2 7.0 222 8 > gene3 0.2 3 8 > gene4 3.0 10 10 > gene5 0.1 5 7 > > for example i find gene2 and gene5 interesting, i want all the nodes > in the plot to be black, > *except for gene 2 and gene 5 that i want to both be red* (or whatever). > > So i understand i need to make a factor to group these variables. I > think this command: > fac<-factor(c(rep("Node1",3), rep("Node2", 2))) > fac > > took my data and said that the first 3 rows (ie. gene 1,2 and 3) were > level one (and labelled node1) > and gene4 and gene5 were level two (and labelled node 2). > > so, similar to this, i also want to set two levels, one for my node of > interest, one for the interesting nodes, > and one for everything else. > > so i did: > > list1 <-c("gene2", "gene4") > to read my interesting rows as a vector. > > Then i wanted to say: > look at the cells in my vector list1 and table. if in vector list1 and > table, change to a factor > (I should acknowledge that i robbed this from the R forum...!) > > > > codonData$mycells <-factor(codonCA$genes %in% > list1,c("Special","NotSpecial")) > Error in codonData$cells :$ operator is invalid for atomic vectors > > > and then once i had found the interesting nodes in my CA analysis, > just do as before: > > > codonCA<-dudi.coa(codonData, scan=FALSE) > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue")) > > > > Miss Singh, I know you were asking something similar about factors and > levels and stuff, > I've read a bit about them, basically a factor is a type of variable, > and as an optional > argument you can add a level which determines the categories of factor > variables, and the > label is a vector of values that will be the labels of categories in > levels argument... > > I don't know if that can help your error... > > Also may I point out that I didn't try just this one way and I'm > already asking for help, I've been stuck on this for over a day, it's > so much to take in at once! > > Aoife > > On Sat, Mar 10, 2012 at 2:28 PM, Aedin <aedin at="" jimmy.harvard.edu=""> <mailto:aedin at="" jimmy.harvard.edu="">> wrote: > > Hi Aoife > Welcome to R. I understand it can seem tough at first. But you are > learning a language that will be wonderful once you do. I find new > users like the Rstudio.org interface to R > > Scan works easy for a one column file. It will read it in as a > vector that you then have to convert to a factor. It's more > complicated when you have >1 column > > In that case you might find it easier to create an annotation file > in excel. Save it as a csv (comma delimited) file It can > contain >1 column eg a column of gene names and a second of > categories etc. then use read.csv to read it into R Then select > the column you want eg if it's the second column: > > annot<~read.csv("file.csv", header=TRUE) > myFac<-annot[,2] > ?scan > > read.csv or read.table will read any categorical column as a > factor by default. Use? To get help on a function. Or use > help.search to search for a function. The website rseek.org > <http: rseek.org=""> is a good google like search engine for all > things R > > There are several intro to R textbooks which might be useful to me > I can send you a list if you wish Or look at Tom Girke UC > riverside's online class or i link course notes on an intro to R > class I teach on my website. > > Good luck > Aedin > > On Mar 10, 2012, at 5:50, aoife doherty <aoife.m.doherty at="" gmail.com=""> <mailto:aoife.m.doherty at="" gmail.com="">> wrote: > > > change > > -- Aedin Culhane Computational Biology and Functional Genomics Laboratory Harvard School of Public Health, Dana-Farber Cancer Institute web: http://www.hsph.harvard.edu/research/aedin-culhane/ email: aedin at jimmy.harvard.edu phone: +1 617 632 2468 Fax: +1 617 582 7760 Mailing Address: Attn: Aedin Culhane, SM822C 450 Brookline Ave. Boston, MA 02215