How to sort a matrix based on its column names and preserving the identical column names
6
0
Entering edit mode
carol white ▴ 680
@carol-white-2174
Last seen 7.1 years ago
European Union
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070803/ 4460b4ba/attachment.pl
• 5.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States
How about mat[,order(colnames(mat))]? Best, Jim carol white wrote: > Hello, > How to sort a matrix based on its column names and preserving the identical column names. > > when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns > > col col -> col col.1 > > thanks > > > --------------------------------- > Park yourself in front of a world of choices in alternative vehicles. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States
Oh, wait. I didn't read the part about the repeated column names. I don't think you can resort without ending up with unique column names. Sorry for the noise... carol white wrote: > Hello, > How to sort a matrix based on its column names and preserving the identical column names. > > when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns > > col col -> col col.1 > > thanks > > > --------------------------------- > Park yourself in front of a world of choices in alternative vehicles. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
@joern-toedling-1244
Last seen 7.8 years ago
Hi Carol, despite the fact that it is certainly preferable to have unique column names and this can prevent all sorts of hard-to-debug confusion later on, I think there is something missing from your example code. If you really only have a matrix and only do some resorting of the colums you will definitely keep the original colum names, check this example: > A = matrix(rnorm(6),nrow=2) > colnames(A)<-c("bla","bal","bla") > A bla bal bla [1,] -1.1283559 -0.5672175 -1.135924 [2,] -0.8160838 -0.7441979 1.395936 > (A[,sort(colnames(A))]) bal bla bla [1,] -0.5672175 -1.1283559 -1.1283559 [2,] -0.7441979 -0.8160838 -0.8160838 Actually, you will notice that the second and third column are identical, since after the sorting the first occurrence of the column name is taken from A. To prevent that you should not index the matrix using a character vector but the numeric indices (or use unique column names): > (A[,order(colnames(A))]) bal bla bla [1,] -0.5672175 -1.1283559 -1.135924 [2,] -0.7441979 -0.8160838 1.395936 Renaming of column names, however, only happens when you convert the matrix into a data.frame: > data.frame(A) bla bal bla.1 1 -1.1283559 -0.5672175 -1.135924 2 -0.8160838 -0.7441979 1.395936 and this renaming can be prevented by setting the argument check.names=FALSE > data.frame(A, check.names=FALSE) bla bal bla 1 -1.1283559 -0.5672175 -1.135924 2 -0.8160838 -0.7441979 1.395936 Best regards, Joern carol white wrote: > Hello, > How to sort a matrix based on its column names and preserving the identical column names. > > when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns > > col col -> col col.1 > > thanks > > > --------------------------------- > Park yourself in front of a world of choices in alternative vehicles. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Oleg Sklyar ▴ 260
@oleg-sklyar-1882
Last seen 7.8 years ago
mat[, order(colnames(mat))] order provides indexes of sorted elements without affecting their names. Sort works in your case half ways only because you are sorting characters, and columns can be identified by characters, otherwise it is wrong. Here is the working example: > x<-c("col","abc","def","abc","col") > a<-matrix(runif(25),ncol=5,nrow=5) > colnames(a) <- x > a col abc def abc col [1,] 0.9815985 0.65855865 0.9982046 0.07781167 0.3228944 [2,] 0.5970836 0.19195563 0.9082061 0.80489513 0.9190933 [3,] 0.8147790 0.13499074 0.9431437 0.41154237 0.6487952 [4,] 0.7661668 0.26216671 0.6694043 0.66462428 0.2177653 [5,] 0.5604505 0.04371932 0.7873665 0.44849293 0.1700327 > a[,order(x)] abc abc col col def [1,] 0.65855865 0.07781167 0.9815985 0.3228944 0.9982046 [2,] 0.19195563 0.80489513 0.5970836 0.9190933 0.9082061 [3,] 0.13499074 0.41154237 0.8147790 0.6487952 0.9431437 [4,] 0.26216671 0.66462428 0.7661668 0.2177653 0.6694043 [5,] 0.04371932 0.44849293 0.5604505 0.1700327 0.7873665 > order(x) [1] 2 4 1 5 3 On Fri, 2007-08-03 at 05:36 -0700, carol white wrote: > Hello, > How to sort a matrix based on its column names and preserving the identical column names. > > when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns > > col col -> col col.1 > > thanks > > > --------------------------------- > Park yourself in front of a world of choices in alternative vehicles. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Dr. Oleg Sklyar * EBI-EMBL, Cambridge CB10 1SD, England * +44-1223-494466
ADD COMMENT
0
Entering edit mode
Hi, I think that there is some confusion here. First sort does not change any values, it sorts. The changing of column and/or row names is "a feature", that comes with the [ operator. Details are given on the man page for [.data.frame. I do not believe that there is any such behavior (currently) for matrices. And I could not replicate the behavior described by Carol except with data.frames, which is the documented (if peculiar) behavior. best wishes Robert m=matrix(rnorm(25), nc=5) colnames(m) = rep("A", 5) m[,2] m[1,] m[,sort(colnames(m))] but, y = data.frame(m) y #changes the colnames y = data.frame(m, check.names=FALSE) # but then do change the names y[1,] y[,sort(colnames(y))] Oleg Sklyar wrote: > mat[, order(colnames(mat))] > > order provides indexes of sorted elements without affecting their names. > Sort works in your case half ways only because you are sorting > characters, and columns can be identified by characters, otherwise it is > wrong. > > Here is the working example: > >> x<-c("col","abc","def","abc","col") >> a<-matrix(runif(25),ncol=5,nrow=5) >> colnames(a) <- x >> a > col abc def abc col > [1,] 0.9815985 0.65855865 0.9982046 0.07781167 0.3228944 > [2,] 0.5970836 0.19195563 0.9082061 0.80489513 0.9190933 > [3,] 0.8147790 0.13499074 0.9431437 0.41154237 0.6487952 > [4,] 0.7661668 0.26216671 0.6694043 0.66462428 0.2177653 > [5,] 0.5604505 0.04371932 0.7873665 0.44849293 0.1700327 >> a[,order(x)] > abc abc col col def > [1,] 0.65855865 0.07781167 0.9815985 0.3228944 0.9982046 > [2,] 0.19195563 0.80489513 0.5970836 0.9190933 0.9082061 > [3,] 0.13499074 0.41154237 0.8147790 0.6487952 0.9431437 > [4,] 0.26216671 0.66462428 0.7661668 0.2177653 0.6694043 > [5,] 0.04371932 0.44849293 0.5604505 0.1700327 0.7873665 > >> order(x) > [1] 2 4 1 5 3 > > On Fri, 2007-08-03 at 05:36 -0700, carol white wrote: >> Hello, >> How to sort a matrix based on its column names and preserving the identical column names. >> >> when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns >> >> col col -> col col.1 >> >> thanks >> >> >> --------------------------------- >> Park yourself in front of a world of choices in alternative vehicles. >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY
0
Entering edit mode
Yes, I know sort does not change values -- in contrast to order it returns sorted values and not indexes, and I pointed out the correct solution. However, what I meant by working hal-ways is, colnames() results in a character vector and sort rearranges elements in this vector. matrix allows to access columns by names and thus rearranging elements and using sort in this case should be similar (but is wrong if names not unique as example shows) to using order and accessing columns by indexes. In this case if any other values and not the colnames was used with sort (say values in the first row), it would result in total mess as sorted would be the values that do not identify columns: R version 2.6.0 Under development (unstable) (2007-07-30 r42359) > a<-matrix(runif(16),4,4) > colnames(a)<-c("c","b","a","c") > a[,colnames(a)] ## correct if unique names, here wrong c b a c [1,] 0.6674110 0.1693423 0.5741207 0.6674110 [2,] 0.4479471 0.1374272 0.1149747 0.4479471 [3,] 0.4328296 0.4990545 0.2777478 0.4328296 [4,] 0.8944030 0.1354652 0.4950811 0.8944030 > a[,sort(colnames(a))] ## correct if unique names a b c c [1,] 0.5741207 0.1693423 0.6674110 0.6674110 [2,] 0.1149747 0.1374272 0.4479471 0.4479471 [3,] 0.2777478 0.4990545 0.4328296 0.4328296 [4,] 0.4950811 0.1354652 0.8944030 0.8944030 > a[,order(colnames(a))] ## correct a b c c [1,] 0.5741207 0.1693423 0.6674110 0.43714271 [2,] 0.1149747 0.1374272 0.4479471 0.79047094 [3,] 0.2777478 0.4990545 0.4328296 0.02128344 [4,] 0.4950811 0.1354652 0.8944030 0.93321638 > On Fri, 2007-08-03 at 09:05 -0700, Robert Gentleman wrote: > Hi, > I think that there is some confusion here. First sort does not change > any values, it sorts. The changing of column and/or row names is "a > feature", that comes with the [ operator. Details are given on the man > page for [.data.frame. > > I do not believe that there is any such behavior (currently) for > matrices. And I could not replicate the behavior described by Carol > except with data.frames, which is the documented (if peculiar) behavior. > > > best wishes > Robert > > m=matrix(rnorm(25), nc=5) > colnames(m) = rep("A", 5) > m[,2] > m[1,] > m[,sort(colnames(m))] > > but, > y = data.frame(m) > y > #changes the colnames > y = data.frame(m, check.names=FALSE) > > # but then do change the names > y[1,] > y[,sort(colnames(y))] > > > > Oleg Sklyar wrote: > > mat[, order(colnames(mat))] > > > > order provides indexes of sorted elements without affecting their names. > > Sort works in your case half ways only because you are sorting > > characters, and columns can be identified by characters, otherwise it is > > wrong. > > > > Here is the working example: > > > >> x<-c("col","abc","def","abc","col") > >> a<-matrix(runif(25),ncol=5,nrow=5) > >> colnames(a) <- x > >> a > > col abc def abc col > > [1,] 0.9815985 0.65855865 0.9982046 0.07781167 0.3228944 > > [2,] 0.5970836 0.19195563 0.9082061 0.80489513 0.9190933 > > [3,] 0.8147790 0.13499074 0.9431437 0.41154237 0.6487952 > > [4,] 0.7661668 0.26216671 0.6694043 0.66462428 0.2177653 > > [5,] 0.5604505 0.04371932 0.7873665 0.44849293 0.1700327 > >> a[,order(x)] > > abc abc col col def > > [1,] 0.65855865 0.07781167 0.9815985 0.3228944 0.9982046 > > [2,] 0.19195563 0.80489513 0.5970836 0.9190933 0.9082061 > > [3,] 0.13499074 0.41154237 0.8147790 0.6487952 0.9431437 > > [4,] 0.26216671 0.66462428 0.7661668 0.2177653 0.6694043 > > [5,] 0.04371932 0.44849293 0.5604505 0.1700327 0.7873665 > > > >> order(x) > > [1] 2 4 1 5 3 > > > > On Fri, 2007-08-03 at 05:36 -0700, carol white wrote: > >> Hello, > >> How to sort a matrix based on its column names and preserving the identical column names. > >> > >> when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns > >> > >> col col -> col col.1 > >> > >> thanks > >> > >> > >> --------------------------------- > >> Park yourself in front of a world of choices in alternative vehicles. > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Dr. Oleg Sklyar * EBI-EMBL, Cambridge CB10 1SD, England * +44-1223-494466
ADD REPLY
0
Entering edit mode
@oosting-j-path-412
Last seen 7.8 years ago
mat<-mat[,order(colnames(mat))] Jan > How to sort a matrix based on its column names and preserving the > identical column names. > > when I use mat [, sort(colnames(mat))], sort changes all column names to > unique ones. for ex, if the name of 2 columns is col, the 2nd will be > changed to col.1 whereas I want to keep the col name for the two columns > > col col -> col col.1
ADD COMMENT
0
Entering edit mode
alex lam RI ▴ 310
@alex-lam-ri-1491
Last seen 7.8 years ago
How about this: > m<-matrix(runif(30), nr=3) > colnames(m)<-rep(c("A","B","C","D","E"), 2) > m A B C D E A B [1,] 0.6603121 0.7788057 0.2465374 0.8624474 0.2542933 0.1170260 0.0810228 [2,] 0.1191780 0.9388283 0.8776471 0.3523432 0.6149786 0.7219005 0.4743703 [3,] 0.3370735 0.6049742 0.9344462 0.7492387 0.2487072 0.5296768 0.7859081 C D E [1,] 0.5279424 0.30166929 0.69852792 [2,] 0.6352592 0.65057813 0.68920975 [3,] 0.6962578 0.06894764 0.02685270 > m[, order(colnames(m))] A A B B C C D [1,] 0.6603121 0.1170260 0.7788057 0.0810228 0.2465374 0.5279424 0.8624474 [2,] 0.1191780 0.7219005 0.9388283 0.4743703 0.8776471 0.6352592 0.3523432 [3,] 0.3370735 0.5296768 0.6049742 0.7859081 0.9344462 0.6962578 0.7492387 D E E [1,] 0.30166929 0.2542933 0.69852792 [2,] 0.65057813 0.6149786 0.68920975 [3,] 0.06894764 0.2487072 0.02685270 Cheers, Alex ------------------------------------ Alex Lam Roslin Institute (Edinburgh) Roslin Midlothian EH25 9PS Great Britain Phone +44 131 5274471 Web http://www.roslin.ac.uk Roslin Institute is a company limited by guarantee, registered in Scotland (registered number SC157100) and a Scottish Charity (registered number SC023592). Our registered office is at Roslin, Midlothian, EH25 9PS. VAT registration number 847380013. The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of carol white Sent: 03 August 2007 13:36 To: bioconductor at stat.math.ethz.ch Subject: [BioC] How to sort a matrix based on its column names andpreserving the identical column names Hello, How to sort a matrix based on its column names and preserving the identical column names. when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex, if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns col col -> col col.1 thanks --------------------------------- Park yourself in front of a world of choices in alternative vehicles. [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 267 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6