Sorting matrix by column
7
0
Entering edit mode
Guest User ★ 12k
@guest-user-4897
Last seen 6.6 years ago
Hi, I would like to sort a matrix by a specific column (column 2). I tried the order() function, but I get an error. I think it is because the values in column 2 are not numeric, they are gene symbols. This may be a general R question, but I thought I would post it here since it is microarray data analysis. I have matrix x: > x ID Gene Symbol logFC Adj.PVal 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 10344674 "10602372" "Fam150a" 1.397496 0.9999522 10344679 "10398428" "St18" -0.3278807 0.9999522 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 I want to sort on the "Gene Symbol" column so that I can remove the duplicates and keep the one with the highest log fold change. I tried the following and received an error. > x[order(x[,2]),] Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' If anyone has any suggestions for an easy way to sort a significant gene list, remove duplicated values, and keep the value with highest fold change, that would be helpful! I've posted my session info below. Thanks! Guest -- output of sessionInfo(): > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.15.1 -- Sent via the guest posting facility at bioconductor.org.
• 1.5k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States
On 10/23/2012 11:15 AM, Guest [guest] wrote: > Hi, > > I would like to sort a matrix by a specific column (column 2). I tried the order() function, but I get an error. I think it is because the values in column 2 are not numeric, they are gene symbols. This may be a general R question, but I thought I would post it here since it is microarray data analysis. > > I have matrix x: > >> x > ID Gene Symbol logFC Adj.PVal > 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 > 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 > 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 > 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 > 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 > 10344674 "10602372" "Fam150a" 1.397496 0.9999522 > 10344679 "10398428" "St18" -0.3278807 0.9999522 > 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 > 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 > 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 > 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 > 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 > 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 > 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 > 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 > 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 > 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 > 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 > 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 > 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 > > I want to sort on the "Gene Symbol" column so that I can remove the duplicates and keep the one with the highest log fold change. > > I tried the following and received an error. >> x[order(x[,2]),] > Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' I am not sure the sessionInfo() you give below corresponds to the session above. I get: > x <- data.frame(ID = 12345:12354, Gene = Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), pval = runif(10)) > x ID Gene logFC pval 1 12345 Sepw1 0.56914952 0.4916910 2 12346 Serf1 0.83929962 0.4816986 3 12347 Gm4748 0.12462117 0.9372249 4 12348 Sez6 -0.21468480 0.4921201 5 12349 Foxp3 -1.36283694 0.4575675 6 12350 Sfpi1 1.03632565 0.5251826 7 12351 Sfrp1 0.04689108 0.3068112 8 12352 Frzb 0.08379607 0.1509499 9 12353 Sfrp4 -1.61513620 0.9336235 10 12354 Srsf2 1.56222316 0.2571122 > x[order(x[,2]),] ID Gene logFC pval 5 12349 Foxp3 -1.36283694 0.4575675 8 12352 Frzb 0.08379607 0.1509499 3 12347 Gm4748 0.12462117 0.9372249 1 12345 Sepw1 0.56914952 0.4916910 2 12346 Serf1 0.83929962 0.4816986 4 12348 Sez6 -0.21468480 0.4921201 6 12350 Sfpi1 1.03632565 0.5251826 7 12351 Sfrp1 0.04689108 0.3068112 9 12353 Sfrp4 -1.61513620 0.9336235 10 12354 Srsf2 1.56222316 0.2571122 It appears you have something loaded that thinks you want to use the orderVector1() function. You can always specify the function you are intending with the :: operator (in this case, you want base::order()). > x[base::order(x[,2]),] ID Gene logFC pval 5 12349 Foxp3 -1.36283694 0.4575675 8 12352 Frzb 0.08379607 0.1509499 3 12347 Gm4748 0.12462117 0.9372249 1 12345 Sepw1 0.56914952 0.4916910 2 12346 Serf1 0.83929962 0.4816986 4 12348 Sez6 -0.21468480 0.4921201 6 12350 Sfpi1 1.03632565 0.5251826 7 12351 Sfrp1 0.04689108 0.3068112 9 12353 Sfrp4 -1.61513620 0.9336235 10 12354 Srsf2 1.56222316 0.2571122 Best, Jim > > If anyone has any suggestions for an easy way to sort a significant gene list, remove duplicated values, and keep the value with highest fold change, that would be helpful! > > I've posted my session info below. > > Thanks! > > Guest > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.15.1 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, The R session info below does correspond to the session I pasted. When I tried your suggestion, I still get an error: > x[base::order(x[,2]),] Error in base::order(x[, 2]) : unimplemented type 'list' in 'orderVector1' I see that you don't have quotes around the ID and Gene Symbol names in your matrix. Is there a way to remove the quotes? Thanks! On 10/23/12 11:27AM, "James W. MacDonald" <jmacdon at="" uw.edu=""> wrote: > > >On 10/23/2012 11:15 AM, Guest [guest] wrote: >> Hi, >> >> I would like to sort a matrix by a specific column (column 2). I tried >>the order() function, but I get an error. I think it is because the >>values in column 2 are not numeric, they are gene symbols. This may be a >>general R question, but I thought I would post it here since it is >>microarray data analysis. >> >> I have matrix x: >> >>> x >> ID Gene Symbol logFC Adj.PVal >> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >> 10344679 "10398428" "St18" -0.3278807 0.9999522 >> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >> >> I want to sort on the "Gene Symbol" column so that I can remove the >>duplicates and keep the one with the highest log fold change. >> >> I tried the following and received an error. >>> x[order(x[,2]),] >> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' > >I am not sure the sessionInfo() you give below corresponds to the >session above. I get: > > > x <- data.frame(ID = 12345:12354, Gene = >Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >pval = runif(10)) > > x > ID Gene logFC pval >1 12345 Sepw1 0.56914952 0.4916910 >2 12346 Serf1 0.83929962 0.4816986 >3 12347 Gm4748 0.12462117 0.9372249 >4 12348 Sez6 -0.21468480 0.4921201 >5 12349 Foxp3 -1.36283694 0.4575675 >6 12350 Sfpi1 1.03632565 0.5251826 >7 12351 Sfrp1 0.04689108 0.3068112 >8 12352 Frzb 0.08379607 0.1509499 >9 12353 Sfrp4 -1.61513620 0.9336235 >10 12354 Srsf2 1.56222316 0.2571122 > > x[order(x[,2]),] > ID Gene logFC pval >5 12349 Foxp3 -1.36283694 0.4575675 >8 12352 Frzb 0.08379607 0.1509499 >3 12347 Gm4748 0.12462117 0.9372249 >1 12345 Sepw1 0.56914952 0.4916910 >2 12346 Serf1 0.83929962 0.4816986 >4 12348 Sez6 -0.21468480 0.4921201 >6 12350 Sfpi1 1.03632565 0.5251826 >7 12351 Sfrp1 0.04689108 0.3068112 >9 12353 Sfrp4 -1.61513620 0.9336235 >10 12354 Srsf2 1.56222316 0.2571122 > >It appears you have something loaded that thinks you want to use the >orderVector1() function. You can always specify the function you are >intending with the :: operator (in this case, you want base::order()). > > > x[base::order(x[,2]),] > ID Gene logFC pval >5 12349 Foxp3 -1.36283694 0.4575675 >8 12352 Frzb 0.08379607 0.1509499 >3 12347 Gm4748 0.12462117 0.9372249 >1 12345 Sepw1 0.56914952 0.4916910 >2 12346 Serf1 0.83929962 0.4816986 >4 12348 Sez6 -0.21468480 0.4921201 >6 12350 Sfpi1 1.03632565 0.5251826 >7 12351 Sfrp1 0.04689108 0.3068112 >9 12353 Sfrp4 -1.61513620 0.9336235 >10 12354 Srsf2 1.56222316 0.2571122 > >Best, > >Jim > > >> >> If anyone has any suggestions for an easy way to sort a significant >>gene list, remove duplicated values, and keep the value with highest >>fold change, that would be helpful! >> >> I've posted my session info below. >> >> Thanks! >> >> Guest >> >> -- output of sessionInfo(): >> >>> sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] tools_2.15.1 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >>http://news.gmane.org/gmane.science.biology.informatics.conductor > >-- >James W. MacDonald, M.S. >Biostatistician >University of Washington >Environmental and Occupational Health Sciences >4225 Roosevelt Way NE, # 100 >Seattle WA 98105-6099 >
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States
What do you get from class(x) On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: > Hi Jim, > > The R session info below does correspond to the session I pasted. When I > tried your suggestion, I still get an error: > >> x[base::order(x[,2]),] > Error in base::order(x[, 2]) : > unimplemented type 'list' in 'orderVector1' > > > I see that you don't have quotes around the ID and Gene Symbol names in > your matrix. Is there a way to remove the quotes? > > Thanks! > > On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: > >> >> On 10/23/2012 11:15 AM, Guest [guest] wrote: >>> Hi, >>> >>> I would like to sort a matrix by a specific column (column 2). I tried >>> the order() function, but I get an error. I think it is because the >>> values in column 2 are not numeric, they are gene symbols. This may be a >>> general R question, but I thought I would post it here since it is >>> microarray data analysis. >>> >>> I have matrix x: >>> >>>> x >>> ID Gene Symbol logFC Adj.PVal >>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>> >>> I want to sort on the "Gene Symbol" column so that I can remove the >>> duplicates and keep the one with the highest log fold change. >>> >>> I tried the following and received an error. >>>> x[order(x[,2]),] >>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >> I am not sure the sessionInfo() you give below corresponds to the >> session above. I get: >> >>> x<- data.frame(ID = 12345:12354, Gene = >> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >> pval = runif(10)) >>> x >> ID Gene logFC pval >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 8 12352 Frzb 0.08379607 0.1509499 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >>> x[order(x[,2]),] >> ID Gene logFC pval >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 8 12352 Frzb 0.08379607 0.1509499 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >> >> It appears you have something loaded that thinks you want to use the >> orderVector1() function. You can always specify the function you are >> intending with the :: operator (in this case, you want base::order()). >> >>> x[base::order(x[,2]),] >> ID Gene logFC pval >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 8 12352 Frzb 0.08379607 0.1509499 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >> >> Best, >> >> Jim >> >> >>> If anyone has any suggestions for an easy way to sort a significant >>> gene list, remove duplicated values, and keep the value with highest >>> fold change, that would be helpful! >>> >>> I've posted my session info below. >>> >>> Thanks! >>> >>> Guest >>> >>> -- output of sessionInfo(): >>> >>>> sessionInfo() >>> R version 2.15.1 (2012-06-22) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> loaded via a namespace (and not attached): >>> [1] tools_2.15.1 >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
I get: > class(x) [1] "matrix" On 10/23/12 11:41AM, "James W. MacDonald" <jmacdon at="" uw.edu=""> wrote: >What do you get from > >class(x) > >On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: >> Hi Jim, >> >> The R session info below does correspond to the session I pasted. When I >> tried your suggestion, I still get an error: >> >>> x[base::order(x[,2]),] >> Error in base::order(x[, 2]) : >> unimplemented type 'list' in 'orderVector1' >> >> >> I see that you don't have quotes around the ID and Gene Symbol names in >> your matrix. Is there a way to remove the quotes? >> >> Thanks! >> >> On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: >> >>> >>> On 10/23/2012 11:15 AM, Guest [guest] wrote: >>>> Hi, >>>> >>>> I would like to sort a matrix by a specific column (column 2). I tried >>>> the order() function, but I get an error. I think it is because the >>>> values in column 2 are not numeric, they are gene symbols. This may >>>>be a >>>> general R question, but I thought I would post it here since it is >>>> microarray data analysis. >>>> >>>> I have matrix x: >>>> >>>>> x >>>> ID Gene Symbol logFC Adj.PVal >>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>> >>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>> duplicates and keep the one with the highest log fold change. >>>> >>>> I tried the following and received an error. >>>>> x[order(x[,2]),] >>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>> I am not sure the sessionInfo() you give below corresponds to the >>> session above. I get: >>> >>>> x<- data.frame(ID = 12345:12354, Gene = >>> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >>> pval = runif(10)) >>>> x >>> ID Gene logFC pval >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>>> x[order(x[,2]),] >>> ID Gene logFC pval >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>> >>> It appears you have something loaded that thinks you want to use the >>> orderVector1() function. You can always specify the function you are >>> intending with the :: operator (in this case, you want base::order()). >>> >>>> x[base::order(x[,2]),] >>> ID Gene logFC pval >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>> >>> Best, >>> >>> Jim >>> >>> >>>> If anyone has any suggestions for an easy way to sort a significant >>>> gene list, remove duplicated values, and keep the value with highest >>>> fold change, that would be helpful! >>>> >>>> I've posted my session info below. >>>> >>>> Thanks! >>>> >>>> Guest >>>> >>>> -- output of sessionInfo(): >>>> >>>>> sessionInfo() >>>> R version 2.15.1 (2012-06-22) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> loaded via a namespace (and not attached): >>>> [1] tools_2.15.1 >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> > >-- >James W. MacDonald, M.S. >Biostatistician >University of Washington >Environmental and Occupational Health Sciences >4225 Roosevelt Way NE, # 100 >Seattle WA 98105-6099 >
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States
Also, what do you get from orderVector1 On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: > Hi Jim, > > The R session info below does correspond to the session I pasted. When I > tried your suggestion, I still get an error: > >> x[base::order(x[,2]),] > Error in base::order(x[, 2]) : > unimplemented type 'list' in 'orderVector1' > > > I see that you don't have quotes around the ID and Gene Symbol names in > your matrix. Is there a way to remove the quotes? > > Thanks! > > On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: > >> >> On 10/23/2012 11:15 AM, Guest [guest] wrote: >>> Hi, >>> >>> I would like to sort a matrix by a specific column (column 2). I tried >>> the order() function, but I get an error. I think it is because the >>> values in column 2 are not numeric, they are gene symbols. This may be a >>> general R question, but I thought I would post it here since it is >>> microarray data analysis. >>> >>> I have matrix x: >>> >>>> x >>> ID Gene Symbol logFC Adj.PVal >>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>> >>> I want to sort on the "Gene Symbol" column so that I can remove the >>> duplicates and keep the one with the highest log fold change. >>> >>> I tried the following and received an error. >>>> x[order(x[,2]),] >>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >> I am not sure the sessionInfo() you give below corresponds to the >> session above. I get: >> >>> x<- data.frame(ID = 12345:12354, Gene = >> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >> pval = runif(10)) >>> x >> ID Gene logFC pval >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 8 12352 Frzb 0.08379607 0.1509499 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >>> x[order(x[,2]),] >> ID Gene logFC pval >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 8 12352 Frzb 0.08379607 0.1509499 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >> >> It appears you have something loaded that thinks you want to use the >> orderVector1() function. You can always specify the function you are >> intending with the :: operator (in this case, you want base::order()). >> >>> x[base::order(x[,2]),] >> ID Gene logFC pval >> 5 12349 Foxp3 -1.36283694 0.4575675 >> 8 12352 Frzb 0.08379607 0.1509499 >> 3 12347 Gm4748 0.12462117 0.9372249 >> 1 12345 Sepw1 0.56914952 0.4916910 >> 2 12346 Serf1 0.83929962 0.4816986 >> 4 12348 Sez6 -0.21468480 0.4921201 >> 6 12350 Sfpi1 1.03632565 0.5251826 >> 7 12351 Sfrp1 0.04689108 0.3068112 >> 9 12353 Sfrp4 -1.61513620 0.9336235 >> 10 12354 Srsf2 1.56222316 0.2571122 >> >> Best, >> >> Jim >> >> >>> If anyone has any suggestions for an easy way to sort a significant >>> gene list, remove duplicated values, and keep the value with highest >>> fold change, that would be helpful! >>> >>> I've posted my session info below. >>> >>> Thanks! >>> >>> Guest >>> >>> -- output of sessionInfo(): >>> >>>> sessionInfo() >>> R version 2.15.1 (2012-06-22) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> loaded via a namespace (and not attached): >>> [1] tools_2.15.1 >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Ido not get anything: > orderVector1 Error: object 'orderVector1' not found On 10/23/12 11:42AM, "James W. MacDonald" <jmacdon at="" uw.edu=""> wrote: >Also, what do you get from > >orderVector1 > > > >On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: >> Hi Jim, >> >> The R session info below does correspond to the session I pasted. When I >> tried your suggestion, I still get an error: >> >>> x[base::order(x[,2]),] >> Error in base::order(x[, 2]) : >> unimplemented type 'list' in 'orderVector1' >> >> >> I see that you don't have quotes around the ID and Gene Symbol names in >> your matrix. Is there a way to remove the quotes? >> >> Thanks! >> >> On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: >> >>> >>> On 10/23/2012 11:15 AM, Guest [guest] wrote: >>>> Hi, >>>> >>>> I would like to sort a matrix by a specific column (column 2). I tried >>>> the order() function, but I get an error. I think it is because the >>>> values in column 2 are not numeric, they are gene symbols. This may >>>>be a >>>> general R question, but I thought I would post it here since it is >>>> microarray data analysis. >>>> >>>> I have matrix x: >>>> >>>>> x >>>> ID Gene Symbol logFC Adj.PVal >>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>> >>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>> duplicates and keep the one with the highest log fold change. >>>> >>>> I tried the following and received an error. >>>>> x[order(x[,2]),] >>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>> I am not sure the sessionInfo() you give below corresponds to the >>> session above. I get: >>> >>>> x<- data.frame(ID = 12345:12354, Gene = >>> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >>> pval = runif(10)) >>>> x >>> ID Gene logFC pval >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>>> x[order(x[,2]),] >>> ID Gene logFC pval >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>> >>> It appears you have something loaded that thinks you want to use the >>> orderVector1() function. You can always specify the function you are >>> intending with the :: operator (in this case, you want base::order()). >>> >>>> x[base::order(x[,2]),] >>> ID Gene logFC pval >>> 5 12349 Foxp3 -1.36283694 0.4575675 >>> 8 12352 Frzb 0.08379607 0.1509499 >>> 3 12347 Gm4748 0.12462117 0.9372249 >>> 1 12345 Sepw1 0.56914952 0.4916910 >>> 2 12346 Serf1 0.83929962 0.4816986 >>> 4 12348 Sez6 -0.21468480 0.4921201 >>> 6 12350 Sfpi1 1.03632565 0.5251826 >>> 7 12351 Sfrp1 0.04689108 0.3068112 >>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>> 10 12354 Srsf2 1.56222316 0.2571122 >>> >>> Best, >>> >>> Jim >>> >>> >>>> If anyone has any suggestions for an easy way to sort a significant >>>> gene list, remove duplicated values, and keep the value with highest >>>> fold change, that would be helpful! >>>> >>>> I've posted my session info below. >>>> >>>> Thanks! >>>> >>>> Guest >>>> >>>> -- output of sessionInfo(): >>>> >>>>> sessionInfo() >>>> R version 2.15.1 (2012-06-22) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> loaded via a namespace (and not attached): >>>> [1] tools_2.15.1 >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> > >-- >James W. MacDonald, M.S. >Biostatistician >University of Washington >Environmental and Occupational Health Sciences >4225 Roosevelt Way NE, # 100 >Seattle WA 98105-6099 >
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States
Weird. I don't see how that is possible, as it appears you have both character and numeric in that matrix, which is not allowed. You are getting the error that you should get if x[,2] is a list: > x2 <- list(rnorm(100), letters) > x2[order(x2)] Error in .Method(..., na.last = na.last, decreasing = decreasing) : unimplemented type 'list' in 'orderVector1' I would recommend starting again - your x object seems to be busted somehow. Best, Jim On 10/23/2012 11:42 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: > I get: > >> class(x) > [1] "matrix" > > > > On 10/23/12 11:41AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: > >> What do you get from >> >> class(x) >> >> On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote: >>> Hi Jim, >>> >>> The R session info below does correspond to the session I pasted. When I >>> tried your suggestion, I still get an error: >>> >>>> x[base::order(x[,2]),] >>> Error in base::order(x[, 2]) : >>> unimplemented type 'list' in 'orderVector1' >>> >>> >>> I see that you don't have quotes around the ID and Gene Symbol names in >>> your matrix. Is there a way to remove the quotes? >>> >>> Thanks! >>> >>> On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at="" uw.edu=""> wrote: >>> >>>> On 10/23/2012 11:15 AM, Guest [guest] wrote: >>>>> Hi, >>>>> >>>>> I would like to sort a matrix by a specific column (column 2). I tried >>>>> the order() function, but I get an error. I think it is because the >>>>> values in column 2 are not numeric, they are gene symbols. This may >>>>> be a >>>>> general R question, but I thought I would post it here since it is >>>>> microarray data analysis. >>>>> >>>>> I have matrix x: >>>>> >>>>>> x >>>>> ID Gene Symbol logFC Adj.PVal >>>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>>> >>>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>>> duplicates and keep the one with the highest log fold change. >>>>> >>>>> I tried the following and received an error. >>>>>> x[order(x[,2]),] >>>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>>> I am not sure the sessionInfo() you give below corresponds to the >>>> session above. I get: >>>> >>>>> x<- data.frame(ID = 12345:12354, Gene = >>>> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10), >>>> pval = runif(10)) >>>>> x >>>> ID Gene logFC pval >>>> 1 12345 Sepw1 0.56914952 0.4916910 >>>> 2 12346 Serf1 0.83929962 0.4816986 >>>> 3 12347 Gm4748 0.12462117 0.9372249 >>>> 4 12348 Sez6 -0.21468480 0.4921201 >>>> 5 12349 Foxp3 -1.36283694 0.4575675 >>>> 6 12350 Sfpi1 1.03632565 0.5251826 >>>> 7 12351 Sfrp1 0.04689108 0.3068112 >>>> 8 12352 Frzb 0.08379607 0.1509499 >>>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>>> 10 12354 Srsf2 1.56222316 0.2571122 >>>>> x[order(x[,2]),] >>>> ID Gene logFC pval >>>> 5 12349 Foxp3 -1.36283694 0.4575675 >>>> 8 12352 Frzb 0.08379607 0.1509499 >>>> 3 12347 Gm4748 0.12462117 0.9372249 >>>> 1 12345 Sepw1 0.56914952 0.4916910 >>>> 2 12346 Serf1 0.83929962 0.4816986 >>>> 4 12348 Sez6 -0.21468480 0.4921201 >>>> 6 12350 Sfpi1 1.03632565 0.5251826 >>>> 7 12351 Sfrp1 0.04689108 0.3068112 >>>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>>> 10 12354 Srsf2 1.56222316 0.2571122 >>>> >>>> It appears you have something loaded that thinks you want to use the >>>> orderVector1() function. You can always specify the function you are >>>> intending with the :: operator (in this case, you want base::order()). >>>> >>>>> x[base::order(x[,2]),] >>>> ID Gene logFC pval >>>> 5 12349 Foxp3 -1.36283694 0.4575675 >>>> 8 12352 Frzb 0.08379607 0.1509499 >>>> 3 12347 Gm4748 0.12462117 0.9372249 >>>> 1 12345 Sepw1 0.56914952 0.4916910 >>>> 2 12346 Serf1 0.83929962 0.4816986 >>>> 4 12348 Sez6 -0.21468480 0.4921201 >>>> 6 12350 Sfpi1 1.03632565 0.5251826 >>>> 7 12351 Sfrp1 0.04689108 0.3068112 >>>> 9 12353 Sfrp4 -1.61513620 0.9336235 >>>> 10 12354 Srsf2 1.56222316 0.2571122 >>>> >>>> Best, >>>> >>>> Jim >>>> >>>> >>>>> If anyone has any suggestions for an easy way to sort a significant >>>>> gene list, remove duplicated values, and keep the value with highest >>>>> fold change, that would be helpful! >>>>> >>>>> I've posted my session info below. >>>>> >>>>> Thanks! >>>>> >>>>> Guest >>>>> >>>>> -- output of sessionInfo(): >>>>> >>>>>> sessionInfo() >>>>> R version 2.15.1 (2012-06-22) >>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>>> >>>>> locale: >>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] tools_2.15.1 >>>>> >>>>> -- >>>>> Sent via the guest posting facility at bioconductor.org. >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> -- >>>> James W. MacDonald, M.S. >>>> Biostatistician >>>> University of Washington >>>> Environmental and Occupational Health Sciences >>>> 4225 Roosevelt Way NE, # 100 >>>> Seattle WA 98105-6099 >>>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 20 hours ago
United States
If you want to annotate data, an easier way to do it is to use the annaffy package - you can output either text or HTML tables. I have some functions in affycoretools to automate going from a MArrayLM object to the HTML or text tables if you are interested. Best, Jim On 10/23/2012 2:32 PM, Kasoji, Manjula (NIH/NCI) [C] wrote: > Thanks, guys. I think I got that because I did a cbind() with my ebayes() > results and my annotation results from mget() that used to annotate my > significant genes from the mogene10sttranscriptcluster db. > > I'll try out a few things. If you guys have any further suggestions or > recommendations I will certainly appreciate them. > > Thanks! > > On 10/23/12 11:57AM, "Axel Klenk"<axel.klenk at="" actelion.com=""> wrote: > >> Dear Manjula, >> >> wow. How did you create that? :-) >> >> order() doesn't like lists: >> >>> order(list(1:3)) >> Error in order(list(1:3)) : unimplemented type 'list' in 'orderVector1' >> >> and I think you should try to make your x look something like the >> data.frame that Jim has used in his example and it will work. >> >> Cheers, >> >> Axel (not Alex!!) Klenk >> Research Informatician >> Information Management Drug Discovery >> >> Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >> ? Switzerland >> G12.O1.R10 >> >> axel.klenk at actelion.com ? www.actelion.com >> Address for visitors: Hegenheimermattweg 92 >> >> >> On Tue, Oct 23, 2012 at 5:45 PM, Kasoji, Manjula (NIH/NCI) [C] >> <manjula.kasoji at="" nih.gov=""> wrote: >>> Hi Alex, >>> >>> Please see the output below: >>> >>>> str(x) >>> >>> List of 80 >>> $ : chr "10371400" >>> $ : chr "10453900" >>> $ : chr "10375051" >>> $ : chr "10575211" >>> $ : chr "10566254" >>> $ : chr "10602372" >>> $ : chr "10398428" >>> $ : chr "10383518" >>> $ : chr "10397054" >>> $ : chr "10384020" >>> $ : chr "10608710" >>> $ : chr "10363762" >>> $ : chr "10375058" >>> $ : chr "10381603" >>> $ : chr "10442373" >>> $ : chr "10421227" >>> $ : chr "10534966" >>> $ : chr "10398408" >>> $ : chr "10398418" >>> $ : chr "10572772" >>> $ : chr "Lypla1" >>> $ : chr "Tcea1" >>> $ : chr "Atp6v1h" >>> $ : chr "Oprk1" >>> >>>> class(x[,2]) >>> [1] "list" >>> >>> >>> >>> >>> On 10/23/12 11:42AM, "Axel Klenk"<axel.klenk at="" actelion.com=""> wrote: >>> >>>> Dear Guest, >>>> >>>> I think your approach is valid in general and it is your x that is >>>> causing the >>>> problem; column 'Gene Symbol' appears to contain two values. What is the >>>> result of >>>> >>>> str(x) >>>> >>>> and/or >>>> >>>> class(x[,2]) >>>> >>>> ? >>>> >>>> Cheers, >>>> >>>> - axel >>>> >>>> >>>> Axel Klenk >>>> Research Informatician >>>> Information Management Drug Discovery >>>> >>>> Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >>>> ? Switzerland >>>> G12.O1.R10 >>>> >>>> axel.klenk at actelion.com ? www.actelion.com >>>> Address for visitors: Hegenheimermattweg 92 >>>> >>>> >>>> >>>> On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest]<guest at="" bioconductor.org=""> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I would like to sort a matrix by a specific column (column 2). I tried >>>>> the order() function, but I get an error. I think it is because the >>>>> values in column 2 are not numeric, they are gene symbols. This may be >>>>> a >>>>> general R question, but I thought I would post it here since it is >>>>> microarray data analysis. >>>>> >>>>> I have matrix x: >>>>> >>>>>> x >>>>> ID Gene Symbol logFC Adj.PVal >>>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>>> >>>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>>> duplicates and keep the one with the highest log fold change. >>>>> >>>>> I tried the following and received an error. >>>>>> x[order(x[,2]),] >>>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>>>> >>>>> If anyone has any suggestions for an easy way to sort a significant >>>>> gene list, remove duplicated values, and keep the value with highest >>>>> fold change, that would be helpful! >>>>> >>>>> I've posted my session info below. >>>>> >>>>> Thanks! >>>>> >>>>> Guest >>>>> >>>>> -- output of sessionInfo(): >>>>> >>>>>> sessionInfo() >>>>> R version 2.15.1 (2012-06-22) >>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>>> >>>>> locale: >>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] tools_2.15.1 >>>>> >>>>> -- >>>>> Sent via the guest posting facility at bioconductor.org. >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> -- >>>> >>>> The information of this email and in any file transmitted with it is >>>> strictly confidential and may be legally privileged. >>>> It is intended solely for the addressee. If you are not the intended >>>> recipient, any copying, distribution or any other use of this email is >>>> prohibited and may be unlawful. In such case, you should please notify >>>> the >>>> sender immediately and destroy this email. >>>> The content of this email is not legally binding unless confirmed by >>>> letter. >>>> Any views expressed in this message are those of the individual sender, >>>> except where the message states otherwise and the sender is authorised >>>> to >>>> state them to be the views of the sender's company. For further >>>> information >>>> about Actelion please see our website at http://www.actelion.com >>>> >> -- >> >> The information of this email and in any file transmitted with it is >> strictly confidential and may be legally privileged. >> It is intended solely for the addressee. If you are not the intended >> recipient, any copying, distribution or any other use of this email is >> prohibited and may be unlawful. In such case, you should please notify >> the >> sender immediately and destroy this email. >> The content of this email is not legally binding unless confirmed by >> letter. >> Any views expressed in this message are those of the individual sender, >> except where the message states otherwise and the sender is authorised to >> state them to be the views of the sender's company. For further >> information >> about Actelion please see our website at http://www.actelion.com >> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Well, the fix was easy. I just did a no quote() on my matrix, and now I can order them and simply us the duplicated() function and it automatically removes the duplicates and keeps the one with the higher FC. Pasting an example below in case others want to view. :-) Thanks! > zz=z[order(z[,2]),] > zz ID Gene Symbol logFC Adj.PVal 10496580 10496580 Gbp3 1.00088125196237 0.044611409531886 10496539 10496539 Gbp5 1.30128040497582 0.0319569661457467 10531994 10531994 Gbp6 1.19085298275753 0.0490943973094095 10496569 10496569 Gbp7 1.0421928217272 0.0490943973094095 10376324 10376324 Gm12250 1.60937590067288 0.030458897264666 10490273 10490273 Gm14305 1.01718341526644 0.0368613093977217 10455961 10455961 Iigp1 1.0315422556842 0.044611409531886 10376326 10376326 Irgm2 1.36277705961511 0.0323289276651196 10398039 10398039 Serpina3f 1.0686870563162 0.044611409531886 10385518 10385518 Tgtp1 1.64120481997653 0.0384608883577761 10385533 10385533 Tgtp1 1.37274810522256 0.044611409531886 > zz[!duplicated(zz[,2]),] ID Gene Symbol logFC Adj.PVal 10496580 10496580 Gbp3 1.00088125196237 0.044611409531886 10496539 10496539 Gbp5 1.30128040497582 0.0319569661457467 10531994 10531994 Gbp6 1.19085298275753 0.0490943973094095 10496569 10496569 Gbp7 1.0421928217272 0.0490943973094095 10376324 10376324 Gm12250 1.60937590067288 0.030458897264666 10490273 10490273 Gm14305 1.01718341526644 0.0368613093977217 10455961 10455961 Iigp1 1.0315422556842 0.044611409531886 10376326 10376326 Irgm2 1.36277705961511 0.0323289276651196 10398039 10398039 Serpina3f 1.0686870563162 0.044611409531886 10385518 10385518 Tgtp1 1.64120481997653 0.0384608883577761 On 10/23/12 3:32PM, "James W. MacDonald" <jmacdon at="" uw.edu=""> wrote: >If you want to annotate data, an easier way to do it is to use the >annaffy package - you can output either text or HTML tables. I have some >functions in affycoretools to automate going from a MArrayLM object to >the HTML or text tables if you are interested. > >Best, > >Jim > >On 10/23/2012 2:32 PM, Kasoji, Manjula (NIH/NCI) [C] wrote: >> Thanks, guys. I think I got that because I did a cbind() with my >>ebayes() >> results and my annotation results from mget() that used to annotate my >> significant genes from the mogene10sttranscriptcluster db. >> >> I'll try out a few things. If you guys have any further suggestions or >> recommendations I will certainly appreciate them. >> >> Thanks! >> >> On 10/23/12 11:57AM, "Axel Klenk"<axel.klenk at="" actelion.com=""> wrote: >> >>> Dear Manjula, >>> >>> wow. How did you create that? :-) >>> >>> order() doesn't like lists: >>> >>>> order(list(1:3)) >>> Error in order(list(1:3)) : unimplemented type 'list' in 'orderVector1' >>> >>> and I think you should try to make your x look something like the >>> data.frame that Jim has used in his example and it will work. >>> >>> Cheers, >>> >>> Axel (not Alex!!) Klenk >>> Research Informatician >>> Information Management Drug Discovery >>> >>> Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >>> ? Switzerland >>> G12.O1.R10 >>> >>> axel.klenk at actelion.com ? www.actelion.com >>> Address for visitors: Hegenheimermattweg 92 >>> >>> >>> On Tue, Oct 23, 2012 at 5:45 PM, Kasoji, Manjula (NIH/NCI) [C] >>> <manjula.kasoji at="" nih.gov=""> wrote: >>>> Hi Alex, >>>> >>>> Please see the output below: >>>> >>>>> str(x) >>>> >>>> List of 80 >>>> $ : chr "10371400" >>>> $ : chr "10453900" >>>> $ : chr "10375051" >>>> $ : chr "10575211" >>>> $ : chr "10566254" >>>> $ : chr "10602372" >>>> $ : chr "10398428" >>>> $ : chr "10383518" >>>> $ : chr "10397054" >>>> $ : chr "10384020" >>>> $ : chr "10608710" >>>> $ : chr "10363762" >>>> $ : chr "10375058" >>>> $ : chr "10381603" >>>> $ : chr "10442373" >>>> $ : chr "10421227" >>>> $ : chr "10534966" >>>> $ : chr "10398408" >>>> $ : chr "10398418" >>>> $ : chr "10572772" >>>> $ : chr "Lypla1" >>>> $ : chr "Tcea1" >>>> $ : chr "Atp6v1h" >>>> $ : chr "Oprk1" >>>> >>>>> class(x[,2]) >>>> [1] "list" >>>> >>>> >>>> >>>> >>>> On 10/23/12 11:42AM, "Axel Klenk"<axel.klenk at="" actelion.com=""> wrote: >>>> >>>>> Dear Guest, >>>>> >>>>> I think your approach is valid in general and it is your x that is >>>>> causing the >>>>> problem; column 'Gene Symbol' appears to contain two values. What is >>>>>the >>>>> result of >>>>> >>>>> str(x) >>>>> >>>>> and/or >>>>> >>>>> class(x[,2]) >>>>> >>>>> ? >>>>> >>>>> Cheers, >>>>> >>>>> - axel >>>>> >>>>> >>>>> Axel Klenk >>>>> Research Informatician >>>>> Information Management Drug Discovery >>>>> >>>>> Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >>>>> ? Switzerland >>>>> G12.O1.R10 >>>>> >>>>> axel.klenk at actelion.com ? www.actelion.com >>>>> Address for visitors: Hegenheimermattweg 92 >>>>> >>>>> >>>>> >>>>> On Tue, Oct 23, 2012 at 5:15 PM, Guest >>>>>[guest]<guest at="" bioconductor.org=""> >>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I would like to sort a matrix by a specific column (column 2). I >>>>>>tried >>>>>> the order() function, but I get an error. I think it is because the >>>>>> values in column 2 are not numeric, they are gene symbols. This may >>>>>>be >>>>>> a >>>>>> general R question, but I thought I would post it here since it is >>>>>> microarray data analysis. >>>>>> >>>>>> I have matrix x: >>>>>> >>>>>>> x >>>>>> ID Gene Symbol logFC Adj.PVal >>>>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>>>> >>>>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>>>> duplicates and keep the one with the highest log fold change. >>>>>> >>>>>> I tried the following and received an error. >>>>>>> x[order(x[,2]),] >>>>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>>>>> >>>>>> If anyone has any suggestions for an easy way to sort a significant >>>>>> gene list, remove duplicated values, and keep the value with highest >>>>>> fold change, that would be helpful! >>>>>> >>>>>> I've posted my session info below. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Guest >>>>>> >>>>>> -- output of sessionInfo(): >>>>>> >>>>>>> sessionInfo() >>>>>> R version 2.15.1 (2012-06-22) >>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>>>> >>>>>> locale: >>>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>>>> >>>>>> attached base packages: >>>>>> [1] stats graphics grDevices utils datasets methods base >>>>>> >>>>>> loaded via a namespace (and not attached): >>>>>> [1] tools_2.15.1 >>>>>> >>>>>> -- >>>>>> Sent via the guest posting facility at bioconductor.org. >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> -- >>>>> >>>>> The information of this email and in any file transmitted with it is >>>>> strictly confidential and may be legally privileged. >>>>> It is intended solely for the addressee. If you are not the intended >>>>> recipient, any copying, distribution or any other use of this email >>>>>is >>>>> prohibited and may be unlawful. In such case, you should please >>>>>notify >>>>> the >>>>> sender immediately and destroy this email. >>>>> The content of this email is not legally binding unless confirmed by >>>>> letter. >>>>> Any views expressed in this message are those of the individual >>>>>sender, >>>>> except where the message states otherwise and the sender is >>>>>authorised >>>>> to >>>>> state them to be the views of the sender's company. For further >>>>> information >>>>> about Actelion please see our website at http://www.actelion.com >>>>> >>> -- >>> >>> The information of this email and in any file transmitted with it is >>> strictly confidential and may be legally privileged. >>> It is intended solely for the addressee. If you are not the intended >>> recipient, any copying, distribution or any other use of this email is >>> prohibited and may be unlawful. In such case, you should please notify >>> the >>> sender immediately and destroy this email. >>> The content of this email is not legally binding unless confirmed by >>> letter. >>> Any views expressed in this message are those of the individual sender, >>> except where the message states otherwise and the sender is authorised >>>to >>> state them to be the views of the sender's company. For further >>> information >>> about Actelion please see our website at http://www.actelion.com >>> > >-- >James W. MacDonald, M.S. >Biostatistician >University of Washington >Environmental and Occupational Health Sciences >4225 Roosevelt Way NE, # 100 >Seattle WA 98105-6099 >
ADD REPLY
0
Entering edit mode
Axel Klenk ▴ 940
@axel-klenk-3224
Last seen 3 hours ago
Switzerland
Dear Guest, I think your approach is valid in general and it is your x that is causing the problem; column 'Gene Symbol' appears to contain two values. What is the result of str(x) and/or class(x[,2]) ? Cheers, - axel Axel Klenk Research Informatician Information Management Drug Discovery Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil ? Switzerland G12.O1.R10 axel.klenk at actelion.com ? www.actelion.com Address for visitors: Hegenheimermattweg 92 On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest] <guest at="" bioconductor.org=""> wrote: > > > Hi, > > I would like to sort a matrix by a specific column (column 2). I tried the order() function, but I get an error. I think it is because the values in column 2 are not numeric, they are gene symbols. This may be a general R question, but I thought I would post it here since it is microarray data analysis. > > I have matrix x: > > > x > ID Gene Symbol logFC Adj.PVal > 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 > 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 > 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 > 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 > 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 > 10344674 "10602372" "Fam150a" 1.397496 0.9999522 > 10344679 "10398428" "St18" -0.3278807 0.9999522 > 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 > 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 > 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 > 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 > 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 > 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 > 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 > 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 > 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 > 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 > 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 > 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 > 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 > > I want to sort on the "Gene Symbol" column so that I can remove the duplicates and keep the one with the highest log fold change. > > I tried the following and received an error. > > x[order(x[,2]),] > Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' > > If anyone has any suggestions for an easy way to sort a significant gene list, remove duplicated values, and keep the value with highest fold change, that would be helpful! > > I've posted my session info below. > > Thanks! > > Guest > > -- output of sessionInfo(): > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.15.1 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email. The content of this email is not legally binding unless confirmed by letter. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
ADD COMMENT
0
Entering edit mode
Hi Alex, Please see the output below: > str(x) List of 80 $ : chr "10371400" $ : chr "10453900" $ : chr "10375051" $ : chr "10575211" $ : chr "10566254" $ : chr "10602372" $ : chr "10398428" $ : chr "10383518" $ : chr "10397054" $ : chr "10384020" $ : chr "10608710" $ : chr "10363762" $ : chr "10375058" $ : chr "10381603" $ : chr "10442373" $ : chr "10421227" $ : chr "10534966" $ : chr "10398408" $ : chr "10398418" $ : chr "10572772" $ : chr "Lypla1" $ : chr "Tcea1" $ : chr "Atp6v1h" $ : chr "Oprk1" > class(x[,2]) [1] "list" On 10/23/12 11:42AM, "Axel Klenk" <axel.klenk at="" actelion.com=""> wrote: >Dear Guest, > >I think your approach is valid in general and it is your x that is >causing the >problem; column 'Gene Symbol' appears to contain two values. What is the >result of > >str(x) > >and/or > >class(x[,2]) > >? > >Cheers, > > - axel > > >Axel Klenk >Research Informatician >Information Management Drug Discovery > >Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >? Switzerland >G12.O1.R10 > >axel.klenk at actelion.com ? www.actelion.com >Address for visitors: Hegenheimermattweg 92 > > > >On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest] <guest at="" bioconductor.org=""> >wrote: >> >> >> Hi, >> >> I would like to sort a matrix by a specific column (column 2). I tried >>the order() function, but I get an error. I think it is because the >>values in column 2 are not numeric, they are gene symbols. This may be a >>general R question, but I thought I would post it here since it is >>microarray data analysis. >> >> I have matrix x: >> >> > x >> ID Gene Symbol logFC Adj.PVal >> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >> 10344679 "10398428" "St18" -0.3278807 0.9999522 >> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >> >> I want to sort on the "Gene Symbol" column so that I can remove the >>duplicates and keep the one with the highest log fold change. >> >> I tried the following and received an error. >> > x[order(x[,2]),] >> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >> >> If anyone has any suggestions for an easy way to sort a significant >>gene list, remove duplicated values, and keep the value with highest >>fold change, that would be helpful! >> >> I've posted my session info below. >> >> Thanks! >> >> Guest >> >> -- output of sessionInfo(): >> >> > sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] tools_2.15.1 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >>http://news.gmane.org/gmane.science.biology.informatics.conductor > >-- > >The information of this email and in any file transmitted with it is >strictly confidential and may be legally privileged. >It is intended solely for the addressee. If you are not the intended >recipient, any copying, distribution or any other use of this email is >prohibited and may be unlawful. In such case, you should please notify >the >sender immediately and destroy this email. >The content of this email is not legally binding unless confirmed by >letter. >Any views expressed in this message are those of the individual sender, >except where the message states otherwise and the sender is authorised to >state them to be the views of the sender's company. For further >information >about Actelion please see our website at http://www.actelion.com >
ADD REPLY
0
Entering edit mode
Axel Klenk ▴ 940
@axel-klenk-3224
Last seen 3 hours ago
Switzerland
Dear Manjula, wow. How did you create that? :-) order() doesn't like lists: > order(list(1:3)) Error in order(list(1:3)) : unimplemented type 'list' in 'orderVector1' and I think you should try to make your x look something like the data.frame that Jim has used in his example and it will work. Cheers, Axel (not Alex!!) Klenk Research Informatician Information Management Drug Discovery Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil ? Switzerland G12.O1.R10 axel.klenk at actelion.com ? www.actelion.com Address for visitors: Hegenheimermattweg 92 On Tue, Oct 23, 2012 at 5:45 PM, Kasoji, Manjula (NIH/NCI) [C] <manjula.kasoji at="" nih.gov=""> wrote: > Hi Alex, > > Please see the output below: > >> str(x) > > > List of 80 > $ : chr "10371400" > $ : chr "10453900" > $ : chr "10375051" > $ : chr "10575211" > $ : chr "10566254" > $ : chr "10602372" > $ : chr "10398428" > $ : chr "10383518" > $ : chr "10397054" > $ : chr "10384020" > $ : chr "10608710" > $ : chr "10363762" > $ : chr "10375058" > $ : chr "10381603" > $ : chr "10442373" > $ : chr "10421227" > $ : chr "10534966" > $ : chr "10398408" > $ : chr "10398418" > $ : chr "10572772" > $ : chr "Lypla1" > $ : chr "Tcea1" > $ : chr "Atp6v1h" > $ : chr "Oprk1" > >> class(x[,2]) > [1] "list" > > > > > On 10/23/12 11:42AM, "Axel Klenk" <axel.klenk at="" actelion.com=""> wrote: > >>Dear Guest, >> >>I think your approach is valid in general and it is your x that is >>causing the >>problem; column 'Gene Symbol' appears to contain two values. What is the >>result of >> >>str(x) >> >>and/or >> >>class(x[,2]) >> >>? >> >>Cheers, >> >> - axel >> >> >>Axel Klenk >>Research Informatician >>Information Management Drug Discovery >> >>Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >>? Switzerland >>G12.O1.R10 >> >>axel.klenk at actelion.com ? www.actelion.com >>Address for visitors: Hegenheimermattweg 92 >> >> >> >>On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest] <guest at="" bioconductor.org=""> >>wrote: >>> >>> >>> Hi, >>> >>> I would like to sort a matrix by a specific column (column 2). I tried >>>the order() function, but I get an error. I think it is because the >>>values in column 2 are not numeric, they are gene symbols. This may be a >>>general R question, but I thought I would post it here since it is >>>microarray data analysis. >>> >>> I have matrix x: >>> >>> > x >>> ID Gene Symbol logFC Adj.PVal >>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>> >>> I want to sort on the "Gene Symbol" column so that I can remove the >>>duplicates and keep the one with the highest log fold change. >>> >>> I tried the following and received an error. >>> > x[order(x[,2]),] >>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>> >>> If anyone has any suggestions for an easy way to sort a significant >>>gene list, remove duplicated values, and keep the value with highest >>>fold change, that would be helpful! >>> >>> I've posted my session info below. >>> >>> Thanks! >>> >>> Guest >>> >>> -- output of sessionInfo(): >>> >>> > sessionInfo() >>> R version 2.15.1 (2012-06-22) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> loaded via a namespace (and not attached): >>> [1] tools_2.15.1 >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>>http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>-- >> >>The information of this email and in any file transmitted with it is >>strictly confidential and may be legally privileged. >>It is intended solely for the addressee. If you are not the intended >>recipient, any copying, distribution or any other use of this email is >>prohibited and may be unlawful. In such case, you should please notify >>the >>sender immediately and destroy this email. >>The content of this email is not legally binding unless confirmed by >>letter. >>Any views expressed in this message are those of the individual sender, >>except where the message states otherwise and the sender is authorised to >>state them to be the views of the sender's company. For further >>information >>about Actelion please see our website at http://www.actelion.com >> > -- The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email. The content of this email is not legally binding unless confirmed by letter. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
ADD COMMENT
0
Entering edit mode
Thanks, guys. I think I got that because I did a cbind() with my ebayes() results and my annotation results from mget() that used to annotate my significant genes from the mogene10sttranscriptcluster db. I'll try out a few things. If you guys have any further suggestions or recommendations I will certainly appreciate them. Thanks! On 10/23/12 11:57AM, "Axel Klenk" <axel.klenk at="" actelion.com=""> wrote: >Dear Manjula, > >wow. How did you create that? :-) > >order() doesn't like lists: > >> order(list(1:3)) >Error in order(list(1:3)) : unimplemented type 'list' in 'orderVector1' > >and I think you should try to make your x look something like the >data.frame that Jim has used in his example and it will work. > >Cheers, > >Axel (not Alex!!) Klenk >Research Informatician >Information Management Drug Discovery > >Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >? Switzerland >G12.O1.R10 > >axel.klenk at actelion.com ? www.actelion.com >Address for visitors: Hegenheimermattweg 92 > > >On Tue, Oct 23, 2012 at 5:45 PM, Kasoji, Manjula (NIH/NCI) [C] ><manjula.kasoji at="" nih.gov=""> wrote: >> Hi Alex, >> >> Please see the output below: >> >>> str(x) >> >> >> List of 80 >> $ : chr "10371400" >> $ : chr "10453900" >> $ : chr "10375051" >> $ : chr "10575211" >> $ : chr "10566254" >> $ : chr "10602372" >> $ : chr "10398428" >> $ : chr "10383518" >> $ : chr "10397054" >> $ : chr "10384020" >> $ : chr "10608710" >> $ : chr "10363762" >> $ : chr "10375058" >> $ : chr "10381603" >> $ : chr "10442373" >> $ : chr "10421227" >> $ : chr "10534966" >> $ : chr "10398408" >> $ : chr "10398418" >> $ : chr "10572772" >> $ : chr "Lypla1" >> $ : chr "Tcea1" >> $ : chr "Atp6v1h" >> $ : chr "Oprk1" >> >>> class(x[,2]) >> [1] "list" >> >> >> >> >> On 10/23/12 11:42AM, "Axel Klenk" <axel.klenk at="" actelion.com=""> wrote: >> >>>Dear Guest, >>> >>>I think your approach is valid in general and it is your x that is >>>causing the >>>problem; column 'Gene Symbol' appears to contain two values. What is the >>>result of >>> >>>str(x) >>> >>>and/or >>> >>>class(x[,2]) >>> >>>? >>> >>>Cheers, >>> >>> - axel >>> >>> >>>Axel Klenk >>>Research Informatician >>>Information Management Drug Discovery >>> >>>Actelion Pharmaceuticals Ltd. ? Gewerbestrasse 16 ? CH-4123 Allschwil >>>? Switzerland >>>G12.O1.R10 >>> >>>axel.klenk at actelion.com ? www.actelion.com >>>Address for visitors: Hegenheimermattweg 92 >>> >>> >>> >>>On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest] <guest at="" bioconductor.org=""> >>>wrote: >>>> >>>> >>>> Hi, >>>> >>>> I would like to sort a matrix by a specific column (column 2). I tried >>>>the order() function, but I get an error. I think it is because the >>>>values in column 2 are not numeric, they are gene symbols. This may be >>>>a >>>>general R question, but I thought I would post it here since it is >>>>microarray data analysis. >>>> >>>> I have matrix x: >>>> >>>> > x >>>> ID Gene Symbol logFC Adj.PVal >>>> 10344624 "10371400" "Lypla1" 0.3592492 0.9999522 >>>> 10344633 "10453900" "Tcea1" 0.1886117 0.9999522 >>>> 10344637 "10375051" "Atp6v1h" 0.6713107 0.9999522 >>>> 10344653 "10575211" "Oprk1" -0.2342731 0.9999522 >>>> 10344658 "10566254" "Rb1cc1" 1.790676 0.9999522 >>>> 10344674 "10602372" "Fam150a" 1.397496 0.9999522 >>>> 10344679 "10398428" "St18" -0.3278807 0.9999522 >>>> 10344707 "10383518" "Pcmtd1" -0.2231074 0.9999522 >>>> 10344713 "10397054" "Ahcy" -0.1844897 0.9999522 >>>> 10344723 "10384020" "Rrs1" -0.2322781 0.9999522 >>>> 10344725 "10608710" "Adhfe1" 0.5993566 0.9999522 >>>> 10344741 "10363762" "Hnrnpa3" -0.2660978 0.9999522 >>>> 10344743 "10375058" "3110035E14Rik" 0.9178868 0.9999522 >>>> 10344750 "10381603" "Sgk3" -0.2961638 0.9999522 >>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522 >>>> 10344789 "10421227" "Cspp1" -0.1480766 0.9999522 >>>> 10344799 "10534966" "Cspp1" -0.2436361 0.9999522 >>>> 10344801 "10398408" "Cspp1" -0.4040665 0.9999522 >>>> 10344803 "10398418" "Cspp1" -0.2556627 0.9999522 >>>> 10344805 "10572772" "Cspp1" -0.1864641 0.9999522 >>>> >>>> I want to sort on the "Gene Symbol" column so that I can remove the >>>>duplicates and keep the one with the highest log fold change. >>>> >>>> I tried the following and received an error. >>>> > x[order(x[,2]),] >>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1' >>>> >>>> If anyone has any suggestions for an easy way to sort a significant >>>>gene list, remove duplicated values, and keep the value with highest >>>>fold change, that would be helpful! >>>> >>>> I've posted my session info below. >>>> >>>> Thanks! >>>> >>>> Guest >>>> >>>> -- output of sessionInfo(): >>>> >>>> > sessionInfo() >>>> R version 2.15.1 (2012-06-22) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> loaded via a namespace (and not attached): >>>> [1] tools_2.15.1 >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>>http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>>-- >>> >>>The information of this email and in any file transmitted with it is >>>strictly confidential and may be legally privileged. >>>It is intended solely for the addressee. If you are not the intended >>>recipient, any copying, distribution or any other use of this email is >>>prohibited and may be unlawful. In such case, you should please notify >>>the >>>sender immediately and destroy this email. >>>The content of this email is not legally binding unless confirmed by >>>letter. >>>Any views expressed in this message are those of the individual sender, >>>except where the message states otherwise and the sender is authorised >>>to >>>state them to be the views of the sender's company. For further >>>information >>>about Actelion please see our website at http://www.actelion.com >>> >> > >-- > >The information of this email and in any file transmitted with it is >strictly confidential and may be legally privileged. >It is intended solely for the addressee. If you are not the intended >recipient, any copying, distribution or any other use of this email is >prohibited and may be unlawful. In such case, you should please notify >the >sender immediately and destroy this email. >The content of this email is not legally binding unless confirmed by >letter. >Any views expressed in this message are those of the individual sender, >except where the message states otherwise and the sender is authorised to >state them to be the views of the sender's company. For further >information >about Actelion please see our website at http://www.actelion.com >
ADD REPLY

Login before adding your answer.

Traffic: 471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6