Merge dataframes
0
0
Entering edit mode
Aleš Maver ▴ 80
@ales-maver-3556
Last seen 10.2 years ago
Dear Joao, you have to supply the with two data.frame objects (you supplied it instead with a data.frame and a vector) and if there are matching column names in these two data frames, the merge function will match the values in two data frames by itself. So, a simple solution to your problem would be to use: merge(data1, data2, all.x=T) #I've added all.x, so that NA's will be produced when there is no matching value in the data2 objects Hope this is of use, Ales 2011/10/5 João Daniel Nunes Duarte <jdanielnd@gmail.com> > Hello, > > I am having some problems to use the 'merge' function. I'm not sure if I > got > its working right. > > What I want to do is: > > 1) Suppose I have a dataframe like: > > height width > 1 1.1 2.3 > 2 2.1 2.5 > 3 1.8 1.9 > 4 1.6 2.1 > 5 1.8 2.4 > > 2) And I generate a second dataframe sampled from this one, like: > > height width > 1 1.1 2.3 > 3 1.8 1.9 > 5 1.8 2.4 > > 3) Next, I add a new variable from this dataframe: > > height width color > 1 1.1 2.3 red > 3 1.8 1.9 red > 5 1.8 2.4 blue > > 4) So, I want to merge those dataframes, so that the new variable, color, > is > binded to the first dataframe. Of course some cases won't have value for > it, > since I generated this variable in a smaller dataframe. In those cases I > want the value to be NA. The result dataframe should be: > > height width color > 1 1.1 2.3 red > 2 2.1 2.5 NA > 3 1.8 1.9 red > 4 1.6 2.1 NA > 5 1.8 2.4 blue > > I have written some codes, but they're not working properly. The new > variable has its values mixed up, and they do not correspond to its > row.names. > > # Generate the first dataframe > data1 <- data.frame(height=rnorm(20,3,0.2),width=rnorm(20,2,0.5)) > # Sample a smaller dataframe from data1 > data2 <- data1[sample(1:20,15,replace=F),] > # Generate the new variable > color <- sample(c("red","blue"),15,replace=T) > # Bind the new variable to data2 > data2 <- cbind(data2, color) > # Merge the data1 and data2$color by row.names, and force it to has the > same > values that data1. Next it generates a new dataframe where column 1 is the > row.name, and then sort it by the row.name from data1. > data.frame(merge(data1,data2$color, by=0, > all.x=T),row.names=1)[row.names(data1),] > > I'm not sure what am I doing wrong. Can anyone see where the mistake is? > > Thank you! > > Cheers, > > Joao D. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Ales Maver, MD Institute of Medical Genetics, Department of Obstetrics and Gynaecology UMC Ljubljana Šlajmerjeva 3 SI-1000 Ljubljana Slovenia [[alternative HTML version deleted]]
Genetics Genetics • 809 views
ADD COMMENT

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6