Merge dataframes

0

Entering edit mode

Aleš Maver ▴ 80

@ales-maver-3556

Last seen 10.6 years ago

Dear Joao, you have to supply the with two data.frame objects (you supplied it instead with a data.frame and a vector) and if there are matching column names in these two data frames, the merge function will match the values in two data frames by itself. So, a simple solution to your problem would be to use: merge(data1, data2, all.x=T) #I've added all.x, so that NA's will be produced when there is no matching value in the data2 objects Hope this is of use, Ales 2011/10/5 JoÃ£o Daniel Nunes Duarte <jdanielnd@gmail.com> > Hello, > > I am having some problems to use the 'merge' function. I'm not sure if I > got > its working right. > > What I want to do is: > > 1) Suppose I have a dataframe like: > > height width > 1 1.1 2.3 > 2 2.1 2.5 > 3 1.8 1.9 > 4 1.6 2.1 > 5 1.8 2.4 > > 2) And I generate a second dataframe sampled from this one, like: > > height width > 1 1.1 2.3 > 3 1.8 1.9 > 5 1.8 2.4 > > 3) Next, I add a new variable from this dataframe: > > height width color > 1 1.1 2.3 red > 3 1.8 1.9 red > 5 1.8 2.4 blue > > 4) So, I want to merge those dataframes, so that the new variable, color, > is > binded to the first dataframe. Of course some cases won't have value for > it, > since I generated this variable in a smaller dataframe. In those cases I > want the value to be NA. The result dataframe should be: > > height width color > 1 1.1 2.3 red > 2 2.1 2.5 NA > 3 1.8 1.9 red > 4 1.6 2.1 NA > 5 1.8 2.4 blue > > I have written some codes, but they're not working properly. The new > variable has its values mixed up, and they do not correspond to its > row.names. > > # Generate the first dataframe > data1 <- data.frame(height=rnorm(20,3,0.2),width=rnorm(20,2,0.5)) > # Sample a smaller dataframe from data1 > data2 <- data1[sample(1:20,15,replace=F),] > # Generate the new variable > color <- sample(c("red","blue"),15,replace=T) > # Bind the new variable to data2 > data2 <- cbind(data2, color) > # Merge the data1 and data2$color by row.names, and force it to has the > same > values that data1. Next it generates a new dataframe where column 1 is the > row.name, and then sort it by the row.name from data1. > data.frame(merge(data1,data2$color, by=0, > all.x=T),row.names=1)[row.names(data1),] > > I'm not sure what am I doing wrong. Can anyone see where the mistake is? > > Thank you! > > Cheers, > > Joao D. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Ales Maver, MD Institute of Medical Genetics, Department of Obstetrics and Gynaecology UMC Ljubljana Å lajmerjeva 3 SI-1000 Ljubljana Slovenia [[alternative HTML version deleted]]

Genetics Genetics • 857 views

ADD COMMENT • link 13.5 years ago Aleš Maver ▴ 80

Login before adding your answer.