Dear All,

I have a dataframe which I want to plot two different variables, x-axis and y-axis respectively.

A sample from my dataframe looks like this but just imagine this with a lot more columns :

Groups X -axis Y -axis

Group1 10 20

Group1 15 25

Group1 20 25

Group2 5 10

Group2 10 15

......

When I try to plot all groups using code below:

c <- ggplot(data=plotMatrix, aes(x= X-axis, y= Y-axis, group = Groups , colour = Groups))

c + stat_smooth(se=FALSE)

I get a plot where one of the lines is kind of an outlier. But since I have many groups to be able to tell which group is the outlier (colour code is very similar for large amount of groups), I wanted to divide the dataset. This is where I realized depending on which groups I divide my dataset into, I get a different plot each time. I am not making a mistake in subsetting as number of rows are the same between groups (I double checked using R and excel). It is just some groups (Group1) are represented by two different lines in the plot depending on whether I take Group1 (representation #1) only or Group1, Group2 and Group4 (representation #2 ) together or Group1, Group5 and Group17 together (representation #2 ) or again Group1, Group3 (representation #1 again). Anyone knows anything about this? Thanks for your input

written 3.3 years ago by alptaciroglu

