Dear All,
I have a dataframe which I want to plot two different variables, x-axis and y-axis respectively.
A sample from my dataframe looks like this but just imagine this with a lot more columns :
Groups X -axis Y -axis
Group1 10 20
Group1 15 25
Group1 20 25
Group2 5 10
Group2 10 15
......
When I try to plot all groups using code below:
c <- ggplot(data=plotMatrix, aes(x= X-axis, y= Y-axis, group = Groups , colour = Groups))
c + stat_smooth(se=FALSE)
I get a plot where one of the lines is kind of an outlier. But since I have many groups to be able to tell which group is the outlier (colour code is very similar for large amount of groups), I wanted to divide the dataset. This is where I realized depending on which groups I divide my dataset into, I get a different plot each time. I am not making a mistake in subsetting as number of rows are the same between groups (I double checked using R and excel). It is just some groups (Group1) are represented by two different lines in the plot depending on whether I take Group1 (representation #1) only or Group1, Group2 and Group4 (representation #2 ) together or Group1, Group5 and Group17 together (representation #2 ) or again Group1, Group3 (representation #1 again). Anyone knows anything about this? Thanks for your input