How could I create a scatter plot in R so that all points are shown there even though I have same values in some categories. Besides data points, I would like to have average values in each category.
For example, if I have data set of two variables where one of them (Cotton Weight Percentage) is factor:
dat <- structure(list(`Tensile Strength` = c(12L, 19L, 17L, 7L, 25L,
7L, 14L, 12L, 18L, 22L, 18L, 7L, 18L, 18L, 15L, 10L, 11L, 19L,
11L, 19L, 15L, 19L, 11L, 23L, 9L), `Cotton weight percent` = c(20L,
30L, 20L, 35L, 30L, 15L, 25L, 20L, 25L, 30L, 20L, 15L, 25L, 20L,
15L, 35L, 35L, 25L, 15L, 25L, 35L, 30L, 35L, 30L, 15L)), .Names = c("Tensile Strength",
"Cotton weight percent"), class = "data.frame", row.names = c(NA,
-25L))
How can I make a scatter plot like this one:
Here, solid dots are the individual observations and the open circles are the average observed tensile strengths.
This can be done in ggplot2 with
geom_jitter
andstat_summary
. Specifically, thegeom_jitter
would give you the black points on your graph:(The "jitter" is to add some noise in terms of the x-axis, as occurs in your example).
Then the
stat_summary
layer lets you add a point for the average of each x value (which I've made large and red):