I have run a model:
mymodel <- glm(averagetime~group,family=Gamma,data = mydata, weights=myweights)
I used the ggeffects package to create an output datafile that I can graph:
library(ggeffects)
library(ggplot2)
Out_mymodel <- ggpredict(mymodel, terms = c("group"),ci.lvl = 0.95)
Resulting datafile structure Out_mymodel:
structure(list(x = structure(1:2, levels = c("Group2", "Group1"
), class = "factor"), predicted = c(705.927485380117, 588.924355777224
), std.error = c(0.000149546858820898, 0.000151122146556574),
conf.low = c(908.215882881075, 725.073206975915), conf.high = c(577.336409873142,
495.822496946592), group = structure(c(1L, 1L), levels = "1", class = "factor")), row.names = c(NA,
-2L), class = c("ggeffects", "data.frame"), legend.labels = "1", x.is.factor = "1", continuous.group = FALSE, rawdata = structure(list(
response = c(1015L, 494L, 648L, 666L, 550L, 414L, 705L, 760L,
737L, 674L, 587L, 855L, 597L, 317L, 875L, 498L, 561L, 318L,
629L), x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), class = "factor", levels = "1"),
facet = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), class = "factor", levels = "1")), class = "data.frame", row.names = c(NA,
-19L)), title = "Predicted values of averagetime", x.title = "group", y.title = "averagetime", legend.title = NA_character_, x.axis.labels = c("Group2",
"Group1"), constant.values = structure(list(), names = character(0)), terms = "group", original.terms = "group", at.list = list(
group = c("Group2", "Group1")), ci.lvl = 0.95, type = "fe", response.name = "averagetime", back.transform = TRUE, response.transform = "averagetime", untransformed.predictions = c(705.927485380117,
588.924355777224), family = "Gamma", link = "inverse", logistic = "0", link_inverse = function (eta)
1/eta, link_function = function (mu)
1/mu, is.trial = "0", fitfun = "glm", model.name = "mymodel")
I wanted to obtain the raw data values from the model so that I could include them on my graph in addition to the marginal means and error bars. So, I located this suggestion:
raw <- attr(Out_mymodel, "rawdata")`
Resulting datafile structure raw:
structure(list(response = c(1015L, 494L, 648L, 666L, 550L, 414L,
705L, 760L, 737L, 674L, 587L, 855L, 597L, 317L, 875L, 498L, 561L,
318L, 629L), x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), class = "factor", levels = "1"),
facet = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), class = "factor", levels = "1")), class = "data.frame", row.names = c(NA,
-19L))
I then created the graph:
ggplot(Out_mymodel, aes(x, predicted)) +
geom_jitter(data = raw, aes(x = factor(x), y = response), width = 0.05,height = 0,size = 2,alpha = 0.4) +
geom_point(size=3) +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = 0), size = 0.75) +
theme(axis.text.x = element_blank())
The problem I'm having is that by converting x into a factor in the geom_jitter call, both geom_jitter and geom_point occupy opposite sides of the x-axis (jitter on left and point with error bar on right). I know that leaving x as numeric permits them to overlap, but I then can't overlay the point and error bar over the jittered points, which I need to do to be able to clearly see them when the number of jittered points gets large. How can I fix this using ggplot, so that the point estimates and their error bars overlay their corresponding jittered data points? Thank you for any assistance.
This is a sample datafile for mydata:
structure(list(id = c("A92", "A1", "A61", "A107", "A119", "A56",
"A73", "A16", "A87", "A93", "A31", "A66", "A83", "A53", "A74",
"A101", "A120", "A132", "A42"), group = c("Group2", "Group2",
"Group2", "Group2", "Group2", "Group2", "Group2", "Group2", "Group2",
"Group1", "Group1", "Group1", "Group1", "Group1", "Group1", "Group1",
"Group1", "Group1", "Group1"), averagetime = c(1015L, 494L, 648L,
666L, 550L, 414L, 705L, 760L, 737L, 674L, 587L, 855L, 597L, 317L,
875L, 498L, 561L, 318L, 629L), myweights = c(151L, 50L, 168L,
132L, 51L, 66L, 61L, 32L, 144L, 150L, 171L, 161L, 144L, 131L,
80L, 54L, 146L, 133L, 33L)), row.names = c(NA, -19L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: (nil)>)