I'd like to illustrate ICC. I'd love to create a plot where one can see, that with a higher ICC more of the variability in a dependent variable can be attributed to the groups.
I've stumble across this depiction (https://bookdown.org/marklhc/notes_bookdown/hierarchical-multilevel-models.html#multilevel-modeling-mlm):
And I think it does a great job explaining this fact.
Now I've tried to recreate the spirit of this plot with the following code:
ni <- 30
#numbers of student per class
nj <- 15
#numbers of clusters/classes
ICC <- seq(.1,.9,.1)
#starting with 0, then going up .1 stepwise till .9, thus generating 10 values
# what is the random intercept standard deviation given icc
RI_sd <- sqrt(1 / (1-ICC) - 1)
# RI_sd
# -------------------------------------------------------------------------
dgp_grid <- expand.grid(
ni = 1:ni,
nj = 1:nj,
x = NA,
y = NA, # with two predictor variables
RI_sd = 0
)
dgp_grid$U0j <- rep(rnorm(nj, mean = 0, sd = RI_sd), each = ni)
#create level 2 residual (ICC)
dgp_grid$x <- rnorm(ni*nj,mean = 0, sd = 1)
# create level 1 x explanatory/predictor variable (draw from standard normal)
dgp_grid$y <-
dgp_grid$x +
dgp_grid$U0j
# -------------------------------------------------------------------------
library(ggplot2)
# Define the desired range for the y-axis
x_axis_limit <- c(-6, 6)
# Plot the data with fixed y-axis range
ggplot(dgp_grid, aes(x = y, y = nj)) +
geom_jitter(aes(color = factor(nj)), size = 3, width = 0.2) +
geom_boxplot(aes(group = nj), alpha = 0.5, color = "black", width = 0.2) +
theme_minimal() +
coord_flip() +
xlim(x_axis_limit)
I basically create some clusters nj (one could think of them as classes in, e.g., a school) and then try to weave in a residual term for the ICC. No If I want to have no ICC I would set RI_sd = 0, and if I want a very big ICC I would set RI_sd = 3.
This is what the plot looks like with RI_sd = 0:

And this is waht it looks like with RI_sd = 3:

I have two questions.
- Is my implementation of ICC correct? and
- can I make the effect of the ICC more profound so that is more like to the first picture?

I think it would be best to write a function that takes ICC and outputs a plot similar in style to the linked example. We can even get it to take a vector of ICC values and output a panel for each value.
This allows