I am working with survey data with 250 columns. A sample of my data looks like this:
q1 <- factor(c("yes",NA,"no","yes",NA,"yes","no","yes"))
q2 <- factor(c("Albania","USA","Albania","Albania","UK",NA,"UK","Albania"))
q3 <- factor(c(0,1,NA,0,1,1,NA,0))
q4 <- factor(c(0,NA,NA,NA,1,NA,0,0))
q5 <- factor(c("Dont know","Prefer not to answer","Agree","Disagree",NA,"Agree","Agree",NA))
q6 <- factor(c(1,NA,3,5,800,NA,900,2))
sector <- factor(c("Energy","Water","Energy","Other","Other","Water","Transportation","Energy"))
weights <- factor(c(0.13,0.25,0.13,0.22,0.22,0.25,0.4,0.13)
data <- data.frame(q1,q2,q3,q4,q5,q6,sector,weights)
With the help from stackoverflow I have created following function to loop through columns and create bar charts where x axis shows percentage of responses, y axis shows underlying column and fill is the sectors.
plot_fun <- function(variable) {
total <- sum(!is.na(data[[variable]]))
data <- data |>
filter(!is.na(.data[[variable]])) |>
group_by(across(all_of(c("sector", variable)))) |>
summarise(n = n(), .groups = "drop_last") |>
mutate(pct = n / sum(n)) |>
ungroup()
ggplot(
data = data,
mapping = aes(fill = sector, x = pct, y = .data[[variable]])
) +
geom_col(position = "dodge") +
labs(
y = variable, x = "Percentage of responses", fill = "Sector legend",
caption = paste("Total =", total)
) +
geom_text(
aes(
label = scales::percent(pct, accuracy = 0.1)
),
position = position_dodge(.9), vjust = 0.5
) +
scale_x_continuous(labels=function(x) paste0(x*100))+
scale_fill_brewer(palette = "Accent")+
theme_bw() +
theme(panel.grid.major.y = element_blank())
}
Now I want to apply survey weights so that bar charts will show weighted response percentages. I have tried to add weight = data$weights to mapping() but it didn't work. I have also tried to apply weights in the calculation of percentages by doing summarise(n= sum(weights)) but it didn't work neither.
Is there a way to modify my code so that weights are applied? Thank you beforehand.
It's still not clear how you are looking to apply the weights. I've assumed here you want to multiply the percentage by the weight. Note you need to fix your data. Weight should not be factor if you want to use it as a numerical value for calculation. Anyhow, used weights in the group_by so that they carry through, and then in mutate to create a weighted percentage.
If this doesn't do the trick, do clarify how you look to use the weights and what the final outcome values should be.