Title: Issue with Stacking Bar Chart in Highcharter: Incorrect Series Mapping
Body: I'm working on a stacked bar chart using the highcharter package in R and encountering an issue where the series data is not correctly mapping to the specified categories on the x-axis. My dataset includes capacity, operator, and count columns, and I'm trying to display the percentage of counts per operator within each capacity.
Problem: Each bar in the chart should represent a capacity, and within each bar, there should be segments (stacks) for each operator, representing their percentage share. However, the chart incorrectly maps all operators to a single capacity or overlaps them, instead of distributing them across the different capacities.
Data Structure: The data is structured as follows (example):
capacity,operator,count
3,Operator1,39
3,Operator2,6916
...
7,Operator1,2729
7,Operator2,23504
...
Code: Here's the code snippet I'm using:
library(highcharter)
library(dplyr)
# Data loading and preparation
data <- read.csv("path_to_data.csv", fileEncoding = "UTF-8")
total_counts_by_capacity <- data %>%
group_by(capacity) %>%
summarise(TotalCount = sum(count))
data_with_total <- merge(data, total_counts_by_capacity, by = "capacity")
data_with_percentage <- data %>%
left_join(total_counts_by_capacity, by = "capacity") %>%
mutate(Percentage = round((count / TotalCount) * 100, 2))
hc <- highchart() %>%
hc_chart(type = "bar") %>%
hc_title(text = "Capacity by Operator Percentage") %>%
hc_xAxis(categories = unique(data_with_percentage$capacity)) %>%
hc_yAxis(title = list(text = "Percentage")) %>%
hc_tooltip(shared = TRUE, pointFormat = "<span style='color:{series.color}'>{series.name}</span>: <b>{point.y}</b><br/>") %>%
hc_plotOptions(series = list(stacking = "normal"))
unique_operators <- unique(data_with_percentage$operator)
for (op in unique_operators) {
op_data <- filter(data_with_percentage, operator == op )
hc <- hc %>% hc_add_series(name = op, data = op_data$Percentage)
}
hc
Issue: While the code runs without errors, the resulting chart does not correctly display the data as intended. All operators are mapped to a single capacity bar, instead of being distributed according to their respective capacities.
I'm not sure if the issue is with the way I'm filtering the data or adding the series to the chart. Any insights or suggestions on how to correctly map the series data to the respective capacities would be greatly appreciated.
As shown in the image below, the percentage should not exceed 100%. However, there is an issue where operators are being incorrectly linked to capacities they should not be associated with, resulting in the total percentage exceeding 100%. Additionally, it would be beneficial if the percentage could be displayed on the bars themselves.
Unfortunately the issue can't be reproduced based on your example data. Instead one gets your desired outcome. Also, at a first glance I don't see any issues with your code.
Hence, my guess is that the issue is your data. First possible issue is that your data is not in the right order, i.e. the categories you map on the x axis in
xAxis
are not linked to the order of the percentage values you add as data inhc_add_series
, i.e. you have to ensure that the data is properly ordered, otherwise values are assigned to the wrong categories. Second, the same issue arises if your isn't complete, i.e. if you don't have obs. on all capacities for each operator. To fix that, you could/have to complete your data using e.g.tidyr::complete
.First, I use a more general example dataset where the data is not properly ordered and added a third operator:
Running your code on this dataset reflects some of the issues in the image in your post, i.e. For the newly added
Operator3
the value is assigned to capacity level 3 instead of 10, and forOperator1
the value is for capacity level 7 is now falsely assigned to the new capacity level 10. In the latter case the issue is that level 10 is now the second category and hence gets assigned the second value for operator 1.To fix both issues I first complete the dataset using
tidy::complete
and arrange the dataset finally by capacity and operator. Also note that I slightly refactored and simplified your data wrangling code, but that has no effect on the issue.