Issue with Stacking Bar Chart in Highcharter: Incorrect Series Mapping

41 Views Asked by At

Title: Issue with Stacking Bar Chart in Highcharter: Incorrect Series Mapping

Body: I'm working on a stacked bar chart using the highcharter package in R and encountering an issue where the series data is not correctly mapping to the specified categories on the x-axis. My dataset includes capacity, operator, and count columns, and I'm trying to display the percentage of counts per operator within each capacity.

Problem: Each bar in the chart should represent a capacity, and within each bar, there should be segments (stacks) for each operator, representing their percentage share. However, the chart incorrectly maps all operators to a single capacity or overlaps them, instead of distributing them across the different capacities.

Data Structure: The data is structured as follows (example):

capacity,operator,count
3,Operator1,39
3,Operator2,6916
...
7,Operator1,2729
7,Operator2,23504
...

Code: Here's the code snippet I'm using:

library(highcharter)
library(dplyr)

# Data loading and preparation
data <- read.csv("path_to_data.csv", fileEncoding = "UTF-8")

    total_counts_by_capacity <- data %>%
      group_by(capacity) %>%
      summarise(TotalCount = sum(count))

    data_with_total <- merge(data, total_counts_by_capacity, by = "capacity")

    data_with_percentage <- data %>%
      left_join(total_counts_by_capacity, by = "capacity") %>%
      mutate(Percentage = round((count / TotalCount) * 100, 2))
    
    hc <- highchart() %>%
      hc_chart(type = "bar") %>%
      hc_title(text = "Capacity by Operator Percentage") %>%
      hc_xAxis(categories = unique(data_with_percentage$capacity)) %>%
      hc_yAxis(title = list(text = "Percentage")) %>%
      hc_tooltip(shared = TRUE, pointFormat = "<span style='color:{series.color}'>{series.name}</span>: <b>{point.y}</b><br/>") %>%
      hc_plotOptions(series = list(stacking = "normal"))

    unique_operators <- unique(data_with_percentage$operator)
    for (op in unique_operators) {
      op_data <- filter(data_with_percentage, operator == op )
      hc <- hc %>% hc_add_series(name = op, data = op_data$Percentage)
    }
    hc


Issue: While the code runs without errors, the resulting chart does not correctly display the data as intended. All operators are mapped to a single capacity bar, instead of being distributed according to their respective capacities.

I'm not sure if the issue is with the way I'm filtering the data or adding the series to the chart. Any insights or suggestions on how to correctly map the series data to the respective capacities would be greatly appreciated.

As shown in the image below, the percentage should not exceed 100%. However, there is an issue where operators are being incorrectly linked to capacities they should not be associated with, resulting in the total percentage exceeding 100%. Additionally, it would be beneficial if the percentage could be displayed on the bars themselves.

enter image description here

1

There are 1 best solutions below

0
On

Unfortunately the issue can't be reproduced based on your example data. Instead one gets your desired outcome. Also, at a first glance I don't see any issues with your code.

Hence, my guess is that the issue is your data. First possible issue is that your data is not in the right order, i.e. the categories you map on the x axis in xAxis are not linked to the order of the percentage values you add as data in hc_add_series, i.e. you have to ensure that the data is properly ordered, otherwise values are assigned to the wrong categories. Second, the same issue arises if your isn't complete, i.e. if you don't have obs. on all capacities for each operator. To fix that, you could/have to complete your data using e.g. tidyr::complete.

First, I use a more general example dataset where the data is not properly ordered and added a third operator:

data <- structure(
  list(
    capacity = c(3L, 3L, 10L, 7L, 7L, 10L),
    operator = c(
      "Operator1",
      "Operator2", "Operator2", "Operator1", "Operator2", "Operator3"
    ), count = c(39L, 6916L, 23504L, 2729L, 23504L, 23504L)
  ),
  class = "data.frame", row.names = c(
    NA,
    -6L
  )
)
data
#>   capacity  operator count
#> 1        3 Operator1    39
#> 2        3 Operator2  6916
#> 3       10 Operator2 23504
#> 4        7 Operator1  2729
#> 5        7 Operator2 23504
#> 6       10 Operator3 23504

Running your code on this dataset reflects some of the issues in the image in your post, i.e. For the newly added Operator3 the value is assigned to capacity level 3 instead of 10, and for Operator1 the value is for capacity level 7 is now falsely assigned to the new capacity level 10. In the latter case the issue is that level 10 is now the second category and hence gets assigned the second value for operator 1.

To fix both issues I first complete the dataset using tidy::complete and arrange the dataset finally by capacity and operator. Also note that I slightly refactored and simplified your data wrangling code, but that has no effect on the issue.


data_with_percentage <- data %>%
  tidyr::complete(capacity, operator, fill = list(count = 0)) |>
  group_by(capacity) %>%
  mutate(
    TotalCount = sum(count),
    Percentage = round(count / TotalCount * 100, 2)
  ) |>
  mutate(capacity = factor(capacity)) |>
  arrange(operator, capacity)

hc <- highchart() %>%
  hc_chart(type = "bar") %>%
  hc_title(text = "Capacity by Operator Percentage") %>%
  hc_xAxis(categories = unique(data_with_percentage$capacity)) %>%
  hc_yAxis(title = list(text = "Percentage")) %>%
  hc_tooltip(shared = TRUE, pointFormat = "<span style='color:{series.color}'>{series.name}</span>: <b>{point.y}</b><br/>") %>%
  hc_plotOptions(series = list(stacking = "normal"))

unique_operators <- unique(data_with_percentage$operator)
for (op in unique_operators) {
  op_data <- filter(data_with_percentage, operator == op)
  hc <- hc %>%
    hc_add_series(name = op, data = op_data$Percentage)
}
hc