Package for category overlines on scatterplot in ggplot

20 Views Asked by At

I have data organised as in this example:

data1 <- tibble(seq = factor(1:20),
                value = rnorm(20, 10, 2),
                par_a = c(rep("S1", 6), rep("S2", 14)),
                par_b = c(rep("B1", 18), rep("S2", 2))
          )

X axis value, and parameters for category overlines will be categorical. seq will be about 50 unique values. Both parameters will have 2, 3 or 4 possible values. Y axis value is continuous.

I'm looking for package that will allow me to make plot like tis desired plot

I saw it made with ggplot once so I assume there might be package that allows to do it. Unfortunately I'm unable to find which package it is.

Radek

1

There are 1 best solutions below

2
stefan On BEST ANSWER

Perhaps there is a package to achieve that. But with some data wrangling you could achieve your desired result using ggplot2 like so:

library(tidyverse)

set.seed(123)

data1 <- tibble(
  seq = factor(1:20),
  value = rnorm(20, 10, 2),
  par_a = c(rep("S1", 6), rep("S2", 14)),
  par_b = c(rep("B1", 18), rep("B2", 2))
)

dat_segment <- data1 |>
  select(-value) |>
  pivot_longer(-seq, values_to = "category") |>
  group_by(name, category) |>
  filter(seq %in% c(first(seq), last(seq))) |>
  mutate(
    seq = as.numeric(seq),
    seq = case_when(
      seq > 1 & seq == last(seq) ~ seq + .4,
      seq > 1 & seq == first(seq) ~ seq - .4,
      .default = seq
    )
  ) |>
  ungroup() |>
  mutate(value = if_else(name == "par_a", 20, 18))

dat_label <- dat_segment |>
  summarise(
    seq = mean(seq), value = unique(value),
    .by = c(name, category)
  )

library(ggplot2)

ggplot(data1, aes(seq, value)) +
  geom_point() +
  geom_label(
    data = dat_label,
    aes(label = category, color = name),
    vjust = 0,
    fill = NA,
    label.size = 0,
    show.legend = FALSE
  ) +
  geom_line(
    data = dat_segment,
    aes(
      color = name, group = interaction(category, name)
    ),
    linewidth = 1
  )