I have a data on individuals with their titre levels. The titre levels are in ordered categories. Instead of using the labelled values, I have used the numeric values - this would allow me to see the trend. My logic as follows:
- If the values are monotonically increasing, i.e. either increases or remains the same, assign "uptrend"
- If the values are monotonically decreasing, i.e. either decreases or remains the same, assign "downtrend"
- If there's no difference, i.e. all values remain the same, assign "stable"
- Everything else would be considered "fluctuate"
However, I want to relax the criteria to allow for 1 "deviation" or "difference". Example (picture below), id==9
would be considered "fluctuate" in the original logic, but I want this to be "stable" since there's only a single value that's difference from the rest. Similarly for id %in% 10:12
.
This is what I have done previously, using dplyr
and tidyr
, but I am pretty much stuck.
library(tidyverse)
df %>%
# Create a duplicate column to convert into long format
mutate(titre_level_sep = titre_level) %>%
separate_rows(titre_level_sep, sep = ",") %>%
# Calculate for each id
group_by(id) %>%
mutate(
# Trends and diff in titre levels
diff = as.numeric(titre_level_sep) - lag(as.numeric(titre_level_sep)),
increasing = all(diff >= 0, na.rm = TRUE),
decreasing = all(diff <= 0, na.rm = TRUE),
stable = all(diff == 0, na.rm = TRUE)) %>%
ungroup() %>%
select(everything(), -c(diff, titre_level_sep)) %>%
distinct() %>%
mutate(
# Trends final classification
trend2 = case_when(increasing & !decreasing & !stable ~ "uptrend",
!increasing & decreasing & !stable ~ "downtrend",
increasing & decreasing & stable ~ "stable",
!increasing & !decreasing & !stable ~ "fluctuate")) %>%
select(id, trend, trend2)
Here's the data I have used:
df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
titre_level = c("2,1,1,1,1", "4,4,4,4,4,1,1,1,1,1", "8,6,6,6,6,4",
"1,1,1,1,1,1,1,1,1,3,3", "4,4,7,10", "8,11", "1,1", "6,6,6,6,6,6,6,6,6",
"1,2,1,1,1,1,1,1,1", "8,8,6,8", "4,2,2,2,4,2,2,2,1", "1,1,2,4,4,1"
), trend = c("downtrend", "downtrend", "downtrend", "uptrend",
"uptrend", "uptrend", "stable", "stable", "stable", "stable",
"downtrend", "uptrend")), class = "data.frame", row.names = c(NA,
12L))