Min-Max Normalization by group across multiple columns

43 Views Asked by At

Below is an example of code I use to normalize by group using dplyr:

mtcars %>%  group_by(carb) %>%  mutate(norm = (hp - min(hp)) / (max(hp) - min(hp))) %>%  ungroup()

I would like to modify this code so that the normalization can be applied to several specific columns at the same time maybe using the function "across" or "purr" or anything else. Can anyone help?

1

There are 1 best solutions below

1
Friede On

Edited: As you don't want base R.

dplyr

> min_max_norm = \(x) { (x - min(x)) / (max(x) - min(x)) }
> library(dplyr)
> mtcars |> 
+   group_by(carb) |>
+   mutate(across(where(is.numeric) & !c(cyl, vs, am, gear), ~ min_max_norm(.x))) |>
+   head()
# A tibble: 6 × 11
# Groups:   carb [3]
    mpg   cyl  disp    hp  drat     wt   qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
1 1         6 0     0     0.752 0      0.445      0     1     4     4
2 1         6 0     0     0.752 0.0909 0.573      0     1     4     4
3 0.297     4 0.197 0.622 0.747 0.298  0          1     1     4     1
4 0.209     6 1     1     0.219 0.849  0.516      1     0     3     1
5 0.230     8 0.877 1     0.180 0.826  0.0516     0     0     3     2
6 0         6 0.823 0.889 0     1      1          1     0     3     1

Have a look at across.