Wide to long format dataframe, where the wide dataframe has paired columns (AAA1 with value11, BBB1 with value22, CCC1 with value33 etc)

Question

Wide to long format dataframe, where the wide dataframe has paired columns (AAA1 with value11, BBB1 with value22, CCC1 with value33 etc)

53 Views Asked by Zach At 27 August 2023 at 15:46

I am trying to convert a wide dataframe that contains the following columns:

id, age, gender, AAA1, value11, AAA2, value12, ..., BBB1, value21, BBB2, value22, ..., CCC1, value31, CCC2, value32,...

dput

   df2.id df2.dummy_age df2.dummy_gender df2.SVO1 df2.trip11 df2.SVO2 df2.trip12 df2.SVA1 df2.trip21 df2.SVA2 df2.trip22 df2.OVA1 df2.trip31 df2.OVA2 df2.trip32
1       1             0                0        2        123        1        146        2        134        2        143        2        143        2        128
2       2             0                0        2        128        2        155        2        134        2        143        2        143        2        158
3       3             0                0        1        152        1        161        2        134        2        143        2        143        2        128
4       4             0                0        2        128        2        143        2        149        2        149        2        128        2        128
5       5             0                0        2        134        1        120        2        134        2        143        2        155        1        149
6       6             0                1        1        146        2        155        2        134        1        112        2        143        1        155
7       7             0                0        2        149        2        123        2        143        2        143        1        123        2        134
8       8             0                1        1        128        1        149        1        140        2        143        2        134        1        155
9       9             1                1        2        129        2        113        2        149        2        128        2        134        2        134
10     10             0                0        2        128        1        137        2        149        2        143        2        143        2        134

to

id, age, gender,
type = AAA, BBB, CCC, ...
stage = * where AAA*, BBB*, CCC*, ...
period = ** where value**
value = actual values under value11, value12, ...., value21, value22, ...

Example for id = 1

  id dummy_age dummy_gender type stage period value
1  1         0            0  SVO     1     11   123
2  1         0            0  SVO     2     12   146
3  1         0            0  SVA     1     21   134
4  1         0            0  SVA     2     22   143
5  1         0            0  OVA     1     31   143
6  1         0            0  OVA     2     32   128

I can't seem to find a good way to use dplyr to do this. Appreciate any help on this.

Original Q&A

There are 1 best solutions below

**AndS.** · Answer 1 · 2023-08-27T17:46:48.947000

Without any data, its hard to know what the format of the dataframe really is. With that being said, I imagen that XXX## is just a string of that name? Given your desired output, it looks like all the type info is held within the value## column names, so we can probably just drop the XXX## columns, pivot_longer and derive the type data after.

library(tidyverse)

#test data
set.seed(25)
example_data <- tibble(id = letters[1:5],
       age = 21:25,
       gender = "m",
       AAA1 = "AAA1",
       value11 = runif(5),
       AAA2 = "AAA2",
       value12 = runif(5),
       BBB1 = "BBB1",
       value21 = runif(5),
       BBB2 = "BBB2",
       value22 = runif(5),
       )

example_data
#> # A tibble: 5 x 11
#>   id      age gender AAA1  value11 AAA2  value12 BBB1  value21 BBB2  value22
#>   <chr> <int> <chr>  <chr>   <dbl> <chr>   <dbl> <chr>   <dbl> <chr>   <dbl>
#> 1 a        21 m      AAA1    0.416 AAA2   0.985  BBB1    0.328 BBB2    0.148
#> 2 b        22 m      AAA1    0.695 AAA2   0.626  BBB1    0.364 BBB2    0.540
#> 3 c        23 m      AAA1    0.149 AAA2   0.338  BBB1    0.959 BBB2    0.728
#> 4 d        24 m      AAA1    0.897 AAA2   0.0668 BBB1    0.589 BBB2    0.477
#> 5 e        25 m      AAA1    0.124 AAA2   0.282  BBB1    0.695 BBB2    0.729


#solution
example_data |>
  select(id, age, gender, contains("value")) |>
  pivot_longer(cols = contains("value"), 
               names_to = c("stage", "period"), 
               values_to = "value", 
               names_pattern = "value(.)(.)", 
               names_transform = as.numeric) |>
  mutate(type = map_chr(stage, \(x) paste(rep(LETTERS[x], 3), collapse = "" )), 
         .after = gender)
#> # A tibble: 20 x 7
#>    id      age gender type  stage period  value
#>    <chr> <int> <chr>  <chr> <dbl>  <dbl>  <dbl>
#>  1 a        21 m      AAA       1      1 0.416 
#>  2 a        21 m      AAA       1      2 0.985 
#>  3 a        21 m      BBB       2      1 0.328 
#>  4 a        21 m      BBB       2      2 0.148 
#>  5 b        22 m      AAA       1      1 0.695 
#>  6 b        22 m      AAA       1      2 0.626 
#>  7 b        22 m      BBB       2      1 0.364 
#>  8 b        22 m      BBB       2      2 0.540 
#>  9 c        23 m      AAA       1      1 0.149 
#> 10 c        23 m      AAA       1      2 0.338 
#> 11 c        23 m      BBB       2      1 0.959 
#> 12 c        23 m      BBB       2      2 0.728 
#> 13 d        24 m      AAA       1      1 0.897 
#> 14 d        24 m      AAA       1      2 0.0668
#> 15 d        24 m      BBB       2      1 0.589 
#> 16 d        24 m      BBB       2      2 0.477 
#> 17 e        25 m      AAA       1      1 0.124 
#> 18 e        25 m      AAA       1      2 0.282 
#> 19 e        25 m      BBB       2      1 0.695 
#> 20 e        25 m      BBB       2      2 0.729

Wide to long format dataframe, where the wide dataframe has paired columns (AAA1 with value11, BBB1 with value22, CCC1 with value33 etc)

There are 1 best solutions below

Related Questions in R

Related Questions in DPLYR

Related Questions in RESHAPE2

Trending Questions

Popular # Hahtags

Popular Questions