Let's say I have a dataframe with 3 ID columns and one column of interest. Each row represents one observation. Some ID have multiple observations, i.e., multiple rows.

df <- data.frame(id1 = c(  1,   2,   3,   4,   4), 
                 id2 = c( 11,  12,  13,  14,  14), 
                 id3 = c(111, 112, 113, 114, 114), 
                 variable_of_interest = c(13, 24, 35, 31, 12))

  id1 id2 id3 variable_of_interest
1   1  11 111                   13
2   2  12 112                   24
3   3  13 113                   35
4   4  14 114                   31
5   4  14 114                   12

My goal is to restructure it in odred to have one row per ID, to keep the 3 IDs and to name the new columns "variable_of_interest1", "variable_of_interest2":

  id1 id2 id3 variable_of_interest1 variable_of_interest1
1   1  11 111                    13                    NA
2   2  12 112                    24                    NA
3   3  13 113                    35                    NA
4   4  14 114                    31                    12

The solution might need reshape2 and the dcast function, but until now, I could not solve this out.

1

There are 1 best solutions below

3
On BEST ANSWER

We can create a sequence grouped by the 'id' columns and then with pivot_wider reshape to wide

library(dplyr)
library(stringr)
library(tidyr)
library(data.table)
df %>% 
  mutate(ind = str_c('variable_of_interest', rowid(id1, id2, id3))) %>% 
  pivot_wider(names_from = ind, values_from = variable_of_interest)

-output

# A tibble: 4 x 5
#    id1   id2   id3 variable_of_interest1 variable_of_interest2
#  <dbl> <dbl> <dbl>                 <dbl>                 <dbl>
#1     1    11   111                    13                    NA
#2     2    12   112                    24                    NA
#3     3    13   113                    35                    NA
#4     4    14   114                    31                    12

Or another option is data.table

library(data.table)
dcast(setDT(df),  id1 + id2 + id3 ~ 
  paste0('variable_of_interest', rowid(id1, id2, id3)),
      value.var = 'variable_of_interest')

-output

#    id1 id2 id3 variable_of_interest1 variable_of_interest2
#1:   1  11 111                    13                    NA
#2:   2  12 112                    24                    NA
#3:   3  13 113                    35                    NA
#4:   4  14 114                    31                    12