Reshape dataframe without “timevar” and multiple value columns from long to wide format

245 Views Asked by At

I stumble over the problem that I want to convert two columns into one line, and I do this with a key. I have a table which consists of the keys, activities and the corresponding intervals of the activity.

set.seed(2)
(data <- data.frame(key=rep(LETTERS, each=4)[1:8], 
                   acitity=c("watering", "remove weeds", "cut", "remove leaf", "watering", "remove weeds", "cut", "fertilize"), 
                   intervall= sample(1:8)))
#  key      acitity intervall
#1   A     watering         2
#2   A remove weeds         3
#3   A          cut         1
#4   A  remove leaf         6
#5   B     watering         4
#6   B remove weeds         7
#7   B          cut         8
#8   B    fertilize         5

My goal is to get a row for each key, where the activities and intervals are written one after the other.

Output:

key activity    intervall   acticity_1    intervall_1   acticity_2  intervall_2  acticity_3   intervall_3
A   watering    5           remove weeds  7             cut         6            remove leaf  1
B   watering    8           remove weeds  4             cut         2            fertilize    3

I have tried variants with spread() and transpose(). But since my skills are not that far advanced, I didn't really get anywhere. With spread and transpose, I didn't get any further.

Thank you very much for your help!!!

3

There are 3 best solutions below

0
On BEST ANSWER

Third option using dcast from data.table. We create the missing 'time variable' with rowid(key):

library(data.table)
# convert data to a data.table object
setDT(data)
# reshape
dcast(data, key  ~ rowid(key), value.var = c("acitity", "intervall"))

Result

#    key acitity_1    acitity_2 acitity_3   acitity_4 intervall_1 intervall_2 intervall_3 intervall_4
#1:   A  watering remove weeds       cut remove leaf           5           7           6           1
#2:   B  watering remove weeds       cut   fertilize           8           4           2           3
0
On

Does this work:

library(dplyr)
library(tidyr)
data %>% 
  group_by(key) %>% 
  mutate(activity_count = row_number(),
         interval_count = row_number()) %>% 
  pivot_wider(id_cols = key,
              names_from = c(activity_count, interval_count),
              values_from = c(activity,intervall))
# A tibble: 2 x 9
# Groups:   key [2]
#  key   activity_1_1 activity_2_2 activity_3_3 activity_4_4 intervall_1_1 intervall_2_2 intervall_3_3 intervall_4_4
#  <chr> <chr>        <chr>        <chr>        <chr>                <int>         <int>         <int>         <int>
#1 A     watering     remove weeds cut          remove leaf              5             7             6             1
#2 B     watering     remove weeds cut          fertilize                8             4             2             3
0
On

Here is a base R option using reshape

reshape(
  within(data, q <- ave(seq_along(key), key, FUN = seq_along)),
  direction = "wide",
  idvar = "key",
  timevar = "q"
)

which gives

  key acitity.1 intervall.1    acitity.2 intervall.2 acitity.3 intervall.3
1   A  watering           5 remove weeds           7       cut           6
5   B  watering           8 remove weeds           4       cut           2
    acitity.4 intervall.4
1 remove leaf           1
5   fertilize           3