How can I create a data frame when I only have the values for each cell (possibly more than one string per cell) and their column and row index?

Question

How can I create a data frame when I only have the values for each cell (possibly more than one string per cell) and their column and row index?

129 Views Asked by Michael Altorfer At 17 August 2025 at 20:59

I have a dataframe including the values for each cell (possibly more than one value) and the row and the column index.


df = data.frame(values = c(1,"Sven", 20,"Mueller","sept",2,30,"John","Mar","Hynes","Marc"), 
                colI = c(1,2,3,2,4,1,3,2,4,2,2), rowI = c(1,1,1,1,1,2,2,2,2,2,2))

And I want to get something similar to the following data.frame:

df_final= data.frame(Index = c(1,2), name = c("Sven, Mueller", "John, Hynes, Marc"), age = c(20,30), 
                     month = c("sept","Mar"))

However, I haven't been able to do anything and I have also not found a solution online. I am not able to find a solution to bring the values to their corresponding place in the dataframe and I think that it is even a bigger challenge that the cells can contain a different amount of values.

Thanks for your help.

Original Q&A

There are 3 best solutions below

**Martin Gal** · Answer 1

You could use dplyr, tidyr and stringr, all included in tidyverse:

df %>%
# bring your data into a wider format
  pivot_wider(id_cols=rowI, names_from=colI, values_from=values, values_fn=list) %>% 
# remove the nested listing
  unnest(everything()) %>%
# rename the columns
  select(Index = rowI, name=`2`, age=`3`, month=`4`) %>%
# group all rows based on the index
  group_by(Index) %>%
# concatenate the name column
  mutate(name=str_c(name, collapse=", ")) %>%
# remove duplicates
  distinct()

returns

# A tibble: 2 x 4
# Groups:   Index [2]
  Index name              age   month
  <dbl> <chr>             <chr> <chr>
1     1 Sven, Mueller     20    sept 
2     2 John, Hynes, Marc 30    Mar

Note: I changed your input data a little bit and added a single 2 to your rowI column (see Maurits Evers' comment).

**Ronak Shah** · Answer 2

Using base R, you can first aggregate your data to one comma separated string for each row and column index and then use unstack.

temp <- aggregate(values~colI + rowI, df, toString)
unstack(temp, values~colI)

#  X1                X2 X3   X4
#1  1     Sven, Mueller 20 sept
#2  2 John, Hynes, Marc 30  Mar

data

df <- structure(list(values = c("1", "Sven", "20", "Mueller", "sept", 
"2", "30", "John", "Mar", "Hynes", "Marc"), colI = c(1, 2, 3, 
2, 4, 1, 3, 2, 4, 2, 2), rowI = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 
2, 2)), class = "data.frame", row.names = c(NA, -11L))

**Yuriy Saraykin** · Answer 3

Yuriy Saraykin On 04 August 2020 at 13:58

Another solution

df %>% 
  pivot_wider(rowI, names_from = colI, values_from = values, values_fn = toString) %>% 
  select(-rowI) %>% 
  purrr::set_names(c("ID", "name", "age", "month"))

How can I create a data frame when I only have the values for each cell (possibly more than one string per cell) and their column and row index?

There are 3 best solutions below

Related Questions in R

Related Questions in DATAFRAME

Related Questions in INDICES

Related Questions in DATA-HANDLING

Trending Questions

Popular # Hahtags

Popular Questions