Unify multiple rows in one row in R

58 Views Asked by At

I have a dataset like this:

a1 <- c("a","a","a", "b", "c", "c", "d", "d", "d", "d")
b1 <- c(7, 7, 7,5, 4, 4, 3, 3, 3, 3)
c1 <- c("A","B", "C", "D", "E", "F", "B", "C", "EE", "F")
m1 <- data.frame(a1, b1, c1)

my expected result is a dataset like this:

a <- c("a","b", "c", "d")
b <- c(7, 5, 4, 3)
c <- c("ABc","D", "EF", "BCEEF")
m <- data.frame(a, b, c)

I try this code but it doesn't work:

m1 <- m1 %>%
 group_by(a1)

how can I fix it?

3

There are 3 best solutions below

0
TarJae On BEST ANSWER

I personally like @Maël answer best, but want to share this solution using toString(): It is more or less the same, but toString() is default separated with a comma. We override this here with post processing using gsub():

library(dplyr) # > 1.1.0

m1 |> 
  summarise(c = gsub(", ", "", toString(c1)), .by=c(a1,b1)) 

  a1 b1     c
1  a  7   ABC
2  b  5     D
3  c  4    EF
4  d  3 BCEEF
0
Jilber Urbina On

In base R

> aggregate(c1~a1+b1, paste0, collapse="", data=m1)[4:1, ]
  a1 b1    c1
4  a  7   ABC
3  b  5     D
2  c  4    EF
1  d  3 BCEEF
1
jkatam On

Alternatively with paste0

m1 %>% mutate(c=paste0(c1,collapse = ''), .by=c(a1,b1)) %>% 
  group_by(a1,b1) %>% 
  slice_head(n=1) %>% 
  select(-c1)

Created on 2023-09-08 with reprex v2.0.2

# A tibble: 4 × 3
# Groups:   a1, b1 [4]
  a1       b1 c    
  <chr> <dbl> <chr>
1 a         7 ABC  
2 b         5 D    
3 c         4 EF   
4 d         3 BCEEF