separate only specific columns into multiple across a table

50 Views Asked by At

I have this small table called table_8[1:2,]:

structure(list(X1 = c("", ""), X2 = c("Wealth", "Ratio (75/25)"
), X3 = c("", "Gini"), X4 = c("Non-Land Wealth", "Ratio (75/25) Gini"
), X5 = c("Income", "Ratio (75/25)"), X6 = c("", "Gini"), X7 = c("Consumption", 
"Ratio (75/25) Gini")), row.names = 1:2, class = "data.frame")

looks like:

enter image description here

I need the second row of X4 and X7 to each be split into two columns so they look like X2 and X3.

I've tried:

table_8_n <- table_8[1:2,] |>
mutate(across(X4:X7, ~str_replace_all(.x, ") G", ")_G")))|>
separate_wider_delim(cols = c(X4,X7), names=LETTERS[1:2], names_sep = "",
                       delim="_", too_few = "align_start")

but it returns this error:

Error in `mutate()`:
ℹ In argument: `across(X4:X7, ~str_replace_all(.x, ") G", ")_G"))`.
Caused by error in `across()`:
! Can't compute column `X4`.
Caused by error in `stri_replace_all_regex()`:
! Incorrectly nested parentheses in regex pattern. (U_REGEX_MISMATCHED_PAREN, context=`) G`)

and I haven't been able to figure out what to put instead of "across".

2

There are 2 best solutions below

0
Mark On BEST ANSWER

The issue is that in regular expressions, (a.k.a. regex), brackets have to be escaped (in other words, they have to have a backslash before them - or, in the case of R, two backslashes before them).

One that is done, your code works:

table_8 |> 
  mutate(across(X4:X7, ~str_replace_all(.x, "\\) G", "\\)_G"))) |>
  separate_wider_delim(cols = c(X4,X7), names=LETTERS[1:2], names_sep = "",
                       delim="_", too_few = "align_start") |>
  mutate(across(everything(), ~ ifelse(is.na(.), "", .))) # I added this so it appears as it should in your screenshot, without NAs in the empty cells

Output:

# A tibble: 2 × 9
  X1    X2            X3     X4A             X4B    X5         X6    X7A   X7B  
  <chr> <chr>         <chr>  <chr>           <chr>  <chr>      <chr> <chr> <chr>
1 ""    Wealth        ""     Non-Land Wealth ""     Income     ""    Cons… ""   
2 ""    Ratio (75/25) "Gini" Ratio (75/25)   "Gini" Ratio (75… "Gin… Rati… "Gin…
0
hks On
table_8_n <- data.frame(lapply(table_8[1:2,], function(x) {
                      gsub(") ",")_",x)  
})) |>
  separate_wider_delim(cols = c(X4,X7), names=LETTERS[1:2], names_sep = "",
                       delim="_", too_few = "align_start") 

enter image description here