The data is as follows:
library(fuzzyjoin)
nr <- c(1,2)
col2 <- c("b","a")
dat <- cbind.data.frame(
nr, col2
)
thelist <- list(
aa=c(1,2,3),
bb=c(1,2,3)
)
I would like to the following:
stringdist_left_join(dat, thelist, by = "col2", method = "lcs", max_dist = 1)
But this (unsurprisingly) gives an error:
Error in `group_by_prepare()`:
! Must group by variables found in `.data`.
* Column `col` is not found.
Run `rlang::last_error()` to see where the error occurred.
What would be the best way to do this?
Desired output:
nr col2 thelist list_col
1 b bb c(1,2,3)
2 a aa c(1,2,3)
This is a bit of a hack. Not sure if there is a more elegant solution.
Create a data.frame of the transposed list and pivot this into a data.frame with all the names of the list in a column named "col2". Then use fuzzy join to merge the data. With the resulting
out
data.frame, you can drop the columns you don't need.