Currently I'm using nested ifelse functions with grepl to check for matches to a vector of strings in a data frame, for example:
# vector of possible words to match
x <- c("Action", "Adventure", "Animation")
# data
my_text <- c("This one has Animation.", "This has none.", "Here is Adventure.")
my_text <- as.data.frame(my_text)
my_text$new_column <- ifelse (
grepl("Action", my_text$my_text) == TRUE,
"Action",
ifelse (
grepl("Adventure", my_text$my_text) == TRUE,
"Adventure",
ifelse (
grepl("Animation", my_text$my_text) == TRUE,
"Animation", NA)))
> my_text$new_column
[1] "Animation" NA "Adventure"
This is fine for just a few elements (e.g., the three here), but how do I return when the possible matches are much larger (e.g., 150)? Nested ifelse seems crazy. I know I can grepl multiple things at once as in the code below, but this return a logical telling me only if the string was matched, not which one was matched. I'd like to know what was matched (in the case of multiple, any of the matches is fine.
x <- c("Action", "Adventure", "Animation")
my_text <- c("This one has Animation.", "This has none.", "Here is Adventure.")
grepl(paste(x, collapse = "|"), my_text)
returns: [1] TRUE FALSE TRUE
what i'd like it to return: "Animation" ""(or FALSE) "Adventure"
Following the pattern here, a
base
solution.stringr
has an easier way to do this