The pattern list looks like:
pattern <- c('aaa','bbb','ccc','ddd')
X came from df looks like:
df$X <- c('aaa-053','aaa-001','aab','bbb')
What I tried to do: use agrep to find the matching name in pattern based on df$X, then assign value to an existing column 'column2' based on the matching result, for example, if 'aaa-053' matched 'aaa', then 'aaa' would be the value in 'column2', if not matched, then return na in that column.
for (i in 1:length(pattern)) {
match <- agrep(pattern, df$X, ignore.case=TRUE, max=0)
if agrep = TRUE {
df$column2 <- pattern
} else {df$column2 <- na
}
}
Ideal column2 in df looks like:
'aaa','aaa',na,'bbb'
agrepby itself isn't going to give you much to determine which to use when multiples match. For instance,which makes sense for the first two, but the third is not among your expected values. Similarly, it's feasible that it might select multiple patterns for a given string.
Here's an alternative: