Properties of pmatch function

1.6k Views Asked by At

I don't understand the behavior of the built-in function pmatch (partial string matching).

The description provides the following example:

pmatch("m",   c("mean", "median", "mode")) # returns NA instead of 1,2,3

but using:

pmatch("m", "mean") # returns 1, as I would have expected. 

Could anybody explain to me this behavior?

2

There are 2 best solutions below

4
On BEST ANSWER

As per the documentation:

nomatch: the value to be returned at non-matching or multiply partially matching positions. Note that it is coerced to integer.

The nomatch defaults to NA (i.e., if there are multiple partial matches then NA will be returned).

pmatch("me",   c("mean", "median", "mode")) 
[1] NA  # returns NA instead of 1,2 since multiple partial matches

pmatch("mo",   c("mean", "median", "mode")) 
[1] 3   # since single partial match
2
On

Use grep instead - the NA-on-duplicates behavior of pmatch is incredibly annoying:

grep("^m",   c("mean", "median", "mode"))
[1] 1 2 3

> grep("ed",   c("mean", "median", "mode"))
[1] 2

The only downside is that pmatch(x, table... is vectorized for both args, but grep only for the second arg. So grep can't take a vector of patterns. But you can use stringi, or else sapply.