I wonder if there is a way to check the ambiguity of a regular expression automatically. A regex is considered ambiguous if there is an string which can be matched by more that one ways from the regex. For example, given a regex R = (ab)*(a|b)*
, we can detect that R
is an ambiguous regex since there are two ways to match string ab
from R.
UPDATE
The question is about how to check if the regex is ambiguous by definition. I know in practical implementation of regex mechanism, there is always one way to match a regex, but please read and think about this question in academic way.
You are forgetting greed. Usually one section gets first dibs because it is a greedy match, and so there is no ambiguity.
If instead you are talking about a mythical pattern matching engine without the practical details like greed; then the answer is yes you can.
Take every element of the pattern. And try every possible subset against every possible string. If more than one subset matches the same pattern then there's an ambiguity. Optimizing this to take less than infinite time is left as an exercise for the reader.