find a regular expression where a is never immediately followed by b (Theory of formal languages)

701 Views Asked by At

I need to find a simplified regular expression for the language of all strings of a's, b's, and c's where a is never immediately followed by b.

I tried something and reached till (a+c)*c(b+c)* + (b+c)*(a+c)*

Is this fine and if so can this be simplified?

Thanks in advance.

2

There are 2 best solutions below

0
On

You are looking for a negative lookbehind:

(?<!a)b

This will find you all the b instances that are not immediately following a

Or a negative lookahead:

a(?!b)

This will find you all the a instances that are not immediately followed by b

Here is a regex101 example for the lookbehind:
https://regex101.com/r/RsqXbW/1

Here is a regex101 example for the lookahead:
https://regex101.com/r/qiDIZU/1

0
On

You solution contains only strings from the desired language. However, it does not contain all of them. For example acbac is not contained. Your basic idea is fine, but you need to be able to iterate the possible factors. In:

(b+c)*(a (a)*(c(b+c)*)*)*

the first part generates all strings withhout a.

After the first a there come either nothing, another a or c. Another a leaves us with the same three options. c basically starts the game again. This is what the part after the first a formalizes. The many * are needed to possibly generate the empty string in all of the different options.