Example of (a|b)* about Alternation , I am confused , Why are ab, ba in the result set?

278 Views Asked by At

Alternation is about Union, then if We have R={"a"} , S={"b"} , then R|S will be {"a", "b"}.

Why ab, ba are included there in (a|b)*?

I don't understand why

enter image description here enter image description here

Regular Expresion wikipedia

I think the result set should be
(a|b)* = {Ɛ,"a", "b", "aa","bb", "aaa", "bbb", ...}

4

There are 4 best solutions below

1
Roland Illig On BEST ANSWER

The expression a|b matches either a or b.

The expression (a|b)* matches for example (a|b) (a|b) (a|b). In each of these alternatives, you can choose individually whether to match a or b. You don't have to choose the same for all of them.

The variant "either a* or b*" is written exactly as pronounced: a*|b*.

0
laenkeio On

Essentially, you can think of (a|b)* as

  • (empty set) or
  • (a|b) or
  • (a|b)(a|b) or
  • (a|b)(a|b)(a|b) ...

It's clear from this that the choice of a or b can alternate in the sequence. Hope this helps.

0
Michał Turczyn On

* means zero or more, it's almost like you'd write:

(a|b)(a|b)(a|b)(a|b)(a|b)...

meaning it repeats pattern, not the text matched, so if it once matches a it doesn't have to match a again, because it repeats pattern (a|b), so again, it wil match a or b.

That's why it can match any combination of as and bs

0
The fourth bird On

The pattern (a|b)* uses an alternation to match either a OR b and repeat that 0+ times due to the quantifier *

There is an example to write (a|b)* without the alternation which might be helpfull to reason about it why you get those matches:

a*(?:b+a*)*

To only get consecutive matches instead of mixed ones you might use a backreference \b(a|b)\1*\b to repeat what has exactly being captured in the group. See a demo.