I have to write a function that checks whether input strings are valid for a given language specification. I thought that this would be a standard CFG -> Chomsky Normal Form, then CYK parsing, but one of the rules in the language is preventing this from happening.
Some of the rules are straightforward, if we define terminals {a,b,c,d,e,f,P,Q,R,S}
, then valid strings are
1) Any of the lowercase terminals in isolation
2) If 'x' is a valid string, then so is Sx
But the third rule is
3) If X and Y are valid input strings, then so are PXY, QXY, RXY
where {P,Q,R}
are the remaining uppercase terminals and X and Y are nonterminals.
What would the production rule for this look like? I thought it would be something like
XY -> PXY (and QXY, RXY)
but there are two problems with this. The first is that this is not a CFG rule -- does that mean this language defines a CSG instead?
And the second is that this doesn't work, because
XY -> PXY -> PPXY
would not be a valid message in all cases.
I think that this grammar is context-free, unless I'm misinterpreting what you're saying.
First, let's let A be the nonterminal that expands out to some valid string made just using the first two rules, we get
Now, your second rule says that if you can build the string ω then you can build Sω. We could encode that as
Finally, you've said that if you have two strings X and Y, then you can combine them together as
One way to think about this would be to generate the string P, followed by any two valid strings (same for Q or R). Thus you could add the rules
This gives the final grammar
Hope this helps!