How to solve ambiguity in with keywords as identifiers in grammar kit

277 Views Asked by At

I've been trying to write the graphql language grammar for grammarkit and I've found myself really stuck on an ambiguity issue for quite some time now. Keywords in graphql (such as: type, implements, scalar ) can also be names of types or fields. I.E.

type type implements type {}

At first I defined these keywords as tokens in the bnf but that'd mean the case above is invalid. But if I write these keywords directly as I'm describing the rule, It results in an ambiguity in the grammar. An example of an issue I'm seeing based on this grammar below is if you define something like this

directive @foo on Baz | Bar
scalar Foobar @cool

the PSI viewer is telling me that in the position of @cool it's expecting a DirectiveAddtlLocation, which is a rule I don't even reference in the scalar rule. Is anyone familiar with grammarkit and have encountered something like this? I'd really appreciate some insight. Thank You.

Here's an excerpt of grammar for the error example I mentioned above.

{
    tokens=[
            LEFT_PAREN='('
            RIGHT_PAREN=')'
            PIPE='|'
            AT='@'
            IDENTIFIER="regexp:[_A-Za-z][_0-9A-Za-z]*"
            WHITE_SPACE = 'regexp:\s+'
    ]
}

Document ::= Definition*
Definition ::=  DirectiveTypeDef | ScalarTypeDef
NamedTypeDef ::= IDENTIFIER

// I.E. @foo @bar(a: 10) @baz
DirectivesDeclSet ::= DirectiveDecl+
DirectiveDecl ::= AT TypeName

// I.E. directive @example on FIELD_DEFINITION | ARGUMENT_DEFINITION
DirectiveTypeDef ::= 'directive' AT NamedTypeDef DirectiveLocationsConditionDef
DirectiveLocationsConditionDef ::= 'on' DirectiveLocation DirectiveAddtlLocation*
DirectiveLocation ::= IDENTIFIER
DirectiveAddtlLocation ::= PIPE? DirectiveLocation

TypeName ::= IDENTIFIER

// I.E. scalar DateTime @foo
ScalarTypeDef ::= 'scalar' NamedTypeDef DirectivesDeclSet?
1

There are 1 best solutions below

0
On

Once your grammar sees directive @TOKEN on IDENTIFIER, it consumes a sequence of DirectiveAddtlLocation. Each of those consists of an optional PIPE followed by an IDENTIFIER. As you note in your question, the GraphQL "keywords" are really just special cases of identifiers. So what's probably happening here is that, since you allow any token as an identifier, scalar and Foobar are both being consumed as DirectiveAddtlLocation and it's never actually getting to see a ScalarTypeDef.

# Parses the same as:
directive @foo on Bar | Baz | scalar | Foobar
@cool  # <-- ?????

You can get around this by listing out the explicit set of allowed directive locations in your grammar. (You might even be able to get pretty far by just copying the grammar in Appendix B of the GraphQL spec and changing its syntax.)

DirectiveLocation ::= ExecutableDirectiveLocation | TypeSystemDirectiveLocation
ExecutableDirectiveLocation ::= 'QUERY' | 'MUTATION' | ...
TypeSystemDirectiveLocation ::= 'SCHEMA' | 'SCALAR' | ...

Now when you go to parse:

directive @foo on QUERY | MUTATION
# "scalar" is not a directive location, so the DirectiveTypeDef must end
scalar Foobar @cool

(For all that the "identifier" vs. "keyword" distinction is a little weird, I'm pretty sure the GraphQL grammar isn't actually ambiguous; in every context where a free-form identifier is allowed, there's punctuation before a "keyword" could appear again, and in cases like this one there's unambiguous lists of not-quite-keywords that don't overlap.)