Methods of creating syntax highlighting in textX?

224 Views Asked by At

As I cannot find any guidelines about syntax highlighting, I decided to prepare simple write-as-plain-text-and-then-highlight-everything-in-html-preview, which is enough for my scope at the moment.

By overriding many custom meta-model classes I have to_source method, which actually reimplements the whole syntax in reverse, as reverse parsing is not yet available. It's fine, but it ignores user formatting.

To retain user formatting we can use only available thing: _tx_position and _tx_position_end. Descending from main textX rule to its children by stored custom meta-model classes attributes works for most cases, but it fails with primitives.

# textX meta-model file
NonsenseProgram:
    "begin" foo=Foo "," count=INT "end";
;

Foo:
    "fancy" a=ID "separator" b=ID "finished"
;
# textX custom meta-model classes
class NonsenseProgram():
    def __init__(foo, count):
        self.foo = foo
        self.count = count

    def to_source(self):
        pass  # some recursive magic that use _tx_position and _tx_position_end

class Foo():
    def __init__(parent, a, b):
        self.parent = parent
        self.a = a
        self.b = b

    def to_source(self):
        pass  # some recursive magic that use _tx_position and _tx_position_end

Let's consider given example. As we have NonsenseProgram and Foo classes that we can override, we are in control about it's returning source as a whole. We can modify NonsenseProgram generated code, NonsenseProgram.foo fragment (by overriding Foo), by accessing its _tx_* attributes. We can't do the same with NonsenseProgram.count, Foo.a and Foo.b as we have primitive string or int value.

Depending of the usage of primitives is out grammar we have following options:

  • Wrap every primitive with rule that contains only that primitive and nothing else.
    Pros: It just works right now!
    Cons: Produces massive overhead of nested values that our grammar toolchain need to handle. It's actually messing with grammar only for being pretty...
  • Ignore syntax from user and use only our reverse parsing rules.
    Pros: It just works too!
    Cons: You need reimplement your syntax with nearly every grammar element. It's forces code reformat on every highlight try.
  • Use some external rules of highlighting.
    Pros: It would work...
    Cons: Again grammar reimplementation.
  • Use language server.
    Pros: Would be the best option on long run.
    Cons: It's only mentioned once without any in-depth docs.

Any suggestions about any other options?

1

There are 1 best solutions below

2
On

You are right. There is no information on position for primitive types. It seems that you have covered available options at the moment.

What would be an easy to implement option is to add bookkeeping of position directly to textX of all attributes as a special structure on each created object (e.g. a dict keyed by attribute name). It should be straightforward to implement so you can register a feature request in the issue tracker if you wish.

There was some work in the past to support full language services to the textX based languages. The idea is to get all the features you would expect from a decent code editor/IDE for any language specified using textX. The work staled for a while but resumed recently as the full rewrite. It should be officially supported by the textX team. You can follow the progress here. Although, the project doesn't mention syntax highlighting at the moment, it is on our agenda.