How does flexibility affect a language's syntax?

246 Views Asked by At

I am currently working on writing my own language(shameless plug), which is centered around flexibility. I am trying to make almost any part of the language syntax exchangeable through things like extensions/plugins. While writing the whole thing, it has got me thinking. I am wondering how that sort of flexibility could affect the language.

I know that Lisp is often referred to as one of the most extensible languages due to its extensive macro system. I do understand that concept of macros, but I am yet to find a language that allows someone to change the way it is parsed. To my knowledge, almost every language has an extremely concrete syntax as defined by some long specification.

My question is how could having a flexible syntax affect the intuitiveness and usability of the language? I know the basic "people might be confused when the syntax changes" and "semantic analysis will be hard". Those are things that I am already starting to compensate for. I am looking for a more conceptual answer on the pros and cons of having a flexible syntax.

The topic of language design is still quite foreign to me, so I apologize if I am asking an obvious or otherwise stupid question!

Edit: I was just wanting to clarify the question I was asking. Where exactly does flexibility in a language's syntax stand, in terms of language theory? I don't really need examples or projects/languages with flexibility, I want to understand how it can affect the language's readability, functionality, and other things like that.

4

There are 4 best solutions below

0
On BEST ANSWER

Thanks to SK-logic's answer for pointing me in the direction of Alan Blackwell. I sent him an email asking his stance on the matter, and he responded with an absolutely wonderful explanation. Here it is:

So the person who responded to your StackOverflow question, saying that flexible syntax could be useful for DSLs, is certainly correct. It actually used to be fairly common to use the C preprocessor to create alternative syntax (that would be turned into regular syntax in an initial compile phase). A lot of the early esolangs were built this way.

In practice, I think we would have to say that a lot of DSLs are implemented as libraries within regular programming languages, and that the library design is far more significant than the syntax. There may be more purpose for having variety in visual languages, but making customisable general purpose compilers for arbitrary graphical syntax is really hard - much worse than changing text syntax features.

There may well be interesting things that your design could enable, so I wouldn’t discourage experimentation. However, I think there is one reason why customisable syntax is not so common. This is related to the famous programmer’s editor EMACS. In EMACS, everything is customisable - all key bindings, and all editor functions. It’s fun to play with, and back in the day, many of us made our own personalised version that only we knew how to operate. But it turned out that it was a real hassle that everyone’s editor worked completely differently. You could never lean over and make suggestions on another person’s session, and teams always had to know who was logged in order to know whether the editor would work. So it turned out that, over the years, we all just started to use the default distribution and key bindings, which made things easier for everyone.

At this point in time, that is just about enough of an explanation that I was looking for. If anyone feels as though they have a better explanation or something to add, feel free to contact me.

1
On

Perl is the most flexible language I know. That a look at Moose, a postmodern object system for Perl 5. It's syntax is very different than Perl's but it is still very Perl-ish.

IMO, the biggest problem with flexibility is precedence in infix notation. But none I know of allow a datatype to have its own infix syntax. For example, take sets. It would be nice to use and in their syntax. But not only would a compiler have to recognize these symbols, it would have to be told their order of precedence.

0
On

Common Lisp allows to change the way it's parsed - see reader macros. Racket allows to modify its parser, see racket languages.

And of course you can have a flexible, dynamically extensible parsing alongside with powerful macros if you use the right parsing techniques (e.g., PEG). Have a look at an example here - mostly a C syntax, but extensible with both syntax and semantic macros.

As for precedence, PEG goes really well together with Pratt.

To answer your updated question - there is surprisingly little research done on programming languages readability anyway. You may want to have a look at what Dr. Blackwell group was up to, but it's still far from conclusive.

So I can only share my hand-wavy anecdotes - flexible syntax languages facilitates eDSL construction, and, in my opinion, eDSLs is the only way to eliminate unnecessary complexity from code, to make code actually maintainable in a long term. I believe that non-flexible languages are one of the biggest mistakes made by this industry, and it must be corrected at all costs, ASAP.

0
On

Flexibility allows you to manipulate the syntax of the language. For example, Lisp Macros can enable you to write programs that write programs and manipulate your syntax at compile-time to valid Lisp expressions. For example the Loop Macro:

(loop for x from 1 to 5
      do(format t "~A~%" x))

1
2
3
4
5
NIL

And we can see how the code was translated with macroexpand-1:

(pprint(macroexpand-1 '(loop for x from 1 to 5
                 do (format t "~a~%" x))))

We can then see how a call to that macro is translated:

  (LET ((X 1))
(DECLARE (TYPE (AND REAL NUMBER) X))
(TAGBODY
 SB-LOOP::NEXT-LOOP
  (WHEN (> X '5) (GO SB-LOOP::END-LOOP))
  (FORMAT T "~a~%" X)
  (SB-LOOP::LOOP-DESETQ X (1+ X))
  (GO SB-LOOP::NEXT-LOOP)
 SB-LOOP::END-LOOP)))

Language Flexibility just allows you to create your own embedded language within a language and reduce the length of your program in terms of characters used. So in theory, this can make a language very unreadable since we can manipulate the syntax. For example we can create invalid code that's translated to valid code:

(defmacro backwards (expr)
   (reverse expr))
BACKWARDS
CL-USER> (backwards ("hello world" nil format))
"hello world"
CL-USER> 

Clearly the above code can become complex since:

 ("hello world" nil format)

is not a valid Lisp expression.