Tiny DSL implementation in Python

Question

Tiny DSL implementation in Python

2.8k Views Asked by mitghi At 17 August 2025 at 14:15

I am researching to implement a DSL in Python and i am looking for a small DSL language that is friendly for someone who had no experience with designing and implementing languages. So far, i reviewed two implementations which are Hy and Mochi. Hy is actually a dialect of lisp and Mochi seems to be very similar to Elixir. Both are complex for me, right now as my aim is to prototype the language and play around with in in order to find if it really helps in solving the problem and fits to the style that problem requires or not. I am aware that Python has good support via language tools provided in Standard library. So far i implemented a dialect of lisp which is very simple indeed, i did not used python AST whatsoever and it is purely implemented via string processing which is absolutely not flexible for what i am looking for.

Are there any implementations rather than two languages mentioned above, small enough to be studied ?

What are some good books ( practical in a sense that does not only sticks to theoritical and academic aspect ) on this subject ?

What would be a good way into studying Python AST and using it ?

Are there any significant performance issues related to languages built upon Python ( like Hy ) in terms of being overhead on the actual produced bytecode ?

Thanks

Original Q&A

There are 2 best solutions below

Erez On 16 March 2017 at 17:28

You don't really need to know a lot about parsing to write your own language.

I wrote a library that lets you do just that very easily: https://github.com/erezsh/lark

Here's a blog post by me explaining how to use it to write your own language: http://blog.erezsh.com/how-to-write-a-dsl-in-python-with-lark/

I hope you don't mind my shameless plug, but it seems very relevant to your question.

**fferri** · Accepted Answer

You can split the task of creating a (yet another!) new language in at least two big steps:

Syntax
Semantics & Interpretation

Syntax

You need to define a grammar for your language, with production rules that specify how to create complex expressions from simple ones.

Example: syntax for LISP:

expression ::= atom   | list
atom       ::= number | symbol    
number     ::= [+-]?['0'-'9']+
symbol     ::= ['A'-'Z''a'-'z'].*
list       ::= '(' expression* ')'

How to read it: an expression is either an atom or a list; an atom is a number or a symbol; a number is... and so on.

Often you will define also some tokenization rules, because most grammars work at token level, and not at characters level.

Once you defined your grammar, you want a parser that, given a sentence (a program) is able to build the derivation tree, or the abstract syntax tree.

For example, for the expression x=f(y+1)+2, you want to obtain the tree:

abstract syntax tree for <code>x=f(y+1)+2</code>

There are several parsers (LL, LR, recursive descent, ...). You don't necessarily need to write your language parser by yourself, as there are tools that generate the parser from the grammar specification (LEX & YACC, Flex & Bison, JavaCC, ANTLR; also check this list of parsers available for Python).

If you want to skip the step of designing a new grammar, you may want to start from a simple one, like the grammar of LISP. There is even a LISP parser written in Python in the Pyperplan project. They use it for parsing PDDL, which is a domain specific language for planning that is based on LISP.

Useful readings:

Book: Compilers: Principles, Techniques, and Tools by by Alfred Aho, Jeffrey Ullman, Monica S. Lam, and Ravi Sethi, also known as The Dragon Book (because of the dragon pictured in the cover)
https://en.wikipedia.org/?title=Syntax_(programming_languages)
https://en.wikibooks.org/wiki/Introduction_to_Programming_Languages/Grammars
How to define a grammar for a programming language
https://en.wikipedia.org/wiki/LL_parser
https://en.wikipedia.org/wiki/Recursive_descent_parser
https://en.wikipedia.org/wiki/LALR_parser
https://en.wikipedia.org/wiki/LR_parser

Semantics & Interpretation

Once you have the abstract syntax tree of your program, you want to execute your program. There are several formalisms for specifying the "rules" to execute (pieces of) programs:

Operational semantics: a very popular one. It is classified in two categories:
- Small Step Semantics: describe individual steps of computation
- Big Step Semantics: describe the overall results of computation
Reduction semantics: a formalism based on lambda calculus
Transition semantics: if you look at your interpreter like a transition system, you can specify its semantics using transition semantics. This is especially useful for programs that do not terminate (i.e. run continuously), like controllers.

Useful readings:

Book: A Structural Approach to Operational Semantics [pdf link] by Gordon D. Plotkin
Book: Structure and Interpretation of Computer Programs by Gerald Jay Sussman and Hal Abelson
Book: Semantics Engineering with PLT Redex (SEwPR) by Matthias Felleisen, Robert Bruce Findler, and Matthew Flatt
https://en.wikipedia.org/wiki/Semantics_(computer_science)
https://en.wikipedia.org/wiki/Operational_semantics
https://en.wikipedia.org/wiki/Transition_system
https://en.wikipedia.org/wiki/Kripke_structure_(model_checking)
https://en.wikipedia.org/wiki/Hoare_logic
https://en.wikipedia.org/wiki/Lambda_calculus

Tiny DSL implementation in Python

There are 2 best solutions below

Syntax

Semantics & Interpretation

Related Questions in PYTHON

Related Questions in DSL

Trending Questions

Popular # Hahtags

Popular Questions