Using the De Bruijn notation, it is possible to define lambda terms as:
data BTerm = BVar Int | BLam BTerm | BApp BTerm BTerm
Or using the usual notation,
data Term = Var String | Lam String Term | App Term Term
These two datatypes allow construction of both closed terms and terms containing free variables.
Is it possible to define a datatype that allows construction of only closed terms. ie only the terms such as: \x.x, \x. x x, \x.\y. x y, \x.\y. y, \x.\y.\z.z(x y)
Sure, this is even implemented as a library in bound, which is a generalisation of a representation of De Bruijn indices using a (polymorphically) recursive data type:
The type of terms is indexed by the type of variables in those terms, both bound and free. When defining a new nested scope in a lambda term, we change that type: for bound variables, we just add another level of nesting, here annotated with
()
to give plain De Bruijn indices; for free variables, we just pass the type along, but also add a level of nesting withTerm
. The nesting at the type level reflects the number of De Bruijn levels that indices may refer to.Now
Term Void
is the type of a term with no free variables; it may still have bound variables, by virtue of the fact that the recursion is polymorphic:Lam (Scope (Var (Bound ()))) :: Term Void
represents (λx. x).This method works in plain Haskell 98 (i.e. without GADTs), although there are advantages to adding some fancy types, e.g. I regularly use them for typed ASTs and statically typed typecheckers. Ed Kmett gave a nice overview of the design of
bound
at School of Haskell. There are related libraries in this space such as unbound and unbound-generics.