This is more of a soft question about static type systems in functional languages like those of the ML family. I understand why you need datatypes to describe data structures like lists and trees but defining "expressions" like those of propositional logic within datatypes seems to bring just some convenience and is not necessary. For example
datatype arithmetic_exp = Constant of int
| Neg of arithmetic_exp
| Add of (arithmetic_exp * arithmetic_exp)
| Mult of (arithmetic_exp * arithmetic_exp)
defines a set of values, on which you can write an eval
function which would give you the result. You could just as well define 4 functions: const: int -> int
, neg: int -> int
, add: int * int -> int
and mult: int * int -> int
and then an expression of the sort add (mult (const 3, neg 2), neg 4)
would give you the same thing without any loss of static security. The only complication is that you have to do four things instead of two. While learning SML and Haskell I've been trying to think about which features give you something necessary and which are just a convenience, so this is the reason why I'm asking. I guess this would matter if you want to decouple the process of evaluating a value from the value itself but I'm not sure where that would be useful.
Thanks a lot.
There is a duality between initial / first-order / datatype-based encodings (aka deep embeddings) and final / higher-order / evaluator-based encodings (aka shallow embeddings). You can indeed typically use a typeclass of combinators instead of a datatype (and convert back and forth between the two).
Here is a module showing the two approaches:
You can see that the two definitions look eerily similar.
Expr' a
is basically describing an algebra onExpr
which means that you can get ana
out of anExpr
if you have such anExpr' a
. Similarly, because you can write an instanceExpr' Expr
, you're able to reify a term of typeforall a. Expr' a => a
into a syntactic value of typeExpr
:In the end, picking a representation over another really depends on what your main focus is: if you want to inspect the structure of the expression (e.g. if you want to optimise / compile it), it's easier if you have access to an AST. If, on the other hand, you're only interested in computing an invariant using a fold (e.g. the depth of the expression or its evaluation), a higher order encoding will do.