I'm trying to model a "heterogeneous tree", ie. a tree where the nodes have different "kinds" and each "kind" are restricted in the "kind" of children they may contain:
type id = string
type block
type inline
type _ node =
| Paragraph : id * inline node list -> block node
| Strong : id * inline node list -> inline node
| Text : id * string -> inline node
A tree can then be defined like this:
let document =
Paragraph ("p1", [
Text ("text1", "Hello ");
Strong ("strong1", [
Text ("text2", "Glorious")
]);
Text ("text3", " World!")
])
Usually this would be done using separate variants for each "kind" of node, but I'm trying to define it as a GADT in order to be able to manipulate the tree using higher-order functions that pattern match on every kind of node:
function
| Text ("text2", _) ->
Some (Text ("text2", "Dreadful"))
| _ ->
None
The problem I have is in defining the function that accepts the above higher-order function and apply it to every node. So far I have this:
let rec replaceNode (type a) (f: a node -> a node option) (node: a node): a node = match f node with | Some otherNode -> otherNode | None -> match node with | Paragraph (id, children) -> Paragraph (id, (List.map (replaceNode f) children)) | Strong (id, children) -> Strong (id, (List.map (replaceNode f) children)) | Text (_, _) -> node
But the compiler gives me the following error on the highlighted line
This expression has type block node -> a node option but an expression was expected of type block node -> a node option This instance of block is ambiguous: it would escape the scope of its equation
Or if I change the type of f
to 'a node -> 'a node option
I get this error instead
This expression has type a node but an expression was expected of type a node The type constructor a would escape its scope
Clearly I don't fully understand locally abstract types (or GADTs really, for that matter), but from what I do understand these errors seem to arise because the type is, as the name suggests, "local", and while polymorphic on the outside, passing it on would "leak" it, I guess?
So my question is, first and foremost: Is this even possible to do (and by "this" I think I mean to pattern match on a GADT in a higher-order function, but I'm not even sure that's the actual problem)?
There are two root problems here ( which are a bit muddled by the presence of GADTs). The first issue is that
replaceNode
is a second rank polymorphic function. Indeed, in the first match,f
is applied to a node of typea node
, but inside theParagraph
branch, it is applied to a node of typeinline node
. The type-checker error here is a bit complicated by theList.map
call, but rewriting the function asyields a more straightforward error:
The problem is thus that we need to reassure the type-checker that
f
works for any typea
and not only the original typea
. In other words, the type off
should be'a. 'a node -> 'a node option
(akaforall 'a. 'a -> 'a node option
). Unfortunalety, explicit polymorphic annotation are only possible in first position (prenex) in OCaml, thus we cannot just change the type off
inreplaceNode
. However, it is possible to works around this issue by using polymorphic record field or method.For instance, using the record path, we can define a record type
mapper
:of which the field
f
has the right explicit polymorphic notation (aka universal quantification) and then use it inreplaceNode
:But then the second issue pops up: this
replaceNode
function has for typemapper -> inline node -> inline node
. Where does the inline type comes from? Ths time the issue is polymorhpic recursion. Without an explicit polymorphic annotation, the type ofreplaceNode
is considered constant inside its recursive definition. In other words, the type-checker considers thatreplaceNode
has for typemapper -> 'elt node -> 'elt node
for a given'elt
. And in theparagraph
andstrong
branchs, thechildren
list is a list ofinline node
. ThusList.map (replaceNode {f}) children
implies for the type-checker that'elt
=inline
and thus the type ofreplaceNode
becomesmapper -> inline node -> inline node
.To fix this issue, we need to add another polymorphic annotation. Fortunately, this time, we can add it directly:
At last, we obtain a function of type
mapper -> 'a node -> 'a node
. Note thatlet f: type a.…
is a shortcut notation for combining locally abstract type and explicit polymorphic annotation.Completing the explanation, the locally abstract
(type a)
is needed here because only abstract types can be refined when pattern matching over GADTs. In other words, we need it to precise that the typea
in theParagraph
,Strong
andText
obeys different equalities:a
=block
in the Paragraph branch,a
=inline
in theStrong
andText
branch.EDIT: How to define a mapper?
This locally abstract type bit is in fact important when defining a mapper. For instance, defining
yields a type
inline node -> inline node option
forf
, since matching over the constructorText
yields the equality'type_of_scrutinee=inline
.To correct this point, one need to add a locally abstract type annotation to make the type-checker able to refine the type of the scrutinee branch-by-branch:
Then this f has the right type and can be wrapped inside a mapper record:
Advertisement: All this is detailed in the OCaml manual starting at version 4.06 .