I am working on a symbolic algebra system. I'm currently looking at how to carry out polynomial addition/multiplication from a binary parse tree. I will later consider this to be a general ring.
Parsing is not relevant here -- this intended to be the output of parsing. The parse tree that is formed. If something could be improved here, I'm certainly happy to learn about that too.
I 'feel' that this tree structure could be folded/crushed, yet I'm not too clear on how to go about it. I believe I can hack something together, but as I'm still learning Haskell, I wanted to learn what the more advanced users would do.
Below is relevant code background.
My two relevant data declarations are:
data Op = Add | Mul
data Tree a = Leaf a | Node Op [Tree a] [Tree a]
Here's one of my test examples:
-- 3*(x+2)*(x+(5*4))
test = Node Mul [
Node Mul
[Leaf "3"]
[Node Add
[Leaf "x"]
[Leaf "2"]
]
]
[Node Add
[Leaf "x"]
[Node Mul
[Leaf "5"]
[Leaf "4"]
]
]
This is a typical recursive tree type, with a node holding an operation as well as lists of left and right trees. Computation proceeds by the below operations. Note: they are all string operations for now. I need them to be as general as possible (but will later add further structure, like multiplicative commutativity).
prod :: [Tree [a]] -> [Tree [a]] -> [Tree [a]]
prod ls rs = [Leaf (l ++ r) | (Leaf l) <- ls, (Leaf r) <- rs]
add :: [Tree [a]] -> [Tree [a]] -> [Tree [a]]
add l r = l ++ r
opOnLeaf :: Op -> [Tree [a]] -> [Tree [a]] -> [Tree [a]]
opOnLeaf op l r
| op == Add = add l r
| op == Mul = prod l r
operate :: Tree [a] -> [Tree [a]]
operate (Node op [Node op2 l2 r2] [Node op3 l3 r3]) = operate (Node op (operate (Node op2 l2 r2)) (operate (Node op3 l3 r3)))
operate (Node op [Node op2 l2 r2] r) = operate (Node op (operate (Node op2 l2 r2)) r)
operate (Node op l [Node op2 l2 r2]) = operate (Node op l (operate (Node op2 l2 r2)))
operate (Node op l r) = opOnLeaf op l r
I think my variable names in operate
are hideous, but I wanted to ensure it functioned first. Making it all hideous makes it stands out more too. It's practically yelling at me to find a better way.
Now, I'd like to use fold or foldMap, but I run into the issue that I have distinct monoids at each operation (namely (*)
and (+)
. And that's just the beginning -- I will be adding several other operations. So I'd need something like a 'variable monoid', or perhaps some other map whose operation was to first acquire the monoid. I'm not sure.
The query: given this binary tree, and multiple operations at each node, how would one proceed to fold up the structure? Is there a better way to write this as an instance of Foldable, or some other structure? I also aim to add several other binary operations. Ideally, I'd like a structure that'd be indefinitely extensible. I assume once it's solved for 2, we could solve it effectively for n.
"Fold" has two related but distinct meanings in common parlance.
Fold
as inFoldable
. Viewing the datatype as a container,Foldable
gives you access to the container's elements while throwing away any information about the shape of the container itself. (This is also the sense in whichlens
'sFold
uses the term.) Not every datatype isFoldable
— only those which can be viewed as a container.The two meanings of "fold" happen to coincide when the datatype you're talking about is
[]
(which partly explains the confusion over the two meanings), but in general they don't. YourTree
happens to be a suitable candidate for either type of fold, and from your question I can't quite tell which one you're after, so I'll show you how to do both.The easiest way to write an instance of
Foldable
is to turn onDeriveFoldable
.But for the sake of understanding I'll write out the instance here:
The idea is to traverse the tree to the bottom, applying
f
to the values inside eachLeaf
. Couple of things to note about this code:foldr
is an overloaded function (it's a method ofFoldable
), and in the second clause I'm using it at two different types. Thefoldr
infoldTree
is a recursive call toTree
's definition offoldr :: (a -> b -> b) -> b -> Tree a -> b
. Thefoldr
in the line above it is a call to the usual listfoldr :: (a -> b -> b) -> b -> [a] -> b
.foldr
discards information! In theNode
case, you can see from the_
that theOp
is dropped on the floor. We also smashls
andrs
together into a single list, forgetting that they used to be in two lists.So
Foldable
is all about finding the items inside a container, while discarding any information about the container itself. To put it bluntly,Foldable
is "toList
-as-a-class".Foldable
is a great fit for container types and other abstract data types, where you want to expose a collection-like interface while hiding the internal representation of the data structure. In your case, you mentioned that you wanted to apply an operation+
or*
depending on theOp
inside the tree, but theOp
is discarded byFoldable
. So it seems likeFoldable
is not the right tool for what you're trying to do.If you want to reduce your datatype to a summary value without throwing away any information about the structure, you need a catamorphism.
To derive the signature for a catamorphism, follow this recipe.
cata
takes a value of your datatype and returns a summary valueb
.cata
. Each of those arguments will be a function which returns ab
.b
.Let's follow these steps for your
Tree
. First of all,cataTree
takes aTree
and returns a summary valueb
.Looking at
Tree
's definition, we see two constructors. SocataTree
will have two function arguments, one forLeaf
and one forNode
.Looking at the
Leaf
constructor, we see one field. So theLeaf
function has a single argument.Now let's look at
Node
.Node
has three arguments, but two of them are lists ofTree
s. We want to replace each of thoseTree
s with its corresponding summary value. So the final signature forcataTree
isImplementing
cataTree
is a matter of following the types.