I was trying to calculate the average of a ternary tree. It seems not possible to finish it inside one function. Is there any way to solve this question, or it's necessary to use two functions? Thanks.
-- define a tree
data Ttree t = Nil | Node3 t (Ttree t) (Ttree t) (Ttree t)
-- get the Ternary tree and return the average
treeAverage :: Ttree Double -> Double
treeAverage Nil = 0.0 -- empty tree
treeAverage tree = treeAverage' tree (0.0)
-- try to use accumulator and another function
where
treeAverage' Nil _ = 0.0 -- empty tree
treeAverage' (Node3 n left mid right) (sum/count) =
((n+sumL+sumM+sumR) / (1+countL+countM+countR)) -- average
where
(sumL,countL) = treeAverage' left
-- calculate left subtree with sum and count
(sumM,countM) = treeAverage' mid
(sumR,countR) = treeAverage' right
In order to compute an average value, you have to perform a single division at the very end of the process, something like
(allSum / allCount)
. As the division cannot be part of the recursive tree traversal process, it seems difficult to achieve what you want within a single function.Let's start by providing a little fix for your code. It is unclear whether your auxiliary
treeAverage'
function returns a pair or a single numeric value. We can rewrite the whole thing like this, where unambiguously it returns a pair:and that code appears to work:
However, in this code, tree traversal is inextricably intertwined with arithmetics.
Decoupling ...
The common Haskell practice would be to improve matters by subcontracting tree traversal to general purpose functions, that is, functions we (or the language library) would provide anyway in order to support our tree structure, regardless of any numeric concerns.
About plain lists ...
At that point, we can look at a simpler problem: how do we compute an average for a plain list of numbers ?
As mentioned in a comment by chepner, you can use:
We could adapt this approach to
Ttree
objects, coming up withtreeSum
andtreeLeafCount
functions. But that would be suboptimal. In modern hardware, memory access is way more expensive than arithmetics, andlistAverage
needlessly traverses the list twice.How do we get to traverse the list just once ? Well, computing an average is obviously a fold operation, that is you traverse a complex structure in order to produce a single value. See the classic paper by Graham Hutton about the merits of fold operations.
Lists have an instance of the
Foldable
class mentioned in the comment by chepner. So the library provides, among other things, afoldr
function for lists:The first argument of
foldr
is a combining function, which takes an accumulator value and a scalar value from the input list, and returns an updated accumulator value. The second argument is just an initial value for the accumulator.So we can write a single-traversal list average like this:
This works fine:
Now, can we adapt this approach to trees ?
Tree traversal
So we need to somehow get a version of
foldr
for our trees.We can write it manually, working our way thru the structure from right to left:
Note that it is critical here to be able to specify the initial accumulator value.
So we now have a tree traversal mechanism that is fully general purpose and detached from any numeric concerns.
For example, we can use it to flatten any sort of (possibly non-numeric) tree:
This can be further simplified:
Testing:
At that point, we can define the
Foldable
instance for trees:and the pretty short code of the list averager above can now be used unmodified to average trees, essentially by adapting its type signature.
Now, we can do something easier. The GHC compiler happens to provide an extension,
DeriveFoldable
, that allows us to ask the compiler to writetreeFoldr
automagically. This leads directly to our:Shortest solution:
And I think most Haskell programmers would agree that this counts as a single function :-)
Note that it is also possible to provide
Functor
instances, hence anfmap
function, using theDeriveFunctor
GHC extension.