# Full binary tree catamorphism by Mark Seemann

*The catamorphism for a full binary tree is a pair of functions.*

This article is part of an article series about catamorphisms. A catamorphism is a universal abstraction that describes how to digest a data structure into a potentially more compact value.

This article presents the catamorphism for a full binary tree, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in Haskell. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix.

A *full binary tree* (also known as a *proper* or *plane* binary tree) is a tree in which each node has either two or no branches.

The diagram shows an example of a tree of integers. The left branch contains two children, of which the right branch again contains two sub-branches. The rest of the nodes are leaf-nodes with no sub-branches.

### C# catamorphism #

As a C# representation of a full binary tree, I'll start with the `IBinaryTree<T>`

API from A Visitor functor. The catamorphism is the `Accept`

method:

TResult Accept<TResult>(IBinaryTreeVisitor<T, TResult> visitor);

So far in this article series, you've mostly seen Church-encoded catamorphisms, so a catamorphism represented as a Visitor may be too big of a cognitive leap. We know, however, from Visitor as a sum type that a Visitor representation is isomorphic to a Church encoding. Since these are isomorphic, it's possible to refactor `IBinaryTree<T>`

to a Church encoding. The GitHub repository contains a series of commits that demonstrates how that refactoring works. Once you're done, you arrive at this `Match`

method, which is the refactored `Accept`

method:

TResult Match<TResult>(Func<TResult, T, TResult, TResult> node, Func<T, TResult> leaf);

This method takes a pair of functions as arguments. The `node`

function deals with an internal node in the tree (the blue nodes in the above diagram), whereas the `leaf`

function deals with the leaf nodes (the green nodes in the diagram).

The `leaf`

function may be the easiest one to understand. A leaf node only contains a value of the type `T`

, so the only operation the function has to support is translating the `T`

value to a `TResult`

value. This is also the premise of the `Leaf`

class' implementation of the method:

private readonly T item; public TResult Match<TResult>(Func<TResult, T, TResult, TResult> node, Func<T, TResult> leaf) { return leaf(item); }

The `node`

function is more tricky. It takes three input arguments, of the types `TResult`

, `T`

, and `TResult`

. The roles of these are respectively *left*, *item*, and *right*. This is a typical representation of a binary node. Since there's always a left and a right branch, you put the node's value in the middle. As was the case with the tree catamorphism, the catamorphism function receives the branches as already-translated values; that is, both the left and right branch have already been translated to `TResult`

when `node`

is called. While it looks like magic, as always it's just the result of recursion:

private readonly IBinaryTree<T> left; private readonly T item; private readonly IBinaryTree<T> right; public TResult Match<TResult>(Func<TResult, T, TResult, TResult> node, Func<T, TResult> leaf) { return node(left.Match(node, leaf), item, right.Match(node, leaf)); }

This is the `Node<T>`

class implementation of the `Match`

method. It calls `node`

and returns whatever it returns, but notice that as the `left`

and `right`

arguments, if first, recursively, calls `left.Match`

and `right.Match`

. This is how it can call `node`

with the translated branches, as well as with the basic `item`

.

The recursion stops and unwinds on `left`

and `right`

whenever one of those are `Leaf`

instances.

### Examples #

You can use `Match`

to implement most other behaviour you'd like `IBinaryTree<T>`

to have. In the original article on the full binary tree functor you saw how to implement `Select`

with a Visitor, but now that the API is Church-encoded, you can derive `Select`

from `Match`

:

public static IBinaryTree<TResult> Select<TResult, T>( this IBinaryTree<T> tree, Func<T, TResult> selector) { if (tree == null) throw new ArgumentNullException(nameof(tree)); if (selector == null) throw new ArgumentNullException(nameof(selector)); return tree.Match( node: (l, x, r) => Create(l, selector(x), r), leaf: x => Leaf(selector(x))); }

In the `leaf`

case, the `Select`

method simply calls `selector`

with the `x`

value it receives, and puts the resulting `TResult`

object into a new `Leaf`

object.

In the `node`

case, the lambda expression receives three arguments: `l`

and `r`

are the *already-translated* left and right branches, so you only need to call `selector`

on `x`

and call the `Create`

helper method to produce a new `Node`

object.

You can also implement more specialised functionality, like calculating the sum of nodes, measuring the depth of the tree, and similar functions. You saw equivalent examples in the previous article.

For the examples in this article, I'll use the tree shown in the above diagram. Using static helper methods, you can write it like this:

var tree = BinaryTree.Create( BinaryTree.Create( BinaryTree.Leaf(42), 1337, BinaryTree.Create( BinaryTree.Leaf(2112), 5040, BinaryTree.Leaf(1984))), 2, BinaryTree.Leaf(90125));

To calculate the sum of all nodes, you can write a function like this:

public static int Sum(this IBinaryTree<int> tree) { return tree.Match((l, x, r) => l + x + r, x => x); }

The `leaf`

function just returns the value of the node, while the `node`

function adds the numbers together. It works for the above `tree`

:

> tree.Sum() 100642

To find the maximum value, you can write another extension method:

public static int Max(this IBinaryTree<int> tree) { return tree.Match((l, x, r) => Math.Max(Math.Max(l, r), x), x => x); }

Again, the `leaf`

function just returns the value of the node. The `node`

function receives the value of the current node `x`

, as well as the already-found maximum value of the left branch and the right branch; it then returns the maximum of these three values:

> tree.Max() 90125

As was also the case for trees, both of these operations are part of the standard repertoire available via a data structure's *fold*. That's not the case for the next two functions, which can't be implemented using a fold, but which can be defined with the catamorphism. The first is a function to count the leaves of a tree:

public static int CountLeaves<T>(this IBinaryTree<T> tree) { return tree.Match((l, _, r) => l + r, _ => 1); }

Since the `leaf`

function handles a leaf node, the number of leaf nodes in a leaf node is, by definition, *one*. Thus, that function can ignore the value of the node and always return `1`

. The `node`

function, on the other hand, receives the number of leaf nodes on the left-hand side (`l`

), the value of the current node, and the number of leaf nodes on the right-hand side (`r`

). Notice that since an internal node is never a leaf node, it doesn't count; instead, just add `l`

and `r`

together. Notice that, again, the value of the node itself is irrelevant.

How many leaf nodes does the above tree have?

> tree.CountLeaves() 4

You can also measure the maximum depth of a tree:

public static int MeasureDepth<T>(this IBinaryTree<T> tree) { return tree.Match((l, _, r) => 1 + Math.Max(l, r), _ => 0); }

Like in the previous article, I've arbitrarily decided that the depth of a leaf node is *zero*; therefore, the `leaf`

function always returns `0`

. The `node`

function receives the depth of the left and right branches, and returns the maximum of those two values, plus one, since the current node adds one level of depth.

> tree.MeasureDepth() 3

You may not have much need for working with full binary trees in your normal, day-to-day C# work, but I found it worthwhile to include this example for a couple of reasons. First, because the original of the API shows that a catamorphism may be hiding in a Visitor. Second, because binary trees are interesting, in that they're foldable functors, but not monads.

Where does the catamorphism come from, though? How can you trust that the `Match`

method is the catamorphism?

### Binary tree F-Algebra #

As in the previous article, I'll use `Fix`

and `cata`

as explained in Bartosz Milewski's excellent article on F-Algebras.

As always, start with the underlying endofunctor. You can think of this one as a specialisation of the rose tree from the previous article:

data FullBinaryTreeF a c = LeafF a | NodeF c a c deriving (Show, Eq, Read) instance Functor (FullBinaryTreeF a) where fmap _ (LeafF x) = LeafF x fmap f (NodeF l x r) = NodeF (f l) x (f r)

As usual, I've called the 'data' type `a`

and the carrier type `c`

(for *carrier*). The `Functor`

instance as usual translates the carrier type; the `fmap`

function has the type `(c -> c1) -> FullBinaryTreeF a c -> FullBinaryTreeF a c1`

.

As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like `Fix (FullBinaryTreeF a)`

. To address that problem, you can introduce a `newtype`

wrapper:

newtype FullBinaryTreeFix a = FullBinaryTreeFix { unFullBinaryTreeFix :: Fix (FullBinaryTreeF a) } deriving (Show, Eq, Read)

You can define `Functor`

, `Foldable`

, and `Traversable`

instances (but not `Monad`

) for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use.

A pair of helper functions make it easier to define `FullBinaryTreeFix`

values:

fbtLeafF :: a -> FullBinaryTreeFix a fbtLeafF = FullBinaryTreeFix . Fix . LeafF fbtNodeF :: FullBinaryTreeFix a -> a -> FullBinaryTreeFix a -> FullBinaryTreeFix a fbtNodeF (FullBinaryTreeFix l) x (FullBinaryTreeFix r) = FullBinaryTreeFix $ Fix $ NodeF l x r

In order to distinguish these helper functions from the ones that create `TreeFix a`

values, I prefixed them with `fbt`

(for *Full Binary Tree*). `fbtLeafF`

creates a leaf node:

Prelude Fix FullBinaryTree> fbtLeafF "fnaah" FullBinaryTreeFix {unFullBinaryTreeFix = Fix (LeafF "fnaah")}

`fbtNodeF`

is a helper function to create an internal node:

Prelude Fix FullBinaryTree> fbtNodeF (fbtLeafF 1337) 42 (fbtLeafF 2112) FullBinaryTreeFix {unFullBinaryTreeFix = Fix (NodeF (Fix (LeafF 1337)) 42 (Fix (LeafF 2112)))}

The `FullBinaryTreeFix`

type, or rather the underlying `FullBinaryTreeF a`

functor, is all you need to identify the catamorphism.

### Haskell catamorphism #

At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (`FullBinaryTreeF a`

), and an object `c`

, but you still need to find a morphism `FullBinaryTreeF a c -> c`

. Notice that the algebra you have to find is the function that reduces the functor to its *carrier type* `c`

, not the 'data type' `a`

. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore `a`

, as you'll see.

As in the previous articles, start by writing a function that will become the catamorphism, based on `cata`

:

fullBinaryTreeF = cata alg . unFullBinaryTreeFix where alg (LeafF x) = undefined alg (NodeF l x r) = undefined

While this compiles, with its `undefined`

implementation of `alg`

, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type `c`

from `alg`

? You could pass a function argument to the `fullBinaryTreeF`

function and use it with `x`

:

fullBinaryTreeF fl = cata alg . unFullBinaryTreeFix where alg (LeafF x) = fl x alg (NodeF l x r) = undefined

I called the function `fl`

for *function, leaf*, because we're also going to need a function for the `NodeF`

case:

fullBinaryTreeF :: (c -> a -> c -> c) -> (a -> c) -> FullBinaryTreeFix a -> c fullBinaryTreeF fn fl = cata alg . unFullBinaryTreeFix where alg (LeafF x) = fl x alg (NodeF l x r) = fn l x r

This works. Since `cata`

has the type `Functor f => (f a -> a) -> Fix f -> a`

, that means that `alg`

has the type `f a -> a`

. In the case of `FullBinaryTreeF`

, the compiler infers that the `alg`

function has the type `FullBinaryTreeF a c -> c`

, which is just what you need!

You can now see what the carrier type `c`

is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns.

This, then, is the catamorphism for a full binary tree. As always, it's not the only possible catamorphism, since you can easily reorder the arguments to both `fullBinaryTreeF`

, `fn`

, and `fl`

. These would all be isomorphic, though.

### Basis #

You can implement most other useful functionality with `treeF`

. Here's the `Functor`

instance:

instance Functor FullBinaryTreeFix where fmap f = fullBinaryTreeF (\l x r -> fbtNodeF l (f x) r) (fbtLeafF . f)

The `fl`

function first invokes `f`

, followed by `fbtLeafF`

. The `fn`

function uses the `fbtNodeF`

helper function to create a new internal node. `l`

and `r`

are already-translated branches, so you just need to call `f`

with the node value `x`

.

There's no `Monad`

instance for binary trees, because you can't flatten a binary tree of binary trees. You can, on the other hand, define a `Foldable`

instance:

instance Foldable FullBinaryTreeFix where foldMap f = fullBinaryTreeF (\l x r -> l <> f x <> r) f

The `f`

function passed to `foldMap`

has the type `Monoid m => (a -> m)`

, so the `fl`

function that handles leaf nodes simply calls `f`

with the contents of the node. The `fn`

function receives two branches already translated to `m`

, so it just has to call `f`

with `x`

and combine all the `m`

values using the `<>`

operator.

The `Traversable`

instance follows right on the heels of `Foldable`

:

instance Traversable FullBinaryTreeFix where sequenceA = fullBinaryTreeF (liftA3 fbtNodeF) (fmap fbtLeafF)

There are operations on binary trees that you can implement with a fold, but some that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using `FullBinaryTreeFix`

, you can define that tree like this:

tree = fbtNodeF (fbtNodeF (fbtLeafF 42) 1337 (fbtNodeF (fbtLeafF 2112) 5040 (fbtLeafF 1984))) 2 (fbtLeafF 90125)

Since `FullBinaryTreeFix`

is `Foldable`

, and that type class already comes with `sum`

and `maximum`

functions, no further work is required to repeat the first two of the above C# examples:

Prelude Fix FullBinaryTree> sum tree 100642 Prelude Fix FullBinaryTree> maximum tree 90125

Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the `Foldable`

instance, but can be implemented using the catamorphism:

countLeaves :: Num n => FullBinaryTreeFix a -> n countLeaves = fullBinaryTreeF (\l _ r -> l + r) (const 1) treeDepth :: (Ord n, Num n) => FullBinaryTreeFix a -> n treeDepth = fullBinaryTreeF (\l _ r -> 1 + max l r) (const 0)

The reasoning is the same as already explained in the above C# examples. The functions also produce the same results:

Prelude Fix FullBinaryTree> countLeaves tree 4 Prelude Fix FullBinaryTree> treeDepth tree 3

This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation.

### Summary #

The catamorphism for a full binary tree is a pair of functions. One function handles internal nodes, while the other function handles leaf nodes.

I thought it was interesting to show this example for two reasons: First, the original example was a Visitor implementation, and I think it's worth realising that a Visitor's `Accept`

method can also be viewed as a catamorphism. Second, a binary tree is an example of a data structure that has a fold, but isn't a monad.

All articles in the article series have, so far, covered data structures well-known from computer science. The next example will, on the other hand, demonstrate that even completely ad-hoc domain-specific data structures have catamorphisms.

**Next:** Payment types catamorphism.