A Visitor functor

Some Visitors can be functors. Another functor example for object-oriented programmers.

This article is an instalment in an article series about functors. In the previous article, you saw how to implement a generalised tree as a functor. In this article, you'll see another functor example, which will also be an application of the Visitor design pattern.

The Visitor design pattern is often described in such a way that it's based on mutation; the Visit and Accept methods in those descriptions typically return void. You can, however, also implement immutable variations. This blog already contains an older example of this.

Visitor #

In this article, you'll see how to implement a full binary tree using the Visitor design pattern. You can make the tree generic, so that each node can contain values of any generic type.

public interface IBinaryTree<T>
{
    TResult Accept<TResult>(IBinaryTreeVisitor<T, TResult> visitor);
}

As promised, this interface implies an immutable variant where the Accept method doesn't mutate the input Visitor, but rather returns a new value. You can learn how you arrive at this particular generic method signature in my article Visitor as a sum type.

A full binary tree is a tree where each node has either zero or two children. This means that a Visitor must have two methods, one for each case:

public interface IBinaryTreeVisitor<T, TResult>
{
    TResult Visit(Node<T> node);
 
    TResult Visit(Leaf<T> leaf);
}

The IBinaryTreeVisitor<T, TResult> interface introduces two new concrete classes: Node<T> and Leaf<T>. A leaf node is a node with zero children:

public sealed class Leaf<T> : IBinaryTree<T>
{
    public T Item { get; }
 
    public Leaf(T item)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
 
        Item = item;
    }
 
    public TResult Accept<TResult>(IBinaryTreeVisitor<T, TResult> visitor)
    {
        if (visitor == null)
            throw new ArgumentNullException(nameof(visitor));
 
        return visitor.Visit(this);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is Leaf<T> other))
            return false;
 
        return Equals(Item, other.Item);
    }
 
    public override int GetHashCode()
    {
        return Item.GetHashCode();
    }
}

While a leaf node has no children, it still contains an Item of the generic type T. A leaf node still counts as a binary tree, so it implements the IBinaryTree<T> interface. Complying with the Visitor design pattern, its Accept method is implemented using double dispatch. Thereby, any visitor knows that it's now visiting a concrete Leaf<T> object.

Likewise, a node is a (sub)tree:

public sealed class Node<T> : IBinaryTree<T>
{
    public T Item { get; }
    public IBinaryTree<T> Left { get; }
    public IBinaryTree<T> Right { get; }
 
    public Node(T item, IBinaryTree<T> left, IBinaryTree<T> right)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
        if (left == null)
            throw new ArgumentNullException(nameof(left));
        if (right == null)
            throw new ArgumentNullException(nameof(right));
 
        Item = item;
        Left = left;
        Right = right;
    }
 
    public TResult Accept<TResult>(IBinaryTreeVisitor<T, TResult> visitor)
    {
        if (visitor == null)
            throw new ArgumentNullException(nameof(visitor));
 
        return visitor.Visit(this);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is Node<T> other))
            return false;
 
        return Equals(Item, other.Item)
            && Equals(Left, other.Left)
            && Equals(Right, other.Right);
    }
 
    public override int GetHashCode()
    {
        return Item.GetHashCode() ^ Left.GetHashCode() ^ Right.GetHashCode();
    }
}

In addition to an Item, a Node<T> object also contains a Left and a Right sub-tree. Notice that the Accept method is literally identical to Leaf<T>.Accept. Its behaviour differs, though, because this has a different type.

A couple of static helper methods makes it a bit easier to create binary tree objects:

public static class BinaryTree
{
    public static IBinaryTree<T> Leaf<T>(T item)
    {
        return new Leaf<T>(item);
    }
 
    public static IBinaryTree<T> Create<T>(
        T item,
        IBinaryTree<T> left,
        IBinaryTree<T> right)
    {
        return new Node<T>(item, left, right);
    }
}

The main convenience of these two methods is that C# (limited) type inference enables you to create tree objects without explicitly typing out the generic type argument every time. You'll soon see an example of creating a binary tree of integers.

Functor #

Since IBinaryTree<T> is a generic type, you should consider whether it's a functor. Given the overall topic of this article, you'd hardly be surprised that it is.

In the previous two functor examples (Maybe and Tree), the Select methods were instance methods. On the other hand, the .NET Base Class Library implements IEnumerable<T>.Select as an extension method. You can do the same with this binary tree Visitor:

public static IBinaryTree<TResult> Select<TResult, T>(
    this IBinaryTree<T> tree,
    Func<T, TResult> selector)
{
    if (tree == null)
        throw new ArgumentNullException(nameof(tree));
    if (selector == null)
        throw new ArgumentNullException(nameof(selector));
 
    var visitor = new SelectBinaryTreeVisitor<T, TResult>(selector);
    return tree.Accept(visitor);
}

This Select method has the right signature for turning IBinaryTree<T> into a functor. It starts by creating a new instance of a private helper class called SelectBinaryTreeVisitor<T, TResult>. Notice that this class has two generic type arguments: the source type T and the destination type TResult. It also contains selector, so that it knows what to do with each Item it encounters.

SelectBinaryTreeVisitor<T, TResult> is a Visitor, so you pass it to the tree object's Accept method. The Accept method returns a variable that you can directly return, because, as you'll see below, the return type of SelectBinaryTreeVisitor<T, TResult>'s Visit methods is IBinaryTree<TResult>.

SelectBinaryTreeVisitor<T, TResult> is a private helper class, and is the most complex functor implementation you've seen so far. The Visitor design pattern solves a specific problem, but it was never the simplest of design patterns.

private class SelectBinaryTreeVisitor<T, TResult> :
    IBinaryTreeVisitor<T, IBinaryTree<TResult>>
{
    private readonly Func<T, TResult> selector;
 
    public SelectBinaryTreeVisitor(Func<T, TResult> selector)
    {
        if (selector == null)
            throw new ArgumentNullException(nameof(selector));
 
        this.selector = selector;
    }
 
    public IBinaryTree<TResult> Visit(Leaf<T> leaf)
    {
        var mappedItem = selector(leaf.Item);
        return Leaf(mappedItem);
    }
 
    public IBinaryTree<TResult> Visit(Node<T> node)
    {
        var mappedItem = selector(node.Item);
        var mappedLeft = node.Left.Accept(this);
        var mappedRight = node.Right.Accept(this);
        return Create(mappedItem, mappedLeft, mappedRight);
    }
}

Since the class implements IBinaryTreeVisitor<T, IBinaryTree<TResult>>, it must implement the two Visit overloads. The overload for Leaf<T> is simple: use the selector to map the Item, and use the Leaf convenience method to return a new Leaf<TResult> containing the mapped item. Notice that while SelectBinaryTreeVisitor<T, TResult> looks like it has a generic 'return' type argument of TResult, it implements IBinaryTreeVisitor<T, IBinaryTree<TResult>>, which means that the return type of each Visit method must be IBinaryTree<TResult>, and that matches Leaf<TResult>.

The overload for a Node<T> object looks twice as big, but it's still simple. Like the leaf overload, it uses selector to map the Item, but in addition to that, it must also recursively map the Left and Right sub-trees. It does this by passing itself, in its role as a Visitor, to the left and right nodes' Accept methods. This returns mapped sub-trees that can be used to create a new mapped tree, using the Create convenience method.

Usage #

While the implementation of such a Visitor is cumbersome, it's easy enough to use.

var source =
    BinaryTree.Create(42,
        BinaryTree.Create(1337,
            BinaryTree.Leaf(0),
            BinaryTree.Leaf(-22)),
        BinaryTree.Leaf(100));

You can translate this binary tree of integers to a tree of strings using method call syntax:

IBinaryTree<string> dest = source.Select(i => i.ToString());

or by using query syntax:

IBinaryTree<string> dest = from i in source
                           select i.ToString();

In both of these examples, I've explicitly declared the type of dest instead of using the var keyword. There's no practical reason to do this; I only did it to make the type clear to you.

Haskell #

Why would anyone ever do something so complicated as this?

The answer to such a question is, I believe, that it's only complicated in some programming languages. In Haskell, all of the above can be reduce to a single type declaration:

data BinaryTree a = Node a (BinaryTree a) (BinaryTree a) | Leaf a
  deriving (Show, Eq, Functor)

Notice that the Haskell compiler can automatically derive an implementation of the Functor typeclass, although that does require the DeriveFunctor language extension.

This may explain why binary trees aren't part of object-oriented programmers' normal tool box, whereas they are more commonplace in functional programming.

While not strictly required, in order to keep the examples equivalent, you can define these two aliases:

leaf :: a -> BinaryTree a
leaf = Leaf
 
create :: a -> BinaryTree a -> BinaryTree a -> BinaryTree a
create = Node

This enables you to create a binary tree like this:

source :: BinaryTree Int
source =
    create 42 (
        create 1337 (
            leaf 0) (
            leaf (-22))) (
        leaf 100)

As usual you can map the tree using the fmap function:

dest :: BinaryTree String
dest = fmap show source

or by using infix notation:

dest :: BinaryTree String
dest = show <$> source

The <$> operator is an alias for fmap.

F# #

As usual, F# lies somewhere between the extremes of C# and Haskell, although it's closer to Haskell in simplicity. The type declaration is similar:

type BinaryTree<'a> =
| Node of ('a * BinaryTree<'a> * BinaryTree<'a>)
| Leaf of 'a

Unlike Haskell, however, F# doesn't have any built-in functor awareness, so you'll have to implement the map function yourself:

// ('a -> 'b) -> BinaryTree<'a> -> BinaryTree<'b>
let rec map f = function
    | Node (x, left, right) -> Node (f x, map f left, map f right)
    | Leaf x -> Leaf (f x)

Notice that you have to use the rec keyword in order to make map recursive. Instead of having to create a new helper class, and all the byzantine interactions required by the Visitor design pattern, the implementation uses simple pattern matching to achieve the same goal. In the Node case, it uses f to translate x, and recursively calls itself on left and right. In the Leaf case, it simply returns a new Leaf value with x translated by f.

Create helper functions to keep all three examples aligned:

// 'a -> BinaryTree<'a>
let leaf = Leaf
 
// 'a -> BinaryTree<'a> -> BinaryTree<'a> -> BinaryTree<'a>
let create x left right = Node (x, left, right)

You can now create a binary tree of integers:

// BinaryTree<int>
let source =
    BinaryTree.create 42 (
        BinaryTree.create 1337 (
            BinaryTree.leaf 0) (
            BinaryTree.leaf -22)) (
        BinaryTree.leaf 100)

which you can translate like this:

// BinaryTree<string>
let dest = source |> BinaryTree.map string

Here, all of the above functions are defined in a module named BinaryTree.

First functor law #

The Select method obeys the first functor law. As usual, it's proper computer-science work to actually prove that, but you can write some tests to demonstrate the first functor law for the IBinaryTree<T> interface. In this article, you'll see a few parametrised tests written with xUnit.net. First, you can define some reusable trees as test input:

public static IEnumerable<object[]> Trees
{
    get
    {
        yield return new[] { BinaryTree.Leaf(0) };
        yield return new[] {
            BinaryTree.Create(-3,
                BinaryTree.Leaf(2),
                BinaryTree.Leaf(99)) };
        yield return new[] {
            BinaryTree.Create(42,
                BinaryTree.Create(1337,
                    BinaryTree.Leaf(0),
                    BinaryTree.Leaf(-22)),
                BinaryTree.Leaf(100)) };
        yield return new[] {
            BinaryTree.Create(-927,
                BinaryTree.Leaf(2),
                BinaryTree.Create(211,
                    BinaryTree.Leaf(88),
                    BinaryTree.Leaf(132))) };
        yield return new[] {
            BinaryTree.Create(111,
                BinaryTree.Create(-336,
                    BinaryTree.Leaf(113),
                    BinaryTree.Leaf(-432)),
                BinaryTree.Create(1299,
                    BinaryTree.Leaf(-32),
                    BinaryTree.Leaf(773))) };
    }
}

This is just a collection of five small binary trees that can be used as input for parametrised tests. The first tree is only a single node - the simplest tree you can make with the IBinaryTree<T> API.

You can use this static property as a source of input for parametrised tests. Here's one that demonstrates that the first functor law holds:

[Theory, MemberData(nameof(Trees))]
public void FirstFunctorLaw(IBinaryTree<int> tree)
{
    Assert.Equal(tree, tree.Select(x => x));
}

Here, I chose to implement the identity function as an anonymous lambda expression. In contrast, in a previous article, I explicitly declared a function variable and called it id. Those two ways to express the identity function are equivalent.

As always, I'd like to emphasise that this test doesn't prove that IBinaryTree<T> obeys the first functor law. It only demonstrates that the law holds for those five examples.

Second functor law #

Like the above example, you can also write a parametrised test that demonstrates that IBinaryTree<T> obeys the second functor law. You can reuse the Trees test case source for that test:

[Theory, MemberData(nameof(Trees))]
public void SecondFunctorLaw(IBinaryTree<int> tree)
{
    string g(int i) => i.ToString();
    bool f(string s) => s.Length % 2 == 0;
 
    Assert.Equal(tree.Select(g).Select(f), tree.Select(i => f(g(i))));
}

This test defines two local functions, f and g. Instead of explicitly declaring the functions as Func variables, this test uses a (relatively) new C# feature called local functions.

Again, while the test doesn't prove anything, it demonstrates that for the five test cases, it doesn't matter if you project the tree in one or two steps.

Summary #

Statically typed functional languages like F# and Haskell enable you to define sum types: types that encode a selection of mutually exclusive cases. Combined with pattern matching, it's easy to deal with values that can be one of several non-polymorphic cases. Object-oriented languages like C# or Java don't have good support for this type of data structure. Object-oriented programmers often resort to using type hierarchies, but this requires down-casting in order to work. It also comes with the disadvantage that with type hierarchies, the hierarchy is extensible, which means that as an implementer, you never know if you've handled all sub-types. The Visitor design pattern is a way to model sum types in object-oriented programming, although it tends to be verbose.

Nevertheless, if you have a generic type that models a set of mutually exclusive cases, it just may be a functor. In Haskell, you can make such a type a Functor with a mere declaration. In C#, you have to write considerable amounts of code.

Next: Reactive functor.

Published: Monday, 13 August 2018 06:56:00 UTC

A Visitor functor by Mark Seemann