Functor sums

A choice of two or more functors gives rise to a functor. An article for object-oriented programmers.

This article is part of a series of articles about functor relationships. In this one you'll learn about a universal composition of functors. In short, if you have a sum type of functors, that data structure itself gives rise to a functor.

Together with other articles in this series, this result can help you answer questions such as: Does this data structure form a functor?

Since functors tend to be quite common, and since they're useful enough that many programming languages have special support or syntax for them, the ability to recognize a potential functor can be useful. Given a type like Foo<T> (C# syntax) or Bar<T1, T2>, being able to recognize it as a functor can come in handy. One scenario is if you yourself have just defined this data type. Recognizing that it's a functor strongly suggests that you should give it a Select method in C#, a map function in F#, and so on.

Not all generic types give rise to a (covariant) functor. Some are rather contravariant functors, and some are invariant.

If, on the other hand, you have a data type which is a sum of two or more (covariant) functors with the same type parameter, then the data type itself gives rise to a functor. You'll see some examples in this article.

Abstract shape in F# #

Before we look at some examples found in other code, it helps if we know what we're looking for. You'll see a C# example in a minute, but since sum types require so much ceremony in C#, we'll make a brief detour around F#.

Imagine that you have two lawful functors, F and G. Also imagine that you have a data structure that holds either an F<'a> value or a G<'a> value:

type FOrG<'a> = FOrGF of F<'a> | FOrGG of G<'a>

The name of the type is FOrG. In the FOrGF case, it holds an F<'a> value, and in the FOrGG case it holds a G<'a> value.

The point of this article is that since both F and G are (lawful) functors, then FOrG also gives rise to a functor. The composed map function can pattern-match on each case and call the respective map function that belongs to each of the two functors.

let map f forg =
    match forg with
    | FOrGF fa -> FOrGF (F.map f fa)
    | FOrGG ga -> FOrGG (G.map f ga)

For clarity I've named the values fa indicating f of a and ga indicating g of a.

Notice that it's an essential requirement that the individual functors (here F and G) are parametrized by the same type parameter (here 'a). If your data structure contains F<'a> and G<'b>, the 'theorem' doesn't apply.

Abstract shape in C# #

The same kind of abstract shape requires much more boilerplate in C#. When defining a sum type in a language that doesn't support them, we may instead either turn to the Visitor design pattern or alternatively use Church encoding. While the two are isomorphic, Church encoding is a bit simpler while the Visitor pattern seems more object-oriented. In this example I've chosen the simplicity of Church encoding.

Like in the above F# code, I've named the data structure the same, but it's now a class:

public sealed class FOrG<T>

Two constructors enable you to initialize it with either an F<T> or a G<T> value.

public FOrG(F<T> f)

public FOrG(G<T> g)

Notice that F<T> and G<T> share the same type parameter T. If a class had, instead, composed either F<T1> or G<T2>, the 'theorem' doesn't apply.

Finally, a Match method completes the Church encoding.

public TResult Match<TResult>(
    Func<F<T>, TResult> whenF,
    Func<G<T>, TResult> whenG)

Regardless of exactly what F and G are, you can add a Select method to FOrG<T> like this:

public FOrG<TResult> Select<TResult>(Func<T, TResult> selector)
{
    return Match(
        whenF: f => new FOrG<TResult>(f.Select(selector)),
        whenG: g => new FOrG<TResult>(g.Select(selector)));
}

Since we assume that F and G are functors, which in C# idiomatically have a Select method, we pass the selector to their respective Select methods. f.Select returns a new F value, while g.Select returns a new G value, but there's a constructor for each case, so the composed Select method repackages those return values in new FOrG<TResult> objects.

I'll have more to say about how this generalizes to a sum of more than two alternatives, but first, let's consider some examples.

Open or closed endpoints #

The simplest example that I can think of is that of range endpoints. A range may be open, closed, or a mix thereof. Some mathematical notations use (1, 6] to indicate the range between 1 and 6, where 1 is excluded from the range, but 6 is included. An alternative notation is ]1, 6].

A given endpoint (1 and 6, above) is either open or closed, which implies a sum type. In F# I defined it like this:

type Endpoint<'a> = Open of 'a | Closed of 'a

If you're at all familiar with F#, this is clearly a discriminated union, which is just what the F# documentation calls sum types.

The article Range as a functor goes through examples in both Haskell, F#, and C#, demonstrating, among other points, how an endpoint sum type forms a functor.

Binary tree #

The next example we'll consider is the binary tree from A Binary Tree Zipper in C#. In the original Haskell Zippers article, the data type is defined like this:

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)

Even if you're not familiar with Haskell syntax, the vertical bar (|) indicates a choice between the left-hand side and the right-hand side. Many programming languages use the | character for Boolean disjunction (or), so the syntax should be intuitive. In this definition, a binary tree is either empty or a node with a value and two subtrees. What interests us here is that it's a sum type.

One way this manifests in C# is in the choice of two alternative constructors:

public BinaryTree() : this(Empty.Instance)
{
}
 
public BinaryTree(T value, BinaryTree<T> left, BinaryTree<T> right)
    : this(new Node(value, left.root, right.root))
{
}

BinaryTree<T> clearly has a generic type parameter. Does the class give rise to a functor?

It does if it's composed from a sum of two functors. Is that the case?

On the 'left' side, it seems that we have nothing. In the Haskell code, it's called Empty. In the C# code, this case is represented by the parameterless constructor (also known as the default constructor). There's no T there, so that doesn't look much like a functor.

All is, however, not lost. We may view this lack of data as a particular value ('nothing') wrapped in the Const functor. In Haskell and F# a value without data is called unit and written (). In C# or Java you may think of it as void, although unit is a value that you can pass around, which isn't the case for void.

In Haskell, we could instead represent Empty as Const (), which is a bona-fide Functor instance that you can fmap:

ghci> emptyNode = Const ()
ghci> fmap (+1) emptyNode
Const ()

This examples pretends to 'increment' a number that isn't there. Not that you'd need to do this. I'm only showing you this to make the argument that the empty node forms a functor.

The 'right' side of the sum type is most succinctly summarized by the Haskell code:

Node a (Tree a) (Tree a)

It's a 'naked' generic value and two generic trees. In C# it's the parameter list

(T value, BinaryTree<T> left, BinaryTree<T> right)

Does that make a functor? Yes, it's a triple of a 'naked' generic value and two recursive subtrees, all sharing the same T. Just like in the previous article we can view a 'naked' generic value as equivalent to the Identity functor, so that parameter is a functor. The other ones are recursive types: They are of the same type as the type we're trying to evaluate, BinaryTree<T>. If we assume that that forms a functor, that triple is a product type of functors. From the previous article, we know that that gives rise to a functor.

This means that in C#, for example, you can add the idiomatic Select method:

public BinaryTree<TResult> Select<TResult>(Func<T, TResult> selector)
{
    return Aggregate(
        whenEmpty: () => new BinaryTree<TResult>(),
        whenNode: (value, left, right) =>
            new BinaryTree<TResult>(selector(value), left, right));
}

In languages that support pattern-matching on sum types (such as F#), you'd have to match on each case and explicitly deal with the recursive mapping. Notice, however, that here I've used the Aggregate method to implement Select. The Aggregate method is the BinaryTree<T> class' catamorphism, and it already handles the recursion for us. In other words, left and right are already BinaryTree<TResult> objects.

What remains is only to tell Aggregate what to do when the tree is empty, and how to transform the 'naked' node value. The Select implementation handles the former by returning a new empty tree, and the latter by invoking selector(value).

Not only does the binary tree form a functor, but it turns out that the Zipper does as well, because the breadcrumbs also give rise to a functor.

Breadcrumbs #

The original Haskell Zippers article defines a breadcrumb for the binary tree Zipper like this:

data Crumb a = LeftCrumb a (Tree a) | RightCrumb a (Tree a) deriving (Show)

That's another sum type with generics on the left as well as the right. In C# the two options may be best illustrated by these two creation methods:

public static Crumb<T> Left<T>(T value, BinaryTree<T> right)
{
    return Crumb<T>.Left(value, right);
}
 
public static Crumb<T> Right<T>(T value, BinaryTree<T> left)
{
    return Crumb<T>.Right(value, left);
}

Notice that the Left and Right choices have the same structure: A 'naked' generic T value, and a BinaryTree<T> object. Only the names differ. This suggests that we only need to think about one of them, and then we can reuse our conclusion for the other.

As we've already done once, we consider a T value equivalent with Identity<T>, which is a functor. We've also, just above, established that BinaryTree<T> forms a functor. We have a product (argument list, or tuple) of functors, so that combination forms a functor.

Since this is true for both alternatives, this sum type, too, gives rise to a functor. This enables you to implement a Select method:

public Crumb<TResult> Select<TResult>(Func<T, TResult> selector)
{
    return Match(
        (v, r) => Crumb.Left(selector(v), r.Select(selector)),
        (v, l) => Crumb.Right(selector(v), l.Select(selector)));
}

By now the pattern should be familiar. Call selector(v) directly on the 'naked' values, and pass selector to any other functors' Select method.

That's almost all the building blocks we have to declare BinaryTreeZipper<T> a functor as well, but we need one last theorem before we can do that. We'll conclude this work in the next article.

Higher arities #

Although we finally saw a 'real' triple product, all the sum types have involved binary choices between a 'left side' and a 'right side'. As was the case with functor products, the result generalizes to higher arities. A sum type with any number of cases forms a functor if all the cases give rise to a functor.

We can, again, use canonicalized forms to argue the case. (See Thinking with Types for a clear explanation of canonicalization of types.) A two-way choice is isomorphic to Either, and a three-way choice is isomorphic to Either a (Either b c). Just like it's possible to build triples, quadruples, etc. by nesting pairs, we can construct n-ary choices by nesting Eithers. It's the same kind of inductive reasoning.

This is relevant because just as Haskell's base library provides Data.Functor.Product for composing two (and thereby any number of) functors, it also provides Data.Functor.Sum for composing functor sums.

The Sum type defines two case constructors: InL and InR, but it's isomorphic with Either:

canonizeSum :: Sum f g a -> Either (f a) (g a)
canonizeSum (InL x) = Left x
canonizeSum (InR y) = Right y
 
summarizeEither :: Either (f a) (g a) -> Sum f g a
summarizeEither (Left x) = InL x
summarizeEither (Right y) = InR y

The point is that we can compose not only a choice of two, but of any number of functors, to a single functor type. A simple example is this choice between Maybe, list, or Tree:

maybeOrListOrTree :: Sum (Sum Maybe []) Tree String
maybeOrListOrTree = InL (InL (Just "foo"))

If we rather wanted to embed a list in that type, we can do that as well:

maybeOrListOrTree' :: Sum (Sum Maybe []) Tree String
maybeOrListOrTree' = InL (InR ["bar", "baz"])

Both values have the same type, and since it's a Functor instance, you can fmap over it:

ghci> fmap (elem 'r') maybeOrListOrTree
InL (InL (Just False))
ghci> fmap (elem 'r') maybeOrListOrTree'
InL (InR [True,False])

These queries examine each String to determine whether or not they contain the letter 'r', which only "bar" does.

The point, anyway, is that sum types of any arity form a functor if all the cases do.

Conclusion #

In the previous article, you learned that a functor product gives rise to a functor. In this article, you learned that a functor sum does, too. If a data structure contains a choice of two or more functors, then that data type itself forms a functor.

As the previous article argues, this is useful to know, particularly if you're working in a language with only partial support for functors. Mainstream languages aren't going to automatically turn such sums into functors, in the way that Haskell's Sum container almost does. Thus, knowing when you can safely give your generic types a Select method or map function may come in handy.

There's one more rule like this one.

Next: Functor compositions.

Published: Monday, 14 October 2024 18:26:00 UTC

Functor sums by Mark Seemann