Functional file system

Monday, 26 August 2019 06:00:00 UTC

How do you model file systems in a functional manner, so that unit testing is enabled? An overview.

One of the many reasons that I like functional programming is that it's intrinsically testable. In object-oriented programming, you often have to jump through hoops to enable testing. This is also the case whenever you need to interact with the computer's file system. Just try to search the web for file system interface, or mock file system. I'm not going to give you any links, because I think such questions are XY problems. I don't think that the most common suggestions are proper solutions.

In functional programming, anyway, Dependency Injection isn't functional, because it makes everything impure. How, then, do you model the file system in such a way that it's pure, decoupled from the logic you'd like to add on top of it, and still has enough fidelity that you can perform most tasks?

You model the file system as a tree, or a forest.

File systems are hierarchies #

It should come as no surprise that file systems are hierarchies, or trees. Each logical drive is the root of a tree. Files are leaves, and directories are internal nodes. Does that sound familiar? That sounds like a rose tree.

Rose trees are immutable data structures. It doesn't get much more functional than that. Why not use a rose tree (or a forest) to model the file system?

What about interaction with the actual file system? Usually, when you encounter object-oriented attempts at decoupling an abstraction from the actual file system, you'll find polymorphic operations such as WriteAllText, GetFileSystemEntries, CreateDirectory, and so on. These would be the (mockable) methods that you have to implement, usually as Humble Objects.

If you, instead of a set of interfaces, model the file system as a forest, interacting with the actual file system is not even part of the abstraction. That's a typical shift of perspective from object-oriented design to functional programming.

Object-oriented and functional ways to abstractly model file systems.

In object-oriented design, you typically attempt to model data with behaviour. Sometimes that fits the underlying reality well, but in this case it doesn't. While you have file and directory objects with behaviour, the actual structure of a file system is implicit. It's hidden in the interactions between the objects.

By modelling the file system as a tree, you explicitly use the structure of the data. How you load a tree into program memory, or how you imprint a tree unto the file system isn't part of the abstraction. When it comes to input and output, you're free to do what you want.

Once you have a model of a directory structure in memory, you can manipulate it to your heart's content. Since rose trees are functors, you know that all transformations are structure-preserving. That means that you don't even need to write tests for those parts of your application.

You'll appreciate an example, I'm sure.

Picture archivist example #

As an example, I'll attempt to answer an old Code Review question. I already gave an answer in 2015, but I'm not so happy with it today as I was back then. The question is great, though, because it explicitly demonstrates how people have a hard time escaping the notion that abstraction is only available via interfaces or abstract base classes. In 2015, I had long since figured out that delegates (and thus functions) are anonymous interfaces, but I still hadn't figured out how to separate pure from impure behaviour.

The question's scenario is how to implement a small program that can inspect a collection of image files, extract the date-taken metadata from each file, and move the files to a new directory structure based on that information.

For example, you could have files organised in various directories according to motive.

Three example directories with pictures.

You soon realise, however, that that archiving strategy is untenable, because what do you do if there's more than one type of motive in a picture? Instead, you decide to organise the files according to month and year.

Seven example directories with pictures.

Clearly, there's some input and output involved in this application, but there's also some logic that you'd like to unit test. You need to parse the metadata, figure out where to move each image file, filter out files that are not images, and so on.

Object-oriented picture archivist #

If you were to implement such a picture archivist program with an object-oriented design, you may use Dependency Injection so that you can 'mock' the file system during unit testing. A typical program might then work like this at run time:

An object-oriented program typically has busy interaction with the file system.

The program has fine-grained, busy interaction with the file system (through a polymorphic interface). It'll typically read one file, load its metadata, decide where to put the file, and copy it there. Then it'll move on to the next file, although it might also do this in parallel. Throughout the program execution, there's input and output going on, which makes it difficult to isolate the pure from the impure code.

Even if you write a program like that in F#, it's hardly a functional architecture.

Such an architecture is, in theory, testable, but my experience is that if you attempt to reproduce such busy, fine-grained interaction with mocks and stubs, you're likely to end up with brittle tests.

Functional picture archivist #

In functional programming, you'll have to reject the notion of dependencies. Instead, you can often resort to the simple architecture I call an impure-pure-impure sandwich; here, specifically:

  1. Load data from disk (impure)
  2. Transform the data (pure)
  3. Write data to disk (impure)
A typical program might then work like this at run time:

A functional program typically loads data, transforms it, and stores it again.

When the program starts, it loads data from disk into a tree. It then manipulates the in-memory model of the files in question, and once it's done, it traverses the entire tree and applies the changes.

This gives you a much clearer separation between the pure and impure parts of the code base. The pure part is bigger, and easier to unit test.

Example code #

This article gave you an overview of the functional architecture. In the next two articles, you'll see how to do this in practice. First, I'll implement the above architecture in Haskell, so that we know that if it works there, the architecture does, indeed, respect the functional interaction law.

Based on the Haskell implementation, you'll then see a port to F#.

These two articles share the same architecture. You can read both, or one of them, as you like. The source code is available on GitHub.

Summary #

One of the hardest problems in transitioning from object-oriented programming to functional programming is that the design approach is so different. Many well-understood design patterns and principles don't translate easily. Dependency Injection is one of those. Often, you'll have to flip the model on its head, so to speak, before you can take it on in a functional manner.

While most object-oriented programmers would say that object-oriented design involves focusing on 'the nouns', in practice, it often revolves around interactions and behaviour. Sometimes, that's appropriate, but often, it's not.

Functional programming, in contrast, tends to take a more data-oriented perspective. Load some data, manipulate it, and publish it. If you can come up with an appropriate data structure for the data, you're probably on your way to implementing a functional architecture.

Next: Picture archivist in Haskell.


A rose tree functor

Monday, 19 August 2019 08:08:00 UTC

Rose trees form normal functors. A place-holder article for object-oriented programmers.

This article is an instalment in an article series about functors. As another article explains, a rose tree is a bifunctor. This makes it trivially a functor. As such, this article is mostly a place-holder to fit the spot in the functor table of contents, thereby indicating that rose trees are functors.

Since a rose tree is a bifunctor, it's actually not one, but two, functors. Many languages, C# included, are best equipped to deal with unambiguous functors. This is also true in Haskell, where you'd usally define the Functor instance over a bifunctor's right, or second, side. Likewise, in C#, you can make IRoseTree<N, L> a functor by implementing Select:

public static IRoseTree<NL1> Select<NLL1>(
    this IRoseTree<NL> source,
    Func<LL1> selector)
{
    return source.SelectLeaf(selector);
}

This method simply delegates all implementation to the SelectLeaf method; it's just SelectLeaf by another name. It obeys the functor laws, since these are just specializations of the bifunctor laws, and we know that a rose tree is a proper bifunctor.

It would have been technically possible to instead implement a Select method by calling SelectNode, but it seems marginally more useful to enable syntactic sugar for mapping over the leaves.

Menu example #

As an example, imagine that you're defining part of a menu bar for an old-fashioned desktop application. Perhaps you're even loading the structure of the menu from a text file. Doing so, you could create a simple tree that represents the edit menu:

IRoseTree<stringstring> editMenuTemplate =
    RoseTree.Node("Edit",
        RoseTree.Node("Find and Replace",
            new RoseLeaf<stringstring>("Find"),
            new RoseLeaf<stringstring>("Replace")),
        RoseTree.Node("Case",
            new RoseLeaf<stringstring>("Upper"),
            new RoseLeaf<stringstring>("Lower")),
        new RoseLeaf<stringstring>("Cut"),
        new RoseLeaf<stringstring>("Copy"),
        new RoseLeaf<stringstring>("Paste"));

At this point, you have an IRoseTree<string, string>, so you might as well have used a 'normal' tree instead of a rose tree. The above template, however, is only a first step, because you have this Command class:

public class Command
{
    public Command(string name)
    {
        Name = name;
    }
 
    public string Name { get; }
 
    public virtual void Execute()
    {
    }
}

Apart from this base class, you also have classes that derive from it: FindCommand, ReplaceCommand, and so on. These classes override the Execute method by implenting find, replace, etc. functionality. Imagine that you also have a store or dictionary of these derived objects. This enables you to transform the template tree into a useful user menu:

IRoseTree<stringCommand> editMenu =
    from name in editMenuTemplate
    select commandStore.Lookup(name);

Notice how this transforms only the leaves, using the command store's Lookup method. This example uses C# query syntax, because this is what the Select method enables, but you could also have written the translation by just calling the Select method.

The internal nodes in a menu have no behavious, so it makes little sense to attempt to turn them into Command objects as well. They're only there to provide structure to the menu. With a 'normal' tree, you wouldn't have been able to enrich only the leaves, while leaving the internal nodes untouched, but with a rose tree you can.

The above example uses the Select method (via query syntax) to translate the nodes, thereby providing a demonstration of how to use the rose tree as the functor it is.

Summary #

The Select doesn't implement any behaviour not already provided by SelectLeaf, but it enables C# query syntax. The C# compiler understands functors, but not bifunctors, so when you have a bifunctor, you might as well light up that language feature as well by adding a Select method.

Next: A Visitor functor.


Rose tree bifunctor

Monday, 12 August 2019 10:33:00 UTC

A rose tree forms a bifunctor. An article for object-oriented developers.

This article is an instalment in an article series about bifunctors. While the overview article explains that there's essentially two practically useful bifunctors, here's a third one. rose trees.

Mapping both dimensions #

Like in the previous article on the Either bifunctor, I'll start by implementing the simultaneous two-dimensional translation SelectBoth:

public static IRoseTree<N1L1> SelectBoth<NN1LL1>(
    this IRoseTree<NL> source,
    Func<NN1> selectNode,
    Func<LL1> selectLeaf)
{
    return source.Cata(
        node: (n, branches) => new RoseNode<N1L1>(selectNode(n), branches),
        leaf: l => (IRoseTree<N1L1>)new RoseLeaf<N1L1>(selectLeaf(l)));
}

This article uses the previously shown Church-encoded rose tree and its catamorphism Cata.

In the leaf case, the l argument received by the lambda expression is an object of the type L, since the source tree is an IRoseTree<N, L> object; i.e. a tree with leaves of the type L and nodes of the type N. The selectLeaf argument is a function that converts an L object to an L1 object. Since l is an L object, you can call selectLeaf with it to produce an L1 object. You can use this resulting object to create a new RoseLeaf<N1, L1>. Keep in mind that while the RoseLeaf class requires two type arguments, it never requires an object of its N type argument, which means that you can create an object with any node type argument, including N1, even if you don't have an object of that type.

In the node case, the lambda expression receives two objects: n and branches. The n object has the type N, while the branches object has the type IEnumerable<IRoseTree<N1, L1>>. In other words, the branches have already been translated to the desired result type. That's how the catamorphism works. This means that you only have to figure out how to translate the N object n to an N1 object. The selectNode function argument can do that, so you can then create a new RoseNode<N1, L1> and return it.

This works as expected:

> var tree = RoseTree.Node("foo"new RoseLeaf<stringint>(42), new RoseLeaf<stringint>(1337));
> tree
RoseNode<string, int>("foo", IRoseTree<string, int>[2] { 42, 1337 })
> tree.SelectBoth(s => s.Length, i => i.ToString())
RoseNode<int, string>(3, IRoseTree<int, string>[2] { "42", "1337" })

This C# Interactive example shows how to convert a tree with internal string nodes and integer leaves to a tree of internal integer nodes and string leaves. The strings are converted to strings by counting their Length, while the integers are turned into strings using the standard ToString method available on all objects.

Mapping nodes #

When you have SelectBoth, you can trivially implement the translations for each dimension in isolation. For tuple bifunctors, I called these methods SelectFirst and SelectSecond, while for Either bifunctors, I chose to name them SelectLeft and SelectRight. Continuing the trend of naming the translations after what they translate, instead of their positions, I'll name the corresponding methods here SelectNode and SelectLeaf. In Haskell, the functions associated with Data.Bifunctor are always called first and second, but I see no reason to preserve such abstract naming in C#. In Haskell, these functions are part of the Bifunctor type class; the abstract names serve an actual purpose. This isn't the case in C#, so there's no reason to retain the abstract names. You might as well use names that communicate intent, which is what I've tried to do here.

If you want to map only the internal nodes, you can implement a SelectNode method based on SelectBoth:

public static IRoseTree<N1L> SelectNode<NN1L>(
    this IRoseTree<NL> source,
    Func<NN1> selector)
{
    return source.SelectBoth(selector, l => l);
}

This simply uses the l => l lambda expression as an ad-hoc identity function, while passing selector as the selectNode argument to the SelectBoth method.

You can use this to map the above tree to a tree made entirely of numbers:

> var tree = RoseTree.Node("foo"new RoseLeaf<stringint>(42), new RoseLeaf<stringint>(1337));
> tree.SelectNode(s => s.Length)
RoseNode<int, int>(3, IRoseTree<int, int>[2] { 42, 1337 })

Such a tree is, incidentally, isomorphic to a 'normal' tree. It might be a good exercise, if you need one, to demonstrate the isormorphism by writing functions that convert a Tree<T> into an IRoseTree<T, T>, and vice versa.

Mapping leaves #

Similar to SelectNode, you can also trivially implement SelectLeaf:

public static IRoseTree<NL1> SelectLeaf<NLL1>(
    this IRoseTree<NL> source,
    Func<LL1> selector)
{
    return source.SelectBoth(n => n, selector);
}

This is another one-liner calling SelectBoth, with the difference that the identity function n => n is passed as the first argument, instead of as the last. This ensures that only RoseLeaf values are mapped:

> var tree = RoseTree.Node("foo"new RoseLeaf<stringint>(42), new RoseLeaf<stringint>(1337));
> tree.SelectLeaf(i => i % 2 == 0)
RoseNode<string, bool>("foo", IRoseTree<string, bool>[2] { true, false })

In the above C# Interactive session, the leaves are mapped to Boolean values, indicating whether they're even or odd.

Identity laws #

Rose trees obey all the bifunctor laws. While it's formal work to prove that this is the case, you can get an intuition for it via examples. Often, I use a property-based testing library like FsCheck or Hedgehog to demonstrate (not prove) that laws hold, but in this article, I'll keep it simple and only cover each law with a parametrised test.

private static T Id<T>(T x) => x;
 
public static IEnumerable<object[]> BifunctorLawsData
{
    get
    {
        yield return new[] { new RoseLeaf<intstring>("") };
        yield return new[] { new RoseLeaf<intstring>("foo") };
        yield return new[] { RoseTree.Node<intstring>(42) };
        yield return new[] { RoseTree.Node(42, new RoseLeaf<intstring>("bar")) };
        yield return new[] { exampleTree };
    }
}
 
[TheoryMemberData(nameof(BifunctorLawsData))]
public void SelectNodeObeysFirstFunctorLaw(IRoseTree<intstring> t)
{
    Assert.Equal(t, t.SelectNode(Id));
}

This test uses xUnit.net's [Theory] feature to supply a small set of example input values. The input values are defined by the BifunctorLawsData property, since I'll reuse the same values for all the bifunctor law demonstration tests. The exampleTree object is the tree shown in Church-encoded rose tree.

The tests also use the identity function implemented as a private function called Id, since C# doesn't come equipped with such a function in the Base Class Library.

For all the IRoseTree<int, string> objects t, the test simply verifies that the original tree t is equal to the tree projected over the first axis with the Id function.

Likewise, the first functor law applies when translating over the second dimension:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void SelectLeafObeysFirstFunctorLaw(IRoseTree<intstring> t)
{
    Assert.Equal(t, t.SelectLeaf(Id));
}

This is the same test as the previous test, with the only exception that it calls SelectLeaf instead of SelectNode.

Both SelectNode and SelectLeaf are implemented by SelectBoth, so the real test is whether this method obeys the identity law:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void SelectBothObeysIdentityLaw(IRoseTree<intstring> t)
{
    Assert.Equal(t, t.SelectBoth(Id, Id));
}

Projecting over both dimensions with the identity function does, indeed, return an object equal to the input object.

Consistency law #

In general, it shouldn't matter whether you map with SelectBoth or a combination of SelectNode and SelectLeaf:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void ConsistencyLawHolds(IRoseTree<intstring> t)
{
    DateTime f(int i) => new DateTime(i);
    bool g(string s) => string.IsNullOrWhiteSpace(s);
 
    Assert.Equal(t.SelectBoth(f, g), t.SelectLeaf(g).SelectNode(f));
    Assert.Equal(
        t.SelectNode(f).SelectLeaf(g),
        t.SelectLeaf(g).SelectNode(f));
}

This example creates two local functions f and g. The first function, f, creates a new DateTime object from an integer, using one of the DateTime constructor overloads. The second function, g, just delegates to string.IsNullOrWhiteSpace, although I want to stress that this is just an example. The law should hold for any two (pure) functions.

The test then verifies that you get the same result from calling SelectBoth as when you call SelectNode followed by SelectLeaf, or the other way around.

Composition laws #

The composition laws insist that you can compose functions, or translations, and that again, the choice to do one or the other doesn't matter. Along each of the axes, it's just the second functor law applied. This parametrised test demonstrates that the law holds for SelectNode:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void SecondFunctorLawHoldsForSelectNode(IRoseTree<intstring> t)
{
    char f(bool b) => b ? 'T' : 'F';
    bool g(int i) => i % 2 == 0;
 
    Assert.Equal(
        t.SelectNode(x => f(g(x))),
        t.SelectNode(g).SelectNode(f));
}

Here, f is a local function that returns the the character 'T' for true, and 'F' for false; g is the even function. The second functor law states that mapping f(g(x)) in a single step is equivalent to first mapping over g and then map the result of that using f.

The same law applies if you fix the first dimension and translate over the second:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void SecondFunctorLawHoldsForSelectLeaf(IRoseTree<intstring> t)
{
    bool f(int x) => x % 2 == 0;
    int g(string s) => s.Length;
 
    Assert.Equal(
        t.SelectLeaf(x => f(g(x))),
        t.SelectLeaf(g).SelectLeaf(f));
}

Here, f is the even function, whereas g is a local function that returns the length of a string. Again, the test demonstrates that the output is the same whether you map over an intermediary step, or whether you map using only a single step.

This generalises to the composition law for SelectBoth:

[TheoryMemberData(nameof(BifunctorLawsData))]
public void SelectBothCompositionLawHolds(IRoseTree<intstring> t)
{
    char f(bool b) => b ? 'T' : 'F';
    bool g(int x) => x % 2 == 0;
    bool h(int x) => x % 2 == 0;
    int i(string s) => s.Length;
 
    Assert.Equal(
        t.SelectBoth(x => f(g(x)), y => h(i(y))),
        t.SelectBoth(g, i).SelectBoth(f, h));
}

Again, whether you translate in one or two steps shouldn't affect the outcome.

As all of these tests demonstrate, the bifunctor laws hold for rose trees. The tests only showcase five examples, but I hope it gives you an intuition how any rose tree is a bifunctor. After all, the SelectNode, SelectLeaf, and SelectBoth methods are all generic, and they behave the same for all generic type arguments.

Summary #

Rose trees are bifunctors. You can translate the node and leaf dimension of a rose tree independently of each other, and the bifunctor laws hold for any pure translation, no matter how you compose the projections.

As always, there can be performance differences between the various compositions, but the outputs will be the same regardless of composition.

A functor, and by extension, a bifunctor, is a structure-preserving map. This means that any projection preserves the structure of the underlying container. For rose trees this means that the shape of the tree remains the same. The number of leaves remain the same, as does the number of internal nodes.

Next: Contravariant functors.


Rose tree catamorphism

Monday, 05 August 2019 08:30:00 UTC

The catamorphism for a tree with different types of nodes and leaves is made up from two functions.

This article is part of an article series about catamorphisms. A catamorphism is a universal abstraction that describes how to digest a data structure into a potentially more compact value.

This article presents the catamorphism for a rose tree, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in Haskell. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix.

A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any tree is that internal nodes can hold values of a different type than leaf values.

A rose tree example diagram, with internal nodes containing integers, and leafs containing strings.

The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches.

C# catamorphism #

As a C# representation of a rose tree, I'll use the Church-encoded rose tree I've previously described. The catamorphism is this extension method:

public static TResult Cata<NLTResult>(
    this IRoseTree<NL> tree,
    Func<NIEnumerable<TResult>, TResult> node,
    Func<LTResult> leaf)
{
    return tree.Match(
        node: (n, branches) => node(n, branches.Select(t => t.Cata(node, leaf))),
        leaf: leaf);
}

Like most of the other catamorphisms shown in this article series, this one consists of two functions. One that handles the leaf case, and one that handles the partially reduced node case. Compare it with the tree catamorphism: notice that the rose tree catamorphism's node function is identical to the the tree catamorphism. The leaf function, however, is new.

In previous articles, you've seen other examples of catamorphisms for Church-encoded types. The most common pattern has been that the Church encoding (the Match method) was also the catamorphism, with the Peano catamorphism being the only exception so far. When it comes to the Peano catamorphism, however, I'm not entirely confident that the difference between Church encoding and catamorphism is real, or whether it's just an artefact of the way I originally designed the Church encoding.

When it comes to the present rose tree, however, notice that the catamorphisms is distinctly different from the Church encoding. That's the reason I called the method Cata instead of Match.

The method simply delegates the leaf handler to Match, while it adds behaviour to the node case. It works the same way as for the 'normal' tree catamorphism.

Examples #

You can use Cata to implement most other behaviour you'd like IRoseTree<N, L> to have. In a future article, you'll see how to turn the rose tree into a bifunctor and functor, so here, we'll look at some other, more ad hoc, examples. As is also the case for the 'normal' tree, you can calculate the sum of all nodes, if you can associate a number with each node.

Consider the example tree in the above diagram. You can create it as an IRoseTree<int, string> object like this:

IRoseTree<intstring> exampleTree =
    RoseTree.Node(42,
        RoseTree.Node(1337,
            new RoseLeaf<intstring>("foo"),
            new RoseLeaf<intstring>("bar")),
        RoseTree.Node(2112,
            RoseTree.Node(90125,
                new RoseLeaf<intstring>("baz"),
                new RoseLeaf<intstring>("qux"),
                new RoseLeaf<intstring>("quux")),
            new RoseLeaf<intstring>("quuz")),
        new RoseLeaf<intstring>("corge"));

If you want to calculate a sum for a tree like that, you can use the integers for the internal nodes, and perhaps the length of the strings of the leaves. That hardly makes much sense, but is technically possible:

> exampleTree.Cata((x, xs) => x + xs.Sum(), x => x.Length)
93641

Perhaps slightly more useful is to count the number of leaves:

> exampleTree.Cata((_, xs) => xs.Sum(), _ => 1)
7

A leaf node has, by definition, exactly one leaf node, so the leaf lambda expression always returns 1. In the node case, xs contains the partially summed leaf node count, so just Sum those together while ignoring the value of the internal node.

You can also measure the maximum depth of the tree:

> exampleTree.Cata((_, xs) => 1 + xs.Max(), _ => 0)
3

Consistent with the example for 'normal' trees, you can arbitrarily decide that the depth of a leaf node is 0, so again, the leaf lambda expression just returns a constant value. The node lambda expression takes the Max of the partially reduced xs and adds 1, since an internal node represents another level of depth in a tree.

Rose tree F-Algebra #

As in the previous article, I'll use Fix and cata as explained in Bartosz Milewski's excellent article on F-Algebras.

As always, start with the underlying endofunctor:

data RoseTreeF a b c =
    NodeF { nodeValue :: a, nodes :: ListFix c }
  | LeafF { leafValue :: b }
  deriving (ShowEqRead)
 
instance Functor (RoseTreeF a b) where
  fmap f (NodeF x ns) = NodeF x $ fmap f ns
  fmap _    (LeafF x) = LeafF x

Instead of using Haskell's standard list ([]) for the nodes, I've used ListFix from the article on list catamorphism. This should, hopefully, demonstrate how you can build on already established definitions derived from first principles.

As usual, I've called the 'data' types a and b, and the carrier type c (for carrier). The Functor instance as usual translates the carrier type; the fmap function has the type (c -> c1) -> RoseTreeF a b c -> RoseTreeF a b c1.

As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like Fix (RoseTreeF a b). To address that problem, you can introduce a newtype wrapper:

newtype RoseTreeFix a b =
  RoseTreeFix { unRoseTreeFix :: Fix (RoseTreeF a b) } deriving (ShowEqRead)

You can define Bifunctor, Bifoldable, Bitraversable, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use.

A pair of helper functions make it easier to define RoseTreeFix values:

roseLeafF :: b -> RoseTreeFix a b
roseLeafF = RoseTreeFix . Fix . LeafF
 
roseNodeF :: a -> ListFix (RoseTreeFix a b) -> RoseTreeFix a b
roseNodeF x = RoseTreeFix . Fix . NodeF x . fmap unRoseTreeFix

roseLeafF creates a leaf node:

Prelude Fix List RoseTree> roseLeafF "ploeh"
RoseTreeFix {unRoseTreeFix = Fix (LeafF "ploeh")}

roseNodeF is a helper function to create internal nodes:

Prelude Fix List RoseTree> roseNodeF 6 (consF (roseLeafF 0) nilF)
RoseTreeFix {unRoseTreeFix = Fix (NodeF 6 (ListFix (Fix (ConsF (Fix (LeafF 0)) (Fix NilF)))))}

Even with helper functions, construction of RoseTreeFix values is cumbersome, but keep in mind that the code shown here isn't meant to be used in practice. The goal is only to deduce catamorphisms from more basic universal abstractions, and you now have all you need to do that.

Haskell catamorphism #

At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (RoseTreeF a b), and an object c, but you still need to find a morphism RoseTreeF a b c -> c. Notice that the algebra you have to find is the function that reduces the functor to its carrier type c, not any of the 'data types' a or b. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore a or b, as you'll see.

As in the previous articles, start by writing a function that will become the catamorphism, based on cata:

roseTreeF = cata alg . unRoseTreeFix
  where alg (NodeF x ns) = undefined
        alg    (LeafF x) = undefined

While this compiles, with its undefined implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type c from the LeafF case? You could pass a function argument to the roseTreeF function and use it with x:

roseTreeF fl = cata alg . unRoseTreeFix
  where alg (NodeF x ns) = undefined
        alg    (LeafF x) = fl x

While you could, technically, pass an argument of the type c to roseTreeF and then return that value from the LeafF case, that would mean that you would ignore the x value. This would be incorrect, so instead, make the argument a function and call it with x. Likewise, you can deal with the NodeF case in the same way:

roseTreeF :: (a -> ListFix c -> c) -> (b -> c) -> RoseTreeFix a b -> c
roseTreeF fn fl = cata alg . unRoseTreeFix
  where alg (NodeF x ns) = fn x ns
        alg    (LeafF x) = fl x

This works. Since cata has the type Functor f => (f a -> a) -> Fix f -> a, that means that alg has the type f a -> a. In the case of RoseTreeF, the compiler infers that the alg function has the type RoseTreeF a b c -> c, which is just what you need!

You can now see what the carrier type c is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns.

This, then, is the catamorphism for a rose tree. As has been the most common pattern so far, it's a pair, made from two functions. It's still not the only possible catamorphism, since you could trivially flip the arguments to roseTreeF, or the arguments to fn.

I've chosen the representation shown here because it's similar to the catamorphism I've shown for a 'normal' tree, just with the added function for leaves.

Basis #

You can implement most other useful functionality with roseTreeF. Here's the Bifunctor instance:

instance Bifunctor RoseTreeFix where
  bimap f s = roseTreeF (roseNodeF . f) (roseLeafF . s)

Notice how naturally the catamorphism implements bimap.

From that instance, the Functor instance trivially follows:

instance Functor (RoseTreeFix a) where
  fmap = second

You could probably also add Applicative and Monad instances, but I find those hard to grasp, so I'm going to skip them in favour of Bifoldable:

instance Bifoldable RoseTreeFix where
  bifoldMap f = roseTreeF (\x xs -> f x <> fold xs)

The Bifoldable instance enables you to trivially implement the Foldable instance:

instance Foldable (RoseTreeFix a) where
  foldMap = bifoldMap mempty

You may find the presence of mempty puzzling, since bifoldMap takes two functions as arguments. Is mempty a function?

Yes, mempty can be a function. Here, it is. There's a Monoid instance for any function a -> m, where m is a Monoid instance, and mempty is the identity for that monoid. That's the instance in use here.

Just as RoseTreeFix is Bifoldable, it's also Bitraversable:

instance Bitraversable RoseTreeFix where
  bitraverse f s =
    roseTreeF (\x xs -> roseNodeF <$> f x <*> sequenceA xs) (fmap roseLeafF . s)

You can comfortably implement the Traversable instance based on the Bitraversable instance:

instance Traversable (RoseTreeFix a) where
  sequenceA = bisequenceA . first pure

That rose trees are Traversable turns out to be useful, as a future article will show.

Relationships #

As was the case for 'normal' trees, the catamorphism for rose trees is more powerful than the fold. There are operations that you can express with the Foldable instance, but other operations that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using RoseTreeFix, you can define that tree like this:

exampleTree =
  roseNodeF 42 (
    consF (
      roseNodeF 1337 (
        consF (roseLeafF "foo") $
        consF (roseLeafF "bar") nilF)) $
    consF (
      roseNodeF 2112 (
        consF (
          roseNodeF 90125 (
            consF (roseLeafF "baz") $
            consF (roseLeafF "qux") $
            consF (roseLeafF "quux") nilF)) $
        consF (roseLeafF "quuz") nilF)) $
    consF (
      roseLeafF "corge")
    nilF)

You can trivially calculate the sum of string lengths of all leaves, using only the Foldable instance:

Prelude RoseTree> sum $ length <$> exampleTree
25

You can also fairly easily calculate a sum of all nodes, using the length of the strings as in the above C# example, but that requires the Bifoldable instance:

Prelude Data.Bifoldable Data.Semigroup RoseTree> bifoldMap Sum (Sum . length) exampleTree
Sum {getSum = 93641}

Fortunately, we get the same result as above.

Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the Foldable instance, but interestingly, it turns out that counting leaves is possible with the Bifoldable instance:

countLeaves :: (Bifoldable p, Num n) => p a b -> n
countLeaves = getSum . bifoldMap (const $ Sum 0) (const $ Sum 1)

This works well with the example tree:

Prelude RoseTree> countLeaves exampleTree
7

Notice, however, that countLeaves works for any Bifoldable instance. Does that mean that you can 'count the leaves' of a tuple? Yes, it does:

Prelude RoseTree> countLeaves ("foo", "bar")
1
Prelude RoseTree> countLeaves (1, 42)
1

Or what about EitherFix:

Prelude RoseTree Either> countLeaves $ leftF "foo"
0
Prelude RoseTree Either> countLeaves $ rightF "bar"
1

Notice that 'counting the leaves' of tuples always returns 1, while 'counting the leaves' of Either always returns 0 for Left values, and 1 for Right values. This is because countLeaves considers the left, or first, data type to represent internal nodes, and the right, or second, data type to indicate leaves.

You can further follow that train of thought to realise that you can convert both tuples and EitherFix values to small rose trees:

fromTuple :: (a, b) -> RoseTreeFix a b
fromTuple (x, y) = roseNodeF x (consF (roseLeafF y) nilF)
 
fromEitherFix :: EitherFix a b -> RoseTreeFix a b
fromEitherFix = eitherF (`roseNodeF` nilF) roseLeafF

The fromTuple function creates a small rose tree with one internal node and one leaf. The label of the internal node is the first value of the tuple, and the label of the leaf is the second value. Here's an example:

Prelude RoseTree> fromTuple ("foo", 42)
RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix (ConsF (Fix (LeafF 42)) (Fix NilF)))))}

The fromEitherFix function turns a left value into an internal node with no leaves, and a right value into a leaf. Here are some examples:

Prelude RoseTree Either> fromEitherFix $ leftF "foo"
RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix NilF)))}
Prelude RoseTree Either> fromEitherFix $ rightF 42
RoseTreeFix {unRoseTreeFix = Fix (LeafF 42)}

While counting leaves can be implemented using Bifoldable, that's not the case for measuring the depths of trees (I think; leave a comment if you know of a way to do this with one of the instances shown here). You can, however, measure tree depth with the catamorphism:

treeDepth :: RoseTreeFix a b -> Integer
treeDepth = roseTreeF (\_ xs -> 1 + maximum xs) (const 0)

The implementation is similar to the implementation for 'normal' trees. I've arbitrarily decided that leaves have a depth of zero, so the function that handles leaves always returns 0. The function that handles internal nodes receives xs as a partially reduced list of depths below the node in question. Take the maximum of those and add 1, since each internal node has a depth of one.

Prelude RoseTree> treeDepth exampleTree
3

This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation.

Summary #

The catamorphism for rose trees is a pair of functions. One function transforms internal nodes with their partially reduced branches, while the other function transforms leaves.

For a realistic example of using a rose tree in a real program, see Picture archivist in Haskell.

This article series has so far covered progressively more complex data structures. The first examples (Boolean catamorphism and Peano catamorphism) were neither functors, applicatives, nor monads. All subsequent examples, on the other hand, are all of these, and more. The next example presents a functor that's neither applicative nor monad, yet still foldable. Obviously, what functionality it offers is still based on a catamorphism.

Next: Full binary tree catamorphism.


Comments

Each node can have an arbitrary number of branches, including none.

Because of this sentence, in the picture of an example after the containing paragraph, I expected to see a(n) (internal) node with no branches.

You can also measure the maximum depth of the tree:

> exampleTree.Cata((_, xs) => 1 + xs.Max(), _ => 0)
3

Max will throw an exception when given an internal node with no children. What value do you want to return in that case?

2020-08-03 16:49 UTC

Tyson, thank you for writing. You're right that my implementation doesn't properly handle the empty edge case. That's also the case for Haskell's maximum function. I find it annoying that it's a partial function.

One can handle that edge case by assigning an arbitrary depth to an empty node, just as is the case for leaf nodes. If leaf nodes get assigned a depth of 0, wouldn't it be natural to also give empty internal nodes a depth of 0?

2020-08-03 17:29 UTC

Yes, very natural. In particular, such a definition would be consistent with the corresponding definition for Tree<>. More specifically, we want the behaviors to be the same when the two type parameters in IRoseTree<,> are the same (and the function passed in for leaf is the same as the one passed in for node after fixing the second argument to Enumberable.Empty<TResult>()>).

I think the smallest change to get the depth to be 0 for an internal node with no children is to replace Max with a slight variant that returns -1 when there are no children. I don't think this is readable though. It is quite the magic number. But it is just the codification of the thought process that lead to it.

Each (internal) node can have an arbitrary number of branches, including none.

...

...an internal node represents another level of depth in a tree.

It is because of such edge cases that Jeremy Gibbons in his PhD thesis Algebras for Tree Algorithms says (on page 44) that the internal nodes of his rose tree must include at least one child.

Meertens allows his lists of children to be empty, so permitting parents with no children; to avoid this rather strange concept we use non-empty lists.

I think Jeremy has me convinced. One could say that this heterogenous rose tree was obtained from the homogeneous variant by adding a type for the leaves. The homogeneous variant denoted leaves via an empty list of children. It makes sense then to remove the empty list approach for making a leaf when adding the typed approach. So, I think the best fix then would be to modify your definition of RoseNode<,> in your first rose tree article to be the same as Jeremy's (by requiring that IEnumerable<> of children is non-empty). This change would also better match your example pictures of a rose tree, which do not include an internal node without children.

2020-08-03 18:37 UTC

Yes, it'd definitely be an option to change the definition of an internal node to a NonEmptyCollection.

My underlying motivation for defining the type like I've done in these articles, however, was to provide the underlying abstraction for a functional file system. In order to model a file system, empty nodes should be possible, because they correspond to empty directories.

2020-08-03 19:41 UTC

Church-encoded rose tree

Monday, 29 July 2019 13:14:00 UTC

A rose tree is a tree with leaf nodes of one type, and internal nodes of another.

This article is part of a series of articles about Church encoding. In the previous articles, you've seen how to implement a Maybe container, and how to implement an Either container. Through these examples, you've learned how to model sum types without explicit language support. In this article, you'll see how to model a rose tree.

A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any tree is that internal nodes can hold values of a different type than leaf values.

A rose tree example diagram, with internal nodes containing integers, and leaves containing strings.

The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches.

Contract #

In C#, you can represent the fundamental structure of a rose tree with a Church encoding, starting with an interface:

public interface IRoseTree<NL>
{
    TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf);
}

The structure of a rose tree includes two mutually exclusive cases: internal nodes and leaf nodes. Since there's two cases, the Match method takes two arguments, one for each case.

The interface is generic, with two type arguments: N (for Node) and L (for leaf). Any consumer of an IRoseTree<N, L> object must supply two functions when calling the Match method: a function that turns a node into a TResult value, and a function that turns a leaf into a TResult value.

Both cases must have a corresponding implementation.

Leaves #

The leaf implementation is the simplest:

public sealed class RoseLeaf<NL> : IRoseTree<NL>
{
    private readonly L value;
 
    public RoseLeaf(L value)
    {
        this.value = value;
    }
 
    public TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf)
    {
        return leaf(value);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is RoseLeaf<NL> other))
            return false;
 
        return Equals(value, other.value);
    }
 
    public override int GetHashCode()
    {
        return value.GetHashCode();
    }
}

The RoseLeaf class is an Adapter over a value of the generic type L. As is always the case with Church encoding, it implements the Match method by unconditionally calling one of the arguments, in this case the leaf function, with its adapted value.

While it doesn't have to do this, it also overrides Equals and GetHashCode. This is an immutable class, so it's a great candidate to be a Value Object. Making it a Value Object makes it easier to compare expected and actual values in unit tests, among other benefits.

Nodes #

The node implementation is slightly more complex:

public sealed class RoseNode<NL> : IRoseTree<NL>
{
    private readonly N value;
    private readonly IEnumerable<IRoseTree<NL>> branches;
 
    public RoseNode(N value, IEnumerable<IRoseTree<NL>> branches)
    {
        this.value = value;
        this.branches = branches;
    }
 
    public TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf)
    {
        return node(value, branches);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is RoseNode<NL> other))
            return false;
 
        return Equals(value, other.value)
            && Enumerable.SequenceEqual(branches, other.branches);
    }
 
    public override int GetHashCode()
    {
        return value.GetHashCode() ^ branches.GetHashCode();
    }
}

A node contains both a value (of the type N) and a collection of sub-trees, or branches. The class implements the Match method by unconditionally calling the node function argument with its constituent values.

Again, it overrides Equals and GetHashCode for the same reasons as RoseLeaf. This isn't required to implement Church encoding, but makes comparison and unit testing easier.

Usage #

You can use the RoseLeaf and RoseNode constructors to create new trees, but it sometimes helps to have a static helper method to create values. It turns out that there's little value in a helper method for leaves, but for nodes, it's marginally useful:

public static IRoseTree<NL> Node<NL>(N value, params IRoseTree<NL>[] branches)
{
    return new RoseNode<NL>(value, branches);
}

This enables you to create tree objects, like this:

IRoseTree<stringint> tree =
    RoseTree.Node("foo"new RoseLeaf<stringint>(42), new RoseLeaf<stringint>(1337));

That's a single node with the label "foo" and two leaves with the values 42 and 1337, respectively. You can create the tree shown in the above diagram like this:

IRoseTree<intstring> exampleTree =
    RoseTree.Node(42,
        RoseTree.Node(1337,
            new RoseLeaf<intstring>("foo"),
            new RoseLeaf<intstring>("bar")),
        RoseTree.Node(2112,
            RoseTree.Node(90125,
                new RoseLeaf<intstring>("baz"),
                new RoseLeaf<intstring>("qux"),
                new RoseLeaf<intstring>("quux")),
            new RoseLeaf<intstring>("quuz")),
        new RoseLeaf<intstring>("corge"));

You can add various extension methods to implement useful functionality. In later articles, you'll see some more compelling examples, so here, I'm only going to show a few basic examples. One of the simplest features you can add is a method that will tell you if an IRoseTree<N, L> object is a node or a leaf:

public static IChurchBoolean IsLeaf<NL>(this IRoseTree<NL> source)
{
    return source.Match<IChurchBoolean>(
        node: (_, __) => new ChurchFalse(),
        leaf: _ => new ChurchTrue());
}
 
public static IChurchBoolean IsNode<NL>(this IRoseTree<NL> source)
{
    return new ChurchNot(source.IsLeaf());
}

Since this article is part of the overall article series on Church encoding, and the purpose of that article series is also to show how basic language features can be created from Church encodings, these two methods return Church-encoded Boolean values instead of the built-in bool type. I'm sure you can imagine how you could change the type to bool if you'd like.

You can use these methods like this:

> IRoseTree<Guiddouble> tree = new RoseLeaf<Guiddouble>(-3.2);
> tree.IsLeaf()
ChurchTrue { }
> tree.IsNode()
ChurchNot(ChurchTrue)
> IRoseTree<longstring> tree = RoseTree.Node<longstring>(42);
> tree.IsLeaf()
ChurchFalse { }
> tree.IsNode()
ChurchNot(ChurchFalse)

In a future article, you'll see some more compelling examples.

Terminology #

It's not entirely clear what to call a tree like the one shown here. The Wikipedia entry doesn't state one way or the other whether internal node types ought to be distinguishable from leaf node types, but there are indications that this could be the case. At least, it seems that the term isn't well-defined, so I took the liberty to retcon the name rose tree to the data structure shown here.

In the paper that introduces the rose tree term, Meertens writes:

"We consider trees whose internal nodes may fork into an arbitrary (natural) number of sub-trees. (If such a node has zero descendants, we still consider it internal.) Each external node carries a data item. No further information is stored in the tree; in particular, internal nodes are unlabelled."

First Steps towards the Theory of Rose Trees, Lambert Meertens, 1988
While the concept is foreign in C#, you can trivially introduce a unit data type:

public class Unit
{
    public readonly static Unit Instance = new Unit();
 
    private Unit() { }
}

This enables you to create a rose tree according to Meertens' definition:

IRoseTree<Unitint> meertensTree =
    RoseTree.Node(Unit.Instance,
        RoseTree.Node(Unit.Instance,
            RoseTree.Node(Unit.Instance,
                new RoseLeaf<Unitint>(2112)),
            new RoseLeaf<Unitint>(42),
            new RoseLeaf<Unitint>(1337),
            new RoseLeaf<Unitint>(90125)),
        RoseTree.Node(Unit.Instance,
            new RoseLeaf<Unitint>(1984)),
        new RoseLeaf<Unitint>(666));

Visually, you could draw it like this:

A Meertens rose tree example diagram, with leaves containing integers.

Thus, the tree structure shown here seems to be a generalisation of Meertens' original definition.

I'm not a mathematician, so I may have misunderstood some things. If you have a better name than rose tree for the data structure shown here, please leave a comment.

Yeats #

Now that we're on the topic of rose tree as a term, you may, as a bonus, enjoy a similarly-titled poem:

THE ROSE TREE

"O words are lightly spoken"
Said Pearse to Connolly,
"Maybe a breath of politic words
Has withered our Rose Tree;
Or maybe but a wind that blows
Across the bitter sea."

"It needs to be but watered,"
James Connolly replied,
"To make the green come out again
And spread on every side,
And shake the blossom from the bud
To be the garden's pride."

"But where can we draw water"
Said Pearse to Connolly,
"When all the wells are parched away?
O plain as plain can be
There's nothing but our own red blood
Can make a right Rose Tree."

As far as I can tell, though, Yeats' metaphor is dissimilar to Meertens'.

Summary #

You may occasionally find use for a tree that distinguishes between internal and leaf nodes. You can model such a tree with a Church encoding, as shown in this article.

Next: Catamorphisms.


Comments

If you have a better name than rose tree for the data structure shown here, please leave a comment.

I would consider using heterogeneous rose tree.

In your linked Twitter thread, Keith Battocchi shared a link to the thesis of Jeremy Gibbons (which is titled Algebras for Tree Algorithms). In his thesis, he defines rose tree as you have and then derives from that (on page 45) the special cases that he calls unlabelled rose tree, leaf-labeled rose tree, branch-labeled rose tree, and homogeneous rose tree.

The advantage of Jeremy's approach is that the name of each special case is formed from the named of the general case by adding an adjective. The disadvantage is the ambiguity that comes from being inconsistent with the definition of rose tree compared to both previous works and current usage (as shown by your numerous links).

The goal of naming is communication, and the name rose tree risks miscommunication, which is worse than no communication at all. Miscommunication would result by (for example) calling this heterogeneous rose tree a rose tree and someone that knows the rose tree definition in Haskell skims over your definition thinking that they already know it. However, I think you did a good job with this article and made that unlikely.

The advantage of heterogeneous rose tree is that the name is not overloaded and heterogeneous clearly indicates the intended variant. If a reader has heard of a rose tree, then they probably know there are several variants and can infer the correct one from this additional adjective.

In the end though, I think using the name rose tree as you did was a good choice. Your have now written several articles involving rose trees and they all use the same variant. Since you always use the same variant, it would be a bit verbose to always include an additional adjective to specify the variant.

The only thing I would have considered changing is the first mention of rose tree in this article. It is common in academic writing to start with the general definition and then give shorter alternatives for brevity. This is one way it could have been written in this article.

A heterogeneous rose tree is a tree with leaf nodes of one type, and internal nodes of another.

This article is part of a series of articles about Church encoding. In the previous articles, you've seen how to implement a Maybe container, and how to implement an Either container. Through these examples, you've learned how to model sum types without explicit language support. In this article, you'll see how to model a heterogeneous rose tree.

A heterogeneous rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a heterogeneous rose tree and just any tree is that internal nodes can hold values of a different type than leaf values. For brevity, we will omit heterogeneous and simply call this data structure a rose tree.

2020-08-03 02:02 UTC

Tyson, thank you for writing. I agree that specificity is desirable. I haven't read the Gibbons paper, so I'll have to reflect on your summary. If I understand you correctly, a type like IRoseTree shown in this article constitutes the general case. In Haskell, I simply named it Tree a b, which is probably too general, but may help to illustrate the following point.

As far as I remember, C# doesn't have type aliases, so Haskell makes the point more succinct. If I understand you correctly, then, you could define a heterogeneous rose tree as:

type HeterogeneousRoseTree = Tree

Furthermore, I suppose that a leaf-labeled rose tree is this:

type LeafLabeledRoseTree b = Tree () b

Would the following be a branch-labeled rose tree?

type BranchLabeledRoseTree a = Tree a ()

And this is, I suppose, a homogeneous rose tree:

type HomogeneousRoseTree a = Tree a a

I can't imagine what an unlabelled rose tree is, unless it's this:

type UnlabelledRoseTree = Tree () ()

I don't see how that'd be of much use, but I suppose that's just my lack of imagination.

2020-08-09 15:23 UTC

Chain of Responsibility as catamorphisms

Monday, 22 July 2019 14:11:00 UTC

The Chain of Responsibility design pattern can be viewed as a list fold over the First monoid, followed by a Maybe fold.

This article is part of a series of articles about specific design patterns and their category theory counterparts. In it, you'll see how the Chain of Responsibility design pattern is equivalent to a succession of catamorphisms. First, you apply the First Maybe monoid over the list catamorphism, and then you conclude the reduction with the Maybe catamorphism.

Pattern #

The Chain of Responsibility design pattern gives you a way to model cascading conditionals with an object structure. It's a chain (or linked list) of objects that all implement the same interface (or base class). Each object (apart from the the last) has a reference to the next object in the list.

General diagram of the Chain of Responsibility design pattern.

A client (some other code) calls a method on the first object in the list. If that object can handle the request, it does so, and the interaction ends there. If the method returns a value, the object returns the value.

If the first object determines that it can't handle the method call, it calls the next object in the chain. It only knows the next object as the interface, so the only way it can delegate the call is by calling the same method as the first one. In the above diagram, Imp1 can't handle the method call, so it calls the same method on Imp2, which also can't handle the request and delegates responsibility to Imp3. In the diagram, Imp3 can handle the method call, so it does so and returns a result that propagates back up the chain. In that particular example, Imp4 never gets involved.

You'll see an example below.

One of the advantages of the pattern is that you can rearrange the chain to change its behaviour. You can even do this at run time, if you'd like, since all objects implement the same interface.

User icon example #

Consider an online system that maintains user profiles for users. A user is modelled with the User class:

public User(int id, string name, string email, bool useGravatar, bool useIdenticon)

While I only show the signature of the class' constructor, it should be enough to give you an idea. If you need more details, the entire example code base is available on GitHub.

Apart from an id, a name and email address, a user also has two flags. One flag tracks whether the user wishes to use his or her Gravatar, while another flag tracks if the user would like to use an Identicon. Obviously, both flags could be true, in which case the current business rule states that the Gravatar should take precedence.

If none of the flags are set, users might still have a picture associated with their profile. This could be a picture that they've uploaded to the system, and is being tracked by a database.

If no user icon can be found or generated, ultimately the system should use a fallback, default icon:

Default user icon.

To summarise, the current rules are:

  1. Use Gravatar if flag is set.
  2. Use Identicon if flag is set.
  3. Use uploaded picture if available.
  4. Use default icon.
The order of precedence could change in the future, new images sources could be added, or some of the present sources could be removed. Modelling this set of rules as a Chain of Responsibility makes it easy for you to reorder the rules, should you need to.

To request an icon, a client can use the IIconReader interface:

public interface IIconReader
{
    Icon ReadIcon(User user);
}

The Icon class is just a Value Object wrapper around a URL. The idea is that such a URL can be used in an img tag to show the icon. Again, the full source code is available on GitHub if you'd like to investigate the details.

The various rules for icon retrieval can be implemented using this interface.

Gravatar reader #

Although you don't have to implement the classes in the order in which you are going to compose them, it seems natural to do so, starting with the Gravatar implementation.

public class GravatarReader : IIconReader
{
    private readonly IIconReader next;
 
    public GravatarReader(IIconReader next)
    {
        this.next = next;
    }
 
    public Icon ReadIcon(User user)
    {
        if (user.UseGravatar)
            return new Icon(new Gravatar(user.Email).Url);
 
        return next.ReadIcon(user);
    }
}

The GravatarReader class both implements the IIconReader interface, but also decorates another object of the same polymorphic type. If user.UseGravatar is true, it generates the appropriate Gravatar URL based on the user's Email address; otherwise, it delegates the work to the next object in the Chain of Responsibility.

The Gravatar class contains the implementation details to generate the Gravatar Url. Again, please refer to the GitHub repository if you're interested in the details.

Identicon reader #

When you compose the chain, according to the above business logic, the next type of icon you should attempt to generate is an Identicon. It's natural to implement the Identicon reader next, then:

public class IdenticonReader : IIconReader
{
    private readonly IIconReader next;
 
    public IdenticonReader(IIconReader next)
    {
        this.next = next;
    }
 
    public Icon ReadIcon(User user)
    {
        if (user.UseIdenticon)
            return new Icon(new Uri(baseUrl, HashUser(user)));
 
        return next.ReadIcon(user);
    }
 
    // Implementation details go here...
}

Again, I'm omitting implementation details in order to focus on the Chain of Responsibility design pattern. If user.UseIdenticon is true, the IdenticonReader generates the appropriate Identicon and returns the URL for it; otherwise, it delegates the work to the next object in the chain.

Database icon reader #

The DBIconReader class attempts to find an icon ID in a database. If it succeeds, it creates a URL corresponding to that ID. The assumption is that that resource exists; either it's a file on disk, or it's an image resource generated on the spot based on binary data stored in the database.

public class DBIconReader : IIconReader
{
    private readonly IUserRepository repository;
    private readonly IIconReader next;
 
    public DBIconReader(IUserRepository repository, IIconReader next)
    {
        this.repository = repository;
        this.next = next;
    }
 
    public Icon ReadIcon(User user)
    {
        if (!repository.TryReadIconId(user.Id, out string iconId))
            return next.ReadIcon(user);
 
        var parameters = new Dictionary<stringstring>
        {
            { "iconId", iconId }
        };
        return new Icon(urlTemplate.BindByName(baseUrl, parameters));
    }
 
    private readonly Uri baseUrl = new Uri("https://example.com");
    private readonly UriTemplate urlTemplate = new UriTemplate("users/{iconId}/icon");
}

This class demonstrates some variations in the way you can implement the Chain of Responsibility design pattern. The above GravatarReader and IdenticonReader classes both follow the same implementation pattern of checking a condition, and then performing work if the condition is true. The delegation to the next object in the chain happens, in those two classes, outside of the if statement.

The DBIconReader class, on the other hand, reverses the structure of the code. It uses a Guard Clause to detect whether to exit early, which is done by delegating work to the next object in the chain.

If TryReadIconId returns true, however, the ReadIcon method proceeds to create the appropriate icon URL.

Another variation on the Chain of Responsibility design pattern demonstrated by the DBIconReader class is that it takes a second dependency, apart from next. The repository is the usual misapplication of the Repository design pattern that everyone think they use correctly. Here, it's used in the common sense to provide access to a database. The main point, though, is that you can add as many other dependencies to a link in the chain as you'd like. All links, apart from the last, however, must have a reference to the next link in the chain.

Default icon reader #

Like linked lists, a Chain of Responsibility has to ultimately terminate. You can use the following DefaultIconReader for that.

public class DefaultIconReader : IIconReader
{
    public Icon ReadIcon(User user)
    {
        return Icon.Default;
    }
}

This class unconditionally returns the Default icon. Notice that it doesn't have any next object it delegates to. This terminates the chain. If no previous implementation of the IIconReader has returned an Icon for the user, this one does.

Chain composition #

With four implementations of IIconReader, you can now compose the Chain of Responsibility:

IIconReader reader =
    new GravatarReader(
        new IdenticonReader(
            new DBIconReader(repo,
                new DefaultIconReader())));

The first link in the chain is a GravatarReader object that contains an IdenticonReader object as its next link, and so on. Referring back to the source code of GravatarReader, notice that its next dependency is declared as an IIconReader. Since the IdenticonReader class implements that interface, you can compose the chain like this, but if you later decide to change the order of the objects, you can do so simply by changing the composition. You could remove objects altogether, or add new classes, and you could even do this at run time, if required.

The DBIconReader class requires an extra IUserRepository dependency, here simply an existing object called repo.

The DefaultIconReader takes no other dependencies, so this effectively terminates the chain. If you try to pass another IIconReader to its constructor, the code doesn't compile.

Haskell proof of concept #

When evaluating whether a design is a functional architecture, I often port the relevant parts to Haskell. You can do the same with the above example, and put it in a form where it's clearer that the Chain of Responsibility pattern is equivalent to two well-known catamorphisms.

Readers not comfortable with Haskell can skip the next few sections. The object-oriented example continues below.

User and Icon types are defined by types equivalent to above. There's no explicit interface, however. Creation of Gravatars and Identicons are both pure functions with the type User -> Maybe Icon. Here's the Gravatar function, but the Identicon function looks similar:

gravatarUrl :: String -> String
gravatarUrl email =
  "https://www.gravatar.com/avatar/" ++ show (hashString email :: MD5Digest)
 
getGravatar :: User -> Maybe Icon
getGravatar u =
  if useGravatar u
    then Just $ Icon $ gravatarUrl $ userEmail u
    else Nothing

Reading an icon ID from a database, however, is an impure operation, so the function to do this has the type User -> IO (Maybe Icon).

Lazy I/O in Haskell #

Notice that the database icon-querying function has the return type IO (Maybe Icon). In the introduction you read that the Chain of Responsibility design pattern is a sequence of catamorphisms - the first one over a list of First values. While First is, in itself, a Semigroup instance, it gives rise to a Monoid instance when combined with Maybe. Thus, to showcase the abstractions being used, you could create a list of Maybe (First Icon) values. This forms a Monoid, so is easy to fold.

The problem with that, however, is that IO is strict under evaluation, so while it works, it's no longer lazy. You can combine IO (Maybe (First Icon)) values, but it leads to too much I/O activity.

You can solve this problem with a newtype wrapper:

newtype FirstIO a = FirstIO (MaybeT IO a) deriving (FunctorApplicativeMonadAlternative)
 
firstIO :: IO (Maybe a) -> FirstIO a
firstIO = FirstIO . MaybeT
 
getFirstIO :: FirstIO a -> IO (Maybe a)
getFirstIO (FirstIO (MaybeT x)) = x
 
instance Semigroup (FirstIO a) where
  (<>) = (<|>)
 
instance Monoid (FirstIO a) where
  mempty = empty

This uses the GeneralizedNewtypeDeriving GHC extension to automatically make FirstIO Functor, Applicative, Monad, and Alternative. It also uses the Alternative instance to implement Semigroup and Monoid. You may recall from the documentation that Alternative is already a "monoid on applicative functors."

Alignment #

You now have three functions with different types: two pure functions with the type User -> Maybe Icon and one impure database-bound function with the type User -> IO (Maybe Icon). In order to have a common abstraction, you should align them so that all types match. At first glance, User -> IO (Maybe (First Icon)) seems like a type that fits all implementations, but that causes too much I/O to take place, so instead, use User -> FirstIO Icon. Here's how to lift the pure getGravatar function:

getGravatarIO :: User -> FirstIO Icon
getGravatarIO = firstIO . return . getGravatar

You can lift the other functions in similar fashion, to produce getGravatarIO, getIdenticonIO, and getDBIconIO, all with the mutual type User -> FirstIO Icon.

Haskell composition #

The goal of the Haskell proof of concept is to compose a function that can provide an Icon for any User - just like the above C# composition that uses Chain of Responsibility. There's, however, no way around impurity, because one of the steps involve a database, so the aim is a composition with the type User -> IO Icon.

While a more compact composition is possible, I'll show it in a way that makes the catamorphisms explicit:

getIcon :: User -> IO Icon
getIcon u = do
  let lazyIcons = fmap (\f -> f u) [getGravatarIO, getIdenticonIO, getDBIconIO]
  m <- getFirstIO $ fold lazyIcons
  return $ fromMaybe defaultIcon m

The getIcon function starts with a list of all three functions. For each of them, it calls the function with the User value u. This may seem inefficient and redundant, because all three function calls may not be required, but since the return values are FirstIO values, all three function calls are lazily evaluated - even under IO. The result, lazyIcons, is a [FirstIO Icon] value; i.e. a lazily evaluated list of lazily evaluated values.

This first step is just to put the potential values in a form that's recognisable. You can now fold the lazyIcons to a single FirstIO Icon value, and then use getFirstIO to unwrap it. Due to do notation, m is a Maybe Icon value.

This is the first catamorphism. Granted, the generalisation that fold offers is not really required, since lazyIcons is a list; mconcat would have worked just as well. I did, however, choose to use fold (from Data.Foldable) to emphasise the point. While the fold function itself isn't the catamorphism for lists, we know that it's derived from the list catamorphism.

The final step is to utilise the Maybe catamorphism to reduce the Maybe Icon value to an Icon value. Again, the getIcon function doesn't use the Maybe catamorphism directly, but rather the derived fromMaybe function. The Maybe catamorphism is the maybe function, but you can trivially implement fromMaybe with maybe.

For golfers, it's certainly possible to write this function in a more compact manner. Here's a point-free version:

getIcon :: User -> IO Icon
getIcon =
  fmap (fromMaybe defaultIcon) . getFirstIO . fold [getGravatarIO, getIdenticonIO, getDBIconIO]

This alternative version utilises that a -> m is a Monoid instance when m is a Monoid instance. That's the reason that you can fold a list of functions. The more explicit version above doesn't do that, but the behaviour is the same in both cases.

That's all the Haskell code we need to discern the universal abstractions involved in the Chain of Responsibility design pattern. We can now return to the C# code example.

Chains as lists #

The Chain of Responsibility design pattern is often illustrated like above, in a staircase-like diagram. There's, however, no inherent requirement to do so. You could also flatten the diagram:

Chain of Responsibility illustrated as a linked list.

This looks a lot like a linked list.

The difference is, however, that the terminator of a linked list is usually empty. Here, however, you have two types of objects. All objects apart from the rightmost object represent a potential. Each object may, or may not, handle the method call and produce an outcome; if an object can't handle the method call, it'll delegate to the next object in the chain.

The rightmost object, however, is different. This object can't delegate any further, but must handle the method call. In the icon reader example, this is the DefaultIconReader class.

Once you start to see most of the list as a list of potential values, you may realise that you'll be able to collapse into it a single potential value. This is possible because a list of values where you pick the first non-empty value forms a monoid. This is sometimes called the First monoid.

In other words, you can reduce, or fold, all of the list, except the rightmost value, to a single potential value:

Chain of Responsibility illustrated as a linked list, with all but the rightmost objects folded to one.

When you do that, however, you're left with a single potential value. The result of folding most of the list is that you get the leftmost non-empty value in the list. There's no guarantee, however, that that value is non-empty. If all the values in the list are empty, the result is also empty. This means that you somehow need to combine a potential value with a value that's guaranteed to be present: the terminator.

You can do that wither another fold:

Chain of Responsibility illustrated as a linked list, with two consecutive folds.

This second fold isn't a list fold, but rather a Maybe fold.

Maybe #

The First monoid is a monoid over Maybe, so add a Maybe class to the code base. In Haskell, the catamorphism for Maybe is called maybe, but that's not a good method name in object-oriented design. Another option is some variation of fold, but in C#, this functionality tends to be called Aggregate, at least for IEnumerable<T>, so I'll reuse that terminology:

public TResult Aggregate<TResult>(TResult @default, Func<TTResult> func)
{
    if (func == null)
        throw new ArgumentNullException(nameof(func));
 
    return hasItem ? func(item) : @default;
}

You can implement another, more list-like Aggregate overload from this one, but for this article, you don't need it.

From TryReadIconId to Maybe #

In the above code examples, DBIconReader depends on IUserRepository, which defined this method:

bool TryReadIconId(int userId, out string iconId);

From Tester-Doer isomorphisms we know, however, that such a design is isomorphic to returning a Maybe value, and since that's more composable, do that:

Maybe<string> ReadIconId(int userId);

This requires you to refactor the DBIconReader implementation of the ReadIcon method:

public Icon ReadIcon(User user)
{
    Maybe<string> mid = repository.ReadIconId(user.Id);
    Lazy<Icon> lazyResult = mid.Aggregate(
        @default: new Lazy<Icon>(() => next.ReadIcon(user)),
        func: id => new Lazy<Icon>(() => CreateIcon(id)));
    return lazyResult.Value;
}

A few things are worth a mention. Notice that the above Aggregate method (the Maybe catamorphism) requires you to supply a @default value (to be used if the Maybe object is empty). In the Chain of Responsibility design pattern, however, the fallback value is produced by calling the next object in the chain. If you do this unconditionally, however, you perform too much work. You're only supposed to call next if the current object can't handle the method call.

The solution is to aggregate the mid object to a Lazy<Icon> and then return its Value. The @default value is now a lazy computation that calls next only if its Value is read. When mid is populated, on the other hand, the lazy computation calls the private CreateIcon method when Value is accessed. The private CreateIcon method contains the same logic as before the refactoring.

This change of DBIconReader isn't strictly necessary in order to change the overall Chain of Responsibility to a pair of catamorphisms, but serves, I think, as a nice introduction to the use of the Maybe catamorphism.

Optional icon readers #

Previously, the IIconReader interface required each implementation to return an Icon object:

public interface IIconReader
{
    Icon ReadIcon(User user);
}

When you have an object like GravatarReader that may or may not return an Icon, this requirement leads toward the Chain of Responsibility design pattern. You can, however, shift the responsibility of what to do next by changing the interface:

public interface IIconReader
{
    Maybe<Icon> ReadIcon(User user);
}

An implementation like GravatarReader becomes simpler:

public class GravatarReader : IIconReader
{
    public Maybe<Icon> ReadIcon(User user)
    {
        if (user.UseGravatar)
            return new Maybe<Icon>(new Icon(new Gravatar(user.Email).Url));
 
        return new Maybe<Icon>();
    }
}

No longer do you have to pass in a next dependency. Instead, you just return an empty Maybe<Icon> if you can't handle the method call. The same change applies to the IdenticonReader class.

Since Maybe is a functor, and the DBIconReader already works on a Maybe<string> value, its implementation is greatly simplified:

public Maybe<Icon> ReadIcon(User user)
{
    return repository.ReadIconId(user.Id).Select(CreateIcon);
}

Since ReadIconId returns a Maybe<string>, you can simply use Select to transform the icon ID to an Icon object if the Maybe is populated.

Coalescing Composite #

As an intermediate step, you can compose the various readers using a Coalescing Composite:

public class CompositeIconReader : IIconReader
{
    private readonly IIconReader[] iconReaders;
 
    public CompositeIconReader(params IIconReader[] iconReaders)
    {
        this.iconReaders = iconReaders;
    }
 
    public Maybe<Icon> ReadIcon(User user)
    {
        foreach (var iconReader in iconReaders)
        {
            var mIcon = iconReader.ReadIcon(user);
            if (IsPopulated(mIcon))
                return mIcon;
        }
 
        return new Maybe<Icon>();
    }
 
    private static bool IsPopulated<T>(Maybe<T> m)
    {
        return m.Aggregate(false, _ => true);
    }
}

I prefer a more explicit design over this one, so this is just an intermediate step. This IIconReader implementation composes an array of other IIconReader objects and queries each in order to return the first populated Maybe value it finds. If it doesn't find any populated value, it returns an empty Maybe object.

You can now compose your IIconReader objects into a Composite:

IIconReader reader = new CompositeIconReader(
    new GravatarReader(),
    new IdenticonReader(),
    new DBIconReader(repo));

While this gives you a single object on which you can call ReadIcon, the return value of that method is still a Maybe<Icon> object. You still need to reduce the Maybe<Icon> object to an Icon object. You can do this with a Maybe helper method:

public T GetValueOrDefault(T @default)
{
    return Aggregate(@default, x => x);
}

Given a User object named user, you can now use the composition and the GetValueOrDefault method to get an Icon object:

Icon icon = reader.ReadIcon(user).GetValueOrDefault(Icon.Default);

First you use the composed reader to produce a Maybe<Icon> object, and then you use the GetValueOrDefault method to reduce the Maybe<Icon> object to an Icon object.

The latter of these two steps, GetValueOrDefault, is already based on the Maybe catamorphism, but the first step is still too implicit to clearly show the nature of what's actually going on. The next step is to refactor the Coalescing Composite to a list of monoidal values.

First #

While not strictly necessary, you can introduce a First<T> wrapper:

public sealed class First<T>
{
    public First(T item)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
 
        Item = item;
    }
 
    public T Item { get; }
 
    public override bool Equals(object obj)
    {
        if (!(obj is First<T> other))
            return false;
 
        return Equals(Item, other.Item);
    }
 
    public override int GetHashCode()
    {
        return Item.GetHashCode();
    }
}

In this particular example, the First<T> class adds no new capabilities, so it's technically redundant. You could add to it methods to combine two First<T> objects into one (since First forms a semigroup), and perhaps a method or two to accumulate multiple values, but in this article, none of those are required.

While the class as shown above doesn't add any behaviour, I like that it signals intent, so I'll use it in that role.

Lazy I/O in C# #

Like in the above Haskell code, you'll need to be able to combine two First<T> objects in a lazy fashion, in such a way that if the first object is populated, the I/O associated with producing the second value never happens. In Haskell I addressed that concern with a newtype that, among other abstractions, is a monoid. You can do the same in C# with an extension method:

public static Lazy<Maybe<First<T>>> FindFirst<T>(
    this Lazy<Maybe<First<T>>> m,
    Lazy<Maybe<First<T>>> other)
{
    if (m.Value.IsPopulated())
        return m;
 
    return other;
}
 
private static bool IsPopulated<T>(this Maybe<T> m)
{
    return m.Aggregate(false, _ => true);
}

The FindFirst method returns the first (leftmost) non-empty object of two options. It's a lazy version of the First monoid, and that's still a monoid. It's truly lazy because it never accesses the Value property on other. While it has to force evaluation of the first lazy computation, m, it doesn't have to evaluate other. Thus, whenever m is populated, other can remain non-evaluated.

Since monoids accumulate, you can also write an extension method to implement that functionality:

public static Lazy<Maybe<First<T>>> FindFirst<T>(this IEnumerable<Lazy<Maybe<First<T>>>> source)
{
    var identity = new Lazy<Maybe<First<T>>>(() => new Maybe<First<T>>());
    return source.Aggregate(identity, (acc, x) => acc.FindFirst(x));
}

This overload just uses the earlier FindFirst extension method to fold an arbitrary number of lazy First<T> objects into one. Notice that Aggregate is the C# name for the list catamorphisms.

You can now compose the desired functionality using the basic building blocks of monoids, functors, and catamorphisms.

Composition from universal abstractions #

The goal is still a function that takes a User object as input and produces an Icon object as output. While you could compose that functionality directly in-line where you need it, I think it may be helpful to package the composition in a Facade object.

public class IconReaderFacade
{
    private readonly IReadOnlyCollection<IIconReader> readers;
 
    public IconReaderFacade(IUserRepository repository)
    {
        readers = new IIconReader[]
            {
                new GravatarReader(),
                new IdenticonReader(),
                new DBIconReader(repository)
            };
    }
 
    public Icon ReadIcon(User user)
    {
        IEnumerable<Lazy<Maybe<First<Icon>>>> lazyIcons = readers
            .Select(r =>
                new Lazy<Maybe<First<Icon>>>(() =>
                    r.ReadIcon(user).Select(i => new First<Icon>(i))));
        Lazy<Maybe<First<Icon>>> m = lazyIcons.FindFirst();
        return m.Value.Aggregate(Icon.Default, fi => fi.Item);
    }
}

When you initialise an IconReaderFacade object, it creates an array of the desired readers. Whenever ReadIcon is invoked, it first transforms all those readers to a sequence of potential icons. All the values in the sequence are lazily evaluated, so in this step, nothing actually happens, even though it looks as though all readers' ReadIcon method gets called. The Select method is a structure-preserving map, so all readers are still potential producers of Icon objects.

You now have an IEnumerable<Lazy<Maybe<First<Icon>>>>, which must be a good candidate for the prize for the most nested generic .NET type of 2019. It fits, though, the input type for the above FindFirst overload, so you can call that. The result is a single potential value m. That's the list catamorphism applied.

Finally, you force evaluation of the lazy computation and apply the Maybe catamorphism (Aggregate). The @default value is Icon.Default, which gets returned if m turns out to be empty. When m is populated, you pull the Item out of the First object. In either case, you now have an Icon object to return.

This composition has exactly the same behaviour as the initial Chain of Responsibility implementation, but is now composed from universal abstractions.

Summary #

The Chain of Responsibility design pattern describes a flexible way to implement conditional logic. Instead of relying on keywords like if or switch, you can compose the conditional logic from polymorphic objects. This gives you several advantages. One is that you get better separations of concerns, which will tend to make it easier to refactor the code. Another is that it's possible to change the behaviour at run time, by moving the objects around.

You can achieve a similar design, with equivalent advantages, by composing polymorphically similar functions in a list, map the functions to a list of potential values, and then use the list catamorphism to reduce many potential values to one. Finally, you apply the Maybe catamorphism to produce a value, even if the potential value is empty.

Next: The State pattern and the State monad.


Tester-Doer isomorphisms

Monday, 15 July 2019 07:35:00 UTC

The Tester-Doer pattern is equivalent to the Try-Parse idiom; both are equivalent to Maybe.

This article is part of a series of articles about software design isomorphisms. An isomorphism is when a bi-directional lossless translation exists between two representations. Such translations exist between the Tester-Doer pattern and the Try-Parse idiom. Both can also be translated into operations that return Maybe.

Isomorphisms between Tester-Doer, Try-Parse, and Maybe.

Given an implementation that uses one of those three idioms or abstractions, you can translate your design into one of the other options. This doesn't imply that each is of equal value. When it comes to composability, Maybe is superior to the two other alternatives, and Tester-Doer isn't thread-safe.

Tester-Doer #

The first time I explicitly encountered the Tester-Doer pattern was in the Framework Design Guidelines, which is from where I've taken the name. The pattern is, however, older. The idea that you can query an object about whether a given operation would be possible, and then you only perform it if the answer is affirmative, is almost a leitmotif in Object-Oriented Software Construction. Bertrand Meyer often uses linked lists and stacks as examples, but I'll instead use the example that Krzysztof Cwalina and Brad Abrams use:

ICollection<int> numbers = // ...
if (!numbers.IsReadOnly)
    numbers.Add(1);

The idea with the Tester-Doer pattern is that you test whether an intended operation is legal, and only perform it if the answer is affirmative. In the example, you only add to the numbers collection if IsReadOnly is false. Here, IsReadOnly is the Tester, and Add is the Doer.

As Jeffrey Richter points out in the book, this is a dangerous pattern:

"The potential problem occurs when you have multiple threads accessing the object at the same time. For example, one thread could execute the test method, which reports that all is OK, and before the doer method executes, another thread could change the object, causing the doer to fail."
In other words, the pattern isn't thread-safe. While multi-threaded programming was always supported in .NET, this was less of a concern when the guidelines were first published (2006) than it is today. The guidelines were in internal use in Microsoft years before they were published, and there wasn't many multi-core processors in use back then.

Another problem with the Tester-Doer pattern is with discoverability. If you're looking for a way to add an element to a collection, you'd usually consider your search over once you find the Add method. Even if you wonder Is this operation safe? Can I always add an element to a collection? you might consider looking for a CanAdd method, but not an IsReadOnly property. Most people don't even ask the question in the first place, though.

From Tester-Doer to Try-Parse #

You could refactor such a Tester-Doer API to a single method, which is both thread-safe and discoverable. One option is a variation of the Try-Parse idiom (discussed in detail below). Using it could look like this:

ICollection<int> numbers = // ...
bool wasAdded = numbers.TryAdd(1);

In this special case, you may not need the wasAdded variable, because the original Add operation never returned a value. If, on the other hand, you do care whether or not the element was added to the collection, you'd have to figure out what to do in the case where the return value is true and false, respectively.

Compared to the more idiomatic example of the Try-Parse idiom below, you may have noticed that the TryAdd method shown here takes no out parameter. This is because the original Add method returns void; there's nothing to return. From unit isomorphisms, however, we know that unit is isomorphic to void, so we could, more explicitly, have defined a TryAdd method with this signature:

public bool TryAdd(T item, out Unit unit)

There's no point in doing this, however, apart from demonstrating that the isomorphism holds.

From Tester-Doer to Maybe #

You can also refactor the add-to-collection example to return a Maybe value, although in this degenerate case, it makes little sense. If you automate the refactoring process, you'd arrive at an API like this:

public Maybe<Unit> TryAdd(T item)

Using it would look like this:

ICollection<int> numbers = // ...
Maybe<Unit> m = numbers.TryAdd(1);

The contract is consistent with what Maybe implies: You'd get an empty Maybe<Unit> object if the add operation 'failed', and a populated Maybe<Unit> object if the add operation succeeded. Even in the populated case, though, the value contained in the Maybe object would be unit, which carries no further information than its existence.

To be clear, this isn't close to a proper functional design because all the interesting action happens as a side effect. Does the design have to be functional? No, it clearly isn't in this case, but Maybe is a concept that originated in functional programming, so you could be misled to believe that I'm trying to pass this particular design off as functional. It's not.

A functional version of this API could look like this:

public Maybe<ICollection<T>> TryAdd(T item)

An implementation wouldn't mutate the object itself, but rather return a new collection with the added item, in case that was possible. This is, however, always possible, because you can always concatenate item to the front of the collection. In other words, this particular line of inquiry is increasingly veering into the territory of the absurd. This isn't, however, a counter-example of my proposition that the isomorphism exists; it's just a result of the initial example being degenerate.

Try-Parse #

Another idiom described in the Framework Design Guidelines is the Try-Parse idiom. This seems to be a coding idiom more specific to the .NET framework, which is the reason I call it an idiom instead of a pattern. (Perhaps it is, after all, a pattern... I'm sure many of my readers are better informed about how problems like these are solved in other languages, and can enlighten me.)

A better name might be Try-Do, since the idiom doesn't have to be constrained to parsing. The example that Cwalina and Abrams supply, however, relates to parsing a string into a DateTime value. Such an API is already available in the base class library. Using it looks like this:

bool couldParse = DateTime.TryParse(candidate, out DateTime dateTime);

Since DateTime is a value type, the out parameter will never be null, even if parsing fails. You can, however, examine the return value couldParse to determine whether the candidate could be parsed.

In the running commentary in the book, Jeffrey Richter likes this much better:

"I like this guideline a lot. It solves the race-condition problem and the performance problem."
I agree that it's better than Tester-Doer, but that doesn't mean that you can't refactor such a design to that pattern.

From Try-Parse to Tester-Doer #

While I see no compelling reason to design parsing attempts with the Tester-Doer pattern, it's possible. You could create an API that enables interaction like this:

DateTime dateTime = default(DateTime);
bool canParse = DateTimeEnvy.CanParse(candidate);
if (canParse)
    dateTime = DateTime.Parse(candidate);

You'd need to add a new CanParse method with this signature:

public static bool CanParse(string candidate)

In this particular example, you don't have to add a Parse method, because it already exists in the base class library, but in other examples, you'd have to add such a method as well.

This example doesn't suffer from issues with thread safety, since strings are immutable, but in general, that problem is always a concern with the Tester-Doer anti-pattern. Discoverability still suffers in this example.

From Try-Parse to Maybe #

While the Try-Parse idiom is thread-safe, it isn't composable. Every time you run into an API modelled over this template, you have to stop what you're doing and check the return value. Did the operation succeed? Was should the code do if it didn't?

Maybe, on the other hand, is composable, so is a much better way to model problems such as parsing. Typically, methods or functions that return Maybe values are still prefixed with Try, but there's no longer any out parameter. A Maybe-based TryParse function could look like this:

public static Maybe<DateTime> TryParse(string candidate)

You could use it like this:

Maybe<DateTime> m = DateTimeEnvy.TryParse(candidate);

If the candidate was successfully parsed, you get a populated Maybe<DateTime>; if the string was invalid, you get an empty Maybe<DateTime>.

A Maybe object composes much better with other computations. Contrary to the Try-Parse idiom, you don't have to stop and examine a Boolean return value. You don't even have to deal with empty cases at the point where you parse. Instead, you can defer the decision about what to do in case of failure until a later time, where it may be more obvious what to do in that case.

Maybe #

In my Encapsulation and SOLID Pluralsight course, you get a walk-through of all three options for dealing with an operation that could potentially fail. Like in this article, the course starts with Tester-Doer, progresses over Try-Parse, and arrives at a Maybe-based implementation. In that course, the example involves reading a (previously stored) message from a text file. The final API looks like this:

public Maybe<string> Read(int id)

The protocol implied by such a signature is that you supply an ID, and if a message with that ID exists on disc, you receive a populated Maybe<string>; otherwise, an empty object. This is not only composable, but also thread-safe. For anyone who understands the universal abstraction of Maybe, it's clear that this is an operation that could fail. Ultimately, client code will have to deal with empty Maybe values, but this doesn't have to happen immediately. Such a decision can be deferred until a proper context exists for that purpose.

From Maybe to Tester-Doer #

Since Tester-Doer is the least useful of the patterns discussed in this article, it makes little sense to refactor a Maybe-based API to a Tester-Doer implementation. Nonetheless, it's still possible. The API could look like this:

public bool Exists(int id)

public string Read(int id)

Not only is this design not thread-safe, but it's another example of poor discoverability. While the doer is called Read, the tester isn't called CanRead, but rather Exists. If the class has other members, these could be listed interleaved between Exists and Read. It wouldn't be obvious that these two members were designed to be used together.

Again, the intended usage is code like this:

string message;
if (fileStore.Exists(49))
    message = fileStore.Read(49);

This is still problematic, because you need to decide what to do in the else case as well, although you don't see that case here.

The point is, still, that you can translate from one representation to another without loss of information; not that you should.

From Maybe to Try-Parse #

Of the three representations discussed in this article, I firmly believe that a Maybe-based API is superior. Unfortunately, the .NET base class library doesn't (yet) come with a built-in Maybe object, so if you're developing an API as part of a reusable library, you have two options:

  • Export the library's Maybe<T> type together with the methods that return it.
  • Use Try-Parse for interoperability reasons.
This is the only reason I can think of to use the Try-Parse idiom. For the FileStore example from my Pluralsight course, this would imply not a TryParse method, but a TryRead method:

public bool TryRead(int id, out string message)

This would enable you to expose the method in a reusable library. Client code could interact with it like this:

string message;
if (!fileStore.TryRead(50, out message))
    message = "";

This has all the problems associated with the Try-Parse idiom already discussed in this article, but it does, at least, have a basic use case.

Isomorphism with Either #

At this point, I hope that you find it reasonable to believe that the three representations, Tester-Doer, Try-Parse, and Maybe, are isomorphic. You can translate between any of these representations to any other of these without loss of information. This also means that you can translate back again.

While I've only argued with a series of examples, it's my experience that these three representations are truly isomorphic. You can always translate any of these representations into another. Mostly, though, I translate into Maybe. If you disagree with my proposition, all you have to do is to provide a counter-example.

There's a fourth isomorphism that's already well-known, and that's between Maybe and Either. Specifically, Maybe<T> is isomorphic to Either<Unit, T>. In Haskell, this is easily demonstrated with this set of functions:

toMaybe :: Either () a -> Maybe a
toMaybe (Left ()) = Nothing
toMaybe (Right x) = Just x
 
fromMaybe :: Maybe a -> Either () a
fromMaybe Nothing = Left ()
fromMaybe (Just x) = Right x

Translated to C#, using the Church-encoded Maybe together with the Church-encoded Either, these two functions could look like the following, starting with the conversion from Maybe to Either:

// On Maybe:
public static IEither<UnitT> ToEither<T>(this IMaybe<T> source)
{
    return source.Match<IEither<UnitT>>(
        nothing: new Left<UnitT>(Unit.Value),
        just: x => new Right<UnitT>(x));
}

Likewise, the conversion from Either to Maybe:

// On Either:
public static IMaybe<T> ToMaybe<T>(this IEither<UnitT> source)
{
    return source.Match<IMaybe<T>>(
        onLeft: _ => new Nothing<T>(),
        onRight: x => new Just<T>(x));
}

You can convert back and forth to your heart's content, as this parametrised xUnit.net 2.3.1 test shows:

[Theory]
[InlineData(42)]
[InlineData(1337)]
[InlineData(2112)]
[InlineData(90125)]
public void IsomorphicWithPopulatedMaybe(int i)
{
    var expected = new Right<Unitint>(i);
    var actual = expected.ToMaybe().ToEither();
    Assert.Equal(expected, actual);
}

I decided to exclude IEither<Unit, T> from the overall theme of this article in order to better contrast three alternatives that may not otherwise look equivalent. That IEither<Unit, T> is isomorphic to IMaybe<T> is a well-known result. Besides, I think that both of these two representations already inhabit the same conceptual space. Either and Maybe are both well-known in statically typed functional programming.

Summary #

The Tester-Doer pattern is a decades-old design pattern that attempts to model how to perform operations that can potentially fail, without relying on exceptions for flow control. It predates mainstream multi-core processors by decades, which can explain why it even exists as a pattern in the first place. At the time people arrived at the pattern, thread-safety wasn't a big concern.

The Try-Parse idiom is a thread-safe alternative to the Tester-Doer pattern. It combines the two tester and doer methods into a single method with an out parameter. While thread-safe, it's not composable.

Maybe offers the best of both worlds. It's both thread-safe and composable. It's also as discoverable as any Try-Parse method.

These three alternatives are all, however, isomorphic. This means that you can refactor any of the three designs into one of the other designs, without loss of information. It also means that you can implement Adapters between particular implementations, should you so desire. You see this frequently in F# code, where functions that return 'a option adapt Try-Parse methods from the .NET base class library.

While all three designs are equivalent in the sense that you can translate one into another, it doesn't imply that they're equally useful. Maybe is the superior design, and Tester-Doer clearly inferior.

Next: Builder isomorphisms.


Payment types catamorphism

Monday, 08 July 2019 06:08:00 UTC

You can find the catamorphism for a custom sum type. Here's an example.

This article is part of an article series about catamorphisms. A catamorphism is a universal abstraction that describes how to digest a data structure into a potentially more compact value.

This article presents the catamorphism for a domain-specific sum type, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with a few examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in Haskell. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix.

In all previous articles in the series, you've seen catamorphisms for well-known data structures: Boolean values, Peano numbers, Maybe, trees, and so on. These are all general-purpose data structures, so you might be left with the impression that catamorphisms are only related to such general types. That's not the case. The point of this article is to demonstrate that you can find the catamorphism for your own custom, domain-specific sum type as well.

C# catamorphism #

The custom type we'll examine in this article is the Church-encoded payment types I've previously written about. It's just an example of a custom data type, but it serves the purpose of illustration because I've already shown it as a Church encoding in C#, as a Visitor in C#, and as a discriminated union in F#.

The catamorphism for the IPaymentType interface is the Match method:

T Match<T>(
    Func<PaymentServiceT> individual,
    Func<PaymentServiceT> parent,
    Func<ChildPaymentServiceT> child);

As has turned out to be a common trait, the catamorphism is identical to the Church encoding.

I'm not going to show more than a few examples of using the Match method, because you can find other examples in the previous articles,

> IPaymentType p = new Individual(new PaymentService("Visa""Pay"));
> p.Match(ps => ps.Name, ps => ps.Name, cps => cps.PaymentService.Name)
"Visa"
> IPaymentType p = new Parent(new PaymentService("Visa""Pay"));
> p.Match(ps => ps.Name, ps => ps.Name, cps => cps.PaymentService.Name)
"Visa"
> IPaymentType p = new Child(new ChildPaymentService("1234"new PaymentService("Visa""Pay")));
> p.Match(ps => ps.Name, ps => ps.Name, cps => cps.PaymentService.Name)
"Visa"

These three examples from a C# Interactive session demonstrate that no matter which payment method you use, you can use the same Match method call to extract the payment name from the p object.

Payment types F-Algebra #

As in the previous article, I'll use Fix and cata as explained in Bartosz Milewski's excellent article on F-Algebras.

First, you'll have to define the auxiliary types involved in this API:

data PaymentService = PaymentService {
    paymentServiceName :: String
  , paymentServiceAction :: String
  } deriving (ShowEqRead)
 
data ChildPaymentService = ChildPaymentService {
    originalTransactionKey :: String
  , parentPaymentService :: PaymentService
  } deriving (ShowEqRead)

While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. You already saw examples of that in the articles about Boolean catamorphism, Maybe catamorphism, and Either catamorphism. While each of the three payment types have associated data, none of it is parametrically polymorphic, so a single type argument for the carrier type suffices:

data PaymentTypeF c =
    IndividualF PaymentService
  | ParentF PaymentService
  | ChildF ChildPaymentService
  deriving (ShowEqRead)
 
instance Functor PaymentTypeF where
  fmap _ (IndividualF ps) = IndividualF ps
  fmap _     (ParentF ps) = ParentF ps
  fmap _     (ChildF cps) = ChildF cps

I chose to call the carrier type c (for carrier). As was also the case with BoolF, MaybeF, and EitherF, the Functor instance ignores the map function because the carrier type is missing from all three cases. Like the Functor instances for BoolF, MaybeF, and EitherF, it'd seem that nothing happens, but at the type level, this is still a translation from PaymentTypeF c to PaymentTypeF c1. Not much of a function, perhaps, but definitely an endofunctor.

Some helper functions make it a little easier to create Fix PaymentTypeF values, but there's really not much to them:

individualF :: PaymentService -> Fix PaymentTypeF
individualF = Fix . IndividualF
 
parentF :: PaymentService -> Fix PaymentTypeF
parentF = Fix . ParentF
 
childF :: ChildPaymentService -> Fix PaymentTypeF
childF = Fix . ChildF

That's all you need to identify the catamorphism.

Haskell catamorphism #

At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (PaymentTypeF), and an object c, but you still need to find a morphism PaymentTypeF c -> c.

As in the previous articles, start by writing a function that will become the catamorphism, based on cata:

paymentF = cata alg
  where alg (IndividualF ps) = undefined
        alg     (ParentF ps) = undefined
        alg     (ChildF cps) = undefined

While this compiles, with its undefined implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type c from the IndividualF case? You could pass an argument to the paymentF function, but you shouldn't ignore the data ps contained in the case, so it has to be a function:

paymentF fi = cata alg
  where alg (IndividualF ps) = fi ps
        alg     (ParentF ps) = undefined
        alg     (ChildF cps) = undefined

I chose to call the argument fi, for function, individual. You can pass a similar argument to deal with the ParentF case:

paymentF fi fp = cata alg
  where alg (IndividualF ps) = fi ps
        alg     (ParentF ps) = fp ps
        alg     (ChildF cps) = undefined

And of course with the remaining ChildF case as well:

paymentF :: (PaymentService -> c) ->
            (PaymentService -> c) ->
            (ChildPaymentService -> c) ->
            Fix PaymentTypeF -> c
paymentF fi fp fc = cata alg
  where alg (IndividualF ps) = fi ps
        alg     (ParentF ps) = fp ps
        alg     (ChildF cps) = fc cps

This works. Since cata has the type Functor f => (f a -> a) -> Fix f -> a, that means that alg has the type f a -> a. In the case of PaymentTypeF, the compiler infers that the alg function has the type PaymentTypeF c -> c, which is just what you need!

You can now see what the carrier type c is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns.

This, then, is the catamorphism for the payment types. Except for the tree catamorphism, all catamorphisms so far have been pairs, but this one is a triplet of functions. This is because the sum type has three cases instead of two.

As you've seen repeatedly, this isn't the only possible catamorphism, since you can, for example, trivially reorder the arguments to paymentF. The version shown here is, however, equivalent to the above C# Match method.

Usage #

You can use the catamorphism as a basis for other functionality. If, for example, you want to convert a Fix PaymentTypeF value to JSON, you can first define an Aeson record type for that purpose:

data PaymentJson = PaymentJson {
    name :: String
  , action :: String
  , startRecurrent :: Bool
  , transactionKey :: Maybe String
  } deriving (ShowEqGeneric)
 
instance ToJSON PaymentJson

Subsequently, you can use paymentF to implement a conversion from Fix PaymentTypeF to PaymentJson, as in the previous articles:

toJson :: Fix PaymentTypeF -> PaymentJson
toJson =
  paymentF
    (\(PaymentService n a)                         -> PaymentJson n a False Nothing)
    (\(PaymentService n a)                         -> PaymentJson n a True Nothing)
    (\(ChildPaymentService k (PaymentService n a)) -> PaymentJson n a False $ Just k)

Testing it in GHCi, it works as it's supposed to:

Prelude Data.Aeson B Payment> B.putStrLn $ encode $ toJson $ parentF $ PaymentService "Visa" "Pay"
{"transactionKey":null,"startRecurrent":true,"action":"Pay","name":"Visa"}

Clearly, it would have been easier to define the payment types shown here as a regular Haskell sum type and just use standard pattern matching, but the purpose of this article isn't to present useful code; the only purpose of the code here is to demonstrate how to identify the catamorphism for a custom domain-specific sum type.

Summary #

Even custom, domain-specific sum types have catamorphisms. This article presented the catamorphism for a custom payment sum type. Because this particular sum type has three cases, the catamorphism is a triplet, instead of a pair, which has otherwise been the most common shape of catamorphisms in previous articles.

Next: Some design patterns as universal abstractions.


Yes silver bullet

Monday, 01 July 2019 07:38:00 UTC

Since Fred Brooks published his essay, I believe that we, contrary to his prediction, have witnessed several silver bullets.

I've been rereading Fred Brooks's 1986 essay No Silver Bullet because I've become increasingly concerned that people seem to draw the wrong conclusions from it. Semantic diffusion seems to have set in. These days, when people state something along the lines that there's no silver bullet in software development, I often get the impression that they mean that there's no panacea.

Indeed; I agree. There's no miracle cure that will magically make all problems in software development go away. That's not what the essay states, however. It is, fortunately, more subtle than that.

No silver bullet reread #

It's a great essay. It's not my intent to dispute the central argument of the essay, but I think that Brooks made one particular assumption that I disagree with. That doesn't make me smarter in any way. He wrote the essay in 1986. I'm writing this in 2019, with the benefit of the experience of all the years in-between. Hindsight is 20-20, so anyone could make the observations that I do here.

Before we get to that, though, a brief summary of the essence of the essay is in order. In short, the conclusion is this:

"There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity."

Fred Brooks, No Silver Bullet, 1986
The beginning of the essay is a brilliant analysis of the reasons why software development is inherently difficult. If you read this together with Jack Reeves What Is Software Design? (available various places on the internet, or as an appendix in APPP), you'll probably agree that there's an inherent complexity to software development that no invention is likely to dispel.

Ostensibly in the tradition of Aristotle, Brooks distinguishes between essential and accidental complexity. This distinction is central to his argument, so it's worth discussing for a minute.

Software development problems are complex, i.e. made up of many interacting sub-problems. Some of that complexity is accidental. This doesn't imply randomness or sloppiness, but only that the complexity isn't inherent to the problem; that it's only the result of our (human) failure to achieve perfection.

If you imagine that you could whittle away all the accidental complexity, you'd ultimately reach a point where, in the words of Saint Exupéry, there is nothing more to remove. What's left is the essential complexity.

Brooks' conjecture is that a typical software development project comes with both essential and accidental complexity. In his 1995 reflections "No Silver Bullet" Refired (available in The Mythical Man-Month), he clarifies what he already implied in 1986:

"It is my opinion, and that is all, that the accidental or representational part of the work is now down to about half or less of the total."

Fred Brooks, "No Silver Bullet" Refired, 1995
This I fundamentally disagree with, but more on that later. It makes sense to me to graphically represent the argument like this:

Some, but not much, accidental complexity as a shell around essential complexity.

The way that I think of Brooks' argument is that any software project contains some essential and some accidental complexity. For a given project, the size of the essential complexity is fixed.

Brooks believes that less than half of the overall complexity is accidental:

Essential and accidental complexity pie chart.

While a pie chart better illustrates the supposed ratio between the two types of complexity, I prefer to view Brooks' arguments as the first diagram, above. In that visualisation, the essential complexity is a core of fixed size, while accidental complexity is something you can work at removing. If you keep improving your process and technology, you may, conceptually, be able to remove (almost) all of it.

Essential complexity with a very thin shell of accidental complexity.

Brooks' point, with which I agree, is that if the essential complexity is inherent, then you can't reduce the size of it. The only way to decrease the overall complexity is to reduce the accidental complexity.

If you agree with the assessment that less than half of the overall complexity in modern software development is accidental, then it follows that no dramatic improvements are available. Even if you remove all accidental complexity, you've only reduced overall complexity by, say, forty percent.

Accidental complexity abounds #

I find Brooks' arguments compelling. I do not, however, accept the premise that there's only little accidental complexity left. Instead of the above diagrams, I believe that the situation looks more like this (not to scale):

Accidental complexity with a tiny core of essential complexity.

I think that most of the complexity in software development is accidental. I'm not sure about today, but I believe that I have compelling evidence that this was the case in 1986, so I don't see why it shouldn't still be the case.

To be clear, this is all anecdotal, since I don't believe that software development is quantifiable. In the essay, Brooks explicitly talks about the invisibility of software. Software is pure thought stuff; you can't measure it. I discuss this in my Humane Code video, but I also recommend that you read The Leprechauns of Software Engineering if you have any illusions that we, as an industry, have any reliable measurements of productivity.

Brooks predicts that, within the decade (from 1986 to 1996), there would be no single development that would increase productivity with an order of magnitude, i.e. by a factor of at least ten. Ironically, when he wrote "No Silver Bullet" Refired in 1995, at least two such developments were already in motion.

We can't blame Brooks for not identifying those developments, because in 1995, their impact was not yet apparent. Again, hindsight is 20-20.

Neither of these two developments are purely technological, although technology plays a role. Notice, though, that Brooks' prediction included technology or management technique. It's in the interaction between technology and the humane that the orders-of-magnitude developments emerged.

World Wide Web #

I have a dirty little secret. In the beginning of my programming career, I became quite the expert on a programming framework called Microsoft Commerce Server. In fact, I co-authored a chapter of Professional Commerce Server 2000 Programming, and in 2003 I received an MVP award as an acknowledgement of my work in the Commerce Server community (such as it were; it was mostly on Usenet).

The Commerce Server framework was a black box. This was long before Microsoft embraced open source, and while there was a bit of official documentation, it was superficial; it was mostly of the getting-started kind.

Over several years, I managed to figure out how the framework really worked, and thus, how one could extend it. This was a painstaking process. Since it was a black box, I couldn't just go and read the code to figure out how it worked. The framework was written in C++ and Visual Basic, so there wasn't even IL code to decompile.

I had one window into the framework. It relied on SQL Server, and I could attach the profiler tool to spy on its interaction with the database. Painstakingly, over several years, I managed to wrest the framework's secrets from it.

I wasted much time doing detective work like that.

In general, programming in the late nineties and early two-thousands was less productive, not because the languages or tools were orders-of-magnitude worse than today, but because when you hit a snag, you were in trouble.

These days, if you run into a problem beyond your abilities, you can ask for help on the World Wide Web. Usually, you'll find an existing answer on Stack Overflow, and you'll be able to proceed without too much delay.

Compared to twenty years ago, I believe that the World Wide Web has increased my productivity more than ten-fold. While it also existed in 1995, there wasn't much content. It's not the technology itself that provides the productivity increase, but rather the synergy of technology and human knowledge.

I think that Brooks vastly underestimated how much time one can waste when one is stuck. That's a sort of accidental complexity, although in the development process rather than in the technology itself.

Automated testing #

In the late nineties, I was developing web sites (with Commerce Server). When I wanted to run my code to see if it worked, I'd launch the web site on my laptop, log in, click around and enter data until I was convinced that the functionality was working as it should. Most of the time, however, it wasn't, so I'd change a bit of the code, and go through the same process again.

I think that's a common way to 'test' software; at least, it was back then.

While you could get good at going through these motions quickly, verifying a single, or a handful of related functionalities, could easily take at least a couple of seconds, and usually more like half a minute.

If you had dozens, or even hundreds, of different scenarios to address, you obviously wouldn't run through them all every time you changed the code. At the very best, you'd click your way through three of four usage scenarios that you thought were relevant to the change you'd made. Other functionality, earlier declared done, you just considered to be unaffected.

Needless to say, regressions were regular occurrences.

In 2003 I discovered test-driven development, and through that, automated testing. While you can't directly compare unit tests with whole usage scenarios, I think it's fair to compare something like automated integration tests or user-scenario tests (whatever you want to call them) with manually clicking through an application.

Even an integration test, if written properly, can verify a scenario at least ten times faster than you can do it by hand. A more realistic estimate is probably hundred times faster, or more.

Granted, you have to write the automated test as well, and I know that it's not always trivial. Still, once you have an automated test suite in place, you can run it all the time.

I never ran through all usage scenarios when I manually 'tested' my software. With automated tests, I do. This saves me from most regressions.

This improvement is, in my opinion, a no-brainer. It's easily a factor ten improvement. All the time wasted manually 'testing' the software, plus the time wasted fixing regressions, can be put to better use.

At the time Brooks was writing his own retrospective (in 1995), Kent Beck was beginning to talk to other people about test-driven development. As is a common theme in this article, hindsight is 20-20.

Honourable mentions #

There's been other improvements in software development since 1986. I considered including several other improvements as bona fide orders-of-magnitude improvements, but I think that's probably going too far. Each of the following developments have, however, offered significant improvements:

  • Git. It's surprising how much more productive Git can make you. While it's somewhat better than centralised source control systems at the functionality also available with those other systems, the productivity increase comes from all the new, unanticipated workflows it enables. Before I started using DVCS, I'd have lots of code that was commented out, so that I could experiment with various alternatives. With Git, I just create a new branch, or stash my changes, and experiment with abandon. While it's probably not a ten-fold increase in productivity, I believe it's the simplest technology change you can make to dramatically increase your productivity.
  • Garbage collection. Since I've admitted that I worked with Microsoft Commerce Server, I've probably lost all credibility with my reader already, but let's see if I can win back a little. While Commerce Server programming involved VBScript programming, it also often involved COM programming, and I did quite a bit of that in C++. Having to make sure that you've cleaned up all memory after use is a bother. Garbage collection just makes this work go away. It's hardly a ten-fold improvement in productivity, but I do find it significant.
  • Agile software development. The methodology of decreasing the feedback time between implementation and deployment has made me much more productive. I'm not interested in peddling any particular methodology like Scrum as much as just the general concept of getting rapid feedback. Particularly if you combine continuous delivery with Git, you have a powerful combination. Brooks already talked about incremental software development, and had some hopes attached to this as well. My personal experience can only agree with his sentiment. Again, probably not in itself a ten-fold increase in productivity, but enough that I wouldn't want to work on a project where rapid feedback and incremental development wasn't valued.
I'm probably forgetting lots of other improvements that have happened in the last decades. That's fine. The purpose of this article isn't to produce an exhaustive list, but rather to make the argument that significant improvements have been made since Brooks wrote his essay. I think it'd be folly, then, to believe that we've seen the last of such improvements.

Personally, I'm inclined to believe another order-of-magnitude improvement is right at our feet.

Statically typed functional programming #

This section is conjecture on my part. The improvements I've so far covered are already realised (at least for those who choose to take advantage of them). The improvement I'll cover here is more speculative.

I believe that statically typed functional programming offers another order-of-magnitude improvement over existing software development. Twenty years ago, I believed that object-oriented programming was a good idea. I now believe that I was wrong about that, so it's possible that in another twenty years, I'll also believe that I was wrong about functional programming. Take the following for what it is.

When I carefully reread No Silver Bullet, I got the distinct impression that Brooks considered low-level details of programming part of its essential complexity:

"Much of the complexity in a software construct is, however, not due to conformity to the external world but rather to the implementation itself - its data structures, its algorithms, its connectivity."

Fred Brooks, "No Silver Bullet" Refired, 1995
It's unreasonable to blame anyone writing in 1986, or 1995 for that matter, to think that for loops, variables, program state, and such other programming stables were anything but essential parts of the complexity of developing software.

Someone, unfortunately I forget who, once made the point that all mainstream programming languages are layers of abstractions of how a CPU works. Assembly language is basically just mnemonics on top of a CPU instruction set, then C can be thought of as an abstraction over assembly language, C++ as the next step in abstraction, Java and C# as sort of abstractions of C++, and so on. The origin of the design is the physical CPU. You could say that these languages are designed in a bottom-up fashion.

Imperative languages depicted as designed bottom-up, and functional languages as designed top-down.

Some functional languages (perhaps most famously Haskell, but also APL, and, possibly, Lisp) are designed in a much more top-down fashion. You start with mathematical abstractions like category theory and then figure out how to crystallise the theory into a programming language, and then again, via more layers of abstractions, how to turn the abstract language into machine code.

The more you learn about the pure functional alternative to programming, the more you begin to see mutable program state, variables, for loops, and similar language constructs merely as artefacts of the underlying model. Brooks, I think, thought of these as part of the essential complexity of programming. I don't think that that's the case. You can get by just fine with other abstractions instead.

Besides, Brooks writes, under the heading of Complexity:

"From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability. From the complexity of the functions comes the difficulty of invoking those functions, which makes programs hard to use."

Fred Brooks, No Silver Bullet, 1986
When he writes functions, I don't think that he means functions in the Haskell sense. I think that he means operations, procedures, or methods.

Indeed, when you look at a C# method signature like the following, it's hard to enumerate, understand, or remember, all that it does:

int? TryAccept(Reservation reservation);

If this is a high-level function, many things could happen when you call that method. It could change the state of a database. It could send an email. It could mutate a variable. Not only that, but the behaviour could depend on non-deterministic factors, such as the date, time of day, or just raw randomness. Finally, how should you handle the return value? What does it mean if the return value is null? What if it's not? Is 0 a valid value? Are negative numbers valid? Are they different from positive values?

It is, indeed, difficult to enumerate all the possible states of such a function.

Consider, instead, a Haskell function with a type like this:

tryAccept :: Int -> Reservation -> MaybeT ReservationsProgram Int

What happens if you invoke this function? It returns a value. Does it send any emails? Does it mutate any state? No, it can't, because the static type informs us that this is a pure function. If any programmer, anywhere inside of the function, or the functions it calls, or functions they call, etc. tried to do something impure, it wouldn't have compiled.

Can we enumerate the states of the program? Certainly. We just have to figure out what ReservationsProgram is. After following a few types, we find this statically typed enumeration:

data ReservationsInstruction next =
    IsReservationInFuture Reservation (Bool -> next)
  | ReadReservations UTCTime ([Reservation] -> next)
  | Create Reservation (Int -> next)
  deriving Functor

Essentially, there's three 'actions' that this type enables. The tryAccept function returns the ReservationsProgram inside of a MaybeT container, so there's a fourth option that something short-circuits along the way.

You don't even have to keep track of this yourself. The compiler keeps you honest. Whenever you invoke the tryAccept function, the compiler will insist that you write code that can handle all possible outcomes. If you turn on the right compiler flags, the code is not going to compile if you don't.

(Both code examples are taken from the same repository.)

Haskellers jokingly declare that if Haskell code compiles, it works. While humorous, there's a kernel of truth in that. An advanced type system can carry much information about the behaviour of a program. Some people, particularly programmers who come from a dynamically typed background, find Haskell's type system rigid. That's not an unreasonable criticism, but often, in dynamically typed languages, you have to write many automated tests to ensure that your program behaves as desired, and that it correctly handles various edge cases. A type system like Haskell's, on the other hand, embeds those rules in types instead of in tests.

While you should still write automated tests for Haskell programs, fewer are needed. How many fewer? Compared to C-based languages, a factor ten isn't an unreasonable guess.

After a few false starts, in 2014 I finally decided that F# would be my default choice of language on .NET. The reason for that decision was that I felt so much more productive in F# compared to C#. While F#'s type system doesn't embed information about pure versus impure functions, it does support sum types, which is what enables the sort of compile-time enumeration that Brooks discusses.

F# is still my .NET language of choice, but I find that I mostly 'think in' Haskell these days. My conjecture is that a sufficiently advanced type system (like Haskell's) could easily represent another order-of-magnitude improvement over mainstream imperative languages.

Improvements for those who want them #

The essay No Silver Bullet is a perspicacious work. I think more people should read at least the first part, where Brooks explains why software development is hard. I find that analysis brilliant, and I agree: software development presupposes essential complexity. It's inherently hard.

There's no reason to make it harder than it has to be, though.

More than once, I've discussed productivity improvements with people, only to be met with the dismissal that 'there's no silver bullet'.

Granted, there's no magical solution that will solve all problems with software development, but that doesn't mean that improvements can't be had.

Consider the improvements I've argued for here. Everyone now uses the World Wide Web and sites like Stack Overflow for research; that particular improvement is firmly embedded in all organisations. On the other hand, I still regularly talk to organisations that don't routinely use automated testing.

People still use centralised version control (like TFS or SVN). If there was ever a low-hanging fruit, changing to Git is one. Git is free, and there's plenty of tools you can use to migrate your version history to it. There's also plenty of training and help to be had. Yes, it'll require a small investment to make the change, but the productivity increase is significant.

"The future is already here — it's just not very evenly distributed."

William Gibson
So it is with technology improvements. Automated testing is available, but not ubiquitous. Git is free, but still organisations stick to suboptimal version control. Haskell and F# are mature languages, yet programmers still program in C# or Java.

Summary #

The essay No Silver Bullet was written in 1986, but seems to me to be increasingly misunderstood. When people today talk about it at all, it's mostly as an excuse to stay where they are. "There's no silver bullets," they'll say.

The essay, however, doesn't argue that no improvements can be had. It only argues that no more order-of-magnitude improvements can be had.

In the present essay I argue that, since Brooks wrote No Silver Bullet, more than one such improvement happened. Once the World Wide Web truly began furnishing information at your fingertips, you could be more productive because you wouldn't be stuck for days or weeks. Automated testing reduces the work that manual testers used to perform, as well as limiting regressions.

If you accept my argument, that order-of-magnitude improvements appeared after 1986, this implies that Brooks' premise was wrong. In that case, there's no reason to believe that we've seen the last significant improvement to software development.

I think that more such improvements await us. I suggest that statically typed functional programming offers such an advance, but if history teaches us anything, it seems that breakthroughs tend to be unpredictable.


Comments

As always I enjoy reading your blog, even though I don't understand half of it most of the time. Or is that most of it half of the time? Allow me to put a few observations forward.

First I should confess, that I have actually not read the whole of Brook's essay. When I initially tried I got about half way through; it sounds like I should make another go at it. That of course will not stop me from commenting on the above.

Brook talks about complexity. To me designing and implementing a software system is not complex. Quantum physics is complex. Flying an airplane is difficult. Software development may be difficult depending on the task at hand (and unfortunately the qualifications of the team), but I would argue that it at most falls into the same category as flying an airplane.

I would properly also state, that there are no silver bullets. But like you I feel that people understand it incorrectly and there is definetely no reason for making things harder than they are. I think the examples of technology that helps are excellent and exactly describe that things do move forward.

That being said, it does not take away the creativity of the right decomposition, the responsibility for getting the use cases right, and especially the liability for getting it wrong. Sadly especially the last of overlooked. People should be reminded of where the phrase 'live under the bridge' comes from.

To end my ramblins, I would also look a little into the future. As you know I am somewhat sceptial about machine learning and AI. However, looking at the recent break throughs and use cases in these areas, I would not be surprised of a future where software development is done by 'an AI' assemblying pre-defined 'entities' to create the software we need. Like an F16 cannot be flown without a computer, future software cannot be created by a human.

2019-07-04 18:29:00 UTC

Karsten, thank you for writing. I'm not inclined to agree that software development falls into the same category of complexity as flying a plane. It seems to me to be orders of magnitudes more complex.

Just look at error rates.

Would you ever board an air plane if flying had error rates similar to those observed in software development? Would you fly even if only one percent of all flights ended with plane crash?

In reality, flying is extremely safe. Would you claim that software development is as safe, predictable, and manageable as flying?

I see no evidence of that.

Are pilots significantly more capable human beings than software developers, or does something else explain the discrepancy in failure rates?

2019-07-05 15:47 UTC

Hi Mark. The fact that error rates are higher in software development is more a statement to the bad state our industry is in and has been for a milinium or more.

Why do we except that we produce crappy systems or in your words software that is not safe, predictable, and manageble? The list of excuses is very long and the list of results is very short. We as an industry are simply doing it wrong, but most people prefers hand waving and marketing than simple and plausible heuristic.

To use your analogy about planes I could ask if you would fly with a place that had (only) been unit tested? Properly not as it is never the unit that fails, but always the integration. Should be test all integrations then? Yes, why not?

The used of planes or pilots (or whatever) may have been bad. My point was, that I do not see software development as complex.

2019-07-05 20:12 UTC

Karsten, if we, as an industry, are doing it wrong, then why are we doing that?

And what should we be doing instead?

2019-07-06 16:00 UTC

Full binary tree catamorphism

Monday, 24 June 2019 06:00:00 UTC

The catamorphism for a full binary tree is a pair of functions.

This article is part of an article series about catamorphisms. A catamorphism is a universal abstraction that describes how to digest a data structure into a potentially more compact value.

This article presents the catamorphism for a full binary tree, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in Haskell. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix.

A full binary tree (also known as a proper or plane binary tree) is a tree in which each node has either two or no branches.

A full binary tree example diagram, with each node containing integers.

The diagram shows an example of a tree of integers. The left branch contains two children, of which the right branch again contains two sub-branches. The rest of the nodes are leaf-nodes with no sub-branches.

C# catamorphism #

As a C# representation of a full binary tree, I'll start with the IBinaryTree<T> API from A Visitor functor. The catamorphism is the Accept method:

TResult Accept<TResult>(IBinaryTreeVisitor<TTResult> visitor);

So far in this article series, you've mostly seen Church-encoded catamorphisms, so a catamorphism represented as a Visitor may be too big of a cognitive leap. We know, however, from Visitor as a sum type that a Visitor representation is isomorphic to a Church encoding. Since these are isomorphic, it's possible to refactor IBinaryTree<T> to a Church encoding. The GitHub repository contains a series of commits that demonstrates how that refactoring works. Once you're done, you arrive at this Match method, which is the refactored Accept method:

TResult Match<TResult>(Func<TResultTTResultTResult> node, Func<TTResult> leaf);

This method takes a pair of functions as arguments. The node function deals with an internal node in the tree (the blue nodes in the above diagram), whereas the leaf function deals with the leaf nodes (the green nodes in the diagram).

The leaf function may be the easiest one to understand. A leaf node only contains a value of the type T, so the only operation the function has to support is translating the T value to a TResult value. This is also the premise of the Leaf class' implementation of the method:

private readonly T item;
 
public TResult Match<TResult>(Func<TResultTTResultTResult> node, Func<TTResult> leaf)
{
    return leaf(item);
}

The node function is more tricky. It takes three input arguments, of the types TResult, T, and TResult. The roles of these are respectively left, item, and right. This is a typical representation of a binary node. Since there's always a left and a right branch, you put the node's value in the middle. As was the case with the tree catamorphism, the catamorphism function receives the branches as already-translated values; that is, both the left and right branch have already been translated to TResult when node is called. While it looks like magic, as always it's just the result of recursion:

private readonly IBinaryTree<T> left;
private readonly T item;
private readonly IBinaryTree<T> right;
 
public TResult Match<TResult>(Func<TResultTTResultTResult> node, Func<TTResult> leaf)
{
    return node(left.Match(node, leaf), item, right.Match(node, leaf));
}

This is the Node<T> class implementation of the Match method. It calls node and returns whatever it returns, but notice that as the left and right arguments, if first, recursively, calls left.Match and right.Match. This is how it can call node with the translated branches, as well as with the basic item.

The recursion stops and unwinds on left and right whenever one of those are Leaf instances.

Examples #

You can use Match to implement most other behaviour you'd like IBinaryTree<T> to have. In the original article on the full binary tree functor you saw how to implement Select with a Visitor, but now that the API is Church-encoded, you can derive Select from Match:

public static IBinaryTree<TResult> Select<TResultT>(
    this IBinaryTree<T> tree,
    Func<TTResult> selector)
{
    if (tree == null)
        throw new ArgumentNullException(nameof(tree));
    if (selector == null)
        throw new ArgumentNullException(nameof(selector));
 
    return tree.Match(
        node: (l, x, r) => Create(l, selector(x), r),
        leaf: x => Leaf(selector(x)));
}

In the leaf case, the Select method simply calls selector with the x value it receives, and puts the resulting TResult object into a new Leaf object.

In the node case, the lambda expression receives three arguments: l and r are the already-translated left and right branches, so you only need to call selector on x and call the Create helper method to produce a new Node object.

You can also implement more specialised functionality, like calculating the sum of nodes, measuring the depth of the tree, and similar functions. You saw equivalent examples in the previous article.

For the examples in this article, I'll use the tree shown in the above diagram. Using static helper methods, you can write it like this:

var tree =
    BinaryTree.Create(
        BinaryTree.Create(
            BinaryTree.Leaf(42),
            1337,
            BinaryTree.Create(
                BinaryTree.Leaf(2112),
                5040,
                BinaryTree.Leaf(1984))),
        2,
        BinaryTree.Leaf(90125));

To calculate the sum of all nodes, you can write a function like this:

public static int Sum(this IBinaryTree<int> tree)
{
    return tree.Match((l, x, r) => l + x + r, x => x);
}

The leaf function just returns the value of the node, while the node function adds the numbers together. It works for the above tree:

> tree.Sum()
100642

To find the maximum value, you can write another extension method:

public static int Max(this IBinaryTree<int> tree)
{
    return tree.Match((l, x, r) => Math.Max(Math.Max(l, r), x), x => x);
}

Again, the leaf function just returns the value of the node. The node function receives the value of the current node x, as well as the already-found maximum value of the left branch and the right branch; it then returns the maximum of these three values:

> tree.Max()
90125

As was also the case for trees, both of these operations are part of the standard repertoire available via a data structure's fold. That's not the case for the next two functions, which can't be implemented using a fold, but which can be defined with the catamorphism. The first is a function to count the leaves of a tree:

public static int CountLeaves<T>(this IBinaryTree<T> tree)
{
    return tree.Match((l, _, r) => l + r, _ => 1);
}

Since the leaf function handles a leaf node, the number of leaf nodes in a leaf node is, by definition, one. Thus, that function can ignore the value of the node and always return 1. The node function, on the other hand, receives the number of leaf nodes on the left-hand side (l), the value of the current node, and the number of leaf nodes on the right-hand side (r). Notice that since an internal node is never a leaf node, it doesn't count; instead, just add l and r together. Notice that, again, the value of the node itself is irrelevant.

How many leaf nodes does the above tree have?

> tree.CountLeaves()
4

You can also measure the maximum depth of a tree:

public static int MeasureDepth<T>(this IBinaryTree<T> tree)
{
    return tree.Match((l, _, r) => 1 + Math.Max(l, r), _ => 0);
}

Like in the previous article, I've arbitrarily decided that the depth of a leaf node is zero; therefore, the leaf function always returns 0. The node function receives the depth of the left and right branches, and returns the maximum of those two values, plus one, since the current node adds one level of depth.

> tree.MeasureDepth()
3

You may not have much need for working with full binary trees in your normal, day-to-day C# work, but I found it worthwhile to include this example for a couple of reasons. First, because the original of the API shows that a catamorphism may be hiding in a Visitor. Second, because binary trees are interesting, in that they're foldable functors, but not monads.

Where does the catamorphism come from, though? How can you trust that the Match method is the catamorphism?

Binary tree F-Algebra #

As in the previous article, I'll use Fix and cata as explained in Bartosz Milewski's excellent article on F-Algebras.

As always, start with the underlying endofunctor. You can think of this one as a specialisation of the rose tree from the previous article:

data FullBinaryTreeF a c = LeafF a | NodeF c a c deriving (ShowEqRead)
 
instance Functor (FullBinaryTreeF a) where
  fmap _     (LeafF x) = LeafF x
  fmap f (NodeF l x r) = NodeF (f l) x (f r)

As usual, I've called the 'data' type a and the carrier type c (for carrier). The Functor instance as usual translates the carrier type; the fmap function has the type (c -> c1) -> FullBinaryTreeF a c -> FullBinaryTreeF a c1.

As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like Fix (FullBinaryTreeF a). To address that problem, you can introduce a newtype wrapper:

newtype FullBinaryTreeFix a =
  FullBinaryTreeFix { unFullBinaryTreeFix :: Fix (FullBinaryTreeF a) }
  deriving (ShowEqRead)

You can define Functor, Foldable, and Traversable instances (but not Monad) for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use.

A pair of helper functions make it easier to define FullBinaryTreeFix values:

fbtLeafF :: a -> FullBinaryTreeFix a
fbtLeafF = FullBinaryTreeFix . Fix . LeafF
 
fbtNodeF :: FullBinaryTreeFix a -> a -> FullBinaryTreeFix a -> FullBinaryTreeFix a
fbtNodeF (FullBinaryTreeFix l) x (FullBinaryTreeFix r) = FullBinaryTreeFix $ Fix $ NodeF l x r

In order to distinguish these helper functions from the ones that create TreeFix a values, I prefixed them with fbt (for Full Binary Tree). fbtLeafF creates a leaf node:

Prelude Fix FullBinaryTree> fbtLeafF "fnaah"
FullBinaryTreeFix {unFullBinaryTreeFix = Fix (LeafF "fnaah")}

fbtNodeF is a helper function to create an internal node:

Prelude Fix FullBinaryTree> fbtNodeF (fbtLeafF 1337) 42 (fbtLeafF 2112)
FullBinaryTreeFix {unFullBinaryTreeFix = Fix (NodeF (Fix (LeafF 1337)) 42 (Fix (LeafF 2112)))}

The FullBinaryTreeFix type, or rather the underlying FullBinaryTreeF a functor, is all you need to identify the catamorphism.

Haskell catamorphism #

At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (FullBinaryTreeF a), and an object c, but you still need to find a morphism FullBinaryTreeF a c -> c. Notice that the algebra you have to find is the function that reduces the functor to its carrier type c, not the 'data type' a. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore a, as you'll see.

As in the previous articles, start by writing a function that will become the catamorphism, based on cata:

fullBinaryTreeF = cata alg . unFullBinaryTreeFix
  where alg     (LeafF x) = undefined
        alg (NodeF l x r) = undefined

While this compiles, with its undefined implementation of alg, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type c from alg? You could pass a function argument to the fullBinaryTreeF function and use it with x:

fullBinaryTreeF fl = cata alg . unFullBinaryTreeFix
  where alg     (LeafF x) = fl x
        alg (NodeF l x r) = undefined

I called the function fl for function, leaf, because we're also going to need a function for the NodeF case:

fullBinaryTreeF :: (c -> a -> c -> c) -> (a -> c) -> FullBinaryTreeFix a -> c
fullBinaryTreeF fn fl = cata alg . unFullBinaryTreeFix
  where alg     (LeafF x) = fl x
        alg (NodeF l x r) = fn l x r

This works. Since cata has the type Functor f => (f a -> a) -> Fix f -> a, that means that alg has the type f a -> a. In the case of FullBinaryTreeF, the compiler infers that the alg function has the type FullBinaryTreeF a c -> c, which is just what you need!

You can now see what the carrier type c is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns.

This, then, is the catamorphism for a full binary tree. As always, it's not the only possible catamorphism, since you can easily reorder the arguments to both fullBinaryTreeF, fn, and fl. These would all be isomorphic, though.

Basis #

You can implement most other useful functionality with treeF. Here's the Functor instance:

instance Functor FullBinaryTreeFix where
  fmap f = fullBinaryTreeF (\l x r -> fbtNodeF l (f x) r) (fbtLeafF . f)

The fl function first invokes f, followed by fbtLeafF. The fn function uses the fbtNodeF helper function to create a new internal node. l and r are already-translated branches, so you just need to call f with the node value x.

There's no Monad instance for binary trees, because you can't flatten a binary tree of binary trees. You can, on the other hand, define a Foldable instance:

instance Foldable FullBinaryTreeFix where
  foldMap f = fullBinaryTreeF (\l x r -> l <> f x <> r) f

The f function passed to foldMap has the type Monoid m => (a -> m), so the fl function that handles leaf nodes simply calls f with the contents of the node. The fn function receives two branches already translated to m, so it just has to call f with x and combine all the m values using the <> operator.

The Traversable instance follows right on the heels of Foldable:

instance Traversable FullBinaryTreeFix where
  sequenceA = fullBinaryTreeF (liftA3 fbtNodeF) (fmap fbtLeafF)

There are operations on binary trees that you can implement with a fold, but some that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using FullBinaryTreeFix, you can define that tree like this:

tree = 
  fbtNodeF
    (fbtNodeF
      (fbtLeafF 42)
      1337
      (fbtNodeF
        (fbtLeafF 2112)
        5040
        (fbtLeafF 1984)))
    2
    (fbtLeafF 90125)

Since FullBinaryTreeFix is Foldable, and that type class already comes with sum and maximum functions, no further work is required to repeat the first two of the above C# examples:

Prelude Fix FullBinaryTree> sum tree
100642
Prelude Fix FullBinaryTree> maximum tree
90125

Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the Foldable instance, but can be implemented using the catamorphism:

countLeaves :: Num n => FullBinaryTreeFix a -> n
countLeaves = fullBinaryTreeF (\l _ r -> l + r) (const 1)
 
treeDepth :: (Ord n, Num n) => FullBinaryTreeFix a -> n
treeDepth = fullBinaryTreeF (\l _ r -> 1 + max l r) (const 0)

The reasoning is the same as already explained in the above C# examples. The functions also produce the same results:

Prelude Fix FullBinaryTree> countLeaves tree
4
Prelude Fix FullBinaryTree> treeDepth tree
3

This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation.

Summary #

The catamorphism for a full binary tree is a pair of functions. One function handles internal nodes, while the other function handles leaf nodes.

I thought it was interesting to show this example for two reasons: First, the original example was a Visitor implementation, and I think it's worth realising that a Visitor's Accept method can also be viewed as a catamorphism. Second, a binary tree is an example of a data structure that has a fold, but isn't a monad.

All articles in the article series have, so far, covered data structures well-known from computer science. The next example will, on the other hand, demonstrate that even completely ad-hoc domain-specific data structures have catamorphisms.

Next: Payment types catamorphism.


Page 28 of 76

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!