Rose tree bifunctor

A rose tree forms a bifunctor. An article for object-oriented developers.

This article is an instalment in an article series about bifunctors. While the overview article explains that there's essentially two practically useful bifunctors, here's a third one. rose trees.

Mapping both dimensions #

Like in the previous article on the Either bifunctor, I'll start by implementing the simultaneous two-dimensional translation SelectBoth:

public static IRoseTree<N1, L1> SelectBoth<N, N1, L, L1>(
    this IRoseTree<N, L> source,
    Func<N, N1> selectNode,
    Func<L, L1> selectLeaf)
{
    return source.Cata(
        node: (n, branches) => new RoseNode<N1, L1>(selectNode(n), branches),
        leaf: l => (IRoseTree<N1, L1>)new RoseLeaf<N1, L1>(selectLeaf(l)));
}

This article uses the previously shown Church-encoded rose tree and its catamorphism Cata.

In the leaf case, the l argument received by the lambda expression is an object of the type L, since the source tree is an IRoseTree<N, L> object; i.e. a tree with leaves of the type L and nodes of the type N. The selectLeaf argument is a function that converts an L object to an L1 object. Since l is an L object, you can call selectLeaf with it to produce an L1 object. You can use this resulting object to create a new RoseLeaf<N1, L1>. Keep in mind that while the RoseLeaf class requires two type arguments, it never requires an object of its N type argument, which means that you can create an object with any node type argument, including N1, even if you don't have an object of that type.

In the node case, the lambda expression receives two objects: n and branches. The n object has the type N, while the branches object has the type IEnumerable<IRoseTree<N1, L1>>. In other words, the branches have already been translated to the desired result type. That's how the catamorphism works. This means that you only have to figure out how to translate the N object n to an N1 object. The selectNode function argument can do that, so you can then create a new RoseNode<N1, L1> and return it.

This works as expected:

> var tree = RoseTree.Node("foo", new RoseLeaf<string, int>(42), new RoseLeaf<string, int>(1337));
> tree
RoseNode<string, int>("foo", IRoseTree<string, int>[2] { 42, 1337 })
> tree.SelectBoth(s => s.Length, i => i.ToString())
RoseNode<int, string>(3, IRoseTree<int, string>[2] { "42", "1337" })

This C# Interactive example shows how to convert a tree with internal string nodes and integer leaves to a tree of internal integer nodes and string leaves. The strings are converted to strings by counting their Length, while the integers are turned into strings using the standard ToString method available on all objects.

Mapping nodes #

When you have SelectBoth, you can trivially implement the translations for each dimension in isolation. For tuple bifunctors, I called these methods SelectFirst and SelectSecond, while for Either bifunctors, I chose to name them SelectLeft and SelectRight. Continuing the trend of naming the translations after what they translate, instead of their positions, I'll name the corresponding methods here SelectNode and SelectLeaf. In Haskell, the functions associated with Data.Bifunctor are always called first and second, but I see no reason to preserve such abstract naming in C#. In Haskell, these functions are part of the Bifunctor type class; the abstract names serve an actual purpose. This isn't the case in C#, so there's no reason to retain the abstract names. You might as well use names that communicate intent, which is what I've tried to do here.

If you want to map only the internal nodes, you can implement a SelectNode method based on SelectBoth:

public static IRoseTree<N1, L> SelectNode<N, N1, L>(
    this IRoseTree<N, L> source,
    Func<N, N1> selector)
{
    return source.SelectBoth(selector, l => l);
}

This simply uses the l => l lambda expression as an ad-hoc identity function, while passing selector as the selectNode argument to the SelectBoth method.

You can use this to map the above tree to a tree made entirely of numbers:

> var tree = RoseTree.Node("foo", new RoseLeaf<string, int>(42), new RoseLeaf<string, int>(1337));
> tree.SelectNode(s => s.Length)
RoseNode<int, int>(3, IRoseTree<int, int>[2] { 42, 1337 })

Such a tree is, incidentally, isomorphic to a 'normal' tree. It might be a good exercise, if you need one, to demonstrate the isormorphism by writing functions that convert a Tree<T> into an IRoseTree<T, T>, and vice versa.

Mapping leaves #

Similar to SelectNode, you can also trivially implement SelectLeaf:

public static IRoseTree<N, L1> SelectLeaf<N, L, L1>(
    this IRoseTree<N, L> source,
    Func<L, L1> selector)
{
    return source.SelectBoth(n => n, selector);
}

This is another one-liner calling SelectBoth, with the difference that the identity function n => n is passed as the first argument, instead of as the last. This ensures that only RoseLeaf values are mapped:

> var tree = RoseTree.Node("foo", new RoseLeaf<string, int>(42), new RoseLeaf<string, int>(1337));
> tree.SelectLeaf(i => i % 2 == 0)
RoseNode<string, bool>("foo", IRoseTree<string, bool>[2] { true, false })

In the above C# Interactive session, the leaves are mapped to Boolean values, indicating whether they're even or odd.

Identity laws #

Rose trees obey all the bifunctor laws. While it's formal work to prove that this is the case, you can get an intuition for it via examples. Often, I use a property-based testing library like FsCheck or Hedgehog to demonstrate (not prove) that laws hold, but in this article, I'll keep it simple and only cover each law with a parametrised test.

private static T Id<T>(T x) => x;
 
public static IEnumerable<object[]> BifunctorLawsData
{
    get
    {
        yield return new[] { new RoseLeaf<int, string>("") };
        yield return new[] { new RoseLeaf<int, string>("foo") };
        yield return new[] { RoseTree.Node<int, string>(42) };
        yield return new[] { RoseTree.Node(42, new RoseLeaf<int, string>("bar")) };
        yield return new[] { exampleTree };
    }
}
 
[Theory, MemberData(nameof(BifunctorLawsData))]
public void SelectNodeObeysFirstFunctorLaw(IRoseTree<int, string> t)
{
    Assert.Equal(t, t.SelectNode(Id));
}

This test uses xUnit.net's [Theory] feature to supply a small set of example input values. The input values are defined by the BifunctorLawsData property, since I'll reuse the same values for all the bifunctor law demonstration tests. The exampleTree object is the tree shown in Church-encoded rose tree.

The tests also use the identity function implemented as a private function called Id, since C# doesn't come equipped with such a function in the Base Class Library.

For all the IRoseTree<int, string> objects t, the test simply verifies that the original tree t is equal to the tree projected over the first axis with the Id function.

Likewise, the first functor law applies when translating over the second dimension:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void SelectLeafObeysFirstFunctorLaw(IRoseTree<int, string> t)
{
    Assert.Equal(t, t.SelectLeaf(Id));
}

This is the same test as the previous test, with the only exception that it calls SelectLeaf instead of SelectNode.

Both SelectNode and SelectLeaf are implemented by SelectBoth, so the real test is whether this method obeys the identity law:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void SelectBothObeysIdentityLaw(IRoseTree<int, string> t)
{
    Assert.Equal(t, t.SelectBoth(Id, Id));
}

Projecting over both dimensions with the identity function does, indeed, return an object equal to the input object.

Consistency law #

In general, it shouldn't matter whether you map with SelectBoth or a combination of SelectNode and SelectLeaf:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void ConsistencyLawHolds(IRoseTree<int, string> t)
{
    DateTime f(int i) => new DateTime(i);
    bool g(string s) => string.IsNullOrWhiteSpace(s);
 
    Assert.Equal(t.SelectBoth(f, g), t.SelectLeaf(g).SelectNode(f));
    Assert.Equal(
        t.SelectNode(f).SelectLeaf(g),
        t.SelectLeaf(g).SelectNode(f));
}

This example creates two local functions f and g. The first function, f, creates a new DateTime object from an integer, using one of the DateTime constructor overloads. The second function, g, just delegates to string.IsNullOrWhiteSpace, although I want to stress that this is just an example. The law should hold for any two (pure) functions.

The test then verifies that you get the same result from calling SelectBoth as when you call SelectNode followed by SelectLeaf, or the other way around.

Composition laws #

The composition laws insist that you can compose functions, or translations, and that again, the choice to do one or the other doesn't matter. Along each of the axes, it's just the second functor law applied. This parametrised test demonstrates that the law holds for SelectNode:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void SecondFunctorLawHoldsForSelectNode(IRoseTree<int, string> t)
{
    char f(bool b) => b ? 'T' : 'F';
    bool g(int i) => i % 2 == 0;
 
    Assert.Equal(
        t.SelectNode(x => f(g(x))),
        t.SelectNode(g).SelectNode(f));
}

Here, f is a local function that returns the the character 'T' for true, and 'F' for false; g is the even function. The second functor law states that mapping f(g(x)) in a single step is equivalent to first mapping over g and then map the result of that using f.

The same law applies if you fix the first dimension and translate over the second:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void SecondFunctorLawHoldsForSelectLeaf(IRoseTree<int, string> t)
{
    bool f(int x) => x % 2 == 0;
    int g(string s) => s.Length;
 
    Assert.Equal(
        t.SelectLeaf(x => f(g(x))),
        t.SelectLeaf(g).SelectLeaf(f));
}

Here, f is the even function, whereas g is a local function that returns the length of a string. Again, the test demonstrates that the output is the same whether you map over an intermediary step, or whether you map using only a single step.

This generalises to the composition law for SelectBoth:

[Theory, MemberData(nameof(BifunctorLawsData))]
public void SelectBothCompositionLawHolds(IRoseTree<int, string> t)
{
    char f(bool b) => b ? 'T' : 'F';
    bool g(int x) => x % 2 == 0;
    bool h(int x) => x % 2 == 0;
    int i(string s) => s.Length;
 
    Assert.Equal(
        t.SelectBoth(x => f(g(x)), y => h(i(y))),
        t.SelectBoth(g, i).SelectBoth(f, h));
}

Again, whether you translate in one or two steps shouldn't affect the outcome.

As all of these tests demonstrate, the bifunctor laws hold for rose trees. The tests only showcase five examples, but I hope it gives you an intuition how any rose tree is a bifunctor. After all, the SelectNode, SelectLeaf, and SelectBoth methods are all generic, and they behave the same for all generic type arguments.

Summary #

Rose trees are bifunctors. You can translate the node and leaf dimension of a rose tree independently of each other, and the bifunctor laws hold for any pure translation, no matter how you compose the projections.

As always, there can be performance differences between the various compositions, but the outputs will be the same regardless of composition.

A functor, and by extension, a bifunctor, is a structure-preserving map. This means that any projection preserves the structure of the underlying container. For rose trees this means that the shape of the tree remains the same. The number of leaves remain the same, as does the number of internal nodes.

Next: Contravariant functors.

Published: Monday, 12 August 2019 10:33:00 UTC

Rose tree bifunctor by Mark Seemann