A rose tree is a tree with leaf nodes of one type, and internal nodes of another.

This article is part of a series of articles about Church encoding. In the previous articles, you've seen how to implement a Maybe container, and how to implement an Either container. Through these examples, you've learned how to model sum types without explicit language support. In this article, you'll see how to model a rose tree.

A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any tree is that internal nodes can hold values of a different type than leaf values.

A rose tree example diagram, with internal nodes containing integers, and leaves containing strings.

The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches.

Contract #

In C#, you can represent the fundamental structure of a rose tree with a Church encoding, starting with an interface:

public interface IRoseTree<NL>
{
    TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf);
}

The structure of a rose tree includes two mutually exclusive cases: internal nodes and leaf nodes. Since there's two cases, the Match method takes two arguments, one for each case.

The interface is generic, with two type arguments: N (for Node) and L (for leaf). Any consumer of an IRoseTree<N, L> object must supply two functions when calling the Match method: a function that turns a node into a TResult value, and a function that turns a leaf into a TResult value.

Both cases must have a corresponding implementation.

Leaves #

The leaf implementation is the simplest:

public sealed class RoseLeaf<NL> : IRoseTree<NL>
{
    private readonly L value;
 
    public RoseLeaf(L value)
    {
        this.value = value;
    }
 
    public TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf)
    {
        return leaf(value);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is RoseLeaf<NL> other))
            return false;
 
        return Equals(value, other.value);
    }
 
    public override int GetHashCode()
    {
        return value.GetHashCode();
    }
}

The RoseLeaf class is an Adapter over a value of the generic type L. As is always the case with Church encoding, it implements the Match method by unconditionally calling one of the arguments, in this case the leaf function, with its adapted value.

While it doesn't have to do this, it also overrides Equals and GetHashCode. This is an immutable class, so it's a great candidate to be a Value Object. Making it a Value Object makes it easier to compare expected and actual values in unit tests, among other benefits.

Nodes #

The node implementation is slightly more complex:

public sealed class RoseNode<NL> : IRoseTree<NL>
{
    private readonly N value;
    private readonly IEnumerable<IRoseTree<NL>> branches;
 
    public RoseNode(N value, IEnumerable<IRoseTree<NL>> branches)
    {
        this.value = value;
        this.branches = branches;
    }
 
    public TResult Match<TResult>(
        Func<NIEnumerable<IRoseTree<NL>>, TResult> node,
        Func<LTResult> leaf)
    {
        return node(value, branches);
    }
 
    public override bool Equals(object obj)
    {
        if (!(obj is RoseNode<NL> other))
            return false;
 
        return Equals(value, other.value)
            && Enumerable.SequenceEqual(branches, other.branches);
    }
 
    public override int GetHashCode()
    {
        return value.GetHashCode() ^ branches.GetHashCode();
    }
}

A node contains both a value (of the type N) and a collection of sub-trees, or branches. The class implements the Match method by unconditionally calling the node function argument with its constituent values.

Again, it overrides Equals and GetHashCode for the same reasons as RoseLeaf. This isn't required to implement Church encoding, but makes comparison and unit testing easier.

Usage #

You can use the RoseLeaf and RoseNode constructors to create new trees, but it sometimes helps to have a static helper method to create values. It turns out that there's little value in a helper method for leaves, but for nodes, it's marginally useful:

public static IRoseTree<NL> Node<NL>(N value, params IRoseTree<NL>[] branches)
{
    return new RoseNode<NL>(value, branches);
}

This enables you to create tree objects, like this:

IRoseTree<stringint> tree =
    RoseTree.Node("foo"new RoseLeaf<stringint>(42), new RoseLeaf<stringint>(1337));

That's a single node with the label "foo" and two leaves with the values 42 and 1337, respectively. You can create the tree shown in the above diagram like this:

IRoseTree<intstring> exampleTree =
    RoseTree.Node(42,
        RoseTree.Node(1337,
            new RoseLeaf<intstring>("foo"),
            new RoseLeaf<intstring>("bar")),
        RoseTree.Node(2112,
            RoseTree.Node(90125,
                new RoseLeaf<intstring>("baz"),
                new RoseLeaf<intstring>("qux"),
                new RoseLeaf<intstring>("quux")),
            new RoseLeaf<intstring>("quuz")),
        new RoseLeaf<intstring>("corge"));

You can add various extension methods to implement useful functionality. In later articles, you'll see some more compelling examples, so here, I'm only going to show a few basic examples. One of the simplest features you can add is a method that will tell you if an IRoseTree<N, L> object is a node or a leaf:

public static IChurchBoolean IsLeaf<NL>(this IRoseTree<NL> source)
{
    return source.Match<IChurchBoolean>(
        node: (_, __) => new ChurchFalse(),
        leaf: _ => new ChurchTrue());
}
 
public static IChurchBoolean IsNode<NL>(this IRoseTree<NL> source)
{
    return new ChurchNot(source.IsLeaf());
}

Since this article is part of the overall article series on Church encoding, and the purpose of that article series is also to show how basic language features can be created from Church encodings, these two methods return Church-encoded Boolean values instead of the built-in bool type. I'm sure you can imagine how you could change the type to bool if you'd like.

You can use these methods like this:

> IRoseTree<Guiddouble> tree = new RoseLeaf<Guiddouble>(-3.2);
> tree.IsLeaf()
ChurchTrue { }
> tree.IsNode()
ChurchNot(ChurchTrue)
> IRoseTree<longstring> tree = RoseTree.Node<longstring>(42);
> tree.IsLeaf()
ChurchFalse { }
> tree.IsNode()
ChurchNot(ChurchFalse)

In a future article, you'll see some more compelling examples.

Terminology #

It's not entirely clear what to call a tree like the one shown here. The Wikipedia entry doesn't state one way or the other whether internal node types ought to be distinguishable from leaf node types, but there are indications that this could be the case. At least, it seems that the term isn't well-defined, so I took the liberty to retcon the name rose tree to the data structure shown here.

In the paper that introduces the rose tree term, Meertens writes:

"We consider trees whose internal nodes may fork into an arbitrary (natural) number of sub-trees. (If such a node has zero descendants, we still consider it internal.) Each external node carries a data item. No further information is stored in the tree; in particular, internal nodes are unlabelled."

First Steps towards the Theory of Rose Trees, Lambert Meertens, 1988
While the concept is foreign in C#, you can trivially introduce a unit data type:

public class Unit
{
    public readonly static Unit Instance = new Unit();
 
    private Unit() { }
}

This enables you to create a rose tree according to Meertens' definition:

IRoseTree<Unitint> meertensTree =
    RoseTree.Node(Unit.Instance,
        RoseTree.Node(Unit.Instance,
            RoseTree.Node(Unit.Instance,
                new RoseLeaf<Unitint>(2112)),
            new RoseLeaf<Unitint>(42),
            new RoseLeaf<Unitint>(1337),
            new RoseLeaf<Unitint>(90125)),
        RoseTree.Node(Unit.Instance,
            new RoseLeaf<Unitint>(1984)),
        new RoseLeaf<Unitint>(666));

Visually, you could draw it like this:

A Meertens rose tree example diagram, with leaves containing integers.

Thus, the tree structure shown here seems to be a generalisation of Meertens' original definition.

I'm not a mathematician, so I may have misunderstood some things. If you have a better name than rose tree for the data structure shown here, please leave a comment.

Yeats #

Now that we're on the topic of rose tree as a term, you may, as a bonus, enjoy a similarly-titled poem:

THE ROSE TREE

"O words are lightly spoken"
Said Pearse to Connolly,
"Maybe a breath of politic words
Has withered our Rose Tree;
Or maybe but a wind that blows
Across the bitter sea."

"It needs to be but watered,"
James Connolly replied,
"To make the green come out again
And spread on every side,
And shake the blossom from the bud
To be the garden's pride."

"But where can we draw water"
Said Pearse to Connolly,
"When all the wells are parched away?
O plain as plain can be
There's nothing but our own red blood
Can make a right Rose Tree."

As far as I can tell, though, Yeats' metaphor is dissimilar to Meertens'.

Summary #

You may occasionally find use for a tree that distinguishes between internal and leaf nodes. You can model such a tree with a Church encoding, as shown in this article.

Next: Catamorphisms.


Comments

If you have a better name than rose tree for the data structure shown here, please leave a comment.

I would consider using heterogeneous rose tree.

In your linked Twitter thread, Keith Battocchi shared a link to the thesis of Jeremy Gibbons (which is titled Algebras for Tree Algorithms). In his thesis, he defines rose tree as you have and then derives from that (on page 45) the special cases that he calls unlabelled rose tree, leaf-labeled rose tree, branch-labeled rose tree, and homogeneous rose tree.

The advantage of Jeremy's approach is that the name of each special case is formed from the named of the general case by adding an adjective. The disadvantage is the ambiguity that comes from being inconsistent with the definition of rose tree compared to both previous works and current usage (as shown by your numerous links).

The goal of naming is communication, and the name rose tree risks miscommunication, which is worse than no communication at all. Miscommunication would result by (for example) calling this heterogeneous rose tree a rose tree and someone that knows the rose tree definition in Haskell skims over your definition thinking that they already know it. However, I think you did a good job with this article and made that unlikely.

The advantage of heterogeneous rose tree is that the name is not overloaded and heterogeneous clearly indicates the intended variant. If a reader has heard of a rose tree, then they probably know there are several variants and can infer the correct one from this additional adjective.

In the end though, I think using the name rose tree as you did was a good choice. Your have now written several articles involving rose trees and they all use the same variant. Since you always use the same variant, it would be a bit verbose to always include an additional adjective to specify the variant.

The only thing I would have considered changing is the first mention of rose tree in this article. It is common in academic writing to start with the general definition and then give shorter alternatives for brevity. This is one way it could have been written in this article.

A heterogeneous rose tree is a tree with leaf nodes of one type, and internal nodes of another.

This article is part of a series of articles about Church encoding. In the previous articles, you've seen how to implement a Maybe container, and how to implement an Either container. Through these examples, you've learned how to model sum types without explicit language support. In this article, you'll see how to model a heterogeneous rose tree.

A heterogeneous rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a heterogeneous rose tree and just any tree is that internal nodes can hold values of a different type than leaf values. For brevity, we will omit heterogeneous and simply call this data structure a rose tree.

2020-08-03 02:02 UTC

Tyson, thank you for writing. I agree that specificity is desirable. I haven't read the Gibbons paper, so I'll have to reflect on your summary. If I understand you correctly, a type like IRoseTree shown in this article constitutes the general case. In Haskell, I simply named it Tree a b, which is probably too general, but may help to illustrate the following point.

As far as I remember, C# doesn't have type aliases, so Haskell makes the point more succinct. If I understand you correctly, then, you could define a heterogeneous rose tree as:

type HeterogeneousRoseTree = Tree

Furthermore, I suppose that a leaf-labeled rose tree is this:

type LeafLabeledRoseTree b = Tree () b

Would the following be a branch-labeled rose tree?

type BranchLabeledRoseTree a = Tree a ()

And this is, I suppose, a homogeneous rose tree:

type HomogeneousRoseTree a = Tree a a

I can't imagine what an unlabelled rose tree is, unless it's this:

type UnlabelledRoseTree = Tree () ()

I don't see how that'd be of much use, but I suppose that's just my lack of imagination.

2020-08-09 15:23 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 29 July 2019 13:14:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 29 July 2019 13:14:00 UTC