# NonEmpty catamorphism by Mark Seemann

*The universal API for generic non-empty collections, with examples in C# and Haskell.*

This article is part of an article series about catamorphisms. A catamorphism is a universal abstraction that describes how to digest a data structure into a potentially more compact value.

I was recently doing some work that required a data structure like a collection, but with the additional constraint that it should be guaranteed to have at least one element. I've known about Haskell's NonEmpty type, and how to port it to C# for years. This time I needed to implement it in a third language, and since I had a little extra time available, I thought it'd be interesting to pursue a conjecture of mine: It seems as though you can implement most (all?) of a generic data structure's API based on its catamorphism.

While I could make a guess as to how a catamorphism might look for a non-empty collection, I wasn't sure. A quick web search revealed nothing conclusive, so I decided to deduce it from first principles. As this article series demonstrates, you can derive the catamorphism from a type's isomorphic F-algebra.

The beginning of this article presents the catamorphism in C#, with an example. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in Haskell. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix.

### C# catamorphism #

This article will use a custom C# class called `NonEmptyCollection<T>`

, which is near-identical to the `NotEmptyCollection<T>`

originally introduced in the article Semigroups accumulate.

I don't know why I originally chose to name the class `NotEmptyCollection`

instead of `NonEmptyCollection`

, but it's annoyed me ever since. I've finally decided to rectify that mistake, so from now on, the name is `NonEmptyCollection`

.

The catamorphism for `NonEmptyCollection`

is this instance method:

public TResult Aggregate<TResult>(Func<T, IReadOnlyCollection<T>, TResult> algebra) { return algebra(Head, Tail); }

Because the `NonEmptyCollection`

class is really just a glorified tuple, the `algebra`

is any function which produces a single value from the two constituent values.

It's easy to fall into the trap of thinking of the catamorphism as 'reducing' the data structure to a more compact form. While this is a common kind of operation, loss of data is not inevitable. You can, for example, return a new collection, essentially doing nothing:

var nec = new NonEmptyCollection<int>(42, 1337, 2112, 666); var same = nec.Aggregate((x, xs) => new NonEmptyCollection<int>(x, xs.ToArray()));

This `Aggregate`

method enables you to safely find a maximum value:

var nec = new NonEmptyCollection<int>(42, 1337, 2112, 666); var max = nec.Aggregate((x, xs) => xs.Aggregate(x, Math.Max));

or to safely calculate an average:

var nec = new NonEmptyCollection<int>(42, 1337, 2112, 666); var average = nec.Aggregate((x, xs) => xs.Aggregate(x, (a, b) => a + b) / (xs.Count + 1.0));

Both of these two last examples use the built-in Aggregate function to accumulate the `xs`

. It uses the overload that takes a seed, for which it supplies `x`

. This means that there's guaranteed to be at least that one value.

The catamorphism given here is not unique. You can create a trivial variation by swapping the two function arguments, so that `x`

comes after `xs`

.

### NonEmpty F-algebra #

As in the previous article, I'll use `Fix`

and `cata`

as explained in Bartosz Milewski's excellent article on F-algebras.

As always, start with the underlying endofunctor:

data NonEmptyF a c = NonEmptyF { head :: a, tail :: ListFix a } deriving (Eq, Show, Read) instance Functor (NonEmptyF a) where fmap _ (NonEmptyF x xs) = NonEmptyF x xs

Instead of using Haskell's standard list (`[]`

) for the tail, I've used `ListFix`

from the article on list catamorphism. This should, hopefully, demonstrate how you can build on already established definitions derived from first principles.

Since a non-empty collection is really just a glorified tuple of *head* and *tail*, there's no recursion, and thus, the carrier type `c`

is not used. You could argue that going through all of these motions is overkill, but it still provides some insights. This is similar to the Boolean catamorphism and Maybe catamorphism.

The `fmap`

function ignores the mapping argument (often called `f`

), since the `Functor`

instance maps `NonEmptyF a c`

to `NonEmptyF a c1`

, but the `c`

or `c1`

type is not used.

As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like `Fix (NonEmptyF a)`

. To address that problem, you can introduce a `newtype`

wrapper:

newtype NonEmptyFix a = NonEmptyFix { unNonEmptyFix :: Fix (NonEmptyF a) } deriving (Eq, Show, Read)

You can define `Functor`

, `Applicative`

, `Monad`

, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that, ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use.

A helper function makes it easier to define `NonEmptyFix`

values:

createNonEmptyF :: a -> ListFix a -> NonEmptyFix a createNonEmptyF x xs = NonEmptyFix $ Fix $ NonEmptyF x xs

Here's how to use it:

ghci> createNonEmptyF 42 $ consF 1337 $ consF 2112 nilF NonEmptyFix { unNonEmptyFix = Fix (NonEmptyF 42 (ListFix (Fix (ConsF 1337 (Fix (ConsF 2112 (Fix NilF)))))))}

While this is quite verbose, keep in mind that the code shown here isn't meant to be used in practice. The goal is only to deduce catamorphisms from more basic universal abstractions, and you now have all you need to do that.

### Haskell catamorphism #

At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (`NonEmptyF a`

), and an object `c`

, but you still need to find a morphism `NonEmptyF a c -> c`

. Notice that the algebra you have to find is the function that reduces the functor to its *carrier type* `c`

, not the 'data type' `a`

. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore `a`

, as you'll see.

As in the previous articles, start by writing a function that will become the catamorphism, based on `cata`

:

nonEmptyF = cata alg . unNonEmptyFix where alg (NonEmptyF x xs) = undefined

While this compiles, with its `undefined`

implementation of `alg`

, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type `c`

from `alg`

? You could pass a function argument to the `nonEmptyF`

function and use it with `x`

and `xs`

:

nonEmptyF :: (a -> ListFix a -> c) -> NonEmptyFix a -> c nonEmptyF f = cata alg . unNonEmptyFix where alg (NonEmptyF x xs) = f x xs

This works. Since `cata`

has the type `Functor f => (f a -> a) -> Fix f -> a`

, that means that `alg`

has the type `f a -> a`

. In the case of `NonEmptyF`

, the compiler infers that the `alg`

function has the type `NonEmptyF a c -> c1`

, which fits the bill, since `c`

may be the same type as `c1`

.

This, then, is the catamorphism for a non-empty collection. This one is just a single function. It's still not the only possible catamorphism, since you could trivially flip the arguments to `f`

.

I've chosen this representation because the arguments `x`

and `xs`

are defined in the same order as the order of `head`

before `tail`

. Notice how this is the same order as the above C# `Aggregate`

method.

### Basis #

You can implement most other useful functionality with `nonEmptyF`

. Here's the `Semigroup`

instance and a useful helper function:

toListFix :: NonEmptyFix a -> ListFix a toListFix = nonEmptyF consF instance Semigroup (NonEmptyFix a) where xs <> ys = nonEmptyF (\x xs' -> createNonEmptyF x $ xs' <> toListFix ys) xs

The implementation uses `nonEmptyF`

to operate on `xs`

. Inside the lambda expression, it converts `ys`

to a list, and uses the `ListFix`

`Semigroup`

instance to concatenate `xs`

with it.

Here's the `Functor`

instance:

instance Functor NonEmptyFix where fmap f = nonEmptyF (\x xs -> createNonEmptyF (f x) $ fmap f xs)

Like the `Semigroup`

instance, this `fmap`

implementation uses `fmap`

on `xs`

, which is the `ListFix`

`Functor`

instance.

The `Applicative`

instance is much harder to write from scratch (or, at least, I couldn't come up with a simpler way):

instance Applicative NonEmptyFix where pure x = createNonEmptyF x nilF liftA2 f xs ys = nonEmptyF (\x xs' -> nonEmptyF (\y ys' -> createNonEmptyF (f x y) (liftA2 f (consF x nilF) ys' <> liftA2 f xs' (consF y ys'))) ys) xs

While that looks complicated, it's not *that* bad. It uses `nonEmptyF`

to 'loop' over the `xs`

, and then a nested call to `nonEmptyF`

to 'loop' over the `ys`

. The inner lambda expression uses `f x y`

to calculate the head, but it also needs to calculate all other combinations of values in `xs`

and `ys`

.

First, it keeps `x`

fixed and 'loops' through all the remaining `ys'`

; that's the `liftA2 f (consF x nilF) ys'`

part:

Then it 'loops' over all the remaining `xs'`

and all the `ys`

; that is, `liftA2 f xs' (consF y ys')`

.

The two `liftA2`

functions apply to the `ListFix`

`Applicative`

instance.

You'll be happy to see, I think, that the `Monad`

instance is simpler:

instance Monad NonEmptyFix where xs >>= f = nonEmptyF (\x xs' -> nonEmptyF (\y ys -> createNonEmptyF y $ ys <> (xs' >>= toListFix . f)) (f x)) xs

And fortunately, `Foldable`

and `Traversable`

are even simpler:

instance Foldable NonEmptyFix where foldr f seed = nonEmptyF (\x xs -> f x $ foldr f seed xs) instance Traversable NonEmptyFix where traverse f = nonEmptyF (\x xs -> liftA2 createNonEmptyF (f x) (traverse f xs))

Finally, you can implement conversions to and from the `NonEmpty`

type from `Data.List.NonEmpty`

:

toNonEmpty :: NonEmptyFix a -> NonEmpty a toNonEmpty = nonEmptyF (\x xs -> x :| toList xs) fromNonEmpty :: NonEmpty a -> NonEmptyFix a fromNonEmpty (x :| xs) = createNonEmptyF x $ fromList xs

This demonstrates that `NonEmptyFix`

is isomorphic to `NonEmpty`

.

### Conclusion #

The catamorphism for a non-empty collection is a single function that produces a single value from the head and the tail of the collection. While it's possible to implement a 'standard fold' (`foldr`

in Haskell), the non-empty catamorphism doesn't require a seed to get started. The data structure guarantees that there's always at least one value available, and this value can then be use to 'kick off' a fold.

In C# one can define the catamorphism as the above `Aggregate`

method. You could then define all other instance functions based on `Aggregate`

.

**Next:** Either catamorphism.