Semigroups accumulate by Mark Seemann
You can accumulate an arbitrary, non-zero number of semigroup values to a single value. An article for object-oriented programmers.
This article is part of a series about semigroups. In short, a semigroup is an associative binary operation.
As you've learned in a previous article, you can accumulate an arbitrary number of monoidal values to a single value. A corresponding property holds for semigroups.
Monoid accumulation #
When an instance method Op
forms a monoid, you can easily write a function that accumulates an arbitrary number of Foo
values:
public static Foo Accumulate(IReadOnlyCollection<Foo> foos) { var acc = Identity; foreach (var f in foos) acc = acc.Op(f); return acc; }
Notice how this generally applicable algorithm starts with the Identity
value. One implication of this is that when foos
is empty, the return value will be Identity
. When Op
is a semigroup, however, there's no identity, so this doesn't quite work. You need a value to start the accumulation; something you can return if the collection is empty.
Semigroup accumulation #
From Haskell you can learn that if you have a semigroup, you can accumulate any non-empty collection:
sconcat :: Semigroup a => NonEmpty a -> a
You can read this as: for any generic type a
, when a
forms a Semigroup
, the sconcat
function takes a non-empty list of a
values, and reduces them to a single a
value. NonEmpty a
is a list with at least one element.
NotEmptyCollection #
You can also define a NotEmptyCollection<T>
class in C#:
public class NotEmptyCollection<T> : IReadOnlyCollection<T> { public NotEmptyCollection(T head, params T[] tail) { if (head == null) throw new ArgumentNullException(nameof(head)); this.Head = head; this.Tail = tail; } public T Head { get; } public IReadOnlyCollection<T> Tail { get; } public int Count { get => this.Tail.Count + 1; } public IEnumerator<T> GetEnumerator() { yield return this.Head; foreach (var item in this.Tail) yield return item; } IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); } }
Because of the way the constructor is defined, you must supply at least one element in order to create an instance. You can provide any number of extra elements via the tail
array, but one is minimum.
The Count
property returns the number of elements in Tail
, plus one, because there's always a Head
value.
The GetEnumerator
method returns an iterator that always starts with the Head
value, and proceeds with all the Tail
values, if there are any.
Finding the maximum of a non-empty collection #
As you've already learned, Math.Max
is a semigroup. Although the .NET Base Class Library has built-in methods for this, you can use a generally applicable algorithm to find the maximum value in a non-empty list of integers.
private static int Accumulate(NotEmptyCollection<int> numbers) { var acc = numbers.Head; foreach (var n in numbers.Tail) acc = Math.Max(acc, n); return acc; }
Notice how similar this algorithm is to monoid accumulation! The only difference is that instead of using Identity
to get started, you can use Head
, and then loop over all elements of Tail
.
You can use it like this:
var nec = new NotEmptyCollection<int>(42, 1337, 123); var max = Accumulate(nec);
Here, max
is obviously 1337
.
As usual, however, this is much nicer, and more succinct in Haskell:
Prelude Data.Semigroup Data.List.NonEmpty> getMax $ sconcat $ fmap Max $ 42 :| [1337, 123] 1337
That's hardly the idiomatic way of getting a maximum element in Haskell, but it does show how you can 'click together' concepts in order to achieve a goal.
Aggregate #
Perhaps the observant reader will, at this point, have recalled to him- or herself that the .NET Base Class Library already includes an Aggregate
extension method, with an overload that takes a seed. In general, the simpliest Aggregate
method doesn't gracefully handle empty collections, but using the overload that takes a seed is more robust. You can rewrite the above Accumulate
method using Aggregate
:
private static int Accumulate(NotEmptyCollection<int> numbers) { return numbers.Tail.Aggregate( numbers.Head, (x, y) => Math.Max(x, y)); }
Notice how you can pass Head
as the seed, and accumulate the Tail
using that starting point. The Aggregate
method is more like Haskell's sconcat
for semigroups than mconcat
for monoids.
Summary #
A semigroup operation can be used to reduce values to a single value, just like a monoid can. The only difference is that a semigroup operation can't reduce an empty collection, whereas a monoid can.
Next: Quasigroups