Coalescing Composite as a monoid by Mark Seemann
A variation of the Composite design pattern uses coalescing behaviour to return non-composable values. That's still a monoid.
This article is part of a series of articles about design patterns and their category theory counterparts. In a previous article, you learned that the Composite design pattern is simply a monoid.
Monoidal return types #
When all methods of an interface return monoids, you can create a Composite. This is fairly intuitive once you understand what a monoid is. Consider this example interface:
public interface ICustomerRepository { void Create(Customer customer); Customer Read(Guid id); void Update(Customer customer); void Delete(Guid id); }
While this interface is, in fact, not readily composable, most of the methods are. It's easy to compose the three void
methods. Here's a composition of the Create
method:
public void Create(Customer customer) { foreach (var repository in this.repositories) repository.Create(customer); }
In this case it's easy to compose multiple repositories, because void
(or, rather, unit) forms a monoid. If you have methods that return numbers, you can add the numbers together (a monoid). If you have methods that return strings, you can concatenate the strings (a monoid). If you have methods that return Boolean values, you can or or and them together (more monoids).
What about the above Read
method, though?
Picking the first Repository #
Why would you even want to compose two repositories? One scenario is where you have an old data store, and you want to move to a new data store. For a while, you wish to write to both data stores, but one of them stays the 'primary' data store, so this is the one from which you read.
Imagine that the old repository saves customer information as JSON files on disk. The new data store, on the other hand, saves customer data as JSON documents in Azure Blob Storage. You've already written two implementations of ICustomerRepository
: FileCustomerRepository
and AzureCustomerRepository
. How do you compose them?
The three methods that return void
are easy, as the above Create
implementation demonstrates. The Read
method, however, is more tricky.
One option is to only query the first repository, and return its return value:
public Customer Read(Guid id) { return this.repositories.First().Read(id); }
This works, but doesn't generalise. It works if you know that you have a non-empty collection of repositories, but if you want to adhere to the Liskov Substitution Principle, you should be able to handle the case where there's no repositories.
A Composite should be able to compose an arbitrary number of other objects. This includes a collection of no objects. The CompositeCustomerRepository
class has this constructor:
private readonly IReadOnlyCollection<ICustomerRepository> repositories; public CompositeCustomerRepository( IReadOnlyCollection<ICustomerRepository> repositories) { if (repositories == null) throw new ArgumentNullException(nameof(repositories)); this.repositories = repositories; }
It uses standard Constructor Injection to inject an IReadOnlyCollection<ICustomerRepository>
. Such a collection is finite, but can be empty.
Another problem with blindly returning the value from the first repository is that the return value may be empty.
In C#, people often use null
to indicate a missing value, and while I find such practice unwise, I'll pursue this option for a bit.
A more robust Composite would return the first non-null value it gets:
public Customer Read(Guid id) { foreach (var repository in this.repositories) { var customer = repository.Read(id); if (customer != null) return customer; } return null; }
This implementation loops through all the injected repositories
and calls Read
on each until it gets a result that is not null
. This will often be the first value, but doesn't have to be. If all repositories return null
, then the Composite also returns null
. To emphasise my position, I would never design C# code like this, but at least it's consistent.
If you've ever worked with relational databases, you may have had an opportunity to use the COALESCE
function, which works in exactly the same way. This is the reason I call such an implementation a coalescing Composite.
The First monoid #
The T-SQL documentation for COALESCE describes the operation like this:
"Evaluates the arguments in order and returns the current value of the first expression that initially does not evaluate to NULL
."
The Oracle documentation expresses it as:
"This may not be apparent, but that's a monoid.COALESCE
returns the first non-nullexpr
in the expression list."
Haskell's base library comes with a monoidal type called First
, which is a
"Maybe monoid returning the leftmost non-Nothing value."Sounds familiar?
Here's how you can use it in GHCi:
λ> First (Just (Customer id1 "Joan")) <> First (Just (Customer id2 "Nigel")) First {getFirst = Just (Customer {customerId = 1243, customerName = "Joan"})} λ> First (Just (Customer id1 "Joan")) <> First Nothing First {getFirst = Just (Customer {customerId = 1243, customerName = "Joan"})} λ> First Nothing <> First (Just (Customer id2 "Nigel")) First {getFirst = Just (Customer {customerId = 5cd5, customerName = "Nigel"})} λ> First Nothing <> First Nothing First {getFirst = Nothing}
(To be clear, the above examples uses First
from Data.Monoid
, not First
from Data.Semigroup
.)
The operator <>
is an infix alias for mappend
- Haskell's polymorphic binary operation.
As long as the left-most value is present, that's the return value, regardless of whether the right value is Just
or Nothing
. Only when the left value is Nothing
is the right value returned. Notice that this value may also be Nothing
, causing the entire expression to be Nothing
.
That's exactly the same behaviour as the above implementation of the Read
method.
First in C# #
It's easy to port Haskell's First
type to C#:
public class First<T> { private readonly T item; private readonly bool hasItem; public First() { this.hasItem = false; } public First(T item) { if (item == null) throw new ArgumentNullException(nameof(item)); this.item = item; this.hasItem = true; } public First<T> FindFirst(First<T> other) { if (this.hasItem) return this; else return other; } }
Instead of nesting Maybe
inside of First
, as Haskell does, I simplified a bit and gave First<T>
two constructor overloads: one that takes a value, and one that doesn't. The FindFirst
method is the binary operation that corresponds to Haskell's <>
or mappend
.
This is only one of several alternative implementations of the first monoid.
In order to make First<T>
a monoid, it must also have an identity, which is just an empty value:
public static First<T> Identity<T>() { return new First<T>(); }
This enables you to accumulate an arbitrary number of First<T>
values to a single value:
public static First<T> Accumulate<T>(IReadOnlyList<First<T>> firsts) { var acc = Identity<T>(); foreach (var first in firsts) acc = acc.FindFirst(first); return acc; }
You start with the identity, which is also the return value if firsts
is empty. If that's not the case, you loop through all firsts
and update acc
by calling FindFirst
.
A composable Repository #
You can formalise such a design by changing the ICustomerRepository
interface:
public interface ICustomerRepository { void Create(Customer customer); First<Customer> Read(Guid id); void Update(Customer customer); void Delete(Guid id); }
In this modified version, Read
explicitly returns First<Customer>
. The rest of the methods remain as before.
The reusable API of First
makes it easy to implement a Composite version of Read
:
public First<Customer> Read(Guid id) { var candidates = new List<First<Customer>>(); foreach (var repository in this.repositories) candidates.Add(repository.Read(id)); return First.Accumulate(candidates); }
You could argue that this seems to be wasteful, because it calls Read
on all repositories. If the first Repository returns a value, all remaining queries are wasted. You can address that issue with lazy evaluation.
You can see (a recording of) a live demo of the example in this article in my Clean Coders video Composite as Universal Abstraction.
Summary #
While the typical Composite is implemented by directly aggregating the return values from the composed objects, variations exist. One variation picks the first non-empty value from a collection of candidates, reminiscent of the SQL COALESCE
function. This is, however, still a monoid, so the overall conjecture that Composites are monoids still holds.
Another Composite variation exists, but that one turns out to be a monoid as well. Read on!
Comments
I've been struggling a bit with the terminology and I could use a bit of help.
I understand a
Monoid<a>
to be of typea -> a -> a
. I further understand that it can be said thata
"forms a monoid over" the specifica -> a -> a
implementation. We can then say that an int forms two different monoids over addition and multiplication.Is my understanding above accurate and if so wouldn't it be more accurate to say that you can create a composite from an interface only when all of its methods return values that form monoids? Saying that they have to return monoids (instead of values that form them) goes against my understanding that a monoid is a binary operation.
Thanks a lot for an amazing article series, and for this blog overall. Keep up the good work!
Nikola, thank you for writing. You're correct: it would be more correct to say that you can create a Composite from an interface when all of its methods return types that form monoids. Throughout this article series, I've been struggling to keep my language as correct and specific as possible, but I sometimes slip up.
This has come up before, so perhaps you'll find this answer helpful.
By the way, there's one exception to the rule that in order to be able to create a Composite, all methods must return types that form monoids. This is when the return type is the same as the input type. The resulting Composite is still a monoid, so the overall conclusion holds.