Monoids by Mark Seemann
Introduction to monoids for object-oriented programmers.
This article is part of a larger series about monoids, semigroups, and related concepts. In this article, you'll learn what a monoid is, and what distinguishes it from a semigroup.
Monoids form a subset of semigroups. The rules that govern monoids are stricter than those for semigroups, so you'd be forgiven for thinking that it would make sense to start with semigroups, and then build upon that definition to learn about monoids. From a strictly hierarchical perspective, that would make sense, but I think that monoids are more intuitive. When you see the most obvious monoid example, you'll see that they cover operations from everyday life. It's easy to think of examples of monoids, while you have to think harder to find some good semigroup examples. That's the reason I think that you should start with monoids.
Monoid laws #
What do addition (40 + 2
) and multiplication (6 * 7
) have in common?
They're both
- associative
- binary operations
- with a neutral element.
Binary operation #
Let's start with the most basic property. That an operation is binary means that it works on two values. Perhaps you mostly associate the word binary with binary numbers, such as 101010, but the word originates from Latin and means something like of two. Astronomers talk about binary stars, but the word is dominantly used in computing context: apart from binary numbers, you may also have heard about binary trees. When talking about binary operations, it's implied that both input values are of the same type, and that the return type is the same as the input type. In other words, a C# method like this is a proper binary operation:
public static Foo Op(Foo x, Foo y)
Sometimes, if Op
is an instance method on the Foo
class, it can also look like this:
public Foo Op (Foo foo)
On the other hand, this isn't a binary operation:
public static Baz Op(Foo f, Bar b)
Although it takes two input arguments, they're of different types, and the return type is a third type.
Since all involved arguments and return values are of the same type, a binary operation exhibits what Eric Evans in Domain-Driven Design calls Closure of Operations.
Associative #
In order to form a monoid, the binary operation must be associative. This simply means that the order of evaluation doesn't matter. For example, for addition, it means that
(2 + 3) + 4 = 2 + (3 + 4) = 2 + 3 + 4 = 9
Likewise, for multiplication
(2 * 3) * 4 = 2 * (3 * 4) = 2 * 3 * 4 = 24
Expressed as the above Op
instance method, associativity would require that areEqual
is true
in the following code:
var areEqual = foo1.Op(foo2).Op(foo3) == foo1.Op(foo2.Op(foo3));
On the left-hand side, foo1.Op(foo2)
is evaluated first, and the result then evaluated with foo3
. On the right-hand side, foo2.Op(foo3)
is evaluated first, and then used as an input argument to foo1.Op
. Since the left-hand side and the right-hand side are compared with the ==
operator, associativity requires that areEqual
is true
.
In C#, if you have a custom monoid like Foo
, you'll have to override Equals
and implement the ==
operator in order to make all of this work.
Neutral element #
The third rule for monoids is that there must exist a neutral value. In the normal jargon, this is called the identity element, and this is what I'm going to be calling it from now on. I only wanted to introduce the concept using a friendlier name.
The identity element is a value that doesn't 'do' anything. For addition, for example, it's zero, because adding zero to a value doesn't change the value:
0 + 42 = 42 + 0 = 42
As an easy exercise, see if you can figure out the identity value for multiplication.
As implied by the above sum, the identity element must act neutrally both when applied to the left-hand side and the right-hand side of another value. For our Foo
objects, it could look like this:
var hasIdentity = Foo.Identity.Op(foo) == foo.Op(Foo.Identity) && foo.Op(Foo.Identity) == foo;
Here, Foo.Identity
is a static read-only field of the type Foo
.
Examples #
There are plenty of examples of monoids. The most obvious examples are addition and multiplication, but there are more. Depending on your perspective, you could even say that there's more than one addition monoid, because there's one for integers, one for real numbers, and so on. The same can be said for multiplication.
There are also two monoids over boolean values called all and any. If you have a binary operation over boolean values called all, how do you think it works? What would be the identity value? What about any?
I'll leave you to ponder (or look up) all and any, and instead, in the next articles, show you some slightly more interesting monoids.
- Angular addition monoid
- Strings, lists, and sequences as a monoid
- Money monoid
- Convex hull monoid
- Tuple monoids
- Function monoids
- Endomorphism monoid
- Maybe monoids
- Lazy monoids
- Monoids accumulate
==
operator. On the other hand, there's no Multiply
method for TimeSpan
, because what does it mean to multiply two durations? What would the dimension be? Time squared?
Summary #
A monoid (not to be confused with a monad) is a set (a type) equipped with a binary operation that satisfies the two monoid laws: that the operation is associative, and that an identity element exists. Addition and multiplication are prime examples, but several others exist.
(By the way, the identity element for multiplication is one (1), the all monoid is boolean and, and the any monoid is boolean or.)
Next: Angular addition monoid
Comments
Great series! I'm a big fan of intuitive abstractions and composition. Can't wait for the remaining parts.
I first heard of the closure property in SICP, where it's mentioned that:
Also, a reference to the algebraic origin of this concept is made in the foot note for this sentence:It's interesting to see this concept come up over and over, although it hasn't been widely socialized as a formal construct to software composition.
This looks like it's going to be a fantastic series - I'm really looking forwards to reading the rest!
So, as we are talking about forming a vocabulary and reducing ambiguity, I have a question about the use of the word closure, which I think has more than one common meaning in this context.
In Eric Evans' "Closure of Operations", closure refers to the fact that the operation is "closed" over it's set of possible values - in other words, the set is closed under the operation.
Closure is also used to describe a function with a bound value (as in the poor man's object").
These are two separate concepts as far as I am aware. Also, I suspect that the latter meaning is likely more well known to C# devs reading this series, especially ReSharper users who have come across it's "implicitly captured closure" detection. So, if I am correct, do you think it is worth making this distinction clear to avoid potential confusion?
Sean, thank you for writing. That's a great observation, and one that I frankly admit that I hadn't made myself. In an ideal world, one of those concepts would have a different name, so that we'd be able to distinguish them from each other.
In my experience, I find that the context in which I'm using those words tend to make the usage unambiguous, but I think that you have a good point that some readers may be more familiar with closure as a captured outer value, rather than the concept of an operation where the domain and the codomain is the same. I'll see if I can make this clearer when I revisit Evans' example.
I'm recently learning category theory, and happened to read this blog. Great post! I'll follow up the series.
I find it a little confusing:
Identity element should be the element of the collection rather than operation, right? So, the id for all should be True, and that of any should be False.
Vitrun, thank you for writing. Yes, the identity for any is false, and for all it's true. There are two other monoids over Boolean values. Can you figure out what they are?
I don't understand this:
Can you elaborate what you mean by that?A monoid is a sequence (M, e, ⋆), where M is a set, e ∈ M is the identity, and ⋆ is the function/operator.
To be clear. I mean, the identity should be the element of the set, rather than the operator
Are the other two and and or?
I found you good at bridging the gap between programming practice and ivory-tower concepts. How do you do that?
Vitrun, thank you for your kind words. I don't know if I have a particular way of 'bridging the gap'; I try to identify patterns in the problems I run into, and then communicate those patterns in as simple a language as I can, with as helpful examples as I can think of...
Regarding monoids over Boolean values, any is another name for Boolean or, and all is another name for Boolean and. That's two monoids (any and all); in addition to those, there are two more monoids over Booleans. I could tell you what they are, but it's a good exercise if you can figure them out by yourself. If not, you can easily Google them.
Hi Mark. Thank you for these articles.
Are the other two boolean monoids not and xor? ... And the identity value for not is the input value. And the identity value for xor is any of the two input values. I did not google for them. I will just wait for your answer so that there will be thrill, and so I remember what the answer is :)
I just realized that not is not a monoid because it does not operate on two values hehe. Sorry about that.
I googled it already :)
I gave answers too soon. I just realized that I was confused about the definition of an identity value.
This is another lesson for me to read a technical writing at least two or three times before thinking that I already understood it.
Jeremiah, thank you for writing, and please accept my apologies that I didn't respond right away. Not only do I write technical content, but I also read a fair bit of it, and my experience is that I often have to work with the topic in question in order to fully grasp it. Reading a text more than once is one way of doing it. When it comes to Boolean monoids, another way is to draw up some truth tables. A third way would be to simply play with Boolean expressions in your programming language of choice. Whatever it takes; if you learned something, then I'm happy.
Thanks for this great series. I know you've specified twice that a monoid is a set equipped with a binary operation, and that's consistent with other sources. However, I'm confused when you say addition and multiplication are monoids. Is it more technically correct to say "the set of integers under addition is a monoid"?
Mark, thank you for writing. It's almost impossible to be explicitly correct all the time when discussing matters like these. It'd tend to make the text verbose, bordering on unreadable. I'm no mathematician, but I think that you're right that it's technically more correct to say that the set of integers under addition forms a monoid, or gives rise to a monoid.
If you were to insist on precision, however, then I believe that your formulation is also incorrect. Mathematically, monoids are triples consisting of 1. a set, 2. a binary, associative operation, and 3. an identity. You didn't mention the identity, which under addition is zero.
I'm not writing this to insist that 'you're wrong, and therefore I'm right'. My language is imprecise, too. I do point this out, however, to highlight just how difficult it is to be absolutely precise when discussing such concepts in prose.
Other aspects to be aware of are these:
Being precise in language can be useful when trying to learn an unfamiliar concept, but my experience with writing this series of articles is that I've been unable to keep up the rigour in my prose at all times.