Introduction to monoids for object-oriented programmers.

This article is part of a larger series about monoids, semigroups, and related concepts. In this article, you'll learn what a monoid is, and what distinguishes it from a semigroup.

Monoids are a subset of semigroups.

Monoids form a subset of semigroups. The rules that govern monoids are stricter than those for semigroups, so you'd be forgiven for thinking that it would make sense to start with semigroups, and then build upon that definition to learn about monoids. From a strictly hierarchical perspective, that would make sense, but I think that monoids are more intuitive. When you see the most obvious monoid example, you'll see that they cover operations from everyday life. It's easy to think of examples of monoids, while you have to think harder to find some good semigroup examples. That's the reason I think that you should start with monoids.

Monoid laws #

What do addition (40 + 2) and multiplication (6 * 7) have in common?

They're both

  • associative
  • binary operations
  • with a neutral element.
That's all it takes to form a monoid. Associativity and the existence of a neutral element is sometimes referred to as the monoid laws. It's worth noting that a monoid is a combination of a data type (or set) and an operation. It's not a data type in itself, but rather a function (or method) that operates on that data type. For example, addition and multiplication are two different monoids that both work on numbers.

Binary operation #

Let's start with the most basic property. That an operation is binary means that it works on two values. Perhaps you mostly associate the word binary with binary numbers, such as 101010, but the word originates from Latin and means something like of two. Astronomers talk about binary stars, but the word is dominantly used in computing context: apart from binary numbers, you may also have heard about binary trees. When talking about binary operations, it's implied that both input values are of the same type, and that the return type is the same as the input type. In other words, a C# method like this is a proper binary operation:

public static Foo Op(Foo x, Foo y)

Sometimes, if Op is an instance method on the Foo class, it can also look like this:

public Foo Op (Foo foo)

On the other hand, this isn't a binary operation:

public static Baz Op(Foo f, Bar b)

Although it takes two input arguments, they're of different types, and the return type is a third type.

Since all involved arguments and return values are of the same type, a binary operation exhibits what Eric Evans in Domain-Driven Design calls Closure of Operations.

Associative #

In order to form a monoid, the binary operation must be associative. This simply means that the order of evaluation doesn't matter. For example, for addition, it means that

(2 + 3) + 4 = 2 + (3 + 4) = 2 + 3 + 4 = 9

Likewise, for multiplication

(2 * 3) * 4 = 2 * (3 * 4) = 2 * 3 * 4 = 24

Expressed as the above Op instance method, associativity would require that areEqual is true in the following code:

var areEqual = foo1.Op(foo2).Op(foo3) == foo1.Op(foo2.Op(foo3));

On the left-hand side, foo1.Op(foo2) is evaluated first, and the result then evaluated with foo3. On the right-hand side, foo2.Op(foo3) is evaluated first, and then used as an input argument to foo1.Op. Since the left-hand side and the right-hand side are compared with the == operator, associativity requires that areEqual is true.

In C#, if you have a custom monoid like Foo, you'll have to override Equals and implement the == operator in order to make all of this work.

Neutral element #

The third rule for monoids is that there must exist a neutral value. In the normal jargon, this is called the identity element, and this is what I'm going to be calling it from now on. I only wanted to introduce the concept using a friendlier name.

The identity element is a value that doesn't 'do' anything. For addition, for example, it's zero, because adding zero to a value doesn't change the value:

0 + 42 = 42 + 0 = 42

As an easy exercise, see if you can figure out the identity value for multiplication.

As implied by the above sum, the identity element must act neutrally both when applied to the left-hand side and the right-hand side of another value. For our Foo objects, it could look like this:

var hasIdentity =
    Foo.Identity.Op(foo) == foo.Op(Foo.Identity) &&
    foo.Op(Foo.Identity) == foo;

Here, Foo.Identity is a static read-only field of the type Foo.

Examples #

There are plenty of examples of monoids. The most obvious examples are addition and multiplication, but there are more. Depending on your perspective, you could even say that there's more than one addition monoid, because there's one for integers, one for real numbers, and so on. The same can be said for multiplication.

There are also two monoids over boolean values called all and any. If you have a binary operation over boolean values called all, how do you think it works? What would be the identity value? What about any?

I'll leave you to ponder (or look up) all and any, and instead, in the next articles, show you some slightly more interesting monoids.

In essence, if you have a data type that 'behaves like a number', you can probably make it a monoid. Addition is often the best candidate, because it doesn't mess with the dimensions of what you're keeping track of. As an example, the .NET Base Class Library defines a TimeSpan structure with an Add method. It also comes with a == operator. On the other hand, there's no Multiply method for TimeSpan, because what does it mean to multiply two durations? What would the dimension be? Time squared?

Summary #

A monoid (not to be confused with a monad) is a set (a type) equipped with a binary operation that satisfies the two monoid laws: that the operation is associative, and that an identity element exists. Addition and multiplication are prime examples, but several others exist.

(By the way, the identity element for multiplication is one (1), the all monoid is boolean and, and the any monoid is boolean or.)

Next: Angular addition monoid


Comments

Great series! I'm a big fan of intuitive abstractions and composition. Can't wait for the remaining parts.

I first heard of the closure property in SICP, where it's mentioned that:

In general, an operation for combining data objects satisfies the closure property if the results of combining things with that operation can themselves be combined using the same operation.
Also, a reference to the algebraic origin of this concept is made in the foot note for this sentence:
The use of the word "closure" here comes from abstract algebra, where a set of elements is said to be closed under an operation if applying the operation to elements in the set produces an element that is again an element of the set.

It's interesting to see this concept come up over and over, although it hasn't been widely socialized as a formal construct to software composition.

2017-10-06 15:38 UTC

This looks like it's going to be a fantastic series - I'm really looking forwards to reading the rest!

So, as we are talking about forming a vocabulary and reducing ambiguity, I have a question about the use of the word closure, which I think has more than one common meaning in this context.

In Eric Evans' "Closure of Operations", closure refers to the fact that the operation is "closed" over it's set of possible values - in other words, the set is closed under the operation.

Closure is also used to describe a function with a bound value (as in the poor man's object").

These are two separate concepts as far as I am aware. Also, I suspect that the latter meaning is likely more well known to C# devs reading this series, especially ReSharper users who have come across it's "implicitly captured closure" detection. So, if I am correct, do you think it is worth making this distinction clear to avoid potential confusion?

2017-10-18 07:30 UTC

Sean, thank you for writing. That's a great observation, and one that I frankly admit that I hadn't made myself. In an ideal world, one of those concepts would have a different name, so that we'd be able to distinguish them from each other.

In my experience, I find that the context in which I'm using those words tend to make the usage unambiguous, but I think that you have a good point that some readers may be more familiar with closure as a captured outer value, rather than the concept of an operation where the domain and the codomain is the same. I'll see if I can make this clearer when I revisit Evans' example.

2017-10-18 12:02 UTC

I'm recently learning category theory, and happened to read this blog. Great post! I'll follow up the series.

I find it a little confusing:

(By the way, the identity element for multiplication is one (1), all is boolean and, and any is boolean or.)

Identity element should be the element of the collection rather than operation, right? So, the id for all should be True, and that of any should be False.

2017-12-03 03:36 UTC

Vitrun, thank you for writing. Yes, the identity for any is false, and for all it's true. There are two other monoids over Boolean values. Can you figure out what they are?

I don't understand this:

"Identity element should be the element of the collection rather than operation"
Can you elaborate what you mean by that?

2017-12-03 18:56 UTC

A monoid is a sequence (M, e, ⋆), where M is a set, e ∈ M is the identity, and ⋆ is the function/operator.

To be clear. I mean, the identity should be the element of the set, rather than the operator

Are the other two and and or?

I found you good at bridging the gap between programming practice and ivory-tower concepts. How do you do that?

2017-12-04 04:25 UTC

Vitrun, thank you for your kind words. I don't know if I have a particular way of 'bridging the gap'; I try to identify patterns in the problems I run into, and then communicate those patterns in as simple a language as I can, with as helpful examples as I can think of...

the identity should be the element of the set
Yes.

Regarding monoids over Boolean values, any is another name for Boolean or, and all is another name for Boolean and. That's two monoids (any and all); in addition to those, there are two more monoids over Booleans. I could tell you what they are, but it's a good exercise if you can figure them out by yourself. If not, you can easily Google them.

2017-12-05 08:27 UTC

Hi Mark. Thank you for these articles.

Are the other two boolean monoids not and xor? ... And the identity value for not is the input value. And the identity value for xor is any of the two input values. I did not google for them. I will just wait for your answer so that there will be thrill, and so I remember what the answer is :)

2018-03-06 04:28 UTC

I just realized that not is not a monoid because it does not operate on two values hehe. Sorry about that.

2018-03-06 07:18 UTC

I googled it already :)

I gave answers too soon. I just realized that I was confused about the definition of an identity value.

This is another lesson for me to read a technical writing at least two or three times before thinking that I already understood it.

2018-03-07 06:17 UTC

Jeremiah, thank you for writing, and please accept my apologies that I didn't respond right away. Not only do I write technical content, but I also read a fair bit of it, and my experience is that I often have to work with the topic in question in order to fully grasp it. Reading a text more than once is one way of doing it. When it comes to Boolean monoids, another way is to draw up some truth tables. A third way would be to simply play with Boolean expressions in your programming language of choice. Whatever it takes; if you learned something, then I'm happy.

2018-03-07 08:15 UTC

Thanks for this great series. I know you've specified twice that a monoid is a set equipped with a binary operation, and that's consistent with other sources. However, I'm confused when you say addition and multiplication are monoids. Is it more technically correct to say "the set of integers under addition is a monoid"?

2022-02-03 04:10 UTC

Mark, thank you for writing. It's almost impossible to be explicitly correct all the time when discussing matters like these. It'd tend to make the text verbose, bordering on unreadable. I'm no mathematician, but I think that you're right that it's technically more correct to say that the set of integers under addition forms a monoid, or gives rise to a monoid.

If you were to insist on precision, however, then I believe that your formulation is also incorrect. Mathematically, monoids are triples consisting of 1. a set, 2. a binary, associative operation, and 3. an identity. You didn't mention the identity, which under addition is zero.

I'm not writing this to insist that 'you're wrong, and therefore I'm right'. My language is imprecise, too. I do point this out, however, to highlight just how difficult it is to be absolutely precise when discussing such concepts in prose.

Other aspects to be aware of are these:

  • Addition also gives rise to a monoid on the set of rational numbers and real numbers. On complex numbers, too, I think...
  • In programming, addition isn't really addition because of overflow/underflow. Exceptions to that rule are unbounded numbers like .NET's BigInteger or Haskell's Integer and Rational. (Interestingly, even in the face of overflow, 'addition' may still form a monoid, but I'll leave that as an exercise.)
  • Floating-point arithmetic, as yet another consideration, in general requires some hand-waving. As far as I recall, floating-point addition isn't associative. Is it really addition, then? In practice, however, we still call it addition. Would we say that it forms a monoid?
The point that I'm trying to make is that some abstract concepts are useful in programming, even though, mathematically speaking, they may not quite fit the bill.

Being precise in language can be useful when trying to learn an unfamiliar concept, but my experience with writing this series of articles is that I've been unable to keep up the rigour in my prose at all times.

2022-02-03 07:15 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Friday, 06 October 2017 07:38:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Friday, 06 October 2017 07:38:00 UTC