From design patterns to category theory by Mark Seemann
How do you design good abstractions? By using abstractions that already exist.
When I was a boy, I had a cassette tape player. It came with playback controls like these:
Soon after cassette players had become widely adopted, VCR manufacturers figured out that they could reuse those symbols to make their machines easier to use. Everyone could play a video tape, but 'no one' could 'program' them, because, while playback controls were already universally understood by consumers, each VCR came with its own proprietary interface for 'programming'.
Then came CD players. Same controls.
MP3 players. Same controls.
Streaming audio and video players. Same controls.
If you download an app that plays music, odds are that you'll find it easy to get started playing music. One reason is that all playback apps seem to have the same common set of controls. It's an abstraction that you already know.
Understanding source code #
As I explain in my Humane Code video, you can't program without abstractions. To summarise, in the words of Robert C. Martin
"Abstraction is the elimination of the irrelevant and the amplification of the essential"With such abstractions, source code becomes easier to understand. Like everything else, there's no silver bullet, but good coding abstractions can save you much grief, and make it easier to understand big and complex code bases.
Not only can a good abstraction shield you from having to understand all the details in a big system, but if you're familiar with the abstraction, you may be able to quickly get up to speed.
While the above definition is great for identifying a good abstraction, it doesn't tell you how to create one.
Design patterns #
Design Patterns explains that a design pattern is a general reusable solution to a commonly occurring problem. As I interpret the original intent of the Gang of Four, the book was an attempt to collect and abstract solutions that were repeatedly observed 'in the wild'. The design patterns in the book are descriptive, not prescriptive.
Design patterns are useful in two ways:
- They offer solutions
- They form a vocabulary
I have no problems with ready-made solutions, but I think that the other advantage may be even bigger. When you're looking at unfamiliar source code, you struggle to understand how it's structured, and what it does. If, hypothetically, you discover that pieces of that unfamiliar source code follows a design pattern that you know, then understanding the code becomes much easier.
There are two criteria for this to happen:
- The reader (you) must already know the pattern
- The original author (also you?) must have implemented the pattern without any surprising deviations
Ambiguous specification #
Programming to a well-known abstraction is a force multiplier, but it does require that those two conditions are satisfied: prior knowledge, and correct implementation.
I don't know how to solve the prior knowledge requirement, other than to tell you to study. I do, however, think that it's possible to formalise some of the known design patterns.
Most design patterns are described in some depth. They come with sections on motivation, when to use and not to use, diagrams, and example code. Furthermore, they also come with an overview of variations.
Picture this: as a reader, you've just identified that the code you're looking at is an implementation of a design pattern. Then you realise that it isn't structured like you'd expect, or that its behaviour surprises you. Was the author incompetent, after all?
While you're inclined to believe the worst about your fellow (wo)man, you look up the original pattern, and there it is: the author is using a variation of the pattern.
Design patterns are ambiguous.
Universal abstractions #
Design Patterns was a great effort in 1994, and I've personally benefited from it. The catalogue was an attempt to discover good abstractions.
What's a good abstraction? As already quoted, it's a model that amplifies the essentials, etcetera. I think a good abstraction should also be intuitive.
What's the most intuitive abstractions ever?
Mathematics.
Stay with me, please. If you're a normal reader of my blog, you're most likely an 'industry programmer' or enterprise developer. You're not interested in mathematics. Perhaps mathematics even turns you off, and at the very least, you never had use for mathematics in programming.
You may not find n-dimensional differential topology, or stochastic calculus, intuitive, but that's not the kind of mathematics I have in mind.
Basic arithmetic is intuitive. You know: 1 + 3 = 4, or 3 * 4 = 12. In fact, it's so intuitive that you can't formally prove it -without axioms, that is. These axioms are unprovable; you must take them at face value, but you'll readily do that because they're so intuitive.
Mathematics is a big structure, but it's all based on intuitive axioms. Mathematics is intuitive.
Writers before me have celebrated the power of mathematical abstraction in programming. For instance, in Domain-Driven Design Eric Evans discusses how Closure of Operations leads to object models reminiscent of arithmetic. If you can design Value Objects in such a way that you can somehow 'add' them together, you have an intuitive and powerful abstraction.
Notice that there's more than one way to combine numbers. You can add them together, but you can also multiply them. Could there be a common abstraction for that? What about objects that can somehow be combined, even if they aren't 'number-like'? The generalisation of such operations is a branch of mathematics called category theory, and it has turned out to be productive when applied to functional programming. Haskell is the most prominent example.
By an interesting coincidence, the 'things' in category theory are called objects, and while they aren't objects in the sense that we think of in object-oriented design, there is some equivalence. Category theory concerns itself with how objects map to other objects. A functional programmer would interpret such morphisms as functions, but in a sense, you can also think of them as well-defined behaviour that's associated with data.
The objects of category theory are universal abstractions. Some of them, it turns out, coincide with known design patterns. The difference is, however, that category theory concepts are governed by specific laws. In order to be a functor, for example, an object must obey certain simple and intuitive laws. This makes the category theory concepts more specific, and less ambiguous, than design patterns.
The coming article series is an exploration of this space:
- Monoids, semigroups, and friends
- Functors, applicatives, and friends
- Software design isomorphisms
- Church encoding
- Catamorphisms
- Some design patterns as universal abstractions
Motivation #
The purpose of this article series is two-fold. Depending on your needs and interests, you can use it to
- learn better abstractions
- learn how functional programming is a real alternative to object-oriented programming
The other goal of these articles may be less clear. Object-oriented programming (OOP) is the dominant software design paradigm. It wasn't always so. When OOP was new, many veteran programmers couldn't see how it could be useful. They were schooled in one paradigm, and it was difficult for them to shift to the new paradigm. They were used to do things in one way (typically, procedural), and it wasn't clear how to achieve the same goals with idiomatic object-oriented design.
The same sort of resistance applies to functional programming. Tasks that are easy in OOP seem impossible in functional programming. How do you make a for loop? How do you change state? How do you break out of a routine?
This leads to both frustration, and dismissal of functional programming, which is still seen as either academic, or something only interesting in computation-heavy domains like science or finance.
It's my secondary goal with these articles to show that:
- There are clear equivalences between known design patterns and concepts from category theory
- Thus, functional programming is as universally useful as OOP
- Since equivalences exist, there's a learning path
Work in progress #
I've been thinking about these topics for years. What's a good abstraction? When do abstractions compose?
My first attempt at answering these questions was in 2010, but while I had the experience that certain abstractions composed better than others, I lacked the vocabulary. I've been wanting to write a better treatment of the topic ever since, but I've been constantly learning as I've grappled with the concepts.
I believe that I now have the vocabulary to take a stab at this again. This is hardly the ultimate treatment. A year from now, I hope to have learned even more, and perhaps that'll lead to further insights or refinement. Still, I can't postpone writing this article until I've stopped learning, because at that time I'll either be dead or senile.
I'll write these articles in an authoritative voice, because a text that constantly moderates and qualifies its assertions easily becomes unreadable. Don't consider the tone an indication that I'm certain that I'm right. I've tried to be as rigorous in my arguments as I could, but I don't have a formal education in computer science. I welcome feedback on any article, both if it's to corroborate my findings, or if it's to refute them. If you have any sort of feedback, then please leave a comment.
I consider the publication of these articles as though I submit them to peer review. If you can refute them, they deserve to be refuted. If not, they just may be valuable to other people.
Summary #
Category theory generalises some intuitive relations, such as how numbers combine (e.g. via addition or multiplication). Instead of discussing numbers, however, category theory considers abstract 'objects'. This field of mathematics explore how object relate and compose.
Some category theory concepts can be translated to code. These universal abstractions can form the basis of a powerful and concise software design vocabulary.
The design patterns movement was an early attempt to create such a vocabulary. I think using category theory offers the chance of a better vocabulary, but fortunately, all the work that went into design patterns isn't wasted. It seems to me that some design patterns are essentially ad-hoc, informally specified, specialised instances of basic category theory concepts. There's quite a bit of overlap. This should further strengthen the argument that category theory is valuable in programming, because some of the concepts are equivalent to design patterns that have already proven useful.
Comments
What a perfect introduction !
I heard about category theory more than one year ago. But it was from a PhD who code in 'haskell' and I thought it was too hard for me to understand.
And then, this post.
Thank you a lot! (you aleardy published the follow up ! yeah)
Thanks for writing these articles, it's nice to have some reference material that is approachable for us as dotnet programmers.
One thing I was kind of expecting to find here was something about the two building blocks of combining types: products and coproducts. Is this something you have written about, or are considering writing about? It gets mentioned in the Church encoding series and obviously those about visitors, but not really as a concept on its own.
What triggered me to come here this time, was reading about the much requested Champion "Discriminated Unions". Not only in those comments, but also when looking at other C# code, lots of people seem to not realize how fundamental of a concept sum types are. IF they are, I could be wrong ofcourse.
I liked the way bartosz milewski explained this by visualizing them as graphs. Or how Scott Wlaschin relates it back to other concepts we also take for granted:
Anyway, I don't want to ramble on too much. Just curious if it's something you think fits the list of universal abstractions.
Rutger, thank you for writing. I agree that the notion of algebraic data types are, in some sense, quite fundamental. Despite that, I was never planning on covering that topic in this series. The main reason is that I think that other people have already done a great job of it. The first time I encountered the concept was in Tomas Petricek's exemplarily well-written article Power of mathematics Reasoning about functional types, but, as you demonstrate, there are plenty of other good resources. Another favourite of mine is Thinking with Types.
That's not to say that this is the right decision, or that I might not write such an article. When I started this massive article series, I had a general idea about the direction I'd like to go, but I learned a lot along the way that slightly changed the plans. For example, despite the title, there's not that much category theory here. The reason for that is that I found that most of the concepts didn't really require category theory. For example, monoids originate (as far as I can tell) from abstract algebra, and you don't need more than that to explain the concept.
So, to answer your direct question: No, this isn't something that I've given an explicit treatment. On one hand, I think there's already enough good material on the topic that the world doesn't need my contribution. On the other hand, perhaps there's a dearth of treatment that puts this in a C# context.
I'm not adverse to writing such an article, but I have so many other topics I'd also like to cover.