Referential transparency fits in your head by Mark Seemann
Why functional programming matters.
This article is mostly excerpts from my book Code That Fits in Your Head. The overall message is too important to exclusively hide away in a book, though, which is the reason I also publish it here.
The illustrations are preliminary. While writing the manuscript, I experimented with hand-drawn figures, but Addison-Wesley prefers 'professional' illustrations. In the published book, the illustrations shown here will be replaced by cleaner, more readable, but also more boring figures.
Nested composition #
Ultimately, software interacts with the real world. It paints pixels on the screen, saves data in databases, sends emails, posts on social media, controls industrial robots, etcetera. All of these are what we in the context of Command Query Separation call side effects.
Since side effects are software's raison d'ĂȘtre, it seems only natural to model composition around them. This is how most people tend to approach object-oriented design. You model actions.
Object-oriented composition tends to focus on composing side effects together. The Composite design pattern may be the paragon of this style of composition, but most of the patterns in Design Patterns rely heavily on composition of side effects.
As illustrated in [the following figure] this style of composition relies on nesting objects in other objects, or side effects in other side effects. Since your goal should be code that fits in your head, this is a problem.
[Figure caption:] The typical composition of objects (or, rather, methods on objects) is nesting. The more you compose, the less the composition fits in your brain. In this figure, each star indicates a side effect that you care about. Object A encapsulates one side effect, and object B two. Object C composes A and B, but also adds a fourth side effect. That's already four side effects that you need to keep in mind when trying to understand the code. This easily gets out of hand: object E composes a total of eight side effects, and F nine. Those don't fit well in your brain.
I should add here that one of the book's central tenets is that the human short-term memory can only hold a limited amount of information. Code that fits in your head is code that respects that limitation. This is a topic I've already addressed in my Humane Code video.
In the book, I use the number seven as a symbol of the this cognitive limit. Nothing I argue, however, relies on this exact number. The point is only that short-term memory is quite limited. Seven is only a shorthand for that.
The book proceeds to provide a code example that illustrates how fast nested composition accumulates complexity that exceeds the capacity of your short-term memory. You can see the code example in the book, or in the article Good names are skin-deep, which makes a slightly different criticism than the one argued in the book.
The section on nested composition goes on:
"Abstraction is the elimination of the irrelevant and the amplification of the essential"
By hiding a side effect in a Query, I've eliminated something essential. In other words, more is going on in [the book's code listing] than meets the eye. The cyclomatic complexity may be as low as 4, but there's a hidden fifth action that you ought to be aware of.
Granted, five chunks still fit in your brain, but that single hidden interaction is an extra 14% towards the budget of seven. It doesn't take many hidden side effects before the code no longer fits in your head
Sequential composition #
While nested composition is problematic, it isn't the only way to compose things. You can also compose behaviour by chaining it together, as illustrated in [the following figure].
[Figure caption:] Sequential composition of two functions. The output from
Where
becomes the input toAllocate
.In the terminology of Command Query Separation, Commands cause trouble. Queries, on the other hand, tend to cause little trouble. They return data which you can use as input for other Queries.
Again, the book proceeds to show code examples. You can, of course, see the code in the book, but the methods discussed are the WillAccept
function shown here, the Overlaps
function shown here, as well as a few other functions that I don't think that I've shown on the blog.
The entire restaurant example code base is written with that principle in mind. Consider the
WillAccept
method [...]. After all the Guard Clauses it first creates a new instance of theSeating
class. You can think of a constructor as a Query under the condition that it has no side effects.The next line of code filters the
existingReservations
using theOverlaps
method [...] as a predicate. The built-inWhere
method is a Query.Finally, the
WillAccept
method returns whether there'sAny
table among theavailableTables
thatFits
thecandidate.Quantity
. TheAny
method is another built-in Query, andFits
is a predicate.Compared to [the sequential composition figure], you can say that the
Seating
constructor,seating.Overlaps
,Allocate
, andFits
are sequentially composed.None of these methods have side effects, which means that once
WillAccept
returns its Boolean value, you can forget about how it reached that result. It truly eliminates the irrelevant and amplifies the essentialReferential transparency #
There's a remaining issue that Command Query Separation fails to address: predictability. While a Query has no side effects that your brain has to keep track of, it could still surprise you if you get a new return value every time you call it - even with the same input.
This may not be quite as bad as side effects, but it'll still tax your brain. What happens if we institute an extra rule on top of Command Query Separation: that Queries must be deterministic?
This would mean that a Query can't rely on random number generators, GUID creation, the time of day, day of the month, or any other data from the environment. That would include the contents of files and databases. That sounds restrictive, so what's the benefit?
A deterministic method without side effects is referentially transparent. It's also known as a pure function. Such functions have some very desirable qualities.
One of these qualities is that pure functions readily compose. If the output of one function fits as the input for another, you can sequentially compose them. Always. There are deep mathematical reasons for that, but suffice it to say that composition is ingrained into the fabric that pure functions are made of.
Another quality is that you can replace a pure function call with its result. The function call is equal to the output. The only difference between the result and the function call is the time it takes to get it.
Think about that in terms of Robert C. Martin's definition of abstraction. Once a pure function returns, the result is all you have to care about. How the function arrived at the result is an implementation detail. Referentially transparent functions eliminate the irrelevant and amplify the essential. As [the following figure] illustrates, they collapse arbitrary complexity to a single result; a single chunk that fits in your brain.
[Figure caption:] Pure function (left) collapsing to its result (right). Regardless of complexity, a referentially transparent function call can be replaced by its output. Thus, once you know what the output is, that's the only thing you need to keep track of as you read and interpret the calling code.
On the other hand, if you want to know how the function works, you zoom in on its implementation, in the spirit of fractal architecture. That might be the
WillAccept
method [...]. This method is, in fact, not just Query, it's a pure function. When you look at the source code of that function, you've zoomed in on it, and the surrounding context is irrelevant. It operates exclusively on its input arguments and immutable class fields.When you zoom out again, the entire function collapses into its result. It's the only thing your brain needs to keep track of.
Regardless of complexity, a referentially transparent function reduces to a single chunk: the result that it produces. Thus, referentially transparent code is code that fits in your head.
Conclusion #
Code That Fits in Your Head isn't a book about functional programming, but it does advocate for the functional core, imperative shell (also known as an impureim sandwich) architecture, among many other techniques, practices, and heuristics that it presents.
I hope that you found the excerpt so inspiring that you consider buying the book.
Comments
Absolutely agree. The problem of "Code that fits in your head" is not about code either, it's about all human reasoning. It particularly happens in math. In math, why do we prove "theorems", and then prove new theorems by referencing those previous theorems or lemmas? Why can't mathematicians just prove everything every single time? Because math can't fit in someone's head, so we also need these tools.
Or rather, math is the tool for "X that fits in your head". Math is the language of human reasoning, anything that can be reasoned about can be modelled in math and talked about in math, which is a language we all know about and have used for thousands of years.
It can help us developers a lot to make use of this shared language. There already exists a language to help us figure out how to Fit Code In Our Heads, it would be detrimental to ourselves to ignore it. You link to category theory, and there's also algebra. These are tools for such endeavor, I think developers should be more open towards them and not just deride them as "hard" or "ivory-tower-esque", they should be the opposite of that. And in fact, whne you properly understand algebra, there is nothing really hard about it, it's actually so simple and easy you can't understand why you didn't pay attention to it sooner.