Software design isomorphisms

When programming, you can often express the same concept in multiple ways. If you can losslessly translate between two alternatives, it's an isomorphism. An introduction for object-oriented programmers.

This article series is part of an even larger series of articles about the relationship between design patterns and category theory.

There's a school of functional programming that looks to category theory for inspiration, verification, abstraction, and cross-pollination of ideas. Perhaps you're put off by terms like zygohistomorphic prepromorphism (a joke), but you shouldn't be. There are often good reasons for using abstract naming. In any case, one term from category theory that occasionally makes the rounds is isomorphism.

Equivalence #

Don't let the terminology daunt you. An isomorphism is an easy enough concept to grasp. In essence, two things are isomorphic if you can translate losslessly back and forth between them. It's a formalisation of equivalence.

Many programming languages, like C# and Java, offer a multitude of alternative ways to do things. Just consider this C# example from my Humane Code video:

public bool IsSatisfiedBy(Customer candidate)
{
    bool retVal;
    if (candidate.TotalPurchases >= 10000)
        retVal = true;
    else
        retVal = false;
 
    return retVal;
}

which is equivalent to this:

public bool IsSatisfiedBy(Customer candidate)
{
    return candidate.TotalPurchases >= 10000;
}

An outside observer can't tell the difference between these two implementations, because they have exactly the same externally visible behaviour. You can always refactor from one implementation to the other without loss of information. Thus, we could claim that they're isomorphic.

Terminology #

If you're an object-oriented programmer, then you already know the word polymorphism, which sounds similar to isomorphism. Perhaps you've also heard the word xenomorph. It's all Greek. Morph means form or shape, and while poly means many, iso means equal. So isomorphism means 'being of equal shape'.

This is most likely the explanation for the term Isomorphic JavaScript. The people who came up with that term knew (enough) Greek, but apparently not mathematics. In mathematics, and particularly category theory, an isomorphism is a translation with an inverse. That's still not a formal definition, but just my attempt at presenting it without too much jargon.

Category theory uses the word object to describe a member of a category. I'm going to use that terminology here as well, but you should be aware that object doesn't imply object-oriented programming. It just means 'thing', 'item', 'element', 'entity', etcetera.

In category theory, a morphism is a mapping or translation of an object to another object. If, for all objects, there's an inverse morphism that leads back to the origin, then it's an isomorphism.

Isomorphism diagram.

In this illustration, the blue arrows going from left to right indicate a single morphism. It's a mapping of objects on the blue left side to objects on the green right side. The green arrows going from right to left is another morphism. In this case, the green right-to-left morphism is an inverse of the blue left-to-right morphism, because by applying both morphisms, you end where you started. It doesn't matter if you start at the blue left side or the green right side.

Another way to view this is to say that a lossless translation exists. When a translation is lossless, it means that you don't lose information by performing the translation. Since all information is still present after a translation, you can go back to the original representation.

Software design isomorphisms #

When programming, you can often solve the same problem in different ways. Sometimes, the alternatives are isomorphic: you can go back and forth between two alternatives without loss of information.

Martin Fowler's book Refactoring contains several examples. For instance, you can apply Extract Method followed by Inline Method and be back where you started.

There are many other isomorphisms in programming. Some are morphisms in the same language, as is the case with the above C# example. This is also the case with the isomorphisms in Refactoring, because a refactoring, by definition, is a change applied to a particular code base, be it C#, Java, Ruby, or Python.

Other programming isomorphisms go between languages, where a concept can be modelled in one way in, say, C++, but in another way in Clojure. The present blog, for instance, contains several examples of translating between C# and F#, and between F# and Haskell.

Being aware of software design isomorphisms can make you a better programmer. It'll enable you to select the best alternative for solving a particular problem. Identifying programming isomorphisms is also important because it'll enable us to formally think about code structure by reducing many alternative representations to a canonical representation. For these reasons, this article presents a catalogue of software design isomorphisms:

In general, I've tried to name each isomorphism after its canonical representation. For instance, by unit isomorphisms, I mean isomorphisms to the unit value. It is, however, not an entirely consistent naming strategy.

Many more software design isomorphisms exist, so if you revisit this article in the future, I may have added more items to this catalogue. In no way should you consider this catalogue exhaustive.

Summary #

An isomorphism is a mapping for which an inverse mapping also exists. It's a way to describe equivalence.

In programming, you often have the choice to implement a particular feature in more than one way. These alternatives may be equivalent, in which case they're isomorphic. That's one of the reasons that many code bases come with a style guide.

Understanding how code is isomorphic to other code enables us to reduce many alternatives to a canonical representation. This makes analysis easier, because we can narrow our analysis to the canonical form, and generalise from there.

Next: Unit isomorphisms.

Comments

Sergey Rogovtsev #

Funny thing, I had a discussion about refactoring as an isomorphism a bit ago. While I like the idea of using isomorphisms to reason about code, I still stand by the point that refactoring is an isomorphism limited to functionality; i.e. refactoring the code may change its other aspects (readability is the first one to come to mind, performance is the second), so two different representations are no longer totally equal. Or, as another argument, using Inline Method in fact loses some information (the descriptive method name, limited variable scopes), so the translation is not (sorry) lossless.

2018-01-13 08:39 UTC

Mark Seemann #

Sergey, thank you for writing. Good point, you're right that viewed as a general-purpose translation, Inline Method is indeed lossy. When you look at the purpose of refactoring code, the motivation is mostly (if not always) to make the code better in some way. Particularly when the purpose is make the code more readable, a refactoring introduces clarity. Thus, going the opposite way would remove that clarity, so I think it's fair to argue that such a change would be lossy.

It's perfectly reasonable to view a refactoring like Inline Method as a general-purpose algorithm, in which case you're right that it's lossy. I don't dispute that.

My agenda with this article series, however, is different. I'm not trying to make multiple angels dance on a pinhead; it's not my intent to try to redefine the word refactoring. The purpose with this series of articles is to describe how the same behaviour can be implemented in many different ways. The book Refactoring is one source of such equivalent representations.

One quality of morphisms is that there can be several translations between two objects. One such translation could be the general-purpose refactoring that you so rightly point out is lossy. Another translation could be one that 'remembers' the name of the original method.

Take, as an example, the isomorphism described under the heading Roster isomorphism in my article Tuple monoids. When you consider the method ToTriple, you could, indeed, argue that it's lossy, because it 'forgets' that the label associated with the first integer is Girls, that the label associated with the second integer is Boys, and so on. The reverse translation, however, 'remembers' that information, as you can see in the implementation of FromTriple.

This isn't necessarily a 'safe' translation. You could easily write a method like this:

public static Roster FromTriple(Tuple<int, int, string[]> triple)
{
    return new Roster(triple.Item2, triple.Item1, triple.Item3);
}

This compiles, but it translates triples created by ToTriple the wrong way.

On the other hand, the pair of translations that I do show in the article is an isomorphism. The point is that an isomorphism exists; not that it's the only possible set of morphisms.

The same argument can be applied to specific pairs of Extract Method and Inline Method. As a general-purpose algorithm, I still agree that Inline Method is lossy. That doesn't preclude that specific pairs of translations exist. For instance, in an article, I discuss how some people refactor Guard Clauses like this:

if (subject == null)
    throw new ArgumentNullException("subject");

to something like this:

Guard.AgainstNull(subject, "subject");

Again, an isomorphism exists: a translation from a Null Guard to Guard.AgainstNull, and another from Guard.AgainstNull to a Null Guard. Those are specific incarnations of Extract Method and Inline Method.

This may not be particularly useful as a refactoring, I admit, but that's also not the agenda of these articles. The programme is to show how particular software behaviour can be expressed in various different ways that are equivalent to each other.

2018-01-14 14:26 UTC

Published: Monday, 08 January 2018 08:35:00 UTC

Software design isomorphisms by Mark Seemann

Equivalence #

Terminology #

Software design isomorphisms #

Summary #

Comments

Wish to comment?

Published

Tags