The Liskov Substitution Principle as a profunctor

With a realistic example in C#.

Like the previous article on Postel's law as a profunctor, this article is part of a series titled Some design patterns as universal abstractions. And like the previous article, it's a bit of a stretch to include the present article in that series, since the Liskov Substitution Principle (LSP) isn't a design pattern, but rather a software design principle or heuristic. I still think, however, that the article fits the spirit of the article series, if not the letter.

As was also the case for the previous article, I don't claim that any of this is new. Michael Feathers and Rúnar Bjarnason blazed that trail long before me.

LSP distilled #

In more or less natural language, the LSP states that subtypes must preserve correctness. A subtype isn't allowed to change behaviour in such a way that client code breaks.

Note that subtypes are allowed to change behaviour. That's often the point of subtyping. By providing a subtype, you can change the behaviour of a system. You can, for example, override how messages are sent, so that an SMS becomes a Slack notification or a Tweet.

If client code 'originally' supplied correct input for sending an SMS, this input should also be valid for posting a Tweet.

Specifically (paraphrasing the Wikipedia entry as of early November 2021):

Subtypes must be contravariant in input
Subtypes must be covariant in output
Preconditions must not be strengthened in the subtype
Postconditions must not be weakened in the subtype

There's a bit more, but in this article, I'll focus on those rules. The first two we already know. Since any function is already a profunctor, we know that functions are contravariant in input and covariant in output.

The LSP, however, isn't a rule about a single function. Rather, it's a rule about a family of functions. Think about a function a -> b as a pipe. You can replace the pipe segment with another pipe segment that has exactly the same shape, but replacing it with a flanged pipe also works, as long as the input flange is wider, and the nozzle narrower than the original pipe shape.

The Liskov Substitution Principle illustrated with flanged pipes.

On the other hand, if you narrow the pipe at the input and widen it at the output, spillage will happen. That's essentially what the LSP states: The upper, green, flanged pipe is a good replacement for the supertype (the blue pipe in the middle), while the lower, orange, flanged pipe is not useful.

The previous article already described that visual metaphor when it comes to co- and contravariance, so this article will focus on pre- and postconditions. My conjecture is that this is just another take on co- and contravariance.

Supertype example #

When encountering statements about subtypes and supertypes, most people tend to think about object inheritance, but that's just one example. As I've previously outlined, anything that can 'act as' something else is a subtype of that something else. Specifically, an interface implementation is a subtype of the interface, and the interface itself the supertype.

Consider, as a supertype example, this interface from my book Code That Fits in Your Head:

public interface IReservationsRepository
{
    Task Create(int restaurantId, Reservation reservation);
 
    Task<IReadOnlyCollection<Reservation>> ReadReservations(
        int restaurantId, DateTime min, DateTime max);
 
    Task<Reservation?> ReadReservation(int restaurantId, Guid id);
 
    Task Update(int restaurantId, Reservation reservation);
 
    Task Delete(int restaurantId, Guid id);
}

Specifically, in this article, I'll focus exclusively on the ReadReservations method. You can imagine that there's an interface with only that method, or that when subtyping the interface in the following, we vary only that method and keep everything else fixed.

What might be the pre- and postconditions of the ReadReservations method?

ReadReservations preconditions #

The most basic kind of precondition is captured by the parameter types. In order to be able to call the method, you must supply an int and two DateTime instances. You can't omit any of them or replace one of the DateTime values with a Guid. In a statically typed language, this is obvious, and the compiler will take care of that.

Both int and DateTime are value types (structs), so they can't be null. Had one of the parameters been a reference type, it'd be appropriate to consider whether or not null constitutes valid input.

So far, we've only discussed static types. Of course a subtype must satisfy the compiler, but what other pre-conditions might be implied by ReadReservations?

The purpose of the method is to enable client code to query a data store and retrieve the reservations for a particular restaurant and in a particular time interval.

Is any restaurantId acceptable? 1? 0? -235?

It's probably a distraction that the restaurant ID is even an int. After all, you don't add IDs together, or multiply one with another one. That an ID is an integer is really just a leaky implementation detail - databases like it best when IDs are integers. I should actually have defined the restaurant ID as an opaque object with Value Object semantics, but I didn't (like other humans, I'm imperfect and lazy). The bottom line is that any number is as good as any other number. No precondition there.

What about the two DateTime parameters? Are DateTime.MinValue or DateTime.MaxValue valid values? Probably: If you'd like to retrieve all reservations in the past, you could ask for DateTime.MinValue as min and DateTime.Now as max. On the other hand, it'd be reasonable to require that min should be less than (or equal to?) max. That sounds like a proper precondition, and one that's difficult to express as a type.

We may also consider it a precondition that the object that implements the ReadReservations method is properly initialised, but I'll take that as a given. Making sure of that is the responsibility of the constructor, not the ReadReservations method.

To summarise, apart from the types and other things handled by the compiler, there's only one additional pre-condition: that min must be less than max.

Are there any postconditions?

ReadReservations postconditions #

The code base for the book obeys Command-Query Separation. Since ReadReservations returns data, it must be a Query. Thus, we can assume that calling it isn't supposed to change state. Thus, postconditions can only be statements about the return value.

Again, static typing takes care of the most fundamental postconditions. An implementation can't return a double or an UriBuilder. It must be a Task, and that task must compute an IReadOnlyCollection<Reservation>.

Why IReadOnlyCollection, by the way? That's my attempt at describing a particular postcondition as a type.

The IReadOnlyCollection<T> interface is a restriction of IEnumerable<T> that adds a Count. By adding the Count property, the interface strongly suggests that the collection is finite.

IEnumerable<T> implementations can be infinite sequences. These can be useful as functional alternatives to infinite loops, but are clearly not appropriate when retrieving reservations from a database.

The use of IReadOnlyCollection tells us about a postcondition: The collection of reservations is finite. This is, however, captured by the type. Any valid implementation of the interface ought to make that guarantee.

Is there anything else? Is it okay if the collection is empty? Yes, that could easily happen, if you have no reservations in the requested interval.

What else? Not much comes to mind, only that we'd expect the collection to be 'stable'. Technically, you could implement the GetEnumerator method so that it generates Count random Reservation objects every time you enumerate it, but none of the built-in implementations do that; that's quite a low bar, as postconditions go.

To summarise the postconditions: None, apart from a well-behaved implementation of IReadOnlyCollection<Reservation>.

SQL implementation #

According to the LSP, a subtype should be allowed to weaken preconditions. Keep in mind that I consider an interface implementation a subtype, so every implementation of ReadReservations constitutes a subtype. Consider the SQL Server-based implementation:

public async Task<IReadOnlyCollection<Reservation>> ReadReservations(
    int restaurantId,
    DateTime min,
    DateTime max)
{
    var result = new List<Reservation>();
 
    using var conn = new SqlConnection(ConnectionString);
    using var cmd = new SqlCommand(readByRangeSql, conn);
    cmd.Parameters.AddWithValue("@RestaurantId", restaurantId);
    cmd.Parameters.AddWithValue("@Min", min);
    cmd.Parameters.AddWithValue("@Max", max);
 
    await conn.OpenAsync().ConfigureAwait(false);
    using var rdr =
        await cmd.ExecuteReaderAsync().ConfigureAwait(false);
    while (await rdr.ReadAsync().ConfigureAwait(false))
        result.Add(ReadReservationRow(rdr));
 
    return result.AsReadOnly();
}
 
private const string readByRangeSql = @"
    SELECT [PublicId], [At], [Name], [Email], [Quantity]
    FROM [dbo].[Reservations]
    WHERE [RestaurantId] = @RestaurantId AND
          @Min <= [At] AND [At] <= @Max";

This implementation actually doesn't enforce the precondition that min ought to be less than max. It doesn't have to, since the code will run even if that's not the case - the result set, if min is greater than max, will always be empty.

While perhaps not useful, weakening this precondition doesn't adversely affect well-behaved clients, and buggy clients are always going to receive empty results. If this implementation also fulfils all postconditions, it's already LSP-compliant.

Still, could it weaken preconditions even more, or in a different way?

Weakening of preconditions #

As Postel's law suggests, a method should be liberal in what it accepts. If it understands 'what the caller meant', it should perform the desired operation instead of insisting on the letter of the law.

Imagine that you receive a call where min is midnight June 6 and max is midnight June 5. While wrong, what do you think that the caller 'meant'?

The caller probably wanted to retrieve the reservations for June 5.

You could weaken that precondition by swapping back min and max if you detect that they've been swapped.

Let's assume, for the sake of argument, that we make the above ReadReservations implementation virtual. This enables us to inherit from SqlReservationsRepository and override the method:

public override Task<IReadOnlyCollection<Reservation>> ReadReservations(
    int restaurantId,
    DateTime min,
    DateTime max)
{
    if (max < min)
        return base.ReadReservations(restaurantId, max, min);
 
    return base.ReadReservations(restaurantId, min, max);
}

While this weakens preconditions, it breaks no existing clients because they all 'know' that they must pass the lesser value before the greater value.

Strengthening of postconditions #

Postel's law also suggests that a method should be conservative in what it sends. What could that mean?

In the case of ReadReservations, you may notice that the result set isn't explicitly sorted. Perhaps we'd like to also sort it on the date and time:

public override async Task<IReadOnlyCollection<Reservation>> ReadReservations(
    int restaurantId,
    DateTime min,
    DateTime max)
{
    var query =
        min < max ?
            base.ReadReservations(restaurantId, min, max) :
            base.ReadReservations(restaurantId, max, min);
    var reservations = await query.ConfigureAwait(false);
    return reservations.OrderBy(r => r.At).ToList();
}

This implementation retains the weakened precondition from before, but now it also explicitly sorts the reservations on At.

Since no client code relies on sorting, this breaks no existing clients.

While the behaviour changes, it does so in a way that doesn't violate the original contract.

Profunctor #

While we've used terms such as weaken preconditions and strengthen postconditions, doesn't this look an awful lot like co- and contravariance?

I think it does, so let's rewrite the above implementation using the Reader profunctor.

First, we'll need to express the original method in the shape of a function like a -> b - that is: a function that takes a single input and returns a single output. While ReadReservations return a single value (a Task), it takes three input arguments. To make it fit the a -> b mould, we have to convert those three parameters to a Parameter Object.

This enables us to write the original implementation as a function:

Func<QueryParameters, Task<IReadOnlyCollection<Reservation>>> imp =
    q => base.ReadReservations(q.RestaurantId, q.Min, q.Max);

If we didn't want to weaken any preconditions or strengthen any postconditions, we could simply call imp and return its output.

The above weakened precondition can be expressed like this:

Func<QueryParameters, QueryParameters> pre =
    q => q.Min < q.Max ?
        new QueryParameters(q.RestaurantId, q.Min, q.Max) :
        new QueryParameters(q.RestaurantId, q.Max, q.Min);

Notice that this is a function from QueryParameters to QueryParameters. As above, it simply swaps Min and Max if required.

Likewise, we can express the strengthened postcondition as a function:

Func<Task<IReadOnlyCollection<Reservation>>, Task<IReadOnlyCollection<Reservation>>> post =
    t => t.Select(rs => rs.OrderBy(r => r.At).ToReadOnly());

The Select method exists because Task forms an asynchronous functor.

It's now possible to compose imp with pre and post using DiMap:

Func<QueryParameters, Task<IReadOnlyCollection<Reservation>>> composition = imp.DiMap(pre, post);

You can now call composition with the original arguments:

composition(new QueryParameters(restaurantId, min, max))

The output of such a function call is entirely equivalent to the above, subtyped ReadReservations implementation.

In case you've forgotten, the presence of a lawful DiMap function is what makes something a profunctor. We already knew that functions are profunctors, but now we've also seen that we can use this knowledge to weaken preconditions and strengthen postconditions.

It seems reasonable to conjecture that the LSP actually describes a profunctor.

It seems to me that the profunctor composition involved with the LSP always takes the specialised form where, for a function a -> b, the preprocessor (the contravariant mapping) always takes the form a -> a, and the postprocessor (the covariant mapping) always takes the form b -> b. This is because polymorphism must preserve the shape of the original function (a -> b).

Conclusion #

We already know that something contravariant in input and covariant in output is a profunctor candidate, but the Liskov Substitution Principle is usually expressed in terms of pre- and postconditions. Subtypes may weaken preconditions and strengthen postconditions, but not the other way around.

Evidence suggests that you can phrase these rules as a profunctor specialisation.

Next: Backwards compatibility as a profunctor.

Comments

Tyson Williams #

It seems to me that the profunctor composition involved with the LSP always takes the specialised form where, for a function a -> b, the preprocessor (the contravariant mapping) always takes the form a -> a, and the postprocessor (the covariant mapping) always takes the form b -> b. This is because polymorphism must preserve the shape of the original function (a -> b).

I think this depends on the language and your perspective on subtype polymorphism. In Java, a subclass Y of a superclass X can override a method m on X that has return type b with a method on Y that has return type c provided c is a subtype of b. To be clear, I am not saying that the instance returned from Y::m can have runtime-type c, though that is also true. I am saying that the compile-time return type of Y::m can be c.

With this in mind, I am willing to argue that the postprocessor can take the form b -> c provided c is a subtype of b. As the programmer, that is how I think of Y::m. And yet, if someone has a instance of Y with compile-time type X, then calling m returns something with compile-time type b by composing my b -> c function with the natural c -> b function from subtype polymorphism to get the b -> b function that you suggested is required.

2021-12-10 05:28 UTC

Mark Seemann #

Tyson, thank you for writing. Yes, true, that nuance may exist. This implies that the LSP is recursive, which really isn't surprising.

A function a -> b may be defined for types a and b, where both are, themselves, polymorphic. So, if a is HttpStyleUriParser and b is MemberInfo, we'd have a function Func<HttpStyleUriParser, MemberInfo> (or HttpStyleUriParser -> MemberInfo if we want to stick with an ML-style type signature - we can imagine that it's an F# function).

If Func<HttpStyleUriParser, MemberInfo> is our polymorphic target signature, according to C# co- and contravariance, we're allowed to vary both input and output. We can, for example, widen the input type to UriParser and restrict the output type to Type. Thus, a specific function of the type Func<UriParser, Type> is compatible with a general function of the type Func<HttpStyleUriParser, MemberInfo>:

Func<HttpStyleUriParser, MemberInfo> general = specific;

Thus, you're correct that both pre- and postprocessor may take the form a' -> a and b -> b', where a' is a supertype of a, and b is a supertype of b'. This is true for C# function delegates, since they're defined with the in and out keywords. You can put in and out on your own interface definitions as well, but most people don't bother (I rarely do, either).

As with all practical recursion you need base cases to stop the recursion. While you can define polymorphic functions (or classes) where input parameters and return types are themselves polymorphic, etc., these should still obey the LSP.

2021-12-14 07:12 UTC

Published: Monday, 06 December 2021 06:52:00 UTC

The Liskov Substitution Principle as a profunctor by Mark Seemann