ploeh blog

Error-accumulating composable assertions in C#

Monday, 19 December 2022 08:39:00 UTC

Perhaps the list monoid is all you need for non-short-circuiting assertions.

This article is the second instalment in a small articles series about applicative assertions. It explores a way to compose assertions in such a way that failure messages accumulate rather than short-circuit. It assumes that you've read the article series introduction and the previous article.

Unsurprisingly, the previous article showed that you can use an applicative functor to create composable assertions that don't short-circuit. It also concluded that, in C# at least, the API is awkward.

This article explores a simpler API.

A clue left by the proof of concept #

The previous article's proof of concept left a clue suggesting a simpler API. Consider, again, how the rather horrible RunAssertions method decides whether or not to throw an exception:

string errors = composition.Match(
    onFailure: f => string.Join(Environment.NewLine, f),
    onSuccess: _ => string.Empty);
 
if (!string.IsNullOrEmpty(errors))
    throw new Exception(errors);

Even though Validated<F, S> is a sum type, the RunAssertions method declines to take advantage of that. Instead, it reduces composition to a simple type: A string. It then decides to throw an exception if the errors value is not null or empty.

This suggests that using a sum type may not be necessary to distinguish between the success and the failure case. Rather, an empty error string is all it takes to indicate success.

Non-empty errors #

The proof-of-concept assertion type is currently defined as Validated with a particular combination of type arguments: Validated<IReadOnlyCollection<string>, Unit>. Consider, again, this Match expression:

string errors = composition.Match(
    onFailure: f => string.Join(Environment.NewLine, f),
    onSuccess: _ => string.Empty);

Does an empty string unambiguously indicate success? Or is it possible to arrive at an empty string even if composition actually represents a failure case?

You can arrive at an empty string from a failure case if the collection of error messages is empty. Consider the type argument that takes the place of the F generic type: IReadOnlyCollection<string>. A collection of this type can be empty, which would also cause the above Match to produce an empty string.

Even so, the proof-of-concept works in practice. The reason it works is that failure cases will never have empty assertion messages. We know this because (in the proof-of-concept code) only two functions produce assertions, and they each populate the error message collection with a string. You may want to revisit the AssertTrue and AssertEqual functions in the previous article to convince yourself that this is true.

This is a good example of knowledge that 'we' as developers know, but the code currently doesn't capture. Having to deal with such knowledge taxes your working memory, so why not encapsulate such information in the type itself?

How do you encapsulate the knowledge that a collection is never empty? Introduce a NotEmptyCollection collection. I'll reuse the class from the article Semigroups accumulate and add a Concat instance method:

public NotEmptyCollection<T> Concat(NotEmptyCollection<T> other)
{
    return new NotEmptyCollection<T>(Head, Tail.Concat(other).ToArray());
}

Since the two assertion-producing functions both supply an error message in the failure case, it's trivial to change them to return Validated<NotEmptyCollection<string>, Unit> - just change the types used:

public static Validated<NotEmptyCollection<string>, Unit> AssertTrue(
    this bool condition,
    string message)
{
    return condition
        ? Succeed<NotEmptyCollection<string>, Unit>(Unit.Value)
        : Fail<NotEmptyCollection<string>, Unit>(new NotEmptyCollection<string>(message));
}
 
public static Validated<NotEmptyCollection<string>, Unit> AssertEqual<T>(
    T expected,
    T actual)
{
    return Equals(expected, actual)
        ? Succeed<NotEmptyCollection<string>, Unit>(Unit.Value)
        : Fail<NotEmptyCollection<string>, Unit>(
            new NotEmptyCollection<string>($"Expected {expected}, but got {actual}."));
}

This change guarantees that the RunAssertions method only produces an empty errors string in success cases.

Error collection isomorphism #

Assertions are still defined by the Validated sum type, but the success case carries no information: Validated<NotEmptyCollection<T>, Unit>, and the failure case is always guaranteed to contain at least one error message.

This suggests that a simpler representation is possible: One that uses a normal collection of errors, and where an empty collection indicates an absence of errors:

public class Asserted<T>
{
    public Asserted() : this(Array.Empty<T>())
    {
    }
 
    public Asserted(T error) : this(new[] { error })
    {
    }
 
    public Asserted(IReadOnlyCollection<T> errors)
    {
        Errors = errors;
    }
 
    public Asserted<T> And(Asserted<T> other)
    {
        if (other is null)
            throw new ArgumentNullException(nameof(other));
 
        return new Asserted<T>(Errors.Concat(other.Errors).ToList());
    }
 
    public IReadOnlyCollection<T> Errors { get; }
}

The Asserted<T> class is scarcely more than a glorified wrapper around a normal collection, but it's isomorphic to Validated<NotEmptyCollection<T>, Unit>, which the following two functions prove:

public static Asserted<T> FromValidated<T>(this Validated<NotEmptyCollection<T>, Unit> v)
{
    return v.Match(
        failures => new Asserted<T>(failures),
        _ => new Asserted<T>());
}
 
public static Validated<NotEmptyCollection<T>, Unit> ToValidated<T>(this Asserted<T> a)
{
    if (a.Errors.Any())
    {
        var errors = new NotEmptyCollection<T>(
            a.Errors.First(),
            a.Errors.Skip(1).ToArray());
        return Validated.Fail<NotEmptyCollection<T>, Unit>(errors);
    }
    else
        return Validated.Succeed<NotEmptyCollection<T>, Unit>(Unit.Value);
}

You can translate back and forth between Validated<NotEmptyCollection<T>, Unit> and Asserted<T> without loss of information.

A collection, however, gives rise to a monoid, which suggests a much simpler way to compose assertions than using an applicative functor.

Asserted truth #

You can now rewrite the assertion-producing functions to return Asserted<string> instead of Validated<NotEmptyCollection<string>, Unit>.

public static Asserted<string> True(bool condition, string message)
{
    return condition ? new Asserted<string>() : new Asserted<string>(message);
}

This Asserted.True function returns no error messages when condition is true, but a collection with the single element message when it's false.

You can use it in a unit test like this:

var assertResponse = Asserted.True(
    deleteResp.IsSuccessStatusCode,
    $"Actual status code: {deleteResp.StatusCode}.");

You'll see how assertResponse composes with another assertion later in this article. The example continues from the previous article. It's the same test from the same code base.

Asserted equality #

You can also rewrite the other assertion-producing function in the same way:

public static Asserted<string> Equal(object expected, object actual)
{
    if (Equals(expected, actual))
        return new Asserted<string>();
 
    return new Asserted<string>($"Expected {expected}, but got {actual}.");
}

Again, when the assertion passes, it returns no errors; otherwise, it returns a collection with a single error message.

Using it may look like this:

var getResp = await api.CreateClient().GetAsync(address);
var assertState = Asserted.Equal(HttpStatusCode.NotFound, getResp.StatusCode);

At this point, each of the assertions are objects that represent a verification step. By themselves, they neither pass nor fail the test. You have to execute them to reach a verdict.

Evaluating assertions #

The above code listing of the Asserted<T> class already shows how to combine two Asserted<T> objects into one. The And instance method is a binary operation that, together with the parameterless constructor, makes Asserted<T> a monoid.

Once you've combined all assertions into a single Asserted<T> object, you need to Run it to produce a test outcome:

public static void Run(this Asserted<string> assertions)
{
    if (assertions?.Errors.Any() ?? false)
    {
        var messages = string.Join(Environment.NewLine, assertions.Errors);
        throw new Exception(messages);
    }
}

If there are no errors, Run does nothing; otherwise it combines all the error messages together and throws an exception. As was also the case in the previous article, I've allowed myself a few proof-of-concept shortcuts. The framework design guidelines admonishes against throwing System.Exception. It might be more appropriate to introduce a new Exception type that also allows enumerating the error messages.

The entire assertion phase of the test looks like this:

var assertResponse = Asserted.True(
    deleteResp.IsSuccessStatusCode,
    $"Actual status code: {deleteResp.StatusCode}.");
var getResp = await api.CreateClient().GetAsync(address);
var assertState = Asserted.Equal(HttpStatusCode.NotFound, getResp.StatusCode);
assertResponse.And(assertState).Run();

You can see the entire test in the previous article. Notice how the two assertion objects are first combined into one with the And binary operation. The result is a single Asserted<string> object on which you can call Run.

Like the previous proof of concept, this assertion passes and fails in the same way. It's possible to compose assertions and collect error messages, instead of short-circuiting on the first failure, even without an applicative functor.

Method chaining #

If you don't like to come up with variable names just to make assertions, it's also possible to use the Asserted API's fluent interface:

var getResp = await api.CreateClient().GetAsync(address);
Asserted
    .True(
        deleteResp.IsSuccessStatusCode,
        $"Actual status code: {deleteResp.StatusCode}.")
    .And(Asserted.Equal(HttpStatusCode.NotFound, getResp.StatusCode))
    .Run();

This isn't necessarily better, but it's an option.

Conclusion #

While it's possible to design non-short-circuiting composable assertions using an applicative functor, it looks as though a simpler solution might solve the same problem. Collect error messages. If none were collected, interpret that as a success.

As I wrote in the introduction article, however, this may not be the last word. Some assertions return values that can be used for other assertions. That's a scenario that I have not yet investigated in this light, and it may change the conclusion. If so, I'll add more articles to this small article series. As I'm writing this, though, I have no such plans.

Did I just, in a roundabout way, write that more research is needed?

Next: Built-in alternatives to applicative assertions.

Comments

Pavel Tupitsyn #

I think NUnit's Assert.Multiple is worth mentioning in this series. It does not require any complicated APIs, just wrap your existing test with multiple asserts into a delegate.

2022-12-20 08:02 UTC

Mark Seemann #

Pavel, thank you for writing. I'm aware of both that API and similar ones for other testing frameworks. As is usually the case, there are trade-offs to consider. I'm currently working on some material that may turn into another article about that.

2022-12-21 20:17 UTC

Mark Seemann #

A new article is now available: Built-in alternatives to applicative assertions.

2023-01-30 12:31 UTC

When do tests fail?

Monday, 12 December 2022 08:33:00 UTC

Optimise for the common scenario.

Unit tests occasionally fail. When does that happen? How often? What triggers it? What information is important when tests fail?

Regularly I encounter the viewpoint that it should be easy to understand the purpose of a test when it fails. Some people consider test names important, a topic that I've previously discussed. Recently I discussed the Assertion Roulette test smell on Twitter, and again I learned some surprising things about what people value in unit tests.

The importance of clear assertion messages #

The Assertion Roulette test smell is often simplified to degeneracy, but it really describes situations where it may be a problem if you can't tell which of several assertions actually caused a test to fail.

Josh McKinney gave a more detailed example than Gerard Meszaros does in the book:

"Background. In a legacy product, we saw some tests start failing intermittently. They weren’t just flakey, but also failed without providing enough info to fix. One of things which caused time to fix to increase was multiple ways of a single test to fail."

Josh McK

He goes on:

"I.e. if you fix the first assertion and you know there still could be flakiness, or long cycle times to see the failure. Multiple assertions makes any test problem worse. In an ideal state, they are fine, but every assertion doubles the amount of failures a test catches."

Josh McK

and concludes:

"the other main way (unrelated) was things like:

assertTrue(someListResult.isRmpty())

Which tells you what failed, but nothing about how.

But the following is worse. You must run the test twice to fix:

assertFalse(someList.isEmpty());
assertEqual(expected, list.get(0));"

Josh McK

The final point is due to the short-circuiting nature of most assertion libraries. That, however, is a solvable problem.

I find the above a compelling example of why Assertion Roulette may be problematic.

It did give me pause, though. How common is this scenario?

Out of the blue #

The situation described by Josh McKinney comes with more than a single warning flag. I hope that it's okay to point some of them out. I didn't get the impression from my interaction with Josh McKinney that he considered the situation ideal in any way.

First, of course, there's the lack of information about the problem. Here, that's a real problem. As I understand it, it makes it harder to reproduce the problem in a development environment.

Next, there's long cycle times, which I interpret as significant time may pass from when you attempt a fix until you can actually observe whether or not it worked. Josh McKinney doesn't say how long, but I wouldn't surprised if it was measured in days. At least, if the cycle time is measured in days, I can see how this is a problem.

Finally, there's the observation that "some tests start failing intermittently". This was the remark that caught my attention. How often does that happen?

Tests shouldn't do that. Tests should be deterministic. If they're not, you should work to eradicate non-determinism in tests.

I'll be the first to admit that that I also write non-deterministic tests. Not by design, but because I make mistakes. I've written many Erratic Tests in my career, and I've documented a few of them here:

While it can happen, it shouldn't be the norm. When it nonetheless happens, eradicating that source of non-determinism should be top priority. Pull the andon cord.

When tests fail #

Ideally, tests should rarely fail. As examined above, you may have Erratic Tests in your test suite, and if you do, these tests will occasionally (or often) fail. As Martin Fowler writes, this is a problem and you should do something about it. He also outlines strategies for it.

Once you've eradicated non-determinism in unit tests, then when do tests fail?

I can think of a couple of situations.

Tests routinely fail as part of the red-green-refactor cycle. This is by design. If no test is failing in the red phase, you probably made a mistake (which also regularly happens to me), or you may not really be doing test-driven development (TDD).

Another situation that may cause a test to fail is if you changed some code and triggered a regression test.

In both cases, tests don't just fail out of the blue. They fail as an immediate consequence of something you did.

Optimise for the common scenario #

In both cases you're (hopefully) in a tight feedback loop. If you're in a tight feedback loop, then how important is the assertion message really? How important is the test name?

You work on the code base, make some changes, run the tests. If one or more tests fail, it's correlated to the change you just made. You should have a good idea of what went wrong. Are code forensics and elaborate documentation really necessary to understand a test that failed because you just did something a few minutes before?

The reason I don't care much about test names or whether there's one or more assertion in a unit test is exactly that: When tests fail, it's usually because of something I just did. I don't need diagnostics tools to find the root cause. The root cause is the change that I just made.

That's my common scenario, and I try to optimise my processes for the common scenarios.

Fast feedback #

There's an implied way of working that affects such attitudes. Since I learned about TDD in 2003 I've always relished the fast feedback I get from a test suite. Since I tried continuous deployment around 2014, I consider it central to modern software engineering (and Accelerate strongly suggests so, too).

The modus operandi I outline above is one of fast feedback. If you're sitting on a feature branch for weeks before integrating into master, or if you can only deploy two times a year, this influences what works and what doesn't.

Both Modern Software Engineering and Accelerate make a strong case that short feedback cycles are pivotal for successful software development organisations.

I also understand that that's not the reality for everyone. When faced with long cycle times, a multitude of Erratic Tests, a legacy code base, and so on, other things become important. In those circumstances, tests may fail for different reasons.

When you work with TDD, continuous integration (CI), and continuous deployment (CD), then when do tests fail? They fail because you made them fail, only minutes earlier. Fix your code and move forward.

Conclusion #

When discussing test names and assertion messages, I've been surprised by the emphasis some people put on what I consider to be of secondary importance. I think the explanation is that circumstances differ.

With TDD and CI/CD you mostly look at a unit test when you write it, or if some regression test fails because you changed some code (perhaps in response to a test you just wrote). Your test suite may have hundreds or thousands of tests. Most of these pass every time you run the test suite. That's the normal state of affairs.

In other circumstances, you may have Erratic Tests that fail unpredictably. You should make it a priority to stop that, but as part of that process, you may need good assertion messages and good test names.

Different circumstances call for different reactions, so what works well in one situation may be a liability in other situations. I hope that this article has shed a little light on the forces you may want to consider.

GitHub Copilot preliminary experience report

Monday, 05 December 2022 08:37:00 UTC

Based on a few months of use.

I've been evaluating GitHub Copilot since August 2022. Perhaps it's time to collect my thoughts so far.

In short, it's surprisingly good, but also gets a lot of things wrong. It does seem helpful to the experienced programmer, but I don't see it replacing all programmers yet.

Not only for boilerplate code #

I was initially doubtful. I'd seen some demos where Copilot created fifteen to twenty lines of code to, say, make a REST API request. These examples mostly struck me as auto-generation of something that ought to be a proper abstraction: A method in a reusable library.

In general I don't think that typing is a productivity bottleneck, and I'm sceptical of productivity tools, and particularly code generators. The more code a code base contains, the more code there is to read. Accelerating code production doesn't strike me as a goal in itself.

On the other hand, I'm past fifty and aware that in my age, one often tends to resist new ideas. Perhaps the above justifications are nothing but motivated reasoning. I decided to install the plug-ins and go about my business to see what would happen.

It has surprised and delighted me more than once. It's not only for boilerplate code. Here's a recent example.

I was doing an exercise in F# and wrote this type definition:

type Instruction = Enqueue of int | Dequeue

I wanted to write a property-based test with this type, and since I was using Hedgehog I needed a generator for the type. While FsCheck can automatically generate values of custom discriminated unions, Hedgehog has a more explicit-is-better-than-implicit design philosophy. In other words, if there's a way to make Hedgehog automatically generate values of a type like the above Instruction, it has so far escaped me.

So I started typing let private genInstruction = after which Copilot suggested the following:

let private genInstruction =
    Gen.frequency
        [ 1, Gen.constant Dequeue
          1, Gen.int32 (Range.constant 0 10_000) |> Gen.map Enqueue ]

I was, frankly, stunned. While I do understand that Copilot doesn't actually understand the code it suggests, Hedgehog is sufficiently esoteric that I didn't expect Copilot to have enough training data to enable it to make a useful contribution in this niche. I was wrong. I've repeatedly seen Copilot make suggestions to my F# and Haskell code. It's not just for C#, JavaScript, or python code.

The above suggestion was, to be clear, absolutely appropriate and compiled right away. The only detail I decided to change was the Range, which I decided to change to Range.linear. That's not, however, a significant change.

Perhaps you're not impressed by three lines of auto-generated code. How much of a productivity improvement is that? Quite a bit, in my case.

It wouldn't have taken me long to type those three lines of code, but as I already mentioned, typing isn't a bottleneck. On the other hand, looking up an unfamiliar API can take some time. The Programmer's Brain discusses this kind of problem and suggests exercises to address it. Does Copilot offer a shortcut?

While I couldn't remember the details of Hedgehog's API, once I saw the suggestion, I recognised Gen.frequency, so I understood it as an appropriate code suggestion. The productivity gain, if there is one, may come from saving you the effort of looking up unfamiliar APIs, rather than saving you some keystrokes.

In this example, I already knew of the Gen.frequency function - I just couldn't recall the exact name and type. This enabled me to evaluate Copilot's suggestion and deem it correct. If I hadn't known that API already, how could I have known whether to trust Copilot?

Detectably wrong suggestions #

As amazing as Copilot can be, it's hardly faultless. It makes many erroneous suggestions. Sometimes the suggestion is obviously wrong. If you accept it, it doesn't compile. Sometimes, the compilation error is only a little edit from being correct, but at least in such situations you'll be explicitly aware that the suggestion couldn't be used verbatim.

Other suggestions are wrong, but less conspicuously so. Here's an example.

I was recently subjecting the code base that accompanies Code That Fits in Your Head to the mutation testing tool Stryker. Since it did point out a few possible mutations, I decided to add a few tests. One was of a wrapper class called TimeOfDay. Because of static code analysis rules, it came with conversions to and from TimeSpan, but these methods weren't covered by any tests.

In order to remedy that situation, I started writing an FsCheck property and came as far as:

[Property]
public void ConversionsRoundTrip(TimeSpan timeSpan)

At that point Copilot suggested the following, which I accepted:

[Property]
public void ConversionsRoundTrip(TimeSpan timeSpan)
{
    var timeOfDay = new TimeOfDay(timeSpan);
    var actual = (TimeSpan)timeOfDay;
    Assert.Equal(timeSpan, actual);
}

Looks good, doesn't it? Again, I was impressed. It compiled, and it even looks as though Copilot had picked up one of my naming conventions: naming variables by role, in this case actual.

While I tend to be on guard, I immediately ran the test suite instead of thinking it through. It failed. Keep in mind that this is a characterisation test, so it was supposed to pass.

The TimeOfDay constructor reveals why:

public TimeOfDay(TimeSpan durationSinceMidnight)
{
    if (durationSinceMidnight < TimeSpan.Zero ||
        TimeSpan.FromHours(24) < durationSinceMidnight)
        throw new ArgumentOutOfRangeException(
            nameof(durationSinceMidnight),
            "Please supply a TimeSpan between 0 and 24 hours.");
 
    this.durationSinceMidnight = durationSinceMidnight;
}

While FsCheck knows how to generate TimeSpan values, it'll generate arbitrary durations, including negative values and spans much longer than 24 hours. That explains why the test fails.

Granted, this is hardly a searing indictment against Copilot. After all, I could have made this mistake myself.

Still, that prompted me to look for more issues with the code that Copilot had suggested. Another problem with the code is that it tests the wrong API. The suggested test tries to round-trip via the TimeOfDay class' explicit cast operators, which were already covered by tests. Well, I might eventually have discovered that, too. Keep in mind that I was adding this test to improve the code base's Stryker score. After running the tool again, I would probably eventually have discovered that the score didn't improve. It takes Stryker around 25 minutes to test this code base, though, so it wouldn't have been rapid feedback.

Since, however, I examined the code with a critical eye, I noticed this by myself. This would clearly require changing the test code as well.

In the end, I wrote this test:

[Property]
public void ConversionsRoundTrip(TimeSpan timeSpan)
{
    var expected = ScaleToTimeOfDay(timeSpan);
    var sut = TimeOfDay.ToTimeOfDay(expected);
 
    var actual = TimeOfDay.ToTimeSpan(sut);
 
    Assert.Equal(expected, actual);
}
 
private static TimeSpan ScaleToTimeOfDay(TimeSpan timeSpan)
{
    // Convert an arbitrary TimeSpan to a 24-hour TimeSpan.
    // The data structure that underlies TimeSpan is a 64-bit integer,
    // so first we need to identify the range of possible TimeSpan
    // values. It might be easier to understand to calculate
    // TimeSpan.MaxValue - TimeSpan.MinValue, but that underflows.
    // Instead, the number of possible 64-bit integer values is the same
    // as the number of possible unsigned 64-bit integer values.
    var range = ulong.MaxValue;
    var domain = TimeSpan.FromHours(24).Ticks;
    var scale = (ulong)domain / range;
    var expected = timeSpan * scale;
    return expected;
}

In this case, Copilot didn't improve my productivity. It may actually have slowed me down a bit.

This time, it wasn't too difficult to spot issues with the suggested code. What if the problems are more subtle?

Errors that are difficult to detect #

How do bugs appear? We write them, thinking that our code is going to do one thing, while the compiler decides otherwise. Even when we actively engage with the code, we may be looking at a bug and still fail to see it.

Why should we trust Copilot to produce bug-free code?

Here's another example. I was going through the Range kata in F# and made my Endpoint discriminated union a functor:

type Endpoint<'a> = Open of 'a | Closed of 'a
 
module Endpoint =
    let map f = function
        | Open   x -> Open   (f x)
        | Closed x -> Closed (f x)

Making a type a functor is such a well-understood process that Haskell can automate it. I knew how to write the above map function as well, so I wrote it without testing it.

After I'd written the map function, I decided to add tests of the two functor laws. Not because I doubted the implementation, but because I might need that as example code later.

Copilot suggested this test after I named it:

open Xunit
open Hedgehog
 
[<Fact>]
let ``First functor law`` () = Property.check <| property {
    let! x = Gen.int32 (Range.linearBounded ())
    let f = (fun x -> x + 1)
    let g = (fun x -> x * 2)
 
    let actual = Endpoint.map f (Endpoint.map g (Open x))
    let expected = Endpoint.map (f >> g) (Open x)
 
    Assert.Equal(expected, actual) }

It compiles. It also fails.

What's wrong with it?

If you know the functor laws by heart, you'll be able to spot that this is clearly not the first functor law. On the other hand, it looks convincingly like the second functor law. Should I just change the name and move on?

I can't, though, since the test fails. Could there be a bug in my map function, after all?

No, there's an error in the test. I invite you to spot it.

In terms of keystrokes, it's easy to fix the problem:

open Xunit
open Hedgehog
 
[<Fact>]
let ``First functor law`` () = Property.check <| property {
    let! x = Gen.int32 (Range.linearBounded ())
    let f = (fun x -> x + 1)
    let g = (fun x -> x * 2)
 
    let actual = Endpoint.map f (Endpoint.map g (Open x))
    let expected = Endpoint.map (f << g) (Open x)
 
    Assert.Equal(expected, actual) }

Spot the edit. I bet it'll take you longer to find it than it took me to type it.

The test now passes, but for one who has spent less time worrying over functor laws than I have, troubleshooting this could have taken a long time.

These almost-right suggestions from Copilot both worry me and give me hope.

Copilot for experienced programmers #

When a new technology like Copilot appears, it's natural to speculate on the consequences. Does this mean that programmers will lose their jobs?

This is just a preliminary evaluation after a few months, so I could be wrong, but I think we programmers are safe. If you're experienced, you'll be able to tell most of Copilot's hits from its misses. Perhaps you'll get a productivity improvement out of, but it could also slow you down.

The tool is likely to improve over time, so I'm hopeful that this could become a net productivity gain. Still, with this high an error rate, I'm not too worried yet.

The Pragmatic Programmer describes a programming style named Programming by Coincidence. People who develop software this way have only a partial understanding of the code they write.

"Fred doesn't know why the code is failing because he didn't know why it worked in the first place."

Andy Hunt and Dave Thomas, The Pragmatic Programmer

I've encountered my fair share of these people. When editing code, they make small adjustments and do cursory manual testing until 'it looks like it works'. If they have to start a new feature or are otherwise faced with a metaphorical blank page, they'll copy some code from somewhere else and use that as a starting point.

You'd think that Copilot could enhance the productivity of such people, but I'm not sure. It might actually slow them down. These people don't fully understand the code they themselves 'write', so why should we expect them to understand the code that Copilot suggests?

If faced with a Copilot suggestion that 'almost works', will they be able to spot if it's a genuinely good suggestion, or whether it's off, like I've described above? If the Copilot code doesn't work, how much time will they waste thrashing?

Conclusion #

GitHub Copilot has the potential to be a revolutionary technology, but it's not, yet. So far, I'm not too worried. It's an assistant, like a pairing partner, but it's up to you to evaluate whether the code that Copilot suggests is useful, correct, and safe. How can you do that unless you already know what you're doing?

If you don't have the qualifications to evaluate the suggested code, I fail to see how it's going to help you. Granted, it does have potential to help you move on in less time that you would otherwise have spent. In this article, I showed one example where I would have had to spend significant time looking up API documentation. Instead, Copilot suggested the correct code to use.

Pulling in the other direction are the many false positives. Copilot makes many suggestions, and many of them are poor. The ones that are recognisably bad are unlikely to slow you down. I'm more concerned with those that are subtly wrong. They have the potential to waste much time.

Which of these forces are strongest? The potential for wasting time is infinite, while the maximum productivity gain you can achieve is 100 percent. That's an asymmetric distribution. There's a long tail of time wasters, but there's no equivalent long tail of improvement.

I'm not, however, trying to be pessimistic. I expect to keep Copilot around for the time being. It could very well be here to stay. Used correctly, it seems useful.

Is it going to replace programmers? Hardly. Rather, it may enable poor developers to make such a mess of things that you need even more good programmers to subsequently fix things.

An initial proof of concept of applicative assertions in C#

Monday, 28 November 2022 06:47:00 UTC

Worthwhile? Not obviously.

This article is the first instalment in a small articles series about applicative assertions. It explores a way to compose assertions in such a way that failure messages accumulate rather than short-circuit. It assumes that you've read the article series introduction.

Assertions are typically based on throwing exceptions. As soon as one assertion fails, an exception is thrown and no further assertions are evaluated. This is normal short-circuiting behaviour of exceptions. In some cases, however, it'd be useful to keep evaluating other assertions and collect error messages.

This article series explores an intriguing idea to address such issues: Use an applicative functor to collect multiple assertion messages. I started experimenting with the idea to see where it would lead. The article series serves as a report of what I found. It is neither a recommendation nor a caution. I still find the idea interesting, but I'm not sure whether the complexity is warranted.

Example scenario #

A realistic example is often illustrative, although there's a risk that the realism carries with it some noise that detracts from the core of the matter. I'll reuse an example that I've already discussed and explained in greater detail. The code is from the code base that accompanies my book Code That Fits in Your Head.

This test has two independent assertions:

[Theory]
[InlineData(884, 18, 47, "c@example.net", "Nick Klimenko", 2)]
[InlineData(902, 18, 50, "emot@example.gov", "Emma Otting", 5)]
public async Task DeleteReservation(
    int days, int hours, int minutes, string email, string name, int quantity)
{
    using var api = new LegacyApi();
    var at = DateTime.Today.AddDays(days).At(hours, minutes)
        .ToIso8601DateTimeString();
    var dto = Create.ReservationDto(at, email, name, quantity);
    var postResp = await api.PostReservation(dto);
    Uri address = FindReservationAddress(postResp);
 
    var deleteResp = await api.CreateClient().DeleteAsync(address);
 
    Assert.True(
        deleteResp.IsSuccessStatusCode,
        $"Actual status code: {deleteResp.StatusCode}.");
    var getResp = await api.CreateClient().GetAsync(address);
    Assert.Equal(HttpStatusCode.NotFound, getResp.StatusCode);
}

The test exercises the REST API to first create a reservation, then delete it, and finally check that the reservation no longer exists. Two independent postconditions must be true for the test to pass:

The DELETE request must result in a status code that indicates success.
The resource must no longer exist.

It's conceivable that a bug might fail one of these without invalidating the other.

As the test is currently written, it uses xUnit.net's standard assertion library. If the Assert.True verification fails, the Assert.Equal statement isn't evaluated.

Assertions as validations #

Is it possible to evaluate the Assert.Equal postcondition even if the first assertion fails? You could use a try/catch block, but is there a more composable and elegant option? How about an applicative functor?

Since I was interested in exploring this question as a proof of concept, I decided to reuse the machinery that I'd already put in place for the article An applicative reservation validation example in C#: The Validated class and its associated functions. In a sense, you can think of an assertion as a validation of a postcondition.

This is not a resemblance I intend to carry too far. What I learn by experimenting with Validated I can apply to a more appropriately-named class like Asserted.

Neither of the two above assertions return a value; they are one-stop assertions. If they succeed, they return nothing; if they fail, they produce an error.

It's possible to model this kind of behaviour with Validated. You can model a collection of errors with, well, a collection. To keep the proof of concept simple, I decided to use a collection of strings: IReadOnlyCollection<string>. To model 'nothing' I had to add a unit type:

public sealed class Unit
{
    private Unit() { }
 
    public readonly static Unit Value = new Unit();
}

This enabled me to define assertions as Validated<IReadOnlyCollection<string>, Unit> values: Either a collection of error messages, or nothing.

Asserting truth #

Instead of xUnit.net's Assert.True, you can now define an equivalent function:

public static Validated<IReadOnlyCollection<string>, Unit> AssertTrue(
    this bool condition,
    string message)
{
    return condition
        ? Succeed<IReadOnlyCollection<string>, Unit>(Unit.Value)
        : Fail<IReadOnlyCollection<string>, Unit>(new[] { message });
}

It simply returns a Success value containing nothing when condition is true, and otherwise a Failure value containing the error message.

You can use it like this:

var assertResponse = Validated.AssertTrue(
    deleteResp.IsSuccessStatusCode,
    $"Actual status code: {deleteResp.StatusCode}.");

Later in the article you'll see how this assertion combines with another assertion.

Asserting equality #

Instead of xUnit.net's Assert.Equal, you can also define a function that works the same way but returns a Validated value:

public static Validated<IReadOnlyCollection<string>, Unit> AssertEqual<T>(
    T expected,
    T actual)
{
    return Equals(expected, actual)
        ? Succeed<IReadOnlyCollection<string>, Unit>(Unit.Value)
        : Fail<IReadOnlyCollection<string>, Unit>(new[]
            { $"Expected {expected}, but got {actual}." });
}

The AssertEqual function first uses Equals to compare expected with actual. If the result is true, the function returns a Success value containing nothing; otherwise, it returns a Failure value containing a failure message. Since this is only a proof of concept, the failure message is useful, but minimal.

Notice that this function returns a value of the same type (Validated<IReadOnlyCollection<string>, Unit>) as AssertTrue.

You can use the function like this:

var assertState = Validated.AssertEqual(HttpStatusCode.NotFound, getResp.StatusCode);

Again, you'll see how to combine this assertion with the above assertResponse value later in this article.

Evaluating assertions #

The DeleteReservation test only has two independent assertions, so in my proof of concept, all I needed to do was to figure out a way to combine two applicative assertions into one, and then evaluate it. This rather horrible method does that:

public static void RunAssertions(
    Validated<IReadOnlyCollection<string>, Unit> assertion1,
    Validated<IReadOnlyCollection<string>, Unit> assertion2)
{
    var f = Succeed<IReadOnlyCollection<string>, Func<Unit, Unit, Unit>>((_, __) => Unit.Value);
    Func<IReadOnlyCollection<string>, IReadOnlyCollection<string>, IReadOnlyCollection<string>>
        combine = (x, y) => x.Concat(y).ToArray();
 
    Validated<IReadOnlyCollection<string>, Unit> composition = f
        .Apply(assertion1, combine)
        .Apply(assertion2, combine);
    string errors = composition.Match(
        onFailure: f => string.Join(Environment.NewLine, f),
        onSuccess: _ => string.Empty);
 
    if (!string.IsNullOrEmpty(errors))
        throw new Exception(errors);

C# doesn't have good language features for applicative functors the same way that F# and Haskell do, and although you can use various tricks to make the programming experience better that what is on display here, I was still doing a proof of concept. If it turns out that this approach is useful and warranted, we can introduce some of the facilities to make the API more palatable. For now, though, we're dealing with all the rough edges.

The way that applicative functors work, you typically use a 'lifted' function to combine two (or more) 'lifted' values. Here, 'lifted' means 'being inside the Validated container'.

Each of the assertions that I want to combine has the same type: Validated<IReadOnlyCollection<string>, Unit>. Notice that the S (success) generic type argument is Unit in both cases. While it seems redundant, formally I needed a 'lifted' function to combine two Unit values into a single value. This single value can (in principle) have any type I'd like it to have, but since you can't extract any information out of a Unit value, it makes sense to use the monoidal nature of unit to combine two into one.

Basically, you just ignore the Unit input values because they carry no information. Also, they're all the same value anyway, since the type is a Singleton. In its 'naked' form, the function might be implemented like this: (_, __) => Unit.Value. Due to the ceremony required by the combination of C# and applicative functors, however, this monoidal binary operation has to be 'lifted' to a Validated value. That's the f value in the RunAssertions function body.

The Validated.Apply function requires as an argument a function that combines the generic F (failure) values into one, in order to deal with the case where there's multiple failures. In this case F is IReadOnlyCollection<string>. Since declarations of Func values in C# requires explicit type declaration, that's a bit of a mouthful, but the combine function just concatenates two collections into one.

The RunAssertions method can now Apply both assertion1 and assertion2 to f, which produces a combined Validated value, composition. It then matches on the combined value to produce a string value. If there are no assertion messages, the result is the empty string; otherwise, the function combines the assertion messages with a NewLine between each. Again, this is proof-of-concept code. A more robust and flexible API (if warranted) might keep the errors around as a collection of strongly typed Value Objects.

Finally, if the resulting errors string is not null or empty, the RunAssertions method throws an exception with the combined error message(s). Here I once more invoked my proof-of-concept privilege to throw an Exception, even though the framework design guidelines admonishes against doing so.

Ultimately, then, the assert phase of the test looks like this:

var assertResponse = Validated.AssertTrue(
    deleteResp.IsSuccessStatusCode,
    $"Actual status code: {deleteResp.StatusCode}.");
var getResp = await api.CreateClient().GetAsync(address);
var assertState =
    Validated.AssertEqual(HttpStatusCode.NotFound, getResp.StatusCode);
Validated.RunAssertions(assertResponse, assertState);

The rest of the test hasn't changed.

Outcomes #

Running the test with the applicative assertions passes, as expected. In order to verify that it works as it's supposed to, I tried to sabotage the System Under Test (SUT) in various ways. First, I made the Delete method that handles DELETE requests a no-op, while still returning 200 OK. As you'd expect, the result is a test failure with this message:

Message:
System.Exception : Expected NotFound, but got OK.

This is the assertion that verifies that getResp.StatusCode is 404 Not Found. It fails because the sabotaged Delete method doesn't delete the reservation.

Then I further sabotaged the SUT to also return an incorrect status code (400 Bad Request), which produced this failure message:

Message:
System.Exception : Actual status code: BadRequest.
Expected NotFound, but got OK.

Notice that the message contains information about both failure conditions.

Finally, I re-enabled the correct behaviour (deleting the reservation from the data store) while still returning 400 Bad Request:

Message:
System.Exception : Actual status code: BadRequest.

As desired, the assertions collect all relevant failure messages.

Conclusion #

Not surprisingly, it's possible to design a composable assertion API that collects multiple failure messages using an applicative functor. Anyone who knows how applicative validation works would have been able to predict that outcome. That's not what the above proof of concept was about. What I wanted to see was rather how it would play out in a realistic scenario, and whether using an applicative functor is warranted.

Applicative functors don't gel well with C#, so unsurprisingly the API is awkward. It's likely possible to smooth much of the friction, but without good language support and syntactic sugar, it's unlikely to become idiomatic C#.

Rather than taking the edge off the unwieldy API, the implementation of RunAssertions suggests another alternative.

Next: Error-accumulating composable assertions in C#.

Decouple to delete

Monday, 21 November 2022 08:46:00 UTC

Don't try to predict the future.

Do you know why it's called spaghetti code? It's a palatable metaphor. You may start with a single spaghetto, but usually, as you wind your fork around it, the whole dish follows along. Unless you're careful, eating spaghetti can be a mess.

A small spaghetti serving.

Spaghetti code is tangled and everything is directly or transitively connected to everything else. As you try to edit the code, every change you make affects other code. Fix one thing and another thing breaks, cascading through the code base.

I was recently reading Clean Architecture, and as Robert C. Martin was explaining the Dependency Inversion Principle for the umpteenth time, my brain made a new connection. To be clear: Connecting (coupling) code is bad, but connecting ideas is good.

What a tangled web we weave #

It's impractical to write code that depends on nothing else. Most code will call other code, which again calls other code. It behoves us, though, to be careful that the web of dependencies don't get too tangled.

Imagine a code base where the dependency graph looks like this:

A connected graph.

Think of each node as a unit of code; a class or a module. While a dependency graph is a directed graph, I didn't indicate the directions. Imagine that most edges point both ways, so that the nodes are interdependent. In other ways, the graph has cycles. This is not uncommon in C# code.

Pick any node in such a graph, and chances are that other nodes depend on it. This makes it hard to make changes to the code in that node, because a change may affect the code that depends on it. As you try to fix the depending code, that change, too, ripples through the network.

This already explains why tight coupling is problematic.

It is difficult to make predictions, especially about the future #

When you write source code, you might be tempted to try to take into account future needs and requirements. There may be a historical explanation for that tendency.

"That is, once it was a sign of failure to change product code. You should have gotten it right the first time."

Brian Marick

In the days of punchcards, you had to schedule time to use a computer. If you made a mistake in your program, you typically didn't have time to fix it during your timeslot. A mistake could easily cost you days as you scrambled to schedule a new time. Not surprisingly, emphasis was on correctness.

With this mindset, it's natural to attempt to future-proof code.

YAGNI #

With interactive development environments you can get rapid feedback. If you make a mistake, change the code and observe the outcome. Don't add code because you think that you might need it later. You probably will not.

While you should avoid speculative generality, that alone is no guarantee of clean code. Unless you're careful, you can easily make a mess by tightly coupling different parts of your code base.

How do produce a code base that is as easy to change as possible?

Write code that is easy to delete #

Write code that is easy to change. The ultimate change you can make is to delete code. After that, you can write something else that better does what you need.

"A system where you can delete parts without rewriting others is often called loosely coupled"

tef

I don't mean that you should always delete code in order to make changes, but often, looking at extremes can provide insights into less extreme cases.

When you have a tangled web as shown above, most of the code is coupled to other parts. If you delete a node, then you break something else. You'd think that deleting code is the easiest thing in the world, but it's not.

What if, on the other hand, you have smaller clusters of nodes that are independent?

A disconnected graph with small islands of connected graphs.

If your dependency graph looks like this, you can at least delete each of the 'islands' without impacting the other sub-graphs.

The graph from the previous figure, less one sub-graph.

Writing code that is easy to delete may be a good idea, but even that is easier said that done. Loose coupling is, once more, key to good architecture.

Add something better #

Once you've deleted a cluster of code, you have the opportunity to add something that is even less coupled than the island you deleted.

The graph from the previous figure, with new small graphs added.

If you add new code that is less coupled than the code you deleted, it's even easier to delete again.

Conclusion #

Coupling is a key factor in code organisation. Tightly coupled code is difficult to change. Loosely coupled code is easier to change. As a thought experiment, consider how difficult it would be to delete a particular piece of code. The easier it is to delete the code, the less coupled it is.

Deleting a small piece of code to add new code in its stead is the ultimate change. You can often get by with a less radical edit, but if all else fails, delete part of your code base and start over. The less coupled the code is, the easier it is to change.

The Reader monad

Monday, 14 November 2022 06:50:00 UTC

Normal functions form monads. An article for object-oriented programmers.

This article is an instalment in an article series about monads. A previous article described the Reader functor. As is the case with many (but not all) functors, Readers also form monads.

This article continues where the Reader functor article stopped. It uses the same code base.

Flatten #

A monad must define either a bind or join function, although you can use other names for both of these functions. Flatten is in my opinion a more intuitive name than join, since a monad is really just a functor that you can flatten. Flattening is relevant if you have a nested functor; in this case a Reader within a Reader. You can flatten such a nested Reader with a Flatten function:

public static IReader<R, A> Flatten<R, A>(
    this IReader<R, IReader<R, A>> source)
{
    return new FlattenReader<R, A>(source);
}
 
private class FlattenReader<R, A> : IReader<R, A>
{
    private readonly IReader<R, IReader<R, A>> source;
 
    public FlattenReader(IReader<R, IReader<R, A>> source)
    {
        this.source = source;
    }
 
    public A Run(R environment)
    {
        IReader<R, A> newReader = source.Run(environment);
        return newReader.Run(environment);
    }
}

Since the source Reader is nested, calling its Run method once returns a newReader. You can Run that newReader one more time to get an A value to return.

You could easily chain the two calls to Run together, one after the other. That would make the code terser, but here I chose to do it in two explicit steps in order to show what's going on.

Like the previous article about the State monad, a lot of ceremony is required because this variation of the Reader monad is defined with an interface. You could also define the Reader monad on a 'raw' function of the type Func<R, A>, in which case Flatten would be simpler:

public static Func<R, A> Flatten<R, A>(this Func<R, Func<R, A>> source)
{
    return environment => source(environment)(environment);
}

In this variation source is a function, so you can call it with environment, which returns another function that you can again call with environment. This produces an A value for the function to return.

SelectMany #

When you have Flatten you can always define SelectMany (monadic bind) like this:

public static IReader<R, B> SelectMany<R, A, B>(
    this IReader<R, A> source,
    Func<A, IReader<R, B>> selector)
{
    return source.Select(selector).Flatten();
}

First use functor-based mapping. Since the selector returns a Reader, this mapping produces a Reader within a Reader. That's exactly the situation that Flatten addresses.

The above SelectMany example works with the IReader<R, A> interface, but the 'raw' function version has the exact same implementation:

public static Func<R, B> SelectMany<R, A, B>(
    this Func<R, A> source,
    Func<A, Func<R, B>> selector)
{
    return source.Select(selector).Flatten();
}

Only the method declaration differs.

Query syntax #

Monads also enable query syntax in C# (just like they enable other kinds of syntactic sugar in languages like F# and Haskell). As outlined in the monad introduction, however, you must add a special SelectMany overload:

public static IReader<R, T1> SelectMany<R, T, U, T1>(
    this IReader<R, T> source,
    Func<T, IReader<R, U>> k,
    Func<T, U, T1> s)
{
    return source.SelectMany(x => k(x).Select(y => s(x, y)));
}

As already predicted in the monad introduction, this boilerplate overload is always implemented in the same way. Only the signature changes. With it, you could write an expression like this nonsense:

IReader<int, bool> r =
    from dur in new MinutesReader()
    from b in new Thingy(dur)
    select b;

Where MinutesReader was already shown in the article Reader as a contravariant functor. I couldn't come up with a good name for another reader, so I went with Dan North's naming convention that if you don't yet know what to call a class, method, or function, don't pretend that you know. Be explicit that you don't know.

Here it is, for the sake of completion:

public sealed class Thingy : IReader<int, bool>
{
    private readonly TimeSpan timeSpan;
 
    public Thingy(TimeSpan timeSpan)
    {
        this.timeSpan = timeSpan;
    }
 
    public bool Run(int environment)
    {
        return new TimeSpan(timeSpan.Ticks * environment).TotalDays < 1;
    }
}

I'm not claiming that this class makes sense. These articles are deliberate kept abstract in order to focus on structure and behaviour, rather than on practical application.

Return #

Apart from flattening or monadic bind, a monad must also define a way to put a normal value into the monad. Conceptually, I call this function return (because that's the name that Haskell uses):

public static IReader<R, A> Return<R, A>(A a)
{
    return new ReturnReader<R, A>(a);
}
 
private class ReturnReader<R, A> : IReader<R, A>
{
    private readonly A a;
 
    public ReturnReader(A a)
    {
        this.a = a;
    }
 
    public A Run(R environment)
    {
        return a;
    }
}

This implementation returns the a value and completely ignores the environment. You can do the same with a 'naked' function.

Left identity #

We need to identify the return function in order to examine the monad laws. Now that this is accomplished, let's see what the laws look like for the Reader monad, starting with the left identity law.

[Theory]
[InlineData(UriPartial.Authority, "https://example.com/f?o=o")]
[InlineData(UriPartial.Path, "https://example.net/b?a=r")]
[InlineData(UriPartial.Query, "https://example.org/b?a=z")]
[InlineData(UriPartial.Scheme, "https://example.gov/q?u=x")]
public void LeftIdentity(UriPartial a, string u)
{
    Func<UriPartial, IReader<Uri, UriPartial>> @return =
        up => Reader.Return<Uri, UriPartial>(up);
    Func<UriPartial, IReader<Uri, string>> h =
        up => new UriPartReader(up);
 
    Assert.Equal(
        @return(a).SelectMany(h).Run(new Uri(u)),
        h(a).Run(new Uri(u)));
}

In order to compare the two Reader values, the test has to Run them and then compare the return values.

This test and the next uses a Reader implementation called UriPartReader, which almost makes sense:

public sealed class UriPartReader : IReader<Uri, string>
{
    private readonly UriPartial part;
 
    public UriPartReader(UriPartial part)
    {
        this.part = part;
    }
 
    public string Run(Uri environment)
    {
        return environment.GetLeftPart(part);
    }
}

Almost.

Right identity #

In a similar manner, we can showcase the right identity law as a test.

[Theory]
[InlineData(UriPartial.Authority, "https://example.com/q?u=ux")]
[InlineData(UriPartial.Path, "https://example.net/q?u=uuz")]
[InlineData(UriPartial.Query, "https://example.org/c?o=rge")]
[InlineData(UriPartial.Scheme, "https://example.gov/g?a=rply")]
public void RightIdentity(UriPartial a, string u)
{
    Func<UriPartial, IReader<Uri, string>> f =
        up => new UriPartReader(up);
    Func<string, IReader<Uri, string>> @return =
        s => Reader.Return<Uri, string>(s);
 
    IReader<Uri, string> m = f(a);
 
    Assert.Equal(
        m.SelectMany(@return).Run(new Uri(u)),
        m.Run(new Uri(u)));
}

As always, even a parametrised test constitutes no proof that the law holds. I show the tests to illustrate what the laws look like in 'real' code.

Associativity #

The last monad law is the associativity law that describes how (at least) three functions compose. We're going to need three functions. For the purpose of demonstrating the law, any three pure functions will do. While the following functions are silly and not at all 'realistic', they have the virtue of being as simple as they can be (while still providing a bit of variety). They don't 'mean' anything, so don't worry too much about their behaviour. It is, as far as I can tell, nonsensical.

public sealed class F : IReader<int, string>
{
    private readonly char c;
 
    public F(char c)
    {
        this.c = c;
    }
 
    public string Run(int environment)
    {
        return new string(c, environment);
    }
}
 
public sealed class G : IReader<int, bool>
{
    private readonly string s;
 
    public G(string s)
    {
        this.s = s;
    }
 
    public bool Run(int environment)
    {
        return environment < 42 || s.Contains("a");
    }
}
 
public sealed class H : IReader<int, TimeSpan>
{
    private readonly bool b;
 
    public H(bool b)
    {
        this.b = b;
    }
 
    public TimeSpan Run(int environment)
    {
        return b ?
            TimeSpan.FromMinutes(environment) :
            TimeSpan.FromSeconds(environment);
    }
}

Armed with these three classes, we can now demonstrate the Associativity law:

[Theory]
[InlineData('a', 0)]
[InlineData('b', 1)]
[InlineData('c', 42)]
[InlineData('d', 2112)]
public void Associativity(char a, int i)
{
    Func<char, IReader<int, string>> f = c => new F(c);
    Func<string, IReader<int, bool>> g = s => new G(s);
    Func<bool, IReader<int, TimeSpan>> h = b => new H(b);
 
    IReader<int, string> m = f(a);
 
    Assert.Equal(
        m.SelectMany(g).SelectMany(h).Run(i),
        m.SelectMany(x => g(x).SelectMany(h)).Run(i));
}

In case you're wondering, the four test cases produce the outputs 00:00:00, 00:01:00, 00:00:42, and 00:35:12. You can see that reproduced below:

Haskell #

In Haskell, normal functions a -> b are already Monad instances, which means that you can easily replicate the functions from the Associativity test:

> f c = \env -> replicate env c
> g s = \env -> env < 42 || 'a' `elem` s
> h b = \env -> if b then secondsToDiffTime (toEnum env * 60) else secondsToDiffTime (toEnum env)

I've chosen to write the f, g, and h as functions that return lambda expressions in order to emphasise that each of these functions return Readers. Since Haskell functions are already curried, I could also have written them in the more normal function style with two normal parameters, but that might have obscured the Reader aspect of each.

Here's the composition in action:

> f 'a' >>= g >>= h $ 0
0s
> f 'b' >>= g >>= h $ 1
60s
> f 'c' >>= g >>= h $ 42
42s
> f 'd' >>= g >>= h $ 2112
2112s

In case you are wondering, 2,112 seconds is 35 minutes and 12 seconds, so all outputs fit with the results reported for the C# example.

What the above Haskell GHCi (REPL) session demonstrates is that it's possible to compose functions with Haskell's monadic bind operator >>= operator exactly because all functions are (Reader) monads.

Conclusion #

In Haskell, it can occasionally be useful that a function can be used when a Monad is required. Some Haskell libraries are defined in very general terms. Their APIs may enable you to call functions with any monadic input value. You can, say, pass a Maybe, a List, an Either, a State, but you can also pass a function.

C# and most other languages (F# included) doesn't come with that level of abstraction, so the fact that a function forms a monad is less useful there. In fact, I can't recall having made explicit use of this knowledge in C#, but one never knows if that day arrives.

In a similar vein, knowing that endomorphisms form monoids (and thereby also semigroups) enabled me to quickly identify the correct design for a validation problem.

Who knows? One day the knowledge that functions are monads may come in handy.

Next: The IO monad.

Applicative assertions

Monday, 07 November 2022 06:56:00 UTC

An exploration.

In a recent Twitter exchange, Lucas DiCioccio made an interesting observation:

"Imho the properties you want of an assertion-framework are really close (the same as?) applicative-validation: one assertion failure with multiple bullet points composed mainly from combinators."

Lucas DiCioccio

In another branch off my initial tweet Josh McKinney pointed out the short-circuiting nature of standard assertions:

"short circuiting often causes weaker error messages in failing tests than running compound assertions. E.g.
TransferTest {
  a.transfer(b,50);
  a.shouldEqual(50);
  b.shouldEqual(150); // never reached?
}
Josh McK

Most standard assertion libraries work by throwing exceptions when an assertion fails. Once you throw an exception, remaining code doesn't execute. This means that you only get the first assertion message. Further assertions are not evaluated.

Josh McKinney later gave more details about a particular scenario. Although in the general case I don't consider the short-circuiting nature of assertions to be a problem, I grant that there are cases where proper assertion composition would be useful.

Lucas DiCioccio's suggestion seems worthy of investigation.

Ongoing exploration #

I asked Lucas DiCioccio whether he'd done any work with his idea, and the day after he replied with a Haskell proof of concept.

I found the idea so interesting that I also wanted to carry out a few proofs of concept myself, perhaps within a more realistic setting.

As I'm writing this, I've reached some preliminary conclusions, but I'm also aware that they may not hold in more general cases. I'm posting what I have so far, but you should expect this exploration to evolve over time. If I find out more, I'll update this post with more articles.

A preliminary summary is in order. Based on the first two articles, applicative assertions look like overkill. I think, however, that it's because of the degenerate nature of the example. Some assertions are essentially one-stop verifications: Evaluate a predicate, and throw an exception if the result is false. These assertions return unit or void. Examples from xUnit include Assert.Equal, Assert.True, Assert.False, Assert.All, and Assert.DoesNotContain.

These are the kinds of assertions that the initial two articles explore.

There are other kinds of assertions that return a value in case of success. xUnit.net examples include Assert.Throws, Assert.Single, Assert.IsAssignableFrom, and some overloads of Assert.Contains. Assert.Single, for example, verifies that a collection contains only a single element. While it throws an exception if the collection is either empty or has more than one element, in the success case it returns the single value. This can be useful if you want to add more assertions based on that value.

I haven't experimented with this yet, but as far as can tell, you'll run into the following problem: If you make such an assertion return an applicative functor, you'll need some way to handle the success case. Combining it with another assertion-producing function, such as a -> Asserted e b (pseudocode) is possible with functor mapping, but will leave you with a nested functor.

You'll probably want to flatten the nested functor, which is exactly what monads do. Monads, on the other hand, short circuit, so you don't want to make your applicative assertion type a monad. Instead, you'll need to use an isomorphic monad container (Either should do) to move in and out of. Doable, but is the complexity warranted?

I realise that the above musings are abstract, and that I really should show rather than tell. I'll add some more content here if I ever collect something worthy of an article. if you ask me now, though, I consider that a bit of a toss-up.

The first two examples also suffer from being written in C#, which doesn't have good syntactic support for applicative functors. Perhaps I'll add some articles that use F# or Haskell.

Conclusion #

There's the occasional need for composable assertions. You can achieve that with an applicative functor, but the question is whether it's warranted. Could you make something simpler based on the list monad?

As I'm writing this, I don't consider that question settled. Even so, you may want to read on.

Next: An initial proof of concept of applicative assertions in C#.

Comments

Tyson Williams #

Monads, on the other hand, short circuit, so you don't want to make your applicative assertion type a monad.

I want my assertion type to be both applicative and monadic. So does Paul Loath, the creator of Language Ext, which is most clearly seen via this Validation test code. So does Alexis King (as you pointed out to me) in her Haskell Validation package, which violiates Hakell's monad type class, and which she defends here.

When I want (or typically need) short-circuiting behavior, then I use the type's monadic API. When I want "error-collecting behavior", then I use the type's applicative API.

The first two examples also suffer from being written in C#, which doesn't have good syntactic support for applicative functors.

The best syntactic support for applicative functors in C# that I have seen is in Langauge Ext. A comment explains in that same Validation test how it works, and the line after the comment shows it in action.

2023-01-16 21:13 UTC

Mark Seemann #

Tyson, thank you for writing. Whether or not you want to enable monadic short-circuiting for assertions or validations depends, I think, on 'developer ergonomics'. It's a trade-off mainly between ease and simplicity as outlined by Rich Hickey. Enabling a monadic API for something that isn't naturally monadic does indeed provide ease of use, in that the compositional capabilities of a monad are readily 'at hand'.

If you don't have that capability you'll have to map back and forth between, say, Validation and Either (if using the validation package). This is tedious, but explicit.

Making validation or assertions monadic makes it easier to compose nested values, but also (in my experience) makes it easier to make mistakes, in the sense that you (or a colleague) may think that the behaviour is error-collecting, whereas in reality it's short-circuiting.

In the end, the trade-off may reduce to how much you trust yourself (and co-workers) to steer clear of mistakes, and how important it is to avoid errors. In this case, how important is it to collect the errors, rather than short-circuiting?

You can choose one alternative or the other by weighing such concerns.

2023-01-19 8:30 UTC

A regular grid emerges

Monday, 31 October 2022 06:44:00 UTC

The code behind a lecture animation.

If you've seen my presentation Fractal Architecture, you may have wondered how I made the animation where a regular(ish) hexagonal grid emerges from adding more and more blobs to an area.

A grid-like structure starting to emerge from tightly packing blobs.

Like a few previous blog posts, today's article appears on Observable, which is where the animation and the code that creates it lives. Go there to read it.

If you have time, watch the animation evolve. Personally I find it quite mesmerising.

Encapsulation in Functional Programming

Monday, 24 October 2022 05:54:00 UTC

Encapsulation is only relevant for object-oriented programming, right?

The concept of encapsulation is closely related to object-oriented programming (OOP), and you rarely hear the word in discussions about (statically-typed) functional programming (FP). I will argue, however, that the notion is relevant in FP as well. Typically, it just appears with a different catchphrase.

Contracts #

I base my understanding of encapsulation on Object-Oriented Software Construction. I've tried to distil it in my Pluralsight course Encapsulation and SOLID.

In short, encapsulation denotes the distinction between an object's contract and its implementation. An object should fulfil its contract in such a way that client code doesn't need to know about its implementation.

Contracts, according to Bertrand Meyer, describe three properties of objects:

Preconditions: What client code must fulfil in order to successfully interact with the object.
Invariants: Statements about the object that are always true.
Postconditions: Statements that are guaranteed to be true after a successful interaction between client code and object.

You can replace object with value and I'd argue that the same concerns are relevant in FP.

In OOP invariants often point to the properties of an object that are guaranteed to remain even in the face of state mutation. As you change the state of an object, the object should guarantee that its state remains valid. These are the properties (i.e. qualities, traits, attributes) that don't vary - i.e. are invariant.

An example would be helpful around here.

Table mutation #

Consider an object that models a table in a restaurant. You may, for example, be working on the Maître d' kata. In short, you may decide to model a table as being one of two kinds: Standard tables and communal tables. You can reserve seats at communal tables, but you still share the table with other people.

You may decide to model the problem in such a way that when you reserve the table, you change the state of the object. You may decide to describe the contract of Table objects like this:

Preconditions
- To create a Table object, you must supply a type (standard or communal).
- To create a Table object, you must supply the size of the table, which is a measure of its capacity; i.e. how many people can sit at it.
- The capacity must be a natural number. One (1) is the smallest valid capacity.
- When reserving a table, you must supply a valid reservation.
- When reserving a table, the reservation quantity must be less than or equal to the table's remaining capacity.
Invariants
- The table capacity doesn't change.
- The table type doesn't change.
- The number of remaining seats is never negative.
- The number of remaining seats is never greater than the table's capacity.
Postconditions
- After reserving a table, the number of remaining seats can't be greater than the previous number of remaining seats minus the reservation quantity.

This list may be incomplete, and if you add more operations, you may have to elaborate on what that means to the contract.

In C# you may implement a Table class like this:

public sealed class Table
{
    private readonly List<Reservation> reservations;
 
    public Table(int capacity, TableType type)
    {
        if (capacity < 1)
            throw new ArgumentOutOfRangeException(
                nameof(capacity),
                $"Capacity must be greater than zero, but was: {capacity}.");
 
        reservations = new List<Reservation>();
        Capacity = capacity;
        Type = type;
        RemaingSeats = capacity;
    }
 
    public int Capacity { get; }
    public TableType Type { get; }
    public int RemaingSeats { get; private set; }
 
    public void Reserve(Reservation reservation)
    {
        if (RemaingSeats < reservation.Quantity)
            throw new InvalidOperationException(
                "The table has no remaining seats.");
 
        if (Type == TableType.Communal)
            RemaingSeats -= reservation.Quantity;
        else
            RemaingSeats = 0;
 
        reservations.Add(reservation);
    }
}

This class has good encapsulation because it makes sure to fulfil the contract. You can't put it in an invalid state.

Immutable Table #

Notice that two of the invariants for the above Table class is that the table can't change type or capacity. While OOP often revolves around state mutation, it seems reasonable that some data is immutable. A table doesn't all of a sudden change size.

In FP data is immutable. Data doesn't change. Thus, data has that invariant property.

If you consider the above contract, it still applies to FP. The specifics change, though. You'll no longer be dealing with Table objects, but rather Table data, and to make reservations, you call a function that returns a new Table value.

In F# you could model a Table like this:

type Table = private Standard of int * Reservation list | Communal of int * Reservation list
 
module Table =
    let standard capacity =
        if 0 < capacity
        then Some (Standard (capacity, []))
        else None
 
    let communal capacity =
        if 0 < capacity
        then Some (Communal (capacity, []))
        else None
        
    let remainingSeats = function
        | Standard (capacity, []) -> capacity
        | Standard _ -> 0
        | Communal (capacity, rs) -> capacity - List.sumBy (fun r -> r.Quantity) rs
 
    let reserve r t =
        match t with
        | Standard (capacity, []) when r.Quantity <= remainingSeats t ->
            Some (Standard (capacity, [r]))
        | Communal (capacity, rs) when r.Quantity <= remainingSeats t ->
            Some (Communal (capacity, r :: rs))
        | _ -> None

While you'll often hear fsharpers say that one should make illegal states unrepresentable, in practice you often have to rely on predicative data to enforce contracts. I've done this here by making the Table cases private. Code outside the module can't directly create Table data. Instead, it'll have to use one of two functions: Table.standard or Table.communal. These are functions that return Table option values.

That's the idiomatic way to model predicative data in statically typed FP. In Haskell such functions are called smart constructors.

Statically typed FP typically use Maybe (Option) or Either (Result) values to communicate failure, rather than throwing exceptions, but apart from that a smart constructor is just an object constructor.

The above F# Table API implements the same contract as the OOP version.

If you want to see a more elaborate example of modelling table and reservations in F#, see An F# implementation of the Maître d' kata.

Functional contracts in OOP languages #

You can adopt many FP concepts in OOP languages. My book Code That Fits in Your Head contains sample code in C# that implements an online restaurant reservation system. It includes a Table class that, at first glance, looks like the above C# class.

While it has the same contract, the book's Table class is implemented with the FP design principles in mind. Thus, it's an immutable class with this API:

public sealed class Table
{
    public static Table Standard(int seats)
 
    public static Table Communal(int seats)
 
    public int Capacity { get; }
 
    public int RemainingSeats { get; }
  
    public Table Reserve(Reservation reservation)
 
    public T Accept<T>(ITableVisitor<T> visitor)
 
    public override bool Equals(object? obj)
 
    public override int GetHashCode()
}

Notice that the Reserve method returns a Table object. That's the table with the reservation associated. The original Table instance remains unchanged.

The entire book is written in the Functional Core, Imperative Shell architecture, so all domain models are immutable objects with pure functions as methods.

The objects still have contracts. They have proper encapsulation.

Conclusion #

Functional programmers may not use the term encapsulation much, but that doesn't mean that they don't share that kind of concern. They often throw around the phrase make illegal states unrepresentable or talk about smart constructors or partial versus total functions. It's clear that they care about data modelling that prevents mistakes.

The object-oriented notion of encapsulation is ultimately about separating the affordances of an API from its implementation details. An object's contract is an abstract description of the properties (i.e. qualities, traits, or attributes) of the object.

Functional programmers care so much about the properties of data and functions that property-based testing is often the preferred way to perform automated testing.

Perhaps you can find a functional programmer who might be slightly offended if you suggest that he or she should consider encapsulation. If so, suggest instead that he or she considers the properties of functions and data.

Comments

Atif Aziz #

I wonder what's the goal of illustrating OOP-ish examples exclusively in C# and FP-ish ones in F# when you could stick to just one language for the reader. It might not always be as effective depending on the topic, but for encapsulation and the examples shown in this article, a C# version would read just as effective as an F# one. I mean when you get round to making your points in the Immutable Table section of your article, you could demonstrate the ideas with a C# version that's nearly identical to and reads as succinct as the F# version:

#nullable enable
 
readonly record struct Reservation(int Quantity);
abstract record Table;
record StandardTable(int Capacity, Reservation? Reservation): Table;
record CommunalTable(int Capacity, ImmutableArray<Reservation> Reservations): Table;
 
static class TableModule
{
    public static StandardTable? Standard(int capacity) =>
        0 < capacity ? new StandardTable(capacity, null) : null;
 
    public static CommunalTable? Communal(int capacity) =>
        0 < capacity ? new CommunalTable(capacity, ImmutableArray<Reservation>.Empty) : null;
 
    public static int RemainingSeats(this Table table) => table switch
    {
        StandardTable { Reservation: null } t => t.Capacity,
        StandardTable => 0,
        CommunalTable t => t.Capacity - t.Reservations.Sum(r => r.Quantity)
    };
 
    public static Table? Reserve(this Table table, Reservation r) => table switch
    {
        StandardTable t when r.Quantity <= t.RemainingSeats() => t with { Reservation = r },
        CommunalTable t when r.Quantity <= t.RemainingSeats() => t with { Reservations = t.Reservations.Add(r) },
        _ => null,
    };
}

This way, I can just point someone to your article for enlightenment, 😉 but not leave them feeling frustrated that they need F# to (practice and) model around data instead of state mutating objects. It might still be worthwhile to show an F# version to draw the similarities and also call out some differences; like Table being a true discriminated union in F#, and while it appears to be emulated in C#, they desugar to the same thing in terms of CLR types and hierarchies.

By the way, in the C# example above, I modeled the standard table variant differently because if it can hold only one reservation at a time then the model should reflect that.

2022-10-27 16:09 UTC

Mark Seemann #

Atif, thank you for supplying and example of an immutable C# implementation.

I already have an example of an immutable, functional C# implementation in Code That Fits in Your Head, so I wanted to supply something else here. I also tend to find it interesting to compare how to model similar ideas in different languages, and it felt natural to supply an F# example to show how a 'natural' FP implementation might look.

Your point is valid, though, so I'm not insisting that this was the right decision.

2022-10-28 8:50 UTC

Sebastian Frelle Koch #

I took your idea, Atif, and wrote something that I think is more congruent with the example here. In short, I’m

using polymorphism to avoid having to switch over the Table type
hiding subtypes of Table to simplify the interface.

Here's the code:

#nullable enable

using System.Collections.Immutable;

readonly record struct Reservation(int Quantity);
abstract record Table
{
    public abstract Table? Reserve(Reservation r);
    public abstract int RemainingSeats();

    public static Table? Standard(int capacity) => 
        capacity > 0 ? new StandardTable(capacity, null) : null;

    public static Table? Communal(int capacity) => 
        capacity > 0 ? new CommunalTable(
            capacity,
            ImmutableArray<Reservation>.Empty) : null;

    private record StandardTable(int Capacity, Reservation? Reservation) : Table
    {
        public override Table? Reserve(Reservation r) => RemainingSeats() switch
        {
            var seats when seats >= r.Quantity => this with { Reservation = r },
            _ => null,
        };

        public override int RemainingSeats() => Reservation switch
        {
            null => Capacity,
            _ => 0,
        };
    }

    private record CommunalTable(
        int Capacity, 
        ImmutableArray<Reservation> Reservations) : Table
    {
        public override Table? Reserve(Reservation r) => RemainingSeats() switch
        {
            var seats when seats >= r.Quantity =>
                this with { Reservations = Reservations.Add(r) },
            _ => null,
        };

        public override int RemainingSeats() => 
            Capacity - Reservations.Sum(r => r.Quantity);
    }
}
    

I’d love to hear your thoughts on this approach. I think that one of its weaknesses is that calls to Table.Standard() and Table.Communal() will yield two instances of Table that can never be equal. For instance, Table.Standard(4) != Table.Communal(4), even though they’re both of type Table? and have the same number of seats.

Calling GetType() on each of the instances reveals that their types are actually Table+StandardTable and Table+CommunalTable respectively; however, this isn't transparent to callers. Another solution might be to expose the Table subtypes and give them private constructors – I just like the simplicity of not exposing the individual types of tables the same way you’re doing here, Mark.

2022-11-29 11:28 UTC

Alexandre Murari Jr #

Mark,

How do you differentiate encapsulation from abstraction?

Here's an excerpt from your book Dependency Injection: Principles, Practices, and Patterns.

Section: 1.3 - What to inject and what not to inject Subsection: 1.3.1 - Stable Dependencies

"Other examples [of libraries that do not require to be injected] may include specialized libraries that encapsulate alogorithms relevant to your application".

In that section, you and Steven were giving examples of stable dependencies that do not require to be injected to keep modularity. You define a library that "encapsulates an algorithm" as an example.

Now, to me, encapsulation is "protecting data integrity", plain and simple. A class is encapsulated as long as it's impossible or nearly impossible to bring it to an invalid or inconsistent state.

Protection of invariants, implementation hiding, bundling data and operations together, pre- and postconditions, Postel's Law all come into play to achieve this goal.

Thus, a class, to be "encapsulatable", has to have a state that can be initialized and/or modified by the client code.

Now I ask: most of the time when we say that something is encapsulating another, don't we really mean abstracting?

Why is it relevant to know that the hypothetical algorithm library protects it's invariants by using the term "encapsulate"?

Abstraction, under the light of Robert C. Martin's definition of it, makes much more sense in that context: "a specialized library that abstracts algorithms relevant to your application". It amplifies the essential (by providing a clear API), but eliminates the irrelevant (by hiding the alogirthm's implementation details).

Granted, there is some overlap between encapsulation and abstraction, specially when you bundle data and operations together (rich domain models), but they are not the same thing, you just use one to achieve another sometimes.

Would it be correct to say that the .NET Framework encapsulates math algorithms in the System.Math class? Is there any state there to be preserved? They're all static methods and constants. On the other hand, they're surely eliminating some pretty irrelevant (from a consumer POV) trigonometric algorithms.

Thanks.

2022-12-04 02:35 UTC

Mark Seemann #

Alexandre, thank you for writing. How do I distinguish between abstraction and encapsulation?

There's much overlap, to be sure.

As I write, my view on encapsulation is influenced by Bertrand Meyer's notion of contract. Likewise, I do use Robert C. Martin's notion of amplifying the essentials while hiding the irrelevant details as a guiding light when discussing abstraction.

While these concepts may seem synonymous, they're not quite the same. I can't say that I've spent too much time considering how these two words relate, but shooting from the hip I think that abstraction is a wider concept.

You don't need to read much of Robert C. Martin before he'll tell you that the Dependency Inversion Principle is an important part of abstraction:

"Abstractions should not depend on details. Details should depend on abstractions."

Robert C. Martin, Agile Principles, Patterns, and Practices in C#

It's possible to implement a code base where this isn't true, even if classes have good encapsulation. You could imagine a domain model that depends on database details like a particular ORM. I've seen plenty of those in my career, although I grant that most of them have had poor encapsulation as well. It is not, however, impossible to imagine such a system with good encapsulation, but suboptimal abstraction.

Does it go the other way as well? Can we have good abstraction, but poor encapsulation?

An example doesn't come immediately to mind, but as I wrote, it's not an ontology that I've given much thought.

2022-12-06 22:11 UTC

Stubs and mocks break encapsulation

Monday, 17 October 2022 08:47:00 UTC

Favour Fakes over dynamic mocks.

For a while now, I've favoured Fakes over Stubs and Mocks. Using Fake Objects over other Test Doubles makes test suites more robust. I wrote the code base for my book Code That Fits in Your Head entirely with Fakes and the occasional Test Spy, and I rarely had to fix broken tests. No Moq, FakeItEasy, NSubstitute, nor Rhino Mocks. Just hand-written Test Doubles.

It recently occurred to me that a way to explain the problem with Mocks and Stubs is that they break encapsulation.

You'll see some examples soon, but first it's important to be explicit about terminology.

Terminology #

Words like Mocks, Stubs, as well as encapsulation, have different meanings to different people. They've fallen victim to semantic diffusion, if ever they were well-defined to begin with.

When I use the words Test Double, Fake, Mock, and Stub, I use them as they are defined in xUnit Test Patterns. I usually try to avoid the terms Mock and Stub since people use them vaguely and inconsistently. The terms Test Double and Fake fare better.

We do need, however, a name for those libraries that generate Test Doubles on the fly. In .NET, they are libraries like Moq, FakeItEasy, and so on, as listed above. Java has Mockito, EasyMock, JMockit, and possibly more like that.

What do we call such libraries? Most people call them mock libraries or dynamic mock libraries. Perhaps dynamic Test Double library would be more consistent with the xUnit Test Patterns vocabulary, but nobody calls them that. I'll call them dynamic mock libraries to at least emphasise the dynamic, on-the-fly object generation these libraries typically use.

Finally, it's important to define encapsulation. This is another concept where people may use the same word and yet mean different things.

I base my understanding of encapsulation on Object-Oriented Software Construction. I've tried to distil it in my Pluralsight course Encapsulation and SOLID.

Contracts, according to Meyer, describe three properties of objects:

Preconditions: What client code must fulfil in order to successfully interact with the object.
Invariants: Statements about the object that are always true.
Postconditions: Statements that are guaranteed to be true after a successful interaction between client code and object.

As I'll demonstrate in this article, objects generated by dynamic mock libraries often break their contracts.

Create-and-read round-trip #

Consider the IReservationsRepository interface from Code That Fits in Your Head:

public interface IReservationsRepository
{
    Task Create(int restaurantId, Reservation reservation);
 
    Task<IReadOnlyCollection<Reservation>> ReadReservations(
        int restaurantId, DateTime min, DateTime max);
 
    Task<Reservation?> ReadReservation(int restaurantId, Guid id);
 
    Task Update(int restaurantId, Reservation reservation);
 
    Task Delete(int restaurantId, Guid id);
}

I already discussed some of the contract properties of this interface in an earlier article. Here, I want to highlight a certain interaction.

What is the contract of the Create method?

There are a few preconditions:

The client must have a properly initialised IReservationsRepository object.
The client must have a valid restaurantId.
The client must have a valid reservation.

A client that fulfils these preconditions can successfully call and await the Create method. What are the invariants and postconditions?

I'll skip the invariants because they aren't relevant to the line of reasoning that I'm pursuing. One postcondition, however, is that the reservation passed to Create must now be 'in' the repository.

How does that manifest as part of the object's contract?

This implies that a client should be able to retrieve the reservation, either with ReadReservation or ReadReservations. This suggests a kind of property that Scott Wlaschin calls There and back again.

Picking ReadReservation for the verification step we now have a property: If client code successfully calls and awaits Create it should be able to use ReadReservation to retrieve the reservation it just saved. That's implied by the IReservationsRepository contract.

SQL implementation #

The 'real' implementation of IReservationsRepository used in production is an implementation that stores reservations in SQL Server. This class should obey the contract.

While it might be possible to write a true property-based test, running hundreds of randomly generated test cases against a real database is going to take time. Instead, I chose to only write a parametrised test:

[Theory]
[InlineData(Grandfather.Id, "2022-06-29 12:00", "e@example.gov", "Enigma", 1)]
[InlineData(Grandfather.Id, "2022-07-27 11:40", "c@example.com", "Carlie", 2)]
[InlineData(2, "2021-09-03 14:32", "bon@example.edu", "Jovi", 4)]
public async Task CreateAndReadRoundTrip(
    int restaurantId,
    string at,
    string email,
    string name,
    int quantity)
{
    var expected = new Reservation(
        Guid.NewGuid(),
        DateTime.Parse(at, CultureInfo.InvariantCulture),
        new Email(email),
        new Name(name),
        quantity);
    var connectionString = ConnectionStrings.Reservations;
    var sut = new SqlReservationsRepository(connectionString);
 
    await sut.Create(restaurantId, expected);
    var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
    Assert.Equal(expected, actual);
}

The part that we care about is the three last lines:

await sut.Create(restaurantId, expected);
var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
Assert.Equal(expected, actual);

First call Create and subsequently ReadReservation. The value created should equal the value retrieved, which is also the case. All tests pass.

Fake #

The Fake implementation is effectively an in-memory database, so we expect it to also fulfil the same contract. We can test it with an almost identical test:

[Theory]
[InlineData(RestApi.Grandfather.Id, "2022-06-29 12:00", "e@example.gov", "Enigma", 1)]
[InlineData(RestApi.Grandfather.Id, "2022-07-27 11:40", "c@example.com", "Carlie", 2)]
[InlineData(2, "2021-09-03 14:32", "bon@example.edu", "Jovi", 4)]
public async Task CreateAndReadRoundTrip(
    int restaurantId,
    string at,
    string email,
    string name,
    int quantity)
{
    var expected = new Reservation(
        Guid.NewGuid(),
        DateTime.Parse(at, CultureInfo.InvariantCulture),
        new Email(email),
        new Name(name),
        quantity);
    var sut = new FakeDatabase();
 
    await sut.Create(restaurantId, expected);
    var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
    Assert.Equal(expected, actual);
}

The only difference is that the sut is a different class instance. These test cases also all pass.

How is FakeDatabase implemented? That's not important, because it obeys the contract. FakeDatabase has good encapsulation, which makes it possible to use it without knowing anything about its internal implementation details. That, after all, is the point of encapsulation.

Dynamic mock #

How does a dynamic mock fare if subjected to the same test? Let's try with Moq 4.18.2 (and I'm not choosing Moq to single it out - I chose Moq because it's the dynamic mock library I used to love the most):

[Theory]
[InlineData(RestApi.Grandfather.Id, "2022-06-29 12:00", "e@example.gov", "Enigma", 1)]
[InlineData(RestApi.Grandfather.Id, "2022-07-27 11:40", "c@example.com", "Carlie", 2)]
[InlineData(2, "2021-09-03 14:32", "bon@example.edu", "Jovi", 4)]
public async Task CreateAndReadRoundTrip(
    int restaurantId,
    string at,
    string email,
    string name,
    int quantity)
{
    var expected = new Reservation(
        Guid.NewGuid(),
        DateTime.Parse(at, CultureInfo.InvariantCulture),
        new Email(email),
        new Name(name),
        quantity);
    var sut = new Mock<IReservationsRepository>().Object;
 
    await sut.Create(restaurantId, expected);
    var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
    Assert.Equal(expected, actual);
}

If you've worked a little with dynamic mock libraries, you will not be surprised to learn that all three tests fail. Here's one of the failure messages:

Ploeh.Samples.Restaurants.RestApi.Tests.MoqRepositoryTests.CreateAndReadRoundTrip(↩
    restaurantId: 1, at: "2022-06-29 12:00", email: "e@example.gov", name: "Enigma", quantity: 1)
 Source: MoqRepositoryTests.cs line 17
 Duration: 1 ms
 
Message: 
  Assert.Equal() Failure
  Expected: Reservation↩
            {↩
              At = 2022-06-29T12:00:00.0000000,↩
              Email = e@example.gov,↩
              Id = c9de4f95-3255-4e1f-a1d6-63591b58ff0c,↩
              Name = Enigma,↩
              Quantity = 1↩
            }
  Actual:   (null)
 
Stack Trace: 
  MoqRepositoryTests.CreateAndReadRoundTrip(↩
    Int32 restaurantId, String at, String email, String name, Int32 quantity) line 35
  --- End of stack trace from previous location where exception was thrown ---

(I've introduced line breaks and indicated them with the ↩ symbol to make the output more readable. I'll do that again later in the article.)

Not surprisingly, the return value of Create is null. You typically have to configure a dynamic mock in order to give it any sort of behaviour, and I didn't do that here. In that case, the dynamic mock returns the default value for the return type, which in this case correctly is null.

You may object that the above example is unfair. How can a dynamic mock know what to do? You have to configure it. That's the whole point of it.

Retrieval without creation #

Okay, let's set up the dynamic mock:

var dm = new Mock<IReservationsRepository>();
dm.Setup(r => r.ReadReservation(restaurantId, expected.Id)).ReturnsAsync(expected);
var sut = dm.Object;

These are the only lines I've changed from the previous listing of the test, which now passes.

A common criticism of dynamic-mock-heavy tests is that they mostly 'just test the mocks', and this is exactly what happens here.

You can make that more explicit by deleting the Create method call:

var dm = new Mock<IReservationsRepository>();
dm.Setup(r => r.ReadReservation(restaurantId, expected.Id)).ReturnsAsync(expected);
var sut = dm.Object;
 
var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
Assert.Equal(expected, actual);

The test still passes. Clearly it only tests the dynamic mock.

You may, again, demur that this is expected, and it doesn't demonstrate that dynamic mocks break encapsulation. Keep in mind, however, the nature of the contract: Upon successful completion of Create, the reservation is 'in' the repository and can later be retrieved, either with ReadReservation or ReadReservations.

This variation of the test no longer calls Create, yet ReadReservation still returns the expected value.

Do SqlReservationsRepository or FakeDatabase behave like that? No, they don't.

Try to delete the Create call from the test that exercises SqlReservationsRepository:

var sut = new SqlReservationsRepository(connectionString);
 
var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
Assert.Equal(expected, actual);

Hardly surprising, the test now fails because actual is null. The same happens if you delete the Create call from the test that exercises FakeDatabase:

var sut = new FakeDatabase();
 
var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
Assert.Equal(expected, actual);

Again, the assertion fails because actual is null.

The classes SqlReservationsRepository and FakeDatabase behave according to contract, while the dynamic mock doesn't.

Alternative retrieval #

There's another way in which the dynamic mock breaks encapsulation. Recall what the contract states: Upon successful completion of Create, the reservation is 'in' the repository and can later be retrieved, either with ReadReservation or ReadReservations.

In other words, it should be possible to change the interaction from Create followed by ReadReservation to Create followed by ReadReservations.

First, try it with SqlReservationsRepository:

await sut.Create(restaurantId, expected);
var min = expected.At.Date;
var max = min.AddDays(1);
var actual = await sut.ReadReservations(restaurantId, min, max);
 
Assert.Contains(expected, actual);

The test still passes, as expected.

Second, try the same change with FakeDatabase:

await sut.Create(restaurantId, expected);
var min = expected.At.Date;
var max = min.AddDays(1);
var actual = await sut.ReadReservations(restaurantId, min, max);
 
Assert.Contains(expected, actual);

Notice that this is the exact same code as in the SqlReservationsRepository test. That test also passes, as expected.

Third, try it with the dynamic mock:

await sut.Create(restaurantId, expected);
var min = expected.At.Date;
var max = min.AddDays(1);
var actual = await sut.ReadReservations(restaurantId, min, max);
 
Assert.Contains(expected, actual);

Same code, different sut, and the test fails. The dynamic mock breaks encapsulation. You'll have to go and fix the Setup of it to make the test pass again. That's not the case with SqlReservationsRepository or FakeDatabase.

Dynamic mocks break the SUT, not the tests #

Perhaps you're still not convinced that this is of practical interest. After all, Bertrand Meyer had limited success getting mainstream adoption of his thought on contract-based programming.

That dynamic mocks break encapsulation does, however, have real implications.

What if, instead of using FakeDatabase, I'd used dynamic mocks when testing my online restaurant reservation system? A test might have looked like this:

[Theory]
[InlineData(1049, 19, 00, "juliad@example.net", "Julia Domna", 5)]
[InlineData(1130, 18, 15, "x@example.com", "Xenia Ng", 9)]
[InlineData( 956, 16, 55, "kite@example.edu", null, 2)]
[InlineData( 433, 17, 30, "shli@example.org", "Shanghai Li", 5)]
public async Task PostValidReservationWhenDatabaseIsEmpty(
    int days,
    int hours,
    int minutes,
    string email,
    string name,
    int quantity)
{
    var at = DateTime.Now.Date + new TimeSpan(days, hours, minutes, 0);
    var dm = new Mock<IReservationsRepository>();
    dm.Setup(r => r.ReadReservations(Grandfather.Id, at.Date, at.Date.AddDays(1).AddTicks(-1)))
        .ReturnsAsync(Array.Empty<Reservation>());
    var sut = new ReservationsController(
        new SystemClock(),
        new InMemoryRestaurantDatabase(Grandfather.Restaurant),
        dm.Object);
    var expected = new Reservation(
        new Guid("B50DF5B1-F484-4D99-88F9-1915087AF568"),
        at,
        new Email(email),
        new Name(name ?? ""),
        quantity);
 
    await sut.Post(expected.ToDto());
 
    dm.Verify(r => r.Create(Grandfather.Id, expected));
}

This is yet another riff on the PostValidReservationWhenDatabaseIsEmpty test - the gift that keeps giving. I've previously discussed this test in other articles:

Here I've replaced the FakeDatabase Test Double with a dynamic mock. (I am, again, using Moq, but keep in mind that the fallout of using a dynamic mock is unrelated to specific libraries.)

To go 'full dynamic mock' I should also have replaced SystemClock and InMemoryRestaurantDatabase with dynamic mocks, but that's not necessary to illustrate the point I wish to make.

This, and other tests, describe the desired outcome of making a reservation against the REST API. It's an interaction that looks like this:

POST /restaurants/90125/reservations?sig=aco7VV%2Bh5sA3RBtrN8zI8Y9kLKGC60Gm3SioZGosXVE%3D HTTP/1.1
content-type: application/json
{
  "at": "2022-12-12T20:00",
  "name": "Pearl Yvonne Gates",
  "email": "pearlygates@example.net",
  "quantity": 4
}

HTTP/1.1 201 Created
Content-Length: 151
Content-Type: application/json; charset=utf-8
Location: [...]/restaurants/90125/reservations/82e550b1690742368ea62d76e103b232?sig=fPY1fSr[...]
{
  "id": "82e550b1690742368ea62d76e103b232",
  "at": "2022-12-12T20:00:00.0000000",
  "email": "pearlygates@example.net",
  "name": "Pearl Yvonne Gates",
  "quantity": 4
}

What's of interest here is that the response includes the JSON representation of the resource that the interaction created. It's mostly a copy of the posted data, but enriched with a server-generated ID.

The code responsible for the database interaction looks like this:

private async Task<ActionResult> TryCreate(Restaurant restaurant, Reservation reservation)
{
    using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled);
 
    var reservations = await Repository
        .ReadReservations(restaurant.Id, reservation.At)
        .ConfigureAwait(false);
    var now = Clock.GetCurrentDateTime();
    if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
        return NoTables500InternalServerError();
 
    await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false);
 
    scope.Complete();
 
    return Reservation201Created(restaurant.Id, reservation);
}

The last line of code creates a 201 Created response with the reservation as content. Not shown in this snippet is the origin of the reservation parameter, but it's the input JSON document parsed to a Reservation object. Each Reservation object has an ID that the server creates when it's not supplied by the client.

The above TryCreate helper method contains all the database interaction code related to creating a new reservation. It first calls ReadReservations to retrieve the existing reservations. Subsequently, it calls Create if it decides to accept the reservation. The ReadReservations method is actually an internal extension method:

internal static Task<IReadOnlyCollection<Reservation>> ReadReservations(
    this IReservationsRepository repository,
    int restaurantId,
    DateTime date)
{
    var min = date.Date;
    var max = min.AddDays(1).AddTicks(-1);
    return repository.ReadReservations(restaurantId, min, max);
}

Notice how the dynamic-mock-based test has to replicate this internal implementation detail to the tick. If I ever decide to change this just one tick, the test is going to fail. That's already bad enough (and something that FakeDatabase gracefully handles), but not what I'm driving towards.

At the moment the TryCreate method echoes back the reservation. What if, however, you instead want to query the database and return the record that you got from the database? In this particular case, there's no reason to do that, but perhaps in other cases, something happens in the data layer that either enriches or normalises the data. So you make an innocuous change:

private async Task<ActionResult> TryCreate(Restaurant restaurant, Reservation reservation)
{
    using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled);
 
    var reservations = await Repository
        .ReadReservations(restaurant.Id, reservation.At)
        .ConfigureAwait(false);
    var now = Clock.GetCurrentDateTime();
    if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
        return NoTables500InternalServerError();
 
    await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false);
    var storedReservation = await Repository
        .ReadReservation(restaurant.Id, reservation.Id)
        .ConfigureAwait(false);
 
    scope.Complete();
 
    return Reservation201Created(restaurant.Id, storedReservation!);
}

Now, instead of echoing back reservation, the method calls ReadReservation to retrieve the (possibly enriched or normalised) storedReservation and returns that value. Since this value could, conceivably, be null, for now the method uses the ! operator to insist that this is not the case. A new test case might be warranted to cover the scenario where the query returns null.

This is perhaps a little less efficient because it implies an extra round-trip to the database, but it shouldn't change the behaviour of the system!

But when you run the test suite, that PostValidReservationWhenDatabaseIsEmpty test fails:

Ploeh.Samples.Restaurants.RestApi.Tests.ReservationsTests.PostValidReservationWhenDatabaseIsEmpty(↩
    days: 433, hours: 17, minutes: 30, email: "shli@example.org", name: "Shanghai Li", quantity: 5)↩
    [FAIL]
  System.NullReferenceException : Object reference not set to an instance of an object.
  Stack Trace:
    [...]\Restaurant.RestApi\ReservationsController.cs(94,0): at↩
      [...].RestApi.ReservationsController.Reservation201Created↩
      (Int32 restaurantId, Reservation r)
    [...]\Restaurant.RestApi\ReservationsController.cs(79,0): at↩
      [...].RestApi.ReservationsController.TryCreate↩
      (Restaurant restaurant, Reservation reservation)
    [...]\Restaurant.RestApi\ReservationsController.cs(57,0): at↩
      [...].RestApi.ReservationsController.Post↩
      (Int32 restaurantId, ReservationDto dto)
    [...]\Restaurant.RestApi.Tests\ReservationsTests.cs(73,0): at↩
      [...].RestApi.Tests.ReservationsTests.PostValidReservationWhenDatabaseIsEmpty↩
      (Int32 days, Int32 hours, Int32 minutes, String email, String name, Int32 quantity)
    --- End of stack trace from previous location where exception was thrown ---

Oh, the dreaded NullReferenceException! This happens because ReadReservation returns null, since the dynamic mock isn't configured.

The typical reaction that most people have is: Oh no, the tests broke!

I think, though, that this is the wrong perspective. The dynamic mock broke the System Under Test (SUT) because it passed an implementation of IReservationsRepository that breaks the contract. The test didn't 'break', because it was never correct from the outset.

Shotgun surgery #

When a test code base uses dynamic mocks, it tends to do so pervasively. Most tests create one or more dynamic mocks that they pass to their SUT. Most of these dynamic mocks break encapsulation, so when you refactor, the dynamic mocks break the SUT.

You'll typically need to revisit and 'fix' all the failing tests to accommodate the refactoring:

[Theory]
[InlineData(1049, 19, 00, "juliad@example.net", "Julia Domna", 5)]
[InlineData(1130, 18, 15, "x@example.com", "Xenia Ng", 9)]
[InlineData( 956, 16, 55, "kite@example.edu", null, 2)]
[InlineData( 433, 17, 30, "shli@example.org", "Shanghai Li", 5)]
public async Task PostValidReservationWhenDatabaseIsEmpty(
    int days,
    int hours,
    int minutes,
    string email,
    string name,
    int quantity)
{
    var at = DateTime.Now.Date + new TimeSpan(days, hours, minutes, 0);
    var expected = new Reservation(
        new Guid("B50DF5B1-F484-4D99-88F9-1915087AF568"),
        at,
        new Email(email),
        new Name(name ?? ""),
        quantity);
    var dm = new Mock<IReservationsRepository>();
    dm.Setup(r => r.ReadReservations(Grandfather.Id, at.Date, at.Date.AddDays(1).AddTicks(-1)))
        .ReturnsAsync(Array.Empty<Reservation>());
    dm.Setup(r => r.ReadReservation(Grandfather.Id, expected.Id)).ReturnsAsync(expected);
    var sut = new ReservationsController(
        new SystemClock(),
        new InMemoryRestaurantDatabase(Grandfather.Restaurant),
        dm.Object);
 
    await sut.Post(expected.ToDto());
 
    dm.Verify(r => r.Create(Grandfather.Id, expected));
}

The test now passes (until the next change in the SUT), but notice how top-heavy it becomes. That's a test code smell when using dynamic mocks. Everything has to happen in the Arrange phase.

You typically have many such tests that you need to edit. The name of this antipattern is Shotgun Surgery.

The implication is that refactoring by definition is impossible:

"to refactor, the essential precondition is [...] solid tests"

Martin Fowler, Refactoring

You need tests that don't break when you refactor. When you use dynamic mocks, tests tend to fail whenever you make changes in SUTs. Even though you have tests, they don't enable refactoring.

To add spite to injury, every time you edit existing tests, they become less trustworthy.

To address these problems, use Fakes instead of Mocks and Stubs. With the FakeDatabase the entire sample test suite for the online restaurant reservation system gracefully handles the change described above. No tests fail.

Spies #

If you spelunk the test code base for the book, you may also find this Test Double:

internal sealed class SpyPostOffice :
    Collection<SpyPostOffice.Observation>, IPostOffice
{
    public Task EmailReservationCreated(
        int restaurantId,
        Reservation reservation)
    {
        Add(new Observation(Event.Created, restaurantId, reservation));
        return Task.CompletedTask;
    }
 
    public Task EmailReservationDeleted(
        int restaurantId,
        Reservation reservation)
    {
        Add(new Observation(Event.Deleted, restaurantId, reservation));
        return Task.CompletedTask;
    }
 
    public Task EmailReservationUpdating(
        int restaurantId,
        Reservation reservation)
    {
        Add(new Observation(Event.Updating, restaurantId, reservation));
        return Task.CompletedTask;
    }
 
    public Task EmailReservationUpdated(
        int restaurantId,
        Reservation reservation)
    {
        Add(new Observation(Event.Updated, restaurantId, reservation));
        return Task.CompletedTask;
    }
 
    internal enum Event
    {
        Created = 0,
        Updating,
        Updated,
        Deleted
    }
 
    internal sealed class Observation
    {
        public Observation(
            Event @event,
            int restaurantId,
            Reservation reservation)
        {
            Event = @event;
            RestaurantId = restaurantId;
            Reservation = reservation;
        }
 
        public Event Event { get; }
        public int RestaurantId { get; }
        public Reservation Reservation { get; }
 
        public override bool Equals(object? obj)
        {
            return obj is Observation observation &&
                   Event == observation.Event &&
                   RestaurantId == observation.RestaurantId &&
                   EqualityComparer<Reservation>.Default.Equals(Reservation, observation.Reservation);
        }
 
        public override int GetHashCode()
        {
            return HashCode.Combine(Event, RestaurantId, Reservation);
        }
    }
}

As you can see, I've chosen to name this class with the Spy prefix, indicating that this is a Test Spy rather than a Fake Object. A Spy is a Test Double whose main purpose is to observe and record interactions. Does that break or realise encapsulation?

While I favour Fakes whenever possible, consider the interface that SpyPostOffice implements:

public interface IPostOffice
{
    Task EmailReservationCreated(int restaurantId, Reservation reservation);
 
    Task EmailReservationDeleted(int restaurantId, Reservation reservation);
 
    Task EmailReservationUpdating(int restaurantId, Reservation reservation);
 
    Task EmailReservationUpdated(int restaurantId, Reservation reservation);
}

This interface consist entirely of Commands. There's no way to query the interface to examine the state of the object. Thus, you can't check that postconditions hold exclusively via the interface. Instead, you need an additional retrieval interface to examine the posterior state of the object. The SpyPostOffice concrete class exposes such an interface.

In a sense, you can view SpyPostOffice as an in-memory message sink. It fulfils the contract.

Concurrency #

Perhaps you're still not convinced. You may argue, for example, that the (partial) contract that I stated is naive. Consider, again, the implications expressed as code:

await sut.Create(restaurantId, expected);
var actual = await sut.ReadReservation(restaurantId, expected.Id);
 
Assert.Equal(expected, actual);

You may argue that in the face of concurrency, another thread or process could be making changes to the reservation after Create, but before ReadReservation. Thus, you may argue, the contract I've stipulated is false. In a real system, we can't expect that to be the case.

I agree.

Concurrency makes things much harder. Even in that light, I think the above line of reasoning is appropriate, for two reasons.

First, I chose to model IReservationsRepository like I did because I didn't expect high contention on individual reservations. In other words, I don't expect two or more concurrent processes to attempt to modify the same reservation at the same time. Thus, I found it appropriate to model the Repository as

"a collection-like interface for accessing domain objects."

Edward Hieatt and Rob Mee, in Martin Fowler, Patterns of Enterprise Application Architecture, Repository pattern

A collection-like interface implies both data retrieval and collection manipulation members. In low-contention scenarios like the reservation system, this turns out to be a useful model. As the aphorism goes, all models are wrong, but some models are useful. Treating IReservationsRepository as a collection accessed in a non-concurrent manner turned out to be useful in this code base.

Had I been more worried about data contention, a move towards CQRS seems promising. This leads to another object model, with different contracts.

Second, even in the face of concurrency, most unit test cases are implicitly running on a single thread. While they may run in parallel, each unit test exercises the SUT on a single thread. This implies that reads and writes against Test Doubles are serialised.

Even if concurrency is a real concern, you'd still expect that if only one thread is manipulating the Repository object, then what you Create you should be able to retrieve. The contract may be a little looser, but it'd still be a violation of the principle of least surprise if it was any different.

Conclusion #

In object-oriented programming, encapsulation is the notion of separating the affordances of an object from its implementation details. I find it most practical to think about this in terms of contracts, which again can be subdivided into sets of preconditions, invariants, and postconditions.

Polymorphic objects (like interfaces and base classes) come with contracts as well. When you replace 'real' implementations with Test Doubles, the Test Doubles should also fulfil the contracts. Fake objects do that; Test Spies may also fit that description.

When Test Doubles obey their contracts, you can refactor your SUT without breaking your test suite.

By default, however, dynamic mocks break encapsulation because they don't fulfil the objects' contracts. This leads to fragile tests.

Favour Fakes over dynamic mocks. You can read more about this way to write tests by following many of the links in this article, or by reading my book Code That Fits in Your Head.

Comments

Matthew Wiemer #

Excellent article exploring the nuances of encapsulation as it relates to testing. That said, the examples here left me with one big question: what exactly is covered by the tests using `FakeDatabase`?

This line in particular is confusing me (as to its practical use in a "real-world" setting): `var sut = new FakeDatabase();`

How can I claim to have tested the real system's implementation when the "system under test" is, in this approach, explicitly _not_ my real system? It appears the same criticism of dynamic mocks surfaces: "you're only testing the fake database". Does this approach align with any claim you are testing the "real database"?

When testing the data-layer, I have historically written (heavier) tests that integrate with a real database to exercise a system's data-layer (as you describe with `SqlReservationsRepository`). I find myself reaching for dynamic mocks in the context of exercising an application's domain layer -- where the data-layer is a dependency providing indirect input/output. Does this use of mocks violate encapsulation in the way this article describes? I _think_ not, because in that case a dynamic mock is used to represent states that are valid "according to the contract", but I'm hoping you could shed a bit more light on the topic. Am I putting the pieces together correctly?

Rephrasing the question using your Reservations example code, I would typically inject `IReservationsRepository` into `MaitreD` (which you opt not to do) and outline the posssible database return values (or commands) using dynamic mocks in a test suite of `MaitreD`. What drawbacks, if any, would that approach lead to with respect to encapsulation and test fragility?

2022-11-02 20:11 UTC

Mark Seemann #

Matthew, thank you for writing. I apologise if the article is unclear about this, but nowhere in the real code base do I have a test of FakeDatabase. I only wrote the tests that exercise the Test Doubles to illustrate the point I was trying to make. These tests only exist for the benefit of this article.

The first CreateAndReadRoundTrip test in the article shows a real integration test. The System Under Test (SUT) is the SqlReservationsRepository class, which is part of the production code - not a Test Double.

That class implements the IReservationsRepository interface. The point I was trying to make is that the CreateAndReadRoundTrip test already exercises a particular subset of the contract of the interface. Thus, if one replaces one implementation of the interface with another implementation, according to the Liskov Substitution Principle (LSP) the test should still pass.

This is true for FakeDatabase. While the behaviour is different (it doesn't persist data), it still fulfils the contract. Dynamic mocks, on the other hand, don't automatically follow the LSP. Unless one is careful and explicit, dynamic mocks tend to weaken postconditions. For example, a dynamic mock doesn't automatically return the added reservation when you call ReadReservation.

This is an essential flaw of dynamic mock objects that is independent of where you use them. My article already describes how a fairly innocuous change in the production code will cause a dynamic mock to break the test.

I no longer inject dependencies into domain models, since doing so makes the domain model impure. Even if I did, however, I'd still have the same problem with dynamic mocks breaking encapsulation.

2022-11-04 7:06 UTC

Next Previous

Page 8 of 73

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!

ploeh blog danish software design

A clue left by the proof of concept #

Non-empty errors #

Error collection isomorphism #

Asserted truth #

Asserted equality #

Evaluating assertions #

Method chaining #

Conclusion #

Comments

The importance of clear assertion messages #

Out of the blue #

When tests fail #

Optimise for the common scenario #

Fast feedback #

Conclusion #

Not only for boilerplate code #

Detectably wrong suggestions #

Errors that are difficult to detect #

Copilot for experienced programmers #

Conclusion #

Example scenario #

Assertions as validations #

Asserting truth #

Asserting equality #

Evaluating assertions #

Outcomes #

Conclusion #

What a tangled web we weave #

It is difficult to make predictions, especially about the future #

YAGNI #

Write code that is easy to delete #

Add something better #

Conclusion #

Flatten #

SelectMany #

Query syntax #

Return #

Left identity #

Right identity #

Associativity #

Haskell #

Conclusion #

Ongoing exploration #

Conclusion #

Comments

Contracts #

Table mutation #

Immutable Table #

Functional contracts in OOP languages #

Conclusion #

Comments

Terminology #

Create-and-read round-trip #

SQL implementation #

Fake #

Dynamic mock #

Retrieval without creation #

Alternative retrieval #

Dynamic mocks break the SUT, not the tests #

Shotgun surgery #

Spies #

Concurrency #

Conclusion #

Comments