IO container in a parallel C# universe

Monday, 15 June 2020 05:55:00 UTC

A C# model of IO at the type level.

This article is part of an article series about the IO container in C#. The previous article provided a conceptual overview. In this article you'll see C# code examples.

In a world... #

Imagine a parallel universe where a C# entry point is supposed to look like this:

static IO<Unit> Main(string[] args)

Like another Neo, you notice that something in your reality is odd, so you decide to follow the white rabbit. Unit? What's that? Navigating to its definition, you see this:

public sealed class Unit
{
    public static readonly Unit Instance;
}

There's not much to see here. Unit is a type that serves the same role as the void keyword. They're isomorphic, but Unit has the advantage that it's a proper type. This means that it can be a type parameter in a generically typed container like IO.

So what's IO? When you view its definition, this is what you see:

public sealed class IO<T>
{ 
    public IO(Func<T> item) 
    public IO<TResult> SelectMany<TResult>(Func<TIO<TResult>> selector)
}

There's a constructor you can initialise with a lazily evaluated value, and a SelectMany method that looks strikingly like something out of LINQ.

You'll probably notice right away that while you can put a value into the IO container, you can't get it back. As the introductory article explained, this is by design. Still, you may think: What's the point? Why would I ever want to use this class?

You get part of the answer when you try to implement your program's entry point. In order for it to compile, the Main method must return an IO<Unit> object. Thus, the simplest Main method that compiles is this:

static IO<Unit> Main(string[] args)
{
    return new IO<Unit>(() => Unit.Instance);
};

That's only a roundabout no-op. What if you want write real code? Like hello world?

Impure actions #

You'd like to write a program that asks the user about his or her name. Based on the answer, and the time of day, it'll write Hello, Nearth, or Good evening, Kate. You'd like to know how to take user input and write to the standard output stream. In this parallel world, the Console API looks like this:

public static class Console
{ 
    public static IO<string> ReadLine(); 
    public static IO<Unit> WriteLine(string value);
    // More members here...
}

You notice that both methods return IO values. This immediately tells you that they're impure. This is hardly surprising, since ReadLine is non-deterministic and WriteLine has a side effect.

You'll also need the current time of day. How do you get that?

public static class Clock
{
    public static IO<DateTime> GetLocalTime();
}

Again, IO signifies that the returned DateTime value is impure; it's non-deterministic.

Pure logic #

A major benefit of functional programming is the natural separation of concerns; separation of business logic from implementation details (a.k.a. the Dependency Inversion Principle).

Write the logic of the program as a pure function:

public static string Greet(DateTime now, string name)
{
    var greeting = "Hello";
    if (IsMorning(now))
        greeting = "Good morning";
    if (IsAfternoon(now))
        greeting = "Good afternoon";
    if (IsEvening(now))
        greeting = "Good evening";
 
    if (string.IsNullOrWhiteSpace(name))
        return $"{greeting}.";
 
    return $"{greeting}{name.Trim()}.";
}

You can tell that this is a pure function from its return type. In this parallel universe, all impure library methods look like the above Console and Clock methods. Thus, by elimination, a method that doesn't return IO is pure.

Composition #

You have impure actions you can invoke, and a pure piece of logic. You can use ReadLine to get the user's name, and GetLocalTime to get the local time. When you have those two pieces of data, you can call the Greet method.

This is where most people run aground. "Yes, but Greet needs a string, but I have an IO<string>. How do I get the string out of the IO?

If you've been following the plot so far, you know the answer: mu. You don't. You compose all the things with SelectMany:

static IO<Unit> Main(string[] args)
{
    return Console.WriteLine("What's your name?").SelectMany(_ =>
        Console.ReadLine().SelectMany(name =>
        Clock.GetLocalTime().SelectMany(now =>
    {
        var greeting = Greeter.Greet(now, name);
        return Console.WriteLine(greeting);
    })));
}

I'm not going to walk through all the details of how this works. There's plenty of monad tutorials out there, but take a moment to contemplate the SelectMany method's selector argument. It takes a plain T value as input, but must return an IO<TResult> object. That's what each of the above lambda expressions do, but that means that name and now are unwrapped values; i.e. string and DateTime.

That means that you can call Greet with name and now, which is exactly what happens.

Notice that the above lambda expressions are nested. With idiomatic formatting, they'd exhibit the dreaded arrow shape, but with this formatting, it looks more like the sequential composition that it actually is.

Conclusion #

The code you've seen here all works. The only difference between this hypothetical C# and the real C# is that your Main method can't look like that, and impure library methods don't return IO values.

The point of the article isn't to recommend this style of programming. You can't, really, since it relies on the counter-factual assumption that all impure library methods return IO. The point of the article is to explain how a language like Haskell uses the type system to distinguish between pure functions and impure actions.

Perhaps the code in that Main method isn't the prettiest code you've ever seen, but we can make it a little nicer. That's the topic of the next article in the series.

Next: Syntactic sugar for IO.


Comments

Why is the input of the IO constructor of type Lazy<T> instead of just type T?

2020-06-22 12:16 UTC

Tyson, thank you for writing. A future article in this article series will answer that question 😀

2020-06-22 12:25 UTC

FWIW, the promised article is now available.

2020-07-06 6:01 UTC

Ah, sure. I was thinking that IO<T> is a monad (in T), so there should be a constructor with argument T. However, the function doens't need to be a constructor. The lambda expression t => new IO<T>(new Lazy<T>(() => t)) satisifies the requirement.

2020-07-13 19:35 UTC

Yes 👍

2020-07-14 6:19 UTC

The IO Container

Monday, 08 June 2020 05:53:00 UTC

How a type system can distinguish between pure and impure code.

Referential transparency is the foundation of functional architecture. If you categorise all operations into pure functions and impure actions, then most other traits associated with functional programming follow.

Unfortunately, mainstream programming languages don't distinguish between pure functions and impure actions. Identifying pure functions is tricky, and the knowledge is fleeting. What was a pure function today may become impure next time someone changes the code.

Separating pure and impure code is important. It'd be nice if you could automate the process. Perhaps you could run some tests, or, even better, make the compiler do the work. That's what Haskell and a few other languages do.

In Haskell, the distinction is made with a container called IO. This static type enforces the functional interaction law at compile time: pure functions can't invoke impure actions.

Opaque container #

Regular readers of this blog know that I often use Haskell to demonstrate principles of functional programming. If you don't know Haskell, however, its ability to guarantee the functional interaction law at compile time may seem magical. It's not.

Fortunately, the design is so simple that it's easy to explain the fundamental concept: Results of impure actions are always enclosed in an opaque container called IO. You can think of it as a box with a label.

A cardboard box with the label 'int inside' on the side.

The label only tells you about the static type of the value inside the box. It could be an int, a DateTime, or your own custom type, say Reservation. While you know what type of value is inside the box, you can't see what's in it, and you can't open it.

Name #

The container itself is called IO, but don't take the word too literally. While all I/O (input/output) is inherently impure, other operations that you don't typically think of as I/O is impure as well. Generation of random numbers (including GUIDs) is the most prominent example. Random number generators rely on the system clock, which you can think of as an input device, although I think many programmers don't.

I could have called the container Impure instead, but I chose to go with IO, since this is the word used in Haskell. It also has the advantage of being short.

What's in the boooox? #

A question frequently comes up: How do I get the value out of my IO? As always, the answer is mu. You don't. You inject the desired behaviour into the container. This goes for all monads, including IO.

But naturally you wonder: If you can't see the value inside the IO box then what's the point?

The point is to enforce the functional interaction law at the type level. A pure function that calls an impure action will receive a sealed, opaque IO box. There's no API that enables a pure function to extract the contents of the container, so this effectively enforces the rule that pure functions can't call impure actions.

The other three types of interactions are still possible.

  • Pure functions should be able to call pure functions. Pure functions return 'normal' values (i.e. values not hidden in IO boxes), so they can call each other as usual.
  • Impure actions should be able to call pure functions. This becomes possible because you can inject pure behaviour into any monad. You'll see example of that in later articles in this series.
  • Impure actions should be able to call other impure actions. Likewise, you can compose many IO actions into one IO action via the IO API.
When you're outside of all IO boxes, there's no way to open the box, and you can't see its contents.

On the other hand, if you're already inside the box, you can see the contents. And there's one additional rule: If you're already inside an IO box, you can open other IO boxes and see their contents!

In subsequent articles in this article series, you'll see how all of this manifests as C# code. This article gives a high-level view of the concept. I suggest that you go back and re-read it once you've seen the code.

The many-worlds interpretation #

If you're looking for metaphors or other ways to understand what's going on, there's two perspectives I find useful. None of them offer the full picture, but together, I find that they help.

A common interpretation of IO is that it's like the box in which you put Schrödinger's cat. IO<Cat> can be viewed as the superposition of the two states of cat (assuming that Cat is basically a sum type with the cases Alive and Dead). Likewise, IO<int> represents the superposition of all 4,294,967,296 32-bit integers, IO<string> the superposition of infinitely many strings, etcetera.

Only when you observe the contents of the box does the superposition collapse to a single value.

But... you can't observe the contents of an IO box, can you?

The black hole interpretation #

The IO container represents an impenetrable barrier between the outside and the inside. It's like a black hole. Matter can fall into a black hole, but no information can escape its event horizon.

In high school I took cosmology, among many other things. I don't know if the following is still current, but we learned a formula for calculating the density of a black hole, based on its mass. When you input the estimated mass of the universe, the formula suggests a density near vacuum. Wait, what?! Are we actually living inside a black hole? Perhaps. Could there be other universes 'inside' black holes?

The analogy to the IO container seems apt. You can't see into a black hole from the outside, but once beyond the blue event horizon, you can observe everything that goes on in that interior universe. You can't escape to the original universe, though.

As with all metaphors, this one breaks down if you think too much about it. Code running in IO can unpack other IO boxes, even nested boxes. There's no reason to believe that if you're inside a black hole that you can then gaze beyond the event horizon of nested black holes.

Code examples #

In the next articles in this series, you'll see C# code examples that illustrate how this concept might be implemented. The purpose of these code examples is to give you a sense for how IO works in Haskell, but with more familiar syntax.

These code examples are illustrations, not recommendations.

Conclusion #

When you saw the title, did you think that this would be an article about IoC Containers? It's not. The title isn't a typo, and I never use the term IoC Container. As Steven and I explain in our book, Inversion of Control (IoC) is a broader concept than Dependency Injection (DI). It's called a DI Container.

IO, on the other hand, is a container of impure values. Its API enables you to 'build' bigger structures (programs) from smaller IO boxes. You can compose IO actions together and inject pure functions into them. The boxes, however, are opaque. Pure functions can't see their contents. This effectively enforces the functional interaction law at the type level.

Next: IO container in a parallel C# universe.


Comments

Come on, you know there is a perfect metaphor. Monads are like burritos.

2020-06-08 11:08 UTC

Christer, I appreciate that this is all in good fun 🤓

For the benefit of readers who may still be trying to learn these concepts, I'll point out that just as this isn't an article about IoC containers, it's not a monad tutorial. It's an explanation of a particular API called IO, which, among other traits, also forms a monad. I'm trying to downplay that aspect here, because I hope that you can understand most of this and the next articles without knowing what a monad is.

2020-06-08 11:27 UTC

"While you know what type of value is inside the box, you can't see what's in it, and you can't open it."

Well you technically can, with unsafePerformIO ;) although it defeats the whole purpose.

2020-06-10 02:31 UTC

Retiring old service versions

Monday, 01 June 2020 09:36:00 UTC

A few ideas on how to retire old versions of a web service.

I was recently listening to a .NET Rocks! episode on web APIs, and one of the topics that came up was how to retire old versions of a web service. It's not easy to do, but not necessarily impossible.

The best approach to API versioning is to never break compatibility. As long as there's no breaking changes, you don't have to version your API. It's rarely possible to completely avoid breaking changes, but the fewer of those that you introduce, the fewer API version you have to maintain. I've previously described how to version REST APIs, but this article doesn't assume any particular versioning strategy.

Sooner or later you'll have an old version that you'd like to retire. How do you do that?

Incentives #

First, take a minute to understand why some clients keep using old versions of your API. It may be obvious, but I meet enough programmers who never give incentives a thought that I find it worthwhile to point out.

When you introduce a breaking change, by definition this is a change that breaks clients. Thus, upgrading from an old version to a newer version of your API is likely to give client developers extra work. If what they have already works to their satisfaction, why should they upgrade their clients?

You might argue that your new version is 'better' because it has more features, or is faster, or whatever it is that makes it better. Still, if the old version is good enough for a particular purpose, some clients are going to stay there. The client maintainers have no incentives to upgrade. There's only cost, and no benefit, to upgrading.

Even if the developers who maintain those clients would like to upgrade, they may be prohibited from doing so by their managers. If there's no business reason to upgrade, efforts are better used somewhere else.

Advance warning #

Web services exist in various contexts. Some are only available on an internal network, while others are publicly available on the internet. Some require an authentication token or API key, while others allow anonymous client access.

With some services, you have a way to contact every single client developer. With other services, you don't know who uses your service.

Regardless of the situation, when you wish to retire a version, you should first try to give clients advance warning. If you have an address list of all client developers, you can simply write them all, but you can't expect that everyone reads their mail. If you don't know the clients, you can publish the warning. If you have a blog or a marketing site, you can publish the warning there. If you run a mailing list, you can write about the upcoming change there. You can tweet it, post it on Facebook, or dance it on TikTok.

Depending on SLAs and contracts, there may even be a legally valid communications channel that you can use.

Give advance warning. That's the decent thing to do.

Slow it down #

Even with advance warning, not everyone gets the message. Or, even if everyone got the message, some people deliberately decide to ignore it. Consider their incentives. They may gamble that as long as your logs show that they use the old version, you'll keep it online. What do you do then?

You can, of course, just take the version off line. That's going to break clients that still depend on that version. That's rarely the best course of action.

Another option is to degrade the performance of that version. Make it slower. You can simply add a constant delay when clients access that service, or you can gradually degrade performance.

Many HTTP client libraries have long timeouts. For example, the default HttpClient timeout is 100 seconds. Imagine that you want to introduce a gradual performance degradation that starts at no delay on June 1, 2020 and reaches 100 seconds after one year. You can use the formula d = 100 s * (t - t0) / 1 year, where d is the delay, t is the current time, and t0 is the start time (e.g. June 1, 2020). This'll cause requests for resources to gradually slow down. After a year, clients still talking to the retiring version will begin to time out.

You can think of this as another way to give advance warning. With the gradual deterioration of performance, users will likely notice the long wait times well before calls actually time out.

When client developers contact you about the bad performance, you can tell them that the issue is fixed in more recent versions. You've just given the client organisation an incentive to upgrade.

Failure injection #

Another option is to deliberately make the service err now and then. Randomly return a 500 Internal Server Error from time to time, even if the service can handle the request.

Like deliberate performance degradation, you can gradually make the deprecated version more and more unreliable. Again, end users are likely to start complaining about the unreliability of the system long before it becomes entirely useless.

Reverse proxy #

One of the many benefits of HTTP-based services is that you can put a reverse proxy in front of your application servers. I've no idea how to configure or operate NGINX or Varnish, but from talking to people who do know, I get the impression that they're quite scriptable.

Since the above ideas are independent of actual service implementation or behaviour, it's a generic problem that you should seek to address with general-purpose software.

Sequence diagram of a client, reverse proxy, and application server.

Imagine having a general-purpose reverse proxy that detects whether incoming HTTP requests are for the version you'd like to retire (version 1 in the diagram) or another version. If the proxy detects that the request is for a retiring version, it inserts a delay before it forward the request to the application server. For all requests for current versions, it just forwards the request.

I could imagine doing something similar with failure injections.

Legal ramifications #

All of the above are only ideas. If you can use them, great. Consider the implications, though. You may be legally obliged to keep an SLA. In that case, you can't degrade the performance or reliability below the SLA level.

In any case, I don't think you should do any of these things lightly. Involve relevant stakeholders before you engage in something like the above.

Legal specialists are as careful and conservative as traditional operations teams. Their default reaction to any change proposal is to say no. That's not a judgement on their character or morals, but simply another observation based on incentives. As long as everything works as it's supposed to work, any change represents a risk. Legal specialists, like operations teams, are typically embedded in incentive systems that punish risk-taking.

To counter other stakeholders' reluctance to engage in unorthodox behaviour, you'll have to explain why retiring an old version of the service is important. It works best if you can quantify the reason. If you can, measure how much extra time you waste on maintaining the old version. If the old version runs on separate hardware, or a separate cloud service, quantify the cost overhead.

If you can't produce a compelling argument to retire an old version, then perhaps it isn't that important after all.

Logs #

Server logs can be useful. They can tell you how many requests the old version serves, which IP addresses they come from, at which times or dates you have most traffic, and whether the usage trend is increasing or decreasing.

These measures can be used to support your argument that a particular version should be retired.

Conclusion #

Versioning web services is already a complex subject, but once you've figured it out, you'll sooner or later want to retire an old version of your API. If some clients still make good use of that version, you'll have to give them incentive to upgrade to a newer version.

It's best if you can proactively make clients migrate, so prioritise amiable solutions. Sometimes, however, you have no way to reach all client developers, or no obvious way to motivate them to upgrade. In those cases, gentle and gradual performance or reliability degradation of deprecated versions could be a way.

I present these ideas as nothing more than that: ideas. Use them if they make sense in your context, but think things through. The responsibility is yours.


Comments

Hi Mark. As always an excellent piece. A few comments if I may.

An assumption seems to be, that the client is able to update to a new version of the API, but is not inclined to do so for various reasons. I work with organisations where updating a client if nearly impossible. Not because of lack of will, but due to other factors such as government regulations, physical location of client, hardware version or capabilities of said client to name just a few.

We have a tendency in the software industry to see updates as a question of 'running Windows update' and then all is good. Most likely because that is the world we live in. If we wish to update a program or even an OS, it is fairly easy even your above considerations taken into account.

In the 'physical' world of manufacturing (or pharmacy or mining or ...) the situation is different. The lifetime of the 'thing' running the client is regularly measured in years or even decades and not weeks or months as it is for a piece of software.

Updates are often equal to bloating of resource requirements meaning you have to replace the hardware. This might not always be possible again for various reasons. Cost (company is pressed on margin or client is located in Outer Mongolia) or risk (client is located in Syria or some other hotspot) are some I've personally come across.

REST/HTTP is not used. I acknowledge that the original podcast from .NET Rocks! was about updating a web service. This does not change the premises of your arguments, but it potentially adds a layer of complication.

2020-06-02 14:47 UTC

Karsten, thank you for writing. You are correct that the underlying assumption is that you can retire old versions.

I, too, have written REST APIs where retiring service versions weren't an option. These were APIs that consumer-grade hardware would communicate with. We had no way to assure that consumers would upgrade their hardware. Those boxes wouldn't have much of a user-interface. Owners might not realise that firmware updates were available, even if they were.

This article does, indeed, assume that the reader has made an informed decision that it's fundamentally acceptable to retire a service version. I should have made that more explicit.

2020-06-04 5:32 UTC

Where's the science?

Monday, 25 May 2020 05:50:00 UTC

Is a scientific discussion about software development possible?

Have you ever found yourself in a heated discussion about a software development topic? Which is best? Tabs or spaces? Where do you put the curly brackets? Is significant whitespace a good idea? Is Python better than Go? Does test-driven development yield an advantage? Is there a silver bullet? Can you measure software development productivity?

I've had plenty of such discussions, and I'll have them again in the future.

While some of these discussions may resemble debates on how many angels can dance on the head of a pin, other discussions might be important. Ex ante, it's hard to tell which is which.

Why don't we settle these discussions with science?

A notion of science #

I love science. Why don't I apply scientific knowledge instead of arguments based on anecdotal evidence?

To answer such questions, we must first agree on a definition of science. I favour Karl Popper's description of empirical falsifiability. A hypothesis that makes successful falsifiable predictions of the future is a good scientific theory. Such a a theory has predictive power.

Newton's theory of gravity had ample predictive power, but Einstein's theory of general relativity supplanted it because its predictive power was even better.

Mendel's theory of inheritance had predictive power, but was refined into what is modern-day genetics which yields much greater predictive power.

Is predictive power the only distinguishing trait of good science? I'm already venturing into controversial territory by taking this position. I've met people in the programming community who consider my position naive or reductionist.

What about explanatory power? If a theory satisfactorily explains observed phenomena, doesn't that count as a proper scientific theory?

Controversy #

I don't believe in explanatory power as a sufficient criterion for science. Without predictive power, we have little evidence that an explanation is correct. An explanatory theory can even be internally consistent, and yet we may not know if it describes reality.

Theories with explanatory power are the realm of politics or religion. Consider the observation that some people are rich and some are poor. You can believe in a theory that explains this by claiming structural societal oppression. You can believe in another theory that views poor people as fundamentally lazy. Both are (somewhat internally consistent) political theories, but they have yet to demonstrate much predictive power.

Likewise, you may believe that some deity created the universe, but that belief produces no predictions. You can apply Occam's razor and explain the same phenomena without a god. A belief in one or more gods is a religious theory, not a scientific theory.

It seems to me that there's a correlation between explanatory power and controversy. Over time, theories with predictive power become uncontroversial. Even if they start out controversial (such as Einstein's theory of general relativity), the dust soon settles because it's hard to argue with results.

Theories with mere explanatory power, on the other hand, can fuel controversy forever. Explanations can be compelling, and without evidence to refute them, the theories linger.

Ironically, you might argue that Popper's theory of scientific discovery itself is controversial. It's a great explanation, but does it have predictive power? Not much, I admit, but I'm also not aware of competing views on science with better predictive power. Thus, you're free to disagree with everything in this article. I admit that it's a piece of philosophy, not of science.

The practicality of experimental verification #

We typically see our field of software development as one of the pillars of STEM. Many of us have STEM educations (I don't; I'm an economist). Yet, we're struggling to come to grips with the lack of scientific methodology in our field. It seems to me that we suffer from physics envy.

It's really hard to compete with physics when it comes to predictive power, but even modern physics struggle with experimental verification. Consider an establishment like CERN. It takes billions of euros of investment to make today's physics experiments possible. The only reason we make such investments, I think, is that physics so far has had a good track record.

What about another fairly 'hard' science like medicine? In order to produce proper falsifiable predictions, medical science have evolved the process of the randomised controlled trial. It works well when you're studying short-term effects. Does this medicine cure this disease? Does this surgical procedure improve a patient's condition? How does lack of REM sleep for three days affect your ability to remember strings of numbers?

When a doctor tells you that a particular medicine helps, or that surgery might be warranted, he or she is typically on solid scientific grounds.

Here's where things start to become less clear, though, What if a doctor tells you that a particular diet will improve your expected life span? Is he or she on scientific solid ground?

That's rarely the case, because you can't make randomised controlled trials about life styles. Or, rather, a totalitarian society might be able to do that, but we'd consider it unethical. Consider what it would involve: You'd have to randomly select a significant number of babies and group them into those who must follow a particular life style, and those who must not. Then you'll somehow have to force those people to stick to their randomly assigned life style for the entirety of their lives. This is not only unethical, but the experiment also takes the most of a century to perform.

What life-style scientists instead do is resort to demographic studies, with all the problems of statistics they involve. Again, the question is whether scientific theories in this field offer predictive power. Perhaps they do, but it takes decades to evaluate the results.

My point is that medicine isn't exclusively a hard science. Some medicine is, and some is closer to social sciences.

I'm an economist by education. Economics is generally not considered a hard science, although it's a field where it's trivial to make falsifiable predictions. Almost all of economics is about numbers, so making a falsifiable prediction is trivial: The MSFT stock will be at 200 by January 1 2021. The unemployment rate in Denmark will be below 5% in third quarter of 2020. The problem with economics is that most of such predictions turn out to be no better than the toss of a coin - even when made by economists. You can make falsifiable predictions in economics, but most of them do, in fact, get falsified.

On the other hand, with the advances in such disparate fields as DNA forensics, satellite surveys, and computer-aided image processing, a venerable 'art' like archaeology is gaining predictive power. We predict that if we dig here, we'll find artefacts from the iron age. We predict that if we make a DNA test of these skeletal remains, they'll show that the person buried was a middle-aged women. And so on.

One thing is the ability to produce falsifiable predictions. Another things is whether or not the associated experiment is practically possible.

The science of software development #

Do we have a science of software development? I don't think that we have.

There's computer science, but that's not quite the same. That field of study has produced many predictions that hold. In general, quicksort will be faster than bubble sort. There's an algorithm for finding the shortest way through a network. That sort of thing.

You will notice that these result are hardly controversial. It's not those topics that we endlessly debate.

We debate whether certain ways to organise work is more 'productive'. The entire productivity debate revolves around an often implicit context: that what we discuss is long-term productivity. We don't much argue how to throw together something during a weekend hackaton. We argue whether we can complete a nine-month software project safer with test-driven development. We argue whether a code base can better sustain its organisation year after year if it's written in F# or JavaScript.

There's little scientific evidence on those questions.

The main problem, as I see it, is that it's impractical to perform experiments. Coming up with falsifiable predictions is easy.

Let's consider an example. Imagine that your hypothesis is that test-driven development makes you more productive in the middle and long run. You'll have to turn that into a falsifiable claim, so first, pick a software development project of sufficient complexity. Imagine that you can find a project that someone estimates will take around ten months to complete for a team of five people. This has to be a real project that someone needs done, complete with vague, contradictory, and changing requirements. Now you formulate your falsifiable prediction, for example: "This project will be delivered one month earlier with test-driven development."

Next, you form teams to undertake the work. One team to perform the work with test-driven development, and one team to do the work without it. Then you measure when they're done.

This is already impractical, because who's going to pay for two teams when one would suffice?

Perhaps, if you're an exceptional proposal writer, you could get a research grant for that, but alas, that wouldn't be good enough.

With two competing teams of five people each, it might happen that one team member exhibits productivity orders of magnitudes different from the others. That could skew the experimental outcome, so you'd have to go for a proper randomised controlled trial. This would involve picking numerous teams and assigning a methodology at random: either they do test-driven development, or they don't. Nothing else should vary. They should all use the same programming language, the same libraries, the same development tools, and work the same hours. Furthermore, no-one outside the team should know which teams follow which method.

Theoretically possible, but impractical. It would require finding and paying many software teams for most of a year. One such experiment would cost millions of euros.

If you did such an experiment, it would tell you something, but it'd still be open to interpretation. You might argue that the programming language used caused the outcome, but that one can't extrapolate from that result to other languages. Or perhaps there was something special about the project that you think doesn't generalise. Or perhaps you take issue with the pool from which the team members were drawn. You'd have to repeat the experiment while varying one of the other dimensions. That'll cost millions more, and take another year.

Considering the billions of euros/dollars/pounds the world's governments pour into research, you'd think that we could divert a few hundred millions to do proper research in software development, but it's unlikely to happen. That's the reason we have to contend ourselves with arguing from anecdotal evidence.

Conclusion #

I can imagine how scientific inquiry into software engineering could work. It'd involve making a falsifiable prediction, and then set up an experiment to prove it wrong. Unfortunately, to be on a scientifically sound footing, experiments should be performed with randomised controlled trials, with a statistically significant number of participants. It's not too hard to conceive of such experiments, but they'd be prohibitively expensive.

In the meantime, the software development industry moves forward. We share ideas and copy each other. Some of us are successful, and some of us fail. Slowly, this might lead to improvements.

That process, however, looks more like evolution than scientific progress. The fittest software development organisations survive. They need not be the best, as they could be stuck in local maxima.

When we argue, when we write blog posts, when we speak at conferences, when we appear on podcasts, we exchange ideas and experiences. Not genes, but memes.


Comments

Sergey Petrov #

That topic is something many software developers think about, at least I do from time to time.

Your post reminded me of that conference talk Intro to Empirical Software Engineering: What We Know We Don't Know by Hillel Wayne. Just curious - have you seen the talk and if so - what do you think? Researches mentioned in the talk are not proper scientific expreiments as you describe, but anyway looked really interesting to me.

2020-05-28 12:40 UTC

Sergey, thank you for writing. I didn't know about that talk, but Hillel Wayne regularly makes an appearance in my Twitter feed. I've now seen the talk, and I think it offers a perspective close to mine.

I've already read The Leprechauns of Software Engineering (above, I linked to my review), but while I was aware of Making Software, I've yet to read it. Several people have reacted to my article by recommending that book, so it's now on it's way to me in the mail.

2020-06-02 10:26 UTC

Modelling versus shaping reality

Monday, 18 May 2020 07:08:00 UTC

How does software development relate to reality?

I recently appeared as a guest on the .NET Rocks! podcast where we discussed Fred Brooks' 1986 essay No Silver Bullet. As a reaction, Jon Suda wrote a thoughtful piece of his own. That made me think some more.

Beware of Greeks... #

Brooks' central premise is Aristotle's distinction between essential and accidental complexity. I've already examined the argument in my previous article on the topic, but in summary, Brooks posits that complexity in software development can be separated into those two categories:

c = E + a

Here, c is the total complexity, E is the essential complexity, and a the accidental complexity. I've deliberately represented c and a with lower-case letters to suggest that they represent variables, whereas I mean to suggest with the upper-case E that the essential complexity is constant. That's Brooks' argument: Every problem has an essential complexity that can't be changed. Thus, your only chance to reduce total complexity c is to reduce the accidental complexity a.

Jon Suda writes that

"Mark doesn’t disagree with the classification of problems"

That got me thinking, because actually I do. When I wrote Yes silver bullet I wanted to engage with Brooks' essay on its own terms.

I do think, however, that one should be sceptical of any argument originating from the ancient Greeks. These people believed that all matter was composed of earth, water, air, and fire. They believed in the extramission theory of vision. They practised medicine by trying to balance blood, phlegm, yellow, and black bile. Aristotle believed in spontaneous generation. He also believed that the brain was a radiator while the heart was the seat of intelligence.

I think that Aristotle's distinction between essential and accidental complexity is a false dichotomy, at least in the realm of software development.

The problem is the assumption that there's a single, immutable underlying reality to be modelled.

Modelling reality #

Jon Suda touches on this as well:

"Conceptual (or business) problems are rooted in the problem domain (i.e., the real world)[...]"

"Dealing with the "messy real world" is what makes software development hard these days"

I agree that the 'real world' (whatever it is) is often messy, and our ability to deal with the mess is how we earn our keep. It seems to me, though, that there's an underlying assumption that there's a single real world to be modelled.

I think that a more fruitful perspective is to question that assumption. Don’t try to model the real world, it doesn’t exist.

I've mostly worked with business software. Web shops, BLOBAs, and other software to support business endeavours. This is the realm implicitly addressed by Domain-Driven Design and a technique like behaviour-driven development. The assumption is that there's one or more domain experts who know how the business operates, and the job of the software development team is to translate that understanding into working software.

This is often the way it happens. Lord knows that I've been involved in enough fixed-requirements contracts to know that sometimes, all you can do is try to implement the requirements as best you can, regardless of how messy they are. In other words, I agree with Jon Suda that messy problems are a reality.

Where I disagree with Brooks and Aristotle is that business processes contain essential complexity. In the old days, before computers, businesses ran according to a set of written and unwritten rules. Routine operations might be fairly consistent, but occasionally, something out of the ordinary would happen. Organisations don't have a script for every eventuality. When the unexpected happens, people wing it.

A business may have a way it sells its goods. It may have a standard price list, and a set of discounts that salespeople are authorised to offer. It may have standard payment terms that customers are expected to comply with. It may even have standard operating procedures for dealing with missing payments.

Then one day, say, the German government expresses interest in placing an order greater than the business has ever handled before. The Germans, however, want a special discount, and special terms of payment. What happens? Do the salespeople refuse because those requests don't comply with the way the organisation does business? Of course not. The case is escalated to people with the authority to change the rules, and a deal is made.

Later, the French government comes by, and a similar situation unfolds, but with the difference that someone else handles the French, and the deal struck is different.

The way these two cases are handled could be internally inconsistent. Decisions are made based on concrete contexts, but with incomplete information and based on gut feelings.

While there may be a system to how an organisation does routine business, there's no uniform reality to be modelled.

You can model standard operating procedures in software, but I think it's a mistake to think that it's a model of reality. It's just an interpretation on how routine business is conducted.

There's no single, unyielding essential complexity, because the essence isn't there.

Shaping reality #

Dan North tells a story of a project where a core business requirement was the ability to print out data. When investigating the issue, it turned out that users needed to have the printout next to their computers so that they could type the data into another system. When asked whether they wouldn't rather prefer to have the data just appear in the other system, they incredulously replied, "You can do that?!

This turns out to be a common experience. Someone may tell you about an essential requirement, but when you investigate, it turns out to be not at all essential. There may not be any essential complexity.

There's likely to be complexity, but the distinction between essential and accidental complexity seems artificial. While software can model business processes, it can also shape reality. Consider a company like Amazon. The software that supports Amazon wasn't developed after the company was already established. Amazon developed it concurrently with setting up business.

Consider companies like Google, Facebook, Uber, or Airbnb. Such software doesn't model reality; it shapes reality.

In the beginning of the IT revolution, the driving force behind business software development was to automate existing real-world processes, but this is becoming increasingly rare. New companies enter the markets digitally born. Older organisations may be on their second or third system since they went digital. Software not only models 'messy' reality - it shapes 'messy' reality.

Conclusion #

It may look as though I fundamentally disagree with Jon Suda's blog post. I don't. I agree with almost all of it. It did, however, inspire me to put my thoughts into writing.

My position is that I find the situation more nuanced than Fred Brooks suggests by setting off from Aristotle. I don't think that the distinction between essential and accidental complexity is the whole story. Granted, it provides a fruitful and inspiring perspective, but while we find it insightful, we shouldn't accept it as gospel.


Comments

I think I agree with virtually everything said here – if not actually everything 😊

As (virtually, if not actually) always, though, there are a few things I’d like to add, clarify, and elaborate on 😊

I fully share your reluctance to accept ancient Greek philosophy. I once discussed No Silver Bullet (and drank Scotch) with a (non-developer) “philosopher” friend of mine. (Understand: He has a BA in philosophy 😊) He said something to the effect of do you understand this essence/accident theory is considered obsolete? I answered that it was beside the point here – at least as far as I was concerned. For me, the dichotomy serves as an inspiration, a metaphor perhaps, not a rigorous theoretical framework – that’s one of the reasons why I adapted it into the (informal) conceptual/technical distinction.

I also 100% agree with the notion that software shouldn’t merely “model” or “automate” reality. In fact, I now remember that back in university, this was one of the central themes of my “thesis” (or whatever it was). I pondered the difference/boundary between analysis and design activities and concluded that the idea of creating a model of the “real world” during analysis and then building a technical representation of this model as part of design/implementation couldn’t adequately describe many of the more “inventive” software projects and products.

I don’t, however, believe this makes the conceptual/technical (essential/accidental) distinction go away. Even though the point of a project may not (and should not) be a mere automation of a preexisting reality, you still need to know what you want to achieve (i.e., “invent”), conceptually, to be able to fashion a technical implementation of it. And yes, your conceptual model should be based on what’s possible with all available technology – which is why you’ll hopefully end up with something way better than the old solution. (Note that in my post, I never talk about “modeling” the “messy real world” but rather “dealing with it”; even your revolutionary new solution will have to coexist with the messy outside world.)

For me, one of the main lessons of No Silver Bullet, the moral of the story, is this: Developers tend to spend inordinate amounts of time discussing and brooding over geeky technical stuff; perhaps they should, instead, talk to their users a bit more and learn something about the problem domain; that’s where the biggest room for improvement is, in my opinion.

2020-05-20 16:40 UTC

FWIW

https://jonsmusings.com/Transmutation-of-Reality

Let the inspiration feedback loop continue 😊 I agree with more-or-less everything you say; I just don’t necessarily think that your ideas are incompatible with those expressed in No Silver Bullet. That, to a significant extent (but not exclusively), is what my new post is about. As I said in my previous comment, your article reminded me of my university “thesis,” and I felt that the paper I used as its “conceptual framework” was worth presenting in a “non-academic” form. It can mostly be used to support your arguments – I certainly use it that way.

2020-05-29 17:28 UTC

AFK

Monday, 11 May 2020 07:04:00 UTC

Software development productivity tip: regularly step away from the computer.

In these days of covid-19, people are learning that productivity is imperfectly correlated to the amount of time one is physically present in an office. Indeed, the time you spend in front of you computer is hardly correlated to your productivity. After all, programming productivity isn't measured by typing speed. Additionally, you can waste much time at the computer, checking Twitter, watching cat videos on YouTube, etc.

Pomodorobut #

I've worked from home for years, so I thought I'd share some productivity tips. I only report what works for me. If you can use some of the tips, then great. If they're not for you, that's okay too. I don't pretend that I've found any secret sauce or universal truth.

A major problem with productivity is finding the discipline to work. I use Pomodorobut. It's like Scrumbut, but for the Pomodoro technique. For years I thought I was using the Pomodoro technique, but Arialdo Martini makes a compelling case that I'm not. He suggests just calling it timeboxing.

I think it's a little more than that, though, because the way I use Pomodorobut has two equally important components:

  • The 25-minute work period
  • The 5-minute break
The 25 minutes is a sufficiently short period that even if you loathe the task ahead of you, you can summon the energy to do it for 25 minutes. Once you get started, it's usually not that bad.

When you program, you often run into problems:

  • There's a bug that you don't understand.
  • You can't figure out the correct algorithm to implement a particular behaviour.
  • Your code doesn't compile, and you don't understand why.
  • A framework behaves inexplicably.
When you're in front of your computer, you can be stuck at such problems for hours on end.

The solution: take a break.

Breaks give a fresh perspective #

I take my Pomodorobut breaks seriously. My rule is that I must leave my office during the break. I usually go get a glass of water or the like. The point is to get out of the chair and afk (away from keyboard).

While I'm out the room, it often happens that I get an insight. If I'm stuck on something, I may think of a potential solution, or I may realise that the problem is irrelevant, because of a wider context I forgot about when I sat in front of the computer.

You may have heard about rubber ducking. Ostensibly

"By having to verbalize [...], you may suddenly gain new insight into the problem."

I've tried it enough times: you ask a colleague if he or she has a minute, launch into an explanation of your problem, only to break halfway through and say: "Never mind! I suddenly realised what the problem is. Thank you for your help."

Working from home, I haven't had a colleague I could disturb like that for years, and I don't actually use a rubber duck. In my experience, getting out of my chair works equally well.

The Pomodorobut technique makes me productive because the breaks, and the insights they spark, reduce the time I waste on knotty problems. When I'm in danger of becoming stuck, I often find a way forward in less than 30 minutes: at most 25 minutes being stuck, and a 5-minute break to get unstuck.

Hammock-driven development #

Working from home gives you extra flexibility. I have a regular routine where I go for a run around 10 in the morning. I also routinely go grocery shopping around 14 in the afternoon. Years ago, when I still worked in an office, I'd ride my bicycle to and from work every day. I've had my good ideas during those activities.

In fact, I can't recall ever having had a profound insight in front of the computer. They always come to me when I'm away from it. For instance, I distinctly remember walking around in my apartment doing other things when I realised that the Visitor design pattern is just another encoding of a sum type.

Insights don't come for free. As Rich Hickey points out in his talk about hammock-driven development, you must feed your 'background mind'. That involves deliberate focus on the problem.

Good ideas don't come if you're permanently away from the computer, but neither do they arrive if all you do is stare at the screen. It's the variety that makes you productive.

Conclusion #

Software development productivity is weakly correlated with the time you spend in front of the computer. I find that I'm most productive when I can vary my activities during the day. Do a few 25-minute sessions, rigidly interrupted by breaks. Go for a run. Work a bit more on the computer. Have lunch. Do one or two more time-boxed sessions. Go grocery shopping. Conclude with a final pair of work sessions.


Significant whitespace is DRY

Monday, 04 May 2020 10:02:00 UTC

Languages with explicit scoping encourage you to repeat yourself.

When the talk falls on significant whitespace, the most common reaction I hear seems to be one of nervous apotropaic deflection.

"Indentation matters? Oh, no! I'm horrified."

I've always wondered why people react like that. What's the problem with significant whitespace?

If given a choice, I'd prefer indentation to matter. In that way, I don't have to declare scope more than once.

Explicit scope #

If you had to choose between the following three C# implementations of the FizzBuzz kata, which one would you choose?

a:

class Program
{
static void Main()
{
for (int i = 1; i < 100; i++)
{
if (i % 15 == 0)
{
Console.WriteLine("FizzBuzz");
}
else if (i % 3 == 0)
{
Console.WriteLine("Fizz");
}
else if (i % 5 == 0)
{
Console.WriteLine("Buzz");
}
else
{
Console.WriteLine(i);
}
}
}
}

b:

class Program
{
    static void Main()
    {
        for (int i = 1; i < 100; i++)
        {
            if (i % 15 == 0)
            {
                Console.WriteLine("FizzBuzz");
            }
            else if (i % 3 == 0)
            {
                Console.WriteLine("Fizz");
            }
            else if (i % 5 == 0)
            {
                Console.WriteLine("Buzz");
            }
            else
            {
                Console.WriteLine(i);
            }
        }
    }
}

c:

class Program { static void Main() { for (int i = 1; i < 100; i++) { if (i % 15 == 0) { Console.WriteLine("FizzBuzz"); } else if (i % 3 == 0) { Console.WriteLine("Fizz"); } else if (i % 5 == 0) { Console.WriteLine("Buzz"); } else { Console.WriteLine(i); } } } }

Which of these do you prefer? a, b, or c?

You prefer b. Everyone does.

Yet, those three examples are equivalent. Not only do they behave the same - except for whitespace, they're identical. They produce the same effective abstract syntax tree.

Even though a language like C# has explicit scoping and statement termination with its curly brackets and semicolons, indentation still matters. It doesn't matter to the compiler, but it matters to humans.

When you format code like option b, you express scope twice. Once for the compiler, and once for human readers. You're repeating yourself.

Significant whitespace #

Some languages dispense with the ceremony and let indentation indicate scope. The most prominent is Python, but I've more experience with F# and Haskell. In F#, you could write FizzBuzz like this:

[<EntryPoint>]
let main argv =
    for i in [1..100] do
        if i % 15 = 0 then
            printfn "FizzBuzz"
        else if i % 3 = 0 then
            printfn "Fizz"
        else if i % 5 = 0 then
            printfn "Buzz"
        else
            printfn "%i" i
    0

You don't have to explicitly scope variables or expressions. The scope is automatically indicated by the indentation. You don't repeat yourself. Scope is expressed once, and both compiler and human understand it.

I've programmed in F# for a decade now, and I don't find its use of significant whitespace to be a problem. I'd recommend, however, to turn on the feature in your IDE of choice that shows whitespace characters.

Summary #

Significant whitespace is a good language feature. You're going to indent your code anyway, so why not let the indentation carry meaning? In that way, you don't repeat yourself.


An F# implementation of the Maître d' kata

Monday, 27 April 2020 14:41:00 UTC

This article walks you through the Maître d' kata done in F#.

In a previous article, I presented the Maître d' kata and promised to publish a walkthrough. Here it is.

Preparation #

I used test-driven development and F# for both unit tests and implementation. As usual, my test framework was xUnit.net (2.4.0) with Unquote (5.0.0) as the assertion library.

I could have done the exercise with a property-based testing framework like FsCheck or Hedgehog, but I chose instead to take my own medicine. In the kata description, I suggested some test cases, so I wanted to try and see if they made sense.

The entire code base is available on GitHub.

Boutique restaurant #

I wrote the first suggested test case this way:

[<Fact>]
let ``Boutique restaurant`` () =
    test <@ canAccept 12 [] { Quantity = 1 } @>

This uses Unquote's test function to verify that a Boolean expression is true. The expression is a function call to canAccept with the capacity 12, no existing reservations, and a reservation with Quantity = 1.

The simplest thing that could possibly work was this:

type Reservation = { Quantity : int }
 
let canAccept _ _ _ = true

The Reservation type was required to make the test compile, but the canAccept function didn't have to consider its arguments. It could simply return true.

Parametrisation #

The next test case made me turn the test function into a parametrised test:

[<Theory>]
[<InlineData( 1,  true)>]
[<InlineData(13, false)>]
let ``Boutique restaurant`` quantity expected =
    expected =! canAccept 12 [] { Quantity = quantity }

So far, the only test parameters were quantity and the expected result. I could no longer use test to verify the result of calling canAccept, since I added variation to the expected result. I changed test into Unquote's =! (must equal) operator.

The simplest passing implementation I could think of was:

let canAccept _ _ { Quantity = q } = q = 1

It ignored the capacity and instead checked whether q is 1. That passed both tests.

Test data API #

Before adding another test case, I decided to refactor my test code a bit. When working with a real domain model, you often have to furnish test data in order to make code compile - even if that data isn't relevant to the test. I wanted to demonstrate how to deal with this issue. My first step was to introduce an 'arbitrary' Reservation value in the spirit of Test data without Builders.

let aReservation = { Quantity = 1 }

This enabled me to rewrite the test:

[<Theory>]
[<InlineData( 1,  true)>]
[<InlineData(13, false)>]
let ``Boutique restaurant`` quantity expected =
    expected =! canAccept 12 [] { aReservation with Quantity = quantity }

This doesn't look like an immediate improvement, but it made it possible to make the Reservation record type more realistic without damage to the test:

type Reservation = {
    Date : DateTime
    Name : string
    Email : string
    Quantity : int }

I added some fields that a real-world reservation would have. The Quantity field will be useful later on, but the Name and Email fields are irrelevant in the context of the kata.

This is the type of API change that often gives people grief. To create a Reservation value, you must supply all four fields. This often adds noise to tests.

Not here, though, because the only concession I had to make was to change aReservation:

let aReservation =
    {
        Date = DateTime (2019, 11, 29, 12, 0, 0)
        Name = ""
        Email = ""
        Quantity = 1
    }

The test code remained unaltered.

With that in place, I could add the third test case:

[<Theory>]
[<InlineData( 1,  true)>]
[<InlineData(13, false)>]
[<InlineData(12,  true)>]
let ``Boutique restaurant`` quantity expected =
    expected =! canAccept 12 [] { aReservation with Quantity = quantity }

The simplest passing implementation I could think of was:

let canAccept _ _ { Quantity = q } = q <> 13

This implementation still ignored the restaurant's capacity and simply checked that q was different from 13. That was enough to pass all three tests.

Refactor test case code #

Adding the next suggested test case proved to be a problem. I wanted to write a single [<Theory>]-driven test function fed by all the Boutique restaurant test data. To do that, I'd have to supply arrays of test input, but unfortunately, that wasn't possible in F#.

Instead I decided to refactor the test case code to use ClassData-driven test cases.

type BoutiqueTestCases () as this =
    inherit TheoryData<int, bool> ()
    do this.Add ( 1,  true)
       this.Add (13, false)
       this.Add (12,  true)
 
[<Theory; ClassData(typeof<BoutiqueTestCases>)>]
let ``Boutique restaurant`` quantity expected =
    expected =! canAccept 12 [] { aReservation with Quantity = quantity }

These are the same test cases as before, but now expressed by a class inheriting from TheoryData<int, bool>. The implementing code remains the same.

Existing reservation #

The next suggested test case includes an existing reservation. To support that, I changed the test case base class to TheoryData<int, int list, int, bool>, and passed empty lists for the first three test cases. For the new, fourth test case, I supplied a number of seats.

type BoutiqueTestCases () as this =
    inherit TheoryData<int, int list, int, bool> ()
    do this.Add (12,  [],  1,  true)
       this.Add (12,  [], 13, false)
       this.Add (12,  [], 12,  true)
       this.Add ( 4, [2],  3, false)
 
[<Theory; ClassData(typeof<BoutiqueTestCases>)>]
let ``Boutique restaurant`` capacity reservatedSeats quantity expected =
    let rs = List.map (fun s -> { aReservation with Quantity = s }) reservatedSeats
    let actual = canAccept capacity rs { aReservation with Quantity = quantity }
    expected =! actual

This also forced me to to change the body of the test function. At this stage, it could be prettier, but it got the job done. I soon after improved it.

My implementation, as usual, was the simplest thing that could possibly work.

let canAccept _ reservations { Quantity = q } =
    q <> 13 && Seq.isEmpty reservations

Notice that although the fourth test case varied the capacity, I still managed to pass all tests without looking at it.

Accept despite existing reservation #

The next test case introduced another existing reservation, but this time with enough capacity to accept a new reservation.

type BoutiqueTestCases () as this =
    inherit TheoryData<int, int list, int, bool> ()
    do this.Add (12,  [],  1,  true)
       this.Add (12,  [], 13, false)
       this.Add (12,  [], 12,  true)
       this.Add ( 4, [2],  3, false)
       this.Add (10, [2],  3,  true)

The test function remained unchanged.

In the spirit of the Devil's advocate technique, I actively sought to avoid a correct implementation. I came up with this:

let canAccept capacity reservations { Quantity = q } =
    let reservedSeats =
        match Seq.tryHead reservations with
        | Some r -> r.Quantity
        | None -> 0
    reservedSeats + q <= capacity

Since all test cases supplied at most one existing reservation, it was enough to consider the first reservation, if present.

To many people, it may seem strange to actively seek out incorrect implementations like this. An incorrect implementation that passes all tests does, however, demonstrate the need for more tests.

The sum of all reservations #

I then added another test case, this time with three existing reservations:

type BoutiqueTestCases () as this =
    inherit TheoryData<int, int list, int, bool> ()
    do this.Add (12,      [],  1,  true)
       this.Add (12,      [], 13, false)
       this.Add (12,      [], 12,  true)
       this.Add ( 4,     [2],  3, false)
       this.Add (10,     [2],  3,  true)
       this.Add (10, [3;2;3],  3, false)

Again, I left the test function untouched.

On the side of the implementation, I couldn't think of more hoops to jump through, so I finally gave in and provided a 'proper' implementation:

let canAccept capacity reservations { Quantity = q } =
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) reservations
    reservedSeats + q <= capacity

Not only does it look simpler that before, but I also felt that the implementation was warranted.

“As the tests get more specific, the code gets more generic.”

Although I'd only tested canAccept with lists, I decided to implement it with Seq. This was a decision I later regretted.

Another date #

The last Boutique restaurant test case was to supply an existing reservation on another date. The canAccept function should only consider existing reservations on the date in question.

First, I decided to model the two separate dates as two values:

let d1 = DateTime (2023, 9, 14)
let d2 = DateTime (2023, 9, 15)

I hoped that it would make my test cases more readable, because the dates would have a denser representation.

type BoutiqueTestCases () as this =
    inherit TheoryData<int, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add (12,                        [], ( 1, d1),  true)
       this.Add (12,                        [], (13, d1), false)
       this.Add (12,                        [], (12, d1),  true)
       this.Add ( 4,                 [(2, d1)], ( 3, d1), false)
       this.Add (10,                 [(2, d1)], ( 3, d1),  true)
       this.Add (10, [(3, d1);(2, d1);(3, d1)], ( 3, d1), false)
       this.Add ( 4,                 [(2, d2)], ( 3, d1),  true)

I changed the representation of a reservation from just an int to a tuple of a number and a date. I also got tired of looking at that noisy unit test, so I introduced a test-specific helper function:

let reserve (q, d) = { aReservation with Quantity = q; Date = d }

Since it takes a tuple of a number and a date, I could use it to simplify the test function:

[<Theory; ClassData(typeof<BoutiqueTestCases>)>]
let ``Boutique restaurant`` (capacity, rs, r, expected) =
    let reservations = List.map reserve rs
    let actual = canAccept capacity reservations (reserve r)
    expected =! actual

The canAccept function now had to filter the reservations on Date:

let canAccept capacity reservations { Quantity = q; Date = d } =
    let relevantReservations = Seq.filter (fun r -> r.Date = d) reservations
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) relevantReservations
    reservedSeats + q <= capacity

This implementation specifically compared dates, though, so while it passed all tests, it'd behave incorrectly if the dates were as much as nanosecond off. That implied that another test case was required.

Same date, different time #

The final test case for the Boutique restaurant, then, was to use two DateTime values on the same date, but with different times.

type BoutiqueTestCases () as this =
    inherit TheoryData<int, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add (12,                        [], ( 1, d1            ),  true)
       this.Add (12,                        [], (13, d1            ), false)
       this.Add (12,                        [], (12, d1            ),  true)
       this.Add ( 4,                 [(2, d1)], ( 3, d1            ), false)
       this.Add (10,                 [(2, d1)], ( 3, d1            ),  true)
       this.Add (10, [(3, d1);(2, d1);(3, d1)], ( 3, d1            ), false)
       this.Add ( 4,                 [(2, d2)], ( 3, d1            ),  true)
       this.Add ( 4,                 [(2, d1)], ( 3, d1.AddHours 1.), false)

I just added a new test case as a new line and lined up the data. The test function, again, didn't change.

To address the new test case, I generalised the first filter.

let canAccept capacity reservations { Quantity = q; Date = d } =
    let relevantReservations = Seq.filter (fun r -> r.Date.Date = d.Date) reservations
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) relevantReservations
    reservedSeats + q <= capacity

An expression like r.Date.Date looks a little odd. DateTime values have a Date property that represents its date part. The first Date is the Reservation field, and the second is the date part.

I was now content with the Boutique restaurant implementation.

Haute cuisine #

In the next phase of the kata, I now had to deal with a configuration of more than one table, so I introduced a type:

type Table = { Seats : int }

It's really only a glorified wrapper around an int, but with a real domain model in place, I could make its constructor private and instead afford a smart constructor that only accepts positive integers.

I changed the canAccept function to take a list of tables, instead of capacity. This also required me to change the existing test function to take a singleton list of tables:

let actual = canAccept [table capacity] reservations (reserve r)

where table is a test-specific helper function:

let table s = { Seats = s }

I also added a new test function and a single test case:

let d3 = DateTime (2024, 6,  7)
 
type HauteCuisineTestCases () as this =
    inherit TheoryData<int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add ([2;2;4;4], [], (4, d3), true)
 
[<Theory; ClassData(typeof<HauteCuisineTestCases>)>]
let ``Haute cuisine`` (tableSeats, rs, r, expected) =
    let tables = List.map table tableSeats
    let reservations = List.map reserve rs
 
    let actual = canAccept tables reservations (reserve r)
 
    expected =! actual

The change to canAccept is modest:

let canAccept tables reservations { Quantity = q; Date = d } =
    let capacity = Seq.sumBy (fun t -> t.Seats) tables
    let relevantReservations =
        Seq.filter (fun r -> r.Date.Date = d.Date) reservations
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) relevantReservations
    reservedSeats + q <= capacity

It still works by looking at a total capacity as if there was just a single communal table. Now it just calculates capacity from the sequence of tables.

Reject reservation that doesn't fit largest table #

Then I added the next test case to the new test function:

type HauteCuisineTestCases () as this =
    inherit TheoryData<int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add ([2;2;4;4], [], (4, d3),  true)
       this.Add ([2;2;4;4], [], (5, d3), false)

This one attempts to make a reservation for five people. The largest table only fits four people, so this reservation should be rejected. The current implementation just considered the total capacity of all tables, to it accepted the reservation, and thereby failed the test.

This change to canAccept passes all tests:

let canAccept tables reservations { Quantity = q; Date = d } =
    let capacity = tables |> Seq.map (fun t -> t.Seats) |> Seq.max
    let relevantReservations =
        Seq.filter (fun r -> r.Date.Date = d.Date) reservations
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) relevantReservations
    reservedSeats + q <= capacity

The function now only considered the largest table in the restaurant. While it's incorrect to ignore all other tables, all tests passed.

Accept when there's still a remaining table #

Only considering the largest table is obviously wrong, so I added another test case where there's an existing reservation.

type HauteCuisineTestCases () as this =
    inherit TheoryData<int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add ([2;2;4;4],        [], (4, d3),  true)
       this.Add ([2;2;4;4],        [], (5, d3), false)
       this.Add (  [2;2;4], [(2, d3)], (4, d3),  true)

While canAccept should accept the reservation, it didn't when I added the test case. In a variation of the Devil's Advocate technique, I came up with this implementation to pass all tests:

let canAccept tables reservations { Quantity = q; Date = d } =
    let largestTable = tables |> Seq.map (fun t -> t.Seats) |> Seq.max
    let capacity = tables |> Seq.sumBy (fun t -> t.Seats)
    let relevantReservations =
        Seq.filter (fun r -> r.Date.Date = d.Date) reservations
    let reservedSeats = Seq.sumBy (fun r -> r.Quantity) relevantReservations
    q <= largestTable && reservedSeats + q <= capacity

This still wasn't the correct implementation. It represented a return to looking at the total capacity of all tables, with the extra rule that you couldn't make a reservation larger than the largest table. At least one more test case was needed.

Accept when remaining table is available #

I added another test case to the haute cuisine test cases. This one came with one existing reservation for three people, effectively reserving the four-person table. While the remaining tables have an aggregate capacity of four, it's two separate tables. Therefore, a reservation for four people should be rejected.

type HauteCuisineTestCases () as this =
    inherit TheoryData<int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add ([2;2;4;4],        [], (4, d3),  true)
       this.Add ([2;2;4;4],        [], (5, d3), false)
       this.Add (  [2;2;4], [(2, d3)], (4, d3),  true)
       this.Add (  [2;2;4], [(3, d3)], (4, d3), false)

It then dawned on me that I had to explicitly distinguish between a communal table configuration, and individual tables that aren't communal, regardless of size. This triggered quite a refactoring.

I defined a new type to distinguish between these two types of table layout:

type TableConfiguration = Communal of int | Tables of Table list

I also had to change the existing test functions, including the boutique restaurant test

[<Theory; ClassData(typeof<BoutiqueTestCases>)>]
let ``Boutique restaurant`` (capacity, rs, r, expected) =
    let reservations = List.map reserve rs
    let actual = canAccept (Communal capacity) reservations (reserve r)
    expected =! actual

and the haute cuisine test

[<Theory; ClassData(typeof<HauteCuisineTestCases>)>]
let ``Haute cuisine`` (tableSeats, rs, r, expected) =
    let tables = List.map table tableSeats
    let reservations = List.map reserve rs
 
    let actual = canAccept (Tables tables) reservations (reserve r)
 
    expected =! actual

In both cases I had to change the call to canAccept to pass either a Communal or a Tables value.

Delete first #

I'd previously done the kata in Haskell and was able to solve this phase of the kata using the built-in deleteFirstsBy function. This function doesn't exist in the F# core library, so I decided to add it. I created a new module named Seq and first defined a function that deletes the first element that satisfies a predicate:

// ('a -> bool) -> seq<'a> -> seq<'a>
let deleteFirstBy pred (xs : _ seq) = seq {
    let mutable found = false
    use e = xs.GetEnumerator ()
    while e.MoveNext () do
        if found
        then yield e.Current
        else if pred e.Current
            then found <- true
            else yield e.Current
    }

It moves over a sequence of elements and looks for an element that satisfies pred. If such an element is found, it's omitted from the output sequence. The function only deletes the first occurrence from the sequence, so any other elements that satisfy the predicate are still included.

This function corresponds to Haskell's deleteBy function and can be used to implement deleteFirstsBy:

// ('a -> 'b -> bool) -> seq<'b> -> seq<'a> -> seq<'b>
let deleteFirstsBy pred = Seq.fold (fun xs x -> deleteFirstBy (pred x) xs)

As the Haskell documentation explains, the "deleteFirstsBy function takes a predicate and two lists and returns the first list with the first occurrence of each element of the second list removed." My F# function does the same, but works on sequences instead of linked lists.

I could use it to find and remove tables that were already reserved.

Find remaining tables #

I first defined a little helper function to determine whether a table can accommodate a reservation:

// Reservation -> Table -> bool
let private fits r t = r.Quantity <= t.Seats

The rule is simply that the table's number of Seats must be greater than or equal to the reservation's Quantity. I could use this function for the predicate for Seq.deleteFirstsBy:

let canAccept config reservations ({ Quantity = q; Date = d } as r) =
    let contemporaneousReservations =
        Seq.filter (fun r -> r.Date.Date = d.Date) reservations
    match config with
    | Communal capacity ->
        let reservedSeats =
            Seq.sumBy (fun r -> r.Quantity) contemporaneousReservations
        reservedSeats + q <= capacity
    | Tables tables ->
        let rs = Seq.sort contemporaneousReservations
        let remainingTables = Seq.deleteFirstsBy fits (Seq.sort tables) rs
        Seq.exists (fits r) remainingTables

The canAccept function now branched on Communal versus Tables configurations. In the Communal configuration, it simply compared the reservedSeats and reservation quantity to the communal table's capacity.

In the Tables case, the function used Seq.deleteFirstsBy fits to remove all the tables that are already reserved. The result is the remainingTables. If there exists a remaining table that fits the reservation, then the function accepts the reservation.

This seemed to me an appropriate implementation of the haute cuisine phase of the kata.

Second seatings #

Now it was time to take seating duration into account. While I could have written my test cases directly against the TimeSpan API, I didn't want to write TimeSpan.FromHours 2.5, TimeSpan.FromDays 1., and so on. I found that it made my test cases harder to read, so I added some literal extensions:

type Int32 with
    member x.hours = TimeSpan.FromHours (float x)
    member x.days = TimeSpan.FromDays (float x)

This enabled me to write expressions like 1 .days and 2 .hours, as shown in the first test case:

let d4 = DateTime (2023, 10, 22, 18, 0, 0)
 
type SecondSeatingsTestCases () as this =
    inherit TheoryData<TimeSpan, int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add (2 .hours, [2;2;4], [(4, d4)], (3, d4.Add (2 .hours)), true)

I used this initial parametrised test case for a new test function:

[<Theory; ClassData(typeof<SecondSeatingsTestCases>)>]
let ``Second seatings`` (dur, tableSeats, rs, r, expected) =
    let tables = List.map table tableSeats
    let reservations = List.map reserve rs
 
    let actual = canAccept dur (Tables tables) reservations (reserve r)
 
    expected =! actual

My motivation for this test case was mostly to introduce an API change to canAccept. I didn't want to rock the boat too much, so I picked a test case that wouldn't trigger a big change to the implementation. I prefer incremental changes. The only change is the introduction of the seatingDur argument:

let canAccept (seatingDur : TimeSpan) config reservations ({ Date = d } as r) =
    let contemporaneousReservations =
        Seq.filter (fun x -> x.Date.Subtract seatingDur < d.Date) reservations
    match config with
    | Communal capacity ->
        let reservedSeats =
            Seq.sumBy (fun r -> r.Quantity) contemporaneousReservations
        reservedSeats + r.Quantity <= capacity
    | Tables tables ->
        let rs = Seq.sort contemporaneousReservations
        let remainingTables = Seq.deleteFirstsBy fits (Seq.sort tables) rs
        Seq.exists (fits r) remainingTables

While the function already considered seatingDur, the way it filtered reservation wasn't entirely correct. It passed all tests, though.

Filter reservations based on seating duration #

The next test case I added made me write what I consider the right implementation, but I subsequently decided to add two more test cases just for confidence. Here's all of them:

type SecondSeatingsTestCases () as this =
    inherit TheoryData<TimeSpan, int list, (int * DateTime) list, (int * DateTime), bool> ()
    do this.Add (
        2 .hours,
        [2;2;4],
        [(4, d4)],
        (3, d4.Add (2 .hours)),
        true)
       this.Add (
        2.5 .hours,
        [2;4;4],
        [(2, d4);(1, d4.AddMinutes 15.);(2, d4.Subtract (15 .minutes))],
        (3, d4.AddHours 2.),
        false)
       this.Add (
        2.5 .hours,
        [2;4;4],
        [(2, d4);(2, d4.Subtract (15 .minutes))],
        (3, d4.AddHours 2.),
        true)
       this.Add (
        2.5 .hours,
        [2;4;4],
        [(2, d4);(1, d4.AddMinutes 15.);(2, d4.Subtract (15 .minutes))],
        (3, d4.AddHours 2.25),
        true)

The new test cases use some more literal extensions:

type Int32 with
    member x.minutes = TimeSpan.FromMinutes (float x)
    member x.hours = TimeSpan.FromHours (float x)
    member x.days = TimeSpan.FromDays (float x)
 
type Double with
    member x.hours = TimeSpan.FromHours x

I added a private isContemporaneous function to the code base and used it to filter the reservation to pass the tests:

let private isContemporaneous
    (seatingDur : TimeSpan)
    (candidate : Reservation)
    (existing : Reservation) =
    let aSeatingBefore = candidate.Date.Subtract seatingDur
    let aSeatingAfter = candidate.Date.Add seatingDur
    aSeatingBefore < existing.Date && existing.Date < aSeatingAfter
 
let canAccept (seatingDur : TimeSpan) config reservations r =
    let contemporaneousReservations =
        Seq.filter (isContemporaneous seatingDur r) reservations
    match config with
    | Communal capacity ->
        let reservedSeats =
            Seq.sumBy (fun r -> r.Quantity) contemporaneousReservations
        reservedSeats + r.Quantity <= capacity
    | Tables tables ->
        let rs = Seq.sort contemporaneousReservations
        let remainingTables = Seq.deleteFirstsBy fits (Seq.sort tables) rs
        Seq.exists (fits r) remainingTables

I could have left the functionality of isContemporaneous inside of canAccept, but I found it just hard enough to get my head around that I preferred to put it in a named helper function. Checking that a value is in a range is in itself trivial, but for some reason, figuring out the limits of the range didn't come naturally to me.

This version of canAccept only considered existing reservations if they in any way overlapped with the reservation in question. It passed all tests. It also seemed to me to be a satisfactory implementation of the second seatings scenario.

Alternative table configurations #

This state of the kata introduces groups of tables that can be reserved individually, or combined. To support that, I changed the definition of Table:

type Table = Discrete of int | Group of int list

A Table is now either a Discrete table that can't be combined, or a Group of tables that can either be reserved individually, or combined.

I had to change the test-specific table function to behave like before.

let table s = Discrete s

Before this change to the Table type, all tables were implicitly Discrete tables.

This enabled me to add the first test case:

type AlternativeTableConfigurationTestCases () as this =
    inherit TheoryData<Table list, int list, int, bool> ()
    do this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        2,
        true)
 
[<Theory; ClassData(typeof<AlternativeTableConfigurationTestCases>)>]
let ``Alternative table configurations`` (tables, rs, r, expected) =
    let res i = reserve (i, d4)
    let reservations = List.map res rs
 
    let actual = canAccept (1 .days) (Tables tables) reservations (res r)
 
    expected =! actual

Like I did when I introduced the seatingDur argument, I deliberately chose a test case that didn't otherwise rock the boat too much. The same was the case now, so the only other change I had to make to pass all tests was to adjust the fits function:

// Reservation -> Table -> bool
let private fits r = function
    | Discrete seats -> r.Quantity <= seats
    | Group _ -> true

It's clearly not correct to return true for any Group, but it passed all tests.

Accept based on sum of table group #

I wanted to edge a little closer to correctly handling the Group case, so I added a test case:

type AlternativeTableConfigurationTestCases () as this =
    inherit TheoryData<Table list, int list, int, bool> ()
    do this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        2,
        true)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        7,
        false)

A restaurant with this table configuration can't accept a reservation for seven people, but because fits returned true for any Group, canAccept would return true. Since the test expected the result to be false, this caused the test to fail.

Edging closer to correct behaviour, I adjusted fits again:

// Reservation -> Table -> bool
let private fits r = function
    | Discrete seats -> r.Quantity <= seats
    | Group tables -> r.Quantity <= List.sum tables

This was still not correct, because it removed an entire group of tables when fits returned true, but it passed all tests so far.

Accept reservation by combining two tables #

I added another failing test:

type AlternativeTableConfigurationTestCases () as this =
    inherit TheoryData<Table list, int list, int, bool> ()
    do this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        2,
        true)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        7,
        false)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2;1],
        4,
        true)

The last test case failed because the existing reservations should only have reserved one of the tables in the group, but because of the way fits worked, the entire group was deleted by Seq.deleteFirstsBy fits. This made canAccept reject the four-person reservation.

To be honest, this step was difficult for me. I should probably have found out how to make a smaller step.

I wanted a function that would compare a Reservation to a Table, but unlike Fits return None if it decided to 'use' the table, or a Some value if it decided that it didn't need to use the entire table. This would enable me to pick only some of the tables from a Group, but still return a Some value with the rest of tables.

I couldn't figure out an elegant way to do this with the existing Seq functionality, so I started to play around with something more specific. The implementation came accidentally as I was struggling to come up with something more general. As I was experimenting, all of a sudden, all tests passed!

// Reservation -> Table -> Table option
let private allot r = function
    | Discrete seats ->
        if r.Quantity <= seats
        then None
        else Some (Discrete seats)
    | Group tables -> Some (Group tables)
 
// seq<Table> -> Reservation -> seq<Table>
let private allocate (tables : Table seq) r = seq {
    let mutable found = false
    use e = tables.GetEnumerator ()
    while e.MoveNext () do
        if found
        then yield e.Current
        else
            match allot r e.Current with
            | None ->
                found <- true
            | Some t -> yield t
    }
 
let canAccept (seatingDur : TimeSpan) config reservations r =
    let contemporaneousReservations =
        Seq.filter (isContemporaneous seatingDur r) reservations
    match config with
    | Communal capacity ->
        let reservedSeats =
            Seq.sumBy (fun r -> r.Quantity) contemporaneousReservations
        reservedSeats + r.Quantity <= capacity
    | Tables tables ->
        let rs = Seq.sort contemporaneousReservations
        let remainingTables = Seq.fold allocate (Seq.sort tables) rs
        Seq.exists (fits r) remainingTables

I wasn't too happy with the implementation, which I found (and still find) too complicated. This was, however, the first time I've done this part of the kata (in any language), so I wasn't sure where this was going.

The allocate function finds and allocates one of its input tables to a reservation. It does that by not yielding the first table it finds that can accommodate the reservation. Don't hurt your head too much with the code in this version, because there's plenty of cases that it incorrectly handles. It's full of bugs. Still, it passed all tests.

Reject when group has been reduced #

The implementation was wrong because the allot function would just keep returning a Group without consuming it. This would imply that canAccept would use it more than once, which was wrong, so I added a test case:

type AlternativeTableConfigurationTestCases () as this =
    inherit TheoryData<Table list, int list, int, bool> ()
    do this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        2,
        true)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2],
        7,
        false)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2;1],
        4,
        true)
       this.Add (
        [Discrete 4; Discrete 1; Discrete 2; Group [2;2;2]],
        [3;1;2;1;4],
        3,
        false)

Given the existing reservations, this restaurant is effectively sold out that day. All the Discrete tables are reserved, and the last two reservations for one and four effectively consumes the Group. The latest test case expected canAccept to return false, but it returned true. Since I was following test-driven development, I expected that.

Consume #

I needed a function that would consume from a Group of tables and return the remaining tables from that group; that is, the tables not consumed. I've already discussed this function in a different context.

// int -> seq<int> -> seq<int>
let consume quantity =
    let go (acc, xs) x =
        if quantity <= acc
        then (acc, Seq.append xs (Seq.singleton x))
        else (acc + x, xs)
    Seq.fold go (0, Seq.empty) >> snd

I put this function in my own Seq module. It consumes values from the left until the sum is greater than or equal to the desired quantity. It then returns the rest of the sequence:

> consume 1 [1;2;3];;
val it : seq<int> = seq [2; 3]

> consume 2 [1;2;3];;
val it : seq<int> = seq [3]

> consume 3 [1;2;3];;
val it : seq<int> = seq [3]

> consume 4 [1;2;3];;
val it : seq<int> = seq []

The first example consumes only the leading 1, while both the second and the third example consumes both 1 and 2 because the sum of those values is 3, and the requested quantity is 2 and 3, respectively. The fourth example consumes all elements because the requested quantity is 4, and you need both 1, 2, and 3 before the sum is large enough. You have to pick strictly from the left, so you can't decide to just take the elements 1 and 3.

Consuming tables from a group #

I could now use my Seq.consume function to improve allot:

// Reservation -> Table -> Table option
let private allot r = function
    | Discrete seats ->
        if r.Quantity <= seats
        then None
        else Some (Discrete seats)
    | Group tables ->
        tables |> Seq.consume r.Quantity |> Seq.toList |> Group |> Some

It handles the Group case by consuming the reservation Quantity and then returning a Some Group with the remaining tables.

It also turned out that sorting the reservations wasn't appropriate, mainly because it's not entirely clear how to sort a list with elements of a discriminated union. My final implementation of canAccept was this:

// TimeSpan -> TableConfiguration -> seq<Reservation> -> Reservation -> bool
let canAccept seatingDur config reservations r =
    let contemporaneousReservations =
        Seq.filter (isContemporaneous seatingDur r) reservations
    match config with
    | Communal capacity ->
        let reservedSeats =
            Seq.sumBy (fun r -> r.Quantity) contemporaneousReservations
        reservedSeats + r.Quantity <= capacity
    | Tables tables ->
        let remainingTables =
            Seq.fold allocate (Seq.ofList tables) contemporaneousReservations
        Seq.exists (fits r) remainingTables

Nothing much has changed - only that neither reservations nor tables are now sorted. It passes all tests.

Retrospective #

I must admit that I ran out of steam towards the end. It's possible that there's some edge cases I didn't think of. I'd probably feel more confident of the final implementation if I'd used property-based testing instead of Example-Guided Development.

I also took some unfortunate turns along the way. Early in the kata, I could easily implement canAccept with functionality from the Seq module. This meant that the function could take a seq<Reservation> as an input argument. I'm always inclined to follow Postel's law and be liberal with what I accept. I thought that being able to accept any seq<Reservation> was a good design decision. It might have been if I'd been publishing a reusable library, but it made things more awkward.

I'm also not sure that I chose to model the table layouts in the best way. For example, I currently can't handle a scenario with both bar seating and individual tables. I think I should have made Communal a case of Table. This would have enabled me to model layouts with several communal tables combined with discrete tables, and even groups of tables.

In general, my solution seems too complicated, but I don't see an easy fix. Often, if I work some more with the problem, insight arrives. It usually arrives when you least need it, so I thought it better to let the problem rest for a while. I can always return to it when I feel that I have a fresh approach.

Summary #

This article walks you through my first F# attempt at the Maître d' kata. The repository is available on GitHub.

I'm not entirely happy with how it turned out, but I don't consider it an utter failure either.


Comments

Ghillie Dhu #

You could leverage library functions and avoid mutability like so:

// ('a -> bool) -> seq<'a> -> seq<'a>
let deleteFirstBy pred (xs : _ seq) =
    match Seq.tryFindIndex pred xs with
    | None -> xs
    | Some n -> Seq.concat (Seq.take (n-1) xs) (Seq.skip n xs)

2020-04-28 19:02 UTC

Ghillie, thank you for writing. It looks like you're on the right track, but I think you have to write the function like this:

let deleteFirstBy pred (xs : _ seq) =
    match Seq.tryFindIndex pred xs with
    | None -> xs
    | Some n -> Seq.append (Seq.take (n-1) xs) (Seq.skip n xs)

2020-04-28 19:21 UTC

Unit bias against collections

Monday, 20 April 2020 06:27:00 UTC

How do you get the value out of a collection? Mu. Which value?

The other day I was looking for documentation on how to write automated tests with a self-hosted ASP.NET Core 3 web application. I've done this numerous times with previous versions of the framework, but ASP.NET Core 3 is new, so I wanted to learn how I'm supposed to do it this year. I found official documentation that helped me figure it out.

One of the code examples in that article displays a motif that I keep encountering. It displays behaviour close enough to unit bias that I consider it reasonable to use that label. Unit bias is the cognitive tendency to consider a unit of something the preferred amount. Our brains don't like fractions, and they don't like multiples either.

Unit bias in action #

The sample code in the ASP.NET Core documentation differs in the type of dependency it looks for, but is otherwise identical to this:

var descriptor = services.SingleOrDefault(
    d => d.ServiceType ==
        typeof(IReservationsRepository));
 
if (descriptor != null)
{
    services.Remove(descriptor);
}

The goal is to enable an automated integration test to run against a Fake database instead of an actual relational database. My production Composition Root registers an implementation of IReservationsRepository that communicates with an actual database. In an automated integration test, I'd like to unregister the existing dependency and replace it with a Fake. Here's the code in context:

public class RestaurantApiFactory : WebApplicationFactory<Startup>
{
    protected override void ConfigureWebHost(IWebHostBuilder builder)
    {
        if (builder is null)
            throw new ArgumentNullException(nameof(builder));
 
        builder.ConfigureServices(services =>
        {
            var descriptor = services.SingleOrDefault(
                d => d.ServiceType ==
                    typeof(IReservationsRepository));
 
            if (descriptor != null)
            {
                services.Remove(descriptor);
            }
 
            services.AddSingleton<IReservationsRepository>(
                new FakeDatabase());
        });
    }
}

It works as intended, so what's the problem?

How do I get the value out of my collection? #

The problem is that it's fragile. What happens if there's more than one registration of IReservationsRepository?

This happens:

System.InvalidOperationException : Sequence contains more than one matching element

This is a completely avoidable error, stemming from unit bias.

A large proportion of programmers I meet seem to be fundamentally uncomfortable with thinking in multitudes. They subconsciously prefer thinking in procedures and algorithms that work on a single object. The programmer who wrote the above call to SingleOrDefault exhibits behaviour putting him or her in that category.

This is nothing but a specific instantiation of a more general programmer bias: How do I get the value out of the monad?

As usual, the answer is mu. You don't. The question borders on the nonsensical. How do I get the value out of my collection? Which value? The first? The last? Some arbitrary value at an unknown position? Which value do you want if the collection is empty? Which value do you want if there's more than one that fits a predicate?

If you can answer such questions, you can get 'the' value out of a collection, but often, you can't. In the current example, the code doesn't handle multiple IReservationsRepository registrations.

It easily could, though.

Inject the behaviour into the collection #

The best answer to the question of how to get the value out of the monad (in this case, the collection) is that you don't. Instead, you inject the desired behaviour into it.

In this case, the desired behaviour is to remove a descriptor. The monad in question is the collection of services. What does that mean in practice?

A first attempt might be something like this:

var descriptors = services
    .Where(d => d.ServiceType == typeof(IReservationsRepository));
foreach (var descriptor in descriptors)
    services.Remove(descriptor);

Unfortunately, this doesn't quite work:

System.InvalidOperationException : Collection was modified; enumeration operation may not execute.

This happens because descriptors is a lazily evaluated Iterator over services, and you're not allowed to remove elements from a collection while you enumerate it. It could lead to bugs if you could.

That problem is easily solved. Just copy the selected descriptors to an array or list:

var descriptors = services
    .Where(d => d.ServiceType == typeof(IReservationsRepository))
    .ToList();
foreach (var descriptor in descriptors)
    services.Remove(descriptor);

This achieves the desired outcome regardless of the number of matches to the predicate. This is a more robust solution, and it requires the same amount of code.

You can stop there, since the code now works, but if you truly want to inject the behaviour into the collection, you're not quite done yet.

But you're close. All you have to do is this:

services
    .Where(d => d.ServiceType == typeof(IReservationsRepository))
    .ToList()
    .ForEach(d => services.Remove(d));

Notice how this statement never produces an output. Instead, you 'inject' the call to services.Remove into the list, using the ForEach method, which then mutates the services collection.

Whether you prefer the version that uses the foreach keyword or the version that uses List<T>.ForEach doesn't matter. What matters is that you don't use the partial SingleOrDefault function.

Conclusion #

It's a common code smell when programmers try to extract a single value from a collection. Sometimes it's appropriate, but there are several edge cases you should be prepared to address. What should happen if the collection is empty? What should happen if the collection contains many elements? What should happen if the collection is infinite? (I didn't address that in this article.)

You can often both simplify your code and make it more robust by staying 'in' the collection, so to speak. Let the desired behaviour apply to all appropriate elements of the collection.

Don't be biased against collections.


Comments

Julius H #

I concur that often the first element of a collection is picked without thinking. Anecdotally, I experienced a system that would not work if set up freshly because in some places there was no consideration for empty collections. (The testing environment always contained some data)

Yet I would reverse the last change (towards .ForEach). For one, because (to my biased eye) it looks side effect free but isn't. And then it does add value compared to a forech loop, also both solutions are needlessy inefficient. If you want to avoid the foreach, go for the RemoveAll() method (also present on List<T>):

services.RemoveAll<IReservationsRepository>();

2020-04-29 8:52 UTC

Julius, thank you for writing. Yes, I agree that in C# it's more idiomatic to use foreach.

How would using RemoveAll work? Isn't that going to remove the entries from the List instead of from services?

2020-04-29 10:49 UTC
Julius H #

Hi Mark,
As you"re calling IServiceCollection.RemoveAll(), it will remove it from the collection. I tried it, and to me it seems to be working. (As long as you are not copying the services into a list beforehand)

But to your main point, I remember when I wrote .Single() once, and years later I got a bug report because of it. I see two approaches there: Fail as fast and hard as possible or check just as much as needed for the moment. Considering TDD in the former approach, one would need to write a lot of test code for scenarios, that should never happen to verify the exceptions happen. Still, in the latter approach, a subtle bug could stay in the system for quite some time... What do you prefer?

2020-05-01 18:07 UTC

Julius, thank you for writing. It turns out that RemoveAll are a couple of extension methods on IServiceCollection. One has to import Microsoft.Extensions.DependencyInjection.Extensions with a using directive before one can use them. I didn't know about these methods, but I agree that they seem to do their job. Thank you for the tip.

As for your question, my preference is for robustness. In my experience, there's rarely such a thing as a scenario that never happens. If the code allows something to happen, it'll likely happen sooner or later. Code changes, so even if you've analysed that some combination of input isn't possible today, a colleague may change the situation tomorrow. It pays to write that extra unit test or two to make sure that encapsulation isn't broken.

This is also one of the reasons I'm fond of property-based testing. You automatically get coverage of all sorts of boundary conditions you'd normally not think about testing.

2020-05-02 9:12 UTC

Curb code rot with thresholds

Monday, 13 April 2020 08:43:00 UTC

Code bases deteriorate unless you actively prevent it. Institute some limits that encourage developers to clean up.

From time to time I manage to draw the ire of people, with articles such as The 80/24 rule or Put cyclomatic complexity to good use. I can understand why. These articles suggest specific constraints to which people should consider consenting. Don't write code wider than 80 characters. Don't write code with a cyclomatic complexity higher than 7.

It makes people uncomfortable.

Sophistication #

I hope that regular readers understand that I'm a more sophisticated thinker than some of my texts may suggest. I deliberately simplify my points.

I do this to make the text more readable. I also aspire to present sufficient arguments, and enough context, that a charitable reader will understand that everything I write should be taken as food for thought rather than gospel.

Consider a sentence like the above: I deliberately simplify my points. That sentence, in itself, is an example of deliberate simplification. In reality, I don't always simplify my points. Perhaps sometimes I simplify, but it's not deliberate. I could have written: I often deliberately simplify some of my points. Notice the extra hedge words. Imagine an entire text written like that. It would be less readable.

I could hedge my words when I write articles, but I don't. I believe that a text that states its points as clearly as possible is easier to understand for any careful reader. I also believe that hedging my language will not prevent casual readers from misunderstanding what I had in mind.

Archetypes #

Why do I suggest hard limits on line width, cyclomatic complexity, and so on?

In light of the above, realise that the limits I offer are suggestions. A number like 80 characters isn't a hard limit. It's a representation of an idea; a token. The same is true for the magic number seven, plus or minus two. That too, represents an idea - the idea that human short-term memory is limited, and that this impacts our ability to read and understand code.

The number seven serves as an archetype. It's a proxy for a more complex idea. It's a simplification that, hopefully, makes it easier to follow the plot.

Each method should have a maximum cyclomatic complexity of seven. That's easier to understand than each method should have a maximum cyclomatic complexity small enough that it fits within the cognitive limits of the human brain's short-term memory.

I've noticed that a subset of the developer population is quite literal-minded. If I declare: don't write code wider than 80 characters they're happy if they agree, and infuriated if they don't.

If you've been paying attention, you now understand that this isn't about the number 80, or 24, or 7. It's about instituting useful quantitative guidance. The actual number is less important.

I have reasons to prefer those specific values. I've already motivated them in previous articles. I'm not, though, obdurately attached to those particular numbers. I'd rather work with a team that has agreed to a 120-character maximum width than with a team that follows no policy.

How code rots #

No-one deliberately decides to write legacy code. Code bases gradually deteriorate.

Here's another deliberate simplification: code gradually becomes more complicated because each change seems small, and no-one pays attention to the overall quality. It doesn't happen overnight, but one day you realise that you've developed a legacy code base. When that happens, it's too late to do anything about it.

Line chart showing increasing complexity as time passes.

At the beginning, a method has low complexity, but as you fix defects and add features, the complexity increases. If you don't pay attention to cyclomatic complexity, you pass 7 without noticing it. You pass 10 without noticing it. You pass 15 and 20 without noticing it.

One day you discover that you have a problem - not because you finally look at a metric, but because the code has now become so complicated that everyone notices. Alas, now it's too late to do anything about it.

Code rot sets in a little at a time; it works like boiling the proverbial frog.

Thresholds #

Agreeing on a threshold can help curb code rot. Institute a rule and monitor a metric. For example, you could agree to keep an eye on cyclomatic complexity. If it exceeds 7, you reject the change.

Line chart showing how complexity is curbed by a threshold of 7.

Such rules work because they can be used to counteract gradual decay. It's not the specific value 7 that contributes to better code quality; it's the automatic activation of a rule based on a threshold. If you decide that the threshold should be 10 instead, that'll also make a difference.

Notice that the above diagram suggests that exceeding the threshold is still possible. Rules are in the way if you must rigidly obey them. Situations arise where breaking a rule is the best response. Once you've responded to the situation, however, find a way to bring the offending code back in line. Once a threshold is exceeded, you don't get any further warnings, and there's a risk that that particular code will gradually decay.

What you measure is what you get #

You could automate the process. Imagine running cyclomatic complexity analysis as part of a Continuous Integration build and rejecting changes that exceed a threshold. This is, in a way, a deliberate attempt to hack the management effect where you get what you measure. With emphasis on a metric like cyclomatic complexity, developers will pay attention to it.

Be aware, however, of Goodhart's law and the law of unintended consequences. Just as code coverage is a useless target measure, you have to be careful when you institute hard rules.

I've had success with introducing threshold rules because they increase awareness. It can help a technical leader shift emphasis to the qualities that he or she wishes to improve. Once the team's mindset has changed, the rule itself becomes redundant.

I'm reminded of the Dreyfus model of skill acquisition. Rules make great training wheels. Once you become proficient, the rules are no longer required. They may even be in your way. When that happens, get rid of them.

Conclusion #

Code deteriorates gradually, when you aren't looking. Instituting rules that make you pay attention can combat code rot. Using thresholds to activate your attention can be an effective countermeasure. The specific value of the threshold is less important.

In this article, I've mostly used cyclomatic complexity as an example of a metric where a threshold could be useful. Another example is line width; don't exceed 80 characters. Or line height: methods shouldn't exceed 24 lines of code. Those are examples. If you agree that keeping an eye on a metric would be useful, but you disagree with the threshold I suggest, pick a value that suits you better.

It's not the specific threshold value that improves your code; paying attention does.


Comments

In F#, it's not uncommon to have inner functions (local functions defined inside other functions). How would you calculate the cyclomatic complexity of a function that contains inner functions?

To be specific, I'm actually wondering about how to count the number of activated objects in a function, which you talk about in your book, Code That Fits in Your Head. I have been wanting to ask you this for some time, but haven't been able to find a good article to comment on. I think this is the closest I can get.

In terms of activated objects: Would you count all activated objects in all sub-functions as counting towards the top-level function? Or would you count the inner functions separately, and have calls to the inner function contribute only "one point" to the top-level functions? I think the latter makes most sense, but I'm not sure. I base my reasoning on the fact that an inner function, being a closure, is similar to a class with fields and a single method (closures are a poor man's objects and vice versa). Another way to view it is that you could refactor by extracting the function and adding any necessary parameters.

PS, this is not just theoretical. I am toying with a linter for F# and want "number of activated objects" as one of the rules.

2023-04-20 14:00 UTC

Christer, thank you for writing. For the purposes of calculating cyclomatic complexity of inner functions, aren't they equivalent to (private) helper methods?

If so, they don't count towards the cyclomatic complexity of the containing function.

As for the other question, I don't count functions as activated objects, but I do count the values they return. When the function is referentially transparent, however, they're equal. I do talk more about this in my talk Fractal Architecture, of which you can find several recordings on the internet; here's one.

The book also discusses this, and that part is also freely available here on the blog.

The short answer is that it's essential that you can prevent 'inner' objects from leaking out from method calls. Per definition, functions that are referentially transparent do have that property. For methods that aren't referentially transparent, encapsulation may still achieve the same effect, if done right. Usually, however, it isn't.

2023-04-21 16:26 UTC

Mark, thank you for the response. What you say about cyclomatic complexity makes sense, especially given your previous writings on the subject. I am still a bit fuzzy on how to count activated objects, though.

If a function returns a tuple of which one item is ignored, would that ignored object count as an activated object? (Does the fact that a tuple could technically be considered as a single object with .Item1 and .Item2 properties change the answer?) And what about piping? Eta reduction?

An example would be handy right about now, so what would you say are the activated object counts of the following functionally identical functions, and why?

  1. let f (a, b) = let c, _ = getCD (a, b) in c
  2. let f (a, b) = (a, b) |> getCD |> fst
  3. let f = getCD >> fst
  4. let f = getCDFst

Note the "trick" of number 3: By removing the explicit parameter, it is now impossible to tell just from looking at the code how many tupled items f accepts, if any. And in number 4, even the knowledge that an intermediate function in the composition returns a tuple of 2 values is removed.

Additionally: I assume that returning a value counts as "activating" an object? So let f x = x has 1 activated object? What about the functionally identical let f = id? Would that be 1 or 0?

I guess what I'm after is a more fully developed concept of "the number of activated objects" of a function/method, to the point where a useful linter rule could be implemented based on it; something similar to your previous writings on how method calls do not increase the cyclomatic complexity, which was a very useful clarification that I have seen you repeat several times. I have given the matter some thought myself, but as you can see, I haven't been able to come up with good answer.

2023-04-22 06:03 UTC

It seems that I should have been more explicit about the terminology related to the adjective activated. I was inspired by the notion of object activation described in Thinking Fast and Slow. The idea is that certain pieces of information move to the forefront in the mind, and that these 'objects' impact decision-making. Kahneman labels such information as activated when it impacts the decision process.

The heuristic I had in mind was to port that idea to an (informal) readability analysis. Given that the human short-time memory is quite limited, I find it useful to count the mental load of a given piece of code.

The point, then, is not to count all objects or values in scope, but rather those that are required to understand what a piece of code does. For example, if you look at an instance method on a class, the class could have four class fields, but if only one of those fields are used in the method, only that one is activated - even though the other three are also in scope.

With that in mind, let's try to look at your four examples.

  1. This example activates a, b, and c: 3 objects.
  2. This example activates a and b: 2 objects.
  3. No objects, unless you now want to count getCD as an object.
  4. Again, probably no objects.

Note that I've employed qualifying words. The point of the analysis is to highlight objects that might stress our short-term memory. It's not an exact science, and I never intended it to be. Rather, I see it as a possible springboard for having a discussion about relative readability of code. A team can use the heuristic to compare alternatives.

With your examples in mind, you'd be likely to run into programmers who find the first two examples more readable than the third. It's certainly more 'detailed', so, in a sense, it's easier to understand what's going on. That works as long as you only have a few values in play, but cognitively, it doesn't scale.

I do tend to prefer eta reductions and point-free notation exactly because they tend to reduce the number of activated objects, but these techniques certainly also raise the abstraction level. On the other hand, once someone understands something like function composition (>>) or point-free notation, they can now leverage long-term memory for that, instead of having to rely on limited short-term memory in order to understand a piece of code. By moving more information to long-term memory, we can reduce the load on short-term memory, thereby freeing it up for other information.

Perhaps that's a bit of a digression, but I always intended the notion of object activation to be a heuristic rather than an algorithm.

2023-04-23 20:01 UTC

Mark, thank you for the excellent clarification. It gave me one of those "a-ha" moments that accompanies a sudden jump in understanding. In hindsight, of course this is about the cognitive load of a piece of code, and of course that will be different for different people, based for example on which abstractions they are used to.

In terms of the code examples, I think we both agree that let f = a >> b requires less mental load than let f = a >> b >> c >> d >> e. In other words, I would argue that functions in a composition do contribute to cognitive load. This may however also depend on the actual functions that are composed.

In any case, I am now less certain than before that a simple linter rule (i.e., an algorithm) can capture cognitive load in a way that is generally useful. I will have to think about this some more.

2023-04-24 13:47 UTC

Page 21 of 73

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!