The Test Data Generator functor

Monday, 18 September 2017 07:55:00 UTC

A Test Data Generator modelled as a functor.

In a previous article series, you learned that while it's possible to model Test Data Builders as a functor, it adds little value. You shouldn't, however, dismiss the value of functors. It's an abstraction that applies broadly.

Closely related to Test Data Builders is the concept of a generator of random test data. You could call it a Test Data Generator instead. Such a generator can be modelled as a functor.

A C# Generator #

At its core, the idea behind a Test Data Generator is to create random test data. Still, you'll like to be able control various parts of the process, because you'd often need to pin parts of the generated data to deterministic values, while allowing other parts to vary randomly.

In C#, you can write a generic Generator like this:

public class Generator<T>
{
    private readonly Func<RandomT> generate;
 
    public Generator(Func<RandomT> generate)
    {
        if (generate == null)
            throw new ArgumentNullException(nameof(generate));
 
        this.generate = generate;
    }
 
    public Generator<T1> Select<T1>(Func<TT1> f)
    {
        if (f == null)
            throw new ArgumentNullException(nameof(f));
 
        Func<RandomT1> newGenerator = r => f(this.generate(r));
        return new Generator<T1>(newGenerator);
    }
 
    public T Generate(Random random)
    {
        if (random == null)
            throw new ArgumentNullException(nameof(random));
 
        return this.generate(random);
    }
}

The Generate method takes a Random object as input, and produces a value of the generic type T as output. This enables you to deterministically reproduce a particular randomly generated value, if you know the seed of the Random object.

Notice how Generator<T> is a simple Adapter over a (lazily evaluated) function. This function also takes a Random object as input, and produces a T value as output. (For the FP enthusiasts, this is simply the Reader functor in disguise.)

The Select method makes Generator<T> a functor. It takes a map function f as input, and uses it to define a new generate function. The return value is a Generator<T1>.

General-purpose building blocks #

Functors are immanently composable. You can compose complex Test Data Generators from simpler building blocks, like the following.

For instance, you may need a generator of alphanumeric strings. You can write it like this:

private const string alphaNumericCharacters =
    "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
 
public static Generator<string> AlphaNumericString =
    new Generator<string>(r =>
    {
        var length = r.Next(25); // Arbitrarily chosen max length
        var chars = new char[length];
        for (int i = 0; i < length; i++)
        {
            var idx = r.Next(alphaNumericCharacters.Length);
            chars[i] = alphaNumericCharacters[idx];
        }
        return new string(chars);
    });

This Generator<string> can generate a random string with alphanumeric characters. It randomly picks a length between 0 and 24, and fills it with randomly selected alphanumeric characters. The maximum length of 24 is arbitrarily chosen. The generated string may be empty.

Notice that the argument passed to the constructor is a function. It's not evaluated at initialisation, but only if Generate is called.

The r argument is the Random object passed to Generate.

Another useful general-purpose building block is a generator that can use a single-object generator to create many objects:

public static Generator<IEnumerable<T>> Many<T>(Generator<T> generator)
{
    return new Generator<IEnumerable<T>>(r =>
    {
        var length = r.Next(25); // Arbitrarily chosen max length
        var elements = new List<T>();
        for (int i = 0; i < length; i++)
            elements.Add(generator.Generate(r));
        return elements;
    });
}

This method takes a Generator<T> as input, and uses it to generate zero or more T objects. Again, the maximum length of 24 is arbitrarily chosen. It could have been a method argument, but in order to keep the example simple, I hard-coded it.

Domain-specific generators #

From such general-purpose building blocks, you can define custom generators for your domain model. This enables you to use such generators in your unit tests.

In order to generate post codes, you can combine the AlphaNumericString and the Many generators:

public static Generator<PostCode> PostCode =
    new Generator<PostCode>(r => 
    {
        var postCodes = Many(AlphaNumericString).Generate(r);
        return new PostCode(postCodes.ToArray());
    });

The PostCode class is part of your domain model; it takes an array of strings as input to its constructor. The PostCode generator uses the AlphaNumericString generator as input to the Many method. This generates zero or many alphanumeric strings, which you can pass to the PostCode constructor.

This, in turn, gives you all the building blocks you need to generate Address objects:

public static Generator<Address> Address =
    new Generator<Address>(r =>
    {
        var street = AlphaNumericString.Generate(r);
        var city = AlphaNumericString.Generate(r);
        var postCode = PostCode.Generate(r);
        return new Address(street, city, postCode);
    });

This Generator<Address> uses the AlphaNumericString generator to generate street and city strings. It uses the PostCode generator to generate a PostCode object. All these objects are passed to the Address constructor.

Keep in mind that all of this logic is defined in lazily evaluated functions. Only when you invoke the Generate method on a generator does the code execute.

Generating values #

You can now write tests similar to the tests shown in the article series about Test Data Builders. If, for example, you need an address in Paris, you can generate it like this:

var rnd = new Random();
var address = Gen.Address.Select(a => a.WithCity("Paris")).Generate(rnd);

Gen.Address is the Address generator shown above; I put all those generators in a static class called Gen. If you don't modify it, Gen.Address will generate a random Address object, but by using Select, you can pin the city to Paris.

You can also start with one type of generator and use Select to map to another type of generator, like this:

var rnd = new Random();
var address = Gen.PostCode
    .Select(pc => new Address("Rue Morgue""Paris", pc))
    .Generate(rnd);

You use Gen.PostCode as the initial generator, and then Select a new Address in Rue Morgue, Paris, with a randomly generated post code.

Functor #

Such a Test Data Generator is a functor. One way to see that is to use query syntax instead of the fluent API:

var rnd = new Random();
var address =
    (from a in Gen.Address select a.WithCity("Paris")).Generate(rnd);

Likewise, you can also translate the Rue Morgue generator to query syntax:

var address = (
    from pc in Gen.PostCode
    select new Address("Rue Morgue""Paris", pc)).Generate(rnd);

This is, however, awkward, because you have to enclose the query expression in brackets in order to be able to invoke the Generate method. Alternatively, you can separate the query from the generation, like this:

var g = from a in Gen.Address select a.WithCity("Paris");
var rnd = new Random();
var address = g.Generate(rnd);

Or this:

var g =
    from pc in Gen.PostCode
    select new Address("Rue Morgue""Paris", pc);
var rnd = new Random();
var address = g.Generate(rnd);

You'd probably still prefer the fluent API over this syntax. The reason I show this alternative is to demonstrate that the functor gives you the ability to separate the definition of data generation from the actual generation. In order to emphasise this point, I defined the g variables before creating the Random object rnd.

Property-based testing #

The above Generator<T> is only a crude example of a Test Data Generator. In order to demonstrate how such a generator is a functor, I left out several useful features. Still, this should have given you a sense for how the Generator<T> class itself, as well as such general-purpose building blocks as Many and AlphaNumericString, could be packaged in a reusable library.

The examples above show how to use a generator to create a single random object. You could, however, easily generate many (say, 100) random objects, and run unit tests for each object created. This is the idea behind property-based testing.

There's more to property-based testing than generation of random values, but the implementations I've seen are all based on Test Data Generators as functors (and monads).

FsCheck #

FsCheck is an open source F# library for property-based testing. It defines a Gen functor (and monad) that you can use to generate Address values, just like the above examples:

let! address = Gen.address |> Gen.map (fun a -> { a with City = "Paris"} )

Here, Gen.address is a Gen<Address> value. By itself, it'll generate random Address values, but by using Gen.map, you can pin the city to Paris.

The map function corresponds to the C# Select method. In functional programming, map is the most common name, although Haskell calls the function fmap; the Select name is, in fact, the odd man out.

Likewise, you can map from one generator type to another:

let! address =
    Gen.postCode
    |> Gen.map (fun pc ->
        { Street = "Rue Morgue"; City = "Paris"; PostCode = pc })

This example uses Gen.postCode as the initial generator. This is, as the name implies, a Gen<PostCode> value. For every random PostCode value generated, map turns it into an address in Rue Morgue, Paris.

There's more going on here than I'd like to cover in this article. The use of let! syntax actually requires Gen<'a> to be a monad (which it is), but that's a topic for another day. Both of these examples are contained in a computation expression, and the implication of that is that the address values represent a multitude of randomly generated Address values.

Hedgehog #

Hedgehog is another open source F# library for property-based testing. With Hedgehog, the Address code examples look like this:

let! address = Gen.address |> Gen.map (fun a -> { a with City = "Paris"} )

And:

let! address =
    Gen.postCode
    |> Gen.map (fun pc ->
        { Street = "Rue Morgue"; City = "Paris"; PostCode = pc })

Did you notice something?

This is literally the same syntax as FsCheck! This isn't because Hedgehog is copying FsCheck, but because both are based on the same underlying abstraction: functor (and monad). There are other parts of the API where Hedgehog differs from FsCheck, but their generators are similar.

This is one of the most important advantages of using well-known abstractions like functors. Once you understand such an abstraction, it's easy to learn a new library. With professional experience with FsCheck, it only took me a few minutes to figure out how to use Hedgehog.

Summary #

Functors are well-defined objects from category theory. It may seem abstract, and far removed from 'real' programming, but it's extraordinarily useful. Many category theory abstractions can be applied to a host of different situations. Once you've learned what a functor is, you'll find it easy to learn to use new libraries that build on that abstraction.

In this article you saw a sketch of how the functor abstraction can be used to model Test Data Generators. Contrary to Test Data Builders, which turned out to be a redundant abstraction, a Test Data Generator is truly useful.

Many years ago, I had the idea to create a Test Data Generator for unit testing purposes. I called it AutoFixture, and although it's had some success, the API isn't as clean as it could be. Back then, I didn't know about functors, so I had to invent an API for AutoFixture. This API is proprietary to AutoFixture, so anyone learning AutoFixture must learn this particular API, and its abstractions. It would have been so much easier for all involved if I had designed AutoFixture as a functor instead.


Comments

I'm curious as to what the "useful features" are that that you left out of the Test Data Generator?

2017-10-23 16:36 UTC

Stuart, thank you for writing. Test Data Generators like the one described here are rich data structures that you can do a lot of interesting things with. As described here, the generator only generates a single value every time you invoke its Generate method. What property-based testing libraries like QuickCheck, FsCheck, and Hedgehog do is that instead of a single random value, they generate many values (the default number seems to be 100).

These property-based testing libraries tend to then 'elevate' their generators into another type of data structure called Arbitraries, and these again into Properties. What typically happens is that they use the Generators to generate values, but for each generated value, they evaluate the associated Property. If all Properties succeed, nothing more happens, but in the case of a test failure, no more values are generated. Instead, the libraries switch to a state where they attempt to shrink the counter-example to a simpler counter-example. It uses a Shrinker associated with the Arbitrary to do this. The end result is that if your test doesn't hold, you'll get an easy-to-understand example of the input that caused the test to fail.

Apart from that, there are many other features of Test Data Generators that I left out. Some of these include ways to combine several Generators to a single Generator. It turns out that Test Data Generators are also Applicative Functors and Monads, and you can use these traits to define powerful combinators. In the future, I'll publish more articles on this topic, but it'll take months, because my article queue has quite a few other articles in front of those.

If you want to explore this topic, I'd recommend playing with FsCheck. While it's written in F#, it also works from C#, and its documentation includes C# examples as well. Hedgehog may also work from C#, but being a newer, more experimental library, its documentation is still sparse.

2017-10-24 7:53 UTC
Hedgehog may also work from C#

That's right. Hedgehog may be used from C# as well.

2018-11-13 09:53 UTC

Test data without Builders

Monday, 11 September 2017 07:28:00 UTC

We don't need no steenkin' Test Data Builders!

This is the fifth and final in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. In the previous article, you learned why a Builder functor adds little value. In this article, you'll see what to do instead.

From Identity to naked values #

While you can define Test Data Builders with Haskell's Identity functor, it adds little value:

Identity address = fmap (\-> a { city = "Paris" }) addressBuilder

That's nothing but an overly complicated way to create a data value from another data value. You can simplify the code from the previous article. First, instead of calling them 'Builders', we should be honest and name them as the default values they are:

defaultPostCode :: PostCode
defaultPostCode = PostCode []
 
defaultAddress :: Address
defaultAddress  = Address { street = "", city = "", postCode = defaultPostCode }

defaultPostCode is nothing but an empty PostCode value, and defaultAddress is an Address value with empty constituent values. Notice that defaultAddress uses defaultPostCode for the postCode value.

If you need a value in Paris, you can simply write it like this:

address = defaultAddress { city = "Paris" }

Likewise, if you need a more specific address, but you don't care about the post code, you can write it like this:

address' =
  Address { street = "Rue Morgue", city = "Paris", postCode = defaultPostCode }

Notice how much simpler this is. There's no need to call fmap in order to pull the 'underlying value' out of the functor, transform it, and put it back in the functor. Haskell's 'copy and update' syntax gives you this ability for free. It's built into the language.

Building F# values #

Haskell isn't the only language with 'copy and update' syntax. F# has it as well, and in fact, it's from the F# documentation that I've taken the 'copy and update' term.

The code corresponding to the above Haskell code looks like this in F#:

let defaultPostCode = PostCode []
let defaultAddress = { Street = ""; City = ""; PostCode = defaultPostCode }
 
let address = { defaultAddress with City = "Paris" }
let address' =
    { Street = "Rue Morgue"; City = "Paris"; PostCode = defaultPostCode }

The syntax is a little different, but the concepts are the same. F# adds the keyword with to 'copy and update' expressions, which translates easily back to C# fluent interfaces.

Building C# objects #

In a previous article, you saw how to refactor your domain model to a model of Value Objects with fluent interfaces.

In your unit tests, you can define natural default values for testing purposes:

public static class Natural
{
    public static PostCode PostCode = new PostCode();
    public static Address Address = new Address("""", PostCode);
    public static InvoiceLine InvoiceLine =
        new InvoiceLine(""PoundsShillingsPence.Zero);
    public static Recipient Recipient = new Recipient("", Address);
    public static Invoice Invoice = new Invoice(Recipient, new InvoiceLine[0]);
}

This static Natural class is a test-specific container of 'good' default values. Notice how, once more, the Address value uses the PostCode value to fill in the PostCode property of the default Address value.

With these default test values, and the fluent interface of your domain model, you can easily build a test address in Paris:

var address = Natural.Address.WithCity("Paris");

Because Natural.Address is an Address object, you can use its WithCity method to build a test address in Paris, and where all other constituent values remain the default values.

Likewise, you can create an address on Rue Morgue, but with a default post code:

var address = new Address("Rue Morgue""Paris"Natural.PostCode);

Here, you can simply create a new Address object, but with Natural.PostCode as the post code value.

Conclusion #

Using a fluent domain model obviates the need for Test Data Builders. There's a tendency among functional programmers to overbearingly state that design patterns are nothing but recipes to overcome deficiencies in particular programming languages or paradigms. If you believe such a claim, at least it ought to go both ways, but at the conclusion of this article series, I hope I've been able to demonstrate that this is true for the Test Data Builder pattern. You only need it for 'classic', mutable, object-oriented domain models.

  1. For mutable object models, use Test Data Builders.
  2. Consider, however, modelling your domain with Value Objects and 'copy and update' instance methods.
  3. Even better, consider using a programming language with built-in 'copy and update' expressions.
If you're stuck with a language like C# or Java, you don't get language-level support for 'copy and update' expressions. This means that you'll still need to incur the cost of adding and maintaining all those With[...] methods:

public class Invoice
{
    public Recipient Recipient { get; }
    public IReadOnlyCollection<InvoiceLine> Lines { get; }
 
    public Invoice(
        Recipient recipient,
        IReadOnlyCollection<InvoiceLine> lines)
    {
        if (recipient == null)
            throw new ArgumentNullException(nameof(recipient));
        if (lines == null)
            throw new ArgumentNullException(nameof(lines));
 
        this.Recipient = recipient;
        this.Lines = lines;
    }
 
    public Invoice WithRecipient(Recipient newRecipient)
    {
        return new Invoice(newRecipient, this.Lines);
    }
 
    public Invoice WithLines(IReadOnlyCollection<InvoiceLine> newLines)
    {
        return new Invoice(this.Recipient, newLines);
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as Invoice;
        if (other == null)
            return base.Equals(obj);
 
        return object.Equals(this.Recipient, other.Recipient)
            && Enumerable.SequenceEqual(
                this.Lines.OrderBy(l => l.Name),
                other.Lines.OrderBy(l => l.Name));
    }
 
    public override int GetHashCode()
    {
        return
            this.Recipient.GetHashCode() ^
            this.Lines.GetHashCode();
    }
}

That may seem like quite a maintenance burden (and it is), but consider that it has the same degree of complexity and overhead as defining a Test Data Builder for each domain object. At least, by putting this extra code in your domain model, you make all of that API (all the With[...] methods, and the structural equality) available to other production code. In my experience, that's a better return of investment than isolating such useful features only to test code.

Still, once you've tried using a language like F# or Haskell, where 'copy and update' expressions come with the language, you realise how much redundant code you're writing in C# or Java. The Test Data Builder design pattern truly is a recipe that addresses deficiencies in particular languages.

Next: The Test Data Generator functor.


Comments

Hi Marks, thanks for the whole serie. I personally tend to split my class into 2: 'core' feature and syntactic sugar one.
Leveraging extension methods to implement 'With' API is relatively straightforward and you have both developper friendly API and a great separation of concern namely definition and usage.
If you choose to implement extensions in another assembly you could manage who have access to it: unit test only, another assembly, whole project.
You can split API according to context/user too. It can also be useful to enforce some guidelines.
2017-09-12 09:20 UTC
Hi Marks, what do you think about using Roslyn to generate builders? Using helpers like this CodeGeneration.Roslyn you can generate all (With*) in compile time so there is no IL injection magic.
I have some ugly POC code in my branch Roslyn builder generator - it is only a starting point but I think it has some potential.
2019-03-19 18:18 UTC

Dominik, thank you for writing. I admit that I haven't given this much thought, but it strikes me as one of those 'interesting problems' that programmers are keen to solve. It looks to me like a bit of a red herring, as I tend to be sceptical of schemes to generate code. What problem does it address? That one has to type? That's rarely the bottleneck in software development.

Granted, it gets tedious to manually add all those With[...] methods, but there's a lot of things about C# that's tedious. There's a reason I prefer F# instead.

2019-03-20 7:16 UTC

Thanks for respond - I think that for each comment you now have 1+ blog post to respond ;). Despite the fact that I should consider learning new language like F# to open my mind I will focus on c# aspect.

I understand your consideration about code generation but I thing that when we repeat some actions over and over we automatically think about some automations - this is the source of computers I think. Currently I'm working in project where we use Test Builder Pattern heavily and every time I think about writing another builder my motivation is decreasing because psychologically is not interesting anymore and I would be happy to give that to someone else or machine.

When I started to understand what is Roslyn and what it can do it just open my eyes to new opportunities. Generating some simple but frequently repeating code give me more time on focusing on real domain problems and keep my frustration level on low position :)

Of course this is not BIG problem solver but only new approach for simplification of daily tasks - another advantage is that Roslyn I creating normal c# code file that can be navigated from code, can be seen in debugger (in contrast to IL injectors), so there is no magical black boxes. Disadvantage is that currently generating code is very simple - it involves some external nugets and I feel that writing generator in Roslyn could be simplified;

ps. Commenting via pull request is interesting experience - feels like pro ;)

2019-03-20 08:48 UTC

Dominik, while it isn't based on Roslyn, are you aware of AutoFixture?

2019-03-21 7:01 UTC

Yes, I discovered this tool together with your blog ;) I think it is good enough - Roslyn approach is only alternative not basing on reflection or IL injection.

I will try to use AutoFixture in next project so I will see it will survive my requirements.

2019-03-26 21:59 UTC

If I understand correctly, one of your claims is that a fluent C# syntax for expressing change (i.e. "with" methods for an immutable value object) is equivalent to F#'s copy and update syntax for records in the sense that any code written with one can be written with the other. I agree with that. Then you pointed out some advantages with the F# syntax. Among the advantages of F#'s syntax is that there is less code to write in the first place and less code to maintain.

I see an advantage with C#'s syntax. Suppose the only constructor of the value object is internal but all its properties and "with" methods are public. Then adding a new (public) property and corresponding (public) "with" method is not a breaking change. As far as I know, this is not possible with F#.* Either the record consturctor is public or it is not public. If the record's constructor is public, then the copy and update syntax is also public but adding a proprty to the record is a breaking change. Otherwise, the record's constructor is not public, so the copy and update syntax is not available.

I have an extremely short list of advantages of C# over F#, and this is one of them.

*It is possible to put an access modifier immediately after the equals sign when defining a record. However, the documentation for record syntax is missing this information. When I try to put an access modifier before a field identifier, I get a compiler error that says

FS0575 Accessibility modifiers are not permitted on record fields. Use 'type R = internal ...' or 'type R = private ...' to give an accessibility to the whole representation.

P.S. For those that want to write functionally in C#, I recommend using Langage Ext. in particular, a somewhat recently added feature is auto-generated "with" methods.

2019-09-21 01:08 UTC

Tyson, thank you for writing. Let's get the uncontroversial part of this discussion out of the way first: F# record types compile to IL that's equivalent to what a properly-written C# Value Object compiles to. At the IL level, there's no difference.

At the language level, it's true that F# records is a specialised syntax that enables you to succinctly define static types to model data. It's not a general-purpose syntax, so there's definitely things it doesn't allow you to express. F# has normal class syntax for those needs.

That record types aren't refactoring-safe is a known issue. This is true not only for F# records, but for Haskell data types as well. In Haskell public APIs, you sometimes see that combination that you describe. The type has a private constructor, but the library then provides functions to manipulate it (essentially copy-and-update functions). You sometimes see that in F# as well, but here a class would often have been a better choice. Haskell doesn't have object-oriented classes, so it has to resort to that sort of hack to keep APIs backwards compatible.

When you write a public API in F#, choosing between a record and a class as a data carrier is an important choice. When APIs are published (e.g. on nuget.org), you'll have little success with your library if you regularly introduce breaking changes.

For internal use, the story is different. You can use F# records to express domain models with a few lines of code. If you later find out that you have to change the model, then you do that, and fix the ensuing compilation errors.

Public APIs represent more work, regardless of the language in which they're written. Yes, you need to carefully and deliberately design a public library's API and data structures. I don't think, however, that that should detract us from using productive language features for application-specific use.

2019-09-21 18:28 UTC

Let's get the uncontroversial part of this discussion out of the way first...
I am right with you. Your entire comment was uncontroversial to me :)

When you write a public API in F#, choosing between a record and a class as a data carrier is an important choice. ... here a class would often have been a better choice.
(I quoted you out of order there. I hope this doesn't misrepresent what you were saying. I don't think it does.) I am really interested to learn more about that.

I found the series that includes this blog post when I searched on Google for "builder pattern F#". This series is primarily about the test data builder design pattern. As I understand it, I would describe this pattern as a special case of the (general case) builder design pattern in which all arguments have reasonable defaults.

Have you ever written a builder that accepted multiple arguments one at a time none of which have reasonable defaults? Have you ever blogged about this (more general) build design pattern?

As a good student of your ;) I wonder if the builder design pattern corresponds to some universal abstraction. Among the fluent interfaces that I am most impressed with are configuration in Entity Framework and Fluent Assertions. Of course I could try to make my own fluent interface by copying them, and that would probably work out reasonably well. At the same time, I would like to learn from you and your frustration (if that description is accurate) that you expressed (at the end of the next and last post in this series) with the API of your AutoFixture project failing to use a potential universal abstraction (namely functors).

2019-09-22 01:50 UTC

Tyson, thank you for bringing the Builder pattern to my attention. I haven't written much about it yet, but I believe that it'd be a perfect fit for my article series on how certain design patterns relate to universal abstractions. When I get some time, I'll have to write one or more articles about that topic.

In short, though, I think that the Builder pattern as described in Design Patterns is isomorphic to the Fluent Builder pattern, as you also imply. It remains for me to more formally argue that case, but in short, the Builder pattern is described as a set of virtual methods that return void. Since all these methods return void, each method could, instead, return the object to which it belongs, and that's what a Fluent Builder does.

Once you return the Builder object, you could, instead of mutating and returning the instance, return a new object. That new object is a near-copy of the previous Builder, with only one change applied to it. Now you have a function that essentially takes a Builder as input, plus some other input, and returns a Builder. That's just a curried endomorphism.

Once again, every time we run into a composable design pattern, it turns out to be a monoid. It shouldn't surprise us much, though, since the original Builder pattern as described in Design Patterns has void methods, and such methods compose.

2019-09-29 20:31 UTC

The most formal treatment I have seen about fluent APIs was in this blog post. The context is that we are trying to create a word in some language specified by a grammar, and the methods in the fluent API correspond to production rules in the grammar. The company behind that blog post seems to able to generate a fluent API (in Java) given as input the produciton rules of a grammar. Their main use case appears to be creating a fluent API for constructing SQL queries against a database (presumably by first converting a database schema into corresponding grammar production rules). The end result reminds me of F#'s SQL type provider.

2019-09-30 13:22 UTC

Tyson, I've now published an article that hopefully answers some of your questions. I must admit that I'm still puzzled by this question:

"Have you ever written a builder that accepted multiple arguments one at a time none of which have reasonable defaults?"

If I left that unanswered, then at least I hope that I've managed to put enough building blocks into position to be able to address it. Can you elaborate?

2020-02-17 7:34 UTC

I have now elaborated in this comment. Thanks for waiting :)

2020-03-11 18:31 UTC

Builder as Identity

Monday, 04 September 2017 07:41:00 UTC

In which the Builder functor turns out to be nothing but the Identity functor in disguise.

This is the fourth in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. In the previous article, you saw how a generic Test Data Builder can be modelled as a functor.

You may, however, be excused if you're slightly underwhelmed. Modelling a Test Data Builder as a functor doesn't seem to add much value.

Haskell's Identity functor #

In the previous article, you saw the Builder functor implemented in various languages, including Haskell:

newtype Builder a = Builder a deriving (ShowEq)
 
instance Functor Builder where
  fmap f (Builder a) = Builder $ f a

The fmap implementation is literally a one-liner: pattern match the value a out of the Builder, call f with a, and package the result in a new Builder value.

For many trivial functors, it turns out that the Glasgow Haskell Compiler (GHC) can automatically implement fmap with a language extension:

{-# LANGUAGE DeriveFunctor #-}
module Builder where
 
newtype Builder a = Builder a deriving (ShowEqFunctor)

Notice the DeriveFunctor language extension. This enables the compiler to automatically implement fmap by adding Functor to the deriving list.

Perhaps we should take this as a hint. If the compiler can automatically make Builder a Functor, perhaps it doesn't add that much value.

This particular Builder is equivalent to Haskell's built-in Identity functor. Identity is a 'no-op' functor, if you will. While it's a functor, it doesn't 'do' anything. It's similar to the Null Object design pattern, in the sense that the only value it adds is that it enables you to turn any naked value into a functor. This can occasionally be useful if you need to pass a functor to an API.

PostCode and Address builders #

You can rewrite the previous PostCode and Address Test Data Builders as Identity values:

postCodeBuilder :: Identity PostCode
postCodeBuilder = Identity $ PostCode []
 
addressBuilder :: Identity Address
addressBuilder  =
  Identity Address { street = "", city = "", postCode = pc }
  where Identity pc = postCodeBuilder

As in the previous examples, postCodeBuilder is nothing but a 'good' default PostCode value. This time, it's turned into an Identity value, instead of a Builder value. The same is true for addressBuilder - notice that it uses postCodeBuilder for the postCode value.

This enables you to build an address in Paris, like previous examples:

Identity address = fmap (\-> a { city = "Paris" }) addressBuilder

This builds an address with city bound to "Paris", but with all other values still at their default values:

Address {street = "", city = "Paris", postCode = PostCode []}

You can also build an address from an Identity of a different generic type:

Identity address' = fmap newAddress postCodeBuilder
  where newAddress pc =
          Address { street = "Rue Morgue", city = "Paris", postCode = pc }

Notice that this example uses postCodeBuilder as an origin, but creates a new Address value. In this expression, newAddress is a local function that takes a PostCode value as input, and returns an Address value as output.

Summary #

Neither F# nor C# comes with a built-in identity functor, but it'd be as trivial to create them as the code you've already seen. In the previous article, you saw how to define a Builder<'a> type in F#. All you have to do is to change its name to Identity<'a>, and you have the identity functor. You can perform a similar rename for the C# code in the previous articles.

Since the Identity functor doesn't really 'do' anything, there's no reason to use it for building test values. In the next article, you'll see how to discard the functor and in the process make your code simpler.

Next: Test data without Builders.


The Builder functor

Monday, 28 August 2017 11:19:00 UTC

The Test Data Builder design pattern as a functor.

This is the third in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. The previous article introduced this generic Builder class:

public class Builder<T>
{
    private readonly T item;
 
    public Builder(T item)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
 
        this.item = item;
    }
 
    public Builder<T1> Select<T1>(Func<TT1> f)
    {
        var newItem = f(this.item);
        return new Builder<T1>(newItem);
    }
 
    public T Build()
    {
        return this.item;
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as Builder<T>;
        if (other == null)
            return base.Equals(obj);
 
        return object.Equals(this.item, other.item);
    }
 
    public override int GetHashCode()
    {
        return this.item.GetHashCode();
    }
}

The workhorse is the Select method. As I previously promised to explain, there's a reason I chose that particular name.

Query syntax #

C# comes with a small DSL normally known as query syntax. People mostly think of it in relation to ORMs such as Entity Framework, but it's a general-purpose language feature. Still, most developers probably associate it with the IEnumerable<T> interface, but it's more general than that. In fact, any type that comes with a Select method with a compatible signature supports query syntax.

I deliberately designed the Builder's Select method to support query syntax:

Address address = from a in Builder.Address
                  select a.WithCity("Paris");

Builder.Address is a Builder<Address> object that contains a 'good' default Address value. Since Builder<T> has a compatible Select method, you can 'query it'. In this example, you use the WithCity method to explicitly pin the Address object's City property, while all the other Address values remain the default values.

There's an extra bit (pun intended) of compiler magic at work. Did you wonder how a Builder<Address> automatically turns into an Address value? After all, address is of the type Address, not Builder<Address>.

I specifically added an implicit conversion so that I didn't have to surround the query expression with brackets in order to call the Build method:

public static implicit operator T(Builder<T> b)
{
    return b.item;
}

This conversion is defined on Builder<T>. It's the reason I explicitly use the type name when I declare the address variable above, instead of using the var keyword. Declaring the type forces the implicit conversion.

You can also use query syntax to map one constructed Builder type into another (and ultimately to the value it contains):

Address address =
    from pc in Builder.PostCode
    select new Address("Rue Morgue""Paris", pc);

This expression starts with a Builder<PostCode> object, transforms it into a Builder<Address> object, and then finally uses the implicit conversion to turn the Builder<Address> into an Address object.

Even a more complex 'query' looks almost palatable:

Invoice invoice =
    from i in Builder.Invoice
    select i.WithRecipient(
        from r in Builder.Recipient
        select r.WithAddress(Builder.Address.WithNoPostCode()));

Again, the implicit type conversion makes the syntax much cleaner.

Functor #

Isn't it amazing that the C# designers were able to come up with such a generally useful language feature? It certainly is a nice piece of work, but it's based on a an existing body of knowledge.

A type like Builder<T> with a suitable Select method is a functor. This is a term from category theory, but I'll try to avoid turning this article into a category theory lecture. Likewise, I'm not going to talk specifically about monads here, although it's a closely related topic. A functor is a mapping between categories; it maps an object from one category into an object of another category.

Although I've never seen Microsoft explicitly acknowledge the connection to functors and monads, it's clear that it's there. One of the principal designers of LINQ was Erik Meijer, who definitely knows his way around category theory and functional programming. A functor is a simple, but widely applicable abstraction.

In order to be a functor, a type must have an associated mapping. In C#'s query syntax, this is a method named Select, but more often it's called map.

Haskell Builder #

In Haskell, the mapping is called fmap, and you can define the Builder functor like this:

newtype Builder a = Builder a deriving (ShowEq)
 
instance Functor Builder where
  fmap f (Builder a) = Builder $ f a

Notice how terse the definition is, compared to the C# version. Despite the difference in size, they accomplish the same goal. The first line of code defines the Builder type, complete with structural equality (Eq) and the ability to convert a Builder value to a string (Show).

This Builder type is explicitly defined as a Functor in the second expression, where the fmap function is implemented. The code is similar to the Select method in the above C# example: f is a function that takes the generic type a (corresponding to T in the C# example) as input, and returns a value of the generic type b (corresponding to T1 in the C# example). The mapping pulls the underlying value out of the input Builder, calls f with that value, and puts the return value into a new Builder.

In Haskell, a functor is part of the language itself, so Builder is explicitly declared to be a Functor instance.

If you define some default Builder values, corresponding to the above Builder.Address, you can use them to build addresses in the same way:

Builder address = fmap (\-> a { city = "Paris" }) addressBuilder

Here, addressBuilder is a Builder Address value, corresponding to the C# Builder.Address value. \a -> a { city = "Paris" } is a lambda expression that takes an Address value as input, and return a similar value as output, only with city explicitly bound to "Paris".

F# example #

Unlike Haskell, F# doesn't treat functors as an explicit construct, but you can still define a Builder functor:

type Builder<'a> = Builder of 'a
 
module Builder =
    // ('a -> 'b) -> Builder<'a> -> Builder<'b>
    let map f (Builder x) = Builder (f x)

You can see how similar this is to the Haskell example. In F#, it's common to define a module with the same name as a generic type. This example defines a generic Builder<'a> type and a supporting Builder module. Normally, a module would contain other functions, in addition to map.

Just like in C# and Haskell, you can build an address in Paris with a predefined Builder value as a start:

let (Builder address) =
    addressBuilder |> Builder.map (fun a -> { a with City = "Paris" })

Again, addressBuilder is a Builder<Address> that contains a 'default' Address (test) value. You use Builder.map with a lambda expression to map the default value into a new Address value where City is bound to "Paris".

Functor laws #

In order to be a proper functor, an object must obey two simple laws. It's not enough that a mapping function exists, it must also obey the laws. While that sounds uncomfortably like mathematics, the laws are simple and intuitive.

The first law is that when the mapping returns the input, the functor returned is also the input functor. There's only one (generic) function that returns its input unmodified. It's called the identity function (often abbreviated id).

Here's an example test case that illustrates the first functor law for the C# Builder<T>:

[Fact]
public void BuilderObeysFirstFunctorLaw()
{
    Func<intint> id = x => x;
    var sut = new Builder<int>(42);
 
    var actual = sut.Select(id);
 
    Assert.Equal(sut, actual);
}

The .NET Base Class Library doesn't come with a built-in identity function, so the test case first defines it as id. Normally, the identity function would be defined as a function that takes a value of the generic type T as input, and returns the same value (still of type T) as output. This test is only an example for the type int, so it also defines the identity function as constrained to int.

The test creates a new Builder<int> with the value 42, and calls Select with id. Since the first functor law says that mapping with the identity function must return the input functor, the expected value is the sut variable.

This test is only an example of the first functor law. It doesn't prove that Builder<T> obeys the law for all generic types (T) and for all values. It only proves that it holds for the integer 42. You get the idea, though, I'm sure.

The second functor law says that if you chain two functions to make a third function, and map your functor using that third function, the result should be equal to the result you get if you chain two mappings made out of those two functions. Here's an example:

[Fact]
public void BuilderObeysSecondFunctorLaw()
{
    Func<intstring> g = i => i.ToString();
    Func<stringstring> f = s => new string(s.Reverse().ToArray());
    var sut = new Builder<int>(1337);
 
    var actual = sut.Select(i => f(g(i)));
 
    var expected = sut.Select(g).Select(f);
    Assert.Equal(expected, actual);
}

This test case (which is, again, only an example) first defines two functions, f and g. It then creates a new Builder<int> and calls Select with the combined function f(g). This returns the actual result, which is a Builder<string>.

This result should be equal to first calling Select with g (which returns a Builder<string>), and then calling Select with f (which returns another Builder<string>). These two Builder objects should be equal to each other, which they are.

Both these tests compare an expected Builder to an actual Builder, which is the reason that Builder<T> overrides Equals in order to have structural equality. In Haskell, the above Builder type has structural equality because it uses the default instance of Eq, and in F#, Builder<'a> has structural equality because that's the default equality for the immutable F# data types.

We can't easily talk about the functor laws without being able to talk about functor values being (or not being) equal to each other, so structural equality is an important element in the discussion.

Summary #

You can define a Test Data Builder as a functor by defining a generic Builder type with a Select method. In order to be a proper functor, it must also obey the functor laws, but these laws are quite natural; you almost have to go out of your way in order to violate them.

A functor is a well-known abstraction. Instead of trying to come up with a half-baked, ad-hoc abstraction, modelling an API based on already known and understood abstractions such as functors will make the API easier to learn. Everyone who knows what a functor is, will automatically have a good understanding of the API. Even if you didn't know about functors until now, you only have to learn about them once.

This can often be beneficial, but for Test Data Builders, it turns out to be a red herring. The Builder functor is nothing but the Identity functor in disguise.

Next: Builder as Identity.


Comments

Maybe I am missing something, so could you explain with few words what is the advantage of having this _generic builder_?

I mean, inmutable entities and _with methods_ seems to be enough to easily create test data without builders, for example:


var invoice = DefaultTestObjects.Invoice
    .WithRecipient(DefaultTestObjects.Recipient
        .WithAddress(DefaultTestObjects.Address
            .WithNoPostCode()
            .WithCity("Paris"))
    .WithDate(new DateTimeOffset(2017, 8, 29)));

2017-08-29 12:50 UTC

Andrés, thank you for writing. I hope that the next two articles in this article series will answer your question. It seems, however, that you've already predicted where this is headed. A fine display of critical thinking!

2017-08-29 17:48 UTC

Just for the DSL and implicit conversion laius it's worth reading.
_Implicit/explicit_ is a must-have to lighten _value object_ usage and contributes to deliver proper API.
Thank you Mark.

2017-08-31 12:44 UTC

Generalised Test Data Builder

Monday, 21 August 2017 06:09:00 UTC

This article presents a generalised Test Data Builder.

This is the second in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. The previous article was a review of the Test Data Builder pattern.

Boilerplate #

While the Test Data Builder is an incredibly versatile and useful design pattern, it has a problem. In languages like C# and Java, it's difficult to generalise. This leads to an excess of boilerplate code.

Expanding on Nat Pryce's original example, an InvoiceBuilder is composed of other builders:

public class InvoiceBuilder
{
    private Recipient recipient;
    private IReadOnlyCollection<InvoiceLine> lines;
 
    public InvoiceBuilder()
    {
        this.recipient = new RecipientBuilder().Build();
        this.lines = new List<InvoiceLine> { new InvoiceLineBuilder().Build() };
    }
 
    public InvoiceBuilder WithRecipient(Recipient newRecipient)
    {
        this.recipient = newRecipient;
        return this;
    }
 
    public InvoiceBuilder WithInvoiceLines(
        IReadOnlyCollection<InvoiceLine> newLines)
    {
        this.lines = newLines;
        return this;
    }
 
    public Invoice Build()
    {
        return new Invoice(recipient, lines);
    }
}

In order to create a Recipient, a RecipientBuilder is used. Likewise, in order to create a single InvoiceLine, an InvoiceLineBuilder is used. This pattern repeats in the RecipientBuilder:

public class RecipientBuilder
{
    private string name;
    private Address address;
 
    public RecipientBuilder()
    {
        this.name = "";
        this.address = new AddressBuilder().Build();
    }
 
    public RecipientBuilder WithName(string newName)
    {
        this.name = newName;
        return this;
    }
 
    public RecipientBuilder WithAddress(Address newAddress)
    {
        this.address = newAddress;
        return this;
    }
 
    public Recipient Build()
    {
        return new Recipient(this.name, this.address);
    }
}

In order to create an Address object, an AddressBuilder is used.

Generalisation attempts #

You can describe the pattern in a completely automatable manner:

  1. For each domain class, create a corresponding Builder class.
  2. For each class field or property in the domain class, define a corresponding field or property in the Builder.
  3. In the Builder's constructor, initialise each field or property with a 'good' default value.
    • If the field is a primitive value, such as a string or integer, hard-code an appropriate value.
    • If the field is a complex domain type, use that type's corresponding Builder to create the default value.
  4. For each class field or property, add a With[...] method that changes the field and returns the Builder itself.
  5. Add a Build method that returns a new instance of the domain class with the constituent values collected so far.
When you can deterministically descrbe an automatable process, you can write code to automate it.

People have already done that. After having written individual Test Data Builders for a couple of months, I got tired of it and wrote AutoFixture. It uses Reflection to build objects at run-time, but I've also witnessed attempts to automate Test Data Builders via automated code generation.

AutoFixture has been moderately successful, but some people find its API difficult to learn. Correspondingly, code generation comes with its own issues.

In languages like C# or Java, it's difficult to identify a better generalisation.

Generic Builder #

Instead of trying to automate the Test Data Builder pattern, you can pursue a different strategy. At first, it doesn't look all that promising, but if you soldier on, it'll reveal meaningful insights.

As an alternative to replicating the Test Data Builder pattern exactly, you can define a single generically typed Builder class:

public class Builder<T>
{
    private readonly T item;
 
    public Builder(T item)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
 
        this.item = item;
    }
 
    public Builder<T1> Select<T1>(Func<TT1> f)
    {
        var newItem = f(this.item);
        return new Builder<T1>(newItem);
    }
 
    public T Build()
    {
        return this.item;
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as Builder<T>;
        if (other == null)
            return base.Equals(obj);
 
        return object.Equals(this.item, other.item);
    }
 
    public override int GetHashCode()
    {
        return this.item.GetHashCode();
    }
}

The Builder<T> class reduces the Test Data Builder design patterns to the essentials:

  • A constructor that initialises the Builder with default data.
  • A single fluent interface Select method, which returns a new Builder object.
  • A Build method, which returns the built object.
Perhaps you wonder about the name of the Select method, but there's a good reason for that; you'll learn about it later.

This example of a generic Builder class overrides Equals (and, therefore, also GetHashCode). It doesn't have to do that, but there's a good reason to do this that we'll also come back to later.

It doesn't seem particularly useful, and a first attempt at using it seems to confirm such scepticism:

var address = Build.Address().Select(a =>
{
    a.City = "Paris";
    return a;
}).Build();

This example first uses Build.Address() to create an initial Builder object with appropriate defaults. This static method is defined on the static Build class:

public static Builder<Address> Address()
{
    return new Builder<Address>(new Address("""", PostCode().Build()));
}

Contrary to Builder<T>, which is a reusable, general-purpose class, the static Build class is an example of a collection of Test Utility Methods specific to the domain model you're testing. Notice how the Build.Address() method uses Build.PostCode().Build() to create a default value for the initial Address object's post code.

The above example passes a C# code block to the Select method. It takes the a (Address) object as input, specifically mutates its City property, and returns it. This syntax is crude, but works. It may look acceptable when pinning a single City property, but it quickly becomes awkward:

var invoice = Build.Invoice().Select(i =>
    {
        i.Recipient = Build.Recipient().Select(r =>
        {
            r.Address = Build.Address().WithNoPostCode().Build();
            return r;
        }).Build();
        return i;
    }).Build();

Not only is it difficult to get right when writing such nested statements, it's also hard to read. You can, however, correct that problem, as you'll see in a little while.

Before we commence on making the code prettier, you may have noticed that the Select method returns a Builder with a different generic type argument than it contains. The Select method on a Builder<T> object has the signature public Builder<T1> Select<T1>(Func<T, T1> f). Until now, however, all the examples you've seen return the input object. In those examples, T is the same as T1. For completeness' sake, here's an example of a proper change of type:

var address = Build.PostCode()
    .Select(pc => new Address("Rue Morgue""Paris", pc))
    .Build();

This example uses a Builder<PostCode> to create a new Address object. Plugging in the types, T becomes PostCode, and T1 becomes Address.

Perhaps you noticed that this example looks a little better than the previous examples. Instead of having to supply a C# code block, with return statement and all, this call to Select passes a proper (lambda) expression.

Expressions from extensions #

It'd be nice if you could use expressions, instead of full code blocks, with the Select method. As a first step, you could write some test-specific extension methods for your domain model, like this:

public static Address WithCity(this Address address, string newCity)
{
    address.City = newCity;
    return address;
}

This is same code as one of the code blocks above, only refactored to a named extension method. It simplifies use of the generic Builder, though:

var address = Build.Address().Select(a => a.WithCity("Paris")).Build();

That looks good in such a simple example, but unfortunately isn't much of an improvement when it comes to a more complex case:

var invoice =
    Build.Invoice()
        .Select(i => i
            .WithRecipient(Build.Recipient()
                .Select(r => r
                    .WithAddress(Build.Address()
                        .WithNoPostCode()
                        .Build()))
                .Build()))
        .Build();

If, at this point, you're tempted to give up on the overall strategy with a single generic Builder, you'd be excused. It will, however, turn out to be beneficial to carry on. There are more obstacles, but eventually, things will start to fall into place.

Copy and update #

The above WithCity extension method mutates the input object, which can lead to surprising behaviour. While it's a common way to implement fluent interfaces in object-oriented languages, nothing prevents you from making the code saner. Instead of mutating the input object, create a new object with the single value changed:

public static Address WithCity(this Address address, string newCity)
{
    return new Address(address.Street, newCity, address.PostCode);
}

Some people will immediately be concerned about the performance implications of doing this, but you're not one of those people, are you?

Granted, there's allocation and garbage collection overhead by creating new objects like this, but I'd digress if I started to discuss this here. In most cases, the impact is insignificant.

Fluent domain model #

Using extension methods enables you to use a more elegant syntax with the Select method, but there's still some maintenance overhead. If, for now, we accept such maintenance overhead, you could ask: given that we have to define and maintain all those With[...] methods, why limit them to your test code?

Would there be any harm in defining them as proper methods on your domain model?

public Address WithCity(string newCity)
{
    return new Address(this.Street, newCity, this.PostCode);
}

The above example shows the WithCity method as an instance method on the Address class. Here's the entire Address class, refactored to an immutable class:

public class Address
{
    public string Street { get; }
    public string City { get; }
    public PostCode PostCode { get; }
 
    public Address(string street, string city, PostCode postCode)
    {
        if (street == null)
            throw new ArgumentNullException(nameof(street));
        if (city == null)
            throw new ArgumentNullException(nameof(city));
        if (postCode == null)
            throw new ArgumentNullException(nameof(postCode));
 
        this.Street = street;
        this.City = city;
        this.PostCode = postCode;
    }
 
    public Address WithStreet(string newStreet)
    {
        return new Address(newStreet, this.City, this.PostCode);
    }
 
    public Address WithCity(string newCity)
    {
        return new Address(this.Street, newCity, this.PostCode);
    }
 
    public Address WithPostCode(PostCode newPostCode)
    {
        return new Address(this.Street, this.City, newPostCode);
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as Address;
        if (other == null)
            return base.Equals(obj);
 
        return object.Equals(this.Street, other.Street)
            && object.Equals(this.City, other.City)
            && object.Equals(this.PostCode, other.PostCode);
    }
 
    public override int GetHashCode()
    {
        return
            this.Street.GetHashCode() ^
            this.City.GetHashCode() ^
            this.PostCode.GetHashCode();
    }
}

Technically, you could introduce instance methods like WithCity even if you kept the class itself mutable, but once you start down that path, it makes sense to make the class immutable. As Eric Evans recommends in Domain-Driven Design, modelling your domain with (immutable) Value Objects has many benefits. Such objects should also have structural equality, which is the reason that this version of Address also overrides Equals and GetHashCode.

While it looks like more work in a language like C# or Java, there are many benefits to be derived from modelling your domain with Value Objects. As an interim result, then, observe that working with unit testing (in this case a general-purpose Test Data Builder) has prompted a better design of the System Under Test.

You may still think that this seems unnecessarily verbose, and I'd agree. This is one of the many reasons I prefer languages like F# and Haskell over C# or Java. The former have such a copy and update feature built-in. Here's an F# example of updating an Address record with a specific city:

let address = { a with City = "Paris" }

This capability is built into the language. You don't have to add or maintain any code in order to be able to write code like that. Notice, even, how with is a keyword. I'm not sure about the etymology of the word with used in this context, but I find the similarity compelling.

In Haskell, it looks similar:

address = a { city = "Paris" }

In other words, domain models created from immutable Value Objects are laborious in some languages, but that only suggests a deficiency in such a language.

Default Builders as values #

Now that the domain model is immutable, you can define default builders as values. Previously, to start building e.g. an Address value, you had to call the Build.Address() method. When the domain model was mutable, containing a single default value inside of a Builder would enable tests to mutate that default value. Now that domain classes are immutable, this is no longer a concern, and you can instead define test-specific default builders as values:

public static class Builder
{
    public readonly static Builder<Address> Address;
    public readonly static Builder<Invoice> Invoice;
    public readonly static Builder<InvoiceLine> InvoiceLine;
    public readonly static Builder<PostCode> PostCode;
    public readonly static Builder<PoundsShillingsPence> PoundsShillingsPence;
    public readonly static Builder<Recipient> Recipient;
 
    static Builder()
    {
        PoundsShillingsPence = new Builder<PoundsShillingsPence>(
            DomainModel.PoundsShillingsPence.Zero);
        PostCode = new Builder<PostCode>(new PostCode());
        Address =
            new Builder<Address>(new Address("""", PostCode.Build()));
        Recipient =
            new Builder<Recipient>(new Recipient("", Address.Build()));
        Invoice = new Builder<Invoice>(
            new Invoice(Recipient.Build(), new List<InvoiceLine>()));
        InvoiceLine = new Builder<InvoiceLine>(
            new InvoiceLine("", PoundsShillingsPence.Build()));
    }
 
    public static Builder<Address> WithNoPostCode(this Builder<Address> b)
    {
        return b.Select(a => a.WithPostCode(new PostCode()));
    }
}

This enables you to write expressions like this:

var address = Builder.Address.Select(a => a.WithCity("Paris")).Build();

To be clear: such a static Builder class is a Test Utility API specific to your unit tests. It would often be defined in a completely different file than the Builder<T> class, perhaps even in separate libraries.

Summary #

Instead of trying to automate Test Data Builders to the letter of the original design pattern description, you can define a single, reusable, generic Builder<T> class. It enables you to achieve some of the expressivity of Test Data Builders.

If you still don't find this strategy's prospects fertile, I understand. We're not done, though. In the next article, you'll see why Select is an appropriate name for the Builder's most important method, and how it relates to good abstractions.

Next: The Builder functor.


Comments

When I found myself writing too many With() methods, I created an extension to Fody code weaving tool: Fody.With.

Basically I declare the With() methods without body implementation, and then Fody does the implementation for me. It can also convert a generic version to N overloads with an implementation per each public property.

The link about has some usage examples, that hopefully make the idea clear.

2017-08-21 12:32 UTC

C# does have Object Initializer to build "address" with specified "city", similar to F# and Haskell.

2017-08-22 12:40 UTC

Harshdeep, thank you for writing. C# object initialisers aren't the same as F# Copy and Update Record Expressions. Unless I misunderstand what you mean, when you write

var address = new Address { City = "Paris" };

address will have "Paris" as City, but all other properties, such as Street and PostCode will be null. That's not what I want. That's the problem the Test Data Builder pattern attempts to address. Test values should be populated with 'good' values, not null.

I admit that I'm not keeping up with the latest developments in C#, but if I try to use the C# object initializer syntax with an existing value, like this:

var defaultAddress =
    new Address { Street = "", PostCode = new DomainModel.PostCode(), City = "" };
var address = defaultAddress { City = "Paris" };

it doesn't compile.

I'm still on Visual Studio 2015, though, so that may be it...

2017-08-22 13:27 UTC

Aah. Now I get it. Thanks for explaining. I am from C# world and certainly not into F# yet so I missunderstood "Copy & Update Expression" with "Object Initializer".

2017-08-23 5:22 UTC

Test Data Builders in C#

Tuesday, 15 August 2017 06:20:00 UTC

A brief recap of the Test Data Builder design pattern with examples in C#.

This is the first in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor.

In 2007 Nat Pryce described the Test Data Builder design pattern. The original article is easy to read, but in case you don't want to read it, here's a quick summary, with some of Nat Pryce's examples translated to C#.

The purpose of a Test Data Builder is to make it easy to create input data (or objects) for unit tests. Imagine, for example, that for a particular test case, you need an address in Paris; no other values matter. With a Test Data Builder, you can write an expression that gives you such a value:

var address = new AddressBuilder().WithCity("Paris").Build();

The address object explicity has a City value of "Paris". Any other values are default values defined by AddressBuilder. The values are there, but when they're unimportant to a particular test case, you don't have to specify them. To paraphrase Robert C. Martin, this eliminates the irrelevant, and amplifies the essentials of the test.

Address Builder #

An AddressBuilder could look like this:

public class AddressBuilder
{
    private string street;
    private string city;
    private PostCode postCode;
 
    public AddressBuilder()
    {
        this.street = "";
        this.city = "";
        this.postCode = new PostCodeBuilder().Build();
    }
 
    public AddressBuilder WithStreet(string newStreet)
    {
        this.street = newStreet;
        return this;
    }
 
    public AddressBuilder WithCity(string newCity)
    {
        this.city = newCity;
        return this;
    }
 
    public AddressBuilder WithPostCode(PostCode newPostCode)
    {
        this.postCode = newPostCode;
        return this;
    }
 
    public AddressBuilder WithNoPostcode()
    {
        this.postCode = new PostCode();
        return this;
    }
 
    public Address Build()
    {
        return new Address(this.street, this.city, this.postCode);
    }
}

The Address class is simpler than the Builder:

public class Address
{
    public string Street { getset; }
    public string City { getset; }
    public PostCode PostCode { getset; }
 
    public Address(string street, string city, PostCode postCode)
    {
        this.Street = street;
        this.City = city;
        this.PostCode = postCode;
    }
}

Clearly, this class could contain some behaviour, but in order to keep the example as simple as possible, it's only a simple Data Transfer Object.

Composition #

Given that AddressBuilder is more complicated than Address itself, the benefit of the pattern may seem obscure, but one of the benefits is that Test Data Builders easily compose:

var invoice = new InvoiceBuilder()
    .WithRecipient(new RecipientBuilder()
        .WithAddress(new AddressBuilder()
            .WithNoPostcode()
            .Build())
        .Build())
    .Build();

Perhaps that looks verbose, but in general, the alternative is worse. If you didn't have a Test Utility Method, you'd have to fill in all the required data for the object:

var invoice = new Invoice(
    new Recipient("Sherlock Holmes",
        new Address("221b Baker Street",
                    "London",
                    new PostCode())),
    new List<InvoiceLine> {
        new InvoiceLine("Deerstalker Hat",
            new PoundsShillingsPence(0, 3, 10)),
        new InvoiceLine("Tweed Cape",
            new PoundsShillingsPence(0, 4, 12))});

Here, the important detail drowns in data. The post code is empty because the PostCode constructor is called without arguments. This hardly jumps out when you see it. Such code neither eliminates the irrelevant, nor amplifies the essential.

Summary #

Test Data Builders are useful because they are good abstractions. They enable you to write unit tests that you can trust.

The disadvantage, as you shall see, is that in languages like C# and Java, much boilerplate code is required.

Next: Generalised Test Data Builder.


Comments

You got me to finally figure out how to post comments. :) Hope everything looks alright.

So first off, great article as always. You totally hit a subject which has been driving me nuts, personally and lately. I have been developing my first FluentAPI and have been running up against both aspects of immutability and query/command separation that you have done an excellent job of presenting here on your blog. It does seem that FluentAPI design and the builder pattern you present above deviate from these principles, so it would be great to hear a little more context and valuable insight from you on how you reconcile this. Is this perhaps a C# issue that is easily remedied in F#? Thank you in advance for any assistance and for providing such a valuable technical resource here. It's been my favorite for many years now.
2017-08-15 06:52 UTC

Mike, thank you for writing. The fluent interface that I show in this article is the most common form you see in C#. While it's not my preferred variation, I use it in this article because it's a direct translation of the style used in Nat Pryce's Java code.

Ordinarily, I prefer an immutable variant, but in C# this leads to even more boilerplate code, and I didn't want to sidetrack the topic by making this recap article more complicated than absolutely necessary.

You may be pleased to learn that future articles in this very article series will show alternatives in both C#, F#, and Haskell.

2017-08-15 07:30 UTC

Hi Mark, A while ago I made a generic builder for this exact purpose. I also made som helper extension methods, that could act as sort of an Object Mother. I quite like how it work and I have used it quite a few times. So, reading this post, I thought I'd put in a link to it, as it might be usefull to other readers.

Generic Builder with Object Mother Gist

It's all in one big gist and probably not very weel structured, but if you look at the class GenericBuilder it should be quite easily understood. The examples of extensionmethods can be seen towards the end of the file.

2017-08-15 07:42 UTC

I've used test data builders in C# just like this in the past, and couldn't decide whether I liked them or not, due to all the boilerplate.

I'm looking forward to the next few posts, thanks for doing this.

2017-08-18 11:43 UTC

Hi, Mark

In C# I starated to prefer to use a "parameterized object mother". Please take a look and tell me what out think about it: Address Object Mother Gist.

From my experience it is less and simplier code. It is also a bid easier to debug. Personally, the Object Mother is the first pattern when refactoring test data creationg and I use Fluent Test Data Builder only in more complex scenarios.

@JanD: Unfortunately, your solution would not work for immutable data structures (which I prefer).

2017-08-19 19:25 UTC

Robert, thank you for writing. I haven't seen that particular C# variation before, but it looks useful. I hope that as this article series progresses, it should become increasingly clear to the reader that the Test Data Builder pattern addresses various language deficiencies. (It has, by the way, for some time been a common criticism of design patterns in general that they are nothing but patches on language deficiencies. I don't think that I agree with that 100 percent, but I certainly understand the argument.)

Nat Pryce's original article about the Test Data Builder pattern is from 2007 with example code in Java. I don't know that much about Java, but back then, I don't think C# had optional arguments (as far as I can tell, that language feature was added in 2010). My point is that the pattern described a good way to model code given the language features that were available at the time.

As a general rule, I'm not a fan of C#'s optional argument feature (because I'm concerned what it does to forwards and backwards compatibility of my APIs), but used in the way you suggest it does look useful. Perhaps it does, indeed, address all the concerns that the Test Data Builder pattern addresses. I haven't tried it, so I can't really evaluate it (yet), but it looks like it'd be worth trying out.

My overall goal with this article series is, however, slightly different. In fact, I'm not trying to sell the Test Data Builder pattern to anyone. Rather, the point is that with better API design, and with better languages, it'd be largely redundant.

2017-08-21 06:33 UTC

Hi, Mark
Thank you for this post

I personally leverage Impromptu Interface. It could be also verbose but as you only provide meaningful data it fits to Robert C. Martin credo. And it avoids creating a lot of one-shot boilerplate code and/or noising existing classes with UT specific stuff.

2017-08-21 09:12 UTC

Romain, do you have an example of that, that you could share?

2017-08-21 09:49 UTC

Partial IAddress with City value only:

var address = new {
		    City = "Paris"
		  }.ActLike<IAddress>();
			

Partial IAddress with City value and partial IPostCode with ISO value only:

var address = new {
		    City = "Paris", 
		    PostCode = new {
				      ISO = "FR"
				   }.ActLike<IPostCode>()
		  }.ActLike<IAddress>();
			

Main drawback is verbosity but intent is pretty clear.
We could reduce nested code by splitting IAddress and IPostCode declarations but it also reduces intent: we do not care about IPostCode, we care about IAddress and IPostCode is only an implementation detail.

I heavily leverage region to cope with C# verbosity and to highlight common pattern - AAA in this case - so all this code is usually hidden in one ARRANGE region.
When I need multiple declaration I used sut (System Under Test) marker to highlight main actor.

2017-08-21 21:09 UTC

Do I understand it correctly that you'd have an interface like the following, then?

public interface IAddress
{
    string City { getset; }
}

I'm not sure that I quite follow...

2017-08-22 11:43 UTC

Mark, I tend to avoid setter in my interfaces so my domain objects usually are immutable and only expose getter.
My implementation are mainly internal which prevent them to be used directly from within UT assembly (without using InternalsVisibleTo attribute).
I have factories - which implementation are also internal - to build my objects.
I then use an IoC container to access factories and create my objects.

public interface IAddress
{
    string City { get; }
    string Street { get; }
    IPostCode PostCode { get; }
}

AddressBuilder lives in UT world so must be in another assembly to avoid noising my model.
To cope with my internal visibility constraint I have at least 2 options I can live with:

  1. Using InternalsVisibleTo attribute for my UT assembly to be able to seamlessly use my types
  2. Leveraging a test container to resolve my factory and then create my objects.
To deal with the immutable constraint I can create new ones within With methods. I can live with this too.

The main drawback remains the verbosity/burden of those methods.
Using Impromptu Interface to generate partial test data spares builder classes creation while keeping verbosity acceptable and intent clear.
Does it make sense?

2017-08-22 13:24 UTC

That helps clarify things, thank you.

I know that obviously, I could try for myself, but when you write

var address = new {
		    City = "Paris"
		  }.ActLike<IAddress>();

then what will be the value of address.PostCode?

2017-08-22 13:40 UTC
It throws an exception if accessed but live peacefully otherwise. It is why I talked about partial data.
You have to be aware of this. When your test focus on a single aspect of your class you can safely use it.
Imagine you are testing a City centric algorithm: you do not care about Street, Street number, Floor, and so on.
No need to create heavy/costly objects you can safely use a partial object which is only compliant with a part of the original interface.
The way you would have deal with if you had split IAddress interface into several parts namely IHaveACity, IHaveAStreet, ...
As it only declares what it needs to work the UT intent is pretty clear. As test builder it removes noisy stuff.
2017-08-22 14:22 UTC

Now I think I get it! Thank you for taking the time to explain.

2017-08-22 14:50 UTC

A slight variation on Robert Pajak's approach that allows writing an.Address() instead of unwieldy AddressObjectMother.Create(): Mother Factory.

Another usage sample: gist.

2017-09-12 9:16 UTC

From Test Data Builders to the identity functor

Monday, 14 August 2017 11:34:00 UTC

The Test Data Builder unit testing design pattern is closely related to the identity functor.

The Test Data Builder design pattern is a valuable technique for managing data for unit testing. It enables you to express test cases in such a way that the important parts of the test case stands out in your code, while the unimportant parts disappear. It perfectly fits Robert C. Martin's definition of an abstraction:

"Abstraction is the elimination of the irrelevant and the amplification of the essential"
Not only are Test Data Builders great abstractions, but they're also eminently composable. You can use fine-grained Test Data Builders as building blocks for more complex Test Data Builders. This turns out to be more than a coincidence. In this series of articles, you'll learn how Test Data Builders are closely related to the identity functor. If you don't know what a functor is, then keep reading; you'll learn about functors as well.
  1. Test Data Builders in C#
  2. Generalised Test Data Builder
  3. The Builder functor
  4. Builder as Identity
  5. Test data without Builders
  6. (The Test Data Generator functor)
By reading these articles, you'll learn the following:
  • How to make your code easier to use in unit tests.
  • What a functor is.
  • How Test Data Builders generalise.
  • Why Test Data Builders are composable.
If you've ever struggled with defining good abstractions, learning about functors (and some related concepts) will help.

For readers wondering if this is 'yet another monad tutorial', it's not; it's a functor tutorial.

Next: Test Data Builders in C#.


F# free monad recipe

Monday, 07 August 2017 08:11:00 UTC

How to create free monads in F#.

This is not a design pattern, but it's something related. Let's call it a recipe. A design pattern should, in my opinion, be fairly language-agnostic (although hardly universally applicable). This article, on the contrary, specifically addresses a problem in F#:

How do you create a free monad in F#?

By following the present recipe.

The recipe here is a step-by-step process, but be sure to first read the sections on motivation and when to use it. A free monads isn't a goal in itself.

This article doesn't attempt to explain the details of free monads, but instead serve as a reference. For an introduction to free monads, I think my article Pure times is a good place to start. See also the Motivating examples section, below.

Motivation #

A frequently asked question about F# is: what's the F# equivalent to an interface? There's no single answer to this question, because, as always, It Depends™. Why do you need an interface in the first place? What is its intended use?

Sometimes, in OOP, an interface can be used for a Strategy. This enables you to dynamically replace or select between different (sub)algorithms at run-time. If the algorithm is pure, then an idiomatic F# equivalent would be a function.

At other times, though, the person asking the question has Dependency Injection in mind. In OOP, dependencies are often modelled as interfaces with several members. Such dependencies are systematically impure, and thereby not part of functional design. If at all possible, prefer impure/pure/impure sandwiches over interactions. Sometimes, however, you'll need something that works like an interface or abstract base class. Free monads can address such situations.

In general, a free monad allows you to build a monad from any functor, but why would you want to do that? The most common reason I've encountered is exactly in order to model impure interactions in a pure manner; in other words: Dependency Injection.

Refactor interface to functor #

This recipe comes in three parts:

  1. A recipe for refactoring interfaces to a functor.
  2. The core recipe for creating a monad from any functor.
  3. A recipe for adding an interpreter.
The universal recipe for creating a monad from any functor follows in a later section. In this section, you'll see how to refactor an interface to a functor.

Imagine that you have an interface that you'd like to refactor. In C# it might look like this:

public interface IFace
{
    Out1 Member1(In1 input);
    Out2 Member2(In2 input);
}

In F#, it'd look like this:

type IFace =
    abstract member Member1 : input:In1 -> Out1
    abstract member Member2 : input:In2 -> Out2

I've deliberately kept the interface vague and abstract in order to showcase the recipe instead of a particular example. For realistic examples, refer to the examples section, further down.

To refactor such an interface to a functor, do the following:

  1. Create a discriminated union. Name it after the interface name, but append the word instruction as a suffix.
  2. Make the union type generic.
  3. For each member in the interface, add a case.
    1. Name the case after the name of the member.
    2. Declare the type of data contained in the case as a pair (a two-element tuple).
    3. Declare the type of the first element in that tuple as the type of the input argument(s) to the interface member. If the member has more than one input argument, declare it as a (nested) tuple.
    4. Declare the type of the second element in the tuple as a function. The input type of that function should be the output type of the original interface member, and the output type of the function should be the generic type argument for the union type.
  4. Add a map function for the union type. I'd recommend making this function private and avoid naming it map in order to prevent naming conflicts. I usually name this function mapI, where the I stands for instruction.
  5. The map function should take a function of the type 'a -> 'b as its first (curried) argument, and a value of the union type as its second argument. It should return a value of the union type, but with the generic type argument changed from 'a to 'b.
  6. For each case in the union type, map it to a value of the same case. Copy the (non-generic) first element of the pair over without modification, but compose the function in the second element with the input function to the map function.
Following that recipe, the above interface becomes this union type:

type FaceInstruction<'a> =
| Member1 of (In1 * (Out1 -> 'a))
| Member2 of (In2 * (Out2 -> 'a))

The map function becomes:

// ('a -> 'b) -> FaceInstruction<'a> -> FaceInstruction<'b>
let private mapI f = function
    | Member1 (x, next-> Member1 (x, next >> f)
    | Member2 (x, next-> Member2 (x, next >> f)

Such a combination of union type and map function satisfies the functor laws, so that's how you refactor an interface to a functor.

Free monad recipe #

Given any functor, you can create a monad. The monad will be a new type that contains the functor; you will not be turning the functor itself into a monad. (Some functors can be turned into monads themselves, but if that's the case, you don't need to create a free monad.)

The recipe for turning any functor into a monad is as follows:

  1. Create a generic discriminated union. You can name it after the underlying functor, but append a suffix such as Program. In the following, this is called the 'program' union type.
  2. Add two cases to the union: Free and Pure.
  3. The Free case should contain a single value of the contained functor, generically typed to the 'program' union type itself. This is a recursive type definition.
  4. The Pure case should contain a single value of the union's generic type.
  5. Add a bind function for the new union type. The function should take two arguments:
  6. The first argument to the bind function should be a function that takes the generic type argument as input, and returns a value of the 'program' union type as output. In the rest of this recipe, this function is called f.
  7. The second argument to the bind function should be a 'program' union type value.
  8. The return type of the bind function should be a 'program' union type value, with the same generic type as the return type of the first argument (f).
  9. Declare the bind function as recursive by adding the rec keyword.
  10. Implement the bind function by pattern-matching on the Free and Pure cases:
  11. In the Free case, pipe the contained functor value to the functor's map function, using bind f as the mapper function; then pipe the result of that to Free.
  12. In the Pure case, return f x, where x is the value contained in the Pure case.
  13. Add a computation expression builder, using bind for Bind and Pure for Return.
Continuing the above example, the 'program' union type becomes:

type FaceProgram<'a> =
| Free of FaceInstruction<FaceProgram<'a>>
| Pure of 'a

It's worth noting that the Pure case always looks like that. While it doesn't take much effort to write it, you could copy and paste it from another free monad, and no changes would be required.

According to the recipe, the bind function should be implemented like this:

// ('a -> FaceProgram<'b>) -> FaceProgram<'a> -> FaceProgram<'b>
let rec bind f = functionFree x -> x |> mapI (bind f) |> FreePure x -> f x

Apart from one small detail, the bind function always looks like that, so you can often copy and paste it from here and use it in your code, if you will. The only variation is that the underlying functor's map function isn't guaranteed to be called mapI - but if it is, you can use the above bind function as is. No modifications will be necessary.

In F#, a monad is rarely a goal in itself, but once you have a monad, you can add a computation expression builder:

type FaceBuilder () =
    member this.Bind (x, f) = bind f x
    member this.Return x = Pure x
    member this.ReturnFrom x = x
    member this.Zero () = Pure ()

While you could add more members (such as Combine, For, TryFinally, and so on), I find that usually, those four methods are all I need.

Create an instance of the builder object, and you can start writing computation expressions:

let face = FaceBuilder ()

Finally, as an optional step, if you've refactored an interface to an instruction set, you can add convenience functions that lift each instruction case to the free monad type:

  1. For each case, add a function of the same name, but camelCased instead of PascalCased.
  2. Each function should have input arguments that correspond to the first element of the case's contained tuple (i.e. the input argument for the original interface). I usually prefer the arguments in curried form, but that's not a requirement.
  3. Each function should return the corresponding instruction union case inside of the Free case. The case constructor must be invoked with the pair of data it requires. Populate the first element with values from the input arguments to the convenience function. The second element should be the Pure case constructor, passed as a function.
In the current example, that would be two functions, one for each case of FaceInstruction<'a>:

// In1 -> FaceProgram<Out1>
let member1 in1 = Free (Member1 (in1, Pure))
 
// In2 -> FaceProgram<Out2>
let member2 in2 = Free (Member2 (in2, Pure))

Such functions are conveniences that make it easier to express what the underlying functor expresses, but in the context of the free monad.

Interpreters #

A free monad is a recursive type, and values are trees. The leafs are the Pure values. Often (if not always), the point of a free monad is to evaluate the tree in order to pull the leaf values out of it. In order to do that, you must add an interpreter. This is a function that recursively pattern-matches over the free monad value until it encounters a Pure case.

At least in the case where you've refactored an interface to a functor, writing an interpreter also follows a recipe. This is equivalent to writing a concrete class that implements an interface.

  1. For each case in the instruction-set functor, write an implementation function that takes the case's 'input' tuple element type as input, and returns a value of the type used in the case's second tuple element. Recall that the second element in the pair is a function; the output type of the implementation function should be the input type for that function.
  2. Add a function to implement the interpreter; I often call it interpret. Make it recursive by adding the rec keyword.
  3. Pattern-match on Pure and each case contained in Free.
  4. In the Pure case, simply return the value contained in the case.
  5. In the Free case, pattern-match the underlying pair out if each of the instruction-set functor's cases. The first element of that tuple is the 'input value'. Pipe that value to the corresponding implementation function, pipe the return value of that to the function contained in the second element of the tuple, and pipe the result of that recursively to the interpreter function.
Assume that two implementation functions imp1 and imp2 exist. According to the recipe, imp1 has the type In1 -> Out1, and imp2 has the type In2 -> Out2. Given these functions, the running example becomes:

// FaceProgram<'a> -> 'a
let rec interpret = function
    | Pure x -> x
    | Free (Member1 (x, next)) -> x |> imp1 |> next |> interpret
    | Free (Member2 (x, next)) -> x |> imp2 |> next |> interpret

The Pure case always looks like that. Each of the Free cases use a different implementation function, but apart from that, they are, as you can tell, the spitting image of each other.

Interpreters like this are often impure because the implementation functions are impure. Nothing prevents you from defining pure interpreters, although they often have limited use. They do have their place in unit testing, though.

// Out1 -> Out2 -> FaceProgram<'a> -> 'a
let rec interpretStub out1 out2 = function
    | Pure x -> x
    | Free (Member1 (_, next)) -> out1 |> next |> interpretStub out1 out2
    | Free (Member2 (_, next)) -> out2 |> next |> interpretStub out1 out2

This interpreter effectively ignores the input value contained within each Free case, and instead uses the pure values out1 and out2. This is essentially a Stub - an 'implementation' that always returns pre-defined values.

The point is that you can have more than a single interpreter, pure or impure, just like you can have more than one implementation of an interface.

When to use it #

Free monads are often used instead of Dependency Injection. Note, however, that while the free monad values themselves are pure, they imply impure behaviour. In my opinion, the main benefit of pure code is that, as a code reader and maintainer, I don't have to worry about side-effects if I know that the code is pure. With a free monad, I do have to worry about side-effects, because, although the ASTs are pure, an impure interpreter will cause side-effects to happen. At least, however, the side-effects are known; they're restricted to a small subset of operations. Haskell enforces this distinction, but F# doesn't. The question, then, is how valuable you find this sort of design.

I think it still has some value, because a free monad explicitly communicates an intent of doing something impure. This intent becomes encoded in the types in your code base, there for all to see. Just as I prefer that functions return 'a option values if they may fail to produce a value, I like that I can tell from a function's return type that a delimited set of impure operations may result.

Clearly, creating free monads in F# requires some boilerplate code. I hope that this article has demonstrated that writing that boilerplate code isn't difficult - just follow the recipe. You almost don't have to think. Since a monad is a universal abstraction, once you've written the code, it's unlikely that you'll need to deal with it much in the future. After all, mathematical abstractions don't change.

Perhaps a more significant concern is how familiar free monads are to developers of a particular code base. Depending on your position, you could argue that free monads come with high cognitive overhead, or that they specifically lower the cognitive overhead.

Insights are obscure until you grasp them; after that, they become clear.

This applies to free monads as well. You have to put effort into understanding them, but once you do, you realise that they are more than a pattern. They are universal abstractions, governed by laws. Once you grok free monads, their cognitive load wane.

Consider, then, the developers who will be interacting with the free monad. If they already know free monads, or have enough of a grasp of monads that this might be their next step, then using free monads could be beneficial. On the other hand, if most developers are new to F# or functional programming, free monads should probably be avoided for the time being.

This flowchart summarises the above reflections:

Decision flowchart for whether or not to choose free monads as a design principle.

Your first consideration should be whether your context enables an impure/pure/impure sandwich. If so, there's no reason to make things more complicated than they have to be. To use Fred Brooks' terminology, this should go a long way to avoid accidental complexity.

If you can't avoid long-running, impure interactions, then consider whether purity, or strictly functional design, is important to you. F# is a multi-paradigmatic language, and it's perfectly possible to write code that's impure, yet still well-structured. You can use partial application as an idiomatic alternative to Dependency Injection.

If you prefer to keep your code functional and explicit, you may consider using free monads. In this case, I still think you should consider the maintainers of the code base in question. If everyone involved are comfortable with free monads, or willing to learn, then I believe it's a viable option. Otherwise, I'd recommend falling back to partial application, even though Dependency Injection makes everything impure.

Motivating examples #

The strongest motivation, I believe, for introducing free monads into a code base is to model long-running, impure interactions in a functional style.

Like most other software design considerations, the overall purpose of application architecture is to deal with (essential) complexity. Thus, any example must be complex enough to warrant the design. There's little point in a Dependency Injection hello world example in C#. Likewise, a hello world example using free monads hardly seems justified. For that reason, examples are provided in separate articles.

A good place to start, I believe, is with the small Pure times article series. These articles show how to address a particular, authentic problem using strictly functional programming. The focus of these articles is on problem-solving, so they sometimes omit detailed explanations in order to keep the narrative moving.

If you need detailed explanations about all elements of free monads in F#, the present article series offers just that, particularly the Hello, pure command-line interaction article.

Variations #

The above recipes describe the regular scenario. Variations are possible. Obviously, you can choose different naming strategies and so on, but I'm not going to cover this in greater detail.

There are, however, various degenerate cases that deserve a few words. An interaction may return no data, or take no input. In F#, you can always model the lack of data as unit (()), so it's definitely possible to define an instruction case like Foo of (unit * Out1 -> 'a), or Bar of (In2 * unit -> 'a), but since unit doesn't contain any data, you can remove it without changing the abstraction.

The Hello, pure command-line interaction article contains a single type that exemplifies both degenerate cases. It defines this instruction set:

type CommandLineInstruction<'a> =
| ReadLine of (string -> 'a)
| WriteLine of string * 'a

The ReadLine case takes no input, so instead of containing a pair of input and continuation, this case contains only the continuation function. Likewise, the WriteLine case is also degenerate, but here, there's no output. This case does contain a pair, but the second element isn't a function, but a value.

This has some superficial consequences for the implementation of functor and monad functions. For example, the mapI function becomes:

// ('a -> 'b) -> CommandLineInstruction<'a> -> CommandLineInstruction<'b>
let private mapI f = function
    | ReadLine next -> ReadLine (next >> f)
    | WriteLine (x, next) -> WriteLine (x, next |> f)

Notice that in the ReadLine case, there's no tuple on which to pattern-match. Instead, you can directly access next.

In the WriteLine case, the return value changes from function composition (next >> f) to a regular function call (next |> f, which is equivalent to f next).

The lift functions also change:

// CommandLineProgram<string>
let readLine = Free (ReadLine Pure)
 
// string -> CommandLineProgram<unit>
let writeLine s = Free (WriteLine (s, Pure ()))

Since there's no input, readLine degenerates to a value, instead of a function. On the other hand, while writeLine remains a function, you'll have to pass a value (Pure ()) as the second element of the pair, instead of the regular function (Pure).

Apart from such minor changes, the omission of unit values for input or output has little significance.

Another variation from the above recipe that you may see relates to interpreters. In the above recipe, I described how, for each instruction, you should create an implementation function. Sometimes, however, that function is only a few lines of code. When that happens, I occasionally inline the function directly in the interpreter. Once more, the CommandLineProgram API provides an example:

// CommandLineProgram<'a> -> 'a
let rec interpret = function
    | Pure x -> x
    | Free (ReadLine  next-> Console.ReadLine () |> next |> interpret
    | Free (WriteLine (s, next)) ->
        Console.WriteLine s
        next |> interpret

Here, no custom implementation functions are required, because Console.ReadLine and Console.WriteLine already exist and serve the desired purpose.

Summary #

This article describes a repeatable, and automatable, process for refactoring an interface to a free monad. I've done this enough times now that I believe that this process is always possible, but I have no formal proof for this.

I also strongly suspect that the reverse process is possible. For any instruction set elevated to a free monad, I think you should be able to define an object-oriented interface. If this is true, then object-oriented interfaces and AST-based free monads are isomorphic.


Comments

Hello Mark. I am trying to understand what is going on.
So basically the Free Moand allows us to separate pure code from impure code even when the impure/pure/impure sandwish idea is not possible to implement. Right?
We want to separate pure and impure code for these reasons: (1) Easier testing (2) Reasoning about pure code is easier than impure code (3) making impure code explicit makes it easier to understand programs. Is this correct?
What I am still trying to figure out is why we can't simply do this with Dependency Injection?
We can separate all units of behavior into pure ones and impure ones (e.g. functions), and then compose them all in the Composition Root. Pure units take no dependencies, they take in "direct input" and give back "direct output" as you describe in one of your blog posts.
To make the impure code explicit and clear, we can make the root method in the Composition Root construct all impure units of behavior first (e.g. adapters to the external world) and then inject them into a method that bakes these dependencies with the rest of pure code. E.g.: public static IApplication CreateApplication(IImpureDependency1 dependency1, IImpureDependency2 dependency2) => { //compose graph here}
If you have sub methods that the CreateApplication method uses for modularizing the Composition Root, they will also take any impurities they need as parameters.
So in summary, only the Composition Root knows about the impure parts of the application and they are explicitly stated as parameters in the Composition Root methods.
Doesn't this solve the impure/pure separation issue?
For example, to test, you can easilly call the CreateApplication method and pass the fake (pure) dependencies. This will make the whole graph pure in the test.
Also, the Composition Root would make it clear which impure dependencies each component in the system depends on.
Am I missing something?

2018-05-03 21:42 UTC

Hello Yacoub, thank you for writing. Your summary of the motivations covers most of them. The reason that purity interests me is that it forces me (and everyone else) to consider decoupling. One day, I should write a more explicit article about this, but I believe that the general problem with programming today has little to do with writing code, but with reading it. Until I get such an article written, I can only refer to my Humane Code video, and perhaps my recent appearance on .NET Rocks!. What fundamentally interests me is how to break down code into small enough chunks that they fit in our brains at all levels of abstraction. Purity, and functional programming in general, attracts me because it offers a principled way of doing that.

If we forget about functional programming and free monads for a while, we could ask a question similar to yours about Dependency Injection (DI). Why should we use Dependency Injection? Can't we just, say, call a database when we need some data? Technically, we can, but we deliberately invert the control of our code so that it becomes easier to break apart into smaller chunks. You may find this observation trivial, but it wasn't ten years ago, and I made much effort in my book to explain the benefits of DI.

The problem with DI is that at detailed levels of abstractions, DI-based code may fit in our brains, but at higher levels of abstraction the complexity still increases. Put another way, understanding a single class that receives a few dependencies is easy. Getting a high-level, big-picture understanding of a DI-based code base can still be quite the challenge. At a high level of abstraction, the moving parts in underlying components are still too visible, you could say.

Strictly functional programming interests me because, by pushing impure behaviours to the boundaries of the application, the pure core of an application becomes easier to treat as a hierarchy of abstractions. (I really need to write an article with diagrams about this some day.)

What's strictly functional programming? It's code that obeys the rule that pure code can't call impure code. The reason I find Haskell so interesting is that the compiler enforces that rule. Code isn't pure if it calls impure functions, and in Haskell, the code simply will not compile if you attempt to do that.

F#, on the other hand, doesn't work like that. There's no compile-time check of whether the code is pure or impure. Thus, when you pass functions to other functions, your higher-order function could look pure, but since you don't know what an 'injected' function does, you really don't know if it's pure or not. In F#, all it takes is a single call to, say, DateTime.Now, Guid.NewGuid(), or similar, deep in your system, and that makes the entire code base impure!

The only way to prevent that in F# is by diligence.

That's a roundabout answer to your question. The gist of it, though, is that in F#, you rarely need free monads. If you find yourself in the situation where a free monad would be required in Haskell, you could just as well use DI, or rather, partial application. My article on that approach explains how this works in F#, but also why it doesn't work in Haskell. When you inject impure behaviour into an 'otherwise' pure function, then everything becomes impure.

This is where F# differs from Haskell. In Haskell, such an attempt simply doesn't compile. In F#, an otherwise pure function suddenly becomes impure. If you mostly care about that distinction because of, say, testability, then that's not a problem, because when you 'inject' pure behaviour, then the composed function is still pure, and thus trivial to unit test.

The entire system is still impure with that design, though, and that can make it difficult to fit the entire application behaviour in our brains.

I'm afraid this answer doesn't help. I'll have to write a more coherent article on this some day, but I wanted to leave this here because, realistically, a more coherent article isn't part of my immediate plans.

2018-05-06 12:41 UTC

Hello Mark. Thanks for the reply and for providing the links. I have already watched your Humane Code videos at clean coders before. Will listen to the podcast too.

I understand that with the free monad, you can maintain the rule that pure code will never call impure code.

This is one goal.

However, as you describe, this by itself is not the final goal. We want to achieve this goal as a mean to achieve other goals. For example, we want our code to be easier to reason about.

As you describe, we cannot achieve the first goal using DI (or partial application). And in Haskell, the compiler will prevent us from even trying.

However, I think you agree with me that there is still some great value in separating "pure" and impure code in different functions or classes, and then combining them in the Composition Root. This is basically Command Query Separation + DI. Although the graph as a whole is impure, some benefit (e.g. easier to reason about code) is still there as a result of the separation.

What I am trying to argue (or let me say think about and discuss) is that if one does the following:

  1. Separate impure and pure behavior at the level of individual units of behavior (e.g. functions or classes).
  2. Compose these units at the Composition Root (only the Composition Root knows about the impure units).
  3. In the Composition Root, first all impure units are created/prepared, and then injected into "pure" (now not pure) units.
  4. Make impure dependencies explicit as parameters in the "Create" methods of the Composition Root. (Basically "Create" methods are a way to modularize the Composition Root. I describe what I mean in more details here
then there is not much value in moving from what I just described to using Free monads just to make the Haskell compiler happy :).

Or is there something that I am missing?

Basically, if we forget for a moment about the first goal (since it is only a mean to other goals), what goals will we be not achieving?

In your reply, I can find the following that might answer these questions:
"Getting a high-level, big-picture understanding of a DI-based code base can still be quite the challenge. At a high level of abstraction, the moving parts in underlying components are still too visible"

But I can't understand what you mean here. What is the problem here? and how does the Free monad fix it?

I hope I was able to explain my ideas correctly.

2018-05-06 21:19 UTC

Reading my comment again, I would like to add/update a few things.

Regarding CQS, this is not exactly the same as separating impure and pure code. Still, a query can be impure (like one that reads from the database). Such query can be separated into a set of pure and impure queries. Also, a command can have some pure logic in it that can be extracted into a separate pure query (or queries). But, CQS is a step in the right direction towards this and it is a good example of how separation at some level has benefits of its own.

I would like to explain also that the steps I describe in my comment aim basically to delay the composition of pure and impure code to the last possible moment. So basically, all pure logic is composed first (parameterized with functions/delegates/interfaces representing possibly impure code). After that, impure code will be injected into such pure graph rendering it impure of course.

So basically, imagine an imaginary version of Haskell that would allow the root method of an application to allow “pure” code to call impure code.

2018-05-10 19:59 UTC

Here is a concrete example. Imagine these three pure functions:

(A, Func<C,D> dep1, Func<E,F> dep2) => B (1)

(C, Func<G,H> dep3) => D (2)

(G, Func<I,J> dep4) => H (3)

Now, in the Composition Root, we "compose" these together to get the following:

(A, Func<I,J> dep4, Func<E,F> dep2) =>B

So far, this is a pure function, we havn't injected any impurities in it. Thinking about this, this might be a special case of dependency injection. We might call it dependency replacing or something like that.
What I have done is "inject" function #2 as dep1 in function #1. But this is not fully injected. I replaced "dep1" with "dep3".
Then, I "inject" function #3 as dep3. Again, this is not full injection as I replace it with "dep4".

Now, after all "pure" functions have been baked together, I inject the impure "dep4" and "dep2" to get this:

A => B

I hope the code gets displayed correctly in the comment.

2018-05-14 07:30 UTC

Yacoub, thank you for the pseudo-code. That makes it easier to discuss things.

Your premise is that functions 1, 2, and 3 are pure. The rest of the argument rests on whether or not they are. Just to be sure that we share the same terminology, I take pure to mean referentially transparent. Nothing you've written gives me any indication that this isn't your interpretation as well, so I mostly include this as an explicit definition for the benefit of other readers who may happen upon this discussion in the future.

It's clear that a function (or method) that adds two numbers together is pure. This also applies to any other first-order function with isolation. I use the word isolation as described by Jessica Kerr: A function has the property of isolation when the only information it has about the external word is passed into it via arguments.

You can write arbitrarily complex isolated functions in, say, C#:

public static DateTime Foo(int year, string month)
{
    if (year < 1)
        return DateTime.MinValue;
    if (9999 < year)
        return DateTime.MaxValue;
 
    if (!int.TryParse(month, out int imonth))
        imonth = 7;
    if (imonth < 1)
        imonth = 1;
    if (12 < imonth)
        imonth = 12;
 
    var day = month.Length;
    if (day < 1)
        day = 10;
    if (28 < day)
        day = 20;
 
    return new DateTime(year, imonth, day);
}

To be clear, this Foo method makes no sense, but it is, as far as I can tell, pure; it operates entirely on its input.

Consider, however, this variation:

public static DateTime Foo(int year, string month)
{
    if (year < 1)
        return DateTime.MinValue;
    if (9999 < year)
        return DateTime.MaxValue;
 
    if (!int.TryParse(month, out int imonth))
        imonth = 7;
    if (imonth < 1)
        imonth = 1;
    if (12 < imonth)
        imonth = 12;
 
    var day = month.Length;
    if (day < 1)
        day = 10;
    if (DateTime.DaysInMonth(year, imonth) < day)
        day = 20;
 
    return new DateTime(year, imonth, day);
}

Notice that DateTime.DaysInMonth(year, imonth) replaces the hard-coded value 28. Is this variation pure?

I don't know. In order to figure that out, we'd need to understand if DateTime.DaysInMonth is pure. Does it use a hard-coded table or algorithm of leap years, or does it use a call to the operating system (OS)? If the latter, does the OS base its functionality on a pure implementation, or does it look up the information in some resource (like the Windows Registry)?

With leap years, and for the Gregorian calendar, a pure algorithm exists, but imagine that we create a similar nonsense function that creates DateTimeOffset values, including time and time-zone offsets. In this case, figuring out if a value is valid relies on external data, since rules about daylight saving time are political and subject to change.

My point is that without a machine tool (such as a type system) to guide us, it's practically impossible to reason about the purity of code.

To make matters worse, as soon as you pass a function as an argument to another function, all bets are off. Even if you've diligently reviewed functions like 1, 2, and 3 above for purity, they're only pure if dep2 and dep4 are pure as well.

Haskell takes away all that angst related to purity by enforcing it via its type system. This liberates us to worry about other things, because the compiler has our backs regarding purity.

In C#, F#, Java, and most other languages, we get no such guarantees. As I've tried to demonstrate above, I'd regard all non-trivial code to be impure. All it takes is one system call, Guid.NewGuid(), random.Next(), DateTime.Now, log.Warning("foo"), etc. to make all code transitively calling such a statement impure. This is, realistically, impossible to prevent.

Do we care, then? What if the functions 1, 2, and 3 are 'pure enough'?

In an analogy to this discussion, in RESTful design, GET requests should be side-effect free. Almost all web servers, however, log HTTP requests, so GET requests are never side-effect free. The interpretation used in that context, therefore, is that GET requests should be free of side effects for which the client is responsible.

You can have a similar discussion about functional programming. What if a function logs debug information? Does that change the observable state of the system?

In any case, before even beginning to discuss whether dependency injection or partial application is functional, we need to make it clear why we care about purity.

I care about purity because it eliminates entire classes of bugs. It also means that I don't have to log what happens inside my pure code; as long as I log what happens at the impure boundary, I can always reproduce the result of a pure computation. All this makes the overall code simpler. Logging, caching, instrumentation. Many cross-cutting concerns either disappear or greatly simplify.

Returning to the overall discussion related to this article, free monads are one way to separate pure code from impure code. What you suggest, though, isn't pure, because all it takes to make the entire composition impure is that dep2 or dep4 are impure (or one of the 'pure' functions turning out to be impure after all). It's Dependency Injection, only you replace interfaces with delegates.

Does it matter? Probably not. Trying to keep things 'as pure as possible' in C# and similar languages could still provide benefits. That's how I approach F#. Ultimately, the goal is to make the code sustainable. If you can do that with Dependency Injection or partial application, then the mission is accomplished.

In Haskell, free monads are sometimes required, but in F#, it's a specialised design I'd only reach for in niche situations.

2018-05-15 8:36 UTC
Nikolay Terletskyi #

Hello! I just want to add my humble optinion to Mark and Yacoub disscussion. There is something that you could not achieve with partial application.

Imagine that you have pipeline that process some entity. And if some conditions are met you need another one. Id of second entity is the field of first.

So you can not just pass second entity as parameter. Because you do not sure if it is needed. You can pass function that give you an entity.

But what is return type of this function? SecondEntetyType or Async<SecondEntetyType> or Task<SecondEntetyType>? What if you use library with callback interface to load this entity?

Should you care about it to declare relations between first and second entities?

Without free monad answer is yes !!!

It is main achievement from free monads for me.

2019-06-06 5:30 UTC
Nick Dunets #

Hi, I had the same questions as Yacoub i.e. how is Free any better than raw Dependency Injection?

After some research I can see at least couple of advantages. Even if code is messy and pure/impure parts interleaved chaotically, and function doesn't reduce to a simple tree and therefore can't serve as a convincing test case without being further interpreted etc. - there are still at least two advantages over DI:

1. No need to pass extra parameters representing the abstraction of impure code all over the place

2. Async aspect doesn't leak: e.g. WriteLine case from the article's example could have been interpreted as Console.Out.WriteLineAsync() - why not? but the "pure" core would still be decopuled from async aspect.

2019-07-09 20:15 UTC
Romain Deneau @DeneauRomain #

Thank you Mark for these high quality articles. I was wondering if it wouldn't be more relevant to talk about Operations rather than Members in the interface:

public interface IFace
{
    Out1 Operation1(In1 input);
    Out2 Operation2(In2 input);
}
				

Indeed, a dependency is needed in order to perform some (impure?) operations to be delegated to another object, in another layer or to follow the Single Responsibility Principle. Also it makes more sense to have operations rather than "members" in an instruction:

type FaceInstruction<'a> =
		| Operation1 of (In1 * (Out1 -> 'a))
		| Operation2 of (In2 * (Out2 -> 'a))

On the other hand, being extreme in the application of another SOLID principle, the Segragation Principle Interface, each operation may be splitted in as many different interfaces to be injected into the object. I think it doesn't change your recipe: putting all operations in the same instruction set / union type. What do you think of about it?

2019-10-11 20:48 UTC

Romain, thank you for writing. In addition to members, we could call them operations, or actions. I chose member because it's established C# terminology when you're talking about the united set of methods, properties, and events defined by a type such as an interface.

If you approach free monads from functional programming, we wouldn't call them members, but rather functions.

I chose to start with the term member because I surmised that this would be the term with which most readers would be familiar. Since the article starts with those names, I chose to keep the same terms all the way through so that the reader would be able to follow the various steps in the recipe.

With regards to the SOLID principles, the logical conclusion is to have lots of one-method interfaces. You can have one-function free monads as well, but combining them involves much plumbing work in F#. This is much easier in Haskell.

2019-10-12 23:20 UTC

Combining free monads in F#

Monday, 31 July 2017 12:30:00 UTC

An example of how to compose free monads in F#.

This article is an instalment in a series of articles about modelling long-running interactions with pure, functional code. In the previous article, you saw how to combine a pure command-line API with an HTTP-client API in Haskell. In this article, you'll see how to translate the Haskell proof of concept to F#.

HTTP API client module #

You've already seen how to model command-line interactions as pure code in a previous article. You can define interactions with the online restaurant reservation HTTP API in the same way. First, define some types required for input and output to the API:

type Slot = { Date : DateTimeOffset; SeatsLeft : int }
 
type Reservation = {
    Date : DateTimeOffset
    Name : string
    Email : string
    Quantity : int }

The Slot type contains information about how many available seats are left on a particular date. The Reservation type contains the information required in order to make a reservation. It's the same Reservation F# record type you saw in a previous article, but now it's moved here.

The online restaurant reservation HTTP API may afford more functionality than you need, but there's no reason to model more instructions than required:

type ReservationsApiInstruction<'a> =
| GetSlots of (DateTimeOffset * (Slot list -> 'a))
| PostReservation of Reservation * 'a

This instruction set models two interactions. The GetSlots case models an instruction to request, from the HTTP API, the slots for a particular date. The PostReservation case models an instruction to make a POST HTTP request with a Reservation, thereby making a reservation.

While Haskell can automatically make this type a Functor, in F# you have to write the code yourself:

// ('a -> 'b) -> ReservationsApiInstruction<'a>
// -> ReservationsApiInstruction<'b>
let private mapI f = function
    | GetSlots (x, next-> GetSlots (x, next >> f)
    | PostReservation (x, next) -> PostReservation (x, next |> f)

This turns ReservationsApiInstruction<'a> into a functor, which is, however, not the ultimate goal. The final objective is to enable syntactic sugar, so that you can write pure ReservationsApiInstruction<'a> Abstract Syntax Trees (ASTs) in standard F# syntax. In order to fulfil that ambition, you need a computation expression builder, and to create one of those, you need a monad.

You can turn ReservationsApiInstruction<'a> into a monad using the free monad recipe that you've already seen. Creating a free monad, however, involves adding another type that will become both monad and functor, so I deliberately make mapI private in order to prevent confusion. This is also the reason I didn't name the function map: you'll need that name for a different type. The I in mapI stands for instruction.

The mapI function pattern-matches on the (implicit) ReservationsApiInstruction argument. In the GetSlots case, it returns a new GetSlots value, but composes the next continuation with f. In the PostReservation case, it returns a new PostReservation value, but pipes next to f. The reason for the difference is that PostReservation is degenerate: next isn't a function, but a value.

Now that ReservationsApiInstruction<'a> is a functor, you can create a free monad from it. The first step is to introduce a new type for the monad:

type ReservationsApiProgram<'a> =
| Free of ReservationsApiInstruction<ReservationsApiProgram<'a>>
| Pure of 'a

This is a recursive type that enables you to assemble ASTs that ultimately can return a value. The Pure case enables you to return a value, while the Free case lets you describe what should happen next.

Using mapI, you can make a monad out of ReservationsApiProgram<'a> by adding a bind function:

// ('a -> ReservationsApiProgram<'b>) -> ReservationsApiProgram<'a>
// -> ReservationsApiProgram<'b>
let rec bind f = function
    | Free instruction -> instruction |> mapI (bind f) |> Free
    | Pure x -> f x

If you refer back to the bind implementation for CommandLineProgram<'a>, you'll see that it's the exact same code. In Haskell, creating a free monad from a functor is automatic. In F#, it's boilerplate.

Likewise, you can make ReservationsApiProgram<'a> a functor:

// ('a -> 'b) -> ReservationsApiProgram<'a> -> ReservationsApiProgram<'b>
let map f = bind (f >> Pure)

Again, this is the same code as in the CommandLine module. You can copy and paste it. It is, however, not the same function, because the types are different.

Finally, to round off the reservations HTTP client API, you can supply functions that lift instructions to programs:

// DateTimeOffset -> ReservationsApiProgram<Slot list>
let getSlots date = Free (GetSlots (date, Pure))
 
// Reservation -> ReservationsApiProgram<unit>
let postReservation r = Free (PostReservation (r, Pure ()))

That's everything you need to create a small computation expression builder:

type ReservationsApiBuilder () =
    member this.Bind (x, f) = ReservationsApi.bind f x
    member this.Return x = Pure x
    member this.ReturnFrom x = x
    member this.Zero () = Pure ()

Create an instance of the ReservationsApiBuilder class in order to use reservationsApi computation expressions:

let reservationsApi = ReservationsApiBuilder ()

This, in total, defines a pure API for interacting with the online restaurant reservation system, including all the syntactic sugar you'll need to stay sane. As usual, some boilerplate code is required, but I'm not too worried about its maintenance overhead, as it's unlikely to change much, once you've added it. If you've followed the recipe, the API obeys the category, functor, and monad laws, so it's not something you've invented; it's an instance of a universal abstraction.

Monad stack #

The addition of the above ReservationsApi module is only a step towards the overall goal, which is to write a command-line wizard you can use to make reservations against the online API. In order to do so, you must combine the two monads CommandLineProgram<'a> and ReservationsApiProgram<'a>. In Haskell, you get that combination for free via the built-in generic FreeT type, which enables you to stack monads. In F#, you have to explicitly declare the type:

type CommandLineReservationsApiT<'a> =
| Run of CommandLineProgram<ReservationsApiProgram<'a>>

This is a single-case discriminated union that stacks ReservationsApiProgram and CommandLineProgram. In this incarnation, it defines a single case called Run. The reason for this is that it enables you to follow the free monad recipe without having to do much thinking. Later, you'll see that it's possible to simplify the type.

The naming is inspired by Haskell. This type is a piece of the puzzle corresponding to Haskell's FreeT type. The T in FreeT stands for transformer, because FreeT is actually something called a monad transformer. That's not terribly important in an F# context, but that's the reason I also tagged on the T in CommandLineReservationsApiT<'a>.

FreeT is actually only a 'wrapper' around another monad. In order to extract the contained monad, you can use a function called runFreeT. That's the reason I called the F# case Run.

You can easily make your stack of monads a functor:

// ('a -> 'b) -> CommandLineProgram<ReservationsApiProgram<'a>>
// -> CommandLineProgram<ReservationsApiProgram<'b>>
let private mapStack f x = commandLine {
    let! x' = x
    return ReservationsApi.map f x' }

The mapStack function uses the commandLine computation expression to access the ReservationsApiProgram contained within the CommandLineProgram. Thanks to the let! binding, x' is a ReservationsApiProgram<'a> value. You can use ReservationsApi.map to map x' with f.

It's now trivial to make CommandLineReservationsApiT<'a> a functor as well:

// ('a -> 'b) -> CommandLineReservationsApiT<'a>
// -> CommandLineReservationsApiT<'b>
let private mapT f (Run p) = mapStack f p |> Run

The mapT function simply pattern-matches the monad stack out of the Run case, calls mapStack, and pipes the return value into another Run case.

By now, it's should be fairly clear that we're following the same recipe as before. You have a functor; make a monad out of it. First, define a type for the monad:

type CommandLineReservationsApiProgram<'a> =
| Free of CommandLineReservationsApiT<CommandLineReservationsApiProgram<'a>>
| Pure of 'a

Then add a bind function:

// ('a -> CommandLineReservationsApiProgram<'b>)
// -> CommandLineReservationsApiProgram<'a>
// -> CommandLineReservationsApiProgram<'b>
let rec bind f = function
    | Free instruction -> instruction |> mapT (bind f) |> Free
    | Pure x -> f x

This is almost the same code as the above bind function for ReservationsApi. The only difference is that the underlying map function is named mapT instead of mapI. The types involved, however, are different.

You can also add a map function:

// ('a -> 'b) -> (CommandLineReservationsApiProgram<'a>
// -> CommandLineReservationsApiProgram<'b>)
let map f = bind (f >> Pure)

This is another copy-and-paste job. Such repeatable. Wow.

When you create a monad stack, you need a way to lift values from each of the constituent monads up to the combined monad. In Haskell, this is done with the lift and liftF functions, but in F#, you must explicitly add such functions:

// CommandLineProgram<ReservationsApiProgram<'a>>
// -> CommandLineReservationsApiProgram<'a>
let private wrap x = x |> Run |> mapT Pure |> Free
// CommandLineProgram<'a> -> CommandLineReservationsApiProgram<'a>
let liftCL x = wrap <| CommandLine.map ReservationsApiProgram.Pure x
// ReservationsApiProgram<'a> -> CommandLineReservationsApiProgram<'a>
let liftRA x = wrap <| CommandLineProgram.Pure x

The private wrap function takes the underlying 'naked' monad stack (CommandLineProgram<ReservationsApiProgram<'a>>) and turns it into a CommandLineReservationsApiProgram<'a> value. It first wraps x in Run, which turns x into a CommandLineReservationsApiT<'a> value. By piping that value into mapT Pure, you get a CommandLineReservationsApiT<CommandLineReservationsApiProgram<'a>> value that you can finally pipe into Free in order to produce a CommandLineReservationsApiProgram<'a> value. Phew!

The liftCL function lifts a CommandLineProgram (CL) to CommandLineReservationsApiProgram by first using CommandLine.map to lift x to a CommandLineProgram<ReservationsApiProgram<'a>> value. It then pipes that value to wrap.

Likewise, the liftRA function lifts a ReservationsApiProgram (RA) to CommandLineReservationsApiProgram. It simply elevates x to a CommandLineProgram value by using CommandLineProgram.Pure. Subsequently, it pipes that value to wrap.

In both of these functions, I used the slightly unusual backwards pipe operator <|. The reason for that is that it emphasises the similarity between liftCL and liftRA. This is easier to see if you remove the type comments:

let liftCL x = wrap <| CommandLine.map ReservationsApiProgram.Pure x
let liftRA x = wrap <| CommandLineProgram.Pure x

This is how I normally write my F# code. I only add the type comments for the benefit of you, dear reader. Normally, when you have an IDE, you can always inspect the types using the built-in tools.

Using the backwards pipe operator makes it immediately clear that both functions depend in the wrap function. This would have been muddied by use of the normal forward pipe operator:

let liftCL x = CommandLine.map ReservationsApiProgram.Pure x |> wrap
let liftRA x = CommandLineProgram.Pure x |> wrap

The behaviour is the same, but now wrap doesn't align, making it harder to discover the kinship between the two functions. My use of the backward pipe operator is motivated by readability concerns.

Following the free monad recipe, now create a computation expression builder:

type CommandLineReservationsApiBuilder () =
    member this.Bind (x, f) = CommandLineReservationsApi.bind f x
    member this.Return x = Pure x
    member this.ReturnFrom x = x
    member this.Zero () = Pure ()

Finally, create an instance of the class:

let commandLineReservationsApi = CommandLineReservationsApiBuilder ()

Putting the commandLineReservationsApi value in a module will enable you to use it for computation expressions whenever you open that module. I normally put it in a module with the [<AutoOpen>] attribute so that it automatically becomes available as soon as I open the containing namespace.

Simplification #

While there can be good reasons to introduce single-case discriminated unions in your F# code, they're isomorphic with their contained type. (This means that there's a lossless conversion between the union type and the contained type, in both directions.) Following the free monad recipe, I introduced CommandLineReservationsApiT as a discriminated union, but since it's a single-case union, you can refactor it to its contained type.

If you delete the CommandLineReservationsApiT type, you'll first have to change the definition of the program type to this:

type CommandLineReservationsApiProgram<'a> =
| Free of CommandLineProgram<ReservationsApiProgram<CommandLineReservationsApiProgram<'a>>>
| Pure of 'a

You simply replace CommandLineReservationsApiT<_> with CommandLineProgram<ReservationsApiProgram<_>>, effectively promoting the type contained in the Run case to be the container in the Free case.

Once CommandLineReservationsApiT is gone, you'll also need to delete the mapT function, and amend bind:

// ('a -> CommandLineReservationsApiProgram<'b>)
// -> CommandLineReservationsApiProgram<'a>
// -> CommandLineReservationsApiProgram<'b>
let rec bind f = function
    | Free instruction -> instruction |> mapStack (bind f) |> Free
    | Pure x -> f x

Likewise, you must also adjust the wrap function:

let private wrap x = x |> mapStack Pure |> Free

The rest of the above code stays the same.

Wizard #

In Haskell, you get combinations of monads for free via the FreeT type, whereas in F#, you have to work for it. Once you have the combination in monadic form as well, you can write programs with that combination. Here's the wizard that collects your data and attempts to make a restaurant reservation on your behalf:

// CommandLineReservationsApiProgram<unit>
let tryReserve = commandLineReservationsApi {
    let! count = liftCL readQuantity
    let! date  = liftCL readDate
    let! availableSeats =
        ReservationsApi.getSlots date
        |> ReservationsApi.map (List.sumBy (fun slot -> slot.SeatsLeft))
        |> liftRA
    if availableSeats < count
    then do!
        sprintf "Only %i remaining seats." availableSeats
        |> CommandLine.writeLine
        |> liftCL
    else
        let! name  = liftCL readName
        let! email = liftCL readEmail
        do! { Date = date; Name = name; Email = email; Quantity = count }
            |> ReservationsApi.postReservation 
            |> liftRA
    }

Notice that tryReserve is a value, and not a function. It's a pure value that contains an AST - a small program that describes the impure interactions that you'd like to take place. It's defined entirely within a commandLineReservationsApi computation expression.

It starts by using the readQuantity and readDate program values you saw in the previous F# article. Both of these values are CommandLineProgram values, so you have to use liftCL to lift them to CommandLineReservationsApiProgram values - only then can you let! bind them to an int and a DateTimeOffset, respectively. This is just like the use of lift in the previous article's Haskell example.

Once the program has collected the desired date from the user, it calls ReservationsApi.getSlots and calculates the sum over all the returned SeatsLeft labels. The ReservationsApi.getSlots function returns a ReservationsApiProgram<Slot list>, the ReservationsApi.map turns it into a ReservationsApiProgram<int> value that you must liftRA in order to be able to let! bind it to an int value. Let me stress once again: the program actually doesn't do any of that; it constructs an AST with instructions to that effect.

If it turns out that there's too few seats left, the program writes that on the command line and exits. Otherwise, it continues to collect the user's name and email address. That's all the data required to create a Reservation record and pipe it to ReservationsApi.postReservation.

Interpreters #

The tryReserve wizard is a pure value. It contains an AST that can be interpreted in such a way that impure operations happen. You've already seen the CommandLineProgram interpreter in a previous article, so I'm not going to repeat it here. I'll only note that I renamed it to interpretCommandLine because I want to use the name interpret for the combined interpreter.

The interpreter for ReservationsApiProgram values is similar to the CommandLineProgram interpreter:

// ReservationsApiProgram<'a> -> 'a
let rec interpretReservationsApi = function
    | ReservationsApiProgram.Pure x -> x
    | ReservationsApiProgram.Free (GetSlots (d, next)) ->
        ReservationHttpClient.getSlots d
        |> Async.RunSynchronously        
        |> next
        |> interpretReservationsApi
    | ReservationsApiProgram.Free (PostReservation (r, next)) ->
        ReservationHttpClient.postReservation r |> Async.RunSynchronously
        next |> interpretReservationsApi

The interpretReservationsApi function pattern-matches on its (implicit) ReservationsApiProgram argument, and performs the appropriate actions according to each instruction. In all Free cases, it delegates to implementations defined in a ReservationHttpClient module. The code in that module isn't shown here, but you can see it in the GitHub repository that accompanies this article.

You can combine the two 'leaf' interpreters in an interpreter of CommandLineReservationsApiProgram values:

// CommandLineReservationsApiProgram<'a> -> 'a
let rec interpret = function
    | CommandLineReservationsApiProgram.Pure x -> x
    | CommandLineReservationsApiProgram.Free p ->
        p |> interpretCommandLine |> interpretReservationsApi |> interpret

As usual, in the Pure case, it simply returns the contained value. In the Free case, p is a CommandLineProgram<ReservationsApiProgram<CommandLineReservationsApiProgram<'a>>>. Since it's a CommandLineProgram value, you can interpret it with interpretCommandLine, which returns a ReservationsApiProgram<CommandLineReservationsApiProgram<'a>>. Since that's a ReservationsApiProgram, you can pipe it to interpretReservationsApi, which then returns a CommandLineReservationsApiProgram<'a>. An interpreter exists for that type as well, namely the interpret function itself, so recursively invoke it again. In other words, interpret will keep recursing until it hits a Pure case.

Execution #

Everything is now in place so that you can execute your program. This is the program's entry point:

[<EntryPoint>]
let main _ =
    interpret Wizard.tryReserve
    0 // return an integer exit code

When you run it, you'll be able to have an interaction like this:

Please enter number of diners:
4
Please enter your desired date:
2017-11-25
Please enter your name:
Mark Seemann
Please enter your email address:
mark@example.net
OK

If you want to run this code sample yourself, you're going to need an appropriate HTTP API with which you can interact. I hosted the API on my local machine, and afterwards verified that the record was, indeed, written in the reservations database.

Summary #

As expected, you can combine free monads in F#, although it requires more boilerplate code than in Haskell.

Next: F# free monad recipe.


Combining free monads in Haskell

Monday, 24 July 2017 15:33:00 UTC

An example on how to compose free monads in Haskell.

In the previous article in this series on pure interactions, you saw how to write a command-line wizard in F#, using a free monad to build an Abstract Syntax Tree (AST). The example collects information about a potential restaurant reservations you'd like to make. That example, however, didn't do more than that.

For a more complete experience, you'd like your command-line interface (CLI) to not only collect data about a reservation, but actually make the reservation, using the available HTTP API. This means that you'll also need to model interaction with the HTTP API as an AST, but a different AST. Then, you'll have to figure out how to compose these two APIs into a combined API.

In order to figure out how to do this in F#, I first had to do it in Haskell. In this article, you'll see how to do it in Haskell, and in the next article, you'll see how to translate this Haskell prototype to F#. This should ensure that you get a functional F# code base as well.

Command line API #

Let's make an easy start of it. In a previous article, you saw how to model command-line interactions as ASTs, complete with syntactic sugar provided by a computation expression. That took a fair amount of boilerplate code in F#, but in Haskell, it's declarative:

import Control.Monad.Trans.Free (FreeliftF)
 
data CommandLineInstruction next =
    ReadLine (String -> next)
  | WriteLine String next
  deriving (Functor)
 
type CommandLineProgram = Free CommandLineInstruction
 
readLine :: CommandLineProgram String
readLine = liftF (ReadLine id)
 
writeLine :: String -> CommandLineProgram ()
writeLine s = liftF (WriteLine s ())

This is all the code required to define your AST and make it a monad in Haskell. Contrast that with all the code you have to write in F#!

The CommandLineInstruction type defines the instruction set, and makes use of a language extension called DeriveFunctor, which enables Haskell to automatically create a Functor instance from the type.

The type alias type CommandLineProgram = Free CommandLineInstruction creates a monad from CommandLineInstruction, since Free is a Monad when the underlying type is a Functor.

The readLine value and writeLine function are conveniences that lift the instructions from CommandLineInstruction into CommandLineProgram values. These were also one-liners in F#.

HTTP client API #

You can write a small wizard to collect restaurant reservation data with the CommandLineProgram API, but the new requirement is to make HTTP calls so that the CLI program actually makes the reservation against the back-end system. You could extend CommandLineProgram with more instructions, but that would be to mix concerns. It'd be more appropriate to define a new instruction set for making the required HTTP requests.

This API will send and receive more complex values than simple String values, so you can start by defining their types:

data Slot = Slot { slotDate :: ZonedTime, seatsLeft :: Int } deriving (Show)
 
data Reservation =
  Reservation { reservationDate :: ZonedTime
              , reservationName :: String
              , reservationEmail :: String
              , reservationQuantity :: Int }
              deriving (Show)

The Slot type contains information about how many available seats are left on a particular date. The Reservation type contains the information required in order to make a reservation. It's similar to the Reservation F# record type you saw in the previous article.

The online restaurant reservation HTTP API may afford more functionality than you need, but there's no reason to model more instructions than required:

data ReservationsApiInstruction next =
    GetSlots ZonedTime ([Slot-> next)
  | PostReservation Reservation next
  deriving (Functor)

This instruction set models two interactions. The GetSlots case models an instruction to request, from the HTTP API, the slots for a particular date. The PostReservation case models an instruction to make a POST HTTP request with a Reservation, thereby making a reservation.

Like the above CommandLineInstruction, this type is (automatically) a Functor, which means that we can create a Monad from it:

type ReservationsApiProgram = Free ReservationsApiInstruction

Once again, the monad is nothing but a type alias.

Finally, you're going to need the usual lifts:

getSlots :: ZonedTime -> ReservationsApiProgram [Slot]
getSlots d = liftF (GetSlots d id)
 
postReservation :: Reservation -> ReservationsApiProgram ()
postReservation r = liftF (PostReservation r ())

This is all you need to write a wizard that interleaves CommandLineProgram and ReservationsApiProgram instructions in order to create a more complex AST.

Wizard #

The wizard should do the following:

  • Collect the number of diners, and the date for the reservation.
  • Query the HTTP API about availability for the requested date. If insufficient seats are available, it should exit.
  • If sufficient capacity remains, collect name and email.
  • Make the reservation against the HTTP API.
Like in the previous F# examples, you can factor some of the work that the wizard performs into helper functions. The first is one that prompts the user for a value and tries to parse it:

readParse :: Read a => String -> String -> CommandLineProgram a
readParse prompt errorMessage = do
  writeLine prompt
  l <- readLine
  case readMaybe l of
    Just dt -> return dt
    Nothing -> do
      writeLine errorMessage
      readParse prompt errorMessage

It first uses writeLine to write prompt to the command line - or rather, it creates an instruction to do so. The instruction is a pure value. No side-effects are involved until an interpreter evaluates the AST.

The next line uses readLine to read the user's input. While readLine is a CommandLineProgram String value, due to Haskell's do notation, l is a String value. You can now attempt to parse that String value with readMaybe, which returns a Maybe a value that you can handle with pattern matching. If readMaybe returns a Just value, then return the contained value; otherwise, write errorMessage and recursively call readParse again.

Like in the previous F# example, the only way to continue is to write something that readMaybe can parse. There's no other way to exit; there probably should be an option to quit, but it's not important for this demo purpose.

You may also have noticed that, contrary to the previous F# example, I here succumbed to the temptation to break the rule of three. It's easier to define a reusable function in Haskell, because you can leave it generic, with the proviso that the generic value must be an instance of the Read typeclass.

The readParse function returns a CommandLineProgram a value. It doesn't combine CommandLineProgram with ReservationsApiProgram. That's going to happen in another function, but before we look at that, you're also going to need another little helper:

readAnything :: String -> CommandLineProgram String
readAnything prompt = do
  writeLine prompt
  readLine

The readAnything function simply writes a prompt, reads the user's input, and unconditionally returns it. You could also have written it as a one-liner like readAnything prompt = writeLine prompt >> readLine, but I find the above code more readable, even though it's slightly more verbose.

That's all you need to write the wizard:

tryReserve :: FreeT ReservationsApiProgram CommandLineProgram ()
tryReserve = do
  q <- lift $ readParse "Please enter number of diners:" "Not an Integer."
  d <- lift $ readParse "Please enter your desired date:" "Not a date."
  availableSeats <- liftF $ (sum . fmap seatsLeft) <$> getSlots d
  if availableSeats < q
    then lift $ writeLine $ "Only " ++ show availableSeats ++ " remaining seats."
    else do
      n <- lift $ readAnything "Please enter your name:"
      e <- lift $ readAnything "Please enter your email address:"
      liftF $ postReservation Reservation
        { reservationDate = d
        , reservationName = n
        , reservationEmail = e
        , reservationQuantity = q }

The tryReserve program first prompt the user for a number of diners and a date. Once it has the date d, it calls getSlots and calculates the sum of the remaining seats. availableSeats is an Int value like q, so you can compare those two values with each other. If the number of available seats is less than the desired quantity, the program writes that and exits.

This interaction demonstrates how to interleave CommandLineProgram and ReservationsApiProgram instructions. It would be a bad user experience if the program would ask the user to input all information, and only then discover that there's insufficient capacity.

If, on the other hand, there's enough remaining capacity, the program continues collecting information from the user, by prompting for the user's name and email address. Once all data is collected, it creates a new Reservation value and invokes postReservation.

Consider the type of tryReserve. It's a combination of CommandLineProgram and ReservationsApiProgram, contained within a type called FreeT. This type is also a Monad, which is the reason the do notation still works. This also begins to explain the various lift and liftF calls sprinkled over the code.

Whenever you use a <- arrow to 'pull the value out of the monad' within a do block, the right-hand side of the arrow must have the same type as the return type of the overall function (or value). In this case, the return type is FreeT ReservationsApiProgram CommandLineProgram (), whereas readParse returns a CommandLineProgram a value. As an example, lift turns CommandLineProgram Int into FreeT ReservationsApiProgram CommandLineProgram Int.

The way the type of tryReserve is declared, when you have a CommandLineProgram a value, you use lift, but when you have a ReservationsApiProgram a, you use liftF. This depends on the order of the monads contained within FreeT. If you swap CommandLineProgram and ReservationsApiProgram, you'll also need to use lift instead of liftF, and vice versa.

Interpreters #

tryReserve is a pure value. It's an Abstract Syntax Tree that combines two separate instruction sets to describe a complex interaction between user, command line, and an HTTP client. The program doesn't do anything until interpreted.

You can write an impure interpreter for each of the APIs, and a third one that uses the other two to interpret tryReserve.

Interpreting CommandLineProgram values is similar to the previous F# example:

interpretCommandLine :: CommandLineProgram a -> IO a
interpretCommandLine program =
  case runFree program of
    Pure r -> return r
    Free (ReadLine next) -> do
      line <- getLine
      interpretCommandLine $ next line
    Free (WriteLine line next) -> do
      putStrLn line
      interpretCommandLine next

This interpreter is a recursive function that pattern-matches all the cases in any CommandLineProgram a. When it encounters a Pure case, it simply returns the contained value.

When it encounters a ReadLine value, it calls getLine, which returns an IO String value read from the command line, but thanks to the do block, line is a String value. The interpreter then calls next with line, and passes the return value of that recursively to itself.

A similar treatment is given to the WriteLine case. putStrLn line writes line to the command line, where after next is used as an input argument to interpretCommandLine.

Thanks to Haskell's type system, you can easily tell that interpretCommandLine is impure, because for every CommandLineProgram a it returns IO a. That was the intent all along.

Likewise, you can write an interpreter for ReservationsApiProgram values:

interpretReservationsApi :: ReservationsApiProgram a -> IO a
interpretReservationsApi program =
  case runFree program of
    Pure x -> return x
    Free (GetSlots zt next) -> do
      slots <- HttpClient.getSlots zt
      interpretReservationsApi $ next slots
    Free (PostReservation r next) -> do
      HttpClient.postReservation r
      interpretReservationsApi next

The structure of interpretReservationsApi is similar to interpretCommandLine. It delegates its implementation to an HttpClient module that contains the impure interactions with the HTTP API. This module isn't shown in this article, but you can see it in the GitHub repository that accompanies this article.

From these two interpreters, you can create a combined interpreter:

interpret :: FreeT ReservationsApiProgram CommandLineProgram a -> IO a
interpret program = do
  r <- interpretCommandLine $ runFreeT program
  case r of
    Pure x -> return x
    Free p -> do
      y <- interpretReservationsApi p
      interpret y

This function has the required type: it evaluates any FreeT ReservationsApiProgram CommandLineProgram a and returns an IO a. runFreeT returns the CommandLineProgram part of the combined program. Passing this value to interpretCommandLine, you get the underlying type - the a in CommandLineProgram a, if you will. In this case, however, the a is quite a complex type that I'm not going to write out here. Suffice it to say that, at the container level, it's a FreeF value, which can be either a Pure or a Free case that you can use for pattern matching.

In the Pure case, you're done, so you can simply return the underlying value.

In the Free case, the p contained inside is a ReservationsApiProgram value, which you can interpret with interpretReservationsApi. That returns an IO a value, and due to the do block, y is the a. In this case, however, a is FreeT ReservationsApiProgram CommandLineProgram a, but that means that the function can now recursively call itself with y in order to interpret the next instruction.

Execution #

Armed with both an AST and an interpreter, executing the program is trivial:

main :: IO ()
main = interpret tryReserve

When you run the program, you could produce an interaction like this:

Please enter number of diners:
4
Please enter your desired date:
2017-11-25 18-30-00Z
Not a date.
Please enter your desired date:
2017-11-25 18:30:00Z
Please enter your name:
Mark Seemann
Please enter your email address:
mark@example.org
Status {statusCode = 200, statusMessage = "OK"}

You'll notice that I initially made a mistake on the date format, which caused readParse to prompt me again.

If you want to run this code sample yourself, you're going to need an appropriate HTTP API with which you can interact. I hosted the API on my local machine, and afterwards verified that the record was, indeed, written in the reservations database.

Summary #

This proof of concept proves that it's possible to combine separate free monads. Now that we know that it works, and the overall outline of it, it should be possible to translate this to F#. You should, however, expect more boilerplate code.

Next: Combining free monads in F#.


Comments

Here's an additional simplification. Rather than writing FreeT ReservationsApiProgram CommandLineProgram which requires you to lift, you can instead form the sum (coproduct) of both functors:

import Data.Functor.Sum

type Program = Free (Sum CommandLineInstruction ReservationsApiInstruction)

liftCommandLine :: CommandLineInstruction a -> Program a
liftCommandLine = liftF . InL

liftReservation :: ReservationsApiInstruction a -> Program a
liftReservation = liftF . InR

Now you can lift the helpers directly to Program, like so:

readLine :: Program String
readLine = liftCommandLine (ReadLine id)
 
writeLine :: String -> Program ()
writeLine s = liftCommandLine (WriteLine s ())

getSlots :: ZonedTime -> Program [Slot]
getSlots d = liftReservation (GetSlots d id)
 
postReservation :: Reservation -> Program ()
postReservation r = liftReservation (PostReservation r ())

Then (after you change the types of the read* helpers), you can drop all lifts from tryReserve:

tryReserve :: Program ()
tryReserve = do
  q <- readParse "Please enter number of diners:" "Not an Integer."
  d <- readParse "Please enter your desired date:" "Not a date."
  availableSeats <- (sum . fmap seatsLeft) <$> getSlots d
  if availableSeats < q
    then writeLine $ "Only " ++ show availableSeats ++ " remaining seats."
    else do
      n <- readAnything "Please enter your name:"
      e <- readAnything "Please enter your email address:"
      postReservation Reservation
        { reservationDate = d
        , reservationName = n
        , reservationEmail = e
        , reservationQuantity = q }

And finally your interpreter needs to dispatch over InL/InR (this is using functions from Control.Monad.Free, you can actually drop the Trans import at this point):

interpretCommandLine :: CommandLineInstruction (IO a) -> IO a
interpretCommandLine (ReadLine next) = getLine >>= next
interpretCommandLine (WriteLine line next) = putStrLn line >> next

interpretReservationsApi :: ReservationsApiInstruction (IO a) -> IO a
interpretReservationsApi (GetSlots zt next) = HttpClient.getSlots zt >>= next
interpretReservationsApi (PostReservation r next) = HttpClient.postReservation r >> next

interpret :: Program a -> IO a
interpret program =
  iterM go program
  where
    go (InL cmd) = interpretCommandLine cmd
    go (InR res) = interpretReservationsApi res

I find this to be quite clean!

2017-07-27 3:58 UTC

George, thank you for writing. That alternative does, indeed, look simpler and cleaner than mine. Thank you for sharing.

FWIW, one reason I write articles on this blog is to learn and become better. I publish what I know and have learned so far, and sometimes, people tell me that there's a better way. That's great, because it makes me a better programmer, and hopefully, it may make other readers better as well.

In case you'll be puzzling over my next blog post, however, I'm going to share a little secret (which is not a secret if you look at the blog's commit history): I wrote this article series more than a month ago, which means that all the remaining articles are already written. While I agree that using the sum of functors instead of FreeT simplifies the Haskell code, I don't think it makes that much of a difference when translating to F#. I may be wrong, but I haven't tried yet. My point, though, is that the next article in the series is going to ignore this better alternative, because, when it was written, I didn't know about it. I invite any interested reader to post, as a comment to that future article, their better alternatives :)

2017-07-27 7:31 UTC

Hi Mark,

I think you'll enjoy Data Types a la Carte. It's the definitive introduction to the style that George Pollard demonstrates above. Swierstra covers how to build datatypes with initial algebras over coproducts, compose them abstracting over the concrete functor, and tear them down generically. It's well written, too 😉

Benjamin

2017-07-23 28:40 UTC

Page 39 of 76

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!