The Builder functor

The Test Data Builder design pattern as a functor.

This is the third in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. The previous article introduced this generic Builder class:

public class Builder<T>
{
    private readonly T item;
 
    public Builder(T item)
    {
        if (item == null)
            throw new ArgumentNullException(nameof(item));
 
        this.item = item;
    }
 
    public Builder<T1> Select<T1>(Func<T, T1> f)
    {
        var newItem = f(this.item);
        return new Builder<T1>(newItem);
    }
 
    public T Build()
    {
        return this.item;
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as Builder<T>;
        if (other == null)
            return base.Equals(obj);
 
        return object.Equals(this.item, other.item);
    }
 
    public override int GetHashCode()
    {
        return this.item.GetHashCode();
    }
}

The workhorse is the Select method. As I previously promised to explain, there's a reason I chose that particular name.

Query syntax #

C# comes with a small DSL normally known as query syntax. People mostly think of it in relation to ORMs such as Entity Framework, but it's a general-purpose language feature. Still, most developers probably associate it with the IEnumerable<T> interface, but it's more general than that. In fact, any type that comes with a Select method with a compatible signature supports query syntax.

I deliberately designed the Builder's Select method to support query syntax:

Address address = from a in Builder.Address
                  select a.WithCity("Paris");

Builder.Address is a Builder<Address> object that contains a 'good' default Address value. Since Builder<T> has a compatible Select method, you can 'query it'. In this example, you use the WithCity method to explicitly pin the Address object's City property, while all the other Address values remain the default values.

There's an extra bit (pun intended) of compiler magic at work. Did you wonder how a Builder<Address> automatically turns into an Address value? After all, address is of the type Address, not Builder<Address>.

I specifically added an implicit conversion so that I didn't have to surround the query expression with brackets in order to call the Build method:

public static implicit operator T(Builder<T> b)
{
    return b.item;
}

This conversion is defined on Builder<T>. It's the reason I explicitly use the type name when I declare the address variable above, instead of using the var keyword. Declaring the type forces the implicit conversion.

You can also use query syntax to map one constructed Builder type into another (and ultimately to the value it contains):

Address address =
    from pc in Builder.PostCode
    select new Address("Rue Morgue", "Paris", pc);

This expression starts with a Builder<PostCode> object, transforms it into a Builder<Address> object, and then finally uses the implicit conversion to turn the Builder<Address> into an Address object.

Even a more complex 'query' looks almost palatable:

Invoice invoice =
    from i in Builder.Invoice
    select i.WithRecipient(
        from r in Builder.Recipient
        select r.WithAddress(Builder.Address.WithNoPostCode()));

Again, the implicit type conversion makes the syntax much cleaner.

Functor #

Isn't it amazing that the C# designers were able to come up with such a generally useful language feature? It certainly is a nice piece of work, but it's based on a an existing body of knowledge.

A type like Builder<T> with a suitable Select method is a functor. This is a term from category theory, but I'll try to avoid turning this article into a category theory lecture. Likewise, I'm not going to talk specifically about monads here, although it's a closely related topic. A functor is a mapping between categories; it maps an object from one category into an object of another category.

Although I've never seen Microsoft explicitly acknowledge the connection to functors and monads, it's clear that it's there. One of the principal designers of LINQ was Erik Meijer, who definitely knows his way around category theory and functional programming. A functor is a simple, but widely applicable abstraction.

In order to be a functor, a type must have an associated mapping. In C#'s query syntax, this is a method named Select, but more often it's called map.

Haskell Builder #

In Haskell, the mapping is called fmap, and you can define the Builder functor like this:

newtype Builder a = Builder a deriving (Show, Eq)
 
instance Functor Builder where
  fmap f (Builder a) = Builder $ f a

Notice how terse the definition is, compared to the C# version. Despite the difference in size, they accomplish the same goal. The first line of code defines the Builder type, complete with structural equality (Eq) and the ability to convert a Builder value to a string (Show).

This Builder type is explicitly defined as a Functor in the second expression, where the fmap function is implemented. The code is similar to the Select method in the above C# example: f is a function that takes the generic type a (corresponding to T in the C# example) as input, and returns a value of the generic type b (corresponding to T1 in the C# example). The mapping pulls the underlying value out of the input Builder, calls f with that value, and puts the return value into a new Builder.

In Haskell, a functor is part of the language itself, so Builder is explicitly declared to be a Functor instance.

If you define some default Builder values, corresponding to the above Builder.Address, you can use them to build addresses in the same way:

Builder address = fmap (\a -> a { city = "Paris" }) addressBuilder

Here, addressBuilder is a Builder Address value, corresponding to the C# Builder.Address value. \a -> a { city = "Paris" } is a lambda expression that takes an Address value as input, and return a similar value as output, only with city explicitly bound to "Paris".

F# example #

Unlike Haskell, F# doesn't treat functors as an explicit construct, but you can still define a Builder functor:

type Builder<'a> = Builder of 'a
 
module Builder =
    // ('a -> 'b) -> Builder<'a> -> Builder<'b>
    let map f (Builder x) = Builder (f x)

You can see how similar this is to the Haskell example. In F#, it's common to define a module with the same name as a generic type. This example defines a generic Builder<'a> type and a supporting Builder module. Normally, a module would contain other functions, in addition to map.

Just like in C# and Haskell, you can build an address in Paris with a predefined Builder value as a start:

let (Builder address) =
    addressBuilder |> Builder.map (fun a -> { a with City = "Paris" })

Again, addressBuilder is a Builder<Address> that contains a 'default' Address (test) value. You use Builder.map with a lambda expression to map the default value into a new Address value where City is bound to "Paris".

Functor laws #

In order to be a proper functor, an object must obey two simple laws. It's not enough that a mapping function exists, it must also obey the laws. While that sounds uncomfortably like mathematics, the laws are simple and intuitive.

The first law is that when the mapping returns the input, the functor returned is also the input functor. There's only one (generic) function that returns its input unmodified. It's called the identity function (often abbreviated id).

Here's an example test case that illustrates the first functor law for the C# Builder<T>:

[Fact]
public void BuilderObeysFirstFunctorLaw()
{
    Func<int, int> id = x => x;
    var sut = new Builder<int>(42);
 
    var actual = sut.Select(id);
 
    Assert.Equal(sut, actual);
}

The .NET Base Class Library doesn't come with a built-in identity function, so the test case first defines it as id. Normally, the identity function would be defined as a function that takes a value of the generic type T as input, and returns the same value (still of type T) as output. This test is only an example for the type int, so it also defines the identity function as constrained to int.

The test creates a new Builder<int> with the value 42, and calls Select with id. Since the first functor law says that mapping with the identity function must return the input functor, the expected value is the sut variable.

This test is only an example of the first functor law. It doesn't prove that Builder<T> obeys the law for all generic types (T) and for all values. It only proves that it holds for the integer 42. You get the idea, though, I'm sure.

The second functor law says that if you chain two functions to make a third function, and map your functor using that third function, the result should be equal to the result you get if you chain two mappings made out of those two functions. Here's an example:

[Fact]
public void BuilderObeysSecondFunctorLaw()
{
    Func<int, string> g = i => i.ToString();
    Func<string, string> f = s => new string(s.Reverse().ToArray());
    var sut = new Builder<int>(1337);
 
    var actual = sut.Select(i => f(g(i)));
 
    var expected = sut.Select(g).Select(f);
    Assert.Equal(expected, actual);
}

This test case (which is, again, only an example) first defines two functions, f and g. It then creates a new Builder<int> and calls Select with the combined function f(g). This returns the actual result, which is a Builder<string>.

This result should be equal to first calling Select with g (which returns a Builder<string>), and then calling Select with f (which returns another Builder<string>). These two Builder objects should be equal to each other, which they are.

Both these tests compare an expected Builder to an actual Builder, which is the reason that Builder<T> overrides Equals in order to have structural equality. In Haskell, the above Builder type has structural equality because it uses the default instance of Eq, and in F#, Builder<'a> has structural equality because that's the default equality for the immutable F# data types.

We can't easily talk about the functor laws without being able to talk about functor values being (or not being) equal to each other, so structural equality is an important element in the discussion.

Summary #

You can define a Test Data Builder as a functor by defining a generic Builder type with a Select method. In order to be a proper functor, it must also obey the functor laws, but these laws are quite natural; you almost have to go out of your way in order to violate them.

A functor is a well-known abstraction. Instead of trying to come up with a half-baked, ad-hoc abstraction, modelling an API based on already known and understood abstractions such as functors will make the API easier to learn. Everyone who knows what a functor is, will automatically have a good understanding of the API. Even if you didn't know about functors until now, you only have to learn about them once.

This can often be beneficial, but for Test Data Builders, it turns out to be a red herring. The Builder functor is nothing but the Identity functor in disguise.

Next: Builder as Identity.

Comments

Andrés Moschini #

Maybe I am missing something, so could you explain with few words what is the advantage of having this _generic builder_?

I mean, inmutable entities and _with methods_ seems to be enough to easily create test data without builders, for example:


var invoice = DefaultTestObjects.Invoice
    .WithRecipient(DefaultTestObjects.Recipient
        .WithAddress(DefaultTestObjects.Address
            .WithNoPostCode()
            .WithCity("Paris"))
    .WithDate(new DateTimeOffset(2017, 8, 29)));

2017-08-29 12:50 UTC

Mark Seemann #

Andrés, thank you for writing. I hope that the next two articles in this article series will answer your question. It seems, however, that you've already predicted where this is headed. A fine display of critical thinking!

2017-08-29 17:48 UTC

Romain Vasseur #

Just for the DSL and implicit conversion laius it's worth reading.
_Implicit/explicit_ is a must-have to lighten _value object_ usage and contributes to deliver proper API.
Thank you Mark.

2017-08-31 12:44 UTC

Published: Monday, 28 August 2017 11:19:00 UTC

The Builder functor by Mark Seemann