Test data without Builders by Mark Seemann
We don't need no steenkin' Test Data Builders!
This is the fifth and final in a series of articles about the relationship between the Test Data Builder design pattern, and the identity functor. In the previous article, you learned why a Builder functor adds little value. In this article, you'll see what to do instead.
From Identity to naked values #
While you can define Test Data Builders with Haskell's Identity
functor, it adds little value:
Identity address = fmap (\a -> a { city = "Paris" }) addressBuilder
That's nothing but an overly complicated way to create a data value from another data value. You can simplify the code from the previous article. First, instead of calling them 'Builders', we should be honest and name them as the default values they are:
defaultPostCode :: PostCode defaultPostCode = PostCode [] defaultAddress :: Address defaultAddress = Address { street = "", city = "", postCode = defaultPostCode }
defaultPostCode
is nothing but an empty PostCode
value, and defaultAddress
is an Address
value with empty constituent values. Notice that defaultAddress
uses defaultPostCode
for the postCode
value.
If you need a value in Paris, you can simply write it like this:
address = defaultAddress { city = "Paris" }
Likewise, if you need a more specific address, but you don't care about the post code, you can write it like this:
address' = Address { street = "Rue Morgue", city = "Paris", postCode = defaultPostCode }
Notice how much simpler this is. There's no need to call fmap
in order to pull the 'underlying value' out of the functor, transform it, and put it back in the functor. Haskell's 'copy and update' syntax gives you this ability for free. It's built into the language.
Building F# values #
Haskell isn't the only language with 'copy and update' syntax. F# has it as well, and in fact, it's from the F# documentation that I've taken the 'copy and update' term.
The code corresponding to the above Haskell code looks like this in F#:
let defaultPostCode = PostCode [] let defaultAddress = { Street = ""; City = ""; PostCode = defaultPostCode } let address = { defaultAddress with City = "Paris" } let address' = { Street = "Rue Morgue"; City = "Paris"; PostCode = defaultPostCode }
The syntax is a little different, but the concepts are the same. F# adds the keyword with
to 'copy and update' expressions, which translates easily back to C# fluent interfaces.
Building C# objects #
In a previous article, you saw how to refactor your domain model to a model of Value Objects with fluent interfaces.
In your unit tests, you can define natural default values for testing purposes:
public static class Natural { public static PostCode PostCode = new PostCode(); public static Address Address = new Address("", "", PostCode); public static InvoiceLine InvoiceLine = new InvoiceLine("", PoundsShillingsPence.Zero); public static Recipient Recipient = new Recipient("", Address); public static Invoice Invoice = new Invoice(Recipient, new InvoiceLine[0]); }
This static Natural
class is a test-specific container of 'good' default values. Notice how, once more, the Address
value uses the PostCode
value to fill in the PostCode
property of the default Address
value.
With these default test values, and the fluent interface of your domain model, you can easily build a test address in Paris:
var address = Natural.Address.WithCity("Paris");
Because Natural.Address
is an Address
object, you can use its WithCity
method to build a test address in Paris, and where all other constituent values remain the default values.
Likewise, you can create an address on Rue Morgue, but with a default post code:
var address = new Address("Rue Morgue", "Paris", Natural.PostCode);
Here, you can simply create a new Address
object, but with Natural.PostCode
as the post code value.
Conclusion #
Using a fluent domain model obviates the need for Test Data Builders. There's a tendency among functional programmers to overbearingly state that design patterns are nothing but recipes to overcome deficiencies in particular programming languages or paradigms. If you believe such a claim, at least it ought to go both ways, but at the conclusion of this article series, I hope I've been able to demonstrate that this is true for the Test Data Builder pattern. You only need it for 'classic', mutable, object-oriented domain models.
- For mutable object models, use Test Data Builders.
- Consider, however, modelling your domain with Value Objects and 'copy and update' instance methods.
- Even better, consider using a programming language with built-in 'copy and update' expressions.
With[...]
methods:
public class Invoice { public Recipient Recipient { get; } public IReadOnlyCollection<InvoiceLine> Lines { get; } public Invoice( Recipient recipient, IReadOnlyCollection<InvoiceLine> lines) { if (recipient == null) throw new ArgumentNullException(nameof(recipient)); if (lines == null) throw new ArgumentNullException(nameof(lines)); this.Recipient = recipient; this.Lines = lines; } public Invoice WithRecipient(Recipient newRecipient) { return new Invoice(newRecipient, this.Lines); } public Invoice WithLines(IReadOnlyCollection<InvoiceLine> newLines) { return new Invoice(this.Recipient, newLines); } public override bool Equals(object obj) { var other = obj as Invoice; if (other == null) return base.Equals(obj); return object.Equals(this.Recipient, other.Recipient) && Enumerable.SequenceEqual( this.Lines.OrderBy(l => l.Name), other.Lines.OrderBy(l => l.Name)); } public override int GetHashCode() { return this.Recipient.GetHashCode() ^ this.Lines.GetHashCode(); } }
That may seem like quite a maintenance burden (and it is), but consider that it has the same degree of complexity and overhead as defining a Test Data Builder for each domain object. At least, by putting this extra code in your domain model, you make all of that API (all the With[...]
methods, and the structural equality) available to other production code. In my experience, that's a better return of investment than isolating such useful features only to test code.
Still, once you've tried using a language like F# or Haskell, where 'copy and update' expressions come with the language, you realise how much redundant code you're writing in C# or Java. The Test Data Builder design pattern truly is a recipe that addresses deficiencies in particular languages.
Comments
Leveraging extension methods to implement 'With' API is relatively straightforward and you have both developper friendly API and a great separation of concern namely definition and usage.
If you choose to implement extensions in another assembly you could manage who have access to it: unit test only, another assembly, whole project.
You can split API according to context/user too. It can also be useful to enforce some guidelines.
I have some ugly POC code in my branch Roslyn builder generator - it is only a starting point but I think it has some potential.
Dominik, thank you for writing. I admit that I haven't given this much thought, but it strikes me as one of those 'interesting problems' that programmers are keen to solve. It looks to me like a bit of a red herring, as I tend to be sceptical of schemes to generate code. What problem does it address? That one has to type? That's rarely the bottleneck in software development.
Granted, it gets tedious to manually add all those
With[...]
methods, but there's a lot of things about C# that's tedious. There's a reason I prefer F# instead.Thanks for respond - I think that for each comment you now have 1+ blog post to respond ;). Despite the fact that I should consider learning new language like F# to open my mind I will focus on c# aspect.
I understand your consideration about code generation but I thing that when we repeat some actions over and over we automatically think about some automations - this is the source of computers I think. Currently I'm working in project where we use Test Builder Pattern heavily and every time I think about writing another builder my motivation is decreasing because psychologically is not interesting anymore and I would be happy to give that to someone else or machine.
When I started to understand what is Roslyn and what it can do it just open my eyes to new opportunities. Generating some simple but frequently repeating code give me more time on focusing on real domain problems and keep my frustration level on low position :)
Of course this is not BIG problem solver but only new approach for simplification of daily tasks - another advantage is that Roslyn I creating normal c# code file that can be navigated from code, can be seen in debugger (in contrast to IL injectors), so there is no magical black boxes. Disadvantage is that currently generating code is very simple - it involves some external nugets and I feel that writing generator in Roslyn could be simplified;
ps. Commenting via pull request is interesting experience - feels like pro ;)
Dominik, while it isn't based on Roslyn, are you aware of AutoFixture?
Yes, I discovered this tool together with your blog ;) I think it is good enough - Roslyn approach is only alternative not basing on reflection or IL injection.
I will try to use AutoFixture in next project so I will see it will survive my requirements.
If I understand correctly, one of your claims is that a fluent C# syntax for expressing change (i.e. "with" methods for an immutable value object) is equivalent to F#'s copy and update syntax for records in the sense that any code written with one can be written with the other. I agree with that. Then you pointed out some advantages with the F# syntax. Among the advantages of F#'s syntax is that there is less code to write in the first place and less code to maintain.
I see an advantage with C#'s syntax. Suppose the only constructor of the value object is internal but all its properties and "with" methods are public. Then adding a new (public) property and corresponding (public) "with" method is not a breaking change. As far as I know, this is not possible with F#.* Either the record consturctor is public or it is not public. If the record's constructor is public, then the copy and update syntax is also public but adding a proprty to the record is a breaking change. Otherwise, the record's constructor is not public, so the copy and update syntax is not available.
I have an extremely short list of advantages of C# over F#, and this is one of them.
*It is possible to put an access modifier immediately after the equals sign when defining a record. However, the documentation for record syntax is missing this information. When I try to put an access modifier before a field identifier, I get a compiler error that says
P.S. For those that want to write functionally in C#, I recommend using Langage Ext. in particular, a somewhat recently added feature is auto-generated "with" methods.
Tyson, thank you for writing. Let's get the uncontroversial part of this discussion out of the way first: F# record types compile to IL that's equivalent to what a properly-written C# Value Object compiles to. At the IL level, there's no difference.
At the language level, it's true that F# records is a specialised syntax that enables you to succinctly define static types to model data. It's not a general-purpose syntax, so there's definitely things it doesn't allow you to express. F# has normal class syntax for those needs.
That record types aren't refactoring-safe is a known issue. This is true not only for F# records, but for Haskell
data
types as well. In Haskell public APIs, you sometimes see that combination that you describe. The type has a private constructor, but the library then provides functions to manipulate it (essentially copy-and-update functions). You sometimes see that in F# as well, but here a class would often have been a better choice. Haskell doesn't have object-oriented classes, so it has to resort to that sort of hack to keep APIs backwards compatible.When you write a public API in F#, choosing between a record and a class as a data carrier is an important choice. When APIs are published (e.g. on nuget.org), you'll have little success with your library if you regularly introduce breaking changes.
For internal use, the story is different. You can use F# records to express domain models with a few lines of code. If you later find out that you have to change the model, then you do that, and fix the ensuing compilation errors.
Public APIs represent more work, regardless of the language in which they're written. Yes, you need to carefully and deliberately design a public library's API and data structures. I don't think, however, that that should detract us from using productive language features for application-specific use.
I found the series that includes this blog post when I searched on Google for "builder pattern F#". This series is primarily about the test data builder design pattern. As I understand it, I would describe this pattern as a special case of the (general case) builder design pattern in which all arguments have reasonable defaults.
Have you ever written a builder that accepted multiple arguments one at a time none of which have reasonable defaults? Have you ever blogged about this (more general) build design pattern?
As a good student of your ;) I wonder if the builder design pattern corresponds to some universal abstraction. Among the fluent interfaces that I am most impressed with are configuration in Entity Framework and Fluent Assertions. Of course I could try to make my own fluent interface by copying them, and that would probably work out reasonably well. At the same time, I would like to learn from you and your frustration (if that description is accurate) that you expressed (at the end of the next and last post in this series) with the API of your AutoFixture project failing to use a potential universal abstraction (namely functors).
Tyson, thank you for bringing the Builder pattern to my attention. I haven't written much about it yet, but I believe that it'd be a perfect fit for my article series on how certain design patterns relate to universal abstractions. When I get some time, I'll have to write one or more articles about that topic.
In short, though, I think that the Builder pattern as described in Design Patterns is isomorphic to the Fluent Builder pattern, as you also imply. It remains for me to more formally argue that case, but in short, the Builder pattern is described as a set of virtual methods that return
void
. Since all these methods returnvoid
, each method could, instead, return the object to which it belongs, and that's what a Fluent Builder does.Once you return the Builder object, you could, instead of mutating and returning the instance, return a new object. That new object is a near-copy of the previous Builder, with only one change applied to it. Now you have a function that essentially takes a Builder as input, plus some other input, and returns a Builder. That's just a curried endomorphism.
Once again, every time we run into a composable design pattern, it turns out to be a monoid. It shouldn't surprise us much, though, since the original Builder pattern as described in Design Patterns has
void
methods, and such methods compose.The most formal treatment I have seen about fluent APIs was in this blog post. The context is that we are trying to create a word in some language specified by a grammar, and the methods in the fluent API correspond to production rules in the grammar. The company behind that blog post seems to able to generate a fluent API (in Java) given as input the produciton rules of a grammar. Their main use case appears to be creating a fluent API for constructing SQL queries against a database (presumably by first converting a database schema into corresponding grammar production rules). The end result reminds me of F#'s SQL type provider.
Tyson, I've now published an article that hopefully answers some of your questions. I must admit that I'm still puzzled by this question:
If I left that unanswered, then at least I hope that I've managed to put enough building blocks into position to be able to address it. Can you elaborate?I have now elaborated in this comment. Thanks for waiting :)