ploeh blog https://blog.ploeh.dk danish software design en-us Mark Seemann Mon, 25 Jan 2021 21:23:51 UTC Mon, 25 Jan 2021 21:23:51 UTC Self-hosted integration tests in ASP.NET https://blog.ploeh.dk/2021/01/25/self-hosted-integration-tests-in-aspnet/ Mon, 25 Jan 2021 07:45:00 UTC <div id="post"> <p> <em>A way to self-host a REST API and test it through HTTP.</em> </p> <p> In 2020 I developed a sizeable code base for an online restaurant REST API. In the spirit of <a href="/outside-in-tdd">outside-in TDD</a>, I found it best to test the HTTP behaviour of the API by actually interacting with it via HTTP. </p> <p> Sometimes ASP.NET offers more than one way to achieve the same end result. For example, to return <code>200 OK</code>, you can use both <code>OkObjectResult</code> and <code>ObjectResult</code>. I don't want my tests to be coupled to such implementation details, so by testing an API via HTTP instead of using the ASP.NET object model, I decouple the two. </p> <p> You can easily self-host an ASP.NET web API and test it using an <a href="https://docs.microsoft.com/dotnet/api/system.net.http.httpclient">HttpClient</a>. In this article, I'll show you how I went about it. </p> <h3 id="a070ceb1f9a84f2988aa7e4c59f38397"> Reserving a table <a href="#a070ceb1f9a84f2988aa7e4c59f38397" title="permalink">#</a> </h3> <p> In true outside-in fashion, I'll first show you the test. Then I'll break it down to show you how it works. </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="color:#74531f;">ReserveTableAtNono</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">api</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SelfHostedApi</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">client</span>&nbsp;=&nbsp;<span style="color:#1f377f;">api</span>.<span style="color:#74531f;">CreateClient</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">at</span>&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Today.<span style="color:#74531f;">AddDays</span>(434).<span style="color:#74531f;">At</span>(20,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">dto</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Some</span>.Reservation.<span style="color:#74531f;">WithDate</span>(<span style="color:#1f377f;">at</span>).<span style="color:#74531f;">WithQuantity</span>(6).<span style="color:#74531f;">ToDto</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">response</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">PostReservation</span>(<span style="color:#a31515;">&quot;Nono&quot;</span>,&nbsp;<span style="color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#74531f;">AssertRemainingCapacity</span>(<span style="color:#1f377f;">client</span>,&nbsp;<span style="color:#1f377f;">at</span>,&nbsp;<span style="color:#a31515;">&quot;Nono&quot;</span>,&nbsp;4); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#74531f;">AssertRemainingCapacity</span>(<span style="color:#1f377f;">client</span>,&nbsp;<span style="color:#1f377f;">at</span>,&nbsp;<span style="color:#a31515;">&quot;Hipgnosta&quot;</span>,&nbsp;10); }</pre> </p> <p> This test uses <a href="https://xunit.net">xUnit.net</a> 2.4.1 to make a reservation at the restaurant named <em>Nono</em>. The first line that creates the <code>api</code> variable spins up a self-hosted instance of the REST API. The next line creates an <code>HttpClient</code> configured to communicate with the self-hosted instance. </p> <p> The test proceeds to create a Data Transfer Object that it posts to the <em>Nono</em> restaurant. It then asserts that the remaining capacity at the <em>Nono</em> and <em>Hipgnosta</em> restaurants are as expected. </p> <p> You'll see the implementation details soon, but I first want to discuss this high-level test. As is the case with most of my code, it's far from perfect. If you're not familiar with this code base, you may have plenty of questions: <ul> <li>Why does it make a reservation 434 days in the future? Why not 433, or 211, or 1?</li> <li>Is there anything significant about the quantity <em>6?</em></li> <li>Why is the expected remaining capacity at <em>Nono 4?</em></li> <li>Why is the expected remaining capacity at <em>Hipgnosta 10?</em> Why does it even verify that?</li> </ul> To answer the easiest question first: There's nothing special about 434 days. The only <a href="/2021/01/11/waiting-to-happen">requirement is that it's a positive number</a>, so that the reservation is in the future. That makes this test a great candidate for a <a href="/property-based-testing-intro">property-based test</a>. </p> <p> The three other questions are all related. A bit of background is in order. I wrote this test during a process where I turned the system into a multi-tenant system. Before that change, there was only one restaurant, which was <em>Hipgnosta</em>. I wanted to verify that if you make a reservation at another restaurant (here, <em>Nono</em>) it changes the observable state of that restaurant, and not of <em>Hipgnosta</em>. </p> <p> The way these two restaurants are configured, <em>Hipgnosta</em> has <a href="/2020/01/27/the-maitre-d-kata">a single communal table that seats ten guests</a>. This explains why the expected capacity of <em>Hipgnosta</em> is <em>10</em>. Making a reservation at <em>Nono</em> shouldn't affect <em>Hipgnosta</em>. </p> <p> <em>Nono</em> has a more complex table configuration. It has both standard and communal tables, but the largest table is a six-person communal table. There's only one table of that size. The next-largest tables are four-person tables. Thus, a reservation for six people reserves the largest table that day, after which only four-person and two-person tables are available. Therefore the remaining capacity ought to be <em>4</em>. </p> <p> The above test knows all this. You are welcome to criticise such hard-coded knowledge. There's a real risk that it might make it more difficult to maintain the test suite in the future. </p> <p> Certainly, had this been a unit test, and not an integration test, I wouldn't have accepted so much implicit knowledge - particularly because I mostly apply <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>, and <a href="/2015/05/07/functional-design-is-intrinsically-testable">pure functions should have isolation</a>. Functions shouldn't depend on implicit global state; they should return a value based on input arguments. That's a bit of digression, though. </p> <p> These are integration tests, which I mostly use for smoke tests and to verify HTTP-specific behaviour. I have unit tests for fine-grained testing of edge cases and variations of input. While I wouldn't accept so much implicit knowledge from a unit test, I find that it so far works well with integration tests. </p> <h3 id="db1185165dc748aba67c8daa6278f523"> Self-hosting <a href="#db1185165dc748aba67c8daa6278f523" title="permalink">#</a> </h3> <p> It only takes a <a href="https://docs.microsoft.com/dotnet/api/microsoft.aspnetcore.mvc.testing.webapplicationfactory-1">WebApplicationFactory</a> to self-host an ASP.NET API. You can use it directly, but if you want to modify the hosted service in some way, you can also inherit from it. </p> <p> I want my self-hosted integration tests to run as <a href="/2019/02/18/from-interaction-based-to-state-based-testing">state-based tests</a> that use an in-memory database instead of SQL Server. I've defined <code>SelfHostedApi</code> for that purpose: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">SelfHostedApi</span>&nbsp;:&nbsp;<span style="color:#2b91af;">WebApplicationFactory</span>&lt;<span style="color:#2b91af;">Startup</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">ConfigureWebHost</span>(<span style="color:#2b91af;">IWebHostBuilder</span>&nbsp;<span style="color:#1f377f;">builder</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">builder</span>.<span style="color:#74531f;">ConfigureServices</span>(<span style="color:#1f377f;">services</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">services</span>.<span style="color:#74531f;">RemoveAll</span>&lt;<span style="color:#2b91af;">IReservationsRepository</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">services</span>.<span style="color:#74531f;">AddSingleton</span>&lt;<span style="color:#2b91af;">IReservationsRepository</span>&gt;(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>()); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The way that <code>WebApplicationFactory</code> works, its <code>ConfigureWebHost</code> method runs <em>after</em> the <code>Startup</code> class' <code>ConfigureServices</code> method. Thus, when <code>ConfigureWebHost</code> runs, the <code>services</code> collection is already configured to use SQL Server. As <a href="/2020/04/20/unit-bias-against-collections#e6675033a3a9dc8a21c64650dff91b8432a9a151">Julius H so kindly pointed out to me</a>, the <code>RemoveAll</code> extension method removes all existing registrations of a service. I use it to remove the SQL Server dependency from the system, after which I replace it with a test-specific in-memory implementation. </p> <p> Since the in-memory database is configured with Singleton lifetime, that instance is going to be around for the lifetime of the service. While it's only keeping track of things in memory, it'll keep state until the service shuts down, which happens when the above <code>api</code> variable goes out of scope. </p> <p> Notice that <a href="/2020/11/30/name-by-role">I named the class by the role it plays</a> rather than which base class it derives from. </p> <h3 id="af89655ac4d948f8a749aa20fffe2b12"> Posting a reservation <a href="#af89655ac4d948f8a749aa20fffe2b12" title="permalink">#</a> </h3> <p> The <code>PostReservation</code> method is an extension method on <code>HttpClient</code>: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">HttpResponseMessage</span>&gt;&nbsp;<span style="color:#74531f;">PostReservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">HttpClient</span>&nbsp;<span style="color:#1f377f;">client</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#1f377f;">name</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">object</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#1f377f;">json</span>&nbsp;=&nbsp;<span style="color:#2b91af;">JsonSerializer</span>.<span style="color:#74531f;">Serialize</span>(<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">content</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StringContent</span>(<span style="color:#1f377f;">json</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">content</span>.Headers.ContentType.MediaType&nbsp;=&nbsp;<span style="color:#a31515;">&quot;application/json&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">resp</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">GetRestaurant</span>(<span style="color:#1f377f;">name</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">resp</span>.<span style="color:#74531f;">EnsureSuccessStatusCode</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">rest</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">resp</span>.<span style="color:#74531f;">ParseJsonContent</span>&lt;<span style="color:#2b91af;">RestaurantDto</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">address</span>&nbsp;=&nbsp;<span style="color:#1f377f;">rest</span>.Links.<span style="color:#74531f;">FindAddress</span>(<span style="color:#a31515;">&quot;urn:reservations&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">PostAsync</span>(<span style="color:#1f377f;">address</span>,&nbsp;<span style="color:#1f377f;">content</span>); }</pre> </p> <p> It's part of a larger set of methods that enables an <code>HttpClient</code> to interact with the REST API. Three of those methods are visible here: <code>GetRestaurant</code>, <code>ParseJsonContent</code>, and <code>FindAddress</code>. These, and many other, methods form a client API for interacting with the REST API. While this is currently test code, it's ripe for being extracted to a reusable client SDK library. </p> <p> I'm not going to show all of them, but here's <code>GetRestaurant</code> to give you a sense of what's going on: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">HttpResponseMessage</span>&gt;&nbsp;<span style="color:#74531f;">GetRestaurant</span>(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">HttpClient</span>&nbsp;<span style="color:#1f377f;">client</span>,&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#1f377f;">name</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">homeResponse</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">GetAsync</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Uri</span>(<span style="color:#a31515;">&quot;&quot;</span>,&nbsp;<span style="color:#2b91af;">UriKind</span>.Relative)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">homeResponse</span>.<span style="color:#74531f;">EnsureSuccessStatusCode</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">homeRepresentation</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">homeResponse</span>.<span style="color:#74531f;">ParseJsonContent</span>&lt;<span style="color:#2b91af;">HomeDto</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">restaurant</span>&nbsp;=&nbsp;<span style="color:#1f377f;">homeRepresentation</span>.Restaurants.<span style="color:#74531f;">First</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Name&nbsp;==&nbsp;<span style="color:#1f377f;">name</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">address</span>&nbsp;=&nbsp;<span style="color:#1f377f;">restaurant</span>.Links.<span style="color:#74531f;">FindAddress</span>(<span style="color:#a31515;">&quot;urn:restaurant&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">GetAsync</span>(<span style="color:#1f377f;">address</span>); }</pre> </p> <p> The REST API has only a single documented address, which is the 'home' resource at the relative URL <code>""</code>; i.e. the root of the API. In this incarnation of the API, the home resource responds with a JSON array of restaurants. The <code>GetRestaurant</code> method finds the restaurant with the desired name and finds its address. It then issues another <code>GET</code> request against that address, and returns the response. </p> <h3 id="00d69af4ad6b43538b86f3e27fb7c4ee"> Verifying state <a href="#00d69af4ad6b43538b86f3e27fb7c4ee" title="permalink">#</a> </h3> <p> The verification phase of the above test calls a private helper method: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="color:#74531f;">AssertRemainingCapacity</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">HttpClient</span>&nbsp;<span style="color:#1f377f;">client</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;<span style="color:#1f377f;">date</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#1f377f;">name</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">expected</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">response</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">client</span>.<span style="color:#74531f;">GetDay</span>(<span style="color:#1f377f;">name</span>,&nbsp;<span style="color:#1f377f;">date</span>.Year,&nbsp;<span style="color:#1f377f;">date</span>.Month,&nbsp;<span style="color:#1f377f;">date</span>.Day); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">day</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="color:#1f377f;">response</span>.<span style="color:#74531f;">ParseJsonContent</span>&lt;<span style="color:#2b91af;">CalendarDto</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">All</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">day</span>.Days.<span style="color:#74531f;">Single</span>().Entries, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">e</span>&nbsp;=&gt;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="color:#1f377f;">expected</span>,&nbsp;<span style="color:#1f377f;">e</span>.MaximumPartySize)); }</pre> </p> <p> It uses another of the above-mentioned client API extension methods, <code>GetDay</code>, to inspect the REST API's calendar entry for the restaurant and day in question. Each calendar contains a series of time entries that lists the largest party size the restaurant can accept at that time slot. The two restaurants in question only have single seatings, so once you've booked a six-person table, you have it for the entire evening. </p> <p> Notice that verification is done by interacting with the system itself. No <a href="http://xunitpatterns.com/Back%20Door%20Manipulation.html">Back Door Manipulation</a> is required. I favour this if at all possible, since I believe that it offers better confidence that the system behaves as it should. </p> <h3 id="e93c5eb19e1d4f358197b4a1c048f01e"> Conclusion <a href="#e93c5eb19e1d4f358197b4a1c048f01e" title="permalink">#</a> </h3> <p> It's been possible to self-host .NET REST APIs for testing purposes at least since 2012, but it's only become easier over the years. All you need to get started is the <code>WebApplicationFactory&lt;TEntryPoint&gt;</code> class, although you're probably going to need a derived class to override some of the system configuration. </p> <p> From there, you can interact with the self-hosted system using the standard <code>HttpClient</code> class. </p> <p> Since I configure these tests to run on an in-memory database, the execution time is comparable to 'normal' unit tests. I admit that I haven't measured it, but that's because I haven't felt the need to do so. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2021/01/25/self-hosted-integration-tests-in-aspnet Parametrised test primitive obsession code smell https://blog.ploeh.dk/2021/01/18/parametrised-test-primitive-obsession-code-smell/ Mon, 18 Jan 2021 06:30:00 UTC <div id="post"> <p> <em>Watch out for this code smell with some unit testing frameworks.</em> </p> <p> In a <a href="/2021/01/11/waiting-to-happen">previous article</a> you saw this <a href="/2019/04/01/an-example-of-state-based-testing-in-c">state-based integration test</a>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(1049,&nbsp;19,&nbsp;00,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(1130,&nbsp;18,&nbsp;15,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;956,&nbsp;16,&nbsp;55,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;2)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;433,&nbsp;17,&nbsp;30,&nbsp;<span style="color:#a31515;">&quot;shli@example.org&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Shanghai&nbsp;Li&quot;</span>,&nbsp;5)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;days, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;hours, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;minutes, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;at&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Now.Date&nbsp;+&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>(days,&nbsp;hours,&nbsp;minutes,&nbsp;0); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SystemClock</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">InMemoryRestaurantDatabase</span>(<span style="color:#2b91af;">Grandfather</span>.Restaurant), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Id&nbsp;=&nbsp;<span style="color:#a31515;">&quot;B50DF5B1-F484-4D99-88F9-1915087AF568&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Guid</span>.Parse(dto.Id), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Email</span>(dto.Email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Name</span>(dto.Name&nbsp;??&nbsp;<span style="color:#a31515;">&quot;&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db.Grandfather); }</pre> </p> <p> This was the test <em>after</em> I improved it. Still, I wasn't satisfied with it. It has several problems. Take a few moments to consider it. Can you identify any problems? Which ones? </p> <h3 id="6da86d1040e544a3a67f509c0f3aade7"> Size <a href="#6da86d1040e544a3a67f509c0f3aade7" title="permalink">#</a> </h3> <p> I know that you're not familiar with all the moving parts. You don't know how <code>ReservationDto</code> or <code>Reservation</code> are implemented. You don't know what <code>InMemoryRestaurantDatabase</code> is, or how <code>ReservationsController</code> behaves. Still, the issues I have in mind aren't specific to a particular code base. </p> <p> I feel that the method is verging on being too big. Quantifiably, it doesn't fit in an <a href="/2019/11/04/the-80-24-rule">80x24 box</a>, but that's just an arbitrary <a href="/2020/04/13/curb-code-rot-with-thresholds">threshold</a> anyway. Still, I think it's grown to a size that makes me uncomfortable. </p> <p> If you aren't convinced, think of this code example as a stand-in for something larger. In the above test, a reservation contains five smaller values (<code>Id</code>, <code>At</code>, <code>Email</code>, <code>Name</code>, and <code>Quantity</code>). How would a similar test look if the object in question contains ten or twenty values? </p> <p> In the decades I've been programming and consulting, I've seen plenty of code bases. Data objects made from twenty fields are hardly unusual. </p> <p> What would a similar test look like if the <code>dto</code> and the <code>expected</code> object required twenty smaller values? </p> <p> The test would be too big. </p> <h3 id="e251bee2df414c5cb0d489f31183c664"> Primitive obsession <a href="#e251bee2df414c5cb0d489f31183c664" title="permalink">#</a> </h3> <p> A test like this one contains a mix of essential behaviour and implementation details. The behaviour that it verifies is that when you <code>Post</code> a valid <code>dto</code>, the data makes it all the way to the database. </p> <p> Exactly how the <code>dto</code> or the <code>expected</code> value are constructed is less relevant for the test. Yet it's intermingled with the test of behaviour. The signal-to-noise ratio in the test isn't all that great. What can you do to improve things? </p> <p> As given, it seems difficult to do much. The problem is <a href="/2011/05/25/DesignSmellPrimitiveObsession">primitive obsession</a>. While this is a <a href="http://xunitpatterns.com/Parameterized%20Test.html">parametrised test</a>, all the method parameters are primitives: integers and strings. The makes it hard to introduce useful abstractions. </p> <p> In C# (and probably other languages as well) parametrised tests often suffer from primitive obsession. The most common data source is an attribute (AKA <em>annotation</em>), like <a href="https://xunit.net">xUnit.net</a>'s <code>[InlineData]</code> attribute. This isn't a limitation of xUnit.net, but rather of .NET attributes. You can only create attributes with primitive values and arrays. </p> <p> What <em>is</em> a limitation of xUnit.net (and the other mainstream .NET testing frameworks, as far as I know) is that tests aren't first-class values. In <a href="https://www.haskell.org">Haskell</a>, by contrast, it's <a href="/2018/04/30/parametrised-unit-tests-in-haskell">easy to write parametrised tests using the normal language constructs</a> exactly because tests are first-class values. (I hope that the next version of xUnit.net will support tests as first-class values.) </p> <p> Imagine that instead of only five constituent fields, you'd have to write a parametrised test for objects with twenty primitive values. As long as you stick with attribute-based data sources, you'll be stuck with primitive values. </p> <p> Granted, attributes like <code>[InlineData]</code> are lightweight, but over the years, my patience with them has grown shorter. They lock me into primitive obsession, and I don't appreciate that. </p> <h3 id="9342d4306e174fe79946a131b9c6894c"> Essential test <a href="#9342d4306e174fe79946a131b9c6894c" title="permalink">#</a> </h3> <p> While tests as first-class values aren't an option in xUnit.net, you can provide other data sources for the <code>[Theory]</code> attribute than <code>[InlineData]</code>. It's not as lightweight, but it breaks the primitive obsession and re-enables normal code design techniques. It enables you to reduce the test itself to its essence. You no longer have to think in primitives, but can instead express the test unshackled by constraints. As a first pass, I'd like the test to look like this: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">ClassData</span>(<span style="color:blue;">typeof</span>(<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;validDto,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;expected) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SystemClock</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">InMemoryRestaurantDatabase</span>(<span style="color:#2b91af;">Grandfather</span>.Restaurant), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(validDto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db.Grandfather); }</pre> </p> <p> This version of the test eliminates the noise. How <code>validDto</code> and <code>expected</code> are constructed is an implementation detail that has little bearing on the behaviour being tested. </p> <p> For a reader of the code, it's should now be clearer what's at stake here: If you <code>Post</code> a <code>validDto</code> the <code>expected</code> reservation should appear in the database. </p> <p> Reducing the test code to its essentials made me realise something that hitherto had escaped me: that I could <a href="/2020/11/30/name-by-role">name the DTO by role</a>. Instead of just <code>dto</code>, I could call the parameter <code>validDto</code>. </p> <p> Granted, I could also have done that previously, but I didn't think of it. There's was so much noise in that test that I didn't stop to consider whether <code>dto</code> sufficiently communicated the role of that variable. </p> <p> The less code, the easier it becomes to think such things through, I find. </p> <p> In any case, the test code now much more succinctly expresses the essence of the desired behaviour. Notice how I started my refactoring by writing the desired test code. I've yet to implement the data source. Now that the data source expresses test data as full objects, I'm not so concerned with whether or not that's going to be possible. Of course it's possible. </p> <h3 id="469ee6ae8f264e21a544f11be2476111"> Object data source <a href="#469ee6ae8f264e21a544f11be2476111" title="permalink">#</a> </h3> <p> You can define data sources for xUnit.net as classes or methods. In C# I usually reach for the <code>[ClassData]</code> option, since an object (in C#, that is) gives me better options for further decomposition. For example, I can define a class and delegate the details to helper methods: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>&nbsp;: &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TheoryData</span>&lt;<span style="color:#2b91af;">ReservationDto</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithName(1049,&nbsp;19,&nbsp;00,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithName(1130,&nbsp;18,&nbsp;15,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithoutName(956,&nbsp;16,&nbsp;55,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;2); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithName(433,&nbsp;17,&nbsp;30,&nbsp;<span style="color:#a31515;">&quot;shli@example.org&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Shanghai&nbsp;Li&quot;</span>,&nbsp;5); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;More&nbsp;members&nbsp;here...</span></pre> </p> <p> Here, I'm taking advantage of xUnit.net's built-in <code>TheoryData&lt;T1, T2&gt;</code> base class, but that's just a convenience. All you have to do is to implement <code>IEnumerable&lt;object[]&gt;</code>. </p> <p> As you can see, the constructor adds the four test cases by calling two private helper methods. Here's the first of those: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">const</span>&nbsp;<span style="color:blue;">string</span>&nbsp;id&nbsp;=&nbsp;<span style="color:#a31515;">&quot;B50DF5B1-F484-4D99-88F9-1915087AF568&quot;</span>; <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">void</span>&nbsp;AddWithName( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;days, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;hours, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;minutes, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;at&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Now.Date&nbsp;+&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>(days,&nbsp;hours,&nbsp;minutes,&nbsp;0); &nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Id&nbsp;=&nbsp;id, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Guid</span>.Parse(id), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Email</span>(email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Name</span>(name), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;quantity)); }</pre> </p> <p> The other helper method is almost identical, although it has a slight variation when it comes to the reservation name: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">void</span>&nbsp;AddWithoutName( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;days, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;hours, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;minutes, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;at&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Now.Date&nbsp;+&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>(days,&nbsp;hours,&nbsp;minutes,&nbsp;0); &nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Id&nbsp;=&nbsp;id, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Guid</span>.Parse(id), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Email</span>(email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Name</span>(<span style="color:#a31515;">&quot;&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;quantity)); }</pre> </p> <p> In total, this refactoring results in <em>more</em> code, so how is this an improvement? </p> <h3 id="9722aade54194b1a87c7312d16ed7fc8"> The paradox of decomposition <a href="#9722aade54194b1a87c7312d16ed7fc8" title="permalink">#</a> </h3> <p> In object-oriented design, decomposition tends to lead to more code. If you want to isolate and make reusable a particular piece of behaviour, you'll usually introduce an interface or a base class. Even stateless functions need a static class to define them. (To be fair, functional programming isn't entirely devoid of such overhead associated with decomposition, but the cost tends to smaller.) This leads to more code, compared with the situation before decomposition. </p> <p> This is a situation you may also encounter if you attempt to refactor to design patterns, or follow the <a href="/encapsulation-and-solid">SOLID principles</a>. You'll have more code than when you started. This often leads to resistance to such 'code bloat'. </p> <p> It's fine to resist code bloat. It's also fine to dislike 'complexity for complexity's sake'. Try to evaluate each potential code change based on advantages and disadvantages. I'm not insisting that the above refactoring is objectively better. I did feel, however, that I had a problem that I ought to address, and that this was a viable alternative. The result is more code, but each piece of code is smaller and simpler. </p> <p> You can, conceivably, read the test method itself to get a feel for what it tests, even if you don't know all the implementation details. You can read the four statements in the <code>PostValidReservationWhenDatabaseIsEmptyTestCases</code> constructor without, I hope, understanding all the details about the two helper methods. And you <em>can</em> read <code>AddWithName</code> without understanding how <code>AddWithoutName</code> works, and vice versa, because these two methods don't depend on each other. </p> <h3 id="27b546f17ba44571a11ed44db9d45a09"> Conclusion <a href="#27b546f17ba44571a11ed44db9d45a09" title="permalink">#</a> </h3> <p> In this article, I've described how the use of code annotations for parametrised tests tend to pull in the direction of primitive obsession. This is a force worth keeping an eye on, I think. </p> <p> You saw how to refactor to class-based test data generation. This enables you to use objects instead of primitives, thus opening your design palette. You can now use all your usual object-oriented or functional design skills to factor the code in a way that's satisfactory. </p> <p> Was it worth it in this case? Keep in mind that the original problem was already marginal. While the code didn't fit in a 80x24 box, it was only 33 lines of code (excluding the test data). Imagine, however, that instead of a five-field reservation, you'd be dealing with a twenty-field data class, and such a refactoring begins to look more compelling. </p> <p> Is the code now perfect? It still isn't. I'm a little put off by the similarity of <code>AddWithName</code> and <code>AddWithoutName</code>. I'm also aware that there's a trace of production code duplicated in the test case, in the way that the test code duplicates how a valid <code>ReservationDto</code> relates to a <code>Reservation</code>. I'm on the fence whether I should do anything about this. </p> <p> At the moment I'm inclined to heed <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">the rule of three</a>. The duplication is still too insubstantial to warrant refactoring, but it's worth keeping an eye on. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2021/01/18/parametrised-test-primitive-obsession-code-smell Waiting to happen https://blog.ploeh.dk/2021/01/11/waiting-to-happen/ Mon, 11 Jan 2021 06:31:00 UTC <div id="post"> <p> <em>A typical future test maintenance problem.</em> </p> <p> In <a href="/2020/12/07/branching-tests">a recent article</a> I showed a unit test and parenthetically mentioned that it might have a future maintenance problem. Here's a more recent version of the same test. Can you tell what the future issue might be? </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-11-24&nbsp;19:00&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2024-02-13&nbsp;18:15&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-08-23&nbsp;16:55&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;2)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2022-03-18&nbsp;17:30&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;shli@example.org&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Shanghai&nbsp;Li&quot;</span>,&nbsp;5)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;at,&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;<span style="color:blue;">string</span>&nbsp;name,&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SystemClock</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">InMemoryRestaurantDatabase</span>(<span style="color:#2b91af;">Grandfather</span>.Restaurant), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Id&nbsp;=&nbsp;<span style="color:#a31515;">&quot;B50DF5B1-F484-4D99-88F9-1915087AF568&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Guid</span>.Parse(dto.Id), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Email</span>(dto.Email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Name</span>(dto.Name&nbsp;??&nbsp;<span style="color:#a31515;">&quot;&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db.Grandfather); }</pre> </p> <p> To be honest, there's more than one problem with this test, but presently I'm going to focus on one of them. </p> <p> Since you don't know the details of the implementation, you may not be able to tell what the problem might be. It's not a trick question. On the other hand, you might still be able to guess, just from the clues available in the above code listing. </p> <h3 id="970cada688cc4a04abbb1c11328931f1"> Sooner or later <a href="#970cada688cc4a04abbb1c11328931f1" title="permalink">#</a> </h3> <p> Here are some clues to consider: I'm writing this article in the beginning of 2021. Consider the dates supplied via the <code>[InlineData]</code> attributes. Seen from 2021, they're all in the future. </p> <p> Notice, as well, that the <code>sut</code> takes a <code>SystemClock</code> dependency. You don't know the <code>SystemClock</code> class (it's a proprietary class in this code base), but from the name I'm sure that you can imagine what it represents. </p> <p> From the perspective of early 2021, all dates are going to be in the future for more than a year. What is going to happen, though, once the test runs after March 18, 2022? </p> <p> That test case is going to fail. </p> <p> You can't tell from the above code listing, but the system under test rejects reservations in the past. Once March 18, 2022 has come and gone, the reservation at <code>"2022-03-18 17:30"</code> is going to be in the past. The <code>sut</code> will reject the reservation, and the assertion will fail. </p> <p> You have to be careful with tests that rely on the system clock. </p> <h3 id="61e0c1225dd34c40a0ceb19270ae9007"> Test Double? <a href="#61e0c1225dd34c40a0ceb19270ae9007" title="permalink">#</a> </h3> <p> The fundamental problem is that the system clock is non-deterministic. A typical reaction to non-determinism in unit testing is to introduce a <a href="https://martinfowler.com/bliki/TestDouble.html">Test Double</a> of some sort. Instead of using the system clock, you could use a <a href="/2013/10/23/mocks-for-commands-stubs-for-queries">Stub</a> as a stand-in for the real time. </p> <p> This is possible here as well. The <code>ReservationsController</code> class actually depends on an <code>IClock</code> interface that <code>SystemClock</code> implements. You could define a test-specific <code>ConstantClock</code> implementation that would always return a constant date and time. This would actually work, but would rely on an implementation detail. </p> <p> At the moment, the <code>ReservationsController</code> only calls <code>Clock.GetCurrentDateTime()</code> a <em>single time</em> to get the current time. As soon as it has that value, it passes it to a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, which implements <a href="/2020/01/27/the-maitre-d-kata">the business logic</a>: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;now&nbsp;=&nbsp;Clock.GetCurrentDateTime(); <span style="color:blue;">if</span>&nbsp;(!restaurant.MaitreD.WillAccept(now,&nbsp;reservations,&nbsp;reservation)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;NoTables500InternalServerError();</pre> </p> <p> A <code>ConstantClock</code> would work, but only as long as the <code>ReservationsController</code> only calls <code>Clock.GetCurrentDateTime()</code> once. If it ever began to call this method multiple times to detect the passing of time, using a constant time value would mostly likely again break the test. This seems brittle, so I don't want to go that way. </p> <h3 id="29f9100aa1104cd2a1db18146384bc17"> Relative time <a href="#29f9100aa1104cd2a1db18146384bc17" title="permalink">#</a> </h3> <p> Working with the system clock in automated tests is easier if you deal with relative time. Instead of defining the test cases as absolute dates, express them as days into the future. Here's one way to refactor the test: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(1049,&nbsp;19,&nbsp;00,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(1130,&nbsp;18,&nbsp;15,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;956,&nbsp;16,&nbsp;55,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;2)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;433,&nbsp;17,&nbsp;30,&nbsp;<span style="color:#a31515;">&quot;shli@example.org&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Shanghai&nbsp;Li&quot;</span>,&nbsp;5)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;days, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;hours, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;minutes, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;at&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Now.Date&nbsp;+&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>(days,&nbsp;hours,&nbsp;minutes,&nbsp;0); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SystemClock</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">InMemoryRestaurantDatabase</span>(<span style="color:#2b91af;">Grandfather</span>.Restaurant), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Id&nbsp;=&nbsp;<span style="color:#a31515;">&quot;B50DF5B1-F484-4D99-88F9-1915087AF568&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Guid</span>.Parse(dto.Id), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Email</span>(dto.Email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Name</span>(dto.Name&nbsp;??&nbsp;<span style="color:#a31515;">&quot;&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db.Grandfather); }</pre> </p> <p> The absolute dates always were fairly arbitrary, so I just took the current date and converted the dates to a number of days into the future. Now, the first test case will always be a date 1,049 days (not quite three years) into the future, instead of November 24, 2023. </p> <p> The test is no longer a failure waiting to happen. </p> <h3 id="906499c9b3d647d08d2f48eb5991cf43"> Conclusion <a href="#906499c9b3d647d08d2f48eb5991cf43" title="permalink">#</a> </h3> <p> Treating test cases that involve time and date as relative to the current time, instead of as absolute values, is usually a good idea if the system under test depends on the system clock. </p> <p> It's always a good idea to factor as much code as you can as pure functions, like the above <code>WillAccept</code> method. Pure functions don't depend on the system clock, so here you can safely pass absolute time and date values. Pure functions are <a href="/2015/05/07/functional-design-is-intrinsically-testable">intrinsically testable</a>. </p> <p> Still, as the <a href="https://martinfowler.com/bliki/TestPyramid.html">test pyramid</a> suggests, relying exclusively on unit tests isn't a good idea. The test shown in this article isn't really a unit test, but rather a <a href="/2019/04/01/an-example-of-state-based-testing-in-c">state-based integration test</a>. It relies on both the system clock and a <a href="http://xunitpatterns.com/Fake%20Object.html">Fake</a> database. Expressing the test cases for this test as relative time values effectively addresses the problem introduced by the system clock. </p> <p> There are plenty of other problems with the above test. One thing that bothers me is that the 'fix' made the line count grow. It didn't quite fit into a <a href="/2019/11/04/the-80-24-rule">80x24 box</a> before, but now it's even worse! I should do something about that, but that's a topic for <a href="/2021/01/18/parametrised-test-primitive-obsession-code-smell">another article</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2021/01/11/waiting-to-happen Dynamic test oracles for rho problems https://blog.ploeh.dk/2021/01/04/dynamic-test-oracles-for-rho-problems/ Mon, 04 Jan 2021 06:26:00 UTC <div id="post"> <p> <em>A proof of concept of cross-branch testing for compiled languages.</em> </p> <p> <a href="https://www.hillelwayne.com">Hillel Wayne</a> recently published an article called <a href="https://buttondown.email/hillelwayne/archive/cross-branch-testing/">Cross-Branch Testing</a>. It outlines an approach to a class of problems that are hard to test. He mentions computer vision and simulations, among others. I can add that it's also <a href="/2015/10/19/visual-value-verification">difficult to write intuitive tests of convex hulls</a> and <a href="https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life">Conway's game of life</a>. </p> <p> Hillel Wayne calls these <em>rho problems</em>, 'just because'. I'm totally going to run with that term. </p> <p> In the article, he outlines an approach where you test an iteration of rho code against a 'last known good' snapshot. He uses <code>git worktree</code> to set up a snapshot of the reference implementation. He then writes a property that compares the refactored code's behaviour against the reference. </p> <p> The example code is in <a href="https://www.python.org">Python</a>, which is a language that I don't know. As far as I can tell, it works because Python is 'lightweight' enough that you can load and execute source code directly. I found that the approach makes much sense, but I wondered how it would apply for statically typed, compiled languages. I decided to create a proof of concept in <a href="https://fsharp.org">F#</a>. </p> <h3 id="ab0c6139b2c84e148fe1173c2508ec62"> Test cases from Python <a href="#ab0c6139b2c84e148fe1173c2508ec62" title="permalink">#</a> </h3> <p> My first problem was to port Hillel Wayne's example rho problem to F#. The function <code>f</code> doesn't have any immediate mathematical properties; nor is its behaviour intuitive. While I think that I understand what each line of code in <code>f</code> means, I don't really know Python. Since one of the properties of rho problems is that bugs can be subtle, I didn't trust myself to be able to port the Python code to F# without some test cases. </p> <p> To solve that problem, I first found an online Python interpreter and pasted the <code>f</code> function into it. I then wrote code to print the output of a function call: </p> <p> <pre>print(f'1, 2, 3, { f(1, 2, 3) }')</pre> </p> <p> This line of code produces this output: </p> <p> <pre>1, 2, 3, True</pre> </p> <p> In other words, I could produce comma-separated values of input and actual output. </p> <p> Hillel Wayne wrote properties using <a href="https://hypothesis.works">Hypothesis</a>, which, <a href="https://hypothesis.works/articles/how-many-tests">it seems</a>, by default runs each property 200 times. </p> <p> In F# I'm going to use <a href="https://fscheck.github.io/FsCheck">FsCheck</a>, so I first used <em>F# Interactive</em> with FsCheck to produce 200 Python <code>print</code> statements like the above: </p> <p> <pre>&gt; Arb.Default.Int32().Generator |&gt; Gen.three |&gt; Gen.map (fun (x, y, z) -&gt; sprintf "print(f'%i, %i, %i, { f(%i, %i, %i) }')" x y z x y z) |&gt; Gen.sample 100 200 |&gt; List.iter (printfn "%s");; print(f'-77, 67, 84, { f(-77, 67, 84) }') print(f'58, -46, 3, { f(58, -46, 3) }') print(f'21, 13, 94, { f(21, 13, 94) }') ... </pre> </p> <p> This is a throwaway data pipeline that starts with an FsCheck integer generator, creates a triple from it, turns that triple into a Python <code>print</code> statement, and finally writes 200 of those to the console. The above code listing only shows the first three lines of output, while the rest are indicated by an ellipsis. </p> <p> I copied those 200 <code>print</code> statements over to the online Python interpreter and ran the code. That produced 200 comma-separated values like these: </p> <p> <pre>-77, 67, 84, False 58, -46, 3, False 21, 13, 94, True ...</pre> </p> <p> These can serve as test cases for porting the Python code to F#. </p> <h3 id="936410215a784d73ae9b5dcba9125a4c"> Port to F# <a href="#936410215a784d73ae9b5dcba9125a4c" title="permalink">#</a> </h3> <p> The next step is to write a parametrised test, using a provisional implementation of <code>f</code>: </p> <p> <pre>[&lt;Theory;&nbsp;MemberData(nameof&nbsp;fTestCases)&gt;] <span style="color:blue;">let</span>&nbsp;``test&nbsp;f``&nbsp;x&nbsp;y&nbsp;z&nbsp;expected&nbsp;=&nbsp;expected&nbsp;=!&nbsp;f&nbsp;x&nbsp;y&nbsp;z</pre> </p> <p> This test uses <a href="https://xunit.net">xUnit.net</a> 2.4.1 and <a href="https://github.com/SwensenSoftware/Unquote">Unquote</a> 5.0.0. As you can tell, apart from the annotations, it's a true one-liner. It calls the <code>f</code> function with the three supplied arguments <code>x</code>, <code>y</code>, and <code>z</code> and compares the return value with the <code>expected</code> value. </p> <p> The code uses the new <a href="https://docs.microsoft.com/dotnet/fsharp/language-reference/nameof">nameof</a> feature of F# 5. <code>fTestCases</code> is a function in the same module that holds the test: </p> <p> <pre><span style="color:green;">//&nbsp;unit&nbsp;-&gt;&nbsp;seq&lt;obj&nbsp;[]&gt;</span> <span style="color:blue;">let</span>&nbsp;fTestCases&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">use</span>&nbsp;strm&nbsp;=&nbsp;typeof&lt;Anchor&gt;.Assembly.GetManifestResourceStream&nbsp;streamName &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">use</span>&nbsp;rdr&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;StreamReader&nbsp;(strm) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;s&nbsp;=&nbsp;rdr.ReadToEnd&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;s.Split&nbsp;Environment.NewLine&nbsp;|&gt;&nbsp;Seq.map&nbsp;csvToTestCase</pre> </p> <p> It reads an embedded resource stream of test cases, like the above comma-separated values. Even though the values are in a text file, it's easier to embed the file in the test assembly, because it nicely dispenses with the problem of copying a text file to the appropriate output directory when the code compiles. That would, however, be an valid alternative. </p> <p> <code>Anchor</code> is a dummy type to support <code>typeof</code>, and <code>streamName</code> is just a string constant that identifies the name of the stream. </p> <p> The <code>csvToTestCase</code> function converts each line of comma-separated values to test cases for the <code>[&lt;Theory&gt;]</code> attribute: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;obj&nbsp;[]</span> <span style="color:blue;">let</span>&nbsp;csvToTestCase&nbsp;(csv&nbsp;:&nbsp;string)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;values&nbsp;=&nbsp;csv.Split&nbsp;<span style="color:#a31515;">&#39;,&#39;</span> &nbsp;&nbsp;&nbsp;&nbsp;[| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;values.[0]&nbsp;|&gt;&nbsp;Convert.ToInt32&nbsp;|&gt;&nbsp;box &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;values.[1]&nbsp;|&gt;&nbsp;Convert.ToInt32&nbsp;|&gt;&nbsp;box &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;values.[2]&nbsp;|&gt;&nbsp;Convert.ToInt32&nbsp;|&gt;&nbsp;box &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;values.[3]&nbsp;|&gt;&nbsp;Convert.ToBoolean&nbsp;|&gt;&nbsp;box &nbsp;&nbsp;&nbsp;&nbsp;|]</pre> </p> <p> It's not the safest code I could write, but this is, after all, a proof of concept. </p> <p> The most direct port of the Python code I could produce is this: </p> <p> <pre><span style="color:green;">//&nbsp;f&nbsp;:&nbsp;int&nbsp;-&gt;&nbsp;int&nbsp;-&gt;&nbsp;int&nbsp;-&gt;&nbsp;bool</span> <span style="color:blue;">let</span>&nbsp;f&nbsp;(x&nbsp;:&nbsp;int)&nbsp;(y&nbsp;:&nbsp;int)&nbsp;(z&nbsp;:&nbsp;int)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;mx&nbsp;=&nbsp;bigint&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;my&nbsp;=&nbsp;bigint&nbsp;y &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;mz&nbsp;=&nbsp;bigint&nbsp;z &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;out&nbsp;=&nbsp;0I &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">for</span>&nbsp;i&nbsp;<span style="color:blue;">in</span>&nbsp;[0I..9I]&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;out&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;out&nbsp;*&nbsp;mx&nbsp;+&nbsp;abs&nbsp;(my&nbsp;*&nbsp;mz&nbsp;-&nbsp;i&nbsp;*&nbsp;i) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;x&#39;&nbsp;=&nbsp;mx &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;y&#39;&nbsp;=&nbsp;my &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;z&#39;&nbsp;=&nbsp;mz &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mx&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;y&#39;&nbsp;+&nbsp;1I &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;z&#39; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mz&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;x&#39; &nbsp;&nbsp;&nbsp;&nbsp;abs&nbsp;out&nbsp;%&nbsp;100I&nbsp;&lt;&nbsp;10I</pre> </p> <p> As F# code goes, it's disagreeable, but it passes all 200 test cases, so this will serve as an initial implementation. The <code>out</code> variable can grow to values that overflow even 64-bit integers, so I had to convert to <a href="https://docs.microsoft.com/dotnet/api/system.numerics.biginteger">bigint</a> to get all test cases to pass. </p> <p> If I make the same mutation to the code that Hillel Wayne did (<code>abs&nbsp;out&nbsp;%&nbsp;100I&nbsp;&lt;&nbsp;9I</code>) two test cases fail. This gives me some confidence that I have a degree of problem coverage comparable to his. </p> <h3 id="289e30b4098c4e54b8390af0b672caf1"> Test oracle <a href="#289e30b4098c4e54b8390af0b672caf1" title="permalink">#</a> </h3> <p> Now that a reference implementation exists, we can use it as a <a href="https://en.wikipedia.org/wiki/Test_oracle">test oracle</a> for refactorings. You can, for example, add a little test-only utility to your program portfolio: </p> <p> <pre><span style="color:blue;">open</span>&nbsp;Prod <span style="color:blue;">open</span>&nbsp;FsCheck [&lt;EntryPoint&gt;] <span style="color:blue;">let</span>&nbsp;main&nbsp;argv&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;Arb.Default.Int32().Generator &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Gen.three &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Gen.sample&nbsp;100&nbsp;200 &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;List.iter&nbsp;(<span style="color:blue;">fun</span>&nbsp;(x,&nbsp;y,&nbsp;z)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;printfn&nbsp;<span style="color:#a31515;">&quot;%i,&nbsp;%i,&nbsp;%i,&nbsp;%b&quot;</span>&nbsp;x&nbsp;y&nbsp;z&nbsp;(f&nbsp;x&nbsp;y&nbsp;z)) &nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;<span style="color:green;">//&nbsp;return&nbsp;an&nbsp;integer&nbsp;exit&nbsp;code</span></pre> </p> <p> Notice that the last step in the pipeline is to output the values of each <code>x</code>, <code>y</code>, and <code>z</code>, as well as the result of calling <code>f x y z</code>. </p> <p> This is a command-line executable that uses FsCheck to produce new test cases by calling the <code>f</code> function. It looks similar to the above one-off script that produced Python code, but this one instead just produces comma-separated values. You can run it from the command line to produce a new sample of test cases: </p> <p> <pre>$ ./foracle 29, -48, -78, false -8, -25, 13, false -74, 34, -68, true ...</pre> </p> <p> As above, I've used an ellipsis to indicate that in reality, 200 lines of comma-separated values scroll by. </p> <p> When you use Bash, you can even pipe the output straight to a file: </p> <p> <pre>$ ./foracle > csv.txt</pre> </p> <p> You can now take the new comma-separated values and update the test values that the above <code>test f</code> test uses. </p> <p> In other words, you use version <em>n</em> of <code>f</code> as a test oracle for version <em>n + 1</em>. When iteration <em>n + 1</em> is a function of iteration <em>n</em>, you have a so-called <em>dynamic system</em>, so I think that we can call this technique <em>dynamic test oracles</em>. </p> <p> The above <code>foracle</code> program is just a proof of concept. You could make it more flexible by making it take command-line arguments that would let you control the sample size and FsCheck's <code>size</code> parameter (the hard-coded <code>100</code> in the above code listing). </p> <h3 id="b7cee6f3ed874b3390276dd3852850a6"> Refactoring <a href="#b7cee6f3ed874b3390276dd3852850a6" title="permalink">#</a> </h3> <p> With the confidence instilled by the test cases, we can now refactor the <code>f</code> function: </p> <p> <pre><span style="color:green;">//&nbsp;f&nbsp;:&nbsp;int&nbsp;-&gt;&nbsp;int&nbsp;-&gt;&nbsp;int&nbsp;-&gt;&nbsp;bool</span> <span style="color:blue;">let</span>&nbsp;f&nbsp;(x&nbsp;:&nbsp;int)&nbsp;(y&nbsp;:&nbsp;int)&nbsp;(z&nbsp;:&nbsp;int)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;imp&nbsp;(x,&nbsp;y,&nbsp;z,&nbsp;out)&nbsp;i&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;out&nbsp;=&nbsp;out&nbsp;*&nbsp;x&nbsp;+&nbsp;abs&nbsp;(y&nbsp;*&nbsp;z&nbsp;-&nbsp;i&nbsp;*&nbsp;i) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;y&nbsp;+&nbsp;1I,&nbsp;z,&nbsp;x,&nbsp;out &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;(_,&nbsp;_,&nbsp;_,&nbsp;out)&nbsp;=&nbsp;List.fold&nbsp;imp&nbsp;(bigint&nbsp;x,&nbsp;bigint&nbsp;y,&nbsp;bigint&nbsp;z,&nbsp;0I)&nbsp;[0I..9I] &nbsp;&nbsp;&nbsp;&nbsp;abs&nbsp;out&nbsp;%&nbsp;100I&nbsp;&lt;&nbsp;10I</pre> </p> <p> Instead of all those mutable variables, the function is, after all, just a left fold. Phew, I feel better now. </p> <h3 id="f849e03594b448299ba4eef5a6a72b4e"> Conclusion <a href="#f849e03594b448299ba4eef5a6a72b4e" title="permalink">#</a> </h3> <p> This article demonstrated a proof of concept where you use a known good version (<em>n</em>) of the code as a test oracle for the next version (<em>n + 1</em>). In interpreted languages, you may be able to load two versions of the code base side by side, but that's rarely practical in a statically typed compiled language like F#. Instead, I used a utility program to generate test cases that can be used as a data source for a parametrised test. </p> <p> The example rho problem takes only integers as input, and returns a simple Boolean value, so in this case it's trivial to encode each test case as a line of comma-separated values. For (real) problems that may involve more complex types, it'd be better to use another serialisation format, such as JSON or XML. </p> <p> An outstanding issue is whether it's possible to implement shrinking behaviour when tests fail. Currently, the proof of concept just uses a set of serialised test cases. Normally, when a <a href="/property-based-testing-intro">property-based testing</a> framework like FsCheck detects a counter-example, it'll shrink the counter-example to values that are easier to understand than the original. This proof of concept doesn't do that. I'm not sure if a framework like FsCheck currently contains enough extensibility points to enable that sort of behaviour. I'll leave that question open for any reader interested in taking on that problem. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="bb31c6eea41f4adcacf249d39a3798d2"> <div class="comment-author"><a href="https://github.com/dharmaturtle">Alex Nguyen</a></div> <div class="comment-content"> <p> Hi Mark! Thanks for another thought provoking post. </p> <p> I believe you and Hillel are writing <a href="https://en.wikipedia.org/wiki/Characterization_test">characterization tests</a>, which <a href="https://blog.ploeh.dk/2013/04/02/why-trust-tests/">you've mentioned in the past</a>. Namely, you're both using the behavior of existing code to verify the correctness of a refactor. The novel part to me is that Hillel is using code as the test oracle. Your solution serializes the oracle to a static file. The library I use for characterization tests (<a href="https://www.nuget.org/packages/ApprovalTests">ApprovalTests</a>) does this as well. </p> <p> I believe shrinking is impossible when the oracle is a static file. However with Hillel's solution the oracle may be consulted at any time, making shrinking viable. If only there was a practical way to combine the two... </p> </div> <div class="comment-date">2021-01-06 23:01 UTC</div> </div> <div class="comment" id="4aa4188124ca4f4fadf35d6881b9452e"> <div class="comment-author"><a href="https://about.me/tysonwilliams">Tyson Williams</a></div> <div class="comment-content"> <p> A thought provoking post indeed! </p> <blockquote> In F# I'm going to use <a href="https://fscheck.github.io/FsCheck">FsCheck</a>... </blockquote> <p> I think that is a fine choice given the use case laid out in this post. In general though, I think <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a> is a better property-based testing library. Its killer feature is integrated shrinking, which means that all generators can also shrink and this extra power is essentially free. </p> <p> For the record (because this can be a point of confusion), Haskell has <a href="https://hackage.haskell.org/package/QuickCheck">QuickCheck</a> and <a href="https://hackage.haskell.org/package/hedgehog">(Haskell) Hedgehog</a> while F# has ports from Haskell called <a href="https://fscheck.github.io/FsCheck">FsCheck</a> and <a href="https://github.com/hedgehogqa/fsharp-hedgehog">(FSharp) Hedgehog</a>. </p> <p> <a href="https://twitter.com/jacobstanley">Jacob Stanley</a> gave <a href="https://www.youtube.com/watch?v=AIv_9T0xKEo">this excellent talk at YOW! Lambda Jam 2017</a> that explains the key idea that allows Hedgehog to have integrated shrinking. (Spoiler: A generic type that is invariant in its only type parameter is replaced by a different generic type that is monadic in its only type parameter. API design guided by functional programming for the win!) </p> <blockquote> Normally, when a property-based testing framework like FsCheck detects a counter-example, it'll shrink the counter-example to values that are easier to understand than the original. </blockquote> <p> In my experience, property-based tests written with QuickCheck / FsCheck do not normally shrink. I think this is because of the extra work required to enable shrinking. For an anecdotal example, consider <a href="https://frasertweedale.github.io/blog-fp/posts/2020-03-31-quickcheck-hedgehog.html">this post by Fraser Tweedale</a>. He believed that it would be faster to add (Haskell) Hedgehog as a dependency and create a generator for it than to add shrinking to his existing generator in QuickCheck. </p> <blockquote> In other words, you use version <em>n</em> of <code>f</code> as a test oracle for version <em>n + 1</em>. When iteration <em>n + 1</em> is a function of iteration <em>n</em>, you have a so-called <em>dynamic system</em>, so I think that we can call this technique <em>dynamic test oracles</em>. </blockquote> <p> I am confused by this paragraph. I interpret your word "When" at the start of the second sentence as a common-language synonym for the mathematical word "If". Here is roughly how I understand that paragraph, where <code>A</code> stands for "version / iteration <em>n</em> of <code>f</code>" and <code>B</code> stands for "version / iteration <em>n + 1</em> of <code>f</code>". "<code>A</code> depends on <code>B</code>. If <code>B</code> depends on <code>A</code>, then we have a dynamic system. Therefore, we have a dynamic system." I feel like the paragraph assumes (because it is obvious?) that version / iteration <em>n + 1</em> of <code>f</code> depends on version / iteration <em>n</em> of <code>f</code>. In what sense is that the case? </p> <blockquote> An outstanding issue is whether it's possible to implement shrinking behaviour when tests fail. [...] I'll leave that question open for any reader interested in taking on that problem. </blockquote> <p> I am interested! </p> <p> Quoting Mark and then Alex. </p> <blockquote> <p> Hillel Wayne [...] outlines an approach where you test an iteration of rho code against a 'last known good' snapshot. He uses <code>git worktree</code> to set up a snapshot of the reference implementation. He then writes a property that compares the refactored code's behaviour against the reference. </p> <p> The example code is in <a href="https://www.python.org">Python</a>, which is a language that I don't know. As far as I can tell, it works because Python is 'lightweight' enough that you can load and execute source code directly. I found that the approach makes much sense, but I wondered how it would apply for statically typed, compiled languages. I decided to create a proof of concept in <a href="https://fsharp.org">F#</a>. </p> </blockquote> <blockquote> I believe shrinking is impossible when the oracle is a static file. However with Hillel's solution the oracle may be consulted at any time, making shrinking viable. </blockquote> <p> I want to start by elaborating on this to make sure we are all on the same page. I think of shrinking as involving two parts. On the one hand, we have the "shrink tree", which contains the values to test during the shrinking process. On the other hand, for each input tested, we need to know if the output should cause the test to pass or fail. </p> <p> With Hedgehog, getting a shrink tree would not be too difficult. For a generator with type parameter <code>'a</code>, the current generator API returns a "random" shrink tree of type <code>'a</code> in which the root is an instance <code>a</code> of the type <code>'a</code> and the tree completely depends on <code>a</code>. It should be easy to expose an additional function that accepts inputs of type <code>'a Gen</code> and <code>'a</code> and returns <em>the</em> tree with the given <code>'a</code> as its root. </p> <p> The difficult part is being able to query the test oracle. As Mark said, this seems easy to do in a dynamically-typed language like Python. In contrast, the fundamental issue with a statically-typed language like F# is that the compiled code exists in an assembly and only one assembly of a given name can be loaded in a given process at the same time. </p> <p> This leads me to two ideas for workarounds. First, we could query the test oracle in a different process. I imagine an entry point could be generated that gives direct access to the test oracle. Then the test process could query the test oracle by executing this generated process. Second, we could generate a different assembly that exposes the test oracle. Then the test process could load this generated assembly to query the test oracle. The second approach seems like it would have a faster query time but be harder to implement. The first approach seems easier to implement but would probably have a slower query time. Maybe the query time would be fast enough though, especially if it was only queried when shrinking. </p> <p> But given such a solution, who wants to restrict access to the test oracle only to shrinking? If the test oracle is always available, then there is no need to store input-output pairs. Instead of always checking that the system under test works correctly for a previously selected set of inputs, the property-based test can check that the system under test has the expected behavior for a unique set of inputs each time the property-based test is executed. In my experience, this is the default behavior of a property-based test. </p> <p> One concern that some people might have is the idea of checking into the code repository the binary containing the test oracle. My first though is that <a href="https://blog.ploeh.dk/2014/01/29/nuget-package-restore-considered-harmful/">the size of this is likely so small that it does not matter</a>. My second thought is that the binary containing the test oracle does not have to be included in the repository. Instead, the workflow could be to (1) create the property-based test that uses the compiled test oracle, (2) refactor the system under test, (3) observe that the property-based test still passes, (4) commit the refactored code, and (5) discard the remaining changes, which will delete the property-based test and the compiled test oracle. </p> <p> Instead of completely removing that property-based test, it might be better to leave it there with input-output pairs stored in a file. Then the conversion from that state of the property-based test to the one that uses the compiled test oracle will be much smaller. </p> </div> <div class="comment-date">2021-01-07 19:27 UTC</div> </div> <div class="comment" id="39b559f49ecc426db7b3d9d1518556df"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Alex, thank you for writing. Yes, I think that calling this a Characterisation Test is correct. I wasn't aware of the <em>ApprovalTests</em> library; thank you for mentioning it. </p> <p> When I originally wrote the article, I was under the impression that shrinking might still be possible. I admit, though, that I hadn't thought things through. I think that <a href="#4aa4188124ca4f4fadf35d6881b9452e">Tyson Williams argues convincingly</a> that this isn't possible. </p> </div> <div class="comment-date">2021-01-15 13:42 UTC</div> </div> <div class="comment" id="a6ea5d7cbb75431ba6adf969599c2bd3"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. I'm <a href="/2017/09/18/the-test-data-generator-functor#5bd990290ff048c2a7b55b740053831d">well aware of Hedgehog</a>, and I'm keen on the way it works. I rarely use it, however, as it so far doesn't quite seem to have the same degree of 'industrial strength' to it that FsCheck has. Additionally, I find that shrinking is less important in practice than it might seem in theory. </p> <p> I'm not sure that I understand your confusion about the term <em>dynamic</em>. You write: <blockquote> <p> "<code>A</code> depends on <code>B</code>." </p> </blockquote> Why do you write that? I don't think, in the way you've labelled iterations, that <code>A</code> depends on <code>B</code>. </p> <p> When it comes to shrinking, I think that you convincigly argues that it can't be done unless one is able to query the oracle. As long as all you have is a list of test cases, you can't do that... unless, perhaps, you were to also generate and run all the shrunk test cases when you capture the list of test cases... Again, I haven't thought this through, so there may be some obvious gotcha that I'm missing. </p> <p> I would be wary of trying to host the previous iteration in a different process. This is technically possible, but, in .NET at least, quite cumbersome. You'll have to deal with data marshalling and lifetime management of the second process. It was difficult enough in .NET framework back when <em>remoting</em> was a thing; I'm not even sure how one would go about such a problem in .NET Core - particularly if you want it to work on both Windows, Linux, and Mac. HTTP? </p> </div> <div class="comment-date">2021-01-16 13:24 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2021/01/04/dynamic-test-oracles-for-rho-problems An F# demo of validation with partial data round trip https://blog.ploeh.dk/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip/ Mon, 28 Dec 2020 09:22:00 UTC <div id="post"> <p> <em>An F# port of the previous Haskell proof of concept.</em> </p> <p> This article is part of <a href="/2020/12/14/validation-a-solved-problem">a short article series</a> on <a href="/2018/11/05/applicative-validation">applicative validation</a> with a twist. The twist is that validation, when it fails, should return not only a list of error messages; it should also retain that part of the input that <em>was</em> valid. </p> <p> In the <a href="/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip">previous article</a> you saw a <a href="https://www.haskell.org">Haskell</a> proof of concept that demonstrated how to compose the appropriate <a href="/2018/10/01/applicative-functors">applicative functor</a> with a suitable <a href="/2017/11/27/semigroups">semigroup</a> to make validation work as desired. In this article, you'll see how to port that proof of concept to <a href="https://fsharp.org">F#</a>. </p> <h3 id="b2ea11fdb65343b5b60fcf2cffc62b1a"> Data definitions <a href="#b2ea11fdb65343b5b60fcf2cffc62b1a" title="permalink">#</a> </h3> <p> Like in the previous article, we're going to need some types. These are essentially direct translations of the corresponding Haskell types: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Input&nbsp;=&nbsp;{&nbsp;Name&nbsp;:&nbsp;string&nbsp;option;&nbsp;DoB&nbsp;:&nbsp;DateTime&nbsp;option;&nbsp;Address&nbsp;:&nbsp;string&nbsp;option} <span style="color:blue;">type</span>&nbsp;ValidInput&nbsp;=&nbsp;{&nbsp;Name&nbsp;:&nbsp;string;&nbsp;DoB&nbsp;:&nbsp;DateTime;&nbsp;Address&nbsp;:&nbsp;string&nbsp;}</pre> </p> <p> The <code>Input</code> type plays the role of the input we'd like to validate, while <code>ValidInput</code> presents validated data. </p> <p> If you're an F# fan, you can bask in the reality that F# records are terser than Haskell records. I like both languages, so I have mixed feelings about this. </p> <h3 id="2e1b689c57114b259b3b4019f4c3976d"> Computation expression <a href="#2e1b689c57114b259b3b4019f4c3976d" title="permalink">#</a> </h3> <p> Haskell's main workhorse is its type class system. F# doesn't have that, but it has <a href="https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/computation-expressions">computation expressions</a>, which in F# 5 got support for applicative functors. That's just what we need, and it turns out that there isn't a lot of code we have to write to make all of this work. </p> <p> To recap from the Haskell proof of concept: We need a <code>Result</code>-like <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">container</a> that returns a tuple for errors. One element of the tuple should be a an <a href="https://en.wikipedia.org/wiki/Endomorphism">endomorphism</a>, which <a href="/2017/11/13/endomorphism-monoid">forms a monoid</a> (and therefore also a semigroup). The other element should be a list of error messages - <a href="/2017/10/10/strings-lists-and-sequences-as-a-monoid">another monoid</a>. In F# terms we'll write it as <code>(('b -&gt; 'b) * 'c list)</code>. </p> <p> That's a tuple, and since <a href="/2017/10/30/tuple-monoids">tuples form monoids when their elements do</a> the <code>Error</code> part of <code>Result</code> <a href="/2017/11/20/monoids-accumulate">supports accumulation</a>. </p> <p> To support an applicative computation expression, we're going to need a a way to merge two results together. This is by far the most complicated piece of code in this article, all six lines of code: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Result&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Result&lt;&#39;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,((&#39;b&nbsp;-&gt;&nbsp;&#39;b)&nbsp;*&nbsp;&#39;c&nbsp;list)&gt;&nbsp;-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Result&lt;&#39;d&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,((&#39;b&nbsp;-&gt;&nbsp;&#39;b)&nbsp;*&nbsp;&#39;c&nbsp;list)&gt;&nbsp;-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Result&lt;(&#39;a&nbsp;*&nbsp;&#39;d),((&#39;b&nbsp;-&gt;&nbsp;&#39;b)&nbsp;*&nbsp;&#39;c&nbsp;list)&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;merge&nbsp;x&nbsp;y&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;x,&nbsp;y&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Ok&nbsp;xres,&nbsp;Ok&nbsp;yres&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;(xres,&nbsp;yres) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Error&nbsp;(f,&nbsp;e1s),&nbsp;Error&nbsp;(g,&nbsp;e2s)&nbsp;&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;(f&nbsp;&gt;&gt;&nbsp;g,&nbsp;e2s&nbsp;@&nbsp;e1s) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Error&nbsp;e,&nbsp;Ok&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;e &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Ok&nbsp;_,&nbsp;Error&nbsp;e&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;e</pre> </p> <p> The <code>merge</code> function composes two input results together. The results have <code>Ok</code> types called <code>'a</code> and <code>'d</code>, and if they're both <code>Ok</code> values, the return value is an <code>Ok</code> tuple of <code>'a</code> and <code>'d</code>. </p> <p> If one of the results is an <code>Error</code> value, it beats an <code>Ok</code> value. The only moderately complex operations is when both are <code>Error</code> values. </p> <p> Keep in mind that an <code>Error</code> value in this instance contains a tuple of the type <code>(('b -&gt; 'b) * 'c list)</code>. The first element is an endomorphism <code>'b -&gt; 'b</code> and the other element is a list. The <code>merge</code> function composes the endomorphism <code>f</code> and <code>g</code> by standard function composition (the <code>&gt;&gt;</code> operator), and concatenates the lists with the standard <code>@</code> list concatenation operator. </p> <p> Because I'm emulating how the <a href="https://forums.fsharp.org/t/thoughts-on-input-validation-pattern-from-a-noob/1541">original forum post</a>'s code behaves, I'm concatenating the two lists with the rightmost going before the leftmost. It doesn't make any other difference than determining the order of the error list. </p> <p> With the <code>merge</code> function in place, the computation expression is a simple matter: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;ValidationBuilder&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;_.BindReturn&nbsp;(x,&nbsp;f)&nbsp;=&nbsp;Result.map&nbsp;f&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;_.MergeSources&nbsp;(x,&nbsp;y)&nbsp;=&nbsp;Result.merge&nbsp;x&nbsp;y</pre> </p> <p> The last piece is a <code>ValidationBuilder</code> value: </p> <p> <pre>[&lt;AutoOpen&gt;] <span style="color:blue;">module</span>&nbsp;ComputationExpressions&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;validation&nbsp;=&nbsp;ValidationBuilder&nbsp;()</pre> </p> <p> Now, whenever you use the <code>validation</code> computation expression, you get the desired functionality. </p> <h3 id="948b93fa6a1d47beac150180769731eb"> Validators <a href="#948b93fa6a1d47beac150180769731eb" title="permalink">#</a> </h3> <p> Before we can compose some validation functions, we'll need to have some validators in place. These are straightforward translations of the Haskell validation functions, starting with the name validator: </p> <p> <pre><span style="color:green;">//&nbsp;Input&nbsp;-&gt;&nbsp;Result&lt;string,((Input&nbsp;-&gt;&nbsp;Input)&nbsp;*&nbsp;string&nbsp;list)&gt;</span> <span style="color:blue;">let</span>&nbsp;validateName&nbsp;({&nbsp;Name&nbsp;=&nbsp;name&nbsp;}&nbsp;:&nbsp;Input)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;name&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;n&nbsp;<span style="color:blue;">when</span>&nbsp;n.Length&nbsp;&gt;&nbsp;3&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Error&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">fun</span>&nbsp;(args&nbsp;:&nbsp;Input)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;{&nbsp;args&nbsp;<span style="color:blue;">with</span>&nbsp;Name&nbsp;=&nbsp;None&nbsp;}), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>]) &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;(id,&nbsp;[<span style="color:#a31515;">&quot;name&nbsp;is&nbsp;required&quot;</span>])</pre> </p> <p> When the name is too short, the endomorphism resets the <code>Name</code> field to <code>None</code>. </p> <p> The date-of-birth validation function works the same way: </p> <p> <pre><span style="color:green;">//&nbsp;DateTime&nbsp;-&gt;&nbsp;Input&nbsp;-&gt;&nbsp;Result&lt;DateTime,((Input&nbsp;-&gt;&nbsp;Input)&nbsp;*&nbsp;string&nbsp;list)&gt;</span> <span style="color:blue;">let</span>&nbsp;validateDoB&nbsp;(now&nbsp;:&nbsp;DateTime)&nbsp;({&nbsp;DoB&nbsp;=&nbsp;dob&nbsp;}&nbsp;:&nbsp;Input)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;dob&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;d&nbsp;<span style="color:blue;">when</span>&nbsp;d&nbsp;&gt;&nbsp;now.AddYears&nbsp;-12&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;d &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Error&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">fun</span>&nbsp;(args&nbsp;:&nbsp;Input)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;{&nbsp;args&nbsp;<span style="color:blue;">with</span>&nbsp;DoB&nbsp;=&nbsp;None&nbsp;}), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;get&nbsp;off&nbsp;my&nbsp;lawn&quot;</span>]) &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;(id,&nbsp;[<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>])</pre> </p> <p> Again, like in the Haskell proof of concept, instead of calling <code>DateTime.Now</code> from within the function, I'm passing <code>now</code> as an argument to keep the function <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>. </p> <p> The address validation concludes the set of validators: </p> <p> <pre><span style="color:green;">//&nbsp;Input&nbsp;-&gt;&nbsp;Result&lt;string,((&#39;a&nbsp;-&gt;&nbsp;&#39;a)&nbsp;*&nbsp;string&nbsp;list)&gt;</span> <span style="color:blue;">let</span>&nbsp;validateAddress&nbsp;({&nbsp;Address&nbsp;=&nbsp;address&nbsp;}:&nbsp;Input)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;address&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;a &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;(id,&nbsp;[<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>])</pre> </p> <p> The inferred endomorphism type here is the more general <code>'a -&gt; 'a</code>, but it's compatible with <code>Input -&gt; Input</code>. </p> <h3 id="836690a83da24da492f70c8eb756a52d"> Composition <a href="#836690a83da24da492f70c8eb756a52d" title="permalink">#</a> </h3> <p> All three functions have compatible <code>Error</code> types, so they ought to compose with the applicative computation expression to produce the desired behaviour: </p> <p> <pre><span style="color:green;">//&nbsp;DateTime&nbsp;-&gt;&nbsp;Input&nbsp;-&gt;&nbsp;Result&lt;ValidInput,(Input&nbsp;*&nbsp;string&nbsp;list)&gt;</span> <span style="color:blue;">let</span>&nbsp;validateInput&nbsp;now&nbsp;args&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;validation&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;name&nbsp;=&nbsp;validateName&nbsp;args &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">and!</span>&nbsp;dob&nbsp;=&nbsp;validateDoB&nbsp;now&nbsp;args &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">and!</span>&nbsp;address&nbsp;=&nbsp;validateAddress&nbsp;args &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;{&nbsp;Name&nbsp;=&nbsp;name;&nbsp;DoB&nbsp;=&nbsp;dob;&nbsp;Address&nbsp;=&nbsp;address&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Result.mapError&nbsp;(<span style="color:blue;">fun</span>&nbsp;(f,&nbsp;msgs)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;f&nbsp;args,&nbsp;msgs)</pre> </p> <p> The <code>validation</code> expression alone produces a <code>Result&lt;ValidInput,((Input -&gt; Input) * string list)&gt;</code> value. To get an <code>Input</code> value in the <code>Error</code> tuple, we need to 'run' the <code>Input -&gt; Input</code> endomorphism. The <code>validateInput</code> function does that by applying the endomorphism <code>f</code> to <code>args</code> when mapping the error with <code>Result.mapError</code>. </p> <h3 id="df2c48e822ad48ce916e29da8638563a"> Tests <a href="#df2c48e822ad48ce916e29da8638563a" title="permalink">#</a> </h3> <p> To test that the <code>validateInput</code> works as intended, I first copied all the code from the original forum post. I then wrote eight <a href="https://en.wikipedia.org/wiki/Characterization_test">characterisation tests</a> against that code to make sure that I could reproduce the desired functionality. </p> <p> I then wrote a parametrised test against the new function: </p> <p> <pre>[&lt;Theory;&nbsp;ClassData(typeof&lt;ValidationTestCases&gt;)&gt;] <span style="color:blue;">let</span>&nbsp;``Validation&nbsp;works``&nbsp;input&nbsp;expected&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;now&nbsp;=&nbsp;DateTime.Now &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;validateInput&nbsp;now&nbsp;input &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;actual</pre> </p> <p> The <code>ValidationTestCases</code> class is defined like this: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;ValidationTestCases&nbsp;()&nbsp;<span style="color:blue;">as</span>&nbsp;this&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;TheoryData&lt;Input,&nbsp;Result&lt;ValidInput,&nbsp;Input&nbsp;*&nbsp;string&nbsp;list&gt;&gt;&nbsp;()</pre> </p> <p> This class produces a set of test cases, where each test case contains an <code>input</code> value and the <code>expected</code> output. To define the test cases, I copied the eight characterisation tests I'd already produced and adjusted them so that they fit the simpler API of the <code>validateInput</code> function. Here's a few examples: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;eightYearsAgo&nbsp;=&nbsp;DateTime.Now.AddYears&nbsp;-8 <span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;Name&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;Alice&quot;</span>;&nbsp;DoB&nbsp;=&nbsp;Some&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;None&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;Error&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;Name&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;Alice&quot;</span>;&nbsp;DoB&nbsp;=&nbsp;Some&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;None&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>])) <span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;Name&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;Alice&quot;</span>;&nbsp;DoB&nbsp;=&nbsp;Some&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;Ok&nbsp;({&nbsp;Name&nbsp;=&nbsp;<span style="color:#a31515;">&quot;Alice&quot;</span>;&nbsp;DoB&nbsp;=&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>&nbsp;}))</pre> </p> <p> The first case expects an <code>Error</code> value because the <code>Input</code> value has no address. The other test case expects an <code>Ok</code> value because all input is fine. </p> <p> I copied all eight characterisation tests over, so now I have those eight tests, as well as the modified eight tests for the applicative-based API shown here. All sixteen tests pass. </p> <h3 id="5a20682ce9494011b78d5a5f9c2c9bad"> Conclusion <a href="#5a20682ce9494011b78d5a5f9c2c9bad" title="permalink">#</a> </h3> <p> I find this solution to the problem elegant. It's always satisfying when you can implement what at first glance looks like custom behaviour using <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstractions</a>. </p> <p> Besides the aesthetic value, I also believe that this keeps a team more productive. These concepts of monoids, semigroups, applicative functors, and so on, are concepts that you only have to learn once. Once you know them, you'll recognise them when you run into them. This means that there's less code to understand. </p> <p> An ad-hoc implementation as the original forum post suggested (even though it looked quite decent) always puts the onus on a maintenance developer to read and understand even more one-off infrastructure code. </p> <p> With an architecture based on universal abstractions and well-documented language features, a functional programmer that knows these things will be able to pick up what's going on without much trouble. Specifically, (s)he will recognise that this is 'just' applicative validation with a twist. </p> <p> This article is the December 28 entry in the <a href="https://sergeytihon.com/2020/10/22/f-advent-calendar-in-english-2020">F# Advent Calendar in English 2020</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip A Haskell proof of concept of validation with partial data round trip https://blog.ploeh.dk/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip/ Mon, 21 Dec 2020 06:54:00 UTC <div id="post"> <p> <em>Which Semigroup best addresses the twist in the previous article?</em> </p> <p> This article is part of <a href="/2020/12/14/validation-a-solved-problem">a short article series</a> on applicative validation with a twist. The twist is that validation, when it fails, should return not only a list of error messages; it should also retain that part of the input that <em>was</em> valid. </p> <p> In this article, I'll show how I did a quick proof of concept in <a href="https://www.haskell.org">Haskell</a>. </p> <h3 id="9417153ff45d4188a65470fa2d67ea2e"> Data definitions <a href="#9417153ff45d4188a65470fa2d67ea2e" title="permalink">#</a> </h3> <p> You can't use the regular <code>Either</code> instance of <code>Applicative</code> for validation because it short-circuits on the first error. In other words, you can't collect multiple error messages, even if the input has multiple issues. Instead, you need a custom <code>Applicative</code> instance. You can <a href="/2018/11/05/applicative-validation">easily write such an instance</a> yourself, but there are a couple of libraries that already do this. For this prototype, I chose the <a href="https://hackage.haskell.org/package/validation">validation</a> package. </p> <p> <pre><span style="color:blue;">import</span>&nbsp;Data.Bifunctor <span style="color:blue;">import</span>&nbsp;Data.Time <span style="color:blue;">import</span>&nbsp;Data.Semigroup <span style="color:blue;">import</span>&nbsp;Data.Validation </pre> </p> <p> Apart from importing <code>Data.Validation</code>, I also need a few other imports for the proof of concept. All of them are well-known. I used no language extensions. </p> <p> For the proof of concept, the input is a triple of a name, a date of birth, and an address: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Input&nbsp;=&nbsp;Input&nbsp;{ &nbsp;&nbsp;<span style="color:#2b91af;">inputName</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:#2b91af;">String</span>, &nbsp;&nbsp;<span style="color:#2b91af;">inputDoB</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Day</span>, &nbsp;&nbsp;<span style="color:#2b91af;">inputAddress</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:#2b91af;">String</span>&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) </pre> </p> <p> The goal is actually to <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate">parse (not validate)</a> <code>Input</code> into a safer data type: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;ValidInput&nbsp;=&nbsp;ValidInput&nbsp;{ &nbsp;&nbsp;<span style="color:#2b91af;">validName</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>, &nbsp;&nbsp;<span style="color:#2b91af;">validDoB</span>&nbsp;::&nbsp;<span style="color:blue;">Day</span>, &nbsp;&nbsp;<span style="color:#2b91af;">validAddress</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) </pre> </p> <p> If parsing/validation fails, the output should report a collection of error messages <em>and</em> return the <code>Input</code> value with any valid data retained. </p> <h3 id="4ce26b2031084212a5fbab83dd848d4b"> Looking for a Semigroup <a href="#4ce26b2031084212a5fbab83dd848d4b" title="permalink">#</a> </h3> <p> My hypothesis was that validation, even with that twist, can be implemented elegantly with an <code>Applicative</code> instance. The <em>validation</em> package defines its <code>Validation</code> data type such that it's an <code>Applicative</code> instance as long as its error type is a <code>Semigroup</code> instance: </p> <p> <pre><span style="color:blue;">Semigroup</span> err =&gt; <span style="color:blue;">Applicative</span> (<span style="color:blue;">Validation</span> err)</pre> </p> <p> The question is: which <code>Semigroup</code> can we use? </p> <p> Since we need to return <em>both</em> a list of error messages <em>and</em> a modified <code>Input</code> value, it sounds like we'll need a product type of some sorts. A tuple will do; something like <code>(Input, [String])</code>. Is that a <code>Semigroup</code> instance, though? </p> <p> Tuples only form semigroups if both elements give rise to a semigroup: </p> <p> <pre>(<span style="color:blue;">Semigroup</span> a, <span style="color:blue;">Semigroup</span> b) =&gt; <span style="color:blue;">Semigroup</span> (a, b)</pre> </p> <p> The second element of my candidate is <code>[String]</code>, which is fine. Lists are <code>Semigroup</code> instances. But what about <code>Input</code>? Can we somehow combine two <code>Input</code> values into one? It's not entirely clear how we should do that, so that doesn't seem too promising. </p> <p> What we need to do, however, is to take the original <code>Input</code> and modify it by (optionally) resetting one or more fields. In other words, a series of functions of the type <code>Input -&gt; Input</code>. Aha! There's the semigroup we need: <a href="https://hackage.haskell.org/package/base/docs/Data-Semigroup.html#t:Endo"><code>Endo Input</code></a>. </p> <p> So the <code>Semigroup</code> instance we need is <code>(<span style="color:blue;">Endo Input</span>, [<span style="color:#2b91af;">String</span>])</code>, and the validation output should be of the type <code><span style="color:blue;">Validation</span> (<span style="color:blue;">Endo Input</span>, [<span style="color:#2b91af;">String</span>]) a</code>. </p> <h3 id="5da9d89ac8414ad0bc9ebe322b831390"> Validators <a href="#5da9d89ac8414ad0bc9ebe322b831390" title="permalink">#</a> </h3> <p> Cool, we can now implement the validation logic; a function for each field, starting with the name: </p> <p> <pre><span style="color:#2b91af;">validateName</span>&nbsp;::&nbsp;<span style="color:blue;">Input</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Validation</span>&nbsp;(<span style="color:blue;">Endo</span>&nbsp;<span style="color:blue;">Input</span>,&nbsp;[<span style="color:#2b91af;">String</span>])&nbsp;<span style="color:#2b91af;">String</span> validateName&nbsp;(Input&nbsp;(Just&nbsp;name)&nbsp;_&nbsp;_)&nbsp;|&nbsp;<span style="color:blue;">length</span>&nbsp;name&nbsp;&gt;&nbsp;3&nbsp;=&nbsp;Success&nbsp;name validateName&nbsp;(Input&nbsp;(Just&nbsp;_)&nbsp;_&nbsp;_)&nbsp;= &nbsp;&nbsp;Failure&nbsp;(Endo&nbsp;$&nbsp;\x&nbsp;-&gt;&nbsp;x&nbsp;{&nbsp;inputName&nbsp;=&nbsp;Nothing&nbsp;},&nbsp;[<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>]) validateName&nbsp;_&nbsp;=&nbsp;Failure&nbsp;(mempty,&nbsp;[<span style="color:#a31515;">&quot;name&nbsp;is&nbsp;required&quot;</span>]) </pre> </p> <p> This function reproduces the validation logic implied by <a href="https://forums.fsharp.org/t/thoughts-on-input-validation-pattern-from-a-noob/1541">the forum question that started it all</a>. Notice, particularly, that when the name is too short, the endomorphism resets <code>inputName</code> to <code>Nothing</code>. </p> <p> The date-of-birth validation function works the same way: </p> <p> <pre><span style="color:#2b91af;">validateDoB</span>&nbsp;::&nbsp;<span style="color:blue;">Day</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Input</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Validation</span>&nbsp;(<span style="color:blue;">Endo</span>&nbsp;<span style="color:blue;">Input</span>,&nbsp;[<span style="color:#2b91af;">String</span>])&nbsp;<span style="color:blue;">Day</span> validateDoB&nbsp;now&nbsp;(Input&nbsp;_&nbsp;(Just&nbsp;dob)&nbsp;_)&nbsp;|&nbsp;addGregorianYearsRollOver&nbsp;(-12)&nbsp;now&nbsp;&lt;&nbsp;dob&nbsp;= &nbsp;&nbsp;Success&nbsp;dob validateDoB&nbsp;_&nbsp;(Input&nbsp;_&nbsp;(Just&nbsp;_)&nbsp;_)&nbsp;= &nbsp;&nbsp;Failure&nbsp;(Endo&nbsp;$&nbsp;\x&nbsp;-&gt;&nbsp;x&nbsp;{&nbsp;inputDoB&nbsp;=&nbsp;Nothing&nbsp;},&nbsp;[<span style="color:#a31515;">&quot;get&nbsp;off&nbsp;my&nbsp;lawn&quot;</span>]) validateDoB&nbsp;_&nbsp;_&nbsp;=&nbsp;Failure&nbsp;(mempty,&nbsp;[<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>]) </pre> </p> <p> Again, the validation logic is inferred from the forum question, although I found it better keep the function pure by requiring a <code>now</code> argument. </p> <p> The address validation is the simplest of the three validators: </p> <p> <pre><span style="color:#2b91af;">validateAddress</span>&nbsp;::&nbsp;<span style="color:blue;">Monoid</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">Input</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Validation</span>&nbsp;(a,&nbsp;[<span style="color:#2b91af;">String</span>])&nbsp;<span style="color:#2b91af;">String</span> validateAddress&nbsp;(Input&nbsp;_&nbsp;_&nbsp;(Just&nbsp;a))&nbsp;=&nbsp;Success&nbsp;a validateAddress&nbsp;_&nbsp;=&nbsp;Failure&nbsp;(mempty,&nbsp;[<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>]) </pre> </p> <p> This one's return type is actually more general than required, since I used <code>mempty</code> instead of <code>Endo id</code>. This means that it actually works for any <code>Monoid a</code>, which also includes <code>Endo Input</code>. </p> <h3 id="6d5502d178f143d58a4d3c5bef7c1f05"> Composition <a href="#6d5502d178f143d58a4d3c5bef7c1f05" title="permalink">#</a> </h3> <p> All three functions return <code><span style="color:blue;">Validation</span> (<span style="color:blue;">Endo Input</span>, [<span style="color:#2b91af;">String</span>])</code>, which has an <code>Applicative</code> instance. This means that we should be able to compose them together to get the behaviour we're looking for: </p> <p> <pre><span style="color:#2b91af;">validateInput</span>&nbsp;::&nbsp;<span style="color:blue;">Day</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Input</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;(<span style="color:blue;">Input</span>,&nbsp;[<span style="color:#2b91af;">String</span>])&nbsp;<span style="color:blue;">ValidInput</span> validateInput&nbsp;now&nbsp;args&nbsp;= &nbsp;&nbsp;toEither&nbsp;$ &nbsp;&nbsp;first&nbsp;(first&nbsp;(`appEndo`&nbsp;args))&nbsp;$ &nbsp;&nbsp;ValidInput&nbsp;&lt;$&gt;&nbsp;validateName&nbsp;args&nbsp;&lt;*&gt;&nbsp;validateDoB&nbsp;now&nbsp;args&nbsp;&lt;*&gt;&nbsp;validateAddress&nbsp;args </pre> </p> <p> That compiles, so it probably works. </p> <h3 id="9ed5a5fe379244a3bd1e9206a79a1ea9"> Sanity check <a href="#9ed5a5fe379244a3bd1e9206a79a1ea9" title="permalink">#</a> </h3> <p> Still, it'd be prudent to check. Since this is only a proof of concept, I'm not going to set up a test suite. Instead, I'll just start GHCi for some ad-hoc testing: </p> <p> <pre>λ&gt; now &lt;- localDay &lt;&amp;&gt; zonedTimeToLocalTime &lt;&amp;&gt; getZonedTime λ&gt; validateInput now &amp; Input Nothing Nothing Nothing Left (Input {inputName = Nothing, inputDoB = Nothing, inputAddress = Nothing}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;["name is required","dob is required","add1 is required"]) λ&gt; validateInput now &amp; Input (Just "Bob") Nothing Nothing Left (Input {inputName = Nothing, inputDoB = Nothing, inputAddress = Nothing}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;["no bob and toms allowed","dob is required","add1 is required"]) λ&gt; validateInput now &amp; Input (Just "Alice") Nothing Nothing Left (Input {inputName = Just "Alice", inputDoB = Nothing, inputAddress = Nothing}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;["dob is required","add1 is required"]) λ&gt; validateInput now &amp; Input (Just "Alice") (Just &amp; fromGregorian 2002 10 12) Nothing Left (Input {inputName = Just "Alice", inputDoB = Nothing, inputAddress = Nothing}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;["get off my lawn","add1 is required"]) λ&gt; validateInput now &amp; Input (Just "Alice") (Just &amp; fromGregorian 2012 4 21) Nothing Left (Input {inputName = Just "Alice", inputDoB = Just 2012-04-21, inputAddress = Nothing}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;["add1 is required"]) λ&gt; validateInput now &amp; Input (Just "Alice") (Just &amp; fromGregorian 2012 4 21) (Just "x") Right (ValidInput {validName = "Alice", validDoB = 2012-04-21, validAddress = "x"})</pre> </p> <p> In order to make the output more readable, I've manually edited the GHCi session by adding line breaks to the output. </p> <p> It looks like it's working like it's supposed to. Only the last line successfully parses the input and returns a <code>Right</code> value. </p> <h3 id="8bbb1d8ca355495b95e5e5ed85a924f4"> Conclusion <a href="#8bbb1d8ca355495b95e5e5ed85a924f4" title="permalink">#</a> </h3> <p> Before I started this proof of concept, I had an inkling of the way this would go. Instead of making the prototype in <a href="https://fsharp.org">F#</a>, I found it more productive to do it in Haskell, since Haskell enables me to compose things together. I particularly appreciate how a composition of types like <code>(<span style="color:blue;">Endo Input</span>, [<span style="color:#2b91af;">String</span>])</code> is automatically a <code>Semigroup</code> instance. I don't have to do anything. That makes the language great for prototyping things like this. </p> <p> Now that I've found the appropriate semigroup, I know how to convert the code to F#. That's in the next article. </p> <p> <strong>Next:</strong> <a href="/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip">An F# demo of validation with partial data round trip</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="0ebea2f4d9c54072a5bb0c093a63fe14"> <div class="comment-author"><a href="https://about.me/tysonwilliams">Tyson Williams</a></div> <div class="comment-content"> <p> Great work and excellent post. I just had a few clarification quesitons. </p> <blockquote> <p> ...But what about <code>Input</code>? Can we somehow combine two <code>Input</code> values into one? It's not entirely clear how we should do that, so that doesn't seem too promising. </p> <p> What we need to do, however, is to take the original <code>Input</code> and modify it by (optionally) resetting one or more fields. In other words, a series of functions of the type <code>Input -&gt; Input</code>. Aha! There's the semigroup we need: <code>Endo Input</code>. </p> </blockquote> <p> How rhetorical are those questions? Whatever the case, I will take the bait. </p> <p> Any product type forms a semigroup if all of its elements do. You explicitly stated this for tuples of length 2; it also holds for records such as <code>Input</code>. Each field on that record has type <code> Maybe a</code> for some <code>a</code>, so it suffices to select a semigroup involving <code>Maybe a</code>. There are few different semigropus involving <code>Maybe</code> that have different functions. </p> <p> I think the most common semigroup for <code>Maybe a</code> has the function that returns the first <code>Just _</code> if one exists or else returns <code>Nothing</code>. Combining that with <code>Nothing</code> as the identity element gives the monoid that is typically associated with <code>Maybe a</code> (and I know by the name monoidal plus). Another monoid, and therefore a semigroup, is to return the last <code>Just _</code> instead of the first. </p> <p> Instead of the having a preference for <code>Just _</code>, the function could have a preference for <code>Nothing</code>. As before, when both inputs are <code>Just _</code>, the output could be either of the inputs. </p> <p> I think either of those last two semigroups will achieved the desired behavior in the problem at hand. Your code never replaces an instace of <code>Just a</code> with a different instance, so we don't need a preference for some input when they are both <code>Just _</code>. </p> <p> In the end though, I think the semigroup you derived from <code>Endo</code> leads to simpler code. </p> <p> At the end of the type signature for <code>validateName</code> / <code>validateDoB</code> / <code>validateAddress</code>, what does <code>String</code> / <code>Day</code> / <code>String</code> mean? </p> <p> Why did you pass all three arguments into every parsing/validation function? I think it is a bit simpler to only pass in the needed argument. Maybe you thought this was good enough for prototype code. </p> <p> Why did you use <code>add1</code> in your error message instead of <code>address</code>? Was it only for prototype code to make the message a bit shorter? </p> </div> <div class="comment-date">2020-12-21 14:21 UTC</div> </div> <div class="comment" id="510b9be50c1b43c18973008b89d2da38"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. The semigroup you suggest, I take it, would look something like this: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;Perhaps&nbsp;a&nbsp;=&nbsp;Perhaps&nbsp;{&nbsp;runPerhaps&nbsp;::&nbsp;Maybe&nbsp;&nbsp;a&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;(<span style="color:blue;">Perhaps</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;Perhaps&nbsp;Nothing&nbsp;&lt;&gt;&nbsp;_&nbsp;=&nbsp;Perhaps&nbsp;Nothing &nbsp;&nbsp;_&nbsp;&lt;&gt;&nbsp;Perhaps&nbsp;Nothing&nbsp;=&nbsp;Perhaps&nbsp;Nothing &nbsp;&nbsp;Perhaps&nbsp;(Just&nbsp;x)&nbsp;&lt;&gt;&nbsp;_&nbsp;=&nbsp;Perhaps&nbsp;(Just&nbsp;x)</pre> </p> <p> That might work, but it's an atypical semigroup. I <em>think</em> that it's lawful - at least, I can't come up with a counterexample against associativity. It seems reminiscent of Boolean <em>and</em> (the <em>All</em> monoid), but it isn't a monoid, as far as I can tell. </p> <p> Granted, a <code>Monoid</code> constraint isn't required to make the validation code work, but following the <a href="https://en.wikipedia.org/wiki/Principle_of_least_astonishment">principle of least surprise</a>, I still think that picking a well-known semigroup such as <code>Endo</code> is preferable. </p> <p> Regarding your second question, the type signature of e.g. <code>validateName</code> is: </p> <p> <pre><span style="color:#2b91af;">validateName</span>&nbsp;::&nbsp;<span style="color:blue;">Input</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Validation</span>&nbsp;(<span style="color:blue;">Endo</span>&nbsp;<span style="color:blue;">Input</span>,&nbsp;[<span style="color:#2b91af;">String</span>])&nbsp;<span style="color:#2b91af;">String</span></pre> </p> <p> Like <code>Either</code>, <code>Validation</code> has two type arguments: <code>err</code> and <code>a</code>; it's defined as <code>data Validation err a</code>. In the above function type, the return value is a <code>Validation</code> value where the <code>err</code> type is <code>(Endo Input, [String])</code> and <code>a</code> is <code>String</code>. </p> <p> All three validation functions share a common <code>err</code> type: <code>(Endo Input, [String])</code>. On the other hand, they return various <code>a</code> types: <code>String</code>, <code>Day</code>, and <code>String</code>, respectively. </p> <p> Regarding your third question, I could also have defined the functions so that they would only have taken the values they'd need to validate. That would better fit <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>, so I should probably have done that... </p> <p> As for the last question, I was just following the 'spec' implied by <a href="https://forums.fsharp.org/t/thoughts-on-input-validation-pattern-from-a-noob/1541">the original forum question</a>. </p> </div> <div class="comment-date">2020-12-22 15:05 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip Validation, a solved problem? https://blog.ploeh.dk/2020/12/14/validation-a-solved-problem/ Mon, 14 Dec 2020 08:28:00 UTC <div id="post"> <p> <em>A validation problem with a twist.</em> </p> <p> Until recently, I thought that data validation was a solved problem: <a href="/2018/11/05/applicative-validation">Use an applicative functor</a>. I then encountered <a href="https://forums.fsharp.org/t/thoughts-on-input-validation-pattern-from-a-noob/1541">a forum question</a> that for a few minutes shook my faith. </p> <p> After brief consideration, though, I realised that all is good. Validation, even with a twist, is successfully modelled with an <a href="/2018/10/01/applicative-functors">applicative functor</a>. Faith in computer science restored. </p> <h3 id="891801e3802f49d8a51b56bebaacecb9"> The twist <a href="#891801e3802f49d8a51b56bebaacecb9" title="permalink">#</a> </h3> <p> Usually, when you see a demo of applicative validation, the result of validating is one of two: either <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate">a parsed result</a>, or a collection of error messages. </p> <p> <pre>λ&gt; validateReservation $ ReservationJson "2017-06-30 19:00:00+02:00" 4 "Jane Doe" "j@example.com" Validation (Right (Reservation { &nbsp;&nbsp;&nbsp;&nbsp;reservationDate = 2017-06-30 19:00:00 +0200, &nbsp;&nbsp;&nbsp;&nbsp;reservationQuantity = 4, &nbsp;&nbsp;&nbsp;&nbsp;reservationName = "Jane Doe", &nbsp;&nbsp;&nbsp;&nbsp;reservationEmail = "j@example.com"})) λ&gt; validateReservation $ ReservationJson "2017/14/12 6pm" 4.1 "Jane Doe" "jane.example.com" Validation (Left ["Not a date.","Not a positive integer.","Not an email address."]) λ&gt; validateReservation $ ReservationJson "2017-06-30 19:00:00+02:00" (-3) "Jane Doe" "j@example.com" Validation (Left ["Not a positive integer."])</pre> </p> <p> (Example from <a href="/2018/11/05/applicative-validation">Applicative validation</a>.) </p> <p> What if, instead, you're displaying an input form? When users enter data, you want to validate it. Imagine, for the rest of this short series of articles that the input form has three fields: <em>name</em>, <em>date of birth</em>, and <em>address</em>. Each piece of data has associated validation rules. </p> <p> If you enter a valid name, but an invalid date of birth, you want to clear the input form's date of birth, but not the name. It's such a bother for a user having to retype valid data just because a single field turned out to be invalid. </p> <p> Imagine, for example, that you want to bind the form to a data model like this <a href="https://fsharp.org">F#</a> record type: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Input&nbsp;=&nbsp;{&nbsp;Name&nbsp;:&nbsp;string&nbsp;option;&nbsp;DoB&nbsp;:&nbsp;DateTime&nbsp;option;&nbsp;Address&nbsp;:&nbsp;string&nbsp;option}</pre> </p> <p> Each of these three fields is optional. We'd like validation to work in the following way: If validation fails, the function should return <em>both</em> a list of error messages, and <em>also</em> the <code>Input</code> object, with valid data retained, but invalid data cleared. </p> <p> One of the rules implied in the forum question is that names must be more than three characters long. Thus, input like this is invalid: </p> <p> <pre>{&nbsp;Name&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;Tom&quot;</span>;&nbsp;DoB&nbsp;=&nbsp;Some&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>&nbsp;}</pre> </p> <p> Both the <code>DoB</code> and <code>Address</code> fields, however, are valid, so, along with error messages, we'd like our validation function to return a partially wiped <code>Input</code> value: </p> <p> <pre>{&nbsp;Name&nbsp;=&nbsp;None;&nbsp;DoB&nbsp;=&nbsp;Some&nbsp;eightYearsAgo;&nbsp;Address&nbsp;=&nbsp;Some&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>&nbsp;}</pre> </p> <p> Notice that both <code>DoB</code> and <code>Address</code> field values are retained, while <code>Name</code> has been reset. </p> <p> A final requirement: If validation succeeds, the return value should be a <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate">parsed value that captures that validation took place</a>: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;ValidInput&nbsp;=&nbsp;{&nbsp;Name&nbsp;:&nbsp;string;&nbsp;DoB&nbsp;:&nbsp;DateTime;&nbsp;Address&nbsp;:&nbsp;string&nbsp;}</pre> </p> <p> That requirement is straightforward. That's how you'd usually implement application validation. It's the partial data round-trip that seems to throw a spanner in the works. </p> <p> How should we model such validation? </p> <h3 id="534fcc2d66f242a0ba10a9aca7827276"> Theory, applied <a href="#534fcc2d66f242a0ba10a9aca7827276" title="permalink">#</a> </h3> <p> There's a subculture of functional programming that draws heavily on <a href="https://en.wikipedia.org/wiki/Category_theory">category theory</a>. This is most prevalent in <a href="https://www.haskell.org">Haskell</a>. I've been studying category theory in an attempt to understand what it's all about. I even wrote <a href="/2017/10/04/from-design-patterns-to-category-theory">a substantial article series</a> about some design patterns and how they relate to theory. </p> <p> One thing I learned <em>after</em> I'd named that article series is that most of the useful theoretical concepts come from <a href="https://en.wikipedia.org/wiki/Abstract_algebra">abstract algebra</a>, with the possible exception of monads. </p> <p> People often ask me: does all that theory have any practical use? </p> <p> Yes, it does, as it turns out. It did, for example, enable me to identify a solution to the above twist in five to ten minutes. </p> <p> It's a discussion that I often have, particularly with the always friendly F# community. <em>Do you have to understand <a href="/2018/03/22/functors">functors</a>, monads, etcetera to be a productive F# developer?</em> </p> <p> To anyone who wants to learn F# I'd respond: Don't worry about that at the gate. Find a good learning resource and dive right in. It's a friendly language that you can learn gradually. </p> <p> Sooner or later, though, you'll run into knotty problems that you may struggle to address. I've seen this enough times that it looks like a pattern. The present forum question is just one example. A beginner or intermediate F# programmer will typically attempt to solve the problem in an ad-hoc manner that may or may not be easy to maintain. (The solution proposed by the author of that forum question doesn't, by the way, look half bad.) </p> <p> To be clear: there's nothing wrong with being a beginner. I was once a beginner programmer, and I'm <em>still</em> a beginner in multiple ways. What I'm trying to argue here is that there <em>is</em> value in knowing theory. With my knowledge of abstract algebra and how it applies to functional programming, it didn't take me long to identify a solution. I'll get to that later. </p> <p> Before I outline a solution, I'd like to round off the discussion of applied theory. That question about monads comes up a lot. <em>Do I have to understand functors, monads, etcetera to be a good F# developer?</em> </p> <p> I think it's like asking <em>Do I have to understand polymorphism, design patterns, the <a href="https://en.wikipedia.org/wiki/SOLID">SOLID principles</a>, etcetera to be a good object-oriented programmer?</em> </p> <p> Those are typically not the first topics people are taught about OOD. I would assert, however, that understanding such topics do help. They may not be required to get started with OOP, but knowing them makes you a better programmer. </p> <p> I think the same is true for functional programming. It's just a different skill set that makes you better in that paradigm. </p> <h3 id="78fdc446770a4ec48b556e0826a59ce9"> Solution outline <a href="#78fdc446770a4ec48b556e0826a59ce9" title="permalink">#</a> </h3> <p> When you know a bit of theory, you may know that validation can be implemented with an applicative sum type like <a href="/2019/01/14/an-either-functor">Either</a> (AKA <em>Option</em>), with one extra requirement. </p> <p> Either <a href="/2019/01/07/either-bifunctor">has two dimensions</a>, <em>left</em> or <em>right</em> (<em>success</em> or <em>failure</em>, <em>ok</em> or <em>error</em>, etcetera). The applicative nature of it already supplies a way to compose the successes, but what if there's more than one validation error? </p> <p> In my <a href="/2018/11/05/applicative-validation">article about applicative validation</a> I showed how to collect multiple error messages in a list. Lists, however, <a href="/2017/10/10/strings-lists-and-sequences-as-a-monoid">form a monoid</a>, so I typed the validation API to be that flexible. </p> <p> In fact, all you need is a <a href="/2017/11/27/semigroups">semigroup</a>. When I wrote the article on applicative validation, Haskell's <code>Semigroup</code> type class wasn't yet a supertype of <code>Monoid</code>, and I (perhaps without sufficient contemplation) just went with <code>Monoid</code>. </p> <p> What remains is that applicative validation can collect errors for <em>any</em> semigroup of errors. All we need to solve the above validation problem with a twist, then, is to identify a suitable semigroup. </p> <p> I don't want to give away everything in this article, so I'm going to leave you with this cliffhanger. Which semigroup solves the problem? Read on. <ul> <li><a href="/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip">A Haskell proof of concept of validation with partial data round trip</a></li> <li><a href="/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip">An F# demo of validation with partial data round trip</a></li> </ul> As is often my modus operandi, I first did a proof of concept in Haskell. With its type classes and higher-kinded polymorphism, it's much faster to prototype solutions than even in F#. In the next article, I'll describe how that turned out. </p> <p> After the Haskell article, I'll show how it translates to F#. You can skip the Haskell article if you like. </p> <h3 id="048492d078164a648dddf6c57dbaf490"> Conclusion <a href="#048492d078164a648dddf6c57dbaf490" title="permalink">#</a> </h3> <p> I still think that validation is a solved problem. It's always interesting when such a belief for a moment is challenged, and satisfying to discover that it still holds. </p> <p> This is, after all, not proof of anything. Perhaps tomorrow, someone will throw another curve ball that I can't catch. If that happens, I'll have to update my beliefs. Until then, I'll consider validation a solved problem. </p> <p> <strong>Next:</strong> <a href="/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip">A Haskell proof of concept of validation with partial data round trip</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/12/14/validation-a-solved-problem Branching tests https://blog.ploeh.dk/2020/12/07/branching-tests/ Mon, 07 Dec 2020 06:25:00 UTC <div id="post"> <p> <em>Is it ever okay to branch and loop in a unit test?</em> </p> <p> When I coach development organisations about unit testing and test-driven development, there's often a sizeable group of developers who don't see the value of unit testing. Some of the arguments they typically use are worth considering. </p> <p> A common complaint is that it's difficult to see the wisdom in writing code to prevent defects in code. That's not an unreasonable objection. </p> <p> <a href="/2020/05/25/wheres-the-science">We have scant scientific knowledge about software engineering</a>, but the little we know suggests that the number of defects is proportional to lines of code. The more lines of code, the more defects. </p> <p> If that's true, adding more code - even when it's test code - seems like a bad idea. </p> <h3 id="88c3fa7503454d9f9e329db299f70881"> Reasons to trust test code <a href="#88c3fa7503454d9f9e329db299f70881" title="permalink">#</a> </h3> <p> First, we should consider the possibility that the correlation between lines of code and defects doesn't mean that defects are <em>evenly</em> distributed. As <a href="https://www.adamtornhill.com">Adam Tornhill</a> argues in <a href="https://amzn.to/36Pd5EE">Your Code as a Crime Scene</a>, defects tend to cluster in hotspots. </p> <p> You can have a large proportion of your code base which is, for all intents and purpose, bug-free, and hotspots where defects keep spawning. </p> <p> If this is true, adding test code isn't a problem if you can keep it bug-free. </p> <p> That, however, sounds like a chicken-and-the-egg kind of problem. How can you know that test code is bug-free without tests? </p> <p> I've <a href="/2013/04/02/why-trust-tests">previously answered that question</a>. In short, you can trust a test for two reasons: <ul> <li>You've seen it fail (haven't you?)</li> <li>It's simple</li> </ul> I usually think of the simplicity criterion as a limit on <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a>: it should be <em>1</em>. This means no branching and no loops in your tests. </p> <p> That's what this article is actually about. </p> <h3 id="0494e1e3524d46b186ad7fc039dc7c17"> What's in a name? <a href="#0494e1e3524d46b186ad7fc039dc7c17" title="permalink">#</a> </h3> <p> I was working with an online restaurant reservation system (example code), and had written this test: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-11-24&nbsp;19:00&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2024-02-13&nbsp;18:15&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db); }</pre> </p> <p> This is a <a href="/2019/02/18/from-interaction-based-to-state-based-testing">state-based test</a> that verifies that a valid reservation makes it to the database. The test has a cyclomatic complexity of <em>1</em>, and I've seen it fail, so all is good. (It may, in fact, contain a future maintenance problem, but that's a topic for <a href="/2021/01/11/waiting-to-happen">another article</a>.) </p> <p> What constitutes a valid reservation? At the very least, we should demand that <code>At</code> is a valid date and time, and that <code>Quantity</code> is a positive number. The restaurant would like to be able to email a confirmation to the user, so an email address is also required. Email addresses are notoriously difficult to validate, so we'll just require that the the string isn't null. </p> <p> What about the <code>Name</code>? I thought about this a bit and decided that, according to <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>, the system should accept null names. The name is only a convenience; the system doesn't need it, it's just there so that when you arrive at the restaurant, you can say <em>"I have a reservation for Julia"</em> instead of giving an email address to the <a href="https://en.wikipedia.org/wiki/Ma%C3%AEtre_d%27h%C3%B4tel">maître d'hôtel</a>. But then, if you didn't supply a name when you made the reservation, you can always state your email address when you arrive. To summarise, the name is just a convenience, not a requirement. </p> <p> This decision meant that I ought to write a test case with a null name. </p> <p> That turned out to present a problem. I'd defined the <code>Reservation</code> class so that it didn't accept <code>null</code> arguments, and I think that's the appropriate design. Null is just evil and has no place in my domain models. </p> <p> That's not a problem in itself. In this case, I think it's acceptable to convert a null name to the empty string. </p> <h3 id="f0caaae1c01d47378b0f23a8d6a95d72"> Copy and paste <a href="#f0caaae1c01d47378b0f23a8d6a95d72" title="permalink">#</a> </h3> <p> Allow me to summarise. If you consider the above unit test, I needed a third test case with a null <code>name</code>. In that case, <code>expected</code> should be a <code>Reservation</code> value with the name <code>""</code>. Not <code>null</code>, but <code>""</code>. </p> <p> As far as I can tell, you can't easily express that in <code>PostValidReservationWhenDatabaseIsEmpty</code> without increasing its cyclomatic complexity. Based on the above introduction, that seems like a no-no. </p> <p> What's the alternative? Should I copy the test and adjust the <em>single</em> line of code that differs? If I did, it would look like this: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-08-23&nbsp;16:55&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;2)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWithNullNameWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db); }</pre> </p> <p> Apart from the values in the <code>[InlineData]</code> attribute and the method name, the <em>only</em> difference from <code>PostValidReservationWhenDatabaseIsEmpty</code> is that <code>expected</code> has a hard-coded name of <code>""</code>. </p> <p> This is not acceptable. </p> <p> There's a common misconception that the <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY</a> principle doesn't apply to unit tests. I don't see why this should be true. The DRY principle exists because copy-and-paste code is difficult to maintain. Unit test code is also code that you have to maintain. All the rules about writing maintainable code also apply to unit test code. </p> <h3 id="1d10bd69a6364e6984cb808ab5d9a8f8"> Branching in test <a href="#1d10bd69a6364e6984cb808ab5d9a8f8" title="permalink">#</a> </h3> <p> What's the alternative? One option (that shouldn't be easily dismissed) is to introduce a <a href="http://xunitpatterns.com/Test%20Helper.html">Test Helper</a> to perform the conversion from a nullable name to a non-nullable name. Such a helper would have a cyclomatic complexity of <em>2</em>, but could be unit tested in isolation. It might even turn out that it'd be useful in the production code. </p> <p> Still, that seems like overkill, so I instead made the taboo move and added branching logic to the existing test to see how it'd look: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-11-24&nbsp;19:00&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2024-02-13&nbsp;18:15&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-08-23&nbsp;16:55&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;2)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Name&nbsp;??&nbsp;<span style="color:#a31515;">&quot;&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db); }</pre> </p> <p> Notice that the <code>expected</code> name is now computed as <code>dto.Name ?? ""</code>. Perhaps you think about branching instructions as relating exclusively to keywords such as <code>if</code> or <code>switch</code>, but the <code>??</code> operator is also a branching instruction. The test now has a cyclomatic complexity of <code>2</code>. </p> <p> Is that okay? </p> <h3 id="baf81334614241e7b36bbe7f8f1af821"> To branch or not to branch <a href="#baf81334614241e7b36bbe7f8f1af821" title="permalink">#</a> </h3> <p> I think that in this case, it's okay to slightly increase the cyclomatic complexity of the test. It's not something I just pull out of my hat, though. I think it's possible to adjust the above heuristics to embrace this sort of variation. </p> <p> To be clear, I consider this an <em>advanced</em> practice. If you're just getting started with unit testing, try to keep tests simple. Keep the cyclomatic complexity at <em>1</em>. </p> <p> Had I been in the above situation a couple of years ago, I might not have considered this option. About a year ago, though, I watched <a href="https://en.wikipedia.org/wiki/John_Hughes_(computer_scientist)">John Hughes'</a> presentation <a href="https://youtu.be/NcJOiQlzlXQ">Building on developers' intuitions to create effective property-based tests</a>. When he, about 15 minutes in, wrote a test with a branching instruction, I remember becoming quite uncomfortable. This lasted for a while until I understood where he was going with it. It's truly an inspiring and illuminating talk; I highly recommend it. </p> <p> How it relates to the problem presented here is through <em>coverage</em>. While the <code>PostValidReservationWhenDatabaseIsEmpty</code> test now has a cyclomatic complexity of <em>2</em>, it's a parametrised test with three test cases. Two of these cover one branch, and the third covers the other. </p> <p> What's more important is the process that produced the test. I added one test case at a time, and for each case, <em>I saw the test fail</em>. </p> <p> Specifically, when I added the third test case with the null name, I first added the branching expression <code>dto.Name ?? ""</code> and ran the two existing tests. They still both passed, which bolstered my belief that they both exercised the left branch of that expression. I then added the third case and saw that it (and only it) failed. This supported my belief that the third case exercised the right branch of <code>??</code>. </p> <p> Branching in unit tests isn't something I do lightly. I still believe that it could make the test more vulnerable to future changes. I'm particularly worried about making a future change that might shift one or more of these test cases into false negatives in the form of <a href="/2019/10/14/tautological-assertion">tautological assertions</a>. </p> <h3 id="bfa3a3d3f82044f289568efcc675c70f"> Conclusion <a href="#bfa3a3d3f82044f289568efcc675c70f" title="permalink">#</a> </h3> <p> As you can tell, when I feel that I'm moving onto thin ice, I move deliberately. If there's one thing I've learned from decades of professional programming it's that my brain loves jumping to conclusions. Moving slowly and deliberately is my attempt at countering this tendency. I believe that it enables me to go faster in the long run. </p> <p> I don't think that branching in unit tests should be common, but I believe that it may be occasionally valid. The key, I think, is to guarantee that each branch in the test is covered by a test case. The implication is that there must be <em>at least</em> as many test cases as the cyclomatic complexity. In other words, the test <em>must</em> be a parametrised test. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="cbbf719d6e1744c9879edb6242bbdf21"> <div class="comment-author"><a href="http://www.morcs.com">James Morcom</a></div> <div class="comment-content"> <p>Hi Mark, I guess there is implicit cyclomatic complexity in the testing framework itself (For example, it loops through the <code>InlineData</code> records). That feels fine though, does this somehow have less cost than cyclomatic complexity in the test code itself? I guess, as you mentioned, it's acceptable because the alternative is violation of DRY. </p> <p> With this in mind, I wonder how you feel about adding an <code>expectedName</code> parameter to the <code>InlineData</code> attributes, instead of the conditional in the test code? Maybe it's harder to read though when the test data includes input and output. </p> </div> <div class="comment-date">2020-12-07 08:36 UTC</div> </div> <div class="comment" id="1f5efe14d22441ffac85f5f5afc9b3b1"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> James, thank you for writing. I consider <a href="/2019/12/09/put-cyclomatic-complexity-to-good-use#de927bfcc95d410bbfcd0adf7a63926b">the cyclomatic complexity of a method call to be <em>1</em></a>, and Visual Studio code metrics agree with me. Whatever happens in a framework should, in my opinion, likewise be considered as encapsulated abstraction that's none of our business. </p> <p> Adding an <code>expectedName</code> parameter to the method is definitely an option. I sometimes do that, and I could have done that here, too. In this situation, I think it's a toss-up. It'd make it harder for a later reader of the code to parse the test cases, but would simplify the test code itself, so that alternative comes with both advantages and disadvantages. </p> </div> <div class="comment-date">2020-12-08 11:02 UTC</div> </div> <div class="comment" id="e5a0f70c39f411ebadc10242ac120002"> <div class="comment-author">Romain Deneau <a href="https://twitter.com/DeneauRomain">@DeneauRomain</a></div> <div class="comment-content"> <p> Hi Mark. To build up on the additional <code>expectedName</code> parameter, instead of keeping a single test with the 3 cases but the last being a edge case, I prefer introduce a specific test for the last case. </p> <p> Then, to remove the duplication, we can extract a common method which will take this additional <code>expectedName</code> parameter: </p> <p> <pre> [<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2023-11-24&nbsp;19:00&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;2024-02-13&nbsp;18:15&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;&nbsp;&nbsp;&nbsp;9)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWithNameWhenDatabaseIsEmpty &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">string</span>&nbsp;at,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name,&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;quantity)&nbsp;=> PostValidReservationWhenDatabaseIsEmpty(at, email, name, expectedName: name, quantity); [<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWithoutNameWhenDatabaseIsEmpty()&nbsp;=> PostValidReservationWhenDatabaseIsEmpty( at          : <span style="color:#a31515;">&quot;2023-11-24&nbsp;19:00&quot;</span>, email       : <span style="color:#a31515;">&quot;juliad@example.net&quot;</span>, name        : <span style="color:blue;">null</span>, expectedName: <span style="color:#a31515;">&quot;&quot;</span>, quantity    : 5); <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;expectedName, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>   &nbsp;quantity) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=&nbsp;at, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;&nbsp;&nbsp;&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity, &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(dto.At,&nbsp;<span style="color:#2b91af;">CultureInfo</span>.InvariantCulture), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;expectedName,&nbsp;<span style="color:green;">// /!\ Not `dto.Name`</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db); } </pre> </p> </div> <div class="comment-date">2020-12-09 8:44 UTC</div> </div> <div class="comment" id="2779280fced748b0879a4d7267f3d634"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Romain, thank you for writing. There are, indeed, many ways to skin that cat. If you're comfortable with distributing a test over more than one method, I instead prefer to use another data source for the <code>[Theory]</code> attribute: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>&nbsp;: &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TheoryData</span>&lt;<span style="color:#2b91af;">ReservationDto</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithName(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2023,&nbsp;11,&nbsp;24,&nbsp;19,&nbsp;0,&nbsp;0),&nbsp;<span style="color:#a31515;">&quot;juliad@example.net&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Julia&nbsp;Domna&quot;</span>,&nbsp;5); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithName(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2024,&nbsp;2,&nbsp;13,&nbsp;18,&nbsp;15,&nbsp;0),&nbsp;<span style="color:#a31515;">&quot;x@example.com&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Xenia&nbsp;Ng&quot;</span>,&nbsp;9); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AddWithoutName(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2023,&nbsp;8,&nbsp;23,&nbsp;16,&nbsp;55,&nbsp;0),&nbsp;<span style="color:#a31515;">&quot;kite@example.edu&quot;</span>,&nbsp;2); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">void</span>&nbsp;AddWithName(<span style="color:#2b91af;">DateTime</span>&nbsp;at,&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;<span style="color:blue;">string</span>&nbsp;name,&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;email, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;quantity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>(at,&nbsp;email,&nbsp;name,&nbsp;quantity)); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">void</span>&nbsp;AddWithoutName(<span style="color:#2b91af;">DateTime</span>&nbsp;at,&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;<span style="color:blue;">int</span>&nbsp;quantity) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;{&nbsp;At&nbsp;=&nbsp;at.ToString(<span style="color:#a31515;">&quot;O&quot;</span>),&nbsp;Email&nbsp;=&nbsp;email,&nbsp;Quantity&nbsp;=&nbsp;quantity&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>(at,&nbsp;email,&nbsp;<span style="color:#a31515;">&quot;&quot;</span>,&nbsp;quantity)); &nbsp;&nbsp;&nbsp;&nbsp;} } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">ClassData</span>(<span style="color:blue;">typeof</span>(<span style="color:#2b91af;">PostValidReservationWhenDatabaseIsEmptyTestCases</span>))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PostValidReservationWhenDatabaseIsEmpty( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;dto,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;expected) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(expected,&nbsp;db); }</pre> </p> <p> Whether you prefer one over the other is, I think, subjective. I like my alternative, using a <code>[ClassData]</code> source, better, because I find it a bit more principled and 'pattern-based', if you will. I also like how small the actual test method becomes. </p> <p> Your solution, on the other hand, is more portable, in the sense that you could also apply it in a testing framework that doesn't have the sort of capability that xUnit.net has. That's a definite benefit with your suggestion. </p> </div> <div class="comment-date">2020-12-10 20:05 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/12/07/branching-tests Name by role https://blog.ploeh.dk/2020/11/30/name-by-role/ Mon, 30 Nov 2020 06:31:00 UTC <div id="post"> <p> <em>Consider naming variables according to their role, instead of their type.</em> </p> <p> My <a href="/2020/11/23/good-names-are-skin-deep">recent article on good names</a> might leave you with the impression that I consider good names unimportant. Not at all. That article was an attempt at delineating the limits of naming. Good names aren't the panacea some people seem to imply, but they're still important. </p> <p> As the cliché goes, naming is one of the hardest problems in software development. Perhaps it's hard because you have to do it so frequently. Every time you create a variable, you have to name it. It's also an opportunity to add clarity to a code base. </p> <p> A common naming strategy is to name objects after their type: </p> <p> <pre><span style="color:#2b91af;">Reservation</span>?&nbsp;reservation&nbsp;=&nbsp;dto.Validate(id);</pre> </p> <p> or: </p> <p> <pre><span style="color:#2b91af;">Restaurant</span>?&nbsp;restaurant&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;RestaurantDatabase.GetRestaurant(restaurantId);</pre> </p> <p> There's nothing inherently wrong with a naming scheme like this. It often makes sense. The <code>reservation</code> variable is a <code>Reservation</code> object, and there's not that much more to say about it. The same goes for the <code>restaurant</code> object. </p> <p> In some contexts, however, objects play specific <em>roles</em>. This is particularly prevalent with primitive types, but can happen to any type of object. It may help the reader if you name the variables according to such roles. </p> <p> In this article, I'll show you several examples. I hope these examples are so plentiful and varied that they can inspire you to come up with good names. </p> <h3 id="895798815b414b368058ed8640236ff9"> A variable introduced only to be named <a href="#895798815b414b368058ed8640236ff9" title="permalink">#</a> </h3> <p> In a <a href="/2020/11/09/checking-signed-urls-with-aspnet">recent article</a> I showed this code snippet: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;SignatureIsValid(<span style="color:blue;">string</span>&nbsp;candidate,&nbsp;<span style="color:#2b91af;">ActionExecutingContext</span>&nbsp;context) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sig&nbsp;=&nbsp;context.HttpContext.Request.Query[<span style="color:#a31515;">&quot;sig&quot;</span>]; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;receivedSignature&nbsp;=&nbsp;<span style="color:#2b91af;">Convert</span>.FromBase64String(sig.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;hmac&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HMACSHA256</span>(urlSigningKey); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;computedSignature&nbsp;=&nbsp;hmac.ComputeHash(<span style="color:#2b91af;">Encoding</span>.ASCII.GetBytes(candidate)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;signaturesMatch&nbsp;=&nbsp;computedSignature.SequenceEqual(receivedSignature); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;signaturesMatch; }</pre> </p> <p> Did you wonder about the <code>signaturesMatch</code> variable? Why didn't I just return the result of <code>SequenceEqual</code>, like the following? </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;SignatureIsValid(<span style="color:blue;">string</span>&nbsp;candidate,&nbsp;<span style="color:#2b91af;">ActionExecutingContext</span>&nbsp;context) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sig&nbsp;=&nbsp;context.HttpContext.Request.Query[<span style="color:#a31515;">&quot;sig&quot;</span>]; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;receivedSignature&nbsp;=&nbsp;<span style="color:#2b91af;">Convert</span>.FromBase64String(sig.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;hmac&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HMACSHA256</span>(urlSigningKey); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;computedSignature&nbsp;=&nbsp;hmac.ComputeHash(<span style="color:#2b91af;">Encoding</span>.ASCII.GetBytes(candidate)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;computedSignature.SequenceEqual(receivedSignature); }</pre> </p> <p> Visual Studio even offers this as a possible refactoring that it'll do for you. </p> <p> The inclusion of the <code>signaturesMatch</code> variable was a conscious decision of mine. I felt that directly returning the result of <code>SequenceEqual</code> was a bit too implicit. It forces readers to make the inference themselves: <em>Ah, the two arrays contain the same sequence of bytes; that must mean that the signatures match!</em> </p> <p> Instead of asking readers to do that work themselves, I decided to do it for them. I hope that it improves readability. It doesn't change the behaviour of the code one bit. </p> <h3 id="e5f18d0e31264b29bf11cd817b8c7bfa"> Test roles <a href="#e5f18d0e31264b29bf11cd817b8c7bfa" title="permalink">#</a> </h3> <p> When it comes to unit testing, there's plenty of inconsistent terminology. One man's <em>mock object</em> is another woman's <em>test double</em>. Most of the jargon isn't even internally consistent. Do yourself a favour and adopt a consistent pattern language. I use the one presented in <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a>. </p> <p> For instance, the thing that you're testing is the System Under Test (SUT). This can be a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a> or a static method, but when it's an object, you're going to create a variable. Consider <a href="https://docs.microsoft.com/en-us/archive/blogs/ploeh/naming-sut-test-variables">naming it <em>sut</em></a>. A typical test also defines other variables. Naming one of them <code>sut</code> clearly identifies which of them is the SUT. It also protects the tests against the class in question being renamed. </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;ScheduleSingleReservationCommunalTable() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;table&nbsp;=&nbsp;<span style="color:#2b91af;">Table</span>.Communal(12); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaitreD</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(18), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(21), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(6), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;table); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;r&nbsp;=&nbsp;<span style="color:#2b91af;">Some</span>.Reservation; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Schedule(<span style="color:blue;">new</span>[]&nbsp;{&nbsp;r&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TimeSlot</span>(r.At,&nbsp;table.Reserve(r))&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(expected,&nbsp;actual); }</pre> </p> <p> The above test follows <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">my AAA.formatting heuristic</a>. In all, it defines five variables, but there can be little doubt about which one is the <code>sut</code>. </p> <p> The <code>table</code> and <code>r</code> variables follow the mainstream practice of naming variables after their type. They play no special role, so that's okay. You may balk at such a short variable name as <code>r</code>, and that's okay. In my defence, I follow <a href="http://amzn.to/XCJi9X">Clean Code</a>'s <em>N5</em> heuristic for long and short scopes. A variable name like <code>r</code> is fine when it only spans three lines of code (four, if you also count the blank line). </p> <p> Consider also using the variable names <code>expected</code> and <code>actual</code>, as in the above example. In many unit testing frameworks, those are the argument names for the assertion. For instance, in <a href="https://xunit.net">xUnit.net</a> (which the above test uses) the <code>Assert.Equals</code> overloads are defined as <code>Equal&lt;T&gt;(T expected, T actual)</code>. Using these names for variables makes the roles clearer, I think. </p> <h3 id="bb01b7e1d37f469e8a81f288b513248a"> The other <a href="#bb01b7e1d37f469e8a81f288b513248a" title="permalink">#</a> </h3> <p> The above assertion relies on structural equality. The <code>TimeSlot</code> class is immutable, so it can safely override <code>Equals</code> (and <code>GetHashCode</code>) to implement structural equality: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>?&nbsp;obj) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">TimeSlot</span>&nbsp;other&nbsp;&amp;&amp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;==&nbsp;other.At&nbsp;&amp;&amp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Tables.SequenceEqual(other.Tables); }</pre> </p> <p> I usually call the downcast variable <code>other</code> because, from the perspective of the instance, it's the other object. I usually use that convention whenever an instance interacts with another object of the same type. Among other examples, this happens when you model objects as <a href="/2017/11/27/semigroups">semigroups</a> and <a href="/2017/10/06/monoids">monoids</a>. The <a href="/2018/07/16/angular-addition-monoid">Angle struct, for example, defines this binary operation</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Angle</span>&nbsp;Add(<span style="color:#2b91af;">Angle</span>&nbsp;other) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Angle</span>(<span style="color:blue;">this</span>.degrees&nbsp;+&nbsp;other.degrees); }</pre> </p> <p> Again, the method argument is in the role as the other object, so naming it <code>other</code> seems natural. </p> <p> Here's another example from a restaurant reservation code base: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Overlaps(<span style="color:#2b91af;">Seating</span>&nbsp;other) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(other&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(other)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Start&nbsp;&lt;&nbsp;other.End&nbsp;&amp;&amp;&nbsp;other.Start&nbsp;&lt;&nbsp;End; }</pre> </p> <p> The <code>Overlaps</code> method is an instance method on the <code>Seating</code> class. Again, <code>other</code> seems natural. </p> <h3 id="8cf3583ad0ac446fa4b4dee2272ba054"> Candidates <a href="#8cf3583ad0ac446fa4b4dee2272ba054" title="permalink">#</a> </h3> <p> The <code>Overlaps</code> method looks like a <em>predicate</em>, i.e. a function that returns a Boolean value. In the case of that method, <code>other</code> indicates the role of being the other object, but it also plays another role. It makes sense to me to call predicate input <em>candidates</em>. Typically, you have some input that you want to evaluate as either true or false. I think it makes sense to think of such a parameter as a 'truth candidate'. You can see one example of that in the above <code>SignatureIsValid</code> method. </p> <p> There, the <code>string</code> parameter is a <code>candidate</code> for having a valid signature. </p> <p> Here's another restaurant-related example: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;WillAccept( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;now, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;existingReservations, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;candidate) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(existingReservations&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(existingReservations)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(candidate&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(candidate)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(candidate.At&nbsp;&lt;&nbsp;now) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(IsOutsideOfOpeningHours(candidate)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;seating&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Seating</span>(SeatingDuration,&nbsp;candidate.At); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;relevantReservations&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;existingReservations.Where(seating.Overlaps); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;availableTables&nbsp;=&nbsp;Allocate(relevantReservations); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;availableTables.Any(t&nbsp;=&gt;&nbsp;t.Fits(candidate.Quantity)); }</pre> </p> <p> Here, the reservation in question is actually not yet a reservation. It might be rejected, so it's a <code>candidate</code> reservation. </p> <p> You can also use that name in <code>TryParse</code> methods, as shown in <a href="/2019/12/09/put-cyclomatic-complexity-to-good-use">this article</a>. </p> <h3 id="4301289931dc4f94abf46578ddcfd693"> Data Transfer Objects <a href="#4301289931dc4f94abf46578ddcfd693" title="permalink">#</a> </h3> <p> Another name that I like to use is <code>dto</code> for <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Objects</a> (DTOs). The benefit here is that as long as <code>dto</code> is unambiguous in context, it makes it easier to distinguish between a DTO and the domain model you might want to turn it into: </p> <p> <pre>[<span style="color:#2b91af;">HttpPost</span>(<span style="color:#a31515;">&quot;restaurants/{restaurantId}/reservations&quot;</span>)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">ActionResult</span>&gt;&nbsp;Post(<span style="color:blue;">int</span>&nbsp;restaurantId,&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;dto) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(dto&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(dto)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;id&nbsp;=&nbsp;dto.ParseId()&nbsp;??&nbsp;<span style="color:#2b91af;">Guid</span>.NewGuid(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Reservation</span>?&nbsp;reservation&nbsp;=&nbsp;dto.Validate(id); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(reservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">BadRequestResult</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;restaurant&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;RestaurantDatabase.GetRestaurant(restaurantId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(restaurant&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">NotFoundResult</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;TryCreate(restaurant,&nbsp;reservation); }</pre> </p> <p> By naming the input parameter <code>dto</code>, I keep the name <code>reservation</code> free for the domain object, which ought to be the more important object of the two. <blockquote> <p> "A Data Transfer Object is one of those objects our mothers told us never to write." </p> <footer><cite><a href="http://bit.ly/patternsofeaa">Martin Fowler</a></cite></footer> </blockquote> I could have named the input parameter <code>reservationDto</code> instead of <code>dto</code>, but that would diminish the 'mental distance' between <code>reservationDto</code> and <code>reservation</code>. I like to keep that distance, so that the roles are more explicit. </p> <h3 id="b8ece382f7554141aa741fc9ded9e02a"> Time <a href="#b8ece382f7554141aa741fc9ded9e02a" title="permalink">#</a> </h3> <p> You often need to make decisions based on the current time or date. In .NET the return value from <a href="https://docs.microsoft.com/dotnet/api/system.datetime.now">DateTime.Now</a> is a <code>DateTime</code> value. Typical variable names are <code>dateTime</code>, <code>date</code>, <code>time</code>, or <code>dt</code>, but why not call it <code>now</code>? </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">ActionResult</span>&gt;&nbsp;TryCreate(<span style="color:#2b91af;">Restaurant</span>&nbsp;restaurant,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;reservation) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;scope&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TransactionScope</span>(<span style="color:#2b91af;">TransactionScopeAsyncFlowOption</span>.Enabled); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;reservations&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(restaurant.Id,&nbsp;reservation.At); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;now&nbsp;=&nbsp;Clock.GetCurrentDateTime(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!restaurant.MaitreD.WillAccept(now,&nbsp;reservations,&nbsp;reservation)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;NoTables500InternalServerError(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.Create(restaurant.Id,&nbsp;reservation).ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;scope.Complete(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Reservation201Created(restaurant.Id,&nbsp;reservation); }</pre> </p> <p> This is the <code>TryCreate</code> method called by the above <code>Post</code> method. Here, <code>DateTime.Now</code> is hidden behind <code>Clock.GetCurrentDateTime()</code> in order to make <a href="/2020/03/23/repeatable-execution">execution repeatable</a>, but the idea remains: the variable represents the current time or date, or, with a bit of good will, <code>now</code>. </p> <p> Notice that the <code>WillAccept</code> method (shown above) also uses <code>now</code> as a parameter name. That value's role is to represent <code>now</code> as a concept. </p> <p> When working with time, I also sometimes use the variable names <code>before</code> and <code>after</code>. This is mostly useful in integration tests: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;GetCurrentYear() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;api&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">LegacyApi</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;before&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Now; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;response&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;api.GetCurrentYear(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;after&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Now; &nbsp;&nbsp;&nbsp;&nbsp;response.EnsureSuccessStatusCode(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;response.ParseJsonContent&lt;<span style="color:#2b91af;">CalendarDto</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;AssertOneOf(before.Year,&nbsp;after.Year,&nbsp;actual.Year); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Null(actual.Month); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Null(actual.Day); &nbsp;&nbsp;&nbsp;&nbsp;AssertLinks(actual); }</pre> </p> <p> While you can inject something like a <code>Clock</code> dependency in order to make your SUT deterministic, in integration tests you might want to see behaviour when using the system clock. You can often verify such behaviour by surrounding the test's <em>Act</em> phase with two calls to <code>DateTime.Now</code>. This gives you the time <code>before</code> and <code>after</code> the test exercised the SUT. </p> <p> When you do that, however, be careful with the assertions. If such a test runs at midnight, <code>before</code> and <code>after</code> might be two different dates. If it runs on midnight December 31, it might actually be two different years! That's the reason that the test passes as long as the <code>actual.Year</code> is either of <code>before.Year</code> and <code>after.Year</code>. </p> <h3 id="8183f4e53fdd4ceca701ef17a6509614"> Invalid values <a href="#8183f4e53fdd4ceca701ef17a6509614" title="permalink">#</a> </h3> <p> While integration tests often test happy paths, unit tests should also exercise error paths. What happens when you supply invalid input to a method? When you write such tests, you can identify the invalid values by naming the variables or parameters accordingly: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:blue;">null</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;bas&quot;</span>)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;PutInvalidId(<span style="color:blue;">string</span>&nbsp;invalidId) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;db&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeDatabase</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SystemClock</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">InMemoryRestaurantDatabase</span>(<span style="color:#2b91af;">Some</span>.Restaurant), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dummyDto&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;<span style="color:#a31515;">&quot;2024-06-25&nbsp;18:19&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;<span style="color:#a31515;">&quot;colera@example.com&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;<span style="color:#a31515;">&quot;Cole&nbsp;Aera&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;2 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Put(invalidId,&nbsp;dummyDto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">NotFoundResult</span>&gt;(actual); }</pre> </p> <p> Here, the invalid input represent an ID. To indicate that, I called the parameter <code>invalidId</code>. </p> <p> The system under test is the <code>Put</code> method, which takes two arguments: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">ActionResult</span>&gt;&nbsp;Put(<span style="color:blue;">string</span>&nbsp;id,&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;dto)</pre> </p> <p> When testing an error path, it's important to keep other arguments well-behaved. In this example, I want to make sure that it's the <code>invalidId</code> that causes the <code>NotFoundResult</code> result. Thus, the <code>dto</code> argument should be as well-behaved as possible, so that it isn't going to be the source of divergence. </p> <p> Apart from being well-behaved, that object plays no role in the test. It just needs to be there to make the code compile. <em>xUnit Test Patterns</em> calls such an object a <em>Dummy Object</em>, so I named the variable <code>dummyDto</code> as information to any reader familiar with that pattern language. </p> <h3 id="441f47fab8b74787840c0237b3958316"> Derived class names <a href="#441f47fab8b74787840c0237b3958316" title="permalink">#</a> </h3> <p> The thrust of all of these examples is that you don't <em>have</em> to name variables after their types. You can extend this line of reasoning to class inheritance. Just because a base class is called <code>Foo</code> it doesn't mean that you <em>have</em> to call a derived class <code>SomethingFoo</code>. </p> <p> This is something of which I have to remind myself. For example, to support integration testing with ASP.NET you'll need a <a href="https://docs.microsoft.com/dotnet/api/microsoft.aspnetcore.mvc.testing.webapplicationfactory-1">WebApplicationFactory&lt;TEntryPoint&gt;</a>. To override the default DI Container configuration, you'll have to derive from this class and override its <code>ConfigureWebHost</code> method. In <a href="/2020/04/20/unit-bias-against-collections">an example I've previously published</a> I didn't spend much time thinking about the class name, so <code>RestaurantApiFactory</code> it was. </p> <p> At first, I named the variables of this type <code>factory</code>, or something equally devoid of information. That bothered me, so instead tried <code>service</code>, which I felt was an improvement, but still too vapid. I then adopted <code>api</code> as a variable name, but then realised that that also suggested a better class name. So currently, this defines my self-hosting API: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">SelfHostedApi</span>&nbsp;:&nbsp;<span style="color:#2b91af;">WebApplicationFactory</span>&lt;<span style="color:#2b91af;">Startup</span>&gt;</pre> </p> <p> Here's how I use it: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;ReserveTableAtNono() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;api&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">SelfHostedApi</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;client&nbsp;=&nbsp;api.CreateClient(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;dto&nbsp;=&nbsp;<span style="color:#2b91af;">Some</span>.Reservation.ToDto(); &nbsp;&nbsp;&nbsp;&nbsp;dto.Quantity&nbsp;=&nbsp;6; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;response&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;client.PostReservation(<span style="color:#a31515;">&quot;Nono&quot;</span>,&nbsp;dto); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;at&nbsp;=&nbsp;<span style="color:#2b91af;">Some</span>.Reservation.At; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;AssertRemainingCapacity(client,&nbsp;at,&nbsp;<span style="color:#a31515;">&quot;Nono&quot;</span>,&nbsp;4); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;AssertRemainingCapacity(client,&nbsp;at,&nbsp;<span style="color:#a31515;">&quot;Hipgnosta&quot;</span>,&nbsp;10); }</pre> </p> <p> The variable is just called <code>api</code>, but the reader can tell from the initialisation that this is an instance of the <code>SelfHostedApi</code> class. I like how that communicates that this is an integration test that uses a self-hosted API. It literally says that. </p> <p> This test also uses the <code>dto</code> naming convention. Additionally, you may take note of the variable and property called <code>at</code>. That's another name for a date and time. I struggled with naming this value, until <a href="http://blog.strobaek.org">Karsten Strøbæk</a> suggested that I used the simple word <em>at:</em> <code>reservation.At</code> indicates the date and time of the reservation without being encumbered by awkward details about date and time. Should we call it <code>date</code>? <code>time</code>? <code>dateTime</code>? No, just call it <code>at</code>. I find it elegant. </p> <h3 id="6692436d37dd491fa9920a5f4ac63118"> Conclusion <a href="#6692436d37dd491fa9920a5f4ac63118" title="permalink">#</a> </h3> <p> Sometimes, a <code>Reservation</code> object is just a <code>reservation</code>, and that's okay. At other times, it's the <code>actual</code> value, or the <code>expected</code> value. If it represents an invalid reservation in a test case, it makes sense to call the variable <code>invalidResevation</code>. </p> <p> Giving variables descriptive names improves <a href="/2019/03/04/code-quality-is-not-software-quality">code quality</a>. You don't have to write <a href="http://butunclebob.com/ArticleS.TimOttinger.ApologizeIncode">comments as apologies for poor readability</a> if a better name communicates what the comment would have said. </p> <p> Consider naming variables (and classes) for the <em>roles</em> they play, rather than their types. </p> <p> On the other hand, <a href="/2016/10/25/when-variable-names-are-in-the-way">when variable names are in the way</a>, consider <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free code</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="8d7dc043829244509b1a13c184f3cbbf"> <div class="comment-author"><a href="https://about.me/tysonwilliams">Tyson Williams</a></div> <div class="comment-content"> <p> Excellent name suggestions. Thanks for sharing them :) </p> <blockquote> I usually call the downcast variable <code>other</code> because, from the perspective of the instance, it's the other object. I usually use that convention whenever an instance interacts with another object of the same type. </blockquote> <p> The name <code>other</code> is good, but I prefer <code>that</code> because I think it is a better antonym of <code>this</code> (the keyword for the current instance) and because it has the same number of letters as <code>this</code>. </p> <blockquote> <p> It makes sense to me to call predicate input <em>candidates</em>. Typically, you have some input that you want to evaluate as either true or false. I think it makes sense to think of such a parameter as a 'truth candidate'. You can see one example of that in the above <code>SignatureIsValid</code> method. </p> <p>...</p> <p> Here, the reservation in question is actually not yet a reservation. It might be rejected, so it's a <code>candidate</code> reservation. </p> </blockquote> <p> I typically try to avoid turning some input into either true or false. In particular, I find it confusing for the syntax to say that some instance is a <code>Reservation</code> while the semantics says that it "is actually not yet a reservation". I think of this as an example of <a href="https://blog.ploeh.dk/2015/01/19/from-primitive-obsession-to-domain-modelling/">primitive obsession</a>. Strictly speaking, I think <a href="https://medium.com/the-sixt-india-blog/primitive-obsession-code-smell-that-hurt-people-the-most-5cbdd70496e9#5009:~:text=Primitive%20Obsession%20is%20when%20the%20code%20relies%20too%20much%20on%20primitives.">"Primitive Obsession is when the code relies too much on primitives."</a> (aka, on primitive types). In my mind though, I have generalized this to cover any code that relies too much on weaker types. Separate types <code>Reservation</code> and <code>bool</code> are weaker than separate types <code>Reservation</code> and <code>CandidateReservation</code>. I think Alexis King summarized this well with a blog post titled <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">Parse, don’t validate</a>. </p> <p> And yet, my coworkers and I have engaged in friendly but serious debates for years about which of those two approaches is better. My argument, essentially as given above, is for separate types <code>Reservation</code> and <code>CandidateReservation</code>. The main counterargument is that these types are the same except for a database-generated ID, so just represent both using one type with an optional ID. </p> <p> Have you thought about this before? </p> <blockquote> <p> By naming the input parameter <code>dto</code>, I keep the name <code>reservation</code> free for the domain object, which ought to be the more important object of the two. </p> <p>...</p> <p> I could have named the input parameter <code>reservationDto</code> instead of <code>dto</code>, but that would diminish the 'mental distance' between <code>reservationDto</code> and <code>reservation</code>. I like to keep that distance, so that the roles are more explicit. </p> </blockquote> <p> I prefer to emphasize the roles even more by using the names <code>dto</code> and <code>model</code>. We are in the implementation of the (Post) route for <code>"restaurants/{restaurantId}/reservations"</code>, so I think it is clear from context that the <code>dto</code> and <code>model</code> are really a reservation DTO and a reservation model. </p> </div> <div class="comment-date">2020-11-30 20:46 UTC</div> </div> <div class="comment" id="931a54d51d3d445c9eaf6a41d8923406"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. Certainly, I didn't intent my article to dictate names. As you imply, there's room for both creativity and subjectivity, and that's fine. My suggestions were meant only for inspiration. <blockquote> <p> The main counterargument is that these types are the same except for a database-generated ID, so just represent both using one type with an optional ID. </p> <p> Have you thought about this before? </p> </blockquote> Yes; I would <a href="/2014/08/11/cqs-versus-server-generated-ids">think twice before deciding to model a domain type with a database-generated ID</a>. A server-generated ID is an implementation detail that shouldn't escape the data access layer. If it does, you have a leaky abstraction at hand. Sooner or later, it's going to bite you. </p> <p> The <code>Reservation</code> class in the above examples has this sole constructor: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Reservation</span>(<span style="color:#2b91af;">Guid</span>&nbsp;id,&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;at,&nbsp;<span style="color:#2b91af;">Email</span>&nbsp;email,&nbsp;<span style="color:#2b91af;">Name</span>&nbsp;name,&nbsp;<span style="color:blue;">int</span>&nbsp;quantity)</pre> </p> <p> You can't create an instance without supplying an ID. On the other hand, any code can conjure up a GUID, so no server is required. At the type-level, there's no compelling reason to distinguish between a reservation and a candidate reservation. </p> <p> Granted, you <em>could</em> define two types, <code>Reservation</code> and <code>CandidateReservation</code>, but they'd be isomorphic. In Haskell, you'd probably use a <code>newtype</code> for one of these types, and then you're <a href="https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety">back at Alexis King's blog</a>. </p> </div> <div class="comment-date">2020-12-02 7:43 UTC</div> </div> <div class="comment" id="fe3ac3a60c754340801b877666a07b65"> <div class="comment-author"><a href="https://ttulka.com">Tomas Tulka</a></div> <div class="comment-content"> <blockquote>...naming is one of the hardest problems in software development. Perhaps it's hard because you have to do it so frequently.</blockquote> <p>Usually, doing things frequently means mastering them pretty quickly. Not so for naming. I guess, there are multiple issues:</p> <ol> <li>Words are ambiguous. The key is, not to do naming in isolation, the context matters. For example, it's difficult to come up with a good name for a method when we don't have a good name for its class, the whole component, etc. Similar with Clean Code's N5: the meaning of a short variable is clear in a small scope, closed context.</li> <li>Good naming requires deep understanding of the domain. Developers are usualy not good at the business they model. Sadly, it often means "necessary evil" for them.</li> </ol> <p>Naming variables by their roles is a great idea!</p> <p>Many thanks for another awesome post, I enjoyed reading it.</p> </div> <div class="comment-date">2020-12-11 09:11 UTC</div> </div> <div class="comment" id="e9299b251c8e4b7f9a168ff571f70950"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tomas, thank you for writing. <blockquote> <p> doing things frequently means mastering them pretty quickly. Not so for naming. I guess </p> </blockquote> Good point; I hadn't thought about that. I think that the reasons you list are valid. </p> <p> As an additional observation, it may be that there's a connection to the notion of <em>deliberate practice</em>. As the catch-phrase about professional experience puts it, there's a difference between 20 years of experience and one year of experience repeated 20 times. </p> <p> Doing a thing again and again generates little improvement if one does it by rote. One has to deliberately practice. In this case, it implies that a programmer should explicitly reflect on variable names, and consider more than one option. </p> <p> I haven't met many software developers who do that. </p> </div> <div class="comment-date">2020-12-15 9:19 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/11/30/name-by-role Good names are skin-deep https://blog.ploeh.dk/2020/11/23/good-names-are-skin-deep/ Mon, 23 Nov 2020 06:33:00 UTC <div id="post"> <p> <em>Good names are important, but insufficient, for code maintainability.</em> </p> <p> You should give the building blocks of your code bases descriptive names. It's easier to understand the purpose of a library, module, class, method, function, etcetera if the name contains a clue about the artefact's purpose. This is hardly controversial, and while naming is hard, most teams I visit agree that names are important. </p> <p> Still, despite good intentions and efforts to name things well, code bases deteriorate into unmaintainable clutter. </p> <p> Clearly, good names aren't enough. </p> <h3 id="5a20f600298844e98c89c423f5c66c5f"> Tenuousness of names <a href="#5a20f600298844e98c89c423f5c66c5f" title="permalink">#</a> </h3> <p> A good name is tenuous. First, naming is hard, so while you may have spent some effort coming up with a good name, other people may misinterpret it. Because they originate from natural language, names are as ambiguous as language. (<a href="/2018/07/02/terse-operators-make-business-code-more-readable">Terse operators, on the other hand...</a>) </p> <p> Another maintainability problem with names is that implementation may change over time, but the names remain constant. Granted, modern IDEs make it easy to rename methods, but developers rarely adjust names when they adjust behaviour. Even the best names may become misleading over time. </p> <p> These weakness aren't the worst, though. In my experience, a more fundamental problem is that all it takes is one badly named 'wrapper object' before the information in a good name is lost. </p> <p> <img src="/content/binary/vague-names-hiding-clear-names.png" alt="Object with clear names enclosed in object with vague names."> </p> <p> In the figure, the inner object is well-named. It has a clear name and descriptive method names. All it takes before this information is lost, however, is another object with vague names to 'encapsulate' it. </p> <h3 id="c0cfcf2d16a94e4c96c33c9d0359846f"> An attempt at a descriptive method name <a href="#c0cfcf2d16a94e4c96c33c9d0359846f" title="permalink">#</a> </h3> <p> Here's an example. Imagine an online restaurant reservation system. One of the features of this system is to take reservations and save them in the database. </p> <p> A restaurant, however, is a finite resource. It can only accommodate a certain number of guests at the same time. Whenever the system receives a reservation request, it'll have to retrieve the existing reservations for that time and make a decision. <a href="/2020/01/27/the-maitre-d-kata">Can it accept the reservation?</a> Only if it can should it save the reservation. </p> <p> How do you model such an interaction? How about a descriptive name? How about <code>TrySave</code>? Here's a possible implementation: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;TrySave(<span style="color:#2b91af;">Reservation</span>&nbsp;reservation) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(reservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(reservation)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;reservations&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Repository &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ReadReservations( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reservation.At, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reservation.At&nbsp;+&nbsp;SeatingDuration) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;availableTables&nbsp;=&nbsp;Allocate(reservations); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!availableTables.Any(t&nbsp;=&gt;&nbsp;reservation.Quantity&nbsp;&lt;=&nbsp;t.Seats)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.Create(reservation).ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> There's an implicit naming convention in .NET that methods with the <code>Try</code> prefix indicate an operation that may or may not succeed. The return value of such methods is either <code>true</code> or <code>false</code>, and they may also have <code>out</code> parameters if they optionally produce a value. That's not the case here, but I think one could make the case that <code>TrySave</code> succinctly describes what's going on. </p> <p> All is good, then? </p> <h3 id="bbf9fd5509ba4a2c823483acd40fbe22"> A vague wrapper <a href="#bbf9fd5509ba4a2c823483acd40fbe22" title="permalink">#</a> </h3> <p> After our conscientious programmer meticulously designed and named the above <code>TrySave</code> method, it turns out that it doesn't meet all requirements. Users of the system file a bug: the system accepts reservations outside the restaurant's opening hours. </p> <p> The original programmer has moved on to greener pastures, so fixing the bug falls on a poor maintenance developer with too much to do. Having recently learned about the <a href="https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle">open-closed principle</a>, our new protagonist decides to wrap the existing <code>TrySave</code> in a new method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;Check(<span style="color:#2b91af;">Reservation</span>&nbsp;reservation) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(reservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(reservation)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(reservation.At&nbsp;&lt;&nbsp;<span style="color:#2b91af;">DateTime</span>.Now) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(reservation.At.TimeOfDay&nbsp;&lt;&nbsp;OpensAt) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(LastSeating&nbsp;&lt;&nbsp;reservation.At.TimeOfDay) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;Manager.TrySave(reservation).ConfigureAwait(<span style="color:blue;">false</span>); }</pre> </p> <p> This new method first checks whether the <code>reservation</code> is within opening hours and in the future. If that's not the case, it returns <code>false</code>. Only if these preconditions are fulfilled does it delegate the decision to that <code>TrySave</code> method. </p> <p> Notice, however, the name. The bug was urgent, and our poor programmer didn't have time to think of a good name, so <code>Check</code> it is. </p> <h3 id="6056d14ac9ab4046b4aa417d3902fbf1"> Caller's perspective <a href="#6056d14ac9ab4046b4aa417d3902fbf1" title="permalink">#</a> </h3> <p> How does this look from the perspective of calling code? Here's the Controller action that handles the pertinent HTTP request: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">ActionResult</span>&gt;&nbsp;Post(<span style="color:#2b91af;">ReservationDto</span>&nbsp;dto) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(dto&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(dto)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Reservation</span>?&nbsp;r&nbsp;=&nbsp;dto.Validate(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(r&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">BadRequestResult</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;isOk&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Manager.Check(r).ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!isOk) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusCodeResult</span>(<span style="color:#2b91af;">StatusCodes</span>.Status500InternalServerError); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">NoContentResult</span>(); }</pre> </p> <p> Try to forget the code you've just seen and imagine that you're looking at this code first. You'd be excused if you miss what's going on. It looks as though the method just does a bit of validation and then <em>checks</em> 'something' concerning the reservation. </p> <p> There's no hint that the <code>Check</code> method might perform the significant side effect of saving the reservation in the database. </p> <p> You'll only learn that if you <em>read</em> the implementation details of <code>Check</code>. As I argue in my <a href="https://cleancoders.com/episode/humane-code-real-episode-1">Humane Code video</a>, <em>if you have to read the source code of an object, encapsulation is broken.</em> </p> <p> Such code doesn't fit in your brain. You'll struggle as you try keep track of all the things that are going on in the code, all the way from the outer boundary of the application to implementation details that relate to databases, third-party services, etcetera. </p> <h3 id="59249ae122b540ca907549ade3eca649"> Straw man? <a href="#59249ae122b540ca907549ade3eca649" title="permalink">#</a> </h3> <p> You may think that this is a straw man argument. After all, wouldn't it be better to edit the original <code>TrySave</code> method? </p> <p> Perhaps, but it would make that class more complex. The <code>TrySave</code> method has a <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> of only <em>3</em>, while the <code>Check</code> method has a complexity of <em>5</em>. Combining them might easily take them over some <a href="/2020/04/13/curb-code-rot-with-thresholds">threshold</a>. </p> <p> Additionally, each of these two classes have different dependencies. As the <code>TrySave</code> method implies, it relies on both <code>Repository</code> and <code>SeatingDuration</code>, and the <code>Allocate</code> helper method (not shown) uses a third dependency: the restaurant's table configuration. </p> <p> Likewise, the <code>Check</code> method relies on <code>OpensAt</code> and <code>LastSeating</code>. If you find it better to edit the original <code>TrySave</code> method, you'd have to combine these dependencies as well. Each time you do that, the class grows until it becomes a <a href="https://en.wikipedia.org/wiki/God_object">God object</a>. </p> <p> It's rational to attempt to separate things in multiple classes. It also, on the surface, seems to make unit testing easier. For example, here's a test that verifies that the <code>Check</code> method rejects reservations before the restaurant's opening time: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;RejectReservationBeforeOpeningTime() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;r&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>.Now.AddDays(10).Date.AddHours(17), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;colaera@example.com&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Cole&nbsp;Aera&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;mgrTD&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IReservationsManager</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;mgrTD.Setup(mgr&nbsp;=&gt;&nbsp;mgr.TrySave(r)).ReturnsAsync(<span style="color:blue;">true</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RestaurantManager</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(18), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(21), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mgrTD.Object); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Check(r); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(actual); }</pre> </p> <p> By replacing the <code>TrySave</code> method by a test double, you've ostensibly decoupled the <code>Check</code> method from all the complexity of the <code>TrySave</code> method. </p> <p> To be clear, this style of programming, with lots of nested interfaces and tests with <a href="/2013/10/23/mocks-for-commands-stubs-for-queries">mocks and stubs</a> is far from ideal, but I still find it better than a <a href="https://en.wikipedia.org/wiki/Big_ball_of_mud">big ball of mud</a>. </p> <h3 id="33cb10390fa4417f96b24e4b9102d4ed"> Alternative <a href="#33cb10390fa4417f96b24e4b9102d4ed" title="permalink">#</a> </h3> <p> A better alternative is <a href="https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell">Functional Core, Imperative Shell</a>, AKA <a href="/2020/03/02/impureim-sandwich">impureim sandwich</a>. Move all impure actions to the edge of the system, leaving only <a href="https://en.wikipedia.org/wiki/Referential_transparency">referentially transparent</a> functions as the main implementers of logic. It could look like this: </p> <p> <pre>[<span style="color:#2b91af;">HttpPost</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">ActionResult</span>&gt;&nbsp;Post(<span style="color:#2b91af;">ReservationDto</span>&nbsp;dto) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(dto&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(dto)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;id&nbsp;=&nbsp;dto.ParseId()&nbsp;??&nbsp;<span style="color:#2b91af;">Guid</span>.NewGuid(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Reservation</span>?&nbsp;r&nbsp;=&nbsp;dto.Validate(id); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(r&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">BadRequestResult</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;reservations&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(r.At).ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!MaitreD.WillAccept(<span style="color:#2b91af;">DateTime</span>.Now,&nbsp;reservations,&nbsp;r)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;NoTables500InternalServerError(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.Create(r).ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Reservation201Created(r); }</pre> </p> <p> Nothing is swept under the rug here. <code>WillAccept</code> is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, and while it encapsulates significant complexity, the only thing you need to understand when you're trying to understand the above <code>Post</code> code is that it returns either <code>true</code> or <code>false</code>. </p> <p> Another advantage of pure functions is that they are <a href="/2015/05/07/functional-design-is-intrinsically-testable">intrinsically testable</a>. That makes unit testing and test-driven development easier. </p> <p> Even with a functional core, you'll also have an imperative shell. You can still test that, too, such as the <code>Post</code> method. It isn't referentially transparent, so you might be inclined to use mocks and stubs, but I instead recommend <a href="/2019/02/18/from-interaction-based-to-state-based-testing">state-based testing with a Fake database</a>. </p> <h3 id="6758361ac712400aae148a7dcc2a4a70"> Conclusion <a href="#6758361ac712400aae148a7dcc2a4a70" title="permalink">#</a> </h3> <p> Good names are important, but don't let good names, alone, lull you into a false sense of security. All it takes is one vaguely named wrapper object, and all the information in your meticulously named methods is lost. </p> <p> This is one of many reasons I try to design with static types instead of names. Not that I dismiss the value of good names. After all, you'll have to give your types good names as well. </p> <p> Types are more robust in the face of inadvertent changes; or, rather, they tend to resist when we try to do something stupid. I suppose that's what lovers of dynamically typed languages feel as 'friction'. In my mind, it's entirely opposite. Types keep me honest. </p> <p> Unfortunately, most type systems don't offer an adequate degree of safety. Even in <a href="https://fsharp.org">F#</a>, which has a great type system, you can introduce impure actions into what you thought was a pure function, and <a href="/2020/02/24/discerning-and-maintaining-purity">you'd be none the wiser</a>. That's one of the reasons I find <a href="https://www.haskell.org">Haskell</a> so interesting. Because of <a href="/2020/06/08/the-io-container">the way IO works</a>, you can't inadvertently sweep surprises under the rug. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="9ecb24dc5f78413687547c7f74f2d8b9"> <div class="comment-author">Johannes Schmitt</div> <div class="comment-content"> <p> I find the idea of the impure/pure/impure sandwich rather interesting and I agree with the benefits that it yields. However, I was wondering about where to move synchronization logic, i.e. the reservation system should avoid double bookings. With the initial TrySave approach it would be clear for me where to put this logic: the synchonrization mechanism should be part of the TrySave method. With the impure/pure/impure sandwich, it will move out to the most outer layern (HTTP Controller) - at least this is how I'd see it. My feelings tells me that this is a bit smelly, but I can't really pin point why I think so. Can you give some advice on this? How would you solve that? </p> </div> <div class="comment-date">2020-12-12 19:08 UTC</div> </div> <div class="comment" id="ebce3b718bc84329b6979bcacf6c2573"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Johannes, thank you for writing. There are several ways to address that issue, depending on what sort of trade-off you're looking for. There's always a trade-off. </p> <p> You can address the issue with a lock-free architecture. This typically involves expressing the desired action as a Command and putting it on a durable queue. If you combine that with a single-threaded, single-instance Actor that pulls Commands off the queue, you need no further transaction processing, because the architecture itself serialises writes. You can find plenty of examples of such an architecture on the internet, including (IIRC) my Pluralsight course <a href="/functional-architecture-with-fsharp">Functional Architecture with F#</a>. </p> <p> Another option is to simply surround the <a href="/2020/03/02/impureim-sandwich">impureim sandwich</a> with a <a href="https://docs.microsoft.com/dotnet/api/system.transactions.transactionscope">TransactionScope</a> (if you're on .NET, that is). </p> </div> <div class="comment-date">2020-12-16 16:59 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Mark Seemann https://blog.ploeh.dk/2020/11/23/good-names-are-skin-deep