A restaurant example of refactoring from example-based to property-based testing

A C# example with xUnit.net and FsCheck.

This is the second comprehensive example that accompanies the article Epistemology of interaction testing. In that article, I argue that in a code base that leans toward functional programming (FP), property-based testing is a better fit than interaction-based testing. In this example, I will show how to refactor realistic state-based tests into (state-based) property-based tests.

The previous article showed a minimal and self-contained example that had the advantage of being simple, but the disadvantage of being perhaps too abstract and unrelatable. In this article, then, I will attempt to show a more realistic and concrete example. It actually doesn't start with interaction-based testing, since it's already written in the style of Functional Core, Imperative Shell. On the other hand, it shows how to refactor from concrete example-based tests to property-based tests.

I'll use the online restaurant reservation code base that accompanies my book Code That Fits in Your Head.

Smoke test #

I'll start with a simple test which was, if I remember correctly, the second test I wrote for this code base. It was a smoke test that I wrote to drive a walking skeleton. It verifies that if you post a valid reservation request to the system, you receive an HTTP response in the 200 range.

[Fact]
public async Task PostValidReservation()
{
    using var api = new LegacyApi();
 
    var expected = new ReservationDto
    {
        At = DateTime.Today.AddDays(778).At(19, 0)
                .ToIso8601DateTimeString(),
        Email = "katinka@example.com",
        Name = "Katinka Ingabogovinanana",
        Quantity = 2
    };
    var response = await api.PostReservation(expected);
 
    response.EnsureSuccessStatusCode();
    var actual = await response.ParseJsonContent<ReservationDto>();
    Assert.Equal(expected, actual, new ReservationDtoComparer());
}

Over the lifetime of the code base, I embellished and edited the test to reflect the evolution of the system as well as my understanding of it. Thus, when I wrote it, it may not have looked exactly like this. Even so, I kept it around even though other, more detailed tests eventually superseded it.

One characteristic of this test is that it's quite concrete. When I originally wrote it, I hard-coded the date and time as well. Later, however, I discovered that I had to make the time relative to the system clock. Thus, as you can see, the At property isn't a literal value, but all other properties (Email, Name, and Quantity) are.

This test is far from abstract or data-driven. Is it possible to turn such a test into a property-based test? Yes, I'll show you how.

A word of warning before we proceed: Tests with concrete, literal, easy-to-understand examples are valuable as programmer documentation. A person new to the code base can peruse such tests and learn about the system. Thus, this test is already quite valuable as it is. In a real, living code base, I'd prefer leaving it as it is, instead of turning it into a property-based test.

Since it's a simple and concrete test, on the other hand, it's easy to understand, and thus also a a good place to start. Thus, I'm going to refactor it into a property-based test; not because I think that you should (I don't), but because I think it'll be easy for you, the reader, to follow along. In other words, it's a good introduction to the process of turning a concrete test into a property-based test.

Adding parameters #

This code base already uses FsCheck so it makes sense to stick to that framework for property-based testing. While it's written in F# you can use it from C# as well. The easiest way to use it is as a parametrised test. This is possible with the FsCheck.Xunit glue library. In fact, as I refactor the PostValidReservation test, it'll look much like the AutoFixture-driven tests from the previous article.

When turning concrete examples into properties, it helps to consider whether literal values are representative of an equivalence class. In other words, is that particular value important, or is there a wider set of values that would be just as good? For example, why is the test making a reservation 778 days in the future? Why not 777 or 779? Is the value 778 important? Not really. What's important is that the reservation is in the future. How far in the future actually isn't important. Thus, we can replace the literal value 778 with a parameter:

[Property]
public async Task PostValidReservation(PositiveInt days)
{
    using var api = new LegacyApi();
 
    var expected = new ReservationDto
    {
        At = DateTime.Today.AddDays((int)days).At(19, 0)
                .ToIso8601DateTimeString(),
        // The rest of the test...

Notice that I've replaced the literal value 778 with the method parameter days. The PositiveInt type is a type from FsCheck. It's a wrapper around int that guarantees that the value is positive. This is important because we don't want to make a reservation in the past. The PositiveInt type is a good choice because it's a type that's already available with FsCheck, and the framework knows how to generate valid values. Since it's a wrapper, though, the test needs to unwrap the value before using it. This is done with the (int)days cast.

Notice, also, that I've replaced the [Fact] attribute with the [Property] attribute that comes with FsCheck.Xunit. This is what enables FsCheck to automatically generate test cases and feed them to the test method. You can't always do this, as you'll see later, but when you can, it's a nice and succinct way to express a property-based test.

Already, the PostValidReservation test method is 100 test cases (the FsCheck default), rather than one.

What about Email and Name? Is it important for the test that these values are exactly katinka@example.com and Katinka Ingabogovinanana or might other values do? The answer is that it's not important. What's important is that the values are valid, and essentially any non-null string is. Thus, we can replace the literal values with parameters:

[Property]
public async Task PostValidReservation(
    PositiveInt days,
    StringNoNulls email,
    StringNoNulls name)
{
    using var api = new LegacyApi();
 
    var expected = new ReservationDto
    {
        At = DateTime.Today.AddDays((int)days).At(19, 0)
                .ToIso8601DateTimeString(),
        Email = email.Item,
        Name = name.Item,
        Quantity = 2
    };
    var response = await api.PostReservation(expected);
 
    response.EnsureSuccessStatusCode();
    var actual = await response.ParseJsonContent<ReservationDto>();
    Assert.Equal(expected, actual, new ReservationDtoComparer());
}

The StringNoNulls type is another FsCheck wrapper, this time around string. It ensures that FsCheck will generate no null strings. This time, however, a cast isn't possible, so instead I had to pull the wrapped string out of the value with the Item property.

That's enough conversion to illustrate the process.

What about the literal values 19, 0, or 2? Shouldn't we parametrise those as well? While we could, that takes a bit more effort. The problem is that with these values, any old positive integer isn't going to work. For example, the number 19 is the hour component of the reservation time; that is, the reservation is for 19:00. Clearly, we can't just let FsCheck generate any positive integer, because most integers aren't going to work. For example, 5 doesn't work because it's in the early morning, and the restaurant isn't open at that time.

Like other property-based testing frameworks FsCheck has an API that enables you to constrain value generation, but it doesn't work with the type-based approach I've used so far. Unlike PositiveInt there's no TimeBetween16And21 wrapper type.

You'll see what you can do to control how FsCheck generates values, but I'll use another test for that.

Parametrised unit test #

The PostValidReservation test is a high-level smoke test that gives you an idea about how the system works. It doesn't, however, reveal much about the possible variations in input. To drive such behaviour, I wrote and evolved the following state-based test:

[Theory]
[InlineData(1049, 19, 00, "juliad@example.net", "Julia Domna", 5)]
[InlineData(1130, 18, 15, "x@example.com", "Xenia Ng", 9)]
[InlineData( 956, 16, 55, "kite@example.edu", null, 2)]
[InlineData( 433, 17, 30, "shli@example.org", "Shanghai Li", 5)]
public async Task PostValidReservationWhenDatabaseIsEmpty(
    int days,
    int hours,
    int minutes,
    string email,
    string name,
    int quantity)
{
    var at = DateTime.Now.Date + new TimeSpan(days, hours, minutes, 0);
    var db = new FakeDatabase();
    var sut = new ReservationsController(
        new SystemClock(),
        new InMemoryRestaurantDatabase(Grandfather.Restaurant),
        db);
    var expected = new Reservation(
        new Guid("B50DF5B1-F484-4D99-88F9-1915087AF568"),
        at,
        new Email(email),
        new Name(name ?? ""),
        quantity);
 
    await sut.Post(expected.ToDto());
 
    Assert.Contains(expected, db.Grandfather);
}

This test gives more details, without exercising all possible code paths of the system. It's still a Facade Test that covers 'just enough' of the integration with underlying components to provide confidence that things work as they should. All the business logic is implemented by a class called MaitreD, which is covered by its own set of targeted unit tests.

While parametrised, this is still only four test cases, so perhaps you don't have sufficient confidence that everything works as it should. Perhaps, as I've outlined in the introductory article, it would help if we converted it to an FsCheck property.

Parametrised property #

I find it safest to refactor this parametrised test to a property in a series of small steps. This implies that I need to keep the [InlineData] attributes around for a while longer, removing one or two literal values at a time, turning them into randomly generated values.

From the previous test we know that the Email and Name values are almost unconstrained. This means that they are trivial in themselves to have FsCheck generate. That change, in itself, is easy, which is good, because combining an [InlineData]-driven [Theory] with an FsCheck property is enough of a mouthful for one refactoring step:

[Theory]
[InlineData(1049, 19, 00, 5)]
[InlineData(1130, 18, 15, 9)]
[InlineData( 956, 16, 55, 2)]
[InlineData( 433, 17, 30, 5)]
public void PostValidReservationWhenDatabaseIsEmpty(
    int days,
    int hours,
    int minutes,
    int quantity)
{
    Prop.ForAll(
        (from r in Gens.Reservation
         select r).ToArbitrary(),
        async r =>
        {
            var at = DateTime.Now.Date + new TimeSpan(days, hours, minutes, 0);
            var db = new FakeDatabase();
            var sut = new ReservationsController(
                new SystemClock(),
                new InMemoryRestaurantDatabase(Grandfather.Restaurant),
                db);
            var expected = r
                .WithQuantity(quantity)
                .WithDate(at);
 
            await sut.Post(expected.ToDto());
 
            Assert.Contains(expected, db.Grandfather);
        }).QuickCheckThrowOnFailure();
}

I've now managed to get rid of the email and name parameters, so I've also removed those values from the [InlineData] attributes. Instead, I've asked FsCheck to generate a valid reservation r, which comes with both valid Email and Name.

It turned out that this code base already had some custom generators in a static class called Gens, so I reused those:

internal static Gen<Email> Email =>
    from s in ArbMap.Default.GeneratorFor<NonWhiteSpaceString>()
    select new Email(s.Item);
 
internal static Gen<Name> Name =>
    from s in ArbMap.Default.GeneratorFor<StringNoNulls>()
    select new Name(s.Item);
 
internal static Gen<Reservation> Reservation =>
    from id in ArbMap.Default.GeneratorFor<Guid>()
    from d in ArbMap.Default.GeneratorFor<DateTime>()
    from e in Email
    from n in Name
    from q in ArbMap.Default.GeneratorFor<PositiveInt>()
    select new Reservation(id, d, e, n, q.Item);

As was also the case with CsCheck you typically use syntactic sugar for monads (which in C# is query syntax) to compose complex test data generators from simpler generators. This enables me to generate an entire Reservation object with a single expression.

Time of day #

Some of the values (such as the reservation's name and email address) that are involved in the PostValidReservationWhenDatabaseIsEmpty test don't really matter. Other values are constrained in some way. Even for the reservation r the above version of the test has to override the arbitrarily generated r value with a specific quantity and a specific at value. This is because you can't just reserve any quantity at any time of day. The restaurant has opening hours and actual tables. Most likely, it doesn't have a table for 100 people at 3 in the morning.

This particular test actually exercises a particular restaurant called Grandfather.Restaurant (because it was the original restaurant that was grandfathered in when the system was expanded to a multi-tenant system). It opens at 16 and has the last seating at 21. This means that the at value has to be between 16 and 21. What's the best way to generate a DateTime value that satisfies this constraint?

You could, naively, ask FsCheck to generate an integer between these two values. You'll see how to do that when we get to the quantity. While that would work for the at value, it would only generate the whole hours 16:00, 17:00, 18:00, etcetera. It would be nice if the test could also exercise times such as 18:30, 20:45, and so on. On the other hand, perhaps we don't want weird reservation times such as 17:09:23.282. How do we tell FsCheck to generate a DateTime value like that?

It's definitely possible to do from scratch, but I chose to do something else. The following shows how test code and production code can co-exist in a symbiotic relationship. The main business logic component that deals with reservations in the system is a class called MaitreD. One of its methods is used to generate a list of time slots for every day. A user interface can use that list to populate a drop-down list of available times. The method is called Segment and can also be used as a data source for an FsCheck test data generator:

internal static Gen<TimeSpan> ReservationTime(
    Restaurant restaurant,
    DateTime date)
{
    var slots = restaurant.MaitreD
        .Segment(date, Enumerable.Empty<Reservation>())
        .Select(ts => ts.At.TimeOfDay);
    return Gen.Elements(slots);
}

The Gen.Elements function is an FsCheck combinator that randomly picks a value from a collection. This one, then, picks one of the DataTime values generated by MaitreD.Segment.

The PostValidReservationWhenDatabaseIsEmpty test can now use the ReservationTime generator to produce a time of day:

[Theory]
[InlineData(5)]
[InlineData(9)]
[InlineData(2)]
public void PostValidReservationWhenDatabaseIsEmpty(int quantity)
{
    var today = DateTime.Now.Date;
    Prop.ForAll(
        (from days in ArbMap.Default.GeneratorFor<PositiveInt>()
         from t in Gens.ReservationTime(Grandfather.Restaurant, today)
         let offset = TimeSpan.FromDays((int)days) + t
         from r in Gens.Reservation
         select (r, offset)).ToArbitrary(),
        async t =>
        {
            var at = today + t.offset;
            var db = new FakeDatabase();
            var sut = new ReservationsController(
                new SystemClock(),
                new InMemoryRestaurantDatabase(Grandfather.Restaurant),
                db);
            var expected = t.r
                .WithQuantity(quantity)
                .WithDate(at);
 
            await sut.Post(expected.ToDto());
 
            Assert.Contains(expected, db.Grandfather);
        }).QuickCheckThrowOnFailure();
}

Granted, the test code is getting more and more busy, but there's room for improvement. Before I simplify it, though, I think that it's more prudent to deal with the remaining literal values.

Notice that the InlineData attributes now only supply a single value each: The quantity.

Quantity #

Like the at value, the quantity is constrained. It must be a positive integer, but it can't be larger than the largest table in the restaurant. That number, however, isn't that hard to find:

var maxCapacity = restaurant.MaitreD.Tables.Max(t => t.Capacity);

The FsCheck API includes a function that generates a random number within a given range. It's called Gen.Choose, and now that we know the range, we can use it to generate the quantity value. Here, I'm only showing the test-data-generator part of the test, since the rest doesn't change that much. You'll see the full test again after a few more refactorings.

var today = DateTime.Now.Date;
var restaurant = Grandfather.Restaurant;
var maxCapacity = restaurant.MaitreD.Tables.Max(t => t.Capacity);
Prop.ForAll(
    (from days in ArbMap.Default.GeneratorFor<PositiveInt>()
     from t in Gens.ReservationTime(restaurant, today)
     let offset = TimeSpan.FromDays((int)days) + t
     from quantity in Gen.Choose(1, maxCapacity)
     from r in Gens.Reservation
     select (r.WithQuantity(quantity), offset)).ToArbitrary(),

There are now no more literal values in the test. In a sense, the refactoring from parametrised test to property-based test is complete. It could do with a bit of cleanup, though.

Simplification #

There's no longer any need to pass along the offset variable, and the explicit QuickCheckThrowOnFailure also seems a bit redundant. I can use the [Property] attribute from FsCheck.Xunit instead.

[Property]
public Property PostValidReservationWhenDatabaseIsEmpty()
{
    var today = DateTime.Now.Date;
    var restaurant = Grandfather.Restaurant;
    var maxCapacity = restaurant.MaitreD.Tables.Max(t => t.Capacity);
    return Prop.ForAll(
        (from days in ArbMap.Default.GeneratorFor<PositiveInt>()
         from t in Gens.ReservationTime(restaurant, today)
         let at = today + TimeSpan.FromDays((int)days) + t
         from quantity in Gen.Choose(1, maxCapacity)
         from r in Gens.Reservation
         select r.WithQuantity(quantity).WithDate(at)).ToArbitrary(),
        async expected =>
        {
            var db = new FakeDatabase();
            var sut = new ReservationsController(
                new SystemClock(),
                new InMemoryRestaurantDatabase(restaurant),
                db);
 
            await sut.Post(expected.ToDto());
 
            Assert.Contains(expected, db.Grandfather);
        });
}

Compared to the initial version of the test, it has become more top-heavy. It's about the same size, though. The original version was 30 lines of code. This version is only 26 lines of code, but it is admittedly more information-dense. The original version had more 'noise' interleaved with the 'signal'. The new variation actually has a better separation of data generation and the test itself. Consider the 'actual' test code:

var db = new FakeDatabase();
var sut = new ReservationsController(
    new SystemClock(),
    new InMemoryRestaurantDatabase(restaurant),
    db);
 
await sut.Post(expected.ToDto());
 
Assert.Contains(expected, db.Grandfather);

If we could somehow separate the data generation from the test itself, we might have something that was quite readable.

Extract test data generator #

The above data generation consists of a bit of initialisation and a query expression. Like all pure functions it's easy to extract:

private static Gen<(Restaurant, Reservation)>
    GenValidReservationForEmptyDatabase()
{
    var today = DateTime.Now.Date;
    var restaurant = Grandfather.Restaurant;
    var capacity = restaurant.MaitreD.Tables.Max(t => t.Capacity);
 
    return from days in ArbMap.Default.GeneratorFor<PositiveInt>()
           from t in Gens.ReservationTime(restaurant, today)
           let at = today + TimeSpan.FromDays((int)days) + t
           from quantity in Gen.Choose(1, capacity)
           from r in Gens.Reservation
           select (restaurant, r.WithQuantity(quantity).WithDate(at));
}

While it's quite specialised, it leaves the test itself small and readable:

[Property]
public Property PostValidReservationWhenDatabaseIsEmpty()
{
    return Prop.ForAll(
        GenValidReservationForEmptyDatabase().ToArbitrary(),
        async t =>
        {
            var (restaurant, expected) = t;
            var db = new FakeDatabase();
            var sut = new ReservationsController(
                new SystemClock(),
                new InMemoryRestaurantDatabase(restaurant),
                db);
 
            await sut.Post(expected.ToDto());
 
            Assert.Contains(expected, db[restaurant.Id]);
        });
}

That's not the only way to separate test and data generation.

Test as implementation detail #

The above separation refactors the data-generating expression to a private helper function. Alternatively you can keep all that FsCheck infrastructure code in the public test method and extract the test body itself to a private helper method:

[Property]
public Property PostValidReservationWhenDatabaseIsEmpty()
{
    var today = DateTime.Now.Date;
    var restaurant = Grandfather.Restaurant;
    var capacity = restaurant.MaitreD.Tables.Max(t => t.Capacity);
 
    var g = from days in ArbMap.Default.GeneratorFor<PositiveInt>()
            from t in Gens.ReservationTime(restaurant, today)
            let at = today + TimeSpan.FromDays((int)days) + t
            from quantity in Gen.Choose(1, capacity)
            from r in Gens.Reservation
            select (restaurant, r.WithQuantity(quantity).WithDate(at));
 
    return Prop.ForAll(
        g.ToArbitrary(),
        t => PostValidReservationWhenDatabaseIsEmptyImp(
            t.restaurant,
            t.Item2));
}

At first glance, that doesn't look like an improvement, but it has the advantage that the actual test method is now devoid of FsCheck details. If we use that as a yardstick for how decoupled the test is from FsCheck, this seems cleaner.

private static async Task PostValidReservationWhenDatabaseIsEmptyImp(
    Restaurant restaurant, Reservation expected)
{
    var db = new FakeDatabase();
    var sut = new ReservationsController(
        new SystemClock(),
        new InMemoryRestaurantDatabase(restaurant),
        db);
 
    await sut.Post(expected.ToDto());
 
    Assert.Contains(expected, db[restaurant.Id]);
}

Using a property-based testing framework in C# is still more awkward than in a language with better support for monadic composition and pattern matching. That said, more recent versions of C# do have better pattern matching on tuples, but this code base is still on C# 8.

If you still think that this looks more complicated than the initial version of the test, then I agree. Property-based testing isn't free, but you get something in return. We started with four test cases and ended with 100. And that's just the default. If you want to increase the number of test cases, that's just an API call away. You could run 1,000 or 10,000 test cases if you wanted to. The only real downside is that the tests take longer to run.

Unhappy paths #

The tests above all test the happy path. A valid request arrives and the system is in a state where it can accept it. This small article series is, you may recall, a response to an email from Sergei Rogovtsev. In his email, he mentioned the need to test both happy path and various error scenarios. Let's cover a few before wrapping up.

As I was developing the system and fleshing out its behaviour, I evolved this parametrised test:

[Theory]
[InlineData(null, "j@example.net", "Jay Xerxes", 1)]
[InlineData("not a date", "w@example.edu", "Wk Hd", 8)]
[InlineData("2023-11-30 20:01", null, "Thora", 19)]
[InlineData("2022-01-02 12:10", "3@example.org", "3 Beard", 0)]
[InlineData("2045-12-31 11:45", "git@example.com", "Gil Tan", -1)]
public async Task PostInvalidReservation(
    string at,
    string email,
    string name,
    int quantity)
{
    using var api = new LegacyApi();
    var response = await api.PostReservation(
        new { at, email, name, quantity });
    Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode);
}

The test body itself is about as minimal as it can be. There are four test cases that I added one or two at a time.

The first test case covers what happens if the at value is missing (i.e. null)
The next test case covers a malformed at value
The third test case covers a missing email address
The two last test cases covers non-positive quantities, both 0 and a negative number

It's possible to combine FsCheck generators that deal with each of these cases, but here I want to demonstrate how it's still possible to keep each error case separate, if that's what you need. First, separate the test body from its data source, like I did above:

[Theory]
[InlineData(null, "j@example.net", "Jay Xerxes", 1)]
[InlineData("not a date", "w@example.edu", "Wk Hd", 8)]
[InlineData("2023-11-30 20:01", null, "Thora", 19)]
[InlineData("2022-01-02 12:10", "3@example.org", "3 Beard", 0)]
[InlineData("2045-12-31 11:45", "git@example.com", "Gil Tan", -1)]
public async Task PostInvalidReservation(
    string at,
    string email,
    string name,
    int quantity)
{
    await PostInvalidReservationImp(at, email, name, quantity);
}
 
private static async Task PostInvalidReservationImp(
    string at,
    string email,
    string name,
    int quantity)
{
    using var api = new LegacyApi();
    var response = await api.PostReservation(
        new { at, email, name, quantity });
    Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode);
}

If you consider this refactoring in isolation, it seems frivolous, but it's just preparation for further work. In each subsequent refactoring I'll convert each of the above error cases to a property.

Missing date and time #

Starting from the top, convert the reservation-at-null test case to a property:

[Property]
public async Task PostReservationAtNull(string email, string name, PositiveInt quantity)
{
    await PostInvalidReservationImp(null, email, name, (int)quantity);
}

I've left the parametrised PostInvalidReservation test in place, but removed the [InlineData] attribute with the null value for the at parameter:

[Theory]
[InlineData("not a date", "w@example.edu", "Wk Hd", 8)]
[InlineData("2023-11-30 20:01", null, "Thora", 19)]
[InlineData("2022-01-02 12:10", "3@example.org", "3 Beard", 0)]
[InlineData("2045-12-31 11:45", "git@example.com", "Gil Tan", -1)]
public async Task PostInvalidReservation(

The PostReservationAtNull property can use the FsCheck.Xunit [Property] attribute, because any string can be used for email and name.

To be honest, it is, perhaps, cheating a bit to post any positive quantity, because a number like, say, 1837 would be a problem even if the posted representation was well-formed and valid, since no table of the restaurant has that capacity.

Validation does, however, happen before evaluating business rules and application state, so the way the system is currently implemented, the test never fails because of that. The service never gets to that part of handling the request.

One might argue that this is relying on (and thereby coupling to) an implementation detail, but honestly, it seems unlikely that the service would begin processing an invalid request - 'invalid' implying that the request makes no sense. Concretely, if the date and time is missing from a reservation, how can the service begin to process it? On which date? At what time?

Thus, it's not that likely that this behaviour would change in the future, and therefore unlikely that the test would fail because of a policy change. It is, however, worth considering.

Malformed date and time #

The next error case is when the at value is present, but malformed. You can also convert that case to a property:

[Property]
public Property PostMalformedDateAndTime()
{
    var g = from at in ArbMap.Default.GeneratorFor<string>()
                .Where(s => !DateTime.TryParse(s, out _))
            from email in Gens.Email
            from name in Gens.Name
            from quantity in Gen.Choose(1, 10)
            select (at,
                    email: email.ToString(),
                    name: name.ToString(),
                    quantity);
 
    return Prop.ForAll(
        g.ToArbitrary(),
        t => PostInvalidReservationImp(t.at, t.email, t.name, t.quantity));
}

Given how simple PostReservationAtNull turned out to be, you may be surprised that this case takes so much code to express. There's not that much going on, though. I reuse the generators I already have for email and name, and FsCheck's built-in Gen.Choose to pick a quantity between 1 and 10. The only slightly tricky expression is for the at value.

The distinguishing part of this test is that the at value should be malformed. A randomly generated string is a good starting point. After all, most strings aren't well-formed date-and-time values. Still, a random string could be interpreted as a date or time, so it's better to explicitly disallow such values. This is possible with the Where function. It's a filter that only allows values through that are not understandable as dates or times - which is the vast majority of them.

Null email #

The penultimate error case is when the email address is missing. That one is as easy to express as the missing at value.

[Property]
public async Task PostNullEmail(DateTime at, string name, PositiveInt quantity)
{
    await PostInvalidReservationImp(at.ToIso8601DateTimeString(), null, name, (int)quantity);
}

Again, with the addition of this specific property, I've removed the corresponding [InlineData] attribute from the PostInvalidReservation test. It only has two remaining test cases, both about non-positive quantities.

Non-positive quantity #

Finally, we can add a property that checks what happens if the quantity isn't positive:

[Property]
public async Task PostNonPositiveQuantity(
    DateTime at,
    string email,
    string name,
    NonNegativeInt quantity)
{
    await PostInvalidReservationImp(at.ToIso8601DateTimeString(), email, name, -(int)quantity);
}

FsCheck doesn't have a wrapper for non-positive integers, but I can use NonNegativeInt and negate it. The point is that I want to include 0, which NonNegativeInt does. That wrapper generates integers greater than or equal to zero.

Since I've now modelled each error case as a separate FsCheck property, I can remove the PostInvalidReservation method.

Conclusion #

To be honest, I think that turning these parametrised tests into FsCheck properties is overkill. After all, when I wrote the code base, I found the parametrised tests adequate. I used test-driven development all the way through, and while I also kept the Devil's Advocate in mind, the tests that I wrote gave me sufficient confidence that the system works as it should.

The main point of this article is to show how you can convert example-based tests to property-based tests. After all, just because I felt confident in my test suite it doesn't follow that a few parametrised tests does it for you. How much testing you need depends on a variety of factors, so you may need the extra confidence that thousands of test cases can give you.

The previous article in this series showed an abstract, but minimal example. This one is more realistic, but also more involved.

Next: Refactoring pure function composition without breaking existing tests.

Comments

Christer van der Meeren #

In the section "Missing date and time", you mention that it could be worth considering the coupling of the test to the implementation details regarding validation order and possible false positive test results. Given that you already have a test data generator that produces valid reservations (GenValidReservationForEmptyDatabase), wouldn't it be more or less trivial to just generate valid test data and modify it to make it invalid in the single specific way you want to test?

2023-04-18 14:00 UTC

Anthony Lloyd #

Am I right in thinking shrinking doesn't work in FsCheck with the query syntax? I've just tried with two ints. How would you make it work?

[Fact]
public void ShrinkingTest()
{
    Prop.ForAll(
        (from a1 in Arb.Default.Int32().Generator
         from a2 in Arb.Default.Int32().Generator
         select (a1, a2)).ToArbitrary(),
        t =>
        {
            if (t.a2 > 10)
                throw new System.Exception();
        })
    .QuickCheckThrowOnFailure();
}

2023-04-18 19:15 UTC

Mark Seemann #

Christer, thank you for writing. It wouldn't be impossible to address that concern, but I haven't found a good way of doing it without introducing other problems. So, it's a trade-off.

What I meant by my remark in the article is that in order to make an (otherwise) valid request, the test needs to know the maximum valid quantity, which varies from restaurant to restaurant. The problem, in a nutshell, is that the test in question operates exclusively against the REST API of the service, and that API doesn't expose any functionality that enable clients to query the configuration of tables for a given restaurant. There's no way to obtain that information.

The only two options I can think of are:

Add such a query API to the REST API. In this case, that seems unwarranted.
Add a backdoor API to the self-host (LegacyApi).

If I had to, I'd prefer the second option, but it would still require me to add more (test) code to the code base. There's a cost to every line of code.

Here, I'm making a bet that the grandfathered restaurant isn't going to change its configuration. The tests are then written with the implicit knowledge that that particular restaurant has a maximum table size of 10, and also particular opening and closing times.

This makes those tests more concrete, which makes them more readable. They serve as easy-to-understand examples of how the system works (once the reader has gained the implicit knowledge I just described).

It's not perfect. The tests are, perhaps, too obscure for that reason, and they are vulnerable to configuration changes. Even so, the remedies I can think of come with their own disadvantages.

So far, I've decided that the trade-offs are best leaving things as you see them here. That doesn't mean that I wouldn't change that decision in the future if it turns out that these tests are too brittle.

2023-04-19 8:18 UTC

Mark Seemann #

Anthony, thank you for writing. You're correct that in FsCheck shrinking doesn't work with query syntax; at least in the versions I've used. I'm not sure if that's planned for a future release.

As far as I can tell, this is a consequence of the maturity of the library. You have the same issue with QuickCheck, which also distinguishes between Gen and Arbitrary. While Gen is a monad, Arbitrary's shrink function is invariant, which prevents it from being a functor (and hence, also from being a monad).

FsCheck is a mature port of QuickCheck, so it has the same limitation. No functor, no query syntax.

Later, this limitation was solved by modelling shrinking based on a lazily evaluated shrink tree, which does allow for a monad. The first time I saw that in effect was in Hedgehog.

2023-04-21 6:17 UTC

Anthony Lloyd #

Hedgehog does a little better than FsCheck but it doesn't shrink well when the variables are dependent.

[Fact]
public void ShrinkingTest_Hedgehog()
{
    Property.ForAll(
        from a1 in Gen.Int32(Range.ConstantBoundedInt32())
        from a2 in Gen.Int32(Range.ConstantBoundedInt32())
        where a1 > a2
        select (a1, a2))
    .Select(t =>
    {
        if (t.a2 > 10)
            throw new System.Exception();
    })
    .Check(PropertyConfig.Default.WithTests(1_000_000).WithShrinks(1_000_000));
}

[Fact]
public void ShrinkingTest_Hedgehog2()
{
    Property.ForAll(
        from a1 in Gen.Int32(Range.ConstantBoundedInt32())
        from a2 in Gen.Int32(Range.Constant(0, a1))
        select (a1, a2))
    .Select(t =>
    {
        if (t.a2 > 10)
            throw new System.Exception();
    })
    .Check(PropertyConfig.Default.WithTests(1_000_000).WithShrinks(1_000_000));
}

[Fact]
public void ShrinkingTest_CsCheck()
{
    (from a1 in Gen.Int
     from a2 in Gen.Int
     where a1 > a2
     select (a1, a2))
    .Sample((_, a2) =>
    {
        if (a2 > 10)
            throw new Exception();
    }, iter: 1_000_000);
}

[Fact]
public void ShrinkingTest_CsCheck2()
{
    (from a1 in Gen.Int.Positive
     from a2 in Gen.Int[0, a1]
     select (a1, a2))
    .Sample((_, a2) =>
    {
        if (a2 > 10)
            throw new Exception();
    }, iter: 1_000_000);
}

This and the syntax complexity I mentioned in the previous post were the reasons I developed CsCheck. Random shrinking is the key innovation that makes it simpler.

2023-04-21 16:38 UTC

Published: Monday, 17 April 2023 06:37:00 UTC

A restaurant example of refactoring from example-based to property-based testing by Mark Seemann