Pre-conditions are important in Property-Based Tests.

Last week, I was giving a workshop on Property-Based Testing, and we were looking at this test:

[<Property>]
let ``Validate.reservation returns right result on invalid date``
    (rendition : ReservationRendition) =
    
    let actual = Validate.reservation rendition
 
    let expected : Result<Reservation, Error> =
        Failure(ValidationError("Invalid date."))
    test <@ expected = actual @>

The test case being exercised by this test is to verify what happens when the input (essentially a representation of an incoming JSON document) is invalid.

The System Under Test is this function:

module Validate =
    let reservation (rendition : ReservationRendition) =
        match rendition.Date |> DateTimeOffset.TryParse with
        | (true, date) -> Success {
            Date = date
            Name = rendition.Name
            Email = rendition.Email
            Quantity = rendition.Quantity }
        | _ -> Failure(ValidationError "Invalid date.")

The validation rule is simple: if the rendition.Date string can be parsed into a DateTimeOffset value, then the input is valid; otherwise, it's not.

If you run the test, if passes:

Output from Ploeh.Samples.BookingApi.UnitTests.ValidatorTests.Validate.reservation returns right result on invalid date:
  Ok, passed 100 tests.

Only, this test isn't guaranteed to pass. Can you spot the problem?

There's a tiny, but real, risk that the randomly generated rendition.Date string can be parsed as a DateTimeOffset value. While it's not particularly likely, it will happen once in a blue moon. This can lead to spurious false positives.

In order to see this, you can increase the number of test cases generated by FsCheck:

[<Property(MaxTest = 100000)>]
let ``Validate.reservation returns right result on invalid date``
    (rendition : ReservationRendition) =
    
    let actual = Validate.reservation rendition
 
    let expected : Result<Reservation, Error> =
        Failure(ValidationError("Invalid date."))
    test <@ expected = actual @>

In my experience, when you ask for 100,000 test cases, the property usually fails:

Test 'Ploeh.Samples.BookingApi.UnitTests.ValidatorTests.Validate.reservation returns right result on invalid date' failed:
	FsCheck.Xunit.PropertyFailedException : 
Falsifiable, after 9719 tests (11 shrinks) (StdGen (1054829489,296106988)):
Original:
{Date = "7am";
 Name = "6WUx;";
 Email = "W";
 Quantity = -5;}
Shrunk:
{Date = "7am";
 Name = "";
 Email = "";
 Quantity = 0;}

---- Swensen.Unquote.AssertionFailedException : Test failed:

Failure (ValidationError "Invalid date.") = Success {Date = 15.01.2016 07:00:00 +00:00;
         Name = "";
         Email = "";
         Quantity = 0;}
false

The problem is that the generated date "7am" can be parsed as a DateTimeOffset value:

> open System;;
> DateTimeOffset.Now;;
val it : DateTimeOffset =
  15.01.2016 21:09:25 +00:00
> "7am" |> DateTimeOffset.TryParse;;
val it : bool * DateTimeOffset =
  (true, 15.01.2016 07:00:00 +00:00)

The above test is written with the implicit assumption that the generated value for rendition.Date will always be an invalidly formatted string. As the Zen of Python reminds us, explicit is better than implicit. You should make that assumption explicit:

[<Property>]
let ``Validate.reservation returns right result on invalid date``
    (rendition : ReservationRendition) =
    not(fst(DateTimeOffset.TryParse rendition.Date)) ==> lazy
    
    let actual = Validate.reservation rendition
 
    let expected : Result<Reservation, Error> =
        Failure(ValidationError("Invalid date."))
    test <@ expected = actual @>

This version of the test uses FsCheck's built-in custom operator ==>. This operator will only evaluate the expression on the right-hand side if the boolean expression on the left-hand side evaluates to true. In the unlikely event that the above condition evaluates to false, FsCheck is going to ignore that randomly generated value, and try with a new one.

This version of the test succeeds, even if you set MaxTest to 1,000,000.

The ==> operator effectively discards candidate values that don't match the 'guard' condition. You shouldn't always use this feature to constrain the input, as it can be wasteful. If the condition isn't satisfied, FsCheck will attempt to generate a new value that satisfies the condition. In this particular example, only in extremely rare cases will FsCheck be forced to discard a candidate value, so this use of the feature is appropriate.

In other cases, you should consider other options. Imagine that you want only even numbers. While you could write a 'guard' condition that ensures that only even numbers are used, that would cause FsCheck to throw away half of the generated candidates. In such cases, you should instead consider defining a custom Arbitrary, using FsCheck's API.

Another discussion entirely is whether the current behaviour is a good idea. If we consider tests as a feedback mechanism about software design, then perhaps we should explicitly specify the expected format. At the moment, the implementation is an extremely Tolerant Reader (because DateTimeOffset.TryParse is a Tolerant Reader). Perhaps it'd make sense to instead use TryParseExact. That's an API modelling decision, though, and might have to involve various other stakeholders than only programmers.

If you wish to learn more about Property-Based Testing with FsCheck, consider watching my Pluralsight course.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 18 January 2016 12:00:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 18 January 2016 12:00:00 UTC