Writing test cases for all possible input values.

I've noticed that programmers new to automated testing struggle with a fundamental task: how to come up with good test input?

There's plenty of design patterns that address that issue, including Test Data Builders. Still, test-driven development, when done right, gives you good feedback on the design of your API. I don't consider having to resort to Test Data Builder as much of a compromise, but still, it's even better if you can model the API in question so that it requires no redundant data.

Most business processes can be modelled as finite state machines. Once you've understood the problem well enough to enumerate the possible states, you can reverse-engineer an algebraic data type from that enumeration.

While the data carried around in the state machine may be unconstrained, the number of state and state transitions is usually limited. Model the states as enumerable values, and you can cover all input values with simple combinatorics.

Tennis states #

The Tennis kata is one of my favourite exercises. Just like a business process, it turns out that you can model the rules as a finite state machine. There's a state where both players have love, 15, 30, or 40 points (although both can't have 40 points; that state is called deuce); there's a state where a player has the advantage; and so on.

I've previously written about how to design the states of the Tennis kata with types, so here, I'll just limit myself to the few types that turn out to matter as test input: players and points.

Here are both types defined as Haskell sum types:

data Player = PlayerOne | PlayerTwo deriving (EqShowReadEnumBounded)
 
data Point = Love | Fifteen | Thirty deriving (EqShowReadEnumBounded)

Notice that both are instances of the Enum and Bounded type classes. This means that it's possible to enumerate all values that inhabit each type. You can use that to your advantage when writing unit tests.

Properties for every value #

Most people think of property-based testing as something that involves a special framework such as QuickCheck, Hedgehog, or FsCheck. I often use such frameworks, but as I've written about before, sometimes enumerating all possible input values is simpler and faster than relying on randomly generated values. There's no reason to generate 100 random values of a type inhabited by only three values.

Instead, write properties for the entire domain of the function in question.

In Haskell, you can define a generic enumeration like this:

every :: (Enum a, Bounded a) => [a]
every = [minBound .. maxBound]

I don't know why this function doesn't already exist in the standard library, but I suppose that most people just rely on the list comprehension syntax [minBound .. maxBound]...

With it you can write simple properties of the Tennis game. For example:

"Given deuce, when player wins, then that player has advantage" ~: do
  player <- every
  let actual = score Deuce player
  return $ Advantage player ~=? actual

Here I'm using inlined HUnit test lists. With the Tennis kata, it's easiest to start with the deuce case, because regardless of player, the new state is that the winning player has advantage. That's what that property asserts. Because of the do notation, the property produces a list of test cases, one for every player. There's only two Player values, so it's only two test cases.

The beauty of do notation (or the list monad in general) is that you can combine enumerations, for example like this:

"Given forty, when player wins ball, then that player wins the game" ~: do
  player <- every
  opp <- every

  let actual = score (Forty $ FortyData player opp) player

  return $ Game player ~=? actual

While player is a Player value, opp (Other Player's Point) is a Point value. Since there's two possible Player values and three possible Point values, the combination yields six test cases.

I've been doing the Tennis kata a couple of times using this approach. I've also done it in C# with xUnit.net, where the first test of the deuce state looks like this:

[TheoryMemberData(nameof(Player))]
public void TransitionFromDeuce(IPlayer player)
{
    var sut = new Deuce();
 
    var actual = sut.BallTo(player);
 
    var expected = new Advantage(player);
    Assert.Equal(expected, actual);
}

C# takes more ceremony than Haskell, but the idea is the same. The Player data source for the [Theory] is defined as an enumeration of the possible values:

public static IEnumerable<object[]> Player
{
    get
    {
        yield return new object[] { new PlayerOne() };
        yield return new object[] { new PlayerTwo() };
    }
}

All the times I've done the Tennis kata by fully exhausting the domain of the transition functions in question, I've arrived at 40 test cases. I wonder if that's the number of possible state transitions of the game, or if it's just an artefact of the way I've modelled it. I suspect the latter...

Conclusion #

You can sometimes enumerate all possible inputs to an API. Even if a function takes a Boolean value and a byte as input, the enumeration of all possible combinations is only 2 * 256 = 512 values. You computer will tear through all of those combinations faster than you can say random selection. Consider writing APIs that take algebraic data types as input, and writing properties that exhaust the domain of the functions in question.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 31 August 2020 08:39:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 31 August 2020 08:39:00 UTC