Typing and testing problem 23

Yet another reflection on the relationship between types and tests, this time with a simple example.

The debate about dynamic typing versus static typing still goes on. If it ever gets resolved, I suppose it'll be in the far future. Until then, one's position is bound to be determined mostly by experience and belief. I openly admit that I prefer statically typed languages like F# and Haskell.

As I've previously touched on, I can't help seeing types as a slider. The more to the right you pull it, the stronger the type system. The more to the left you pull it, the more you'll need automated tests to give you a sense of confidence in your code.

In this article, I'll share an small revelation recently given to me.

Problem 23 #

Somewhere, a Stack Overflow user was going through Ninety-Nine Haskell Problems, and asked how to unit test problem 23.

The problem is elementary:

"Extract a given number of randomly selected elements from a list."

Here's an example of the intended use:

λ> rndSelect "abcdefgh" 3
"fag"

The first argument to rndSelect is the candidates from which to pick elements; in this case the letters a to h. The second argument is the number of values to select; in this case the number 3.

Test plan #

How does one test a function like that? Clearly, when randomness is involved, you'll need some way to regulate the randomness in order to make tests deterministic. With my blinders on, I assumed that this was the main problem, so I answered with the following plan for a few properties:

The length of the returned list should be equal to the input length.
All elements in the returned list should be elements of the list of candidates.

In addition, I also suggested a way to make tests deterministic, but I'll get back to that later.

In response to this plan, the user chi commented on my second suggestion:

"I think this it is a consequence of the free theorem. If so, no need to test for that!"

Sometimes, I find it difficult to shake my object-oriented TDD-influenced way of thinking, but chi is right. Here's why:

Parametric polymorphism #

.NET, including C# and F#, has a platform feature called generics. Haskell has generics as well, although normally, that language feature is called parametric polymorphism. What I had in mind was a set of parametrically polymorphic functions with these types:

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]

rndSelect :: Integral i => [a] -> i -> IO [a]

Notice that both functions return lists of a values, where a is a type variable (in C#, you'd call it a generic type argument). It could be Integer, String, Day, or a custom domain type you'd added to the code base two minutes earlier.

Given a completely unrestricted type variable, Haskell has no way of creating values. How could it, logically?

In C#, you can write default(T), which tends to mostly produce null references. Haskell doesn't have null, so with that option cut off, how would it be able to produce values of arbitrary types? It can't.

When returning a list of a values, the only option open to a parametric polymorphic function is to pick values from its input arguments. For both rndGenSelect and rndSelect, there's only a single source of a values, so there's no reason to test that the functions return values from those lists of candidates. It's the only thing it can do. That's the free theorem for that function.

It'd been an entirely different story if the function had had concrete types. If, for example, the function had had the type RandomGen g => g -> String -> Int -> String, I could have written a function like this one:

rndGenSelect' :: RandomGen g => g -> String -> Int -> String
rndGenSelect' _ _ count = replicate count 's'

Because the type of elements is known at compile-time, we can pick an arbitrary Char value ('s'). This is possible because we know the type, and therefore can come up with a strategy to hard-code known values of that type. When the type argument is unknown, this is no longer possible. To paraphrase Robert C. Martin, as the types get more generic, the tests become more redundant.

Taming randomness #

Before we look at automated testing, let's consider how to turn randomness into deterministic behaviour. This is (seemingly) always a problem with unit testing when the desired behaviour contains randomness, because tests should be deterministic. Once again, however, it turns out that functional design is intrinsically testable. Since Haskell design favours pure functions, the core of System.Random is deterministic.

This is, in fact, not much different from C#, where the Random class encapsulates an algorithm that computes a series of random-looking values based on an initial seed value. If you give it the same seed, it'll produce the same sequence of random-looking numbers. Haskell works the same way.

This led me to a design with a 'core' function that does all the work, and a 'wrapper' function that only adds one extra feature: randomness.

Starting my design with types, I wanted a function with this type:

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]

This is the type that I've already discussed above. Because of the free theorem, we already know that the returned list can only contain values selected from the input list. In other words, there's no need to test for that.

This function takes a RandomGen argument, which is a type class of pure functions. RandomGen itself is pure; the source of randomness comes from how it's produced. More on that later. This, however, should enable me to write deterministic tests.

Properties #

Before we start adding deterministic tests, let's see how far we can get with property-based testing. First, designing with types, I need to implement the function so that it compiles. This is the simplest implementation I could think of:

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]
rndGenSelect _ xs _ = [head xs]

This implementation is both incorrect and unsafe, but it compiles. In TDD fashion, then, I found it appropriate to add a test - in this case a QuickCheck property:

lenProp :: Integral i => Int -> [a] -> NonNegative i -> Bool
lenProp seed xs (NonNegative i) =
  i == genericLength (rndGenSelect (mkStdGen seed) xs i)

This little piece of test code is the only surviving property from my original test plan. It states that for any non-negative count, the list returned from rndGenSelect should have the requested length.

Writing this property, however, quickly forced me to deal with the case where the count is negative. It's easy to forget about edge cases when your function is nothing but a pie in the sky, but QuickCheck (and property-based testing in general) is really effective at grounding you in reality. Even with a language like Haskell, I still find the fast feedback loop from tests helpful.

The original exercise specification doesn't mention what should happen if the count is negative, so after short deliberation, I decide to write another property:

negLenProp :: Integral i => Int -> [a] -> Positive i -> Bool
negLenProp seed xs (Positive i) =
  0 == genericLength (rndGenSelect (mkStdGen seed) xs (-i))

This property simply states that for all negative counts, the returned list should be empty.

Both of these properties obviously fail, because of the incorrect implementation.

The simplest implementation I could think of that passes both properties is this:

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]
rndGenSelect _ xs count = genericReplicate count (head xs)

At this point, I don't see how TDD or property-based testing can help me move forward. The remaining work required is to add randomness to the mix. In this case, I'll need to use the RandomGen argument to produce random values, but since I don't know how its algorithm works, then even if I had a seed value known at compile-time, I wouldn't be able to predict which values it'd produce.

Selecting random indices #

I admit that I don't know how to write the next test a priori. I do know, however, that if I implement what's missing, I have a deterministic function, and I can use it to write regression test. In other words, I'll reverse direction and write the code first, and then the test. What a novel idea.

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]
rndGenSelect rnd xs count =
  let indices = genericTake count $ randomRs (0, length xs - 1) rnd
  in fmap (xs !!) indices

This function first uses randomRs to produce an infinite list of values. These values are indices because they all fall between 0 and length xs - 1. In other words, they are indices into xs.

While the list is infinite, it's lazily evaluated, so infinity itself isn't a problem. We only need count elements, though, so we can simply take the first count elements.

Finally, the function maps over the list of indices, and for each index value, selects the element at that position.

I could inline indices in the return expression, like this:

rndGenSelect :: (RandomGen g, Integral i) => g -> [a] -> i -> [a]
rndGenSelect rnd xs count =
  fmap (xs !!) $ genericTake count $ randomRs (0, length xs - 1) rnd

I find that more obscure than the first alternative, though, but both versions pass the properties and do what they're supposed to do.

Regression testing #

How do I know that my code works? Well, that's always difficult with code that contains randomness, but you can load the function into GHCi and perform some sanity testing:

λ> rndGenSelect (mkStdGen 42) "foo" 3
"ofo"
λ> rndGenSelect (mkStdGen 1337) "bar" 10
"rabbaarrra"
λ> rndGenSelect (mkStdGen (-197221)) ['a'..'z'] 5
"ntfnc"

That looks, I suppose, random enough... What's more important is that this is completely repeatable. This means that I can write parametrised tests that protect against regressions:

"rndGenSelect of chars returns correct result" ~: do
  (seed, xs, count, expected) <-
    [
      (     42,      "foo",  3,        "ofo"),
      (   1337,      "bar", 10, "rabbaarrra"),
      (-197221, ['a'..'z'],  5,      "ntfnc")
    ]
  let rnd = mkStdGen seed

  let actual = rndGenSelect rnd xs count

  return $ expected ~=? actual

These tests don't drive the design, but they prevent regressions. If, at a later time, I, or someone else, inadvertently revert rndGenSelect to genericReplicate count (head xs), these tests will fail.

Humble function #

The original problem statement is to write a function without an explicit RandomGen argument. In the spirit of xUnit Test Patterns' Humble Object pattern, we can now click all our pieces together to a function that does what is required:

rndSelect :: Integral i => [a] -> i -> IO [a]
rndSelect xs count = do
  rnd <- newStdGen
  return $ rndGenSelect rnd xs count

The only thing of interest here is that the function is impure, because it uses newStdGen to produce a random RandomGen value. It then delegates all work to rndGenSelect, which is covered by tests.

As you can see, this function does not exhibit repeatable behaviour:

λ> rndSelect "abcdefgh" 3
"add"
λ> rndSelect "abcdefgh" 3
"daf"

This should, I think, address the original problem statement.

All source code for this article is available on GitHub.

Summary #

The first time I encountered parametric polymorphism was when C# got generics in 2005. Back then it was mostly explained as a mechanism to avoid boxing, although it also seriously reduced the amount of boilerplate code you'd have to write in order to have type-safe collections. In many years, I mostly understood C# generics as a language feature aimed at efficiency and programmer productivity.

It wasn't until I started to program in F#, with its stronger type inference, that it began to dawn on me that parametric polymorphism could also be a design tool. Making a function more generic tightens its contract, so to speak. The more generic a function becomes, the less wriggle room does it have. This may sound like a disadvantage to a programmer, but it's a boon to a code reader. When you, as a reader, encounter a parametrically polymorphic function, you already know that there are things that function can't do. Such functions come with invariants, or 'theorems', for free.

Comments

Roman Leventov #

"First, designing with types, I need to implement the function so that it compiles. This is the simplest implementation I could think of:" -- Why wouldn't you just use undefined function?

2019-06-11 18:09 UTC

Mark Seemann #

Roman, thank you for writing. For no particularly good reason. I tend to automatically disregard undefined, since it's basically one kind of bottom in Haskell.

But I admit that that's not a bulletproof reason, since the actual function, at that point in the article, is still partial.

So honestly, the best answer is that I wrote "the simplest implementation I could think of" because I didn't think of undefined.

2019-06-11 18:44 UTC

Published: Monday, 09 July 2018 07:03:00 UTC

Typing and testing problem 23 by Mark Seemann