Test Data Generator modelled as an applicative functor, with examples in C# and F#.

This article is an instalment in an article series about applicative functors. It shows yet another example of an applicative functor. I tend to think of applicative functors as an abstraction of 'combinations' of values and functions, but this is most evident for lists and other collections. Even so, I think that the intuition also holds for Maybe as an applicative functor as well as validation as an applicative functor. This is also the case for the Test Data Generator functor.

Applicative generator in C# #

In a previous article, you saw how to implement a Test Data Generator as a functor in C#. The core class is this Generator<T> class:

public class Generator<T>
{
    private readonly Func<RandomT> generate;
 
    public Generator(Func<RandomT> generate)
    {
        if (generate == null)
            throw new ArgumentNullException(nameof(generate));
 
        this.generate = generate;
    }
 
    public Generator<T1> Select<T1>(Func<TT1> f)
    {
        if (f == null)
            throw new ArgumentNullException(nameof(f));
 
        Func<RandomT1> newGenerator = r => f(this.generate(r));
        return new Generator<T1>(newGenerator);
    }
 
    public T Generate(Random random)
    {
        if (random == null)
            throw new ArgumentNullException(nameof(random));
 
        return this.generate(random);
    }
}

This is merely repetition from the earlier article.

Generator<T> is already a functor, but you can make it an applicative functor by adding an extension method like this:

public static Generator<TResult> Apply<TTResult>(
    this Generator<Func<TTResult>> selectors,
    Generator<T> generator)
{
    if (selectors == null)
        throw new ArgumentNullException(nameof(selectors));
    if (generator == null)
        throw new ArgumentNullException(nameof(generator));
 
    Func<RandomTResult> newGenerator = r =>
    {
        var f = selectors.Generate(r);
        var x = generator.Generate(r);
        return f(x);
    };
    return new Generator<TResult>(newGenerator);
}

The Apply function combines a generator of functions with a generator of values. Given these two generators, it defines a closure over both, and packages that function in a new generator object.

CPR example #

When is it interesting to combine a randomly selected function with a randomly generated value? Here's an example.

In Denmark, everyone has a personal identification number, in Danish called CPR-nummer (CPR number). It's somewhat comparable to the U.S. Social Security number, but surely more Orwellian.

CPR numbers have a simple format: DDMMYY-SSSS, where the first six digits indicate a person's birth date, and the last four digits are a sequence number. An example could be 010203-1234, which indicates a woman born February 1, 1903. Assume that you're writing a system that has to accept CPR numbers as input. You've represented CPR number as a class called CprNumber, and you've already written a parser, but now it turns out that sometimes users enter slightly malformed data.

Sometimes users copy and paste CPR numbers from other sources, and occasionally, they inadvertently include a leading or trailing space. Other users forget the dash between the birth date and the sequence number. In other words, sometimes you receive input like "050301-4231 " or "1211993321".

Using your fancy Test Data Generator, you'd like to write a test that verifies that your parser follows Postel's law: Be conservative in what you send, be liberal in what you accept.

Assume that you already have a generator that can produce valid CprNumber objects. One test strategy, then, could be to generate a valid CprNumber object, convert it to a string, slightly taint or mangle that string, and then attempt to parse the tainted string.

There's more than one way to taint a valid CPR number string. You could, for example, define three functions like these:

Func<stringstring> addLeadingSpace = s => " " + s;
Func<stringstring> addTrailingSpace = s => s + " ";
Func<stringstring> removeDash = s => s.Replace("-""");

The first function adds a leading space to a string, the second adds a trailing space, and the third removes all dashes it finds. Can you make a generator out of those functions?

Yes, you can, with this common general-purpose method:

public static Generator<T> Elements<T>(params T[] alternatives)
{
    if (alternatives == null)
        throw new ArgumentNullException(nameof(alternatives));
 
    return new Generator<T>(r =>
    {
        var index = r.Next(alternatives.Length);
        return alternatives[index];
    });
}

This generator randomly selects a value from an array of values, with equal probability. It's a common method, because you'll find equivalent functions in QuickCheck, FsCheck, and Hedgehog (where it's called item).

You can write the entire test like this:

[Fact]
public void ParseCorrectlyHandlesTaintedInput()
{
    Func<stringstring> addLeadingSpace = s => " " + s;
    Func<stringstring> addTrailingSpace = s => s + " ";
    Func<stringstring> removeDash = s => s.Replace("-""");
    Generator<Func<stringstring>> tainters =
        Gen.Elements(
            addLeadingSpace,
            addTrailingSpace,
            removeDash,
            s => addLeadingSpace(removeDash(s)),
            s => addTrailingSpace(addLeadingSpace(s)));
    var g = tainters.Apply(Gen.CprNumber.Select(n => n.ToString()));
    var rnd = new Random();
    var candidate = g.Generate(rnd);
 
    CprNumber dummy;
    var actual = CprNumber.TryParse(candidate, out dummy);
 
    Assert.True(actual);
}

You first define the three 'tainting' functions, and then create the tainters generator. Notice that this generator not only contains the three functions, but also combinations of these functions. I didn't include all possible combinations of functions, but in an earlier article, you saw how to do just that.

Gen.CprNumber is a Generator<CprNumber>, while you actually need a Generator<string>. Since Generator<T> is a functor, you can easily map Gen.CprNumber with its Select method.

You now have a Generator<Func<string, string>> (tainters) and a Generator<string>, so you can combine them using Apply. The result g is another Generator<string> that generates 'tainted', but still valid, CPR string representations.

In order to keep the example as simple as possible, the Arrange and Assert phases of the test only checks if CprNumber.TryParse returns true. A better test would also verify that the resulting CprNumber value was correct, but in order to do that, you need to keep the originally generated CprNumber object around so that you can compare this expected value to the actual value. This is possible, but complicates the code, so I left it out of the example.

F# Hedgehog #

You can translate the above unit test to F#, using Hedgehog as a property-based testing library. Another option would be FsCheck. Without further ado, I present to you the test:

[<Fact>]
let ``Correctly parses tainted text`` () = Property.check <| property {
    let removeDash (s : string) = s.Replace ("-""")
    let! candidate =
        Gen.item [
            sprintf %s"
            sprintf "%s "
            removeDash
            removeDash >> sprintf %s"
            sprintf %s "]
        <*> Gen.map string Gen.cprNumber
    
    let actual = Cpr.tryParse candidate
 
    test <@ actual |> Option.isSome @> }

As I already mentioned in passing, Hedgehog's equivalent to the above Elements method is called Gen.item. Here, you see the same five functions as above passed in a list. Thanks to partial application and type inference, an expression like sprintf " %s" is already a function of the type string -> string, as is removeDash. For the last of the five functions, I found it easier to write (and read) sprintf " %s " instead of sprintf " %s" >> sprintf "%s ".

Equivalent to the C# example, Gen.cprNumber is a Gen<CprNumber>, so mapping it with the built-in string function translates it to a Gen<string>.

Hedgehog already includes an <*> operator; Gen is an applicative functor.

Summary #

Applicative functors are fairly common. I find it intuitive to think of them as an abstraction of how to make combinations of things. Modelling a Test Data Generator as an applicative functor enables you to create random combinations of functions and values.

While working on the Hedgehog example, I discovered another great use of option as an applicative functor. You can see this in the next article.

Next: Danish CPR numbers in F#.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 26 November 2018 07:24:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 26 November 2018 07:24:00 UTC