An F# demo of validation with partial data round trip

An F# port of the previous Haskell proof of concept.

This article is part of a short article series on applicative validation with a twist. The twist is that validation, when it fails, should return not only a list of error messages; it should also retain that part of the input that was valid.

In the previous article you saw a Haskell proof of concept that demonstrated how to compose the appropriate applicative functor with a suitable semigroup to make validation work as desired. In this article, you'll see how to port that proof of concept to F#.

Data definitions #

Like in the previous article, we're going to need some types. These are essentially direct translations of the corresponding Haskell types:

type Input = { Name : string option; DoB : DateTime option; Address : string option}
type ValidInput = { Name : string; DoB : DateTime; Address : string }

The Input type plays the role of the input we'd like to validate, while ValidInput presents validated data.

If you're an F# fan, you can bask in the reality that F# records are terser than Haskell records. I like both languages, so I have mixed feelings about this.

Computation expression #

Haskell's main workhorse is its type class system. F# doesn't have that, but it has computation expressions, which in F# 5 got support for applicative functors. That's just what we need, and it turns out that there isn't a lot of code we have to write to make all of this work.

To recap from the Haskell proof of concept: We need a Result-like container that returns a tuple for errors. One element of the tuple should be a an endomorphism, which forms a monoid (and therefore also a semigroup). The other element should be a list of error messages - another monoid. In F# terms we'll write it as (('b -> 'b) * 'c list).

That's a tuple, and since tuples form monoids when their elements do the Error part of Result supports accumulation.

To support an applicative computation expression, we're going to need a a way to merge two results together. This is by far the most complicated piece of code in this article, all six lines of code:

module Result =
    // Result<'a       ,(('b -> 'b) * 'c list)> ->
    // Result<'d       ,(('b -> 'b) * 'c list)> ->
    // Result<('a * 'd),(('b -> 'b) * 'c list)>
    let merge x y =
        match x, y with
        | Ok xres, Ok yres -> Ok (xres, yres)
        | Error (f, e1s), Error (g, e2s)  -> Error (f >> g, e2s @ e1s)
        | Error e, Ok _ -> Error e
        | Ok _, Error e -> Error e

The merge function composes two input results together. The results have Ok types called 'a and 'd, and if they're both Ok values, the return value is an Ok tuple of 'a and 'd.

If one of the results is an Error value, it beats an Ok value. The only moderately complex operations is when both are Error values.

Keep in mind that an Error value in this instance contains a tuple of the type (('b -> 'b) * 'c list). The first element is an endomorphism 'b -> 'b and the other element is a list. The merge function composes the endomorphism f and g by standard function composition (the >> operator), and concatenates the lists with the standard @ list concatenation operator.

Because I'm emulating how the original forum post's code behaves, I'm concatenating the two lists with the rightmost going before the leftmost. It doesn't make any other difference than determining the order of the error list.

With the merge function in place, the computation expression is a simple matter:

type ValidationBuilder () =
    member _.BindReturn (x, f) = Result.map f x
    member _.MergeSources (x, y) = Result.merge x y

The last piece is a ValidationBuilder value:

[<AutoOpen>]
module ComputationExpressions =
    let validation = ValidationBuilder ()

Now, whenever you use the validation computation expression, you get the desired functionality.

Validators #

Before we can compose some validation functions, we'll need to have some validators in place. These are straightforward translations of the Haskell validation functions, starting with the name validator:

// Input -> Result<string,((Input -> Input) * string list)>
let validateName ({ Name = name } : Input) =
    match name with
    | Some n when n.Length > 3 -> Ok n
    | Some _ ->
        Error (
            (fun (args : Input) -> { args with Name = None }),
            ["no bob and toms allowed"])
    | None -> Error (id, ["name is required"])

When the name is too short, the endomorphism resets the Name field to None.

The date-of-birth validation function works the same way:

// DateTime -> Input -> Result<DateTime,((Input -> Input) * string list)>
let validateDoB (now : DateTime) ({ DoB = dob } : Input) =
    match dob with
    | Some d when d > now.AddYears -12 -> Ok d
    | Some _ ->
        Error (
            (fun (args : Input) -> { args with DoB = None }),
            ["get off my lawn"])
    | None -> Error (id, ["dob is required"])

Again, like in the Haskell proof of concept, instead of calling DateTime.Now from within the function, I'm passing now as an argument to keep the function pure.

The address validation concludes the set of validators:

// Input -> Result<string,(('a -> 'a) * string list)>
let validateAddress ({ Address = address }: Input) =
    match address with
    | Some a -> Ok a
    | None -> Error (id, ["add1 is required"])

The inferred endomorphism type here is the more general 'a -> 'a, but it's compatible with Input -> Input.

Composition #

All three functions have compatible Error types, so they ought to compose with the applicative computation expression to produce the desired behaviour:

// DateTime -> Input -> Result<ValidInput,(Input * string list)>
let validateInput now args =
    validation {
        let! name = validateName args
        and! dob = validateDoB now args
        and! address = validateAddress args
        return { Name = name; DoB = dob; Address = address }
    }
    |> Result.mapError (fun (f, msgs) -> f args, msgs)

The validation expression alone produces a Result<ValidInput,((Input -> Input) * string list)> value. To get an Input value in the Error tuple, we need to 'run' the Input -> Input endomorphism. The validateInput function does that by applying the endomorphism f to args when mapping the error with Result.mapError.

Tests #

To test that the validateInput works as intended, I first copied all the code from the original forum post. I then wrote eight characterisation tests against that code to make sure that I could reproduce the desired functionality.

I then wrote a parametrised test against the new function:

[<Theory; ClassData(typeof<ValidationTestCases>)>]
let ``Validation works`` input expected =
    let now = DateTime.Now
    let actual = validateInput now input
    expected =! actual

The ValidationTestCases class is defined like this:

type ValidationTestCases () as this =
    inherit TheoryData<Input, Result<ValidInput, Input * string list>> ()

This class produces a set of test cases, where each test case contains an input value and the expected output. To define the test cases, I copied the eight characterisation tests I'd already produced and adjusted them so that they fit the simpler API of the validateInput function. Here's a few examples:

let eightYearsAgo = DateTime.Now.AddYears -8
do this.Add (
    { Name = Some "Alice"; DoB = Some eightYearsAgo; Address = None },
    Error (
        { Name = Some "Alice"; DoB = Some eightYearsAgo; Address = None },
        ["add1 is required"]))
 
do this.Add (
    { Name = Some "Alice"; DoB = Some eightYearsAgo; Address = Some "x" },
    Ok ({ Name = "Alice"; DoB = eightYearsAgo; Address = "x" }))

The first case expects an Error value because the Input value has no address. The other test case expects an Ok value because all input is fine.

I copied all eight characterisation tests over, so now I have those eight tests, as well as the modified eight tests for the applicative-based API shown here. All sixteen tests pass.

Conclusion #

I find this solution to the problem elegant. It's always satisfying when you can implement what at first glance looks like custom behaviour using universal abstractions.

Besides the aesthetic value, I also believe that this keeps a team more productive. These concepts of monoids, semigroups, applicative functors, and so on, are concepts that you only have to learn once. Once you know them, you'll recognise them when you run into them. This means that there's less code to understand.

An ad-hoc implementation as the original forum post suggested (even though it looked quite decent) always puts the onus on a maintenance developer to read and understand even more one-off infrastructure code.

With an architecture based on universal abstractions and well-documented language features, a functional programmer that knows these things will be able to pick up what's going on without much trouble. Specifically, (s)he will recognise that this is 'just' applicative validation with a twist.

This article is the December 28 entry in the F# Advent Calendar in English 2020.

Next: A C# port of validation with partial round trip.

Published: Monday, 28 December 2020 09:22:00 UTC

An F# demo of validation with partial data round trip by Mark Seemann

Data definitions #

Computation expression #

Validators #

Composition #

Tests #

Conclusion #

Wish to comment?

Published

Tags