Validate input in applicative style for superior readability and composability.

This article is an instalment in an article series about applicative functors. It demonstrates how applicative style can be used to compose small validation functions to a larger validation function in such a way that no validation messages are lost, and the composition remains readable.

All example code in this article is given in Haskell. No F# translation is offered, because Scott Wlaschin has an equivalent example covering input validation in F#.

JSON validation #

In my Pluralsight course about a functional architecture in F#, you can see an example of an on-line restaurant reservation system. I often return to that example scenario, so for regular readers of this blog, it should be known territory. For newcomers, imagine that you've been asked to develop an HTTP-based API that accepts JSON documents containing restaurant reservations. Such a JSON document could look like this:

{
  "date""2017-06-27 18:30:00+02:00",
  "name""Mark Seemann",
  "email""mark@example.com",
  "quantity": 4
}

It contains the date and time of the (requested) reservation, the email address and name of the person making the reservation, as well as the number of people who will be dining. Particularly, notice that the date and time is represented as a string value (specifically, in ISO 8601 format), since JSON has no built-in date and time data type.

In Haskell, you can represent such a JSON document using a type like this:

data ReservationJson = ReservationJson {
  jsonDate :: String,
  jsonQuantity :: Double,
  jsonName :: String,
  jsonEmail :: String }
  deriving (EqShowReadGeneric)

Haskell's strength is in its type system, so you should prefer to model a reservation using a strong type:

data Reservation = Reservation {
  reservationDate :: ZonedTime,
  reservationQuantity :: Int,
  reservationName :: String,
  reservationEmail :: String }
  deriving (ShowRead)

Instead of modelling the date and time as a string, you model it as a ZonedTime value. Additionally, you should model quantity as an integer, since a floating point value doesn't make much sense.

While you can always translate a Reservation value to a ReservationJson value, the converse doesn't hold. There are ReservationJson values that you can't translate to Reservation. Such ReservationJson values are invalid.

You should write code to validate and translate ReservationJson values to Reservation values, if possible.

Specialised validations #

The ReservationJson type is a complex type, because it's composed of multiple (four) elements of different types. You can easily define at least three validation rules that ought to hold:

  1. You should be able to convert the jsonDate value to a ZonedTime value.
  2. jsonQuantity must be a positive integer.
  3. jsonEmail should look believably like an email address.
When you have a complex type where more than one validation rule applies, your code will be most readable and maintainable if you can write each rule as an independent function.

In Haskell, people often use Either for validation, but instead of using Either directly, I'll introduce a specialised Validation type:

newtype Validation e r = Validation (Either e r) deriving (EqShowFunctor)

You'll notice that this is simply a redefinition of Either. Haskell can automatically derive its Functor instance with the DeriveFunctor language extension.

My motivation for introducing a new type is that the way that Either is Applicative is not quite how I'd like it to be. Introducing a newtype enables you to change how a type behaves. More on that later. First, you can implement the three individual validation functions.

Date validation #

If the JSON date value is an ISO 8601-formatted string, then you can parse it as a ZonedTime. In that case, you should return the Right case of Validation. If you can't parse the string into a ZonedTime value, you should return a Left value containing a helpful error message.

validateDate :: String -> Validation [String] ZonedTime
validateDate candidate =
  case readMaybe candidate of
    Just d  -> Validation $ Right d
    Nothing -> Validation $ Left ["Not a date."]

This function uses readMaybe from Text.Read to attempt to parse the candidate String. When readMaybe can read the String value, it returns a Just value with the parsed value inside; otherwise, it returns Nothing. The function pattern-matches on those two cases and returns the appropriate value in each case.

Notice that errors are represented as a list of String values, although this particular function only returns a single message in its list of error messages. The reason for that is that you should be able to collect multiple validation issues for a complex value such as ReservationJson, and keeping track of errors in a list makes that possible.

Haskell golfers may argue that this implementation is overly verbose, and it could, for instance, instead be written as:

validateDate = Validation . maybe (Left ["Not a date."]) Right . readMaybe

which is true, but not as readable. Both versions get the job done, though, as these GCHi-based ad-hoc tests demonstrate:

λ> validateDate "2017/27/06 18:30:00 UTC+2"
Validation (Left ["Not a date."])
λ> validateDate "2017-06-27 18:30:00+02:00"
Validation (Right 2017-06-27 18:30:00 +0200)

That takes care of parsing dates. On to the next validation function.

Quantity validation #

JSON numbers aren't guaranteed to be integers, so it's possible that even a well-formed Reservation JSON document could contain a quantity property of 9.7, -11.9463, or similar. When handling restaurant reservations, however, it only makes sense to handle positive integers. Even 0 is useless in this context. Thus, validation must check for two conditions, so in principle, you could write two separate functions for that. In order to keep the example simple, though, I've included both tests in the same function:

validateQuantity :: Double -> Validation [String] Int
validateQuantity candidate =
  if isInt candidate && candidate > 0
    then Validation $ Right $ round candidate
    else Validation $ Left ["Not a positive integer."]
  where isInt x = x == fromInteger (round x)

If candidate is both an integer, and greater than zero, then validateQuantity returns Right; otherwise, it returns a Left value containing an error message. Like validateDate, you can easily test validateQuantity in GHCi:

λ> validateQuantity 4
Validation (Right 4)
λ> validateQuantity (-1)
Validation (Left ["Not a positive integer."])
λ> validateQuantity 2.32
Validation (Left ["Not a positive integer."])

Perhaps you can think of rules for names, but I can't, so we'll leave the name be and move on to validating email addresses.

Email validation #

It's notoriously difficult to validate SMTP addresses, so you shouldn't even try. It seems fairly safe to assume, however, that an email address must contain at least one @ character, so that's going to be all the validation you have to implement:

validateEmail :: String -> Validation [String] String
validateEmail candidate =
  if '@' `elem` candidate
    then Validation $ Right candidate
    else Validation $ Left ["Not an email address."]

Straightforward. Try it out in GHCI:

λ> validateEmail "foo"
Validation (Left ["Not an email address."])
λ> validateEmail "foo@example.org"
Validation (Right "foo@example.org")

Indeed, that works.

Applicative composition #

What you really should be doing is to validate a ReservationJson value. You have the three validation rules implemented, so now you have to compose them. There is, however, a catch: you must evaluate all rules, and return a list of all the errors you encountered. That's probably going to be a better user experience for a user.

That's the reason you can't use Either. While it's Applicative, it doesn't behave like you'd like it to behave in this scenario. Particularly, the problem is that it throws away all but the first Left value it finds:

λ> Right (,,) <*> Right 42 <*> Left "foo" <*> Left "bar"
Left "foo"

Notice how Left "bar" is ignored.

With the new type Validation based on Either, you can now define how it behaves as an applicative functor:

instance Monoid m => Applicative (Validation m) where
  pure = Validation . pure
  Validation (Left x) <*> Validation (Left y) = Validation (Left (mappend x y))
  Validation f <*> Validation r = Validation (f <*> r)

This instance is restricted to Monoid Left types. It has special behaviour for the case where both expressions passed to <*> are Left values. In that case, it uses mappend (from Monoid) to 'add' the two Left values together in a new Left value.

For all other cases, this instance of Applicative delegates to the behaviour defined for Either. It also uses pure from Either to implement its own pure function.

Lists ([]) form a monoid, and since all the above validation functions return lists of errors, it means that you can compose them using this definition of Applicative:

validateReservation :: ReservationJson -> Validation [String] Reservation
validateReservation candidate =
  pure Reservation <*> vDate <*> vQuantity <*> vName <*> vEmail
  where
    vDate     = validateDate     $ jsonDate     candidate
    vQuantity = validateQuantity $ jsonQuantity candidate
    vName     = pure             $ jsonName     candidate
    vEmail    = validateEmail    $ jsonEmail    candidate

The candidate is a ReservationJson value, but each of the validation functions work on either String or Double, so you'll have to use the ReservationJson type's access functions (jsonDate, jsonQuantity, and so on) to pull the relevant values out of it. Once you have those, you can pass them as arguments to the appropriate validation function.

Since there's no rule for jsonName, you can use pure to create a Validation value. All four resulting values (vDate, vQuantity, vName, and vEmail) are Validation [String] values; only their Right types differ.

The Reservation record constructor is a function of the type ZonedTime -> Int -> String -> String -> Reservation, so when you arrange the four v* values correctly between the <*> operator, you have the desired composition.

Try it in GHCi:

λ> validateReservation $ ReservationJson "2017-06-30 19:00:00+02:00" 4 "Jane Doe" "j@example.com"
Validation (Right (Reservation {
    reservationDate = 2017-06-30 19:00:00 +0200,
    reservationQuantity = 4,
    reservationName = "Jane Doe",
    reservationEmail = "j@example.com"}))

λ> validateReservation $ ReservationJson "2017/14/12 6pm" 4.1 "Jane Doe" "jane.example.com"
Validation (Left ["Not a date.","Not a positive integer.","Not an email address."])

λ> validateReservation $ ReservationJson "2017-06-30 19:00:00+02:00" (-3) "Jane Doe" "j@example.com"
Validation (Left ["Not a positive integer."])

The first ReservationJson value passed to validateReservation is valid, so the return value is a Right value.

The next ReservationJson value is about as wrong as it can be, so three different error messages are returned in a Left value. This demonstrates that Validation doesn't give up the first time it encounters a Left value, but rather collects them all.

The third example demonstrates that even a single invalid value (in this case a negative quantity) is enough to make the entire input invalid, but as expected, there's only a single error message.

Summary #

Validation may be the poster child of applicative functors, but it is a convenient way to solve the problem. In this article you saw how to validate a complex data type, collecting and reporting on all problems, if any.

In order to collect all errors, instead of immediately short-circuiting on the first error, you have to deviate from the standard Either implementation of <*>. If you go back to read Scott Wlaschin's article, you should be aware that it specifically implements its applicative functor in that way, instead of the normal behaviour of Either.

More applicative functors exist. This article series has, I think, room for more examples.

Next: The Test Data Generator applicative functor.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 05 November 2018 07:05:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 05 November 2018 07:05:00 UTC