Serializing restaurant tables in Haskell

Using Aeson, with and without generics.

This article is part of a short series of articles about serialization with and without Reflection. In this instalment I'll explore some options for serializing JSON using Aeson.

The source code is available on GitHub.

Natural numbers #

Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the introductory article we'd like to parse and write restaurant table configurations like this:

{
  "singleTable": {
    "capacity": 16,
    "minimalReservation": 10
  }
}

On the other hand, I'd like to represent the Domain Model in a way that encapsulates the rules governing the model, making illegal states unrepresentable.

As the first step, we observe that the numbers involved are all natural numbers. While I'm aware that Haskell has built-in Nat type, I choose not to use it here, for a couple of reasons. One is that Nat is intended for type-level programming, and while this might be useful here, I don't want to pull in more exotic language features than are required. Another reason is that, in this domain, I want to model natural numbers as excluding zero (and I honestly don't remember if Nat allows zero, but I think that it does..?).

Another option is to use Peano numbers, but again, for didactic reasons, I'll stick with something a bit more idiomatic.

You can easily introduce a wrapper over, say, Integer, to model natural numbers:

newtype Natural = Natural Integer deriving (Eq, Ord, Show)

This, however, doesn't prevent you from writing Natural (-1), so we need to make this a predicative data type. The first step is to only export the type, but not its data constructor:

module Restaurants (
  Natural,
  -- More exports here...
  ) where

But this makes it impossible for client code to create values of the type, so we need to supply a smart constructor:

tryNatural :: Integer -> Maybe Natural
tryNatural n
  | n < 1 = Nothing
  | otherwise = Just (Natural n)

In this, as well as the other articles in this series, I've chosen to model the potential for errors with Maybe values. I could also have chosen to use Either if I wanted to communicate information along the 'error channel', but sticking with Maybe makes the code a bit simpler. Not so much in Haskell or F#, but once we reach C#, applicative validation becomes complicated.

There's no loss of generality in this decision, since both Maybe and Either are Applicative instances.

With the tryNatural function you can now (attempt to) create Natural values:

ghci> tryNatural (-1)
Nothing
ghci> x = tryNatural 42
ghci> x
Just (Natural 42)

This enables client developers to create Natural values, and due to the type's Ord instance, you can even compare them:

ghci> y = tryNatural 2112
ghci> x < y
True

Even so, there will be cases when you need to extract the underlying Integer from a Natural value. You could supply a normal function for that purpose, but in order to make some of the following code a little more elegant, I chose to do it with pattern synonyms:

{-# COMPLETE N #-}
pattern N :: Integer -> Natural
pattern N i <- Natural i

That needs to be exported as well.

So, eight lines of code to declare a predicative type that models a natural number. Incidentally, this'll be 2-3 lines of code in F#.

Domain Model #

Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a Natural number less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six.

In the same spirit as above, then, define this type:

data SingleTable = SingleTable
  { singleCapacity :: Natural
  , minimalReservation :: Natural
  } deriving (Eq, Ord, Show)

Again, only export the type, but not its data constructor. In order to extract values, then, supply another pattern synonym:

{-# COMPLETE SingleT #-}
pattern SingleT :: Natural -> Natural -> SingleTable
pattern SingleT c m <- SingleTable c m

Finally, define a Table type and two smart constructors:

data Table = Single SingleTable | Communal Natural deriving (Eq, Show)
 
trySingleTable :: Integer -> Integer -> Maybe Table
trySingleTable capacity minimal = do
  c <- tryNatural capacity
  m <- tryNatural minimal
  if c < m then Nothing else Just (Single (SingleTable c m))
 
tryCommunalTable :: Integer -> Maybe Table
tryCommunalTable = fmap Communal . tryNatural

Notice that trySingleTable checks the invariant that the capacity must be greater than or equal to the minimal reservation.

The point of this little exercise, so far, is that it encapsulates the contract implied by the Domain Model. It does this by using the static type system to its advantage.

JSON serialization by hand #

At the boundaries of applications, however, there are no static types. Is the static type system still useful in that situation?

For Haskell, the most common JSON library is Aeson, and I admit that I'm no expert. Thus, it's possible that there's an easier way to serialize to and deserialize from JSON. If so, please leave a comment explaining the alternative.

The original rationale for this article series was to demonstrate how serialization can be done without Reflection, or, in the case of Haskell, Generics (not to be confused with .NET generics, which in Haskell usually is called parametric polymorphism). We'll return to Generics later in this article.

In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this:

{ "communalTable": { "capacity": 42 } }

Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility.

As I outlined in the introduction article you can usually find a more weakly typed API to get the job done. For serializing Table to JSON it looks like this:

newtype JSONTable = JSONTable Table deriving (Eq, Show)
 
instance ToJSON JSONTable where
  toJSON (JSONTable (Single (SingleT (N c) (N m)))) =
    object ["singleTable" .= object [
      "capacity" .= c,
      "minimalReservation" .= m]]
  toJSON (JSONTable (Communal (N c))) =
    object ["communalTable" .= object ["capacity" .= c]]

In order to separate concerns, I've defined this functionality in a new module that references the module that defines the Domain Model. Thus, to avoid orphan instances, I've defined a JSONTable newtype wrapper that I then make a ToJSON instance.

The toJSON function pattern-matches on Single and Communal to write two different Values, using Aeson's underlying Document Object Model (DOM).

JSON deserialization by hand #

You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid Table representation, or the values may be illegal, etc.

It's no surprise, then, that the FromJSON instance is bigger:

instance FromJSON JSONTable where
  parseJSON (Object v) = do
    single <- v .:? "singleTable"
    communal <- v .:? "communalTable"
    case (single, communal) of
      (Just s, Nothing) -> do
        capacity <- s .: "capacity"
        minimal <- s .: "minimalReservation"
        case trySingleTable capacity minimal of
          Nothing -> fail "Expected natural numbers."
          Just t -> return $ JSONTable t
      (Nothing, Just c) -> do
        capacity <- c .: "capacity"
        case tryCommunalTable capacity of
          Nothing -> fail "Expected a natural number."
          Just t -> return $ JSONTable t
      _ -> fail "Expected exactly one of singleTable or communalTable."
  parseJSON _ = fail "Expected an object."

I could probably have done this more succinctly if I'd spent even more time on it than I already did, but it gets the job done and demonstrates the point. Instead of relying on run-time Reflection, the FromJSON instance is, unsurprisingly, a parser, composed from Aeson's specialised parser combinator API.

Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works.

Apart from module declaration and imports etc. this hand-written JSON capability requires 27 lines of code. Can we do better with static types and Generics?

JSON serialisation based on types #

The intent with the Aeson library is that you define a type (a Data Transfer Object (DTO) if you will), and then let 'compiler magic' do the rest. In Haskell, it's not run-time Reflection, but a compilation technology called Generics. As I understand it, it automatically 'writes' the serialization and parsing code and turns it into machine code as part of normal compilation.

You're supposed to first turn on the

{-# LANGUAGE DeriveGeneric #-}

language pragma and then tell the compiler to automatically derive Generic for the DTO in question. You'll see an example of that shortly.

It's a fairly flexible system that you can tweak in various ways, but if it's possible to do it directly with the above Table type, please leave a comment explaining how. I tried, but couldn't make it work. To be clear, I could make it serializable, but not to the above JSON format. After enough Aeson Whac-A-Mole I decided to change tactics.

In Code That Fits in Your Head I explain how you're usually better off separating the role of serialization from the role of Domain Model. The way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above Table type plays the latter role, so we need new DTO types.

We may start with the building blocks:

newtype CommunalDTO = CommunalDTO
  { communalCapacity :: Integer
  } deriving (Eq, Show, Generic)

Notice how it declaratively derives Generic, which works because of the DeriveGeneric language pragma.

From here, in principle, all that you need is just a single declaration to make it serializable:

instance ToJSON CommunalDTO

While it does serialize to JSON, it doesn't have the right format:

{ "communalCapacity": 42 }

The property name should be capacity, not communalCapacity. Why did I call the record field communalCapacity instead of capacity? Can't I just fix my CommunalDTO record?

Unfortunately, I can't just do that, because I also need a capacity JSON property for the single-table case, and Haskell isn't happy about duplicated field names in the same module. (This language feature truly is one of the weak points of Haskell.)

Instead, I can tweak the Aeson rules by supplying an Options value to the instance definition:

communalJSONOptions :: Options
communalJSONOptions =
  defaultOptions {
    fieldLabelModifier = \s -> case s of
      "communalCapacity" -> "capacity"
      _ -> s }
 
instance ToJSON CommunalDTO where
  toJSON = genericToJSON communalJSONOptions
  toEncoding = genericToEncoding communalJSONOptions

This instructs the compiler to modify how it generates the serialization code, and the generated JSON fragment is now correct.

We can do the same with the single-table case:

data SingleDTO = SingleDTO
  { singleCapacity :: Integer
  , minimalReservation :: Integer
  } deriving (Eq, Show, Generic)
 
singleJSONOptions :: Options
singleJSONOptions =
  defaultOptions {
    fieldLabelModifier = \s -> case s of
      "singleCapacity" -> "capacity"
      "minimalReservation" -> "minimalReservation"
      _ -> s }
 
instance ToJSON SingleDTO where
  toJSON = genericToJSON singleJSONOptions
  toEncoding = genericToEncoding singleJSONOptions

This takes care of that case, but we still need a container type that will hold either one or the other:

data TableDTO = TableDTO
  { singleTable :: Maybe SingleDTO
  , communalTable :: Maybe CommunalDTO
  } deriving (Eq, Show, Generic)
 
tableJSONOptions :: Options
tableJSONOptions =
  defaultOptions { omitNothingFields = True }
 
instance ToJSON TableDTO where
  toJSON = genericToJSON tableJSONOptions
  toEncoding = genericToEncoding tableJSONOptions

One way to model a sum type with a DTO is to declare both cases as Maybe fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary.

That's quite a bit of infrastructure to stand up, but at least most of it can be reused for parsing.

JSON deserialisation based on types #

To allow parsing of JSON into the above DTO we can make them all FromJSON instances, e.g.:

instance FromJSON CommunalDTO where
  parseJSON = genericParseJSON communalJSONOptions

Notice that you can reuse the same communalJSONOptions used for the ToJSON instance. Repeat that exercise for the two other record types.

That's only half the battle, though, since this only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate Table data.

Converting DTO to Domain Model, and vice versa #

As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial:

toTableDTO :: Table -> TableDTO
toTableDTO (Single (SingleT (N c) (N m))) = TableDTO (Just (SingleDTO c m)) Nothing
toTableDTO (Communal (N c)) = TableDTO Nothing (Just (CommunalDTO c))

Going the other way is fundamentally a parsing exercise:

tryParseTable :: TableDTO -> Maybe Table
tryParseTable (TableDTO (Just (SingleDTO c m)) Nothing) = trySingleTable c m
tryParseTable (TableDTO Nothing (Just (CommunalDTO c))) = tryCommunalTable c
tryParseTable _ = Nothing

Such an operation may fail, so the result is a Maybe Table. It could also have been an Either something Table, if you wanted to return information about errors when things go wrong. It makes the code marginally more complex, but doesn't change the overall thrust of this exploration.

Let's take stock of the type-based alternative. It requires 62 lines of code, distributed over three DTO types, their Options, their ToJSON and FromJSON instances, and finally the two conversions tryParseTable and toTableDTO.

Conclusion #

In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define types that model the shape of the JSON data, and then leverage type-based automation (here, Generics) to automatically produce the code that parses and writes the JSON.

I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 27 lines of code, while 'taking advantage' of type-based automation requires 62 lines of code.

To be fair, the dice don't always land that way. You can't infer a general rule from a single example, and it's possible that I could have done something clever with Aeson to reduce the code. Even so, I think that there's a conclusion to be drawn, and it's this:

Type-based automation (Generics, or run-time Reflection) may seem simple at first glance. Just declare a type and let some automation library do the rest. It may happen, however, that you need to tweak the defaults so much that it would be easier skipping the type-based approach and instead directly manipulating the DOM.

I love static type systems, but I'm also watchful of their limitations. There's likely to be an inflection point where, on the one side, a type-based declarative API is best, while on the other side of that point, a more 'lightweight' approach is better.

The position of such an inflection point will vary from context to context. Just be aware of the possibility, and explore alternatives if things begin to feel awkward.

Next: Serializing restaurant tables in F#.

Published: Monday, 11 December 2023 07:35:00 UTC

Serializing restaurant tables in Haskell by Mark Seemann