Serializing restaurant tables in Haskell by Mark Seemann
Using Aeson, with and without generics.
This article is part of a short series of articles about serialization with and without Reflection. In this instalment I'll explore some options for serializing JSON using Aeson.
The source code is available on GitHub.
Natural numbers #
Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the introductory article we'd like to parse and write restaurant table configurations like this:
{ "singleTable": { "capacity": 16, "minimalReservation": 10 } }
On the other hand, I'd like to represent the Domain Model in a way that encapsulates the rules governing the model, making illegal states unrepresentable.
As the first step, we observe that the numbers involved are all natural numbers. While I'm aware that Haskell has built-in Nat type, I choose not to use it here, for a couple of reasons. One is that Nat
is intended for type-level programming, and while this might be useful here, I don't want to pull in more exotic language features than are required. Another reason is that, in this domain, I want to model natural numbers as excluding zero (and I honestly don't remember if Nat
allows zero, but I think that it does..?).
Another option is to use Peano numbers, but again, for didactic reasons, I'll stick with something a bit more idiomatic.
You can easily introduce a wrapper over, say, Integer
, to model natural numbers:
newtype Natural = Natural Integer deriving (Eq, Ord, Show)
This, however, doesn't prevent you from writing Natural (-1)
, so we need to make this a predicative data type. The first step is to only export the type, but not its data constructor:
module Restaurants ( Natural, -- More exports here... ) where
But this makes it impossible for client code to create values of the type, so we need to supply a smart constructor:
tryNatural :: Integer -> Maybe Natural tryNatural n | n < 1 = Nothing | otherwise = Just (Natural n)
In this, as well as the other articles in this series, I've chosen to model the potential for errors with Maybe
values. I could also have chosen to use Either
if I wanted to communicate information along the 'error channel', but sticking with Maybe
makes the code a bit simpler. Not so much in Haskell or F#, but once we reach C#, applicative validation becomes complicated.
There's no loss of generality in this decision, since both Maybe
and Either
are Applicative
instances.
With the tryNatural
function you can now (attempt to) create Natural
values:
ghci> tryNatural (-1) Nothing ghci> x = tryNatural 42 ghci> x Just (Natural 42)
This enables client developers to create Natural
values, and due to the type's Ord
instance, you can even compare them:
ghci> y = tryNatural 2112 ghci> x < y True
Even so, there will be cases when you need to extract the underlying Integer
from a Natural
value. You could supply a normal function for that purpose, but in order to make some of the following code a little more elegant, I chose to do it with pattern synonyms:
{-# COMPLETE N #-} pattern N :: Integer -> Natural pattern N i <- Natural i
That needs to be exported as well.
So, eight lines of code to declare a predicative type that models a natural number. Incidentally, this'll be 2-3 lines of code in F#.
Domain Model #
Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a Natural
number less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six.
In the same spirit as above, then, define this type:
data SingleTable = SingleTable { singleCapacity :: Natural , minimalReservation :: Natural } deriving (Eq, Ord, Show)
Again, only export the type, but not its data constructor. In order to extract values, then, supply another pattern synonym:
{-# COMPLETE SingleT #-} pattern SingleT :: Natural -> Natural -> SingleTable pattern SingleT c m <- SingleTable c m
Finally, define a Table
type and two smart constructors:
data Table = Single SingleTable | Communal Natural deriving (Eq, Show) trySingleTable :: Integer -> Integer -> Maybe Table trySingleTable capacity minimal = do c <- tryNatural capacity m <- tryNatural minimal if c < m then Nothing else Just (Single (SingleTable c m)) tryCommunalTable :: Integer -> Maybe Table tryCommunalTable = fmap Communal . tryNatural
Notice that trySingleTable
checks the invariant that the capacity
must be greater than or equal to the minimal reservation.
The point of this little exercise, so far, is that it encapsulates the contract implied by the Domain Model. It does this by using the static type system to its advantage.
JSON serialization by hand #
At the boundaries of applications, however, there are no static types. Is the static type system still useful in that situation?
For Haskell, the most common JSON library is Aeson, and I admit that I'm no expert. Thus, it's possible that there's an easier way to serialize to and deserialize from JSON. If so, please leave a comment explaining the alternative.
The original rationale for this article series was to demonstrate how serialization can be done without Reflection, or, in the case of Haskell, Generics (not to be confused with .NET generics, which in Haskell usually is called parametric polymorphism). We'll return to Generics later in this article.
In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this:
{ "communalTable": { "capacity": 42 } }
Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility.
As I outlined in the introduction article you can usually find a more weakly typed API to get the job done. For serializing Table
to JSON it looks like this:
newtype JSONTable = JSONTable Table deriving (Eq, Show) instance ToJSON JSONTable where toJSON (JSONTable (Single (SingleT (N c) (N m)))) = object ["singleTable" .= object [ "capacity" .= c, "minimalReservation" .= m]] toJSON (JSONTable (Communal (N c))) = object ["communalTable" .= object ["capacity" .= c]]
In order to separate concerns, I've defined this functionality in a new module that references the module that defines the Domain Model. Thus, to avoid orphan instances, I've defined a JSONTable
newtype
wrapper that I then make a ToJSON
instance.
The toJSON
function pattern-matches on Single
and Communal
to write two different Values, using Aeson's underlying Document Object Model (DOM).
JSON deserialization by hand #
You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid Table
representation, or the values may be illegal, etc.
It's no surprise, then, that the FromJSON
instance is bigger:
instance FromJSON JSONTable where parseJSON (Object v) = do single <- v .:? "singleTable" communal <- v .:? "communalTable" case (single, communal) of (Just s, Nothing) -> do capacity <- s .: "capacity" minimal <- s .: "minimalReservation" case trySingleTable capacity minimal of Nothing -> fail "Expected natural numbers." Just t -> return $ JSONTable t (Nothing, Just c) -> do capacity <- c .: "capacity" case tryCommunalTable capacity of Nothing -> fail "Expected a natural number." Just t -> return $ JSONTable t _ -> fail "Expected exactly one of singleTable or communalTable." parseJSON _ = fail "Expected an object."
I could probably have done this more succinctly if I'd spent even more time on it than I already did, but it gets the job done and demonstrates the point. Instead of relying on run-time Reflection, the FromJSON
instance is, unsurprisingly, a parser, composed from Aeson's specialised parser combinator API.
Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works.
Apart from module declaration and imports etc. this hand-written JSON capability requires 27 lines of code. Can we do better with static types and Generics?
JSON serialisation based on types #
The intent with the Aeson library is that you define a type (a Data Transfer Object (DTO) if you will), and then let 'compiler magic' do the rest. In Haskell, it's not run-time Reflection, but a compilation technology called Generics. As I understand it, it automatically 'writes' the serialization and parsing code and turns it into machine code as part of normal compilation.
You're supposed to first turn on the
{-# LANGUAGE DeriveGeneric #-}
language pragma and then tell the compiler to automatically derive Generic
for the DTO in question. You'll see an example of that shortly.
It's a fairly flexible system that you can tweak in various ways, but if it's possible to do it directly with the above Table
type, please leave a comment explaining how. I tried, but couldn't make it work. To be clear, I could make it serializable, but not to the above JSON format. After enough Aeson Whac-A-Mole I decided to change tactics.
In Code That Fits in Your Head I explain how you're usually better off separating the role of serialization from the role of Domain Model. The way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above Table
type plays the latter role, so we need new DTO types.
We may start with the building blocks:
newtype CommunalDTO = CommunalDTO { communalCapacity :: Integer } deriving (Eq, Show, Generic)
Notice how it declaratively derives Generic
, which works because of the DeriveGeneric
language pragma.
From here, in principle, all that you need is just a single declaration to make it serializable:
instance ToJSON CommunalDTO
While it does serialize to JSON, it doesn't have the right format:
{ "communalCapacity": 42 }
The property name should be capacity
, not communalCapacity
. Why did I call the record field communalCapacity
instead of capacity
? Can't I just fix my CommunalDTO
record?
Unfortunately, I can't just do that, because I also need a capacity
JSON property for the single-table case, and Haskell isn't happy about duplicated field names in the same module. (This language feature truly is one of the weak points of Haskell.)
Instead, I can tweak the Aeson rules by supplying an Options
value to the instance definition:
communalJSONOptions :: Options communalJSONOptions = defaultOptions { fieldLabelModifier = \s -> case s of "communalCapacity" -> "capacity" _ -> s } instance ToJSON CommunalDTO where toJSON = genericToJSON communalJSONOptions toEncoding = genericToEncoding communalJSONOptions
This instructs the compiler to modify how it generates the serialization code, and the generated JSON fragment is now correct.
We can do the same with the single-table case:
data SingleDTO = SingleDTO { singleCapacity :: Integer , minimalReservation :: Integer } deriving (Eq, Show, Generic) singleJSONOptions :: Options singleJSONOptions = defaultOptions { fieldLabelModifier = \s -> case s of "singleCapacity" -> "capacity" "minimalReservation" -> "minimalReservation" _ -> s } instance ToJSON SingleDTO where toJSON = genericToJSON singleJSONOptions toEncoding = genericToEncoding singleJSONOptions
This takes care of that case, but we still need a container type that will hold either one or the other:
data TableDTO = TableDTO { singleTable :: Maybe SingleDTO , communalTable :: Maybe CommunalDTO } deriving (Eq, Show, Generic) tableJSONOptions :: Options tableJSONOptions = defaultOptions { omitNothingFields = True } instance ToJSON TableDTO where toJSON = genericToJSON tableJSONOptions toEncoding = genericToEncoding tableJSONOptions
One way to model a sum type with a DTO is to declare both cases as Maybe
fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary.
That's quite a bit of infrastructure to stand up, but at least most of it can be reused for parsing.
JSON deserialisation based on types #
To allow parsing of JSON into the above DTO we can make them all FromJSON
instances, e.g.:
instance FromJSON CommunalDTO where parseJSON = genericParseJSON communalJSONOptions
Notice that you can reuse the same communalJSONOptions
used for the ToJSON
instance. Repeat that exercise for the two other record types.
That's only half the battle, though, since this only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate Table
data.
Converting DTO to Domain Model, and vice versa #
As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial:
toTableDTO :: Table -> TableDTO toTableDTO (Single (SingleT (N c) (N m))) = TableDTO (Just (SingleDTO c m)) Nothing toTableDTO (Communal (N c)) = TableDTO Nothing (Just (CommunalDTO c))
Going the other way is fundamentally a parsing exercise:
tryParseTable :: TableDTO -> Maybe Table tryParseTable (TableDTO (Just (SingleDTO c m)) Nothing) = trySingleTable c m tryParseTable (TableDTO Nothing (Just (CommunalDTO c))) = tryCommunalTable c tryParseTable _ = Nothing
Such an operation may fail, so the result is a Maybe Table
. It could also have been an Either something Table
, if you wanted to return information about errors when things go wrong. It makes the code marginally more complex, but doesn't change the overall thrust of this exploration.
Let's take stock of the type-based alternative. It requires 62 lines of code, distributed over three DTO types, their Options
, their ToJSON
and FromJSON
instances, and finally the two conversions tryParseTable
and toTableDTO
.
Conclusion #
In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define types that model the shape of the JSON data, and then leverage type-based automation (here, Generics) to automatically produce the code that parses and writes the JSON.
I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 27 lines of code, while 'taking advantage' of type-based automation requires 62 lines of code.
To be fair, the dice don't always land that way. You can't infer a general rule from a single example, and it's possible that I could have done something clever with Aeson to reduce the code. Even so, I think that there's a conclusion to be drawn, and it's this:
Type-based automation (Generics, or run-time Reflection) may seem simple at first glance. Just declare a type and let some automation library do the rest. It may happen, however, that you need to tweak the defaults so much that it would be easier skipping the type-based approach and instead directly manipulating the DOM.
I love static type systems, but I'm also watchful of their limitations. There's likely to be an inflection point where, on the one side, a type-based declarative API is best, while on the other side of that point, a more 'lightweight' approach is better.
The position of such an inflection point will vary from context to context. Just be aware of the possibility, and explore alternatives if things begin to feel awkward.