Using System.Text.Json, with and without Reflection.

This article is part of a short series of articles about serialization with and without Reflection. In this instalment I'll explore some options for serializing JSON with C# using the API built into .NET: System.Text.Json. I'm not going use Json.NET in this article, but I've done similar things with that library in the past, so what's here is, at least, somewhat generalizable.

Since the API is the same, the only difference from the previous article is the language syntax.

Natural numbers #

Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the introductory article we'd like to parse and write restaurant table configurations like this:

{
  "singleTable": {
    "capacity": 16,
    "minimalReservation": 10
  }
}

On the other hand, I'd like to represent the Domain Model in a way that encapsulates the rules governing the model, making illegal states unrepresentable. Even though that's a catchphrase associated with functional programming, it applies equally well to a statically typed object-oriented language like C#.

As the first step, we observe that the numbers involved are all natural numbers. In C# it's rarer to define predicative data types than in a language like F#, but people should do it more.

public readonly struct NaturalNumber : IEquatable<NaturalNumber>
{
    private readonly int value;
 
    public NaturalNumber(int value)
    {
        if (value < 1)
            throw new ArgumentOutOfRangeException(
                nameof(value),
                "Value must be a natural number greater than zero.");
        this.value = value;
    }
 
    public static NaturalNumber? TryCreate(int candidate)
    {
        if (candidate < 1)
            return null;
        return new NaturalNumber(candidate);
    }
 
    public static bool operator <(NaturalNumber left, NaturalNumber right)
    {
        return left.value < right.value;
    }
 
    public static bool operator >(NaturalNumber left, NaturalNumber right)
    {
        return left.value > right.value;
    }
 
    public static bool operator <=(NaturalNumber left, NaturalNumber right)
    {
        return left.value <= right.value;
    }
 
    public static bool operator >=(NaturalNumber left, NaturalNumber right)
    {
        return left.value >= right.value;
    }
 
    public static bool operator ==(NaturalNumber left, NaturalNumber right)
    {
        return left.value == right.value;
    }
 
    public static bool operator !=(NaturalNumber left, NaturalNumber right)
    {
        return left.value != right.value;
    }
 
    public static explicit operator int(NaturalNumber number)
    {
        return number.value;
    }
 
    public override bool Equals(objectobj)
    {
        return obj is NaturalNumber number && Equals(number);
    }
 
    public bool Equals(NaturalNumber other)
    {
        return value == other.value;
    }
 
    public override int GetHashCode()
    {
        return HashCode.Combine(value);
    }
}

When comparing all that boilerplate code to the three lines required to achieve the same result in F#, it seems, at first glance, understandable that C# developers rarely reach for that option. Still, typing is not a programming bottleneck, and most of that code was generated by a combination of Visual Studio and GitHub Copilot.

The TryCreate method may not be strictly necessary, but I consider it good practice to give client code a way to perform a fault-prone operation in a safe manner, without having to resort to a try/catch construct.

That's it for natural numbers. 72 lines of code. Compare that to the F# implementation, which required three lines of code. Syntax does matter.

Domain Model #

Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a NaturalNumber less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six.

In the same spirit as above, then, define this type:

public readonly struct Table
{
    private readonly NaturalNumber capacity;
    private readonly NaturalNumber? minimalReservation;
 
    private Table(NaturalNumber capacity, NaturalNumber? minimalReservation)
    {
        this.capacity = capacity;
        this.minimalReservation = minimalReservation;
    }
 
    public static Table? TryCreateSingle(int capacityint minimalReservation)
    {
        var cap = NaturalNumber.TryCreate(capacity);
        if (cap is null)
            return null;
        var min = NaturalNumber.TryCreate(minimalReservation);
        if (min is null)
            return null;
 
        if (cap < min)
            return null;
 
        return new Table(cap.Value, min.Value);
    }
 
    public static Table? TryCreateCommunal(int capacity)
    {
        var cap = NaturalNumber.TryCreate(capacity);
        if (cap is null)
            return null;
 
        return new Table(cap.Value, null);
    }
 
    public T Accept<T>(ITableVisitor<T> visitor)
    {
        if (minimalReservation is null)
            return visitor.VisitCommunal(capacity);
        else
            return visitor.VisitSingle(capacity, minimalReservation.Value);
    }
}

Here I've Visitor-encoded the sum type that Table is. It can either be a 'single' table or a communal table.

Notice that TryCreateSingle checks the invariant that the capacity must be greater than or equal to the minimalReservation.

The point of this little exercise, so far, is that it encapsulates the contract implied by the Domain Model. It does this by using the static type system to its advantage.

JSON serialization by hand #

At the boundaries of applications, however, there are no static types. Is the static type system still useful in that situation?

For a long time, the most popular .NET library for JSON serialization was Json.NET, but these days I find the built-in API offered in the System.Text.Json namespace adequate. This is also the case here.

The original rationale for this article series was to demonstrate how serialization can be done without Reflection, so I'll start there and return to Reflection later.

In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this:

"communalTable": { "capacity": 42 } }

Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility.

As I outlined in the introduction article you can usually find a more weakly typed API to get the job done. For serializing Table to JSON it looks like this:

public static string Serialize(this Table table)
{
    return table.Accept(new TableVisitor());
}
 
private sealed class TableVisitor : ITableVisitor<string>
{
    public string VisitCommunal(NaturalNumber capacity)
    {
        var j = new JsonObject
        {
            ["communalTable"] = new JsonObject
            {
                ["capacity"] = (int)capacity
            }
        };
        return j.ToJsonString();
    }
 
    public string VisitSingle(NaturalNumber capacity, NaturalNumber value)
    {
        var j = new JsonObject
        {
            ["singleTable"] = new JsonObject
            {
                ["capacity"] = (int)capacity,
                ["minimalReservation"] = (int)value
            }
        };
        return j.ToJsonString();
    }
}

In order to separate concerns, I've defined this functionality in a new static class that references the Domain Model. The Serialize extension method uses a private Visitor to write two different JsonObject objects, using the JSON API's underlying Document Object Model (DOM).

JSON deserialization by hand #

You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid Table representation, or the values may be illegal, etc.

Since there are several values that explicitly must be integers, it makes sense to define a helper method to try to parse an integer:

private static intTryInt(this JsonNode? node)
{
    if (node is null)
        return null;
    
    if (node.GetValueKind() != JsonValueKind.Number)
        return null;
 
    try
    {
        return (int)node;
    }
    catch (FormatException)
    {
        return null;
    }
}

I'm surprised that there's no built-in way to do that, but if there is, I couldn't find it.

With a helper method like that you can now implement the Deserialize method:

public static Table? Deserialize(string json)
{
    try
    {
        var node = JsonNode.Parse(json);
 
        var cnode = node?["communalTable"];
        if (cnode is { })
        {
            var capacity = cnode["capacity"].TryInt();
            if (capacity is null)
                return null;
            return Table.TryCreateCommunal(capacity.Value);
        }
 
        var snode = node?["singleTable"];
        if (snode is { })
        {
            var capacity = snode["capacity"].TryInt();
            var minimalReservation = snode["minimalReservation"].TryInt();
            if (capacity is null || minimalReservation is null)
                return null;
            return Table.TryCreateSingle(
                capacity.Value,
                minimalReservation.Value);
        }
 
        return null;
    }
    catch (JsonException)
    {
        return null;
    }
}

Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works, and in fact, I did. Here are a few examples:

[Fact]
public void DeserializeSingleTableFor4()
{
    var json = """{"singleTable":{"capacity":4,"minimalReservation":3}}""";
    var actual = TableJson.Deserialize(json);
    Assert.Equal(Table.TryCreateSingle(4, 3), actual);
}
 
[Fact]
public void DeserializeNonTable()
{
    var json = """{"foo":42}""";
    var actual = TableJson.Deserialize(json);
    Assert.Null(actual);
}

Apart from using directives and namespace declaration this hand-written JSON capability requires 87 lines of code, although, to be fair, TryInt is a general-purpose method that ought to be part of the System.Text.Json API. Can we do better with static types and Reflection?

JSON serialisation based on types #

The static JsonSerializer class comes with Serialize<T> and Deserialize<T> methods that use Reflection to convert a statically typed object to and from JSON. You can define a type (a Data Transfer Object (DTO) if you will) and let Reflection do the hard work.

In Code That Fits in Your Head I explain how you're usually better off separating the role of serialization from the role of Domain Model. One way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above Table type plays the latter role, so we need new DTO types:

public sealed class TableDto
{
    public CommunalTableDto? CommunalTable { getset; }
    public SingleTableDto? SingleTable { getset; }
}

public sealed class CommunalTableDto
{
    public int Capacity { getset; }
}

public sealed class SingleTableDto
{
    public int Capacity { getset; }
    public int MinimalReservation { getset; }
}

One way to model a sum type with a DTO is to declare both cases as nullable fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary.

While you can serialize values of that type, by default the generated JSON doesn't have the right format. Instead, a serialized communal table looks like this:

{
  "CommunalTable": { "Capacity": 42 },
  "SingleTable"null
}

There are two problems with the generated JSON document:

  • The casing is wrong
  • The null value shouldn't be there

None of those are too hard to address, but it does make the API a bit more awkward to use, as this test demonstrates:

[Fact]
public void SerializeCommunalTableViaReflection()
{
    var dto = new TableDto
    {
        CommunalTable = new CommunalTableDto { Capacity = 42 }
    };
 
    var actual = JsonSerializer.Serialize(
        dto,
        new JsonSerializerOptions
        {
            PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
            DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
        });
 
    Assert.Equal("""{"communalTable":{"capacity":42}}""", actual);
}

You can, of course, define this particular serialization behaviour as a reusable method, so it's not a problem that you can't address. I just wanted to include this, since it's part of the overall work that you have to do in order to make this work.

JSON deserialisation based on types #

To allow parsing of JSON into the above DTO the Reflection-based Deserialize method pretty much works out of the box, although again, it needs to be configured. Here's a passing test that demonstrates how that works:

[Fact]
public void DeserializeSingleTableViaReflection()
{
    var json = """{"singleTable":{"capacity":4,"minimalReservation":2}}""";
 
    var actual = JsonSerializer.Deserialize<TableDto>(
        json,
        new JsonSerializerOptions { PropertyNamingPolicy = JsonNamingPolicy.CamelCase });
 
    Assert.Null(actual?.CommunalTable);
    Assert.Equal(4, actual?.SingleTable?.Capacity);
    Assert.Equal(2, actual?.SingleTable?.MinimalReservation);
}

There's only difference in casing, so you'd expect the Deserialize method to be a Tolerant Reader, but no. It's very particular about that, so the JsonNamingPolicy.CamelCase configuration is necessary. Perhaps the API designers found that explicit is better than implicit.

In any case, you could package that in a reusable Deserialize function that has all the options that are appropriate in a particular code context, so not a big deal. That takes care of actually writing and parsing JSON, but that's only half the battle. This only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate Table data.

Converting DTO to Domain Model, and vice versa #

As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial:

public static TableDto ToDto(this Table table)
{
    return table.Accept(new TableDtoVisitor());
}
 
private sealed class TableDtoVisitor : ITableVisitor<TableDto>
{
    public TableDto VisitCommunal(NaturalNumber capacity)
    {
        return new TableDto
        {
            CommunalTable = new CommunalTableDto
            {
                Capacity = (int)capacity
            }
        };
    }
 
    public TableDto VisitSingle(
        NaturalNumber capacity,
        NaturalNumber value)
    {
        return new TableDto
        {
            SingleTable = new SingleTableDto
            {
                Capacity = (int)capacity,
                MinimalReservation = (int)value
            }
        };
    }
}

Going the other way is fundamentally a parsing exercise:

public Table? TryParse()
{
    if (CommunalTable is { })
        return Table.TryCreateCommunal(CommunalTable.Capacity);
    if (SingleTable is { })
        return Table.TryCreateSingle(
            SingleTable.Capacity,
            SingleTable.MinimalReservation);
 
    return null;
}

Here, like in Code That Fits in Your Head, I've made that conversion an instance method on TableDto.

Such an operation may fail, so the result is a nullable Table object.

Let's take stock of the type-based alternative. It requires 58 lines of code, distributed over three DTO types and the two conversions ToDto and TryParse, but here I haven't counted configuration of Serialize and Deserialize, since I left that to each test case that I wrote. Since all of this code generally stays within 80 characters in line width, that would realistically add another 10 lines of code, for a total around 68 lines.

This is smaller than the DOM-based code, but not by much.

Conclusion #

In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define types that model the shape of the JSON data, and then leverage type-based automation (here, Reflection) to automatically parse and write the JSON.

I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 87 lines of code, while taking advantage of type-based automation requires 68 lines of code. Again, Reflection seems 'easier' if you count lines of code, but the difference is marginal.


Comments

Great piece as ever Mark. Always enjoy reading about alternatives to methods that have become unquestioned convention.

I generally try to avoid reflection, especially within business code, and mainly use it for application bootstrapping, such as to discover services for dependency injection by convention. I also don't like attributes muddying model definitions, even on DTOs, so I would happily take an alternative to System.Text.Json. It is however increasingly integrated into other System libraries in ways that make it almost too useful to pass up. For example, the System.Net.Http.HttpContent class has the ReadFromJsonAsync extension method, which makes it trivial to deserialize a response body. Analogous methods exist for BinaryData. I'm not normally a sucker for convenience, but it is difficult to turn down strong integration like this.

2024-01-05 21:13 UTC

Callum, thank you for writing. You are correct that the people who design and develop .NET put a lot of effort into making things convenient. Some of that convenience, however, comes with a price. You have to buy into a certain way of doing things, and that certain way can sometimes be at odds with other good software practices, such as the Dependency Inversion Principle or test-driven development.

My goal with this (and other) article(s) isn't, however, to say that you mustn't take advantage of convenient integrations, but rather to highlight that alternatives exist.

The many 'convenient' ways that a framework gives you to solve various problems comes with the risk that you may paint yourself into a corner, if you aren't careful. You've invested heavily in the framework's way of doing things, but there's just this small edge case that you can't get right. So you write a bit of custom code, after having figured out the correct extensibility point to hook into. Until the framework changes 'how things are done' in the next iteration.

This is what I call Framework Whac-A-Mole - a syndrome that I'm becoming increasingly wary of the more experience I gain. Of the examples linked to in that article, ASP.NET validation revisited may be the most relevant to this discussion.

As a final note, I'd be remiss if I entered into a discussion about programmer convenience without drawing on Rich Hickey's excellent presentation Simple Made Easy, where he goes to great length distinguishing between what is easy (i.e. close at hand) and what is simple (i.e. not complex). The sweet spot, of course, is the intersection, where things are both simple and easy.

Most 'convenient' framework features do not, in my opinion, check that box.

2024-01-10 13:37 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 25 December 2023 11:42:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 25 December 2023 11:42:00 UTC