Refactoring a saga from the State pattern to the State monad

A slightly less unrealistic example in C#.

This article is one of the examples that I promised in the earlier article The State pattern and the State monad. That article examines the relationship between the State design pattern and the State monad. It's deliberately abstract, so one or more examples are in order.

In the previous example you saw how to refactor Design Patterns' TCP connection example. That example is, unfortunately, hardly illuminating due to its nature, so a second example is warranted.

This second example shows how to refactor a stateful asynchronous message handler from the State pattern to the State monad.

Shipping policy #

Instead of inventing an example from scratch, I decided to use an NServiceBus saga tutorial as a foundation. Read on even if you don't know NServiceBus. You don't have to know anything about NServiceBus in order to follow along. I just thought that I'd embed the example code in a context that actually executes and does something, instead of faking it with a bunch of unit tests. Hopefully this will help make the example a bit more realistic and relatable.

The example is a simple demo of asynchronous message handling. In a web store shipping department, you should only ship an item once you've received the order and a billing confirmation. When working with asynchronous messaging, you can't, however, rely on message ordering, so perhaps the OrderBilled message arrives before the OrderPlaced message, and sometimes it's the other way around.

Shipping policy state diagram.

Only when you've received both messages may you ship the item.

It's a simple workflow, and you don't really need the State pattern. So much is clear from the sample code implementation:

public class ShippingPolicy : Saga<ShippingPolicyData>,
    IAmStartedByMessages<OrderBilled>,
    IAmStartedByMessages<OrderPlaced>
{
    static ILog log = LogManager.GetLogger<ShippingPolicy>();
 
    protected override void ConfigureHowToFindSaga(SagaPropertyMapper<ShippingPolicyData> mapper)
    {
        mapper.MapSaga(sagaData => sagaData.OrderId)
            .ToMessage<OrderPlaced>(message => message.OrderId)
            .ToMessage<OrderBilled>(message => message.OrderId);
    }
 
    public Task Handle(OrderPlaced message, IMessageHandlerContext context)
    {
        log.Info($"OrderPlaced message received.");
        Data.IsOrderPlaced = true;
        return ProcessOrder(context);
    }
 
    public Task Handle(OrderBilled message, IMessageHandlerContext context)
    {
        log.Info($"OrderBilled message received.");
        Data.IsOrderBilled = true;
        return ProcessOrder(context);
    }
 
    private async Task ProcessOrder(IMessageHandlerContext context)
    {
        if (Data.IsOrderPlaced && Data.IsOrderBilled)
        {
            await context.SendLocal(new ShipOrder() { OrderId = Data.OrderId });
            MarkAsComplete();
        }
    }
}

I don't expect you to be familiar with the NServiceBus API, so don't worry about the base class, the interfaces, or the ConfigureHowToFindSaga method. What you need to know is that this class handles two types of messages: OrderPlaced and OrderBilled. What the base class and the framework does is handling message correlation, hydration and dehydration, and so on.

For the purposes of this demo, all you need to know about the context object is that it enables you to send and publish messages. The code sample uses context.SendLocal to send a new ShipOrder Command.

Messages arrive asynchronously and conceptually with long wait times between them. You can't just rely on in-memory object state because a ShippingPolicy instance may receive one message and then risk that the server it's running on shuts down before the next message arrives. The NServiceBus framework handles message correlation and hydration and dehydration of state data. The latter is modelled by the ShippingPolicyData class:

public class ShippingPolicyData : ContainSagaData
{
    public string OrderId { get; set; }
    public bool IsOrderPlaced { get; set; }
    public bool IsOrderBilled { get; set; }
}

Notice that the above sample code inspects and manipulates the Data property defined by the Saga<ShippingPolicyData> base class.

When the ShippingPolicy methods are called by the NServiceBus framework, the Data is automatically populated. When you modify the Data, the state data is automatically persisted when the message handler shuts down to wait for the next message.

Characterisation tests #

While you can draw an explicit state diagram like the one above, the sample code doesn't explicitly model the various states as objects. Instead, it relies on reading and writing two Boolean values.

There's nothing wrong with this implementation. It's the simplest thing that could possibly work, so why make it more complicated?

In this article, I am going to make it more complicated. First, I'm going to refactor the above sample code to use the State design pattern, and then I'm going to refactor that code to use the State monad. From a perspective of maintainability, this isn't warranted, but on the other hand, I hope it's educational. The sample code is just complex enough to showcase the structures of the State pattern and the State monad, yet simple enough that the implementation logic doesn't get in the way.

Simplicity can be deceiving, however, and no refactoring is without risk.

"to refactor, the essential precondition is [...] solid tests"

Martin Fowler, Refactoring

I found it safest to first add a few Characterisation Tests to make sure I didn't introduce any errors as I changed the code. It did catch a few copy-paste goofs that I made, so adding tests turned out to be a good idea.

Testing NServiceBus message handlers isn't too hard. All the tests I wrote look similar, so one should be enough to give you an idea.

[Theory]
[InlineData("1337")]
[InlineData("baz")]
public async Task OrderPlacedAndBilled(string orderId)
{
    var sut = 
        new ShippingPolicy 
        {
            Data = new ShippingPolicyData { OrderId = orderId }
        };
    var ctx = new TestableMessageHandlerContext();
 
    await sut.Handle(new OrderPlaced { OrderId = orderId }, ctx);
    await sut.Handle(new OrderBilled { OrderId = orderId }, ctx);
 
    Assert.True(sut.Completed);
    var msg = Assert.Single(ctx.SentMessages.Containing<ShipOrder>());
    Assert.Equal(orderId, msg.Message.OrderId);
}

The tests use xUnit.net 2.4.2. When I downloaded the NServiceBus saga sample code it targeted .NET Framework 4.8, and I didn't bother to change the version.

While the NServiceBus framework will automatically hydrate and populate Data, in a unit test you have to remember to explicitly populate it. The TestableMessageHandlerContext class is a Test Spy that is part of NServiceBus testing API.

You'd think I was paid by Particular Software to write this article, but I'm not. All this is really just the introduction. You're excused if you've forgotten the topic of this article, but my goal is to show a State pattern example. Only now can we begin in earnest.

State pattern implementation #

Refactoring to the State pattern, I chose to let the ShippingPolicy class fill the role of the pattern's Context. Instead of a base class with virtual method, I used an interface to define the State object, as that's more Idiomatic in C#:

public interface IShippingState
{
    Task OrderPlaced(OrderPlaced message, IMessageHandlerContext context, ShippingPolicy policy);
 
    Task OrderBilled(OrderBilled message, IMessageHandlerContext context, ShippingPolicy policy);
}

The State pattern only shows examples where the State methods take a single argument: The Context. In this case, that's the ShippingPolicy. Careful! There's also a parameter called context! That's the NServiceBus context, and is an artefact of the original example. The two other parameters, message and context, are run-time values passed on from the ShippingPolicy's Handle methods:

public IShippingState State { get; internal set; }
 
public async Task Handle(OrderPlaced message, IMessageHandlerContext context)
{
    log.Info($"OrderPlaced message received.");
    Hydrate();
    await State.OrderPlaced(message, context, this);
    Dehydrate();
}
 
public async Task Handle(OrderBilled message, IMessageHandlerContext context)
{
    log.Info($"OrderBilled message received.");
    Hydrate();
    await State.OrderBilled(message, context, this);
    Dehydrate();
}

The Hydrate method isn't part of the State pattern, but finds an appropriate state based on Data:

private void Hydrate()
{
    if (!Data.IsOrderPlaced && !Data.IsOrderBilled)
        State = InitialShippingState.Instance;
    else if (Data.IsOrderPlaced && !Data.IsOrderBilled)
        State = AwaitingBillingState.Instance;
    else if (!Data.IsOrderPlaced && Data.IsOrderBilled)
        State = AwaitingPlacementState.Instance;
    else
        State = CompletedShippingState.Instance;
}

In more recent versions of C# you'd be able to use more succinct pattern matching, but since this code base is on .NET Framework 4.8 I'm constrained to C# 7.3 and this is as good as I cared to make it. It's not important to the topic of the State pattern, but I'm showing it in case you where wondering. It's typical that you need to translate between data that exists in the 'external world' and your object-oriented, polymorphic code, since at the boundaries, applications aren't object-oriented.

Likewise, the Dehydrate method translates the other way:

private void Dehydrate()
{
    if (State is AwaitingBillingState)
    {
        Data.IsOrderPlaced = true;
        Data.IsOrderBilled = false;
        return;
    }
 
    if (State is AwaitingPlacementState)
    {
        Data.IsOrderPlaced = false;
        Data.IsOrderBilled = true;
        return;
    }
 
    if (State is CompletedShippingState)
    {
        Data.IsOrderPlaced = true;
        Data.IsOrderBilled = true;
        return;
    }
 
    Data.IsOrderPlaced = false;
    Data.IsOrderBilled = false;
}

In any case, Hydrate and Dehydrate are distractions. The important part is that the ShippingPolicy (the State Context) now delegates execution to its State, which performs the actual work and updates the State.

Initial state #

The first time the saga runs, both Data.IsOrderPlaced and Data.IsOrderBilled are false, which means that the State is InitialShippingState:

public sealed class InitialShippingState : IShippingState
{
    public readonly static InitialShippingState Instance =
        new InitialShippingState();
 
    private InitialShippingState()
    {
    }
 
    public Task OrderPlaced(
        OrderPlaced message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        policy.State = AwaitingBillingState.Instance;
        return Task.CompletedTask;
    }
 
    public Task OrderBilled(
        OrderBilled message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        policy.State = AwaitingPlacementState.Instance;
        return Task.CompletedTask;
    }
}

As the above state transition diagram indicates, the only thing that each of the methods do is that they transition to the next appropriate state: AwaitingBillingState if the first event was OrderPlaced, and AwaitingPlacementState when the event was OrderBilled.

"State object are often Singletons"

Design Patterns

Like in the previous example I've made all the State objects Singletons. It's not that important, but since they are all stateless, we might as well. At least, it's in the spirit of the book.

Awaiting billing #

AwaitingBillingState is another IShippingState implementation:

public sealed class AwaitingBillingState : IShippingState
{
    public readonly static IShippingState Instance =
        new AwaitingBillingState();
 
    private AwaitingBillingState()
    {
    }
 
    public Task OrderPlaced(
        OrderPlaced message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        return Task.CompletedTask;
    }
 
    public async Task OrderBilled(
        OrderBilled message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        await context.SendLocal(
            new ShipOrder() { OrderId = policy.Data.OrderId });
        policy.Complete();
        policy.State = CompletedShippingState.Instance;
    }
}

This State doesn't react to OrderPlaced because it assumes that an order has already been placed. It only reacts to an OrderBilled event. When that happens, all requirements have been fulfilled to ship the item, so it sends a ShipOrder Command, marks the saga as completed, and changes the State to CompletedShippingState.

The Complete method is a little wrapper method I had to add to the ShippingPolicy class, since MarkAsComplete is a protected method:

internal void Complete()
{
    MarkAsComplete();
}

The AwaitingPlacementState class is similar to AwaitingBillingState, except that it reacts to OrderPlaced rather than OrderBilled.

Terminal state #

The fourth and final state is the CompletedShippingState:

public sealed class CompletedShippingState : IShippingState
{
    public readonly static IShippingState Instance =
        new CompletedShippingState();
 
    private CompletedShippingState()
    {
    }
 
    public Task OrderPlaced(
        OrderPlaced message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        return Task.CompletedTask;
    }
 
    public Task OrderBilled(
        OrderBilled message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        return Task.CompletedTask;
    }
}

In this state, the saga is completed, so it ignores both events.

Move Commands to output #

The saga now uses the State pattern to manage state-specific behaviour as well as state transitions. To be clear, this complexity isn't warranted for the simple requirements. This is, after all, an example. All tests still pass, and smoke testing also indicates that everything still works as it's supposed to.

The goal of this article is now to refactor the State pattern implementation to pure functions. When the saga runs it has an observable side effect: It eventually sends a ShipOrder Command. During processing it also updates its internal state. Both of these are sources of impurity that we have to decouple from the decision logic.

I'll do this in several steps. The first impure action I'll address is the externally observable message transmission. A common functional-programming trick is to turn a side effect into a return value. So far, the IShippingState methods don't return anything. (This is strictly not true; they each return Task, but we can regard Task as 'asynchronous void'.) Thus, return values are still available as a communications channel.

Refactor the IShippingState methods to return Commands instead of actually sending them. Each method may send an arbitrary number of Commands, including none, so the return type has to be a collection:

public interface IShippingState
{
    IReadOnlyCollection<ICommand> OrderPlaced(
        OrderPlaced message,
        IMessageHandlerContext context,
        ShippingPolicy policy);
 
    IReadOnlyCollection<ICommand> OrderBilled(
        OrderBilled message,
        IMessageHandlerContext context,
        ShippingPolicy policy);
}

When you change the interface you also have to change all the implementing classes, including AwaitingBillingState:

public sealed class AwaitingBillingState : IShippingState
{
    public readonly static IShippingState Instance = new AwaitingBillingState();
 
    private AwaitingBillingState()
    {
    }
 
    public IReadOnlyCollection<ICommand> OrderPlaced(
        OrderPlaced message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        return Array.Empty<ICommand>();
    }
 
    public IReadOnlyCollection<ICommand> OrderBilled(
        OrderBilled message,
        IMessageHandlerContext context,
        ShippingPolicy policy)
    {
        policy.Complete();
        policy.State = CompletedShippingState.Instance;
        return new[] { new ShipOrder() { OrderId = policy.Data.OrderId } };
    }
}

In order to do nothing a method like OrderPlaced now has to return an empty collection of Commands. In order to 'send' a Command, OrderBilled now returns it instead of using the context to send it. The context is already redundant, but since I prefer to move in small steps, I'll remove it in a separate step.

It's now the responsibility of the ShippingPolicy class to do something with the Commands returned by the State:

public async Task Handle(OrderBilled message, IMessageHandlerContext context)
{
    log.Info($"OrderBilled message received.");
    Hydrate();
    var result = State.OrderBilled(message, context, this);
    await Interpret(result, context);
    Dehydrate();
}
 
private async Task Interpret(
    IReadOnlyCollection<ICommand> commands,
    IMessageHandlerContext context)
{
    foreach (var cmd in commands)
        await context.SendLocal(cmd);
}

In functional programming, you often run an interpreter over the instructions returned by a pure function. Here the interpreter is just a private helper method.

The IShippingState methods are no longer asynchronous. Now they just return collections. I consider that a simplification.

Remove context parameter #

The context parameter is now redundant, so remove it from the IShippingState interface:

public interface IShippingState
{
    IReadOnlyCollection<ICommand> OrderPlaced(OrderPlaced message, ShippingPolicy policy);
 
    IReadOnlyCollection<ICommand> OrderBilled(OrderBilled message, ShippingPolicy policy);
}

I used Visual Studio's built-in refactoring tools to remove the parameter, which automatically removed it from all the call sites and implementations.

This takes us part of the way towards implementing the states as pure functions, but there's still work to be done.

public IReadOnlyCollection<ICommand> OrderBilled(OrderBilled message, ShippingPolicy policy)
{
    policy.Complete();
    policy.State = CompletedShippingState.Instance;
    return new[] { new ShipOrder() { OrderId = policy.Data.OrderId } };
}

The above OrderBilled implementation calls policy.Complete to indicate that the saga has completed. That's another state mutation that must be eliminated to make this a pure function.

Return complex result #

How do you refactor from state mutation to pure function? You turn the mutation statement into an instruction, which is a value that you return. In this case you might want to return a Boolean value: True to complete the saga. False otherwise.

There seems to be a problem, though. The IShippingState methods already return data: They return a collection of Commands. How do we get around this conundrum?

Introduce a complex object:

public sealed class ShippingStateResult
{
    public ShippingStateResult(
        IReadOnlyCollection<ICommand> commands,
        bool completeSaga)
    {
        Commands = commands;
        CompleteSaga = completeSaga;
    }
 
    public IReadOnlyCollection<ICommand> Commands { get; }
    public bool CompleteSaga { get; }
 
    public override bool Equals(object obj)
    {
        return obj is ShippingStateResult result &&
               EqualityComparer<IReadOnlyCollection<ICommand>>.Default
                    .Equals(Commands, result.Commands) &&
               CompleteSaga == result.CompleteSaga;
    }
 
    public override int GetHashCode()
    {
        int hashCode = -1668187231;
        hashCode = hashCode * -1521134295 + EqualityComparer<IReadOnlyCollection<ICommand>>
            .Default.GetHashCode(Commands);
        hashCode = hashCode * -1521134295 + CompleteSaga.GetHashCode();
        return hashCode;
    }
}

That looks rather horrible, but most of the code is generated by Visual Studio. The only thing I wrote myself was the class declaration and the two read-only properties. I then used Visual Studio's Generate constructor and Generate Equals and GetHashCode Quick Actions to produce the rest of the code.

With more modern versions of C# I could have used a record, but as I've already mentioned, I'm on C# 7.3 here.

The IShippingState interface can now define its methods with this new return type:

public interface IShippingState
{
    ShippingStateResult OrderPlaced(OrderPlaced message, ShippingPolicy policy);
 
    ShippingStateResult OrderBilled(OrderBilled message, ShippingPolicy policy);
}

This change reminds me of the Introduce Parameter Object refactoring, but instead applied to the return value instead of input.

Implementers now have to return values of this new type:

public sealed class AwaitingBillingState : IShippingState
{
    public readonly static IShippingState Instance = new AwaitingBillingState();
 
    private AwaitingBillingState()
    {
    }
 
    public ShippingStateResult OrderPlaced(OrderPlaced message, ShippingPolicy policy)
    {
        return new ShippingStateResult(Array.Empty<ICommand>(), false);
    }
 
    public ShippingStateResult OrderBilled(OrderBilled message, ShippingPolicy policy)
    {
        policy.State = CompletedShippingState.Instance;
        return new ShippingStateResult(
            new[] { new ShipOrder() { OrderId = policy.Data.OrderId } },
            true);
    }
}

Moving a statement to an output value implies that the effect must happen somewhere else. It seems natural to put it in the ShippingPolicy class' Interpret method:

public async Task Handle(OrderBilled message, IMessageHandlerContext context)
{
    log.Info($"OrderBilled message received.");
    Hydrate();
    var result = State.OrderBilled(message, this);
    await Interpret(result, context);
    Dehydrate();
}
 
private async Task Interpret(ShippingStateResult result, IMessageHandlerContext context)
{
    foreach (var cmd in result.Commands)
        await context.SendLocal(cmd);
 
    if (result.CompleteSaga)
        MarkAsComplete();
}

Since Interpret is an instance method on the ShippingPolicy class I can now also delete the internal Complete method, since MarkAsComplete is already callable (it's a protected method defined by the Saga base class).

Use message data #

Have you noticed an odd thing about the code so far? It doesn't use any of the message data!

This is an artefact of the original code example. Refer back to the original ProcessOrder helper method. It uses neither OrderPlaced nor OrderBilled for anything. Instead, it pulls the OrderId from the saga's Data property. It can do that because NServiceBus makes sure that all OrderId values are correlated. It'll only instantiate a saga for which Data.OrderId matches OrderPlaced.OrderId or OrderBilled.OrderId. Thus, these values are guaranteed to be the same, and that's why ProcessOrder can get away with using Data.OrderId instead of the message data.

So far, through all refactorings, I've retained this detail, but it seems odd. It also couples the implementation methods to the ShippingPolicy class rather than the message classes. For these reasons, refactor the methods to use the message data instead. Here's the AwaitingBillingState implementation:

public ShippingStateResult OrderBilled(OrderBilled message, ShippingPolicy policy)
{
    policy.State = CompletedShippingState.Instance;
    return new ShippingStateResult(
        new[] { new ShipOrder() { OrderId = message.OrderId } },
        true);
}

Compare this version with the previous iteration, where it used policy.Data.OrderId instead of message.OrderId.

Now, the only reason to pass ShippingPolicy as a method parameter is to mutate policy.State. We'll get to that in due time, but first, there's another issue I'd like to address.

Immutable arguments #

Keep in mind that the overall goal of the exercise is to refactor the state machine to pure functions. For good measure, method parameters should be immutable as well. Consider a method like OrderBilled shown above in its most recent iteration. It mutates policy by setting policy.State. The long-term goal is to get rid of that statement.

The method doesn't mutate the other argument, message, but the OrderBilled class is actually mutable:

public class OrderBilled : IEvent
{
    public string OrderId { get; set; }
}

The same is true for the other message type, OrderPlaced.

For good measure, pure functions shouldn't take mutable arguments. You could argue that, since none of the implementation methods actually mutate the messages, it doesn't really matter. I am, however, enough of a neat freak that I don't like to leave such a loose strand dangling. I'd like to refactor the IShippingState API so that only immutable message data is passed as arguments.

In a situation like this, there are (at least) three options:

Make the message types immutable. This would mean making OrderBilled and OrderPlaced immutable. These message types are by default mutable Data Transfer Objects (DTO), because NServiceBus needs to serialise and deserialise them to transmit them over durable queues. There are ways you can configure NServiceBus to use serialisation mechanisms that enable immutable records as messages, but for an example code base like this, I might be inclined to reach for an easier solution if one presents itself.
Add an immutable 'mirror' class. This may often be a good idea if you have a rich domain model that you'd like to represent. You can see an example of that in Code That Fits in Your Head, where there's both a mutable ReservationDto class and an immutable Reservation Value Object. This makes sense if the invariants of the domain model are sufficiently stronger than the DTO. That hardly seems to be the case here, since both messages only contain an OrderId.
Dissolve the DTO into its constituents and pass each as an argument. This doesn't work if the DTO is complex and nested, but here there's only a single constituent element, and that's the OrderId property.

The third option seems like the simplest solution, so refactor the IShippingState methods to take an orderId parameter instead of a message:

public interface IShippingState
{
    ShippingStateResult OrderPlaced(string orderId, ShippingPolicy policy);
 
    ShippingStateResult OrderBilled(string orderId, ShippingPolicy policy);
}

While this is the easiest of the three options given above, the refactoring doesn't hinge on this. It would work just as well with one of the two other options.

Implementations now look like this:

public ShippingStateResult OrderBilled(string orderId, ShippingPolicy policy)
{
    policy.State = CompletedShippingState.Instance;
    return new ShippingStateResult(
        new[] { new ShipOrder() { OrderId = orderId } },
        true);
}

The only impure action still lingering is the mutation of policy.State. Once we're rid of that, the API consists of pure functions.

Return state #

As outlined by the parent article, instead of mutating the caller's state, you can return the state as part of a tuple. This means that you no longer need to pass ShippingPolicy as a parameter:

public interface IShippingState
{
    Tuple<ShippingStateResult, IShippingState> OrderPlaced(string orderId);
 
    Tuple<ShippingStateResult, IShippingState> OrderBilled(string orderId);
}

Why not expand the ShippingStateResult class, or conversely, dissolve that class and instead return a triple (a three-tuple)? All of these are possible as alternatives, as they'd be isomorphic to this particular design. The reason I've chosen this particular return type is that it's the idiomatic implementation of the State monad: The result is the first element of a tuple, and the state is the second element. This means that you can use a standard, reusable State monad library to manipulate the values, as you'll see later.

An implementation now looks like this:

public sealed class AwaitingBillingState : IShippingState
{
    public readonly static IShippingState Instance = new AwaitingBillingState();
 
    private AwaitingBillingState()
    {
    }
 
    public Tuple<ShippingStateResult, IShippingState> OrderPlaced(string orderId)
    {
        return Tuple.Create(
            new ShippingStateResult(Array.Empty<ICommand>(), false),
            (IShippingState)this);
    }
 
    public Tuple<ShippingStateResult, IShippingState> OrderBilled(string orderId)
    {
        return Tuple.Create(
            new ShippingStateResult(
                new[] { new ShipOrder() { OrderId = orderId } },
                true),
            CompletedShippingState.Instance);
    }
}

Since the ShippingPolicy class that calls these methods now directly receives the state as part of the output, it no longer needs a mutable State property. Instead, it immediately handles the return value:

public async Task Handle(OrderPlaced message, IMessageHandlerContext context)
{
    log.Info($"OrderPlaced message received.");
    var state = Hydrate();
 
    var result = state.OrderPlaced(message.OrderId);
 
    await Interpret(result.Item1, context);
    Dehydrate(result.Item2);
}
 
public async Task Handle(OrderBilled message, IMessageHandlerContext context)
{
    log.Info($"OrderBilled message received.");
    var state = Hydrate();
 
    var result = state.OrderBilled(message.OrderId);
 
    await Interpret(result.Item1, context);
    Dehydrate(result.Item2);
}

Each Handle method is now an impureim sandwich.

Since the result is now a tuple, the Handle methods now have to pass the first element (result.Item1) to the Interpret helper method, and the second element (result.Item2) - the state - to Dehydrate. It's also possible to pattern match (or destructure) each of the elements directly; you'll see an example of that later.

Since the mutable State property is now gone, the Hydrate method returns the hydrated state:

private IShippingState Hydrate()
{
    if (!Data.IsOrderPlaced && !Data.IsOrderBilled)
        return InitialShippingState.Instance;
    else if (Data.IsOrderPlaced && !Data.IsOrderBilled)
        return AwaitingBillingState.Instance;
    else if (!Data.IsOrderPlaced && Data.IsOrderBilled)
        return AwaitingPlacementState.Instance;
    else
        return CompletedShippingState.Instance;
}

Likewise, the Dehydrate method takes the new state as an input parameter:

private void Dehydrate(IShippingState state)
{
    if (state is AwaitingBillingState)
    {
        Data.IsOrderPlaced = true;
        Data.IsOrderBilled = false;
        return;
    }
 
    if (state is AwaitingPlacementState)
    {
        Data.IsOrderPlaced = false;
        Data.IsOrderBilled = true;
        return;
    }
 
    if (state is CompletedShippingState)
    {
        Data.IsOrderPlaced = true;
        Data.IsOrderBilled = true;
        return;
    }
 
    Data.IsOrderPlaced = false;
    Data.IsOrderBilled = false;
}

Since each Handle method only calls a single State-valued method, they don't need the State monad machinery. This only becomes useful when you need to compose multiple State-based operations.

This might be useful in unit tests, so let's examine that next.

State monad #

In previous articles about the State monad you've seen it implemented based on an IState interface. I've also dropped hints here and there that you don't need the interface. Instead, you can implement the monad functions directly on State-valued functions. That's what I'm going to do here:

public static Func<S, Tuple<T1, S>> SelectMany<S, T, T1>(
    this Func<S, Tuple<T, S>> source,
    Func<T, Func<S, Tuple<T1, S>>> selector)
{
    return s =>
    {
        var tuple = source(s);
        var f = selector(tuple.Item1);
        return f(tuple.Item2);
    };
}

This SelectMany implementation works directly on another function, source. This function takes a state of type S as input and returns a tuple as a result. The first element is the result of type T, and the second element is the new state, still of type S. Compare that to the IState interface to convince yourself that these are just two representations of the same idea.

The return value is a new function with the same shape, but where the result type is T1 rather than T.

You can implement the special SelectMany overload that enables query syntax in the standard way.

The return function also mirrors the previous interface-based implementation:

public static Func<S, Tuple<T, S>> Return<S, T>(T x)
{
    return s => Tuple.Create(x, s);
}

You can also implement the standard Get, Put, and Modify functions, but we are not going to need them here. Try it as an exercise.

State-valued event handlers #

The IShippingState methods almost look like State values, but the arguments are in the wrong order. A State value is a function that takes state as input and returns a tuple. The methods on IShippingState, however, take orderId as input and return a tuple. The state is also present, but as the instance that exposes the methods. We have to flip the arguments:

public static Func<IShippingState, Tuple<ShippingStateResult, IShippingState>> Billed(
    this string orderId)
{
    return s => s.OrderBilled(orderId);
}
 
public static Func<IShippingState, Tuple<ShippingStateResult, IShippingState>> Placed(
    this string orderId)
{
    return s => s.OrderPlaced(orderId);
}

This is a typical example of how you have to turn things on their heads in functional programming, compared to object-oriented programming. These two methods convert OrderBilled and OrderPlaced to State monad values.

Testing state results #

A unit test demonstrates how this enables you to compose multiple stateful operations using query syntax:

[Theory]
[InlineData("90125")]
[InlineData("quux")]
public void StateResultExample(string orderId)
{
    var sf = from x in orderId.Placed()
             from y in orderId.Billed()
             select new[] { x, y };
 
    var (results, finalState) = sf(InitialShippingState.Instance);
 
    Assert.Equal(
        new[] { false, true },
        results.Select(r => r.CompleteSaga));
    Assert.Single(
        results
            .SelectMany(r => r.Commands)
            .OfType<ShipOrder>()
            .Select(msg => msg.OrderId),
        orderId);
    Assert.Equal(CompletedShippingState.Instance, finalState);
}

Keep in mind that a State monad value is a function. That's the reason I called the composition sf - for State Function. When you execute it with InitialShippingState as input it returns a tuple that the test immediately pattern matches (destructures) into its constituent elements.

The test then asserts that the results and finalState are as expected. The assertions against results are a bit awkward, since C# collections don't have structural equality. These assertions would have been simpler in F# or Haskell.

Testing with an interpreter #

While the Arrange and Act phases of the above test are simple, the Assertion phase seems awkward. Another testing strategy is to run a test-specific interpreter over the instructions returned as the State computation result:

[Theory]
[InlineData("1984")]
[InlineData("quuz")]
public void StateInterpretationExample(string orderId)
{
    var sf = from x in orderId.Placed()
             from y in orderId.Billed()
             select new[] { x, y };
 
    var (results, finalState) = sf(InitialShippingState.Instance);
 
    Assert.Equal(CompletedShippingState.Instance, finalState);
    var result = Interpret(results);
    Assert.True(result.CompleteSaga);
    Assert.Single(
        result.Commands.OfType<ShipOrder>().Select(msg => msg.OrderId),
        orderId);
}

It helps a little, but the assertions still have to work around the lack of structural equality of result.Commands.

Monoid #

The test-specific Interpret helper method is interesting in its own right, though:

private ShippingStateResult Interpret(IEnumerable<ShippingStateResult> results)
{
    var identity = new ShippingStateResult(Array.Empty<ICommand>(), false);
    ShippingStateResult Combine(ShippingStateResult x, ShippingStateResult y)
    {
        return new ShippingStateResult(
            x.Commands.Concat(y.Commands).ToArray(),
            x.CompleteSaga || y.CompleteSaga);
    }
    return results.Aggregate(identity, Combine);
}

It wasn't until I started implementing this helper method that I realised that ShippingStateResult gives rise to a monoid! Since monoids accumulate, you can start with the identity and use the binary operation (here called Combine) to Aggregate an arbitrary number of ShippingStateResult values into one.

The ShippingStateResult class is composed of two constituent values (a collection and a Boolean value), and since both of these give rise to one or more monoids, a tuple of those monoids itself gives rise to one or more monoids. The ShippingStateResult is isomorphic to a tuple, so this result carries over.

Should you move the Combine method and the identity value to the ShippingStateResult class itself. After all, putting them in a test-specific helper method smells a bit of Feature Envy.

This seems compelling, but it's not clear that arbitrary client code might need this particular monoid. After all, there are four monoids over Boolean values, and at least two over collections. That's eight possible combinations. Which one should ShippingStateResult expose as members?

The monoid used in Interpret combines the normal collection monoid with the any monoid. That seems appropriate in this case, but other clients might rather need the all monoid.

Without more usage examples, I decided to leave the code as an Interpret implementation detail for now.

In any case, I find it worth noting that by decoupling the state logic from the NServiceBus framework, it's possible to test it without running asynchronous workflows.

Conclusion #

In this article you saw how to implement an asynchronous messaging saga in three different ways. First, as a simple ad-hoc solution, second using the State pattern, and third implemented with the State monad. Both the State pattern and State monad implementations are meant exclusively to showcase these two techniques. The first solution using two Boolean flags is by far the simplest solution, and the one I'd use in a production system.

The point is that you can use the State monad if you need to write stateful computations. This may include finite state machines, as otherwise addressed by the State design pattern, but could also include other algorithms where you need to keep track of state.

Next: Postel's law as a profunctor.

Published: Monday, 10 October 2022 06:27:00 UTC

Refactoring a saga from the State pattern to the State monad by Mark Seemann