Get and Put State

A pair of standard helper functions for the State monad. An article for object-oriented programmers.

The State monad is completely defined by its two defining functions (SelectMany and Return). While you can get by without them, two additional helper functions (get and put) are so convenient that they're typically included. To be clear, they're not part of the State monad - rather, you can consider them part of what we may term a standard State API.

In short, get is a function that, as the name implies, gets the state while inside the State monad, and put replaces the state with a new value.

Later in this article, I'll show how to implement these two functions, as well as a usage example. Before we get to that, however, I want to show a motivating example. In other words, an example that doesn't use get and put.

The code shown in this article uses the C# State implementation from the State monad article.

Aggregator #

Imagine that you have to implement a simple Aggregator.

"How do we combine the results of individual but related messages so that they can be processed as a whole?"

[...] "Use a stateful filter, an Aggregator, to collect and store individual messages until it receives a complete set of related messages. Then, the Aggregator publishes a single message distilled from the individual messages."

Enterprise Integration Patterns

The example that I'll give here is simplified and mostly focuses on how to use the State monad to implement the desired behaviour. The book Enterprise Integration Patterns starts with a simple example where messages arrive with a correlation ID as an integer. The message payload is also a an integer, just to keep things simple. The Aggregator should only publish an aggregated message once it has received three correlated messages.

Using the State monad, you could implement an Aggregator like this:

public sealed class Aggregator :
    IState<IReadOnlyDictionary<int, IReadOnlyCollection<int>>, Maybe<Tuple<int, int, int>>>
{
    private readonly int correlationId;
    private readonly int value;
 
    public Aggregator(int correlationId, int value)
    {
        this.correlationId = correlationId;
        this.value = value;
    }
 
    public Tuple<Maybe<Tuple<int, int, int>>, IReadOnlyDictionary<int, IReadOnlyCollection<int>>> Run(
        IReadOnlyDictionary<int, IReadOnlyCollection<int>> state)
    {
        if (state.TryGetValue(correlationId, out var coll))
        {
            if (coll.Count == 2)
            {
                var retVal =
                    Tuple.Create(coll.ElementAt(0), coll.ElementAt(1), value);
                var newState = state.Remove(correlationId);
                return Tuple.Create(retVal.ToMaybe(), newState);
            }
            else
            {
                var newColl = coll.Append(value);
                var newState = state.Replace(correlationId, newColl);
                return Tuple.Create(new Maybe<Tuple<int, int, int>>(), newState);
            }
        }
        else
        {
            var newColl = new[] { value };
            var newState = state.Add(correlationId, newColl);
            return Tuple.Create(new Maybe<Tuple<int, int, int>>(), newState);
        }
    }
}

The Aggregator class implements the IState<S, T> interface. The full generic type is something of a mouthful, though.

The state type (S) is IReadOnlyDictionary<int, IReadOnlyCollection<int>> - in other words, a dictionary of collections. Each entry in the dictionary is keyed by a correlation ID. Each value is a collection of messages that belong to that ID. Keep in mind that, in order to keep the example simple, each message is just a number (an int).

The value to produce (T) is Maybe<Tuple<int, int, int>>. This code example uses this implementation of the Maybe monad. The value produced may or may not be empty, depending on whether the Aggregator has received all three required messages in order to produce an aggregated message. Again, for simplicity, the aggregated message is just a triple (a three-tuple).

The Run method starts by querying the state dictionary for an entry that corresponds to the correlationId. This entry may or may not be present. If the message is the first in a series of three, there will be no entry, but if it's the second or third message, the entry will be present.

In that case, the Run method checks the Count of the collection. If the Count is 2, it means that two other messages with the same correlationId was already received. This means that the Aggregator is now handling the third and final message. Thus, it creates the retVal tuple, removes the entry from the dictionary to create the newState, and returns both.

If the state contains an entry for the correlationId, but the Count isn't 2, the Run method updates the entry by appending the value, updating the state to newState, and returns that together with an empty Maybe value.

Finally, if there is no entry for the correlationId, the Run method creates a new collection containing only the value, adds it to the state dictionary, and returns the newState together with an empty Maybe value.

Message handler #

A message handler could be a background service that receives messages from a durable queue, a REST endpoint, or based on some other technology.

After it receives a message, a message handler would create a new instance of the Aggregator:

var a = new Aggregator(msg.CorrelationId, msg.Value);

Since Aggregator implements the IState<S, T> interface, the object a represents a stateful computation. A message handler might keep the current state in memory, or rehydrate it from some persistent storage system. Keep in mind that the state must be of the type IReadOnlyDictionary<int, IReadOnlyCollection<int>>. Wherever it comes from, assume that this state is a variable called s (for state).

The message handler can now Run the stateful computation by supplying s:

var t = a.Run(s);

The result is a tuple where the first item is a Maybe value, and the second item is the new state.

The message handler can now publish the triple if the Maybe value is populated. In any case, it can update the 'current' state with the new state. That's a nice little impureim sandwich.

Notice how this design is different from a typical object-oriented solution. In object-oriented programming, you'd typically have an object than contains the state and then receives the run-time value as input to a method that might then mutate the state. Data with behaviour, as it's sometimes characterised.

The State-based computation turns such a design on its head. The computation closes over the run-time values, and the state is supplied as an argument to the Run method. This is an example of the shift of perspective often required to think functionally, rather than object-oriented. That's why it takes time learning Functional Programming (FP); it's not about syntax. It's a different way to think.

An object like the above a seems almost frivolous, since it's going to have a short lifetime. Calling code will create it only to call its Run method and then let it go out of scope to be garbage-collected.

Of course, in a language more attuned to FP like Haskell, it's a different story:

let h = handle (corrId msg) (val msg)

Instead of creating an object using a constructor, you only pass the message values to a function called handle. The return value h is a State value that an overall message handler can then later run with a state s:

let (m, ns) = runState h s

The return value is a tuple where m is the Maybe value that may or may not contain the aggregated message; ns is the new state.

Is this better? #

Is this approach to state mutation better than the default kind of state mutation possible with most languages (including C#)? Why make things so complicated?

There's more than one answer. First, in a language like Haskell, state mutation is in general not possible. While you can do state mutation with the IO container in Haskell, this sets you completely free. You don't want to be free, because with freedom comes innumerable ways to shoot yourself in the foot. Constraints liberate.

While the IO monad allows uncontrolled state mutation (together with all sorts of other impure actions), the State monad constrains itself and callers to only one type of apparent mutation. The type of the state being 'mutated' is visible in the type system, and that's the only type of value you can 'mutate' (in Haskell, that is).

The State monad uses the type system to clearly communicate what the type of state is. Given a language like Haskell, or otherwise given sufficient programming discipline, you can tell from an object's type exactly what to expect.

This also goes a long way to explain why monads are such an important concept in Functional Programming. When discussing FP, a common question is: How do you perform side effects? The answer, as may be already implied by this article, is that you use monads. The State monad for local state mutation, and the IO monad for 'global' side effects.

Get #

Clearly you can write an implementation of IState<S, T> like the above Aggregator class. Must we always write a class that implements the interface in order to work within the State monad?

Monads are all about composition. Usually, you can compose even complex behaviour from smaller building blocks. Just consider the list monad, which in C# is epitomised by the IEnumerable<T> interface. You can write quite complex logic using the building blocks of Where, Select, Aggregate, Zip, etcetera.

Likewise, we should expect that to be the case with the State monad, and it is so. The useful extra combinators are get and put.

The get function enables a composition to retrieve the current state. Given the IState<S, T> interface, you can implement it like this:

public static IState<S, S> Get<S>()
{
    return new GetState<S>();
}
 
private class GetState<S> : IState<S, S>
{
    public Tuple<S, S> Run(S state)
    {
        return Tuple.Create(state, state);
    }
}

The Get function represents a stateful computation that copies the state over to the 'value' dimension, so to speak. Notice that the return type is IState<S, S>. Copying the state over to the position of the T generic type means that it becomes accessible to the expressions that run inside of Select and SelectMany.

You'll see an example once I rewrite the above Aggregator to be entirely based on composition, but in order to do that, I also need the put function.

Put #

The put function enables you to write a new state value to the underlying state dimension. The implementation in the current code base looks like this:

public static IState<S, Unit> Put<S>(S s)
{
    return new PutState<S>(s);
}
 
private class PutState<S> : IState<S, Unit>
{
    private readonly S s;
 
    public PutState(S s)
    {
        this.s = s;
    }
 
    public Tuple<Unit, S> Run(S state)
    {
        return Tuple.Create(Unit.Default, s);
    }
}

This implementation uses a Unit value to represent void. As usual, we have the problem in C-based languages that void isn't a value, but fortunately, unit is isomorphic to void.

Notice that the Run method ignores the current state and instead replaces it with the new state s.

Look, no classes! #

The Get and Put functions are enough that we can now rewrite the functionality currently locked up in the Aggregator class. Instead of having to define a new class for that purpose, it's possible to compose our way to the same functionality by writing a function:

public static IState<IReadOnlyDictionary<int, IReadOnlyCollection<int>>, Maybe<Tuple<int, int, int>>>
    Aggregate(int correlationId, int value)
{
    return
        from state in State.Get<IReadOnlyDictionary<int, IReadOnlyCollection<int>>>()
        let mcoll = state.TryGetValue(correlationId)
        let retVal = from coll in mcoll.Where(c => c.Count == 2)
                     select Tuple.Create(coll.ElementAt(0), coll.ElementAt(1), value)
        let newState = retVal
            .Select(_ => state.Remove(correlationId))
            .GetValueOrFallback(
                state.Replace(
                    correlationId,
                    mcoll
                        .Select(coll => coll.Append(value))
                        .GetValueOrFallback(new[] { value })))
        from _ in State.Put(newState)
        select retVal;
}

Okay, I admit that there's a hint of code golf over this. It's certainly not idiomatic C#. To be clear, I'm not endorsing this style of C#; I'm only showing it to explain the abstraction offered by the State monad. Adopt such code at your own peril.

The first observation to be made about this code example is that it's written entirely in query syntax. There's a good reason for that. Query syntax is syntactic sugar on top of SelectMany, so you could, conceivably, also write the above expression using method call syntax. However, in order to make early values available to later expressions, you'd have to pass a lot of tuples around. For example, the above expression makes repeated use of mcoll, so had you been using method call syntax instead of query syntax, you would have had to pass that value on to subsequent computations as one item in a tuple. Not impossible, but awkward. With query syntax, all values remain in scope so that you can refer to them later.

The expression starts by using Get to get the current state. The state variable is now available in the rest of the expression.

The state is a dictionary, so the next step is to query it for an entry that corresponds to the correlationId. I've used an overload of TryGetValue that returns a Maybe value, which also explains (I hope) the m prefix of mcoll.

Next, the expression filters mcoll and creates a triple if the coll has a Count of two. Notice that the nested query syntax expression (from...select) isn't running in the State monad, but rather in the Maybe monad. The result, retVal, is another Maybe value.

That takes care of the 'return value', but we also need to calculate the new state. This happens in a somewhat roundabout way. The reason that it's not more straightforward is that C# query syntax doesn't allow branching (apart from the ternary ?..: operator) and (this version of the language, at least) has weak pattern-matching abilities.

Instead, it uses retVal and mcoll as indicators of how to update the state. If retVal is populated, it means that the Aggregate computation will return a triple, in which case it must Remove the entry from the state dictionary. On the other hand, if that's not the case, it must update the entry. Again, there are two options. If mcoll was already populated, it should be updated by appending the value. If not, a new entry containing only the value should be added.

Finally, the expression uses Put to save the new state, after which it returns retVal.

While this is far from idiomatic C# code, the point is that you can compose your way to the desired behaviour. You don't have to write a new class. Not that this is necessarily an improvement in C#. I'm mostly stating this to highlight a difference in philosophy.

Of course, this is all much more elegant in Haskell, where the same functionality is as terse as this:

handle :: (Ord k, MonadState (Map k [a]) m) => k -> a -> m (Maybe (a, a, a))
handle correlationId value = do
  m <- get
  let (retVal, newState) =
        case Map.lookup correlationId m of
          Just [x, y] -> (Just (x, y, value), Map.delete correlationId m)
          Just  _ -> (Nothing, Map.adjust (++ [value]) correlationId m)
          Nothing -> (Nothing, Map.insert correlationId [value] m)
  put newState
  return retVal

Notice that this implementation also makes use of get and put.

Modify #

The Get and Put functions are basic functions based on the State monad abstraction. These two functions, again, can be used to define some secondary helper functions, whereof Modify is one:

public static IState<S, Unit> Modify<S>(Func<S, S> modify)
{
    return Get<S>().SelectMany(s => Put(modify(s)));
}

It wasn't required for the above Aggregate function, but here's a basic unit test that demonstrates how it works:

[Fact]
public void ModifyExample()
{
    var x = State.Modify((int i) => i + 1);
    var actual = x.Run(1);
    Assert.Equal(2, actual.Item2);
}

It can be useful if you need to perform an 'atomic' state modification. For a realistic Haskell example, you may want to refer to my article An example of state-based testing in Haskell.

Gets #

Another occasionally useful second-order helper function is Gets:

public static IState<S, T> Gets<S, T>(Func<S, T> selector)
{
    return Get<S>().Select(selector);
}

This function can be useful as a combinator if you need just a projection of the current state, instead of the whole state.

Here's another basic unit test as an example:

[Fact]
public void GetsExample()
{
    IState<string, int> x = State.Gets((string s) => s.Length);
    Tuple<int, string> actual = x.Run("bar");
    Assert.Equal(Tuple.Create(3, "bar"), actual);
}

While the above Aggregator example didn't require Modify or Gets, I wanted to include them here for completeness sake.

F# #

Most of the code shown in this article has been C#, with the occasional Haskell code. You can also implement the State monad, as well as the helper methods, in F#, where it'd feel more natural to dispense with interfaces and instead work directly with functions. To make things a little clearer, you may want to define a type alias:

type State<'a, 's> = ('s -> 'a * 's)

You can now define a State module that works directly with that kind of function:

module State =
    let run state (f : State<_, _>) = f state
 
    let lift x state = x, state
 
    let map f x state =
        let x', newState = run state x
        f x', newState
 
    let bind (f : 'a -> State<'b, 's>) (x : State<'a, 's>) state =
        let x', newState = run state x
        run newState (f x')
 
    let get state = state, state
 
    let put newState _ = (), newState
 
    let modify f = get |> map f |> bind put

This is code I originally wrote for a Code Review answer. You can go there to see all the details, as well as a motivating example.

I see that I never got around to add a gets function... I'll leave that as an exercise.

In C#, I've based the example on an interface (IState<S, T>), but it would also be possible to implement the State monad as extension methods on Func<S, Tuple<T, S>>. Try it! It might be another good exercise.

Conclusion #

The State monad usually comes with a few helper functions: get, put, modify, and gets. They can be useful as combinators you can use to compose a stateful combination from smaller building blocks, just like you can use LINQ to compose complex queries over data.

Published: Monday, 04 July 2022 09:15:00 UTC

Get and Put State by Mark Seemann