Asynchronous message passing combined with eventual consistency makes it possible to build very scalable systems. However, sometimes eventual consistency isn't appropriate in parts of the system, while it's acceptable in other parts. How can a consistent architecture be defined to fit both ACID and eventual consistency? This article provides an answer.

The case of an online game #

Last week I visited Pixel Pandemic, a company that produces browser-based MMORPGs. Since each game world has lots of players who can all potentially interact with each other, scalability is very important.

In traditional line of business applications, eventual consistency is often an excellent fit because the application is a projection of the real world. My favorite example is an inventory system: it models what's going on in one or more physical warehouses, but the real world is the ultimate source of truth. A warehouse worker might accidentally drop and damage some of the goods, in which case the application must adjust after the fact.

In other words, the information contained within line of business applications tend to lag after the real world. It's impossible to guarantee that the application is always consistent with the real world, so eventual consistency is a natural fit.

That's not the case with an online game world. The game world itself is the source of truth, and it must be internally consistent at all times. As an example, in Zombie Pandemic, players fight against zombies and may take damage along the way. Players can heal themselves, but they would prefer (I gather) that the healing action takes place immediately, and not some time in the future where the character might be dead. Similarly, when a player hits a zombie, they'd prefer to apply the damage immediately. (However, I think that even here, eventual consistency might provide some interesting game mechanics, but that's another discussion.)

While discussing these matters with the nice people in Pixel Pandemic, it turned out that while some parts of the game world have to be internally consistent, it's perfectly acceptable to use eventual consistency in other cases. One example is the game's high score table. While a single player should have a consistent view of his or her own score, it's acceptable if the high score table lags a bit.

At this point it seemed clear that this particular online game could use an appropriate combination of ACID and eventual consistency, and I think this conclusion can be generalized. The question now becomes: how can a consistent architecture encompass both types of consistency?

Problem statement #

With the above example scenario in mind the problem statement can be generalized:

Given that an application should apply a mix of ACID and eventual consistency, how can a consistent architecture be defined?

Keep in mind that ACID consistency implies that all writes to a transactional resource must take place as a blocking method call. This seems to be at odds with the concept of asynchronous message passing that works so well with eventual consistency.

However, an application architecture where blocking ACID calls are fundamentally different than asynchronous message passing isn't really an architecture at all. Developers will have to decide up-front whether or not a given operation is or isn't synchronous, so the ‘architecture' offers little implementation guidance. The end result is likely to be a heterogeneous mix of Services, Repositories, Units of Work, Message Channels, etc. A uniform principle will be difficult to distill, and the whole thing threatens to devolve into Spaghetti Code.

The solution turns out to be not at all difficult, but it requires that we invert our thinking a bit. Most of us tend to think about synchronous code first. When we think about code performing synchronous work it seems difficult (perhaps even impossible) to retrofit asynchrony to that model. On the other hand, the converse isn't true.

Given an asynchronous API, it's trivial to provide a synchronous, blocking implementation.

Adopting an architecture based on asynchronous message passing (the Pipes and Filters architecture) enables both scenarios. Eventual consistency can be achieved by passing messages around on persistent queues, while ACID consistency can be achieved by handling a message in a blocking call that enlists a (potentially distributed) transaction.

An example seems to be in order here.

Example: keeping score #

In the online game world, each player accumulates a score based on his or her actions. From the perspective of the player, the score should always be consistent. When you defeat the zombie boss, you want to see the result in your score right away. That sounds an awful lot like the Player is an Aggregate Root and the score is part of that Entity. ACID consistency is warranted whenever the Player is updated.

On the other hand, each time a score changes it may influence the high score table, but this doesn't need to be ACID consistent; eventual consistency is fine in this case.

Once again, polymorphism comes to the rescue.

Imagine that the application has a GameEngine class that handles updates in the game. Using an injected IChannel<PointsChangedEvent> it can update the score for a player as simple as this:

/* Lots of other interesting things happen
    * here, like calculating the new score... */
 
var cmd =
    new ScoreChangedEvent(this.playerId, score);
this.pointsChannel.Send(cmd);

The Send method returns void, so it's a good example of a naturally asynchronous API. However, the implementation must do two things:

  • Update the Player Aggregate Root in a transaction
  • Update the high score table (eventually)

That's two different types of consistency within the same method call.

The first step to enable this is to employ the trusty old Composite design pattern:

public class CompositeChannel<T> : IChannel<T>
{
    private readonly IEnumerable<IChannel<T>> channels;
 
    public CompositeChannel(params IChannel<T>[] channels)
    {
        this.channels = channels;
    }
 
    public void Send(T message)
    {
        foreach (var c in this.channels)
        {
            c.Send(message);
        }
    }
}

With a Composite channel it's possible to compose a polymorphic mix of IChannel<T> implementations, some blocking and some asynchronous.

ACID write #

To update the Player Aggregate Root a simple Adapter writes the event to a persistent data store. This could be a relational database, a document database, a REST resource or something else - it doesn't really matter exactly which technology is used.

public class PlayerStoreChannel : 
    IChannel<ScoreChangedEvent>
{
    private readonly IPlayerStore store;
 
    public PlayerStoreChannel(IPlayerStore store)
    {
        this.store = store;
    }
 
    public void Send(ScoreChangedEvent message)
    {
        this.store.Save(message.PlayerId, message);
    }
}

The important thing to realize is that the IPlayerStore.Save method will be a blocking method call - perhaps wrapped in a distributed transaction. This ensures that updates to the Player Aggregate Root always leave the data store in a consistent state. Either the operation succeeds or it fails during the method call itself.

This takes care of the ACID consistent write, but the application must also update the high score table.

Asynchronous write #

Since eventual consistency is acceptable for the high score table, the message can be transmitted over a persistent queue to be picked up by a background process.

A generic class can server as an Adapter over an IQueue abstraction:

public class QueueChannel<T> : IChannel<T>
{
    private readonly IQueue queue;
    private readonly IMessageSerializer serializer;
 
    public QueueChannel(IQueue queue,
        IMessageSerializer serializer)
    {
        this.queue = queue;
        this.serializer = serializer;
    }
 
    public void Send(T message)
    {
        this.queue.Enqueue(
            this.serializer.Serialize(message));
    }
}

Obvously, the Enqueue method is another void method. In the case of a persistent queue, it'll block while the message is being written to the queue, but that will tend to be a fast operation.

Composing polymorphic consistency #

Now all the building blocks are available to compose both channel implementations into the GameEngine via the CompositeChannel. That might look like this:

var playerConnString = ConfigurationManager
    .ConnectionStrings["player"].ConnectionString;
 
var gameEngine = new GameEngine(
    new CompositeChannel<ScoreChangedEvent>(
        new PlayerStoreChannel(
            new DbPlayerStore(playerConnString)),
        new QueueChannel<ScoreChangedEvent>(
            new PersistentQueue("messageQueue"),                        
            new JsonSerializer())));

When the Send method is invoked on the channel, it'll first invoke a blocking call that ensures ACID consistency for the Player, followed by asynchronous message passing for eventual consistency in other parts of the application.

Conclusion #

Even when parts of an application must be implemented in a synchronous fashion to ensure ACID consistency, an architecture based on asynchronous message passing provides a flexible foundation that enables you to polymorphically mix both kinds of consistency in a single method call. From the perspective of the application layer, this provides a consistent and uniform architecture because all mutating actions are modeled as commands end events encapsulated in messages.


Comments

Thanks a lot for mentioning us here at Pixel Pandemic and for your insights Mark! I very much agree with your conclusion and at this point we're discussing an architectural switch to what you're outlining here (a form of polymorphic consistency) and an event sourcing model for our persistent storage needs.

We're working on ways to make as many aspects of our games as possible fit with an eventual consistency model by, e.g. simply by changing the way we communicate information about the virtual game world state to players (to put them in a frame of mind in which eventual consistency fits naturally with their perception of the game state).

Looking very much forward to meeting with you again soon and discussing more details!
2011-12-07 09:46 UTC
Jake #
Could you also use the async ctp stuff to do it all in a single command, so that you are not blocking while waiting for the persistant store to do I/O, and then when it calls back push the message onto the message queue then return.. If you were using something like async controllers in mvc 4 it would mean you could do something like registering a user which saves them to the database, then pass this event information onto the persistant queue so a backend could pick it up and send emails, and do other tasks that are longer to execute.

await this.store.Save(message.PlayerId, message);
this.queue.Enqueue(this.serializer.Serialize(message));

Keen to hear your thoughts
Jake
2011-12-08 13:38 UTC
Why not? You can combine the async functionality with the approach described above. It could make the application more efficient, since it would free up a thread while the first transaction is being completed.

However, while the async CTP makes it easier to write asynchronous code, it doesn't help with blocking calls. It may be more efficient, but not necessarily faster. You can't know whether or not a transaction has committed until it actually happens, so you still need to wait for that before you decide whether or not to proceed.

BTW, F# has had async support since its inception, so it's interesting to look towards what people are doing with that. Agents (the F# word for Actors) seem to fit into that model pretty well, and as far as I can tell, an Agent is simply an in-memory asynchronous worker process.
2011-12-08 14:07 UTC
Hi Mark, firstly: great post.

I do have a question, though. I can see how this works for all future commands, because they will all need to load the aggregate work to work on it and that will be ACID at all times. What I'm not sure about is how that translates to the query side of the coin - where the query is *not* the high-score table, but must be immediately available on-screen.

Even if hypothetical, imagine the screen has a typical Heads-Up-Display of relevant information - stuff like 'ammo', 'health' and 'current score'. These are view concerns and will go down the query arm of the CQRS implementation. For example, the view-model backing this HUD could be stored in document storage for the player. This is, then, subject to eventual consistency and is not ACID, right?

I'm clearly not 'getting' this bit of the puzzle at the moment, hopefully you can enlighten me.
2011-12-14 21:23 UTC
A HUD is exactly the sort of scenario that a must be implemented by a synchronous write. If you want to be sure that the persisted data is ACID consistent, it must be written as a synchronous, blocking operation. This means that once the query side comes along, it can simply read from the same persistent store because it's always up to date. That sort of persisted data isn't eventually consistent - it's atomically consistent.
2011-12-21 08:08 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Wednesday, 07 December 2011 08:40:21 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Wednesday, 07 December 2011 08:40:21 UTC