Testing races with a synchronizing Decorator

Synchronized database reads for testing purposes.

In a previous article, you saw how to use a slow Decorator to test for race conditions. Towards the end, I discussed how that solution is only near-deterministic. In this article, I discuss a technique which is, I think, properly deterministic, but unfortunately less elegant.

In short, it works by letting a Decorator synchronize reads.

The problem #

In the previous article, I used words to describe the problem, but really, I should be showing, not telling. Here's a variation on the previous test that exemplifies the problem.

[Fact]
public async Task NoOverbookingRace()
{
    var date = DateTime.Now.Date.AddDays(1).AddHours(18.5);
    using var service = new RestaurantService();
    using var slowService =
        from repo in service
        select new SlowReservationsRepository(TimeSpan.FromMilliseconds(100), repo);

    var task1 = slowService.PostReservation(new ReservationDtoBuilder()
        .WithDate(date)
        .WithQuantity(10)
        .Build());
    await Task.Delay(TimeSpan.FromSeconds(1));
    var task2 = slowService.PostReservation(new ReservationDtoBuilder()
        .WithDate(date)
        .WithQuantity(10)
        .Build());
    var actual = await Task.WhenAll(task1, task2);

    Assert.Single(
        actual,
        msg => msg.StatusCode == HttpStatusCode.InternalServerError);
    var ok = Assert.Single(actual, msg => msg.IsSuccessStatusCode);
    // Check that the reservation was actually created:
    var resp = await service.GetReservation(ok.Headers.Location);
    resp.EnsureSuccessStatusCode();
    var reservation = await resp.ParseJsonContent<ReservationDto>();
    Assert.Equal(10, reservation.Quantity);
}

Apart from a single new line of code, this is is identical to the test shown in the previous article. The added line is the Task.Delay between task1 and task2.

What's the point of adding this delay there? Only to demonstrate a problem. That's one of the situations I described in the previous article: Even though the test starts both tasks without awaiting them, they aren't guaranteed to run in parallel. Both start as soon as they're created, so task2 is going to be ever so slightly behind task1. What happens if there's a delay between the creation of these two tasks?

Here I've explicitly introduced such a delay for demonstration purposes, but such a delay could happen on a real system for a number of reasons, including garbage collection, thread starvation, the OS running a higher-priority task, etc.

Why might that pause matter?

Because it may produce false negatives. Imagine a situation where there's no transaction control; where there's no TransactionScope around the database interactions. If the pause is long enough, the tasks effectively run in sequence (instead of in parallel), in which case the system correctly rejects the second attempt.

This is even when using the SlowReservationsRepository Decorator.

How long does the pause need to be before this happens?

As described in the previous article, with a configured delay of 100 ms for the SlowReservationsRepository, creating a new reservation is delayed by 300 ms. This bears out. Experimenting on my own machine, if I change that explicit, artificial delay to 300 ms, and remove the transaction control, the test sometimes fails, and sometimes passes. With the above one-second delay, the test always passes, even when transaction control is missing.

You could decide that a 300 ms pause at just the worst possible time is so unlikely that you're willing to simply accept those odds. I would probably be, too. Still, what to test, and what not to test is a function of context. You may find yourself in a context where that's not good enough. What other options are there?

Synchronizing Decorator #

What you really need to reproduce the race condition is to synchronize the database reads. If you could make sure that the Repository only returns data when enough reads have been performed, you can deterministically reproduce the problem.

Again, start with a Decorator. This time, build into it a way to synchronize reads.

internal sealed class SynchronizedReaderRepository : IReservationsRepository
{
    private readonly CountdownEvent countdownEvent = new CountdownEvent(2);

    public SynchronizedReaderRepository(IReservationsRepository inner)
    {
        Inner = inner;
    }

    public IReservationsRepository Inner { get; }

Here I've used a CountdownEvent object to ensure that reads only progress when the countdown reaches zero. It's possible that more appropriate threading APIs exist, but this serves well as a proof of concept.

The method you need to synchronize is ReadReservations, so you can leave all the other methods to delegate to Inner. Only ReadReservations is special.

public async Task<IReadOnlyCollection<Reservation>> ReadReservations(
    int restaurantId,
    DateTime min,
    DateTime max)
{
    var result = await Inner .ReadReservations(restaurantId, min, max);
    countdownEvent.Signal();
    countdownEvent.Wait();
    return result;
}

This implementation also starts by delegating to Inner, but before it returns the result, it signals the countdownEvent and blocks the thread by waiting on the countdownEvent. Only when both threads have signalled it does the counter reach zero, and the methods may proceed.

If we assume that, while the test is running, no other calls to ReadReservations is made, this guarantees that both threads receive the same answer. This will make both competing threads come to the answer that they can accept the reservation. If no transaction control is in place, the system will overbook the requested time slot.

Testing with the synchronizing Repository #

The test that uses SynchronizedReaderRepository is almost identical to the previous test.

[Fact]
public async Task NoOverbookingRace()
{
    var date = DateTime.Now.Date.AddDays(1).AddHours(18.5);
    using var service = new RestaurantService();
    using var syncedService =
        service.Select(repo => new SynchronizedReaderRepository(repo));

    var task1 = syncedService.PostReservation(new ReservationDtoBuilder()
        .WithDate(date)
        .WithQuantity(10)
        .Build());
    var task2 = syncedService.PostReservation(new ReservationDtoBuilder()
        .WithDate(date)
        .WithQuantity(10)
        .Build());
    var actual = await Task.WhenAll(task1, task2);

    Assert.Single(
        actual,
        msg => msg.StatusCode == HttpStatusCode.InternalServerError);
    var ok = Assert.Single(actual, msg => msg.IsSuccessStatusCode);
    // Check that the reservation was actually created:
    var resp = await service.GetReservation(ok.Headers.Location);
    resp.EnsureSuccessStatusCode();
    var reservation = await resp.ParseJsonContent<ReservationDto>();
    Assert.Equal(10, reservation.Quantity);
}

Contrary to using the slow Repository, this test doesn't allow false negatives. If transaction control is missing from the System Under Test (SUT), this test fails. And it passes when transaction control is in place.

Disadvantages #

That sounds great, so why not just do this, instead of using a delaying Decorator? Because, as usual, there are trade-offs involved. This kind of solution comes with some disadvantages that are worth taking into account.

In short, this could make the test more fragile. As shown above, SynchronizedReaderRepository makes a specific assumption. It assumes that it needs to synchronize exactly two parallel readers. One problem with this is that this may be coupled to exactly one test. If you had other tests, you'd need to write a new Decorator, or generalize this one in some way.

Another problem is that this makes the test sensitive to changes in the SUT. What if a code change introduces a new call to ReadReservations? If so, the countdownEvent may unblock the threads too soon. One such change may be that the SUT decides to also query for surrounding times slots. You might be able to make SynchronizedReaderRepository robust against such changes by keeping a dictionary of synchronization objects (such as the above CountdownEvent) per argument set, but that clearly complicates the implementation.

And even so, it doesn't protect against identical 'double reads', even though these may be less likely to happen.

This Decorator is also vulnerable to caching. If you have a read-through cache that wraps around SynchronizedReaderRepository, only the first query may get to it, which would then cause it to block forever. Perhaps, again, you could fix this with the Wait overload that takes a timeout value.

That said, if you cache reads, the pessimistic locking that TransactionScope uses isn't going to work. You could, perhaps, address that concern with optimistic concurrency, but that comes with its own problems.

Conclusion #

You can address race conditions in various ways, but synchronization has been around for a long time. Not only can you use synchronization primitives and APIs to make your code thread-safe, you can also use them to deterministically reproduce race conditions, or to test that such a bug is no longer present in the system.

I don't want to claim that this is universally possible, but if you run into such problems, it's at least worth considering if you could take advantage of synchronization to reproduce a problem.

Of course, the implication is that you understand what the problem is. This is often the hardest part of dealing with race conditions, and the ideas described in this article don't help with that.

Comments

Anthony Lloyd #

A more general solution would be to use the parallel random testing described in the talk by John Hughes here.

2025-08-19 22:22 UTC

Mark Seemann #

Thank you for writing. I tried watching the talk, but I don't get how to translate that idea to this example. Additionally, I currently don't have the bandwidth to read a 12-page paper. Would it be possible to sketch out how the concept translates to testing a race condition like the one shown here?

2025-09-07 18:20 UTC

Anthony Lloyd #

You can think of it as a normal property-based test (in fact CsCheck builds on the existing random testing functions) with a generated initial state set of operations plus a small set of operations to be run in parallel (some PostReservation calls). The property to test is that this parallel run is linearizable which means we can find at least one sequential order of the parallel operations that gives the same result (reservation.Quantity) as the parallel run. This does require some housekeeping to get all the possible permutations correct as you know some ran on the same thread in a certain order which reduces the number to test. We get all the goodies of shrinking here (CsCheck has an advantage as shrinking is random) which is why John was able to help solve very hard concurrency issues for companies.



                public class ReservationTests(Xunit.Abstractions.ITestOutputHelper output)
                {
                    [Fact]
                    public void ReservationSystem_Parallel_Test()
                    {
                        Check.SampleParallel(
                            // initial state
                            Gen.Const(() => new ReservationSystem()),
                            // parallel operations
                            Gen.Int[1, 10].Operation((rs, q) =>
                                rs.PostReservation(new ReservationDtoBuilder().WithQuantity(q))),
                            // equality check parallel vs sequential
                            equal: (rs1, rs2) => rs1.AllBookings().Equals(rs2.AllBookings()),
                            // string representation of the bookings state for when the test fails
                            print: rs => rs.AllBookings().ToTableString(),
                            // display output in test results
                            writeLine: output.WriteLine
                        );
                    }
                }

2025-09-10 7:26 UTC

Mark Seemann #

Thank you for the example code. Obviously, you've made some assumptions about how this particular code base works, but often the devil is in the details, so when I finally had a bit of time on my hands, I decided to try out the feature on the real code base. After a bit of back and forth, I wrote this test:

[Fact]
public void OverbookingAttemptsSerialize()
{
    var now = DateTimeOffset.UtcNow;
    Gen.Const(() => (new RestaurantService(), new ConcurrentBag<bool>()))
        .SampleParallel(
            Gen.Int[1, 10]
                .Operation<(RestaurantService, ConcurrentBag<bool>)>(
                    async (t, q) =>
            {
                var res = await t.Item1.PostReservation(
                    new ReservationDtoBuilder()
                        .WithDate(now.AddDays(1).Date.AddHours(18.5))
                        .WithQuantity(q)
                        .Build());
                t.Item2.Add(res.IsSuccessStatusCode);
            }),
            equal: (t1, t2) => t1.Item2.SequenceEqual(t2.Item2));
}

An important part of the Restaurant REST API design is that clients receive correct responses when things go wrong. That's the motivation for including a ConcurrentBag as part of the state, recording whether the response indicates success or failure.

To my disappointment, even when the race bug is absent (i.e. when transaction control is present in the service), this test fails. Here's a typical output from a failing test:

Message: 
    CsCheck.CsCheckException : Set seed: "0005Pq8wWlm1" or -e CsCheck_Seed=0005Pq8wWlm1 to reproduce (0 shrinks, 72 skipped, 100 total).
    
            Initial state: (Ploeh.Samples.Restaurants.RestApi.SqlIntegrationTests.RestaurantService, {})
    Sequential Operations: []
      Parallel Operations: [Op0 4, Op0 1, Op0 4]
               On Threads: [0, 1, 0]
              Final state: (Ploeh.Samples.Restaurants.RestApi.SqlIntegrationTests.RestaurantService, {False, True, True})
               Linearized: [Op0 4, Op0 1, Op0 4] -> (Ploeh.Samples.Restaurants.RestApi.SqlIntegrationTests.RestaurantService, {False, False, False})
                         : [Op0 1, Op0 4, Op0 4] -> (Ploeh.Samples.Restaurants.RestApi.SqlIntegrationTests.RestaurantService, {False, False, False})
                         : [Op0 4, Op0 4, Op0 1] -> (Ploeh.Samples.Restaurants.RestApi.SqlIntegrationTests.RestaurantService, {False, False, False})

At first this is surprising, and upon further reflection, it may not be that surprising. Still, there are things that I don't understand.

Here's what's initially surprising: Implicit in this test is that the restaurant seats a maximum of 10 people a day. Thus, with 4, 1, and 4 seats being requested, it surprised me that any of these reservations were being declined. Still, all the linearized models have {False, False, False}; in other words, no reservations were accepted.

Then I remembered that this test is actually running on a real SQL Server database, and since I'm assuming that CsCheck has been running quite a few scenarios already, it actually makes sense that that particular day is already completely sold out. Does 72 skipped indicate that there were 72 prior runs before this one? Or does it mean that CsCheck found a counterexample after 28 tries, and then decided to skip the remaining 72 runs?

In any case, even if it only did 27 prior runs, it seems likely that the date is already completely sold out. It then puzzles me that the final state is {False, True, True}. I honestly can't think of a good explanation of that, but perhaps you can?

In any case, I still think that I understand why this can't work as written above. The problem with filling up all reservations for a given date is the reason why I originally performed every attempt to provoke the race on a unique date. Is there a way to do that with CsCheck and SampleParallel?

Alternatively, one would need a way to tear down the persistent fixture (i.e. the database) after each run. I can't identify an API that enables me to do that, but perhaps I'm missing something. Is that option somehow available?

2025-10-09 14:32 UTC

Published: Monday, 04 August 2025 07:24:00 UTC

Testing races with a synchronizing Decorator by Mark Seemann

The problem #

Synchronizing Decorator #

Testing with the synchronizing Repository #

Disadvantages #

Conclusion #

Comments

Wish to comment?

Published

Tags