A refactoring example.

When I introduce the Fake Object testing pattern to people, a common concern is the maintenance burden of it. The point of the pattern is that you write some 'working' code only for test purposes. At a glance, it seems as though it'd be more work than using a dynamic mock library like Moq or Mockito.

This article isn't really about that, but the benefit of a Fake Object is that it has a lower maintenance footprint because it gives you a single class to maintain when you change interfaces or base classes. Dynamic mock objects, on the contrary, leads to Shotgun surgery because every time you change an interface or base class, you have to revisit multiple tests.

In a recent article I presented a Fake Object that may have looked bigger than most people would find comfortable for test code. In this article I discuss how to trim it via a set of refactorings.

Original Fake read registry #

The article presented this FakeReadRegistry, repeated here for your convenience:

internal sealed class FakeReadRegistry : IReadRegistry
{
    private readonly IReadOnlyCollection<Room> rooms;
    private readonly IDictionary<DateOnly, IReadOnlyCollection<Room>> views;
 
    public FakeReadRegistry(params Room[] rooms)
    {
        this.rooms = rooms;
        views = new Dictionary<DateOnly, IReadOnlyCollection<Room>>();
    }
 
    public IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure)
    {
        return EnumerateDates(arrival, departure)
            .Select(GetView)
            .Aggregate(rooms.AsEnumerable(), Enumerable.Intersect)
            .ToList();
    }
 
    public void RoomBooked(Booking booking)
    {
        foreach (var d in EnumerateDates(booking.Arrival, booking.Departure))
        {
            var view = GetView(d);
            var newView = QueryService.Reserve(booking, view);
            views[d] = newView;
        }
    }
 
    private static IEnumerable<DateOnly> EnumerateDates(DateOnly arrival, DateOnly departure)
    {
        var d = arrival;
        while (d < departure)
        {
            yield return d;
            d = d.AddDays(1);
        }
    }
 
    private IReadOnlyCollection<Room> GetView(DateOnly date)
    {
        if (views.TryGetValue(date, out var view))
            return view;
        else
            return rooms;
    }
}

This is 47 lines of code, spread over five members (including the constructor). Three of the methods have a cyclomatic complexity (CC) of 2, which is the maximum for this class. The remaining two have a CC of 1.

While you can play some CC golf with those CC-2 methods, that tends to pull the code in a direction of being less idiomatic. For that reason, I chose to present the code as above. Perhaps more importantly, it doesn't save that many lines of code.

Had this been a piece of production code, no-one would bat an eye at size or complexity, but this is test code. To add spite to injury, those 47 lines of code implement this two-method interface:

public interface IReadRegistry
{
    IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure);
 
    void RoomBooked(Booking booking);
}

Can we improve the situation?

Root cause analysis #

Before you rush to 'improve' code, it pays to understand why it looks the way it looks.

Code is a wonderfully malleable medium, so you should regard nothing as set in stone. On the other hand, there's often a reason it looks like it does. It may be that the previous programmers were incompetent ogres for hire, but often there's a better explanation.

I've outlined my thinking process in the previous article, and I'm not going to repeat it all here. To summarise, though, I've applied the Dependency Inversion Principle.

"clients [...] own the abstract interfaces"

Robert C. Martin, APPP, chapter 11

In other words, I let the needs of the clients guide the design of the IReadRegistry interface, and then the implementation (FakeReadRegistry) had to conform.

But that's not the whole truth.

I was doing a programming exercise - the CQRS booking kata - and I was following the instructions given in the description. They quite explicitly outline the two dependencies and their methods.

When trying a new exercise, it's a good idea to follow instructions closely, so that's what I did. Once you get a sense of a kata, though, there's no law saying that you have to stick to the original rules. After all, the purpose of an exercise is to train, and in programming, trying new things is training.

Test code that wants to be production code #

A major benefit of test-driven development (TDD) is that it provides feedback. It pays to be tuned in to that channel. The above FakeReadRegistry seems to be trying to tell us something.

Consider the GetFreeRooms method. I'll repeat the single-expression body here for your convenience:

return EnumerateDates(arrival, departure)
    .Select(GetView)
    .Aggregate(rooms.AsEnumerable(), Enumerable.Intersect)
    .ToList();

Why is that the implementation? Why does it need to first enumerate the dates in the requested interval? Why does it need to call GetView for each date?

Why don't I just do the following and be done with it?

internal sealed class FakeStorage : Collection<Booking>, IWriteRegistry, IReadRegistry
{
    private readonly IReadOnlyCollection<Room> rooms;
 
    public FakeStorage(params Room[] rooms)
    {
        this.rooms = rooms;
    }
 
    public IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure)
    {
        var booked = this.Where(b => b.Overlaps(arrival, departure)).ToList();
        return rooms
            .Where(r => !booked.Any(b => b.RoomName == r.Name))
            .ToList();
    }
 
    public void Save(Booking booking)
    {
        Add(booking);
    }
}

To be honest, that's what I did first.

While there are two interfaces, there's only one Fake Object implementing both. That's often an easy way to address the Interface Segregation Principle and still keeping the Fake Object simple.

This is much simpler than FakeReadRegistry, so why didn't I just keep that?

I didn't feel it was an honest attempt at CQRS. In CQRS you typically write the data changes to one system, and then you have another logical process that propagates the information about the data modification to the read subsystem. There's none of that here. Instead of being based on one or more 'materialised views', the query is just that: A query.

That was what I attempted to address with FakeReadRegistry, and I think it's a much more faithful CQRS implementation. It's also more complex, as CQRS tends to be.

In both cases, however, it seems that there's some production logic trapped in the test code. Shouldn't EnumerateDates be production code? And how about the general 'algorithm' of RoomBooked:

  • Enumerate the relevant dates
  • Get the 'materialised' view for each date
  • Calculate the new view for that date
  • Update the collection of views for that date

That seems like just enough code to warrant moving it to the production code.

A word of caution before we proceed. When deciding to pull some of that test code into the production code, I'm making a decision about architecture.

Until now, I'd been following the Dependency Inversion Principle closely. The interfaces exist because the client code needs them. Those interfaces could be implemented in various ways: You could use a relational database, a document database, files, blobs, etc.

Once I decide to pull the above algorithm into the production code, I'm choosing a particular persistent data structure. This now locks the data storage system into a design where there's a persistent view per date, and another database of bookings.

Now that I'd learned some more about the exercise, I felt confident making that decision.

Template Method #

The first move I made was to create a superclass so that I could employ the Template Method pattern:

public abstract class ReadRegistry : IReadRegistry
{
    public IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure)
    {
        return EnumerateDates(arrival, departure)
            .Select(GetView)
            .Aggregate(Rooms.AsEnumerable(), Enumerable.Intersect)
            .ToList();
    }
 
    public void RoomBooked(Booking booking)
    {
        foreach (var d in EnumerateDates(booking.Arrival, booking.Departure))
        {
            var view = GetView(d);
            var newView = QueryService.Reserve(booking, view);
            UpdateView(d, newView);
        }
    }
 
    protected abstract void UpdateView(DateOnly date, IReadOnlyCollection<Room> view);
 
    protected abstract IReadOnlyCollection<Room> Rooms { get; }
 
    protected abstract bool TryGetView(DateOnly dateout IReadOnlyCollection<Room> view);
 
    private static IEnumerable<DateOnly> EnumerateDates(DateOnly arrival, DateOnly departure)
    {
        var d = arrival;
        while (d < departure)
        {
            yield return d;
            d = d.AddDays(1);
        }
    }
 
    private IReadOnlyCollection<Room> GetView(DateOnly date)
    {
        if (TryGetView(date, out var view))
            return view;
        else
            return Rooms;
    }
}

This looks similar to FakeReadRegistry, so how is this an improvement?

The new ReadRegistry class is production code. It can, and should, be tested. (Due to the history of how we got here, it's already covered by tests, so I'm not going to repeat that effort here.)

True to the Template Method pattern, three abstract members await a child class' implementation. These are the UpdateView and TryGetView methods, as well as the Rooms read-only property (glorified getter method).

Imagine that in the production code, these are implemented based on file/document/blob storage - one per date. TryGetView would attempt to read the document from storage, UpdateView would create or modify the document, while Rooms returns a default set of rooms.

A Test Double, however, can still use an in-memory dictionary:

internal sealed class FakeReadRegistry : ReadRegistry
{
    private readonly IReadOnlyCollection<Room> rooms;
    private readonly IDictionary<DateOnly, IReadOnlyCollection<Room>> views;
 
    protected override IReadOnlyCollection<Room> Rooms => rooms;
 
    public FakeReadRegistry(params Room[] rooms)
    {
        this.rooms = rooms;
        views = new Dictionary<DateOnly, IReadOnlyCollection<Room>>();
    }
 
    protected override void UpdateView(DateOnly date, IReadOnlyCollection<Room> view)
    {
        views[date] = view;
    }
 
    protected override bool TryGetView(DateOnly dateout IReadOnlyCollection<Room> view)
    {
        return views.TryGetValue(date, out view);
    }
}

Each override is a one-liner with cyclomatic complexity 1.

First round of clean-up #

An abstract class is already a polymorphic object, so we no longer need the IReadRegistry interface. Delete that, and update all code accordingly. Particularly, the QueryService now depends on ReadRegistry rather than IReadRegistry:

private readonly ReadRegistry readRegistry;
 
public QueryService(ReadRegistry readRegistry)
{
    this.readRegistry = readRegistry;
}

Now move the Reserve function from QueryService to ReadRegistry. Once this is done, the QueryService looks like this:

public sealed class QueryService
{
    private readonly ReadRegistry readRegistry;
 
    public QueryService(ReadRegistry readRegistry)
    {
        this.readRegistry = readRegistry;
    }
 
    public IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure)
    {
        return readRegistry.GetFreeRooms(arrival, departure);
    }
}

That class is only passing method calls along, so clearly no longer serving any purpose. Delete it.

This is a not uncommon in CQRS. One might even argue that if CQRS is done right, there's almost no code on the query side, since all the data view update happens as events propagate.

From abstract class to Dependency Injection #

While the current state of the code is based on an abstract base class, the overall architecture of the system doesn't hinge on inheritance. From Abstract class isomorphism we know that it's possible to refactor an abstract class to Constructor Injection. Let's do that.

First add an IViewStorage interface that mirrors the three abstract methods defined by ReadRegistry:

public interface IViewStorage
{
    IReadOnlyCollection<Room> Rooms { get; }
 
    void UpdateView(DateOnly date, IReadOnlyCollection<Room> view);
 
    bool TryGetView(DateOnly dateout IReadOnlyCollection<Room> view);
}

Then implement it with a Fake Object:

public sealed class FakeViewStorage : IViewStorage
{
    private readonly IDictionary<DateOnly, IReadOnlyCollection<Room>> views;
 
    public IReadOnlyCollection<Room> Rooms { get; }
 
    public FakeViewStorage(params Room[] rooms)
    {
        Rooms = rooms;
        views = new Dictionary<DateOnly, IReadOnlyCollection<Room>>();
    }
 
    public void UpdateView(DateOnly date, IReadOnlyCollection<Room> view)
    {
        views[date] = view;
    }
 
    public bool TryGetView(DateOnly dateout IReadOnlyCollection<Room> view)
    {
        return views.TryGetValue(date, out view);
    }
}

Notice the similarity to FakeReadRegistry, which we'll get rid of shortly.

Now inject IViewStorage into ReadRegistry, and make ReadRegistry a regular (sealed) class:

public sealed class ReadRegistry
{
    private readonly IViewStorage viewStorage;
 
    public ReadRegistry(IViewStorage viewStorage)
    {
        this.viewStorage = viewStorage;
    }
 
    public IReadOnlyCollection<Room> GetFreeRooms(DateOnly arrival, DateOnly departure)
    {
        return EnumerateDates(arrival, departure)
            .Select(GetView)
            .Aggregate(viewStorage.Rooms.AsEnumerable(), Enumerable.Intersect)
            .ToList();
    }
 
    public void RoomBooked(Booking booking)
    {
        foreach (var d in EnumerateDates(booking.Arrival, booking.Departure))
        {
            var view = GetView(d);
            var newView = Reserve(booking, view);
            viewStorage.UpdateView(d, newView);
        }
    }
 
    public static IReadOnlyCollection<Room> Reserve(
        Booking booking,
        IReadOnlyCollection<Room> existingView)
    {
        return existingView
            .Where(r => r.Name != booking.RoomName)
            .ToList();
    }
 
    private static IEnumerable<DateOnly> EnumerateDates(DateOnly arrival, DateOnly departure)
    {
        var d = arrival;
        while (d < departure)
        {
            yield return d;
            d = d.AddDays(1);
        }
    }
 
    private IReadOnlyCollection<Room> GetView(DateOnly date)
    {
        if (viewStorage.TryGetView(date, out var view))
            return view;
        else
            return viewStorage.Rooms;
    }
}

You can now delete the FakeReadRegistry Test Double, since FakeViewStorage has now taken its place.

Finally, we may consider if we can make FakeViewStorage even slimmer. While I usually favour composition over inheritance, I've found that deriving Fake Objects from collection base classes is often an efficient way to get a lot of mileage out of a few lines of code. FakeReadRegistry, however, had to inherit from ReadRegistry, so it couldn't derive from any other class.

FakeViewStorage isn't constrained in that way, so it's free to inherit from Dictionary<DateOnly, IReadOnlyCollection<Room>>:

public sealed class FakeViewStorage : Dictionary<DateOnly, IReadOnlyCollection<Room>>, IViewStorage
{
    public IReadOnlyCollection<Room> Rooms { get; }
 
    public FakeViewStorage(params Room[] rooms)
    {
        Rooms = rooms;
    }
 
    public void UpdateView(DateOnly date, IReadOnlyCollection<Room> view)
    {
        this[date] = view;
    }
 
    public bool TryGetView(DateOnly dateout IReadOnlyCollection<Room> view)
    {
        return TryGetValue(date, out view);
    }
}

This last move isn't strictly necessary, but I found it worth at least mentioning.

I hope you'll agree that this is a Fake Object that looks maintainable.

Conclusion #

Test-driven development is a feedback mechanism. If something is difficult to test, it tells you something about your System Under Test (SUT). If your test code looks bloated, that tells you something too. Perhaps part of the test code really belongs in the production code.

In this article, we started with a Fake Object that looked like it contained too much production code. Via a series of refactorings I moved the relevant parts to the production code, leaving me with a more idiomatic and conforming implementation.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 20 November 2023 06:44:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 20 November 2023 06:44:00 UTC