An encapsulated, albeit overly complicated, implementation.

This is the second in a series of articles about encapsulation and immutability. In the next article, you'll see how immutability makes encapsulation easier, but in order to appreciate that, you should see the alternative. This article, then, shows a working, albeit overly complicated, implementation that does maintain its invariants.

In the introductory article, I described the example problem in more details, but in short, the exercise is to develop a class that holds a collection of prioritized items, with the invariant that the priorities must always sum to 100. It should be impossible to leave the object in a state where that's not true. It's quite an illuminating exercise, so if you have the time, you should try it for yourself before reading on.

Initialization #

As the previous article demonstrated, inheriting directly from a base class seems like a dead end. Once you see the direction that I go in this article, you may argue that it'd be possible to also make that design work with an inherited collection. It may be, but I'm not convinced that it would improve anything. Thus, for this iteration, I decided to eschew inheritance.

On the other hand, we need an API to query the object about its state, and I found that it made sense to implement the IReadOnlyDictionary interface.

As before, invariants are statements that are always true about an object, and that includes a newly initialized object. Thus, the PriorityCollection<T> class should require enough information to safely initialize.

public sealed class PriorityCollection<T> : IReadOnlyDictionary<Tbytewhere T : notnull
{
    private readonly Dictionary<Tbyte> dict;
 
    public PriorityCollection(T initial)
    {
        dict = new Dictionary<Tbyte> { { initial, 100 } };
    }
 
    // IReadOnlyDictionary implemented by delegating to dict field...
}

Several design decisions are different from the previous article. This design has no Prioritized<T> class. Instead it treats the item (of type T) as a dictionary key, and the priority as the value. The most important motivation for this design decision was that this enables me to avoid the 'leaf node mutation' problem that I demonstrated in the previous article. Notice how, while the general design in this iteration will be object-oriented and mutable, I already take advantage of a bit of immutability to make the design simpler and safer.

Another difference is that you can't initialize a PriorityCollection<T> object with a list. Instead, you only need to tell the constructor what the initial item is. The constructor will then infer that, since this is the only item so far, its priority must be 100. It can't be anything else, because that would violate the invariant. Thus, no assertion is required in the constructor.

Mutation API #

So far, the code only implements the IReadOnlyDictionary API, so we need to add some methods that will enable us to add new items and so on. As a start, we can add methods to add, remove, or update items:

public void Add(T keybyte value)
{
    AssertInvariants(dict.Append(KeyValuePair.Create(keyvalue)));
    dict.Add(keyvalue);
}
 
public void Remove(T key)
{
    AssertInvariants(dict.Where(kvp => !kvp.Key.Equals(key)));
    dict.Remove(key);
}
 
public byte this[T key]
{
    get { return dict[key]; }
    set
    {
        var l = dict.ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
        l[key] = value;
        AssertInvariants(l);
 
        dict[key] = value;
    }
}

I'm not going to show the AssertInvariants helper method yet, since it's going to change anyway.

At this point, the implementation suffers from the same problem as the example in the previous article. While you can add new items, you can only add an item with priority 0. You can only remove items if they have priority 0. And you can only 'update' an item if you set the priority to the same value as it already had.

We need to be able to add new items, change their priorities, and so on. How do we get around the above problem, without breaking the invariant?

Edit mode #

One way out of this conundrum is introduce a kind of 'edit mode'. The idea is to temporarily turn off the maintenance of the invariant for long enough to allow edits.

Af first glance, such an idea seems to go against the very definition of an invariant. After all, an invariant is a statement about the object that is always true. If you allow a client developer to turn off that guarantee, then, clearly, the guarantee is gone. Guarantees only work if you can trust them, and you can't trust them if they can be cancelled.

That idea in itself doesn't work, but if we can somehow encapsulate such an 'edit action' in an isolated scope that either succeeds or fails in its entirety, we may be getting somewhere. It's an idea similar to Unit of Work, although here we're not involving an actual database. Still, an 'edit action' is a kind of in-memory transaction.

For didactic reasons, I'll move toward that design in a series of step, where the intermediate steps fail to maintain the invariant. We'll get there eventually. The first step is to introduce 'edit mode'.

private bool isEditing;

While I could have made that flag public, I found it more natural to wrap access to it in two methods:

public void BeginEdit()
{
    isEditing = true;
}
 
public void EndEdit()
{
    isEditing = false;
}

This still doesn't accomplishes anything in itself, but the final change in this step is to change the assertion so that it respects the flag:

private void AssertInvariants(IEnumerable<KeyValuePair<Tbyte>> candidate)
{
    if (!isEditing && candidate.Sum(kvp => kvp.Value) != 100)
        throw new InvalidOperationException(
            "The sum of all values must be 100.");
}

Finally, you can add or change priorities, as this little F# example shows:

sut.BeginEdit ()
sut["foo"<- 50uy
sut["bar"<- 50uy
sut.EndEdit ()

Even if you nominally 'don't read F#', this little example is almost like C# without semicolons. The <- arrow is F#'s mutation or assignment operator, which in C# would be =, and the uy suffix is the F# way of stating that the literal is a byte.

The above example is well-behaved because the final state of the object is valid. The priorities sum to 100. Even so, no code in PriorityCollection<T> actually checks that, so we could trivially leave the object in an invalid state.

Assert invariant at end of edit #

The first step toward remedying that problem is to add a check to the EndEdit method:

public void EndEdit()
{
    isEditing = false;
    AssertInvariants(dict);
}

The class is still not effectively protecting its invariants, because a client developer could forget to call EndEdit, or client code might pass around a collection in edit mode. Other code, receiving such an object as an argument, may not know whether or not it's in edit mode, so again, doesn't know if it can trust it.

We'll return to that problem shortly, but first, there's another, perhaps more pressing issue that we should attend to.

Edit dictionary #

The current implementation directly edits the collection, and even if a client developer remembers to call EndEdit, other code, higher up in the call stack could circumvent the check and leave the object in an invalid state. Not that I expect client developers to be deliberately malicious, but the notion that someone might wrap a method call in a try-catch block seems realistic.

The following F# unit test demonstrates the issue:

[<Fact>]
let ``Attempt to circumvent`` () =
    let sut = PriorityCollection<string"foo"
 
    try
        sut.BeginEdit ()
        sut["foo"<- 50uy
        sut["bar"<- 48uy
        sut.EndEdit ()
    with _ -> ()
 
    100uy =! sut["foo"]
    test <@ sut.ContainsKey "bar" |> not @>

Again, let me walk you through it in case you're unfamiliar with F#.

The try-with block works just like C# try-catch blocks. Inside of that try-with block, the test enters edit mode, changes the values in such a way that the sum of them is 98, and then calls EndEdit. While EndEdit throws an exception, those four lines of code are wrapped in a try-with block that suppresses all exceptions.

The test attempts to verify that, since the edit failed, the "foo" value should be 100, and there should be no "bar" value. This turns out not to be the case. The test fails. The edits persist, even though EndEdit throws an exception, because there's no roll-back.

You could probably resolve that defect in various ways, but I chose to address it by introducing two, instead of one, backing dictionaries. One holds the data that always maintains the invariant, and the other is a temporary dictionary for editing.

private Dictionary<Tbyte> current;
private readonly Dictionary<Tbyte> encapsulated;
private readonly Dictionary<Tbyte> editable;
private bool isEditing;
 
public PriorityCollection(T initial)
{
    encapsulated = new Dictionary<Tbyte> { { initial, 100 } };
    editable = [];
    current = encapsulated;
}

There are two dictionaries: encapsulated holds the always-valid list of priorities, while editable is the dictionary that client code will be editing when in edit mode. Finally, current is either of these: editable when the object is in edit mode, and encapsulated when it's not. Most of the existing code shown so far now uses current, which before was called dict. The important changes are in BeginEdit and EndEdit.

public void BeginEdit()
{
    isEditing = true;
 
    editable.Clear();
    foreach (var kvp in current)
        editable.Add(kvp.Key, kvp.Value);
    current = editable;
}

Besides setting the isEditing flag, BeginEdit now copies all data from current to editable, and then sets current to editable. Keep in mind that encapsulated still holds the original, valid values.

Now that I'm writing this, I'm not even sure if this method is re-entrant, in the following sense: What happens if client code calls BeginEdit, makes some changes, and then calls BeginEdit again? It's questions like these that I don't feel intelligent enough to feel safe that I always answer correctly. That's why I like functional programming better. I don't have to think so hard.

Anyway, this will soon become irrelevant, since BeginEdit and EndEdit will eventually become private methods.

The EndEdit method performs the inverse manoeuvre:

public void EndEdit()
{
    isEditing = false;
    try
    {
        AssertInvariants(current);
 
        encapsulated.Clear();
        foreach (var kvp in current)
            encapsulated.Add(kvp.Key, kvp.Value);
        current = encapsulated;
    }
    catch
    {
        current = encapsulated;
        throw;
    }
}

It first checks the invariant, and only copies the edited values to the encapsulated dictionary if the invariant still holds. Otherwise, it restores the original encapsulated values and rethrows the exception.

This helps to make the nature of editing 'transactional' in nature, but it doesn't address the issue that the collection is in an invalid state during editing, or that a client developer may forget to call EndEdit.

Edit action #

As the next step towards addressing that problem, we may now introduce a 'wrapper method' for that little object protocol:

public void Edit(Action<PriorityCollection<T>> editAction)
{
    BeginEdit();
    editAction(this);
    EndEdit();
}

As you can see, it just wraps that little call sequence so that you don't have to remember to call BeginEdit and EndEdit. My F# test code comes with this example:

sut.Edit (fun col ->
    col["bar"<- 55uy
    col["baz"<- 45uy
    col.Remove "foo"
)

The fun col -> part is just F# syntax for a lambda expression. In C#, you'd write it as col =>.

We're close to a solution. What remains is to make BeginEdit and EndEdit private. This means that client code can only edit a PriorityCollection<T> object through the Edit method.

Replace action with interface #

You may complain that this solution isn't properly object-oriented, since it makes use of Action<T> and requires that client code uses lambda expressions.

We can easily fix that.

Instead of the action, you can introduce a Command interface with the same signature:

public interface IPriorityEditor<Twhere T : notnull
{
    void EditPriorities(PriorityCollection<Tpriorities);
}

Next, change the Edit method:

public void Edit(IPriorityEditor<Teditor)
{
    BeginEdit();
    editor.EditPriorities(this);
    EndEdit();
}

Now you have a nice, object-oriented design, with no lambda expressions in sight.

Full code dump #

The final code is complex enough that it's easy to lose track of what it looks like, as I walk through my process. To make it easer, here's the full code for the collection class:

public sealed class PriorityCollection<T> : IReadOnlyDictionary<Tbyte>
    where T : notnull
{
    private Dictionary<Tbyte> current;
    private readonly Dictionary<Tbyte> encapsulated;
    private readonly Dictionary<Tbyte> editable;
    private bool isEditing;
 
    public PriorityCollection(T initial)
    {
        encapsulated = new Dictionary<Tbyte> { { initial, 100 } };
        editable = [];
        current = encapsulated;
    }
 
    public void Add(T keybyte value)
    {
        AssertInvariants(current.Append(KeyValuePair.Create(keyvalue)));
        current.Add(keyvalue);
    }
 
    public void Remove(T key)
    {
        AssertInvariants(current.Where(kvp => !kvp.Key.Equals(key)));
        current.Remove(key);
    }
 
    public byte this[T key]
    {
        get { return current[key]; }
        set
        {
            var l = current.ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
            l[key] = value;
            AssertInvariants(l);
 
            current[key] = value;
        }
    }
 
    public void Edit(IPriorityEditor<Teditor)
    {
        BeginEdit();
        editor.EditPriorities(this);
        EndEdit();
    }
 
    private void BeginEdit()
    {
        isEditing = true;
 
        editable.Clear();
        foreach (var kvp in current)
            editable.Add(kvp.Key, kvp.Value);
        current = editable;
    }
 
    private void EndEdit()
    {
        isEditing = false;
        try
        {
            AssertInvariants(current);
 
            encapsulated.Clear();
            foreach (var kvp in current)
                encapsulated.Add(kvp.Key, kvp.Value);
            current = encapsulated;
        }
        catch
        {
            current = encapsulated;
            throw;
        }
    }
 
    private void AssertInvariants(IEnumerable<KeyValuePair<Tbyte>> candidate)
    {
        if (!isEditing && candidate.Sum(kvp => kvp.Value) != 100)
            throw new InvalidOperationException(
                "The sum of all values must be 100.");
    }
 
    public IEnumerable<T> Keys
    {
        get { return current.Keys; }
    }
 
    public IEnumerable<byte> Values
    {
        get { return current.Values; }
    }
 
    public int Count
    {
        get { return current.Count; }
    }
 
    public bool ContainsKey(T key)
    {
        return current.ContainsKey(key);
    }
 
    public IEnumerator<KeyValuePair<Tbyte>> GetEnumerator()
    {
        return current.GetEnumerator();
    }
 
    public bool TryGetValue(T key, [MaybeNullWhen(false)] out byte value)
    {
        return current.TryGetValue(keyout value);
    }
 
    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

The IPriorityEditor<T> interface remains as shown above.

Conclusion #

Given how simple the problem is, this solution is surprisingly complicated, and I'm fairly sure that it's not even thread-safe.

At least it does, as far as I can tell, protect the invariant that the sum of priorities must always be exactly 100. Even so, it's just complicated enough that I wouldn't be surprised if a bug is lurking somewhere. It'd be nice if a simpler design existed.

Next: An immutable priority collection.


Comments

Joker_vD #

Where does the notion come that a data structure invariant has to be true at all times? I am fairly certain that it's only required to be true at "quiescent" points of executions. That is, just as the loop invariant is only required to hold before and after each loop step but not inside the loop step, so is the data structure invariant is only required to hold before and after each invocation of its public methods.

This definition actually has an interesting quirk which is absent in the loop invariant: a data structure's method can't, generally speaking, call other public methods of the very same data structure because the invariant might not hold at this particular point of execution! I've been personally bitten by this a couple of times, and I've seen others tripping over this subtle point as well. You yourself notice it when you muse about the re-entrancy of the BeginEdit method.

Now, this particular problem is quite similar to the problem with inner iteration, and can be solved the same way, with the outer editor, as you've done, although I would have probably provided each editor with its own, separate editable dictionary because right now, the editors cannot nest/compose... but that'd complicate implementation even further.

2024-07-03 22:19 UTC

Thank you for writing. As so many other areas of knowledge, the wider field of software development suffers from the problem of overlapping or overloaded terminology. The word invariant is just one of them. In this context, invariant doesn't refer to loop invariants, or any other kind of invariants used in algorithmic analysis.

As outlined in the introduction article, when discussing encapsulation, I follow Object-Oriented Software Construction (OOSC). In that seminal work, Bertrand Meyer proposes the notion of design-by-contract, and specifically decomposes a contract into three parts: preconditions, invariants, and postconditions.

Having actually read the book, I'm well aware that it uses Eiffel as an exemplar of the concept. This has led many readers to conflate design-by-contract with Eiffel, and (in yet another logical derailment) conclude that it doesn't apply to, say, Java or C# programming.

It turns out, however, to transfer easily to other languages, and it's a concept with much practical potential.

A major problem with object-oriented design is that most ideas about good design are too 'fluffy' to be of immediate use to most developers. Take the Single Responsibility Principle (SRP) as an example. It's seductively easy to grasp the overall idea, but turns out to be hard to apply. Being able to identify reasons to change requires more programming experience than most people have. Or rather, the SRP is mostly useful to programmers who already have that experience. Being too 'fluffy', it's not a good learning tool.

I've spent quite some time with development organizations and individual programmers eager to learn, but struggling to find useful, concrete design rules. The decomposition of encapsulation into preconditions, invariants, and postconditions works well as a concrete, almost quantifiable heuristic.

Does it encompass everything that encapsulation means? Probably not, but it's by far the most effective heuristic that I've found so far.

Since I'm currently travelling, I don't have my copy of OOSC with me, but as far as I remember, the notion that an invariant should be true at all times originates there.

In any case, if an invariant doesn't always hold, then of what value is it? The whole idea behind encapsulation (as I read Meyer) is that client developers should be able to use 'objects' without having intimate knowledge of their implementation details. The use of contracts proposes to achieve that ideal by decoupling affordances from implementation details by condensing the legal protocol between object and client code into a contract. This means that a client developer, when making programming decisions, should be able to trust that certain guarantees stipulated by a contract always hold. If a client developer can't trust those guarantees, they aren't really guarantees.

"the data structure invariant is only required to hold before and after each invocation of its public methods"

I can see how a literal reading of OOSC may leave one with that impression. One must keep in mind, however, that the book was written in the eighties, at a time when multithreading wasn't much of a concern. (Incidentally, this is an omission that also mars a much later book on API design, the first edition of the .NET Framework Design Guidelines.)

In modern code, concurrent execution is a real possibility, so is at least worth keeping in mind. I'm still most familiar with the .NET ecosystem, and in it, there are plenty of classes that are documented as not being thread-safe. You could say that such a statement is part of the contract, in which case what you wrote is true: The invariant is only required to hold before and after each method invocation.

If, on the other hand, you want to make the code thread-safe, you must be more rigorous than that. Then an invariant must truly always hold.

This is, of course, a design decision one may take. Just don't bother with thread-safety if it's not important.

Still, the overall thrust of this article series is that immutability makes encapsulation much simpler. This is also true when it comes to concurrency. Immutable data structures are automatically thread-safe.

2024-07-06 8:07 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 24 June 2024 17:59:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 24 June 2024 17:59:00 UTC