# Thursday, February 02, 2012

A common criticism of loosely coupled code is that it’s harder to understand. How do you see the big picture of an application when loose coupling is everywhere? When the entire code base has been programmed against interfaces instead of concrete classes, how do we understand how the objects are wired and how they interact?

In this post, I’ll provide answers on various levels, from high-level architecture over object-oriented principles to more nitty-gritty code. Before I do that, however, I’d like to pose a set of questions you should always be prepared to answer.

Mu

My first reaction to that sort of question is: you say loosely coupled code is harder to understand. Harder than what?

If we are talking about a non-trivial application, odds are that it’s going to take some time to understand the code base – whether or not it’s loosely coupled. Agreed: understanding a loosely coupled code base takes some work, but so does understanding a tightly coupled code base. The question is whether it’s harder to understand a loosely coupled code base?

Imagine that I’m having a discussion about this subject with Mary Rowan from my book.

Mary: “Loosely coupled code is harder to understand.”

Me: “Why do you think that is?”

Mary: “It’s very hard to navigate the code base because I always end up at an interface.”

Me: “Why is that a problem?”

Mary: “Because I don’t know what the interface does.”

At this point I’m very tempted to answer Mu. An interfaces doesn’t do anything – that’s the whole point of it. According to the Liskov Substitution Principle (LSP), a consumer shouldn’t have to care about what happens on the other side of the interface.

However, developers used to tightly coupled code aren’t used to think about services in this way. They are used to navigate the code base from consumer to service to understand how the two of them interact, and I will gladly admit this: in principle, that’s impossible to do in a loosely coupled code base. I’ll return to this subject in a little while, but first I want to discuss some strategies for understanding a loosely coupled code base.

Architecture and Documentation

Yes: documentation. Don’t dismiss it. While I agree with Uncle Bob and like-minded spirits that the code is the documentation, a two-page document that outlines the Big Picture might save you from many hours of code archeology.

The typical agile mindset is to minimize documentation because it tends to lose touch with the code base, but even so, it should be possible to maintain a two-page high-level document so that it stays up to date. Consider the alternative: if you have so much architectural churn that even a two-page overview regularly falls behind, then you’re probably having a greater problem than understanding your loosely coupled code base.

Maintaining such a document isn’t adverse to the agile spirit. You’ll find the same advice in Lean Architecture (p. 127). Don’t underestimate the value of such a document.

See the Forest Instead of the Trees

Understanding a loosely coupled code base typically tends to require a shift of mindset.

Recall my discussion with Mary Rowan. The criticism of loose coupling is that it’s difficult to understand which collaborators are being invoked. A developer like Mary Rowan is used to learn a code base by understanding all the myriad concrete details of it. In effect, while there may be ‘classes’ around, there are no abstractions in place. In order to understand the behavior of a user interface element, it’s necessary to also understand what’s happening in the database – and everything in between.

A loosely coupled code base shouldn’t be like that.

The entire purpose of loose coupling is that we should be able to reason about a part (a ‘unit’, if you will) without understanding the whole.

In a tightly coupled code base, it’s often impossible to see the forest for the trees. Although we developers are good at relating to details, a tightly coupled code base requires us to be able to contain the entire code base in our heads in order to understand it. As the size of the code base grows, this becomes increasingly difficult.

In a loosely coupled code base, on the other hand, it should be possible to understand smaller parts in isolation. However, I purposely wrote “should be”, because that’s not always the case. Often, a so-called “loosely coupled” code base violates basic principles of object-oriented design.

RAP

The criticism that it’s hard to see “what’s on the other side of an interface” is, in my opinion, central. It betrays a mindset which is still tightly coupled.

In many code bases there’s often a single implementation of a given interface, so developers can be forgiven if they think about an interface as only a piece of friction that prevents them from reaching the concrete class on the other side. However, if that’s the case with most of the interfaces in a code base, it signifies a violation of the Reused Abstractions Principle (RAP) more than it signifies loose coupling.

Jim Cooper, a reader of my book, put it quite eloquently on the book’s forum:

“So many people think that using an interface magically decouples their code. It doesn't. It only changes the nature of the coupling. If you don't believe that, try changing a method signature in an interface - none of the code containing method references or any of the implementing classes will compile. I call that coupled”

Refactoring tools aside, I completely agree with this statement. The RAP is a test we can use to verify whether or not an interface is truly reusable – what better test is there than to actually reuse your interfaces?

The corollary of this discussion is that if a code base is massively violating the RAP then it’s going to be hard to understand. It has all the disadvantages of loose coupling with few of the benefits. If that’s the case, you would gain more benefit from making it either more tightly coupled or truly loosely coupled.

What does “truly loosely coupled” mean?

LSP

According to the LSP a consumer must not concern itself with “what’s on the other side of the interface”. It should be possible to replace any implementation with any other implementation of the same interface without changing the correctness of the program.

This is why I previously said that in a truly loosely coupled code base, it isn’t ‘hard’ to understand “what’s on the other side of the interface” – it’s impossible. At design-time, there’s nothing ‘behind’ the interface. The interface is what you are programming against. It’s all there is.

Mary has been listening to all of this, and now she protests:

Mary: “At run-time, there’s going to be a concrete class behind the interface.”

Me (being annoyingly pedantic): “Not quite. There’s going to be an instance of a concrete class which the consumer invokes via the interface it implements.”

Mary: “Yes, but I still need to know which concrete class is going to be invoked.”

Me: “Why?”

Mary: “Because otherwise I don’t know what’s going to happen when I invoke the method.”

This type of answer often betrays a much more fundamental problem in a code base.

CQS

Now we are getting into the nitty-gritty details of class design. What would you expect that the following method does?

public List<Order> GetOpenOrders(Customer customer)

The method name indicates that it gets open orders, and the signature seems to back it up. A single database query might be involved, since this looks a lot like a read-operation. A quick glance at the implementation seems to verify that first impression:

public List<Order> GetOpenOrders(Customer customer)
{
    var orders = GetOrders(customer);
    return (from o in orders
            where o.Status == OrderStatus.Open
            select o).ToList();
}

Is it safe to assume that this is a side-effect-free method call? As it turns out, this is far from the case in this particular code base:

private List<Order> GetOrders(Customer customer)
{
    var gw = new CustomerGateway(this.ConnectionString);
    var orders = gw.GetOrders(customer);
    AuditOrders(orders);
    FixCustomer(gw, orders, customer);
    return orders;
}

The devil is in the details. What does AuditOrders do? And what does FixCustomer do? One method at a time:

private void AuditOrders(List<Order> orders)
{
    var user = Thread.CurrentPrincipal.Identity.ToString();
    var gw = new OrderGateway(this.ConnectionString);
    foreach (var o in orders)
    {
        var clone = o.Clone();
        var ar = new AuditRecord
        {
            Time = DateTime.Now,
            User = user
        };
        clone.AuditTrail.Add(ar);
        gw.Update(clone);
 
        // We don't want the consumer to see the audit trail.
        o.AuditTrail.Clear();
    }
}

OK, it turns out that this method actually makes a copy of each and every order and updates that copy, writing it back to the database in order to leave behind an audit trail. It also mutates each order before returning to the caller. Not only does this method result in an unexpected N+1 problem, it also mutates its input, and perhaps even more surprising, it’s leaving the system in a state where the in-memory object is different from the database. This could lead to some pretty interesting bugs.

Then what happens in the FixCustomer method? This:

// Fixes the customer status field if there were orders
// added directly through the database.
private static void FixCustomer(CustomerGateway gw,
    List<Order> orders, Customer customer)
{
    var total = orders.Sum(o => o.Total);
    if (customer.Status != CustomerStatus.Preferred
        && total > PreferredThreshold)
    {
        customer.Status = CustomerStatus.Preferred;
        gw.Update(customer);
    }
}

Another potential database write operation, as it turns out – complete with an apology. Now that we’ve learned all about the details of the code, even the GetOpenOrders method is beginning to look suspect. The GetOrders method returns all orders, with the side effect that all orders were audited as having been read by the user, but the GetOpenOrders filters the output. In the end, it turns out that we can’t even trust the audit trail.

While I must apologize for this contrived example of a Transaction Script, it’s clear that when code looks like that, it’s no wonder why developers think that it’s necessary to contain the entire code base in their heads. When this is the case, interfaces are only in the way.

However, this is not the fault of loose coupling, but rather a failure to adhere to the very fundamental principle of Command-Query Separation (CQS). You should be able to tell from the method signature alone whether invoking the method will or will not have side-effects. This is one of the key messages from Clean Code: the method name and signature is an abstraction. You should be able to reason about the behavior of the method from its declaration. You shouldn’t have to read the code to get an idea about what it does.

Abstractions always hide details. Method declarations do too. The point is that you should be able to read just the method declaration in order to gain a basic understanding of what’s going on. You can always return to the method’s code later in order to understand detail, but reading the method declaration alone should provide the Big Picture.

Strictly adhering to CQS goes a long way in enabling you to understand a loosely coupled code base. If you can reason about methods at a high level, you don’t need to see “the other side of the interface” in order to understand the Big Picture.

Stack Traces

Still, even in a loosely coupled code base with great test coverage, integration issues arise. While each class works fine in isolation, when you integrate them, sometimes the interactions between them cause errors. This is often because of incorrect assumptions about the collaborators, which often indicates that the LSP was somehow violated.

To understand why such errors occur, we need to understand which concrete classes are interacting. How do we do that in a loosely coupled code base?

That’s actually easy: look at the stack trace from your error report. If your error report doesn’t include a stack trace, make sure that it’s going to do that in the future.

The stack trace is one of the most important troubleshooting tools in a loosely coupled code base, because it’s going to tell you exactly which classes were interacting when an exception was thrown.

Furthermore, if the code base also adheres to the Single Responsibility Principle and the ideals from Clean Code, each method should be very small (under 15 lines of code). If that’s the case, you can often understand the exact nature of the error from the stack trace and the error message alone. It shouldn’t even be necessary to attach a debugger to understand the bug, but in a pinch, you can still do that.

Tooling

Returning to the original question, I often hear people advocating tools such as IDE add-ins which support navigation across interfaces. Such tools might provide a menu option which enables you to “go to implementation”. At this point it should be clear that such a tool is mainly going to be helpful in code bases that violate the RAP.

(I’m not naming any particular tools here because I’m not interested in turning this post into a discussion about the merits of various specific tools.)

Conclusion

It’s the responsibility of the loosely coupled code base to make sure that it’s easy to understand the Big Picture and that it’s easy to work with. In the end, that responsibility falls on the developers who write the code – not the developer who’s trying to understand it.

posted on Thursday, February 02, 2012 9:37:40 PM (Romance Standard Time, UTC+01:00)  #    Comments [4] Trackback
# Tuesday, January 03, 2012

SOLID is a set of principles that, if applied consistently, has some surprising effect on code. In a previous post I provided a sketch of what it means to meticulously apply the Single Responsibility Principle. In this article I will describe what happens when you follow the Open/Closed Principle (OCP) to its logical conclusion.

In case a refresher is required, the OCP states that a class should be open for extension, but closed for modification. It seems to me that people often forget the second part. What does it mean?

It means that once implemented, you shouldn’t touch that piece of code ever again (unless you need to correct a bug).

Then how can new functionality be added to a code base? This is still possible through either inheritance or polymorphic recomposition. Since the L in SOLID signifies the Liskov Substitution Principle, SOLID code tends to be based on loosely coupled code composed into an application through copious use of interfaces – basically, Strategies injected into other Strategies and so on (also due to Dependency Inversion Principle). In order to add functionality, you can create new implementations of these interfaces and redefine the application’s Composition Root. Perhaps you’d be wrapping existing functionality in a Decorator or adding it to a Composite.

Once in a while, you’ll stop using an old implementation of an interface. Should you then delete this implementation? What would be the point? At a certain point in time, this implementation was valuable. Maybe it will become valuable again. Leaving it as an potential building block seems a better choice.

Thus, if we think about working with code as a CRUD endeavor, SOLID code can be Created and Read, but never Updated or Deleted. In other words, true SOLID code is append-only code.

Example: Changing AutoFixture’s Number Generation Algorithm

In early 2011 an issue was reported for AutoFixture: Anonymous numbers were created in monotonically increasing sequences, but with separate sequences for each number type:

integers: 1, 2, 3, 4, 5, …

decimals: 1.0, 2.0, 3.0, 4.0, 5.0, …

and so on. However, the person reporting the issue thought it made more sense if all numbers shared a single sequence. After thinking about it a little while, I agreed.

Because the AutoFixture code base is fairly SOLID we decided to leave the old implementations in place and implement the new behavior in new classes.

The old behavior was composed from a set of ISpecimenBuilders. As an example, integers were generated by this class:

public class Int32SequenceGenerator : ISpecimenBuilder
{
    private int i;
 
    public int CreateAnonymous()
    {
        return Interlocked.Increment(ref this.i);
    }
 
    public object Create(object request,
        ISpecimenContext context)
    {
        if (request != typeof(int))
        {
            return new NoSpecimen(request);
        }
 
        return this.CreateAnonymous();
    }
}

Similar implementations generated decimals, floats, doubles, etc. Instead of modifying any of these classes, we left them in the code base and created a new ISpecimenBuilder that generates all numbers from a single sequence:

public class NumericSequenceGenerator : ISpecimenBuilder
{
    private int value;
 
    public object Create(object request,
        ISpecimenContext context)
    {
        var type = request as Type;
        if (type == null)
            return new NoSpecimen(request);
 
        return this.CreateNumericSpecimen(type);
    }
 
    private object CreateNumericSpecimen(Type request)
    {
        var typeCode = Type.GetTypeCode(request);
 
        switch (typeCode)
        {
            case TypeCode.Byte:
                return (byte)this.GetNextNumber();
            case TypeCode.Decimal:
                return (decimal)this.GetNextNumber();
            case TypeCode.Double:
                return (double)this.GetNextNumber();
            case TypeCode.Int16:
                return (short)this.GetNextNumber();
            case TypeCode.Int32:
                return this.GetNextNumber();
            case TypeCode.Int64:
                return (long)this.GetNextNumber();
            case TypeCode.SByte:
                return (sbyte)this.GetNextNumber();
            case TypeCode.Single:
                return (float)this.GetNextNumber();
            case TypeCode.UInt16:
                return (ushort)this.GetNextNumber();
            case TypeCode.UInt32:
                return (uint)this.GetNextNumber();
            case TypeCode.UInt64:
                return (ulong)this.GetNextNumber();
            default:
                return new NoSpecimen(request);
        }
    }
 
    private int GetNextNumber()
    {
        return Interlocked.Increment(ref this.value);
    }
}

Adding a new class in itself has no effect, so in order to recompose the default behavior of AutoFixture, we changed a class called DefaultPrimitiveBuilders by removing the old ISpecimenBuilders like Int32SequenceGenerator and instead adding NumericSequenceGenerator:

yield return new StringGenerator(() => 
    Guid.NewGuid());
yield return new ConstrainedStringGenerator();
yield return new StringSeedRelay();
yield return new NumericSequenceGenerator();
yield return new CharSequenceGenerator();
yield return new RangedNumberGenerator();
// even more builders...

NumericSequenceGenerator is the fourth class being yielded here. Before we added NumericSequenceGenerator, this class instead yielded Int32SequenceGenerator and similar classes. These were removed.

The DefaultPrimitiveBuilders class is part of AutoFixture’s default Facade and is the closest we get to a Composition Root for the library. Recomposing this Facade enabled us to change the behavior of AutoFixture without modifying (other) existing classes.

As Enrico (who implemented this change) points out, the beauty is that the previous behavior is still in the box, and all it takes is a single method call to bring it back:

var fixture = new Fixture().Customize(
    new NumericSequencePerTypeCustomization());

The only class we had to modify was the DefaultPrimitiveBuilders, which is where the object graph is composed. In applications this corresponds to the Composition Root, so even in the face of SOLID code, you still need to modify the Composition Root in order to recompose the application. However, use of a good DI Container and a strong set of conventions can do much to minimize the required editing of such a class.

SOLID versus Refactoring

SOLID is a goal I strive towards in the way I write code and design APIs, but I don’t think I’ve ever written a significant code base which is perfectly SOLID. While I consider AutoFixture a ‘fairly’ SOLID code base, it’s not perfect, and I’m currently performing some design work in order to change some abstractions for version 3.0. This will require changing some of the existing types and thereby violating the OCP.

It’s worth noting that as long as you can stick with the OCP you can avoid introducing breaking changes. A breaking change is also an OCP violation, so adhering to the OCP is more than just an academic exercise – particularly if you write reusable libraries.

Still, while none of my code is perfect and I occasionally have to refactor, I don’t refactor much. By definition, refactoring means violating the OCP, and while I have nothing against refactoring code when it’s required, I much prefer putting myself in a situation where it’s rarely necessary in the first place.

I’ve often been derided for my lack of use of Resharper. When replying that I have little use for Resharper because I write SOLID code and thus don’t do much refactoring, I’ve been ridiculed for being totally clueless. People don’t realize the intimate relationship between SOLID and refactoring. I hope this post has helped highlight that connection.

posted on Tuesday, January 03, 2012 3:43:47 PM (Romance Standard Time, UTC+01:00)  #    Comments [11] Trackback
# Monday, December 19, 2011

Recently I received a question from Kelly Sommers about good ways to refactor away from Factory Overload. Basically, she’s working in a code base where there’s an explosion of Abstract Factories which seems to be counter-productive. In this post I’ll take a look at the example problem and propose a set of alternatives.

An Abstract Factory (and its close relative Product Trader) can serve as a solution to various challenges that come up when writing loosely coupled code (chapter 6 of my book describes the most common scenarios). However, introducing an Abstract Factory may be a leaky abstraction, so don’t do it blindly. For example, an Abstract Factory is rarely the best approach to address lifetime management concerns. In other words, the Abstract Factory has to make sense as a pure model element.

That’s not the case in the following example.

Problem Statement

The question centers around a code base that integrates with a database closely guarded by DBA police. Apparently, every single database access must happen through a set of very fine-grained stored procedures.

For example, to update the first name of a user, a set of stored procedures exist to support this scenario, depending on the context of the current application user:

User type Stored procedure Parameter name
Admin update_admin_firstname adminFirstName
Guest update_guest_firstname guestFirstName
Regular update_regular_firstname regularFirstName
Restricted update_restricted_firstname restrictedFirstName

As this table demonstrates, not only is there a stored procedure for each user context, but the parameter name differs as well. However, in this particular case it seems as though there’s a pattern to the names.

If this pattern is consistent, I think the easiest way to address these variations would be to algorithmically build the strings from a couple of templates.

However, this is not the route taken by Kelly’s team, so I assume that things are more complicated than that; apparently, a templated approach is not viable, so for the rest of  this article I’m going to assume that it’s necessary to write at least some code to address each case individually.

The current solution that Kelly’s team has implemented is to use an Abstract Factory (Product Trader) to translate the user type into an appropriate IUserFirstNameModifier instance. From the consuming code, it looks like this:

var modifier = factory.Create(UserTypes.Admin);
modifier.Commit("first");

where the factory variable is an instance of the IUserFirstNameModifierFactory interface. This is certainly loosely coupled, but looks like a leaky abstraction. Why is a factory needed? It seems that its single responsibility is to translate a UserTypes instance (an enum) into an IUserFirstNameModifier. There’s a code smell buried here – try to spot it before you read on :)

Proposed Solution

Kelly herself suggests an alternative involving a concrete Builder which can create instances of a single concrete UserFirstNameModifier with or without an implicit conversion:

// Implicit conversion.
UserFirstNameModifier modifier1 = 
    builder.WithUserType(UserTypes.Guest);
 
// Without implicit conversion.
var modifier2 = builder
    .WithUserType(UserTypes.Restricted)
    .Create();

While this may seem to reduce the number of classes involved, it has several drawbacks:

  • First of all, the Fluent Builder pattern implies that you can forgo invoking any of the WithXyz methods (WithUserType) and just accept all the default values encapsulated in the builder. This again implies that there’s a default user type, which may or may not make sense in that particular domain. Looking at Kelly’s code, UserTypes is an enum (and thus has a default value), so if WithUserType isn’t invoked, the Create method defaults to UserTypes.Admin. That’s a bit too implicit for my taste.
  • Since all involved classes are now concrete, the proposed solution isn’t extensibile (and by corollary hard to unit test).
  • The builder is essentially a big switch statement.

Both the current implementation and the proposed solution involves passing an enum as a method parameter to a different class. If you’ve read and memorized Refactoring you should by now have recognized both a code smell and the remedy.

Alternative 1a: Make UserType Polymorphic

The code smell is Feature Envy and a possible refactoring is to replace the enum with a Strategy. In order to do that, an IUserType interface is introduced:

public interface IUserType
{
    IUserFirstNameModifer CreateUserFirstNameModifier();
}

Usage becomes as simple as this:

var modifier = userType.CreateUserFirstNameModifier();

Obviously, more methods can be added to IUserType to support other update operations, but care should be taken to avoid creating a Header Interface.

While this solution is much more object-oriented, I’m still not quite happy with it, because apparently, the context is a CQRS style architecture. Since an update operation is essentially a Command, then why model the implementation along the lines of a Query? Both Abstract Factory and Factory Method patterns represent Queries, so it seems redundant in this case. It should be possible to apply the Hollywood Principle here.

Alternative 1b: Tell, Don’t Ask

Why have the user type return an modifier? Why can’t it perform the update itself? The IUserType interface should be changed to something like this:

public interface IUserType
{
    void CommitUserFirtName(string firstName);
}

This makes it easier for the consumer to commit the user’s first name because it can be done directly on the IUserType instance instead of first creating the modifier.

It also makes it much easier to unit test the consumer because there’s no longer a mix of Command and Queries within the same method. From Growing Object-Oriented Software we know that Queries should be modeled with Stubs and Commands with Mocks, and if you’ve ever tried mixing the two you know that it’s a sort of interaction that should be minimized.

Alternative 2a: Distinguish by Type

While I personally like alternative 1b best, it may not be practical in all situations, so it’s always valuable to examine other alternatives.

The root cause of the problem is that there’s a lot of stored procedures. I want to reiterate that I still think that the absolutely easiest solution would be to generate a SqlCommand from a string template, but given that this article assumes that this isn’t possible or desirable, it follows that code must be written for each stored procedure.

Why not simply define an interface for each one? As an example, to update the user’s first name in the context of being an ‘Admin’ user, this Role Interface can be used:

public interface IUserFirstNameAdminModifier
{
    void Commit(string firstName);
}

Similar interfaces can be defined for the other user types, such as IUserFirstNameRestrictedModifier, IUserFirstNameGuestModifier and so on.

This is a very simple solution; it’s easy to implement, but risks violating the Reused Abstractions Principle (RAP).

Alternative 2b: Distinguish by Generic Type

The problem with introducing interfaces like IUserFirstNameAdminModifier, IUserFirstNameRestrictedModifier, IUserFirstNameGuestModifier etc. is that they differ only by name. The Commit method is the same for all these interfaces, so this seems to violate the RAP. It’d be better to merge all these interfaces into a single interface, which is what Kelly’s team is currently doing. However, the problem with this is that the type carries no information about the role that the modifier is playing.

Another alternative is to turn the modifier interface into a generic interface like this:

public interface IUserFirstNameModifier<T> 
    where T : IUserType
{
    void Commit(string firstName);
}

The IUserType is a Marker Interface, so .NET purists are not going to like this solution, since the .NET Type Design Guidelines recommend against using Marker Interfaces. However, it’s impossible to constrain a generic type argument against an attribute, so the party line solution is out of the question.

This solution ensures that consumers can now have dependencies on IUserFirstNameModifier<AdminUserType>, IUserFirstNameModifier<RestrictedUserType>, etc.

However, the need for a marker interface gives off another odeur.

Alternative 3: Distinguish by Role

The problem at the heart of alternative 2 is that it attempts to use the type of the interfaces as an indicator of the roles that Services play. It’s seems that making the type distinct works against the RAP, but when the RAP is applied, the type becomes ambiguous.

However, as Ted Neward points out in his excellent series on Multiparadigmatic .NET, the type is only one axis of variability among many. Perhaps, in this case, it may be much easier to use the name of the dependency to communicate its role instead of the type.

Given a single, ambiguous IUserFirstNameModifier interface (just as in the original problem statement), a consumer can distinguish between the various roles of modifiers by their names:

public partial class SomeConsumer
{
    private readonly IUserFirstNameModifier adminModifier;
    private readonly IUserFirstNameModifier guestModifier;
 
    public SomeConsumer(
        IUserFirstNameModifier adminModifier,
        IUserFirstNameModifier guestModifier)
    {
        this.adminModifier = adminModifier;
        this.guestModifier = guestModifier;
    }
 
    public void DoSomething()
    {
        if (this.UseAdmin)
            this.adminModifier.Commit("first");
        else
            this.guestModifier.Commit("first");
    }
}

Now it’s entirely up to the Composition Root to compose SomeConsumer with the correct modifiers, and while this can be done manually, it’s an excellent case for a DI Container and a bit of Convention over Configuration.

Conclusion

I’m sure that if I’d spent more time analyzing the problem I could have come up with more alternatives, but this post is becoming long enough already.

Of the alternatives I’ve suggested here, I prefer 1b or 3, depending on the exact requirements.

posted on Monday, December 19, 2011 2:04:55 PM (Romance Standard Time, UTC+01:00)  #    Comments [5] Trackback
# Wednesday, December 07, 2011

Asynchronous message passing combined with eventual consistency makes it possible to build very scalable systems. However, sometimes eventual consistency isn’t appropriate in parts of the system, while it’s acceptable in other parts. How can a consistent architecture be defined to fit both ACID and eventual consistency? This article provides an answer.

The case of an online game

Last week I visited Pixel Pandemic, a company that produces browser-based MMORPGs. Since each game world has lots of players who can all potentially interact with each other, scalability is very important.

In traditional line of business applications, eventual consistency is often an excellent fit because the application is a projection of the real world. My favorite example is an inventory system: it models what’s going on in one or more physical warehouses, but the real world is the ultimate source of truth. A warehouse worker might accidentally drop and damage some of the goods, in which case the application must adjust after the fact.

In other words, the information contained within line of business applications tend to lag after the real world. It’s impossible to guarantee that the application is always consistent with the real world, so eventual consistency is a natural fit.

That’s not the case with an online game world. The game world itself is the source of truth, and it must be internally consistent at all times. As an example, in Zombie Pandemic, players fight against zombies and may take damage along the way. Players can heal themselves, but they would prefer (I gather) that the healing action takes place immediately, and not some time in the future where the character might be dead. Similarly, when a player hits a zombie, they’d prefer to apply the damage immediately. (However, I think that even here, eventual consistency might provide some interesting game mechanics, but that’s another discussion.)

While discussing these matters with the nice people in Pixel Pandemic, it turned out that while some parts of the game world have to be internally consistent, it’s perfectly acceptable to use eventual consistency in other cases. One example is the game’s high score table. While a single player should have a consistent view of his or her own score, it’s acceptable if the high score table lags a bit.

At this point it seemed clear that this particular online game could use an appropriate combination of ACID and eventual consistency, and I think this conclusion can be generalized. The question now becomes: how can a consistent architecture encompass both types of consistency?

Problem statement

With the above example scenario in mind the problem statement can be generalized:

Given that an application should apply a mix of ACID and eventual consistency, how can a consistent architecture be defined?

Keep in mind that ACID consistency implies that all writes to a transactional resource must take place as a blocking method call. This seems to be at odds with the concept of asynchronous message passing that works so well with eventual consistency.

However, an application architecture where blocking ACID calls are fundamentally different than asynchronous message passing isn’t really an architecture at all. Developers will have to decide up-front whether or not a given operation is or isn’t synchronous, so the ‘architecture’ offers little implementation guidance. The end result is likely to be a heterogeneous mix of Services, Repositories, Units of Work, Message Channels, etc. A uniform principle will be difficult to distill, and the whole thing threatens to devolve into Spaghetti Code.

The solution turns out to be not at all difficult, but it requires that we invert our thinking a bit. Most of us tend to think about synchronous code first. When we think about code performing synchronous work it seems difficult (perhaps even impossible) to retrofit asynchrony to that model. On the other hand, the converse isn’t true.

Given an asynchronous API, it’s trivial to provide a synchronous, blocking implementation.

Adopting an architecture based on asynchronous message passing (the Pipes and Filters architecture) enables both scenarios. Eventual consistency can be achieved by passing messages around on persistent queues, while ACID consistency can be achieved by handling a message in a blocking call that enlists a (potentially distributed) transaction.

An example seems to be in order here.

Example: keeping score

In the online game world, each player accumulates a score based on his or her actions. From the perspective of the player, the score should always be consistent. When you defeat the zombie boss, you want to see the result in your score right away. That sounds an awful lot like the Player is an Aggregate Root and the score is part of that Entity. ACID consistency is warranted whenever the Player is updated.

On the other hand, each time a score changes it may influence the high score table, but this doesn’t need to be ACID consistent; eventual consistency is fine in this case.

Once again, polymorphism comes to the rescue.

Imagine that the application has a GameEngine class that handles updates in the game. Using an injected IChannel<PointsChangedEvent> it can update the score for a player as simple as this:

/* Lots of other interesting things happen
    * here, like calculating the new score... */
 
var cmd =
    new ScoreChangedEvent(this.playerId, score);
this.pointsChannel.Send(cmd);

The Send method returns void, so it’s a good example of a naturally asynchronous API. However, the implementation must do two things:

  • Update the Player Aggregate Root in a transaction
  • Update the high score table (eventually)

That’s two different types of consistency within the same method call.

The first step to enable this is to employ the trusty old Composite design pattern:

public class CompositeChannel<T> : IChannel<T>
{
    private readonly IEnumerable<IChannel<T>> channels;
 
    public CompositeChannel(params IChannel<T>[] channels)
    {
        this.channels = channels;
    }
 
    public void Send(T message)
    {
        foreach (var c in this.channels)
        {
            c.Send(message);
        }
    }
}

With a Composite channel it’s possible to compose a polymorphic mix of IChannel<T> implementations, some blocking and some asynchronous.

ACID write

To update the Player Aggregate Root a simple Adapter writes the event to a persistent data store. This could be a relational database, a document database, a REST resource or something else – it doesn’t really matter exactly which technology is used.

public class PlayerStoreChannel : 
    IChannel<ScoreChangedEvent>
{
    private readonly IPlayerStore store;
 
    public PlayerStoreChannel(IPlayerStore store)
    {
        this.store = store;
    }
 
    public void Send(ScoreChangedEvent message)
    {
        this.store.Save(message.PlayerId, message);
    }
}

The important thing to realize is that the IPlayerStore.Save method will be a blocking method call – perhaps wrapped in a distributed transaction. This ensures that updates to the Player Aggregate Root always leave the data store in a consistent state. Either the operation succeeds or it fails during the method call itself.

This takes care of the ACID consistent write, but the application must also update the high score table.

Asynchronous write

Since eventual consistency is acceptable for the high score table, the message can be transmitted over a persistent queue to be picked up by a background process.

A generic class can server as an Adapter over an IQueue abstraction:

public class QueueChannel<T> : IChannel<T>
{
    private readonly IQueue queue;
    private readonly IMessageSerializer serializer;
 
    public QueueChannel(IQueue queue,
        IMessageSerializer serializer)
    {
        this.queue = queue;
        this.serializer = serializer;
    }
 
    public void Send(T message)
    {
        this.queue.Enqueue(
            this.serializer.Serialize(message));
    }
}

Obvously, the Enqueue method is another void method. In the case of a persistent queue, it’ll block while the message is being written to the queue, but that will tend to be a fast operation.

Composing polymorphic consistency

Now all the building blocks are available to compose both channel implementations into the GameEngine via the CompositeChannel. That might look like this:

var playerConnString = ConfigurationManager
    .ConnectionStrings["player"].ConnectionString;
 
var gameEngine = new GameEngine(
    new CompositeChannel<ScoreChangedEvent>(
        new PlayerStoreChannel(
            new DbPlayerStore(playerConnString)),
        new QueueChannel<ScoreChangedEvent>(
            new PersistentQueue("messageQueue"),                        
            new JsonSerializer())));

When the Send method is invoked on the channel, it’ll first invoke a blocking call that ensures ACID consistency for the Player, followed by asynchronous message passing for eventual consistency in other parts of the application.

Conclusion

Even when parts of an application must be implemented in a synchronous fashion to ensure ACID consistency, an architecture based on asynchronous message passing provides a flexible foundation that enables you to polymorphically mix both kinds of consistency in a single method call. From the perspective of the application layer, this provides a consistent and uniform architecture because all mutating actions are modeled as commands end events encapsulated in messages.

posted on Wednesday, December 07, 2011 9:40:21 AM (Romance Standard Time, UTC+01:00)  #    Comments [5] Trackback
# Tuesday, October 25, 2011

Greg Young gave a talk at GOTO Aarhus 2011 titled Developers have a mental disorder, which was (semi-)humorously meant, but still addressed some very real concerns about the cognitive biases of software developers as a group. While I have no intention to provide a complete resume of the talk, Greg said one thing that made me think a bit (more) about SOLID code. To paraphrase, it went something like this:

Developers have a tendency to attempt to solve specific problems with general solutions. This leads to coupling and complexity. Instead of being general, code should be specific.

This sounds correct at first glance, but once again I think that SOLID code offers a solution. Due to the Single Responsibility Principle each SOLID concrete (pardon the pun) class will tend to very specifically address a very narrow problem.

Such a class may implement one (or more) general-purpose interface(s), but the concrete type is specific.

The difference between the generality of an interface and the specificity of a concrete type becomes more and more apparent the better a code base applies the Reused Abstractions Principle. This is best done by defining an API in terms of Role Interfaces, which makes it possible to define a few core abstractions that apply very broadly, while implementations are very specific.

As an example, consider AutoFixture’s ISpecimenBuilder interface. This is a very central interface in AutoFixture (in fact, I don’t even know just how many implementations it has, and I’m currently too lazy to count them). As an API, it has proven to be very generally useful, but each concrete implementation is still very specific, like the CurrentDateTimeGenerator shown here:

public class CurrentDateTimeGenerator : ISpecimenBuilder
{
    public object Create(object request, 
        ISpecimenContext context)
    {
        if (request != typeof(DateTime))
        {
            return new NoSpecimen(request);
        }
 
        return DateTime.Now;
    }
}

This is, literally, the entire implementation of the class. I hope we can agree that it’s very specific.

In my opinion, SOLID is a set of principles that can help us keep an API general while each implementation is very specific.

In SOLID code all concrete types are specific.

posted on Tuesday, October 25, 2011 5:01:15 PM (Romance Daylight Time, UTC+02:00)  #    Comments [5] Trackback
# Friday, September 23, 2011

Soon after I posted my previous blog post on message dispatching without Service Location I received an email from Jeff Saunders with some great observations. Jeff has been so kind to allow me to quote his email here on the blog, so here it is:

“I enjoyed your latest blog post about message dispatching. I have to ask, though: why do we want weakly-typed messages? Why can't we just inject an appropriate IConsumer<T> into our services - they know which messages they're going to send or receive.

“A really good example of this is ISubject<T> from Rx. It implements both IObserver<T> (a message consumer) and IObservable<T> (a message producer) and the default implementation Subject<T> routes messages directly from its IObserver side to its IObservable side.

“We can use this with DI quite nicely - I have written an example in .NET Pad: http://dotnetpad.net/ViewPaste/woTkGk6_GEq3P9xTVEJYZg#c9,c26,

“The good thing about this is that we now have access to all of the standard LINQ query operators and the new ones added in Rx, so we can use a select query to map messages between layers, for instance.

“This way we get all the benefits of a weakly-typed IChannel interface, with the added advantages of strong typing for our messages and composability using Rx.

“One potential benefit of weak typing that could be raised is that we can have just a single implementation for IChannel, instead of an ISubject<T> for each message type. I don't think this is really a benefit, though, as we may want different propagation behaviour for each message type - there are other implementations of ISubject<T> that call consumers asynchronously, and we could pass any IObservable<T> or IObserver<T> into a service for testing purposes.”

These are great observations and I think that Rx holds much promise in this space. Basically you can say that in CQRS-style architectures we’re already pushing events (and commands) around, so why not build upon what the framework offers?

Even if you find the IObserver<T> interface a bit too clunky with its OnNext, OnError and OnCompleted methods compared to the strongly typed IConsumer<T> interface, the question still remains: why do we want weakly-typed messages?

We don’t, necessarily. My previous post wasn’t meant as a particular endorsement of a weakly typed messaging channel. It was more an observation that I’ve seen many variations of this IChannel interface:

public interface IChannel
{
    void Send<T>(T message);
}

The most important thing I wanted to point out was that while the generic type argument may create the illusion that this is a strongly typed method, this is all it is: an illusion. IChannel isn’t strongly typed because you can invoke the Send method with any type of message – and the code will still compile. This is no different than the mechanical distinction between a Service Locator and an Abstract Factory.

Thus, when defining a channel interface I normally prefer to make this explicit and instead model it like this:

public interface IChannel
{
    void Send(object message);
}

This achieves exactly the same and is more honest.

Still, this doesn’t really answer Jeff’s question: is this preferable to one or more strongly typed IConsumer<T> dependencies?

Any high-level application entry point that relies on a weakly typed IChannel can get by with a single IChannel dependency. This is flexible, but (just like with Service Locator), it might hide that the client may have (or (d)evolve) too many responsibilities.

If, instead, the client would rely on strongly typed dependencies it becomes much easier to see if/when it violates the Single Responsibility Principle.

In conclusion, I’d tend to prefer strongly typed Datatype Channels instead of a single weakly typed channel, but one shouldn’t underestimate the flexibility of a general-purpose channel either.

posted on Friday, September 23, 2011 11:08:53 AM (Romance Daylight Time, UTC+02:00)  #    Comments [1] Trackback
# Monday, September 19, 2011

Once upon a time I wrote a blog post about why Service Locator is an anti-pattern, and ever since then, I occasionally receive rebuffs from people who agree with me in principle, but think that, still: in various special cases (the argument goes), Service Locator does have its uses.

Most of these arguments actually stem from mistaking the mechanics for the role of a Service Locator. Still, once in a while a compelling argument seems to come my way. One of the most insistent arguments concerns message dispatching – a pattern which is currently gaining in prominence due to the increasing popularity of CQRS, Domain Events and kindred architectural styles.

In this article I’ll first provide a quick sketch of the scenario, followed by a typical implementation based on a ‘Service Locator’, and then conclude by demonstrating why a Service Locator isn’t necessary.

Scenario: Message Dispatching

Appropriate use of message dispatching internally in an application can significantly help decouple the code and make roles explicit. A common implementation utilizes a messaging interface like this one:

public interface IChannel
{
    void Send<T>(T message);
}

Personally, I find that the generic typing of the Send method is entirely redundant (not to mention heavily reminiscent of the shape of a Service Locator), but it’s very common and not particularly important right now (but more about that later).

An application might use the IChannel interface like this:

var registerUser = new RegisterUserCommand(
    Guid.NewGuid(),
    "Jane Doe",
    "password",
    "jane@ploeh.dk");
this.channel.Send(registerUser);
 
// ...
 
var changeUserName = new ChangeUserNameCommand(
    registerUser.UserId,
    "Jane Ploeh");
this.channel.Send(changeUserName);
 
// ...
 
var resetPassword = new ResetPasswordCommand(
    registerUser.UserId);
this.channel.Send(resetPassword);

Obviously, in this example, the channel variable is an injected instance of the IChannel interface.

On the receiving end, these messages must be dispatched to appropriate consumers, which must all implement this interface:

public interface IConsumer<T>
{
    void Consume(T message);
}

Thus, each of the command messages in the example have a corresponding consumer:

public class RegisterUserConsumer : IConsumer<RegisterUserCommand>
public class ChangeUserNameConsumer : IConsumer<ChangeUserNameCommand>
public class ResetPasswordConsumer : IConsumer<ResetPasswordCommand>

This certainly is a very powerful pattern, so it’s often used as an argument to prove that Service Locator is, after all, not an anti-pattern.

Message Dispatching using a DI Container

In order to implement IChannel it’s necessary to match messages to their appropriate consumers. One easy way to do this is by employing a DI Container. Here’s an example that uses Autofac to implement IChannel, but any other container would do as well:

private class AutofacChannel : IChannel
{
    private readonly IComponentContext container;
 
    public AutofacChannel(IComponentContext container)
    {
        if (container == null)
            throw new ArgumentNullException("container");
 
        this.container = container;
    }
 
    public void Send<T>(T message)
    {
        var consumer = this.container.Resolve<IConsumer<T>>();
        consumer.Consume(message);
    }
}

This class is an Adapter from Autofac’s IComponentContext interface to the IChannel interface. At this point I can always see the “Q.E.D.” around the corner: “look! Service Locator isn’t an anti-pattern after all! I’d like to see you implement IChannel without a Service Locator.”

While I’ll do the latter in just a moment, I’d like to dwell on the DI Container-based implementation for a moment.

  • Is it simple? Yes.
  • Is it flexible? Yes, although it has shortcomings.
  • Would I use it like this? Perhaps. It depends :)
  • Is it the only way to implement IChannel? No – see the next section.
  • Does it use a Service Locator? No.

While AutofacChannel uses Autofac (a DI Container) to implement the functionality, it’s not (necessarily) a Service Locator in action. This was the point I already tried to get across in my previous post about the subject: just because its mechanics look like Service Locator it doesn’t mean that it is one. In my implementation, the AutofacChannel class is a piece of pure infrastructure code. I even made it a private nested class in my Composition Root to underscore the point. The container is still not available to the application code, so is never used in the Service Locator role.

One of the shortcomings about the above implementations is that it provides no fallback mechanism. What happens if the container can’t resolve the matching consumer? Perhaps there isn’t a consumer for the message. That’s entirely possible because there are no safeguards in place to ensure that there’s a consumer for every possibly message.

The shape of the Send method enables the client to send any conceivable message type, and the code still compiles even if no consumer exists. That may look like a problem, but is actually an important insight into implementing an alternative IChannel class.

Message Dispatching using weakly typed matching

Consider the IChannel.Send method once again:

void Send<T>(T message);

Despite its generic signature it’s important to realize that this is, in fact, a weakly typed method (at least when used with type inferencing, as in the above example). Equivalently to a bona fide Service Locator, it’s possible for a developer to define a new class (Foo) and send it – and the code still compiles:

this.channel.Send(new Foo());

However, at run-time, this will fail because there’s no matching consumer. Despite the generic signature of the Send method, it contains no type safety. This insight can be used to implement IChannel without a DI Container.

Before I go on I should point out that I don’t consider the following solution intrinsically superior to using a DI Container. However, readers of my book will know that I consider it a very illuminating exercise to try to implement everything with Poor Man’s DI once in a while.

Using Poor Man’s DI often helps unearth some important design elements of DI because it helps to think about solutions in terms of patterns and principles instead of in terms of technology.

However, once I have arrived at an appropriate conclusion while considering Poor Man’s DI, I still tend to prefer mapping it back to an implementation that involves a DI Container.

Thus, the purpose of this section is first and foremost to outline how message dispatching can be implemented without relying on a Service Locator.

While this alternative implementation isn’t allowed to change any of the existing API, it’s a pure implementation detail to encapsulate the insight about the weakly typed nature of IChannel into a similarly weakly typed consumer interface:

private interface IConsumer
{
    void Consume(object message);
}

Notice that this is a private nested interface of my Poor Man’s DI Composition Root – it’s a pure implementation detail. However, given this private interface, it’s now possible to implement IChannel like this:

private class PoorMansChannel : IChannel
{
    private readonly IEnumerable<IConsumer> consumers;
 
    public PoorMansChannel(params IConsumer[] consumers)
    {
        this.consumers = consumers;
    }
 
    public void Send<T>(T message)
    {
        foreach (var c in this.consumers)
            c.Consume(message);
    }
}

Notice that this is another private nested type that belongs to the Composition Root. It loops though all injected consumers, so it’s up to each consumer to decide whether or not to do anything about the message.

A final private nested class bridges the generically typed world with the weakly typed world:

private class Consumer<T> : IConsumer
{
    private readonly IConsumer<T> consumer;
 
    public Consumer(IConsumer<T> consumer)
    {
        this.consumer = consumer;
    }
 
    public void Consume(object message)
    {
        if (message is T)
            this.consumer.Consume((T)message);
    }
}

This generic class is another Adapter – this time adapting the generic IConsumer<T> interface to the weakly typed (private) IConsumer interface. Notice that it only delegates the message to the adapted consumer if the type of the message matches the consumer.

Each implementer of IConsumer<T> can be wrapped in the (private) Consumer<T> class and injected into the PoorMansChannel class:

var channel = new PoorMansChannel(
    new Consumer<ChangeUserNameCommand>(
        new ChangeUserNameConsumer(store)),
    new Consumer<RegisterUserCommand>(
        new RegisterUserConsumer(store)),
    new Consumer<ResetPasswordCommand>(
        new ResetPasswordConsumer(store)));

So there you have it: type-based message dispatching without a DI Container in sight. However, it would be easy to use convention-based configuration to scan an assembly and register all IConsumer<T> implementations and wrap them in Consumer<T> instances and use this list to compose a PoorMansChannel instance. However, I will leave this as an exercise to the reader (or a later blog post).

My claim still stands

In conclusion, I find that I can still defend my original claim: Service Locator is an anti-pattern.

That claim, by the way, is falsifiable, so I do appreciate that people take it seriously enough by attempting to disprove it. However, until now, I’ve yet to be presented with a scenario where I couldn’t come up with a better solution that didn’t involve a Service Locator.

Keep in mind that a Service Locator is defined by the role it plays – not the shape of the API.

posted on Monday, September 19, 2011 4:44:47 PM (Romance Daylight Time, UTC+02:00)  #    Comments [15] Trackback
# Thursday, August 25, 2011

It’s time to take a step back from the whole debate about whether or not Service Locator is, or isn’t, an anti-pattern. It remains my strong belief that it’s an anti-pattern, while others disagree. Although everyone is welcome to think differently than me, I’ve noticed that some of the arguments being put forth in defense of Service Locator seem very convincing. However, I believe that in those cases we no longer talk about Service Locator, but something that looks an awful lot like it.

Some APIs are easy to confuse with a ‘real’ Service Locator. It probably doesn’t help that last year I published an article on how to tell the difference between a Service Locator and an Abstract Factory. In this article I may have focused too much on the mechanics of Service Locator, but as Derick Bailey was so kind to point out, this hides the role the API might play.

To repeat that earlier post, a Service Locator looks like this:

public interface IServiceLocator
{
    T Create<T>(object context);
}

All Service Locators I’ve seen so far look like that, or some variation thereof, but that doesn’t mean that the relationship is transitive. Just because an API looks like that it doesn’t automatically means that it’s a Service Locator.

If it was, all DI containers would be Service Locators. As an example, here’s Castle Windsor’s Resolve method:

public T Resolve<T>()

Even AutoFixture has an API like that:

MyClass sut = fixture.CreateAnonymous<MyClass>();

It has never been my intention to denounce every single DI container available, as well as my own open source framework. Service Locator is ultimately not identified by the mechanics of its API, but by the role it plays.

A DI container encapsulated in a Composition Root is not a Service Locator – it’s an infrastructure component.

It becomes a Service Locator if used incorrectly: when application code (as opposed to infrastructure code) actively queries a service in order to be provided with required dependencies, then it has become a Service Locator.

Service Locators are spread thinly and pervasively throughout a code base – that is just as much a defining characteristic.

posted on Thursday, August 25, 2011 8:55:12 PM (Romance Daylight Time, UTC+02:00)  #    Comments [9] Trackback
# Thursday, July 28, 2011

In my book I describe the Composition Root pattern in chapter 3. This post serves as a summary description of the pattern.

The Constructor Injection pattern is easy to understand until a follow-up question comes up:

Where should we compose object graphs?

It’s easy to understand that each class should require its dependencies through its constructor, but this pushes the responsibility of composing the classes with their dependencies to a third party. Where should that be?

It seems to me that most people are eager to compose as early as possible, but the correct answer is:

As close as possible to the application’s entry point.

This place is called the Composition Root of the application and defined like this:

A Composition Root is a (preferably) unique location in an application where modules are composed together.

This means that all the application code relies solely on Constructor Injection (or other injection patterns), but is never composed. Only at the entry point of the application is the entire object graph finally composed.

The appropriate entry point depends on the framework:

  • In console applications it’s the Main method
  • In ASP.NET MVC applications it’s global.asax and a custom IControllerFactory
  • In WPF applications it’s the Application.OnStartup method
  • In WCF it’s a custom ServiceHostFactory
  • etc.

(you can read more about framework-specific Composition Roots in chapter 7 of my book.)

The Composition Root is an application infrastructure component.

Only applications should have Composition Roots. Libraries and frameworks shouldn’t.

The Composition Root can be implemented with Poor Man’s DI, but is also the (only) appropriate place to use a DI Container.

A DI Container should only be referenced from the Composition Root. All other modules should have no reference to the container.

Using a DI Container is often a good choice. In that case it should be applied using the Register Resolve Release pattern entirely from within the Composition Root.

Read more in Dependency Injection in .NET.

posted on Thursday, July 28, 2011 5:22:04 PM (Romance Daylight Time, UTC+02:00)  #    Comments [8] Trackback
# Tuesday, June 07, 2011

Recently I had an interesting conversation with a developer at my current client, about how the SOLID principles would impact their code base. The client wants to write SOLID code – who doesn’t? It’s a beautiful acronym that fully demonstrates the power of catchy terminology.

However, when you start to outline what it actually means people become uneasy. At the point where the discussion became interesting, I had already sketched my view on encapsulation. However, the client’s current code base is designed around validation at the perimeter. Most of the classes in the Domain Model are actually internal and implicitly trust input.

We were actually discussing Test-Driven Development, and I had already told them that they should only test against the public API of their code base. The discussion went something like this (I’m hoping I’m not making my ‘opponent’ sound dumb, because the real developer I talked to was anything but):

Client: “That would mean that each and every class we expose must validate input!”

Me: “Yes…?”

Client: “That would be a lot of extra work.”

Me: “Would it? Why is that?”

Client: “The input that we deal with consist of complex data structures, and we must validate that all values are present and correct.”

Me: “Assume that input is SOLID as well. This would mean that each input instance can be assumed to be in a valid state because that would be its own responsibility. Given that, what would validation really mean?”

Client: “I’m not sure I understand what you mean…”

Me: “Assuming that the input instance is a self-validating reference type, what could possibly go wrong?”

Client: “The instance might be null…”

Me: “Yes. Anything else?”

Client: “Not that I can think of…”

Me: “Me neither. This means that while you must add more code to implement proper encapsulation, it’s really trivial code. It’s just some Guard Clauses.”

Client: “But isn’t it still gold plating?”

Me: “Not really, because we are designing for change in the general sense. We know that we can’t predict specific change, but I can guarantee you that change requests will occur. Instead of trying to predict specific changes and design variability in those specific places, we simply put interfaces around everything because the cost of doing so is really low. This means that when change does happen, we already have Seams in the right places.”

Client: “How does SOLID help with that?”

Me: “A result of the Single Responsibility Principle is that each self-encapsulated class becomes really small, and there will be a lot of them.”

Client: “Lots of classes… I’m not sure I’m comfortable with that. Doesn’t it make it much harder to find what you need?”

Me: “I don’t think so. Each class is very small, so although you have many of them, understanding what each one does is easy. In my experience this is a lot easier than trying to figure out what a big class with thousands of lines of code does. When you have few big classes, your object model might look something like this:”

Large Grained Objects

“There’s a few objects and they kind of fit together to form the overall picture. However, if you need to change something, you’ll need to substantially change the shape of each of those objects. That’s a lot of work, and this is why such an object design isn’t particularly adaptable to change.

“With SOLID, on the other hand, you have lots of small-grained objects which you can easily re-arrange to match new requirements:”

Fine Grained Objects

And that’s when it hit me: SOLID code isn’t really solid at all. I’m not a material scientist, but to me a solid indicates a rigid structure. In essence a structure where the particles are tightly locked to each other and can’t easily move about.

However, when thinking about SOLID code, it actually helps to think about it more like a liquid (although perhaps a rather viscous one). Each class has much more room to maneuver because it is small and fits together with other classes in many different ways. It’s clear that when you push an analogy too far, it breaks apart.

Still, a closing anecdote is appropriate…

My (then) three-year old son one day handed me a handful of Duplo bricks and asked me to build him a dragon. If you’ve ever tried to build anything out of Duplo you’ll know that the ‘resolution’ of the bricks is rather coarse-grained. Given that ‘a handful’ for a three-year old isn’t a lot of bricks, this was quite a challenge. Fortunately, I had an appreciative audience with quite a bit of imagination, so I was able to put the few bricks together in a way that satisfied my son.

Still, building a dragon of comparable size out of Lego bricks is much easier because the bricks have a much finer ‘resolution’. SOLID code is more comparable to Lego than Duplo.

posted on Tuesday, June 07, 2011 3:46:07 PM (Romance Daylight Time, UTC+02:00)  #    Comments [13] Trackback
# Tuesday, May 31, 2011

My recent series of blog posts about Poka-yoke Design generated a few responses (I would have been disappointed had this not been the case). Quite a few of these reactions relate to various serialization or translation technologies usually employed at application boundaries: Serialization, XML (de)hydration, UI validation, etc. Note that such translation happens not only at the perimeter of the application, but also at the persistence layer. ORMs are also a translation mechanism.

Common to most of the comments is that lots of serialization technologies require the presence of a default constructor. As an example, the XmlSerializer requires a default constructor and public writable properties. Most ORMs I’ve investigated seem to have the same kind of requirements. Windows Forms and WPF Controls (UI is also an application boundary) also must have default constructors. Doesn’t that break encapsulation? Yes and no.

Objects at the Boundary

It certainly would break encapsulation if you were to expose your (domain) objects directly at the boundary. Consider a simple XML document like this one:

<name>
  <firstName>Mark</firstName>
  <lastName>Seemann</lastName>
</name>

Whether or not we have formal contract (XSD) or not, we might stipulate that both the firstName and lastName elements are required. However, despite such a contract, I can easily create a document that breaks it:

<name>
  <firstName>Mark</firstName>
</name>

We can’t enforce the contract as there’s no compilation step involved. We can validate input (and output), but that’s a different matter. Exactly because there’s no enforcement it’s very easy to create malformed input. The same argument can be made for UI input forms and any sort of serialized byte sequence. This is why we must treat all input as suspect.

This isn’t a new observation at all. In Patterns of Enterprise Application Architecture, Martin Fowler described this as a Data Transfer Object (DTO). However, despite the name we should realize that DTOs are not really objects at all. This is nothing new either. Back in 2004 Don Box formulated the Four Tenets of Service Orientation. (Yes, I know that they are not in vogue any more and that people wanted to retire them, but some of them still make tons of sense.) Particularly the third tenet is germane to this particular discussion:

Services share schema and contract, not class.

Yes, and that means they are not objects. A DTO is a representation of such a piece of data mapped into an object-oriented language. That still doesn’t make them objects in the sense of encapsulation. It would be impossible. Since all input is suspect, we can hardly enforce any invariants at all.

Often, as Craig Stuntz points out in a comment to one of my previous posts, even if the input is invalid, we want to capture what we did receive in order to present a proper error message (this argument also applies on machine-to-machine boundaries). This means that any DTO must have very weak invariants (if any at all).

DTOs don’t break encapsulation because they aren’t objects at all.

Don’t be fooled by your tooling. The .NET framework very, very much wants you to treat DTOs as objects. Code generation ensues.

However, the strong typing provided by such auto-generated classes gives a false sense of security. You may think that you get rapid feedback from the compiler, but there are many possible ways you can get run-time errors (most notably when you forget to update the auto-generated code based on new schema versions).

An even more problematic result of representing input and output as objects is that it tricks lots of developers into dealing with them as though they represent the real object model. The result is invariably an anemic domain model.

More and more, this line of reasoning is leading me towards the conclusion that the DTO mental model that we have gotten used to over the last ten years is a dead end.

What Should Happen at the Boundary

Given that we write write object-oriented code and that data at the boundary is anything but object-oriented, how do we deal with it?

One option is to stick with what we already have. To bridge the gap we must then develop translation layers that can translate the DTOs to properly encapsulated domain objects. This is the route I take with the samples in my book. However, this is a solution that more and more I’m beginning to think may not be the best. It has issues with maintainability. (Incidentally, that’s the problem with writing a book: at the time you’re done, you know so much more than you did when you started out… Not that I’m denouncing the book – it’s just not perfect…)

Another option is to stop treating data as objects and start treating it as the structured data that it really is. It would be really nice if our programming language had a separate concept of structured data… Interestingly, while C# has nothing of the kind, F# has tons of ways to model data structures without behavior. Perhaps that’s a more honest approach to dealing with data… I will need to experiment more with this…

A third option is to look towards dynamic types. In his article Cutting Edge: Expando Objects in C# 4.0, Dino Esposito outlines a dynamic approach towards consuming structured data that shortcuts auto-generated code and provides a lightweight API to structured data. This also looks like a promising approach… It doesn’t provide compile-time feedback, but that’s only a false sense of security anyway. We must resort to unit tests to get rapid feedback, but we’re all using TDD already, right?

In summary, my entire series about encapsulation relates to object-oriented programming. Although there are lots of technologies available to represent boundary data as ‘objects’, they are false objects. Even if we use an object-oriented language at the boundary, the code has nothing to do with object orientation. Thus, the Poka-yoke Design rules don’t apply there.

Now go back and reread this post, but replace ‘DTO’ with ‘Entity’ (or whatever your ORM calls its representation of a relational table row) and you should begin to see the contours of why ORMs are problematic.

posted on Tuesday, May 31, 2011 3:27:11 PM (Romance Daylight Time, UTC+02:00)  #    Comments [23] Trackback
# Monday, May 30, 2011

This post is the fifth in a series about Poka-yoke Design – also known as encapsulation.

Default constructors are code smells. There you have it. That probably sounds outrageous, but consider this: object-orientation is about encapsulating behavior and data into cohesive pieces of code (classes). Encapsulation means that the class should protect the integrity of the data it encapsulates. When data is required, it must often be supplied through a constructor. Conversely, a default constructor implies that no external data is required. That’s a rather weak statement about the invariants of the class.

Please be aware that this post represents a smell. This indicates that whenever a certain idiom or pattern (in this case a default constructor) is encountered in code it should trigger further investigation.

As I will outline below, there are several scenarios where default constructors are perfectly fine, so the purpose of this blog post is not to thunder against default constructors. It’s to provide food for thought.

If you have read my book you will know that Constructor Injection is the dominating DI pattern exactly because it statically advertises dependencies and protects the integrity of those dependencies by guaranteeing that an initialized consumer is always in a consistent state. This is fail-safe design because the compiler can enforce the relationship, thus providing rapid feedback.

This principle extends far beyond DI. In a previous post I described how a constructor with arguments statically advertises that the argument is required:

public class Fragrance : IFragrance
{
    private readonly string name;
 
    public Fragrance(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
 
        this.name = name;
    }
 
    public string Spread()
    {
        return this.name;
    }
}

The Fragrance class protects the integrity of the name by requiring it through the constructor. Since this class requires the name to implement its behavior, requesting it through the constructor is the correct thing to do. A default constructor would not have been fail-safe, since it would introduce a temporal coupling.

Consider that objects are supposed to be containers of behavior and data. Whenever an object contains data, the data must be encapsulated. In the (very common) case where no meaningful default value can be defined, the data must be provided via the constructor. Thus, default constructors might indicate that encapsulation is broken.

When are Default Constructors OK?

There are still scenarios where default constructors are in order (I’m sure there are more than those listed here).

  • If a default constructor can assign meaningful default values to all contained fields a default constructor still protects the invariants of the class. As an example, the default constructor of UriBuilder initializes its internal values to a consistent set that will build the Uri http://localhost unless one or more of its properties are subsequently manipulated. You may agree or disagree with this default behavior, but it’s consistent and so encapsulation is preserved.
  • If a class contains no data obviously there is no data to protect. This may be a symptom of the Feature Envy code smell, which is often evidenced by the class in question being a concrete class.
    • If such a class can be turned into a static class it’s a certain sign of Feature Envy.
    • If, on the other hand, the class implements an interface, it might be a sign that it actually represents pure behavior.

A class that represents pure behavior by implementing an interface is not necessarily a bad thing. This can be a very powerful construct.

In summary, a default constructor should be a signal to stop and think about the invariants of the class in question. Does the default constructor sufficiently guarantee the integrity of the encapsulated data? If so, the default constructor is appropriate, but otherwise it’s not. In my experience, default constructors tend to be the exception rather than the rule.

posted on Monday, May 30, 2011 3:02:02 PM (Romance Daylight Time, UTC+02:00)  #    Comments [9] Trackback
# Friday, May 27, 2011

This post is the fourth in a series about Poka-yoke Design – also known as encapsulation.

Recently I saw this apparently enthusiastic tweet reporting from some Microsoft technology event:

[Required] attribute in code automatically creates a non-nullable entry in DB and validation in the webpage – nice […]

I imagine that it must look something like this:

public class Smell
{
    [Required]
    public int Id { get; set; }
}

Every time I see something like this I die a little inside. If you already read my previous posts it should by now be painfully clear why this breaks encapsulation. Despite the [Required] attribute there’s no guarantee that the Id property will ever be assigned a value. The attribute is just a piece of garbage making a claim it can’t back up.

Code like that is not fail-safe.

I understand that the attribute mentioned in the above tweet is intended to signal to some tool (probably EF) that the property must be mapped to a database schema as non-nullable, but it’s still redundant. Attributes are not the correct way to make a statement about invariants.

Improved Design

The [Required] attribute is redundant because there’s a much better way to state that a piece of data is required. This has been possible since .NET 1.0. Here’s the Poka-yoke version of that same statement:

public class Fragrance
{
    private readonly int id;
 
    public Fragrance(int id)
    {
        this.id = id;
    }
 
    public int Id
    {
        get { return this.id; }
    }
}

This simple structural design ensures that the ID truly is required (and if the ID can only be positive a Guard Clause can be added). An instance of Fragrance can only be created with an ID. Since this is a structural construction, the compiler can enforce the requirement, giving us rapid feedback.

I do realize that the [Required] attribute mentioned above is intended to address the challenge of mapping objects to relational data and rendering, but instead of closing the impedance mismatch gap, it widens it. Instead of introducing yet another redundant attribute the team should have made their tool understand simple idioms for encapsulation like the one above.

This isn’t at all hard to do. As an example, DI Containers thrive on structural information encoded into constructors (this is called Auto-wiring). The team behind the [Required] attribute could have done that as well. The [Required] attribute is a primitive and toxic hack.

This is the major reason I never expect to use EF. It forces developers to break encapsulation, which is a principle upon which I refuse to compromise.

posted on Friday, May 27, 2011 3:21:06 PM (Romance Daylight Time, UTC+02:00)  #    Comments [9] Trackback
# Thursday, May 26, 2011

This post is the third in a series about Poka-yoke Design – also known as encapsulation.

Automatic properties are one of the most redundant features of C#. I know that some people really love them, but they address a problem you shouldn’t have in the first place.

I totally agree that code like this looks redundant:

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

However, the solution is not to write this instead:

public string Name { get; set; }

The problem with the first code snippet isn’t that it contains too much ceremony. The problem is that it breaks encapsulation. In fact

“[…] getters and setters do not achieve encapsulation or information hiding: they are a language-legitimized way to violate them.”

James O. Coplien & Gertrud Bjørnvig. Lean Architecture. Wiley. 2010. p. 134.

While I personally think that properties do have their uses, I very rarely find use for automatic properties. They are never appropriate for reference types, and only rarely for value types.

Code Smell: Automatic Reference Type Property

First of all, let’s consider the very large set of properties that expose a reference type.

In the case of reference types, null is a possible value. However, when we think about Poka-yoke design, null is never an appropriate value because it leads to NullReferenceExceptions. The Null Object pattern provides a better alternative to deal with situations where a value might be undefined.

In other words, an automatic property like the Name property above is never appropriate. The setter must have some kind of Guard Clause to protect it against null (and possibly other invalid values). Here’s the most fundamental example:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        this.name = value; 
    }
}

As an alternative, a Guard Clause could also check for null and provide a default Null Object in the cases where the assigned value is null:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            this.name = "";
            return;
        }
        this.name = value; 
    }
}

However, this implementation contains a POLA violation because the getter sometimes returns a different value than what was assigned. It’s possible to fix this problem by adding an associated boolean field indicating whether the name was assigned null so that null can be returned from the setter in this special case, but that leads to another code smell.

Code Smell: Automatic Value Type Property

If the type of the property is a value type, the case is less clear-cut because value types can’t be null. This means that a Null Guard is never appropriate. However, directly consuming a value type may still be inappropriate. In fact, it’s only appropriate if the class can meaningfully accept and handle any value of that type.

If, for example, the class can really only handle a certain subset of all possible values, a Guard Clause must be introduced. Consider this example:

public int RetryCount { get; set; }

This property might be used to set the appropriate number or retries for a given operation. The problem with using an automatic property is that it’s possible to assign a negative value to it, and that wouldn’t make any sense. One possible remedy is to add a Guard Clause:

private int retryCount;
public int RetryCount
{
    get { return this.retryCount; }
    set
    {
        if (value < 0)
        {
            throw new ArgumentOutOfRangeException();
        }
        this.retryCount = value;
    }
}

However, in many cases, exposing a primitive property is more likely to be a case of Primitive Obsession.

Improved Design: Guard Clause

As I described above, the most immediate fix for automatic properties is to properly implement the property with a Guard Clause. This ensures that the class’ invariants are properly encapsulated.

Improved Design: Value Object Property

When the automatic property is a value type, a Guard Clause may still be in order. However, when the property is really a symptom of Primitive Obsession, a better alternative is to introduce a proper Value Object.

Consider, as an example, this property:

public int Temperature { get; set; }

This is bad design for a number of reasons. It doesn’t communicate the unit of measure and allows unbounded values to be assigned. What happens if –100 is assigned? If the unit of measure is Celcius it should succeed, although in the case when it’s Kelvin, it should fail. No matter the unit of measure, attempting to assign int.MinValue should fail.

A more robust design can be had if we introduce a new Temperature type and change the property to have that type. Apart from protection of invariants it would also encapsulate conversion between different temperature scales.

However, if that Value Object is implemented as a reference type the situation is equivalent to the situation described above, and a Null Guard is necessary. Only in the case where the Value Object is implemented as a value type is an anonymous property appropriate.

The bottom line is that automatic properties are rarely appropriate. In fact, they are only appropriate when the type of the property is a value type and all conceivable values are allowed. Since there are a few cases where automatic properties are appropriate their use can’t be entirely dismissed, but it should be treated as warranting further investigation. It’s a code smell, not an anti-pattern.

On a different note properties also violate the Law of Demeter, but that’s the topic of a future blog post…

posted on Thursday, May 26, 2011 3:33:13 PM (Romance Daylight Time, UTC+02:00)  #    Comments [16] Trackback
# Wednesday, May 25, 2011

This post is the second in a series about Poka-yoke Design – also known as encapsulation.

Many classes have a tendency to consume or expose primitive values like integers and strings. While such primitive types exist on any platform, they tend to lead to procedural code. Furthermore they often break encapsulation by allowing invalid values to be assigned.

This problem has been addressed many times before. Years ago Jimmy Bogard provided an excellent treatment of the issue, as well as guidance on how to resolve it. In relation to AutoFixture I also touched upon the subject some time ago. As such, the current post is mostly a placeholder.

However, it’s worth noting that both Jimmy’s and my own post address the concern that strings and integers do not sufficiently encapsulate the concepts of Zip codes and phone numbers.

  • When a Zip code is represented as a string it’s possible to assign values such as null, string.Emtpy, “foo”, very long strings, etc. Jimmy’s ZipCode class encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.
  • When a Danish phone number is represented as an integer it’s possible to assign values such as –98, 0, int.MaxValue, etc. Once again the DanishPhoneNumber class from the above example encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.

Encapsulation is broken unless the concept represented by a primitive value can truly take any of the possible values of the primitive type. This is rarely the case.

Design Smell:

A class consumes a primitive type. However, further analysis shows that not all possible values of the type are legal values.

Improved Design:

Encapsulate the primitive value in a Value Object that contains appropriate Guard Clauses etc. to guarantee that only valid instances are possible.

Primitives tend to not be fail-safe, but encapsulated Value Objects are.

posted on Wednesday, May 25, 2011 5:03:31 PM (Romance Daylight Time, UTC+02:00)  #    Comments [5] Trackback
# Tuesday, May 24, 2011

This post is the first in a series about Poka-yoke Design – also known as encapsulation.

A common problem in API design is temporal coupling, which occurs when there’s an implicit relationship between two, or more, members of a class requiring clients to invoke one member before the other. This tightly couples the members in the temporal dimension.

The archetypical example is the use of an Initialize method, although copious other examples can be found – even in the BCL. As an example, this usage of EndpointAddressBuilder compiles, but fails at run-time:

var b = new EndpointAddressBuilder();
var e = b.ToEndpointAddress();

It turns out that at least an URI is required before an EndpointAddress can be created. The following code compiles and succeeds at run-time:

var b = new EndpointAddressBuilder();
b.Uri = new UriBuilder().Uri;
var e = b.ToEndpointAddress();

The API provides no hint that this is necessary, but there’s a temporal coupling between the Uri property and the ToEndpointAddress method.

In the rest of the post I will provide a more complete example, as well as a guideline to improve the API towards Poka-yoke Design.

Smell Example

This example describes a more abstract code smell, exhibited by the Smell class. The public API looks like this:

public class Smell
{
    public void Initialize(string name)
 
    public string Spread()
}

Semantically the name of the Initialize method is obviously a clue, but on a structural level this API gives us no indication of temporal coupling. Thus, code like this compiles, but throws an exception at run-time:

var s = new Smell();
var n = s.Spread();

It turns out that the Spread method throws an InvalidOperationException because the Smell has not been initialized with a name. The problem with the Smell class is that it doesn’t properly protect its invariants. In other words, encapsulation is broken.

To fix the issue the Initialize method must be invoked before the Spread method:

var sut = new Smell();
sut.Initialize("Sulphur");
var n = sut.Spread();

While it’s possible to write unit tests that explore the behavior of the Smell class, it would be better if the design was improved to enable the compiler to provide feedback.

Improvement: Constructor Injection

Encapsulation (Poka-yoke style) requires that the class can never be in an inconsistent state. Since the name of the smell is required, a guarantee that it is always available must be built into the class. If no good default value is available, the name must be requested via the constructor:

public class Fragrance : IFragrance
{
    private readonly string name;
 
    public Fragrance(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
 
        this.name = name;
    }
 
    public string Spread()
    {
        return this.name;
    }
}

This effectively guarantees that the name is always available in all instances of the class. There  are also positive side effects:

  • The cyclomatic complexity of the class has been reduced
  • The class is now immutable, and thereby thread-safe

However, there are times when the original version of the class implements an interface that causes the temporal coupling. It might have looked like this:

public interface ISmell
{
    void Initialize(string name);
 
    string Spread();
}

In many cases the injected value (name) is unknown until run-time, in which case straight use of the constructor seems prohibitive – after all, the constructor is an implementation detail and not part of the loosely coupled API. When programming against an interface it’s not possible to invoke the constructor.

There’s a solution for that as well.

Improvement: Abstract Factory

To decouple the methods in the ISmell (ha ha) interface the Initialize method can be moved to a new interface. Instead of mutating the (inconsistent) state of a class, the Create method (formerly known as Initialize) returns a new instance of the IFragrance interface:

public interface IFragranceFactory
{
    IFragrance Create(string name);
}

The implementation is straightforward:

public class FragranceFactory : IFragranceFactory
{
    public IFragrance Create(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
        return new Fragrance(name);
    }
}

This enables encapsulation because both the FragranceFactory and Fragrance classes protect their invariants. They can never be in an inconsistent state. A client previously interacting with the ISmell interface can use the IFragranceFactory/IFragrance combination to achieve the same funcionality:

var f = factory.Create(name);
var n = f.Spread();

This is better because improper use of the API can now be detected by the compiler instead of at run-time. An interesting side-effect by moving towards a more statically declared interaction structure is that classes tend towards immutability. Immutable classes are automatically thread-safe, which is an increasingly important trait in the (relatively) new multi-core era.

posted on Tuesday, May 24, 2011 4:00:42 PM (Romance Daylight Time, UTC+02:00)  #    Comments [19] Trackback

Encapsulation is one of the most misunderstood aspects of object-oriented programming. Most people seem to think that the related concept of information hiding simply means that private fields should be exposed by public properties (or getter/setter methods in languages that don’t have native properties).

Have you ever wondered what’s the real benefit to be derived from code like the following?

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

This feels awfully much like redundant code to me (and automatic properties are not the answer – it’s just a compiler trick that still creates private backing fields). No information is actually hidden. Derick Bailey has a good piece on why this view of encapsulation is too narrow, so I’m not going to reiterate all his points here.

So then what is encapsulation?

The whole point of object-orientation is to produce cohesive pieces of code (classes) that solve given problems once and for all, so that programmers can use those classes without having to learn about the intricate details of the implementations.

This is what encapsulation is all about: exposing a solution to a problem without requiring the consumer to fully understand the problem domain.

This is what all well-designed classes do.

  • You don’t have to know the intricate details of TDS to use ADO.NET against SQL Server.
  • You don’t have to know the intricate details of painting on the screen to use WPF or Windows Forms.
  • You don’t have to know the intricate details of Reflection to use a DI Container.
  • You don’t have to know how to efficiently sort a list in order to efficiently sort a list in .NET.
  • Etc.

What makes encapsulation so important is exactly this trait. The class must hide the information it encapsulates in order to protect it against ‘naïve’ users. Wikipedia has this to say:

Hiding the internals of the object protects its integrity by preventing users from setting the internal data of the component into an invalid or inconsistent state.

Keep in mind that users are expected to not fully understand the internal implementation of a class. This makes it obvious what encapsulation is really about:

Encapsulation is a fail-safe mechanism.

By corollary, encapsulation does not mean hiding complexity. Whenever complexity is hidden (as is the case for Providers) feedback time increases. Rapid feedback is much preferred, so delaying feedback is not desirable if it can be avoided.

Encapsulation is not about hiding complexity, but conversely exposing complexity in a fail-safe manner.

In Lean this is known as Poka-yoke, so I find it only fitting to think about encapsulation as Poka-yoke Design: APIs that make it as hard as possible to do the wrong thing. Considering that compilation is the cheapest feedback mechanism, it’s preferable to design APIs so that the code can only compile when classes are used correctly.

In a series of blog posts I will look at various design smells that break encapsulation, as well as provide guidance on how to improve the design to make it safer, thus going from smell to fragrance.

  1. Design Smell: Temporal Coupling
  2. Design Smell: Primitive Obsession
  3. Code Smell: Automatic Property
  4. Design Smell: Redundant Required Attribute
  5. Design Smell: Default Constructor

Postscript: At the Boundaries, Applications are Not Object-Oriented

posted on Tuesday, May 24, 2011 3:57:39 PM (Romance Daylight Time, UTC+02:00)  #    Comments [8] Trackback
# Monday, May 16, 2011

Recently I had the inclination to do the Tennis Kata a couple of times. The first time I saw it I thought it wasn’t terribly interesting as an exercise in C# development. It would basically just be an application of the State pattern, so I decided to make it a bit more interesting. More or less by intuition I decided to give myself the following constraints:

Now that’s more interesting :)

Given these constraints, what would be the correct approach? Given that this is a finite state machine with a fixed number of states, the Visitor pattern will be a good match.

Each player’s score can be modeled as a Value Object that can be one of these types:

  • ZeroPoints
  • FifteenPoints
  • ThirtyPoints
  • FortyPoints
  • AdvantagePoint
  • GamePoint

All of these classes implement the IPoints interface:

public interface IPoints
{
    IPoints Accept(IPoints visitor);
 
    IPoints LoseBall();
 
    IPoints WinBall(IPoints opponentPoints);
 
    IPoints WinBall(AdvantagePoint opponentPoints);
 
    IPoints WinBall(FortyPoints opponentPoints);
}

The interesting insight here is that until the opponent's score reaches FortyPoints nothing special happens. Those states can be effectively collapsed into the WinBall(IPoints) method. However, when the opponent either has FortyPoints or AdvantagePoint, special things happen, so IPoints has specialized methods for those cases. All implementations should use double dispatch to invoke the correct overload of WinBall, so the Accept method must be implemented like this:

public IPoints Accept(IPoints visitor)
{
    return visitor.WinBall(this);
}

That’s the core of the Visitor pattern in action. When the implementer of the Accept method is either FortyPoints or AdvantagePoint, the specialized overload will be invoked.

It’s now possible to create a context around a pair of IPoints (called a Game) to implement a method to register that Player 1 won a ball:

public Game PlayerOneWinsBall()
{
    var newPlayerOnePoints = this.PlayerTwoScore
        .Accept(this.PlayerOneScore);
    var newPlayerTwoPoints = 
        this.PlayerTwoScore.LoseBall();
    return new Game(
        newPlayerOnePoints, newPlayerTwoPoints);
}

A similar method for player two simply reverses the roles. (I’m currently reading Lean Architecture, but have yet to reach the chapter on DCI. However, considering what I’ve already read about DCI, this seems to fit the bill pretty well… although I might be wrong on that account.)

The context calculates new scores for both players and returns the result as a new instance of the Game class. This keeps the Game and IPoints implementations immutable.

The new score for the winner depends on the opponent’s score, so the appropriate overload of WinBall should be invoked. The Visitor implementation makes it possible to pick the right overload without resorting to casts and if statements. As an example, the FortyPoints class implements the three WinBall overloads like this:

public IPoints WinBall(IPoints opponentPoints)
{
    return new GamePoint();
}
 
public IPoints WinBall(FortyPoints opponentPoints)
{
    return new AdvantagePoint();
}
 
public IPoints WinBall(AdvantagePoint opponentPoints)
{
    return this;
}

It’s also important to correctly implement the LoseBall method. In most cases, losing a ball doesn’t change the current state of the loser, in which case the implementation looks like this:

public IPoints LoseBall()
{
    return this;
}

However, when the player has advantage and loses the ball, he or she loses the advantage, so for the AdvantagePoint class the implementation looks like this:

public IPoints LoseBall()
{
    return new FortyPoints();
}

To keep things simple I decided to implicitly model deuce as both players having FortyPoints, so there’s not explicit Deuce class. Thus, AdvantagePoint returns FortyPoints when losing the ball.

Using the Visitor pattern it’s possible to keep the cyclomatic complexity at 1. The code has no branches or loops. It’s immutable to boot, so a game might look like this:

[Fact]
public void PlayerOneWinsAfterHardFight()
{
    var game = new Game()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall();
 
    Assert.Equal(new GamePoint(), game.PlayerOneScore);
    Assert.Equal(new FortyPoints(), game.PlayerTwoScore);
}

In case you’d like to take a closer look at the code I’m attaching it to this post. It was driven completely by using the AutoFixture.Xunit extension, so if you are interested in idiomatic AutoFixture code it’s also a good example of that.

TennisKata.zip (3.09 MB)
posted on Monday, May 16, 2011 1:01:00 PM (Romance Daylight Time, UTC+02:00)  #    Comments [3] Trackback
# Monday, May 02, 2011

Recently I partook in a Windows Azure migration workshop, helping developers from existing development organizations port their applications to Windows Azure. Once more an old design smell popped up: SQL Server over-utilization. This ought to be old news to anyone with experience designing software on the Wintel stack, but apparently it bears repetition:

Don’t put logic in your database. SQL Server should be used only for persistent storage of data.

(Yes: this post is written in 2011…)

Many years ago I heard that role described as a ‘bit bucket’ – you put in data and pull it out again, and that’s all you do. No fancy stored procedures or functions or triggers.

Why wouldn’t we want to use the database if we have one? Scalability is the answer. SQL Server doesn’t scale horizontally. You can’t add more servers to take the load off a database server (well, some of my old colleagues will argue that this is possible with Oracle, and that may be true, but with SQL Server it’s impossible).

Yes, we can jump through hoops like partitioning and splitting the database up into several smaller databases, but it still doesn’t give us horizontal scalability. SQL Server is a bottleneck in any system in which it takes part.

How is this relevant to Windows Azure? It’s relevant for two important reasons:

  • There’s an upper size limit on SQL Azure. Currently that size limit is 50 GB, and while it’s likely to grow in the future, there’s going to be a ceiling for a long time.
  • You can’t fine tune the hardware for performance. The server runs on virtual hardware.

Development organizations that rely heavily on the database for execution of logic often need expensive hardware and experienced DBAs to squeeze extra performance out of the database servers. Such people know that write-intensive/append-only tables work best with one type of RAID, while read-intensive tables are better hosted on other file groups on different disks with different RAID configurations.

With SQL Azure you can just forget about all that.

The bottom line is that there are fundamental rules for software development that you must follow if you want to be able to successfully migrate to Windows Azure. I previously described an even simpler sanity check you should perform, but after that you should take a good look at your database.

The best solution is if you can completely replace SQL Server with Azure’s very scalable storage services, but those come with their own set of challenges.

posted on Monday, May 02, 2011 2:23:49 PM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Wednesday, April 27, 2011

Developers exposed to ASP.NET are likely to be familiar with the so-called Provider pattern. You see it a lot in that part of the BCL: Role Provider, Membership Provider, Profile Provider, etc. Lots of text has already been written about Providers, but the reason I want to add yet another blog post on the topic is because once in a while I get the question on how it relates to Dependency Injection (DI).

Is Provider a proper way to do DI?

No, it has nothing to do with DI, but as it tries to mimic loose coupling I can understand the confusion.

First things first. Let’s start with the name. Is it a pattern at all? Regular readers of this blog may get the impression that I’m fond of calling everything and the kitchen sink an anti-pattern. That’s not true because I only make that claim when I’m certain I can hold that position, so I’m not going to denounce Provider as an anti-pattern. On the contrary I will make the claim that Provider is not a pattern at all.

A design pattern is not invented – it’s discovered as a repeated solution to a commonly recurring problem. Providers, on the other hand, were invented by Microsoft, and I’ve rarely seen them used outside their original scope. Secondly I’d also dispute that they solve anything.

That aside, however, I want to explain why Provider is bad design:

  • It uses the Constrained Construction anti-pattern
  • It hides complexity
  • It prevents proper lifetime management
  • It’s not testable

In the rest of this post I will explain each point in detail, but before I do that we need an example to look at. The old OrderProcessor example suffices, but instead of injecting IOrderValidator, IOrderCollector, and IOrderShipper this variation uses Providers to provide instances of the Services:

public SuccessResult Process(Order order)
{
    IOrderValidator validator = 
        ValidatorProvider.Validator;
    bool isValid = validator.Validate(order);
    if (isValid)
    {
        CollectorProvider.Collector.Collect(order);
        ShipperProvider.Shipper.Ship(order);
    }
 
    return this.CreateStatus(isValid);
}

The ValidatorProvider uses the configuration system to create and return an instance of IOrderValidator:

public static IOrderValidator Validator
{
    get 
    {
        var section = 
            OrderValidationConfigurationSection
                .GetSection();
        var typeName = section.ValidatorTypeName;
        var type = Type.GetType(typeName, true);
        var obj = Activator.CreateInstance(type);
        return (IOrderValidator)obj;
    }
}

There are lots of details I omitted here. I could have saved the reference for later use instead of creating a new instance each time the property is accessed. In that case I would also have had to make the code thread-safe, so I decided to skip that complexity. The code could also be more defensive, but I’m sure you get the picture.

The type name is defined in the app.config file like this:

<orderValidation 
  type="Ploeh.Samples.OrderModel.UnitTest.TrueOrderValidator,
        Ploeh.Samples.OrderModel.UnitTest" />

Obviously, CollectorProvider and ShipperProvider follow the same… blueprint.

This should be well-known to most .NET developers, so what’s wrong with this model?

Constrained Construction

In my book’s chapter on DI anti-patterns I describe the Constrained Construction anti-pattern. Basically it occurs every time there’s an implicit constraint on the constructor of an implementer. In the case of Providers the constraint is that each implementer must have a default constructor. In the example the culprit is this line of code:

var obj = Activator.CreateInstance(type);

This constrains any implementation of IOrderValidator to have a default constructor, which obviously means that the most fundamental DI pattern Constructor Injection is out of the question.

Variations of the Provider idiom is to supply an Initialize method with a context, but this creates a temporal coupling while still not enabling us to inject arbitrary Services into our implementations. I’m not going to repeat six pages of detailed description of Constrained Construction here, but the bottom line is that you can’t fix it – you have to refactor towards true DI – preferably Constructor Injection.

Hidden complexity

Providers hide the complexity of their implementations. This is not the same as encapsulation. Rather it’s a dishonest API and the problem is that it just postpones the moment when you discover how complex the implementation really is.

When you implement a client and use code like the following everything looks deceptively simple:

IOrderValidator validator = 
    ValidatorProvider.Validator;

However, if this is the only line of code you write it will fail, but you will not notice until run-time. Check back to the implementation of the Validator property if you need to refresh the implementation: there’s a lot of things that can go wrong here:

  • The appropriate configuration section is not available in the app.config file.
  • The ValidatorTypeName is not provided, or is null, or is malformed.
  • The ValidatorTypeName is correctly formed, but the type in question cannot be located by Fusion.
  • The Type doesn’t have a default constructor. This is one of the other problems of Constrained Construction: it can’t be statically enforced because a constructor is not part of an abstraction’s API.
  • The created type doesn’t implement IOrderValidator.

I’m sure I even forgot a thing or two, but the above list is sufficient for me. None of these problems are caught by the compiler, so you don’t discover these issues until you run an integration test. So much for rapid feedback.

I don’t like APIs that lie about their complexity.

Hiding complexity does not make an API easier to use; it makes it harder.

An API that hides necessary complexity makes it impossible to discover problems at compile time. It simply creates more friction.

Lifetime management issues

A Provider exerts too much control over the instances it creates. This is a variation of the Control Freak anti-pattern (also from my book). In the current implementation the Validator property totally violates the Principle of least surprise since it returns a new instance every time you invoke the getter. I did this to keep the implementation simple (this is, after all, example code), but a more normal implementation would reuse the same instance every time.

However, reusing the same instance every time may be problematic in a multi-threaded context (such as a web application) because you’ll need to make sure that the implementation is thread-safe. Often, we’d much prefer to scope the lifetime of the Service to each HTTP request.

HTTP request scoping can be built into the Provider, but then it would only work in web applications. That’s not very flexible.

What’s even more problematic is that once we move away from the Singleton lifestyle (not to be confused with the Singleton design pattern) we may have a memory leak at hand, since the implementation may implement IDisposable. This can be solved by adding a Release method to each Provider, but now we are moving so far into DI Container territory that I find it far more reasonable to just use proper DI instead of trying to reinvent the wheel.

Furthermore, the fact that each Provider owns the lifetime of the Service it controls makes it impossible to share resources. What if the implementation we want to use implements several Role Interfaces each served up by a different Provider? We might want to use that common implementation to share or coordinate state across different Services, but that’s not possible because we can’t share an instance across multiple providers.

Even if we configure all Providers with the same concrete class, each will instantiate and serve its own separate instance.

Testability

The Control Freak also impacts testability. Since a Provider creates instances of interfaces based on XML configuration and Activator.CreateInstance, there’s no way to inject a dynamic mock.

It is possible to use hard-coded Test Doubles such as Stubs or Fakes because we can configure the XML with their type names, but even a Spy is problematic because we’ll rarely have an object reference to the Test Double.

In short, the Provider idiom is not a good approach to loose coupling. Although Microsoft uses it in some of their products, it only leads to problems, so there’s no reason to mimic it. Instead, use Constructor Injection to create loosely coupled components and wire them in the application’s Composition Root using the Register Resolve Release pattern.

posted on Wednesday, April 27, 2011 2:14:52 PM (Romance Daylight Time, UTC+02:00)  #    Comments [2] Trackback
# Tuesday, April 05, 2011

My latest MSDN Magazine article, this time about CQRS on Windows Azure, is now available at the April MSDN Magazine web site.

It’s mostly meant as an introduction to CQRS as well as containing some tips and tricks that are specific to applying CQRS on Windows Azure.

As an added bonus the code sample download contains lots of idiomatic unit tests written with AutoFixture’s xUnit.net extensions, so if you’d like to see the result of my TDD work with AutoFixture, there’s a complete code base to look at there.

posted on Tuesday, April 05, 2011 9:52:57 PM (Romance Daylight Time, UTC+02:00)  #    Comments [2] Trackback
# Tuesday, March 22, 2011

A few months back I wrote a (somewhat theoretical) post on composable interfaces. A major point of that post was that Role Interfaces  with a single Command method (i.e. a method that returns no value) is a very versatile category of abstraction.

Some of my readers asked for examples, so in this post I will provide a few. Consider this interface that fits the above description:

public interface IMessageConsumer<T>
{
    void Consume(T message);
}

This is a very common type of interface you will tend to encounter a lot in distributed, message-based architectures such as CQRS or Udi Dahan’s view of SOA. Some people would call it a message subscriber instead…

In the rest of this post I will examine how we can create compositions out of the IMessageConsumer<T> interface using (in order of significance) Decorator, Null Object, Composite, and other well-known programming constructs.

Decorator

Can we create a meaningful Decorator around the IMessageConsumer<T> interface? Yes, that’s easy – I’ve earlier provided various detailed examples of Decorators, so I’m not going to repeat them here.

I’ve yet to come up with an example of an interface that prevents us from applying a Decorator, so it’s a valid falsifiable claim that we can always Decorate an interface. However, I have yet to prove that this is true, so until now we’ll have to call it a conjecture.

However, since it’s so easy to apply a Decorator to an interface, it’s not a particularly valuable trait when evaluating the composability of an interface.

Null Object

It can be difficult to implement the Null Object pattern when the method(s) in question return a value, but for Commands it’s easy:

public class NullMessageConsumer<T> : IMessageConsumer<T>
{
    public void Consume(T message)
    {
    }
}

The implementation simply ignores the input and does nothing.

Once again my lack of formal CS education prevents me from putting forth a formal proof, but I strongly suspect that it’s always possibly to apply the Null Object pattern to a Command (keep in mind that out parameters count as output, so any interface with one or more of these are not Commands).

It’s often valuable to be able to use a Null Object, but the real benefit comes when we can compose various implementations together.

Composite

To be truly composable, an interface should make it possible to create various concrete implementations that each adhere to the Single Responsibility Principle and then compose those together in a complex implementation. A Composite is a general-purpose implementation of this concept, and it’s easy to create a Composite out of a Command:

public class CompositeMessageConsumer<T> : 
    IMessageConsumer<T>
{
    private readonly IEnumerable<IMessageConsumer<T>> 
        consumers;
 
    public CompositeMessageConsumer(
        params IMessageConsumer<T>[] consumers)
    {
        if (consumers == null)
        {
            throw new ArgumentNullException("consumers");
        }
 
        this.consumers = consumers;
    }
 
    public IEnumerable<IMessageConsumer<T>> Consumers
    {
        get { return this.consumers; }
    }
 
    #region IMessageConsumer<T> Members
 
    public void Consume(T message)
    {
        foreach (var consumer in this.Consumers)
        {
            consumer.Consume(message);
        }
    }
 
    #endregion
}

The implementation of the Consume method simply loops over each composed IMessageConsumer<T> and invokes its Consume method.

I can, for example, implement a sequence of actions that will take place by composing the individual concrete implementations. First we have a guard that protects against invalid messages, followed by a consumer that writes the message to a persistent store, completed by a consumer that raises a Domain Event that the reservation request was accepted.

var c = new CompositeMessageConsumer<MakeReservationCommand>(
    guard, 
    writer, 
    acceptRaiser);

Consumers following the Liskov Substitution Principle will not notice the difference, as all they will see is an implementation of IMessageConsumer<MakeReservationCommand>.

More advanced programming constructs

The Composite pattern only describes a single, general way to compose implementations, but with a Command interface we can do more. As Domain-Driven Design explains, a successful interface is often characterized by making it possible to apply well-known arithmetic or logical operators. As an example, in the case of the IMessageConsumer<T> interface, we can easily mimic the well-known ?! ternary operator from C#:

public class ConditionalMessageConsumer<T> : 
    IMessageConsumer<T>
{
    private Func<T, bool> condition;
    private IMessageConsumer<T> first;
    private IMessageConsumer<T> second;
 
    public ConditionalMessageConsumer(
        Func<T, bool> condition, 
        IMessageConsumer<T> first, 
        IMessageConsumer<T> second)
    {
        if (condition == null)
        {
            throw new ArgumentNullException("condition");
        }
        if (first == null)
        {
            throw new ArgumentNullException("first");
        }
        if (second == null)
        {
            throw new ArgumentNullException("second");
        }
 
        this.condition = condition;
        this.first = first;
        this.second = second;
    }
 
    public void Consume(T message)
    {
        (this.condition(message) ? this.first : this.second)
            .Consume(message);
    }
}

This is more verbose than the ?! operator because C# doesn’t allow us to define operator overloads for interfaces, but apart from that, it does exactly the same thing. Notice particularly that the Consume method uses the ?! operator to select among the two alternatives, and then subsequently invokes the Consume method on the selected consumer.

We can use the ConditionalMessageConsumer to define branches in the consumption of messages. As an example, we can encapsulate the previous CompositeMessageConsumer<MakeReservationCommand> into a conditional branch like this:

var consumer = 
    new ConditionalMessageConsumer<MakeReservationCommand>(
        guard.HasCapacity, c, rejectRaiser);

Notice that I use the method group syntax to supply the condition delegate. If the HasCapacity method returns true, the previous composite (c) is being invoked, but if the result is false we instead use a consumer that raises the Domain Event that the reservation request was rejected.

Concluding thoughts

Apart from the direct purpose of providing examples of the immensely powerful composition options a Command interface provides I want to point out a couple of things:

  • Each of the design pattern implementations (Null Object, Composite – even Conditional) are generic. This is a strong testament to the power of this particular abstraction.
  • The dynamic mocks I’m familiar with (Moq, Rhino Mocks) will by default try to create Null Object implementations for interfaces without explicit setups. Since it’s trivial to implement a Null Command, they just emit them by default. If you use an AutoMocking Container with your unit tests, you can refactor to your heart’s content, adding and removing dependencies as long as they are Command interfaces, and your testing infrastructure will just take care of things for you. It’ll just work.
  • Did you notice that even with these few building blocks we have implemented a large part of a sequential workflow engine? We can execute consumers in sequence as well as branch between different sequences. Obviously, more building blocks are needed to make a full-blown workflow engine, but not that many. I’ll leave the rest as an exercise to the reader :)

As I originally sketched, a Command interface is the ultimate in composability. To illustrate, the application from where I took the above examples is a small application with 48 types (classes and interfaces) in the production code base (that is, excluding unit tests). Of these, 9 are implementations of the IMessageConsumer<T> interface. If we also count the interface itself, it accounts for more than 20 percent of the code base. According to the Reused Abstractions Principle (RAP) I consider this a very successful abstraction.

posted on Tuesday, March 22, 2011 2:09:25 PM (Romance Standard Time, UTC+01:00)  #    Comments [0] Trackback
# Friday, March 04, 2011

The main principle behind the Register Resolve Release pattern is that loosely coupled object graphs should be composed as a single action in the entry point of the application (the Composition Root). For request-based applications (web sites and services), we use a variation where we compose once per request.

It seems to me that a lot of people are apprehensive when they first hear about this concept. It may sound reasonable from an architectural point of view, but isn’t it horribly inefficient? A well-known example of such a concern is Jeffrey Palermo’s blog post Constructor over-injection anti-pattern. Is it really a good idea to compose a complete object graph in one go? What if we don’t need part of the graph, or only need it later? Doesn’t it adversely affect response times?

Normally it doesn’t, and if it does, there are elegant ways to address the issue.

In the rest of this blog post I will expand on this topic. To keep the discussion as simple as possible, I’ll restrict my analysis to object trees instead of full graphs. This is quite a reasonable simplification as we should strive to avoid circular dependencies, but even in the case of full graphs the arguments and techniques put forward below hold.

Consider a simple tree composed of classes from three different assemblies:

Tree

All the A classes (blue) are defined in the A assembly, B classes (green) in the B assembly, and the C1 class (red) in the C assembly. In code we create the tree with Constructor Injection like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new C1(
            new B3()));

Given the tree above, we can now address the most common concerns about composing object trees in one go.

Will it be slow?

Most likely not. Keep in mind that Injection Constructors should be very simple, so not a lot of work is going on during composition. Obviously just creating new object instances takes a bit of time in itself, but we create objects instances all the time in .NET code, and it’s often a very fast operation.

Even when using DI Containers, which perform a lot of (necessary) extra work when creating objects, we can create tens of thousand trees per second. Creation of objects simply isn’t that big a deal.

But still: what about assembly loading?

I glossed over an important point in the above argument. While object creation is fast, it sometimes takes a bit of time to load an assembly. The tree above uses classes from three different assemblies, so to create the tree all three assemblies must be loaded.

In many cases that’s a performance hit you’ll have to take because you need those classes anyway, but sometimes you might be concerned with taking this performance hit too early. However, I make the claim that in the vast majority of cases, this concern is irrelevant.

In this particular context there are two different types of applications: Request-based applications (web) and all the rest (desktop apps, daemons, batch-jobs, etc.).

Request-based applications

For request-based applications such as web sites and REST services, an object tree must be composed for each request. However, all requests are served by the same AppDomain, so once an assembly is loaded, it sticks around to be available for all subsequent requests. Thus, the first few requests will suffer a performance penalty from having to load all assemblies, but after that there will be no performance impact.

In short, in request-based applications, you can compose object trees with confidence. In only extremely rare cases should you have performance issues from composing the entire tree in one go.

Long-running applications

For long-running applications the entire object tree must be composed at start-up. For background services such as daemons and batch processes the start-up time probably doesn’t matter much, but for desktop applications it can be of great importance.

In some cases the application requires the entire tree to be immediately available, in which case there’s not a lot you can do. Still, once all assemblies have been loaded, actually creating the tree will be very fast.

In other cases an entire branch of the tree may not be immediately required. As an example, if the C1 node in the above graph isn’t needed right away, we could improve start-up time if we could somehow defer creating that branch, because this would also defer loading of the entire C assembly.

Deferred branches

Since object creation is fast, the only case where it makes sense to defer loading of a branch is when creation of that branch causes an assembly to be loaded. If we can defer creation of such a branch, we can also defer loading of the assembly, thus improving the time it takes to compose the initial tree.

Imagine that we wish to defer creation of the C1 branch of the above tree. It will prevent the C assembly from being loaded because that assembly is not used in any other place in the tree. However, it will not prevent the B assembly from being loaded, since that assembly is also being used by the A2 node.

Still, in those rare situations where it makes sense to defer creation of a branch, we can make that cut into a part of the infrastructure of the tree. I originally described this technique as a reaction to the above mentioned post by Jeffrey Palermo, but here’s a restatement in the current context.

We can defer creating the C1 node by wrapping it in a lazy implementation of the same interface. The C1 node implements an interface called ISolo<IMarker>, so we can wrap it in a Virtual Proxy that defers creation of C1 until it’s needed:

public class LazySoloMarker : ISolo<IMarker>
{
    private readonly Lazy<ISolo<IMarker>> lazy;
 
    public LazySoloMarker(Lazy<ISolo<IMarker>> lazy)
    {
        if (lazy == null)
        {
            throw new ArgumentNullException("lazy");
        }
 
        this.lazy = lazy;
    }
 
    #region ISolo<IMarker> Members
 
    public IMarker Item
    {
        get { return this.lazy.Value.Item; }
    }
 
    #endregion
}

This Virtual Proxy takes a Lazy<ISolo<IMarker>> as input and defers to it to implement the members of the interface. This only causes the Value property to be created when it’s first accessed – which may be long after the LazySoloMarker instance was created.

The tree can now be composed like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new LazySoloMarker(
            new Lazy<ISolo<IMarker>>(() => new C1(
                new B3()))));

This retains all the original behavior of the original tree, but defers creation of the C1 node until it’s needed for the first time.

The bottom line is this: you can compose the entire object graph with confidence. It’s not going to be a performance bottleneck.

posted on Friday, March 04, 2011 12:15:10 PM (Romance Standard Time, UTC+01:00)  #    Comments [0] Trackback
# Thursday, March 03, 2011

The Constructor Injection design pattern is a extremely useful way to implement loose coupling. It’s easy to understand and implement, but sometime perhaps a bit misunderstood.

The pattern itself is easily described through an example:

private readonly ISpecimenBuilder builder;
 
public SpecimenContext(ISpecimenBuilder builder)
{
    if (builder == null)
    {
        throw new ArgumentNullException("builder");
    }
 
    this.builder = builder;
}

The SpecimenContext constructor statically declares that it requires an ISpecimenBuilder instance as an argument. To guarantee that the the builder field is an invariant of the class, the constructor contains a Guard Clause before it assigns the builder parameter to the builder field. This pattern can be repeated for each constructor argument.

It’s important to understand that when using Constructor Injection the constructor should contain no additional logic.

An Injection Constructor should do no more than receiving the dependencies.

This is simply a rephrasing of Nikola Malovic’s 4th law of IoC. There are several reasons for this rule of thumb:

  • When we compose applications with Constructor Injection we often create substantial object graphs, and we want to be able to create these graphs as efficiently as possible. This is Nikola’s original argument.
  • In the odd (and not recommended) cases where you have circular dependencies, the injected dependencies may not yet be fully initialized, so an attempt to invoke their members at that time may result in an exception. This issue is similar to the issue of invoking virtual members from the constructor. Conceptually, an injected dependency is equivalent to a virtual member.
  • With Constructor Injection, the constructor’s responsibility is to demand and receive the dependencies. Thus, according to the Single Responsibility Principle (SRP), it should not try to do something else as well. Some readers might argue that I’m misusing the SRP here, but I think I’m simply applying the underlying principle in a more granular context.

There’s no reason to feel constrained by this rule, as in any case the constructor is an implementation detail. In loosely coupled code, the constructor is not part of the overall application API. When we consider the API at that level, we are still free to design the API as we’d like.

Please notice that this rule is contextual: it applies to Services that use Constructor Injection. Entities and Value Objects tend not to use DI, so their constructors are covered by other rules.

posted on Thursday, March 03, 2011 3:18:54 PM (Romance Standard Time, UTC+01:00)  #    Comments [4] Trackback
# Monday, February 28, 2011

.NET developers should by familiar with the standard access modifiers (public, protected, internal, private). However, in loosely coupled code we can regard interface implementations as a fifth access modifier. This concept was originally introduced to me by Udi Dahan the only time I’ve had the pleasure of meeting him. That was many years ago and while I didn’t grok it back then, I’ve subsequently come to appreciate it quite a lot.

Although I can’t take credit for the idea, I’ve never seen it described, and it really deserves to be.

The basic idea is simple:

If a consumer respects the Liskov Substitution Principle (LSP), the only visible members are those belonging to the interface. Thus, the interface represents a dimension of visibility.

As an example, consider this simple interface from AutoFixture:

public interface ISpecimenContext
{
    object Resolve(object request);
}

A well-behaved consumer can only invoke the Resolve method even though an implementation may have additional public members:

public class SpecimenContext : ISpecimenContext
{
    private readonly ISpecimenBuilder builder;
 
    public SpecimenContext(ISpecimenBuilder builder)
    {
        if (builder == null)
        {
            throw new ArgumentNullException("builder");
        }
 
        this.builder = builder;
    }
 
    public ISpecimenBuilder Builder
    {
        get { return this.builder; }
    }
 
    #region ISpecimenContext Members
 
    public object Resolve(object request)
    {
        return this.Builder.Create(request, this);
    }
 
    #endregion
}

Even though the SpecimenContext class defines the Builder property, as well as a public constructor, any consumer respecting the LSP will only see the Resolve method.

In fact, the Builder property on the SpecimenContext class mostly exists to support unit testing because I sometimes need to assert that a given instance of SpecimenContext contains the expected ISpecimenBuilder. This doesn’t break encapsulation since the Builder is exposed as a read-only property, and it more importantly doesn’t pollute the API.

To support unit testing (and whichever other clients might be interested in the encapsulated ISpecimenBuilder) we have a public property that follows all framework design guidelines. However, it’s essentially an implementation detail, so it’s not visible via the ISpecimenContext interface.

When writing loosely coupled code, I’ve increasingly begun to see the interfaces as the real API. Most other (even public) members are pure implementation details. If the members are public, I still demand that they follow the framework design guidelines, but I don’t consider them parts of the API. It’s a very important distinction.

The interfaces define the bulk of an application’s API. Most other types and members are implementation details.

An important corollary is that constructors are implementation details too, since they can never by part of any interfaces.

In that sense we can regard interfaces as a fifth access modifier – perhaps even the most important one.

posted on Monday, February 28, 2011 2:19:04 PM (Romance Standard Time, UTC+01:00)  #    Comments [9] Trackback
# Friday, February 04, 2011

During the last couple of weeks I’ve been very interested in using a Maybe monad with AutoFixture’s Kernel code, but although many examples can be found on the internet, they remain samples. Rinat Abdullin and Zack Owens both posted samples, but I particularly like Mike Hadlow’s series about Monads in C# because he also explains how to use LINQ with monads such as the Maybe monad.

As I really wanted a Maybe monad for AutoFixture, I first thought about simply implementing it directly in the AutoFixture source. However, I found it too arbitrary to put such a general purpose programming construct into a specific library such as AutoFixture. My next thought was to create a small open source project just for that single purpose, but then I though about the problem a bit more…

The BCL sort of already has a Maybe monad – you just need to recognize it as such.

What is a Maybe monad really? If you really distill it, it’s just a type that either contains a value, or doesn’t contain a value. In other words, it’s a type that represents a particular range: a set with either zero or one items. That’s just a special case of a more general range or collection, and we already have LINQ covering those constructs.

Here it is: the Maybe monad from the BLC (encapsulated in a nice extension method):

public static class LightweightMaybe
{
    public static IEnumerable<T> Maybe<T>(this T value)
    {
        return new[] { value };
    }
}

Obviously, this method returns a Maybe with a value, but we can just as easily represent Nothing with an empty array.

With my ‘new’ Maybe monad, I can now write code like this (where request is a System.Object instance):

return (from t in request.Maybe().OfType<Type>()
        let typeArguments = t.GetGenericArguments()
        where typeArguments.Length == 1
        && typeof(IList<>)
            == t.GetGenericTypeDefinition()
        select context.Resolve(typeof(List<>)
            .MakeGenericType(typeArguments)))
        .DefaultIfEmpty(new NoSpecimen(request))
        .SingleOrDefault();

You may think that this looks dense, but before that the code looked like this:

var type = request as Type;
if (type == null)
{
    return new NoSpecimen(request);
}
 
var typeArguments = type.GetGenericArguments();
if (typeArguments.Length != 1)
{
    return new NoSpecimen(request);
}
 
if (typeof(IList<>) != 
    type.GetGenericTypeDefinition())
{
    return new NoSpecimen(request);
}
 
return context.Resolve(typeof(List<>)
    .MakeGenericType(typeArguments));

Notice that in this more traditional approach involving Guard Clauses, I have to construct a new NoSpecimen object in three different places, thus violating the DRY principle. I like not having all those if/return blocks in the code.

posted on Friday, February 04, 2011 2:11:34 PM (Romance Standard Time, UTC+01:00)  #    Comments [9] Trackback
# Monday, January 24, 2011

Recently I spent a couple of days with Thomas Jespersen who’s working towards a launch of spiir.dk – on Windows Azure. The reason I got to talk to him was to see if I could help with some performance issues he had with Azure Table Storage.

The scenario is really simple: the application needs to load all of a user’s bank transactions into memory to enable pretty advanced sorting and filtering. That sounds like a lot, but really isn’t more than approximately 200 kB of data retrieved through a single query – so: there are no 1+N problems in play here, but even so it originally took more than two seconds. That’s a bit long to wait before you can even start rendering a web page.

By tweaking his partitioning strategy and using parallel queries, Thomas managed to bring down the data retrieval time to approximately one second. Although stress testing indicated that this duration was very stable, even under load, it is still too slow. So we met to see what could be done.

Thomas had done a great job tweaking the query, so I couldn’t really suggest some sort of secret API that would make it run significantly faster. Basically, we have to deal with Azure storage being based on REST and that there are a lot of things about run-time behavior we cannot control. Apart from designing a proper partitioning strategy, we can’t add indexes to Azure Table Storage.

It was time to take a different approach.

As far as I can tell, Windows Azure is designed to be very scalable. However, just because scalability implies that you can handle an insane amount of work within acceptable time frames, it doesn’t mean that you can extrapolate it to mean that under a light load, everything will be lightning fast. That’s not the case at all.

Scalability means that performance characteristics remain stable from light to heavy load.

Consequently this means that if performance is adequate under heavy load, it will also be adequate under a light load. Azure Storage is first and foremost designed to be scalable, and as a second priority, as fast as possible.

As Thomas discovered, Azure Table Storage isn’t particularly fast.

It may be a masochistic side of me that I’m not otherwise aware of, but I actually appreciate that. It makes us reassess our most basic assumptions.

The data that Thomas needs to read isn’t particularly dynamic, so what if we take a snapshot of it? In short, we loaded all of a user’s data into memory and serialized it to Azure Blob Storage.

Loading the same data from a binary serialized Blob took only 1/6 of the time it did to load it from Table Storage.

As it turns out, Thomas doesn’t even need all the columns from the Table to populate the view, so we could even make the serialized Blob smaller yet.

At this point, however, we now have two representations of the same data: The original data in Table Storage, and a persistent cache in Blob Storage. The remaining challenge is to figure out how to keep these in sync.

This may seem like a hack, but is really represents a paradigm shift. Letting go of ACID opens up a lot of new opportunities.

Actually, I spend most of the next day trying to convince Thomas that CQRS would be the best approach, or that we could at least pick up some of the techniques from asynchronous, messaging based architectures, but that’s another story.

The morale here is that on Azure, things may be slower than you are used to, but storage is (relatively) cheap, so denormalization can save you a lot of execution time.

posted on Monday, January 24, 2011 1:03:16 PM (Romance Standard Time, UTC+01:00)  #    Comments [2] Trackback
# Wednesday, December 22, 2010

I’ve been doing Test-Driven Development since 2003. I still do, I still love it, and I still expect to be doing it in the future. Over the years, I’ve repeatedly returned to the discussion of whether TDD should be regarded as Test-Driven Development or Test-Driven Design. For a long time I’ve been of the conviction that TDD is both of those. Not so any longer.

TDD is not a good design methodology.

Over the years I’ve written tons of code with TDD. I’ve written code where tests blindly drove the design, and I’ve written code where the design was the result of a long period of deliberation, and the tests were only the manifestations of already well-formed ideas.

I can safely say that the code where tests alone drove the design never turned out particularly well. Although it was testable and, after a fashion, ‘loosely coupled’, it was still Spaghetti Code in the sense that it lacked overall consistency and good abstractions.

On the other hand, I’m immensely pleased with code like AutoFixture 2.0, which was mostly the result of hours of careful contemplation riding my bike to and from work. It was still written test-first, but the design was well thought out in advance.

This made me think: did I just fail (repeatedly) at Test-Driven Design, or is the overall concept a fallacy?

That’s a pretty hard question to answer; what constitutes good design? In the following, let’s assume that the SOLID principles is a pretty good indicator of good design. If so, does test-first drive us towards SOLID design?

TDD versus the Single Responsibility Principle

Does TDD ensure the application of the Single Responsibility Principle (SRP)? This question is easy to answer and the answer is a resounding NO! Nothing prevents us from test-driving a God Class. I’ve seen many examples, and I’ve been guilty of it myself.

Constructor Injection is a much better help because it makes SRP violations so painful.

The score so far: 0 points to TDD.

TDD versus the Open/Closed Principle

Does TDD ensure that we follow the Open/Closed Principle (OCP)? This is a bit harder to answer. I’ve previously argued that Testability is just another name for OCP, so that would in itself imply that TDD drives OCP. However, the issue is more complex than that, because there are several different ways we can address the OCP:

  • Inheritance
  • Composition

According to Roy Osherove’s book The Art of Unit Testing, the Extract and Override technique is a common unit testing trick. Personally, I rarely use it, but if used it will indirectly drive us a bit towards OCP via inheritance.

However, we all know that we should favor composition over inheritance, so does TDD drive us in that direction? As I alluded to previously, TDD does tend to drive us towards the use of Test Doubles, which we can view as one way to achieve OCP via composition.

However, another favorite composition technique of mine is to add functionality with a Decorator. This is only possible if the original type implements an interface that can be decorated. It’s possible to write a test that forces a SUT to implement an interface, but TDD as a technique in itself does not drive us in that direction.

Grudgingly, however, I most admit that TDD still scores half a point against OCP, for a total score so far of ½ point.

TDD versus the Liskov Substitution Principle

Does TDD drive us towards adhering to the Liskov Substitution Princple (LSP)? Perhaps, but probably not.

Black box testing can’t protect us against the SUT attempting to downcast its dependencies, but at least it doesn’t particularly pull us in that direction either. When it comes to the SUT’s treatment of a dependency, TDD pulls in neither direction.

Can we test-drive interface implementations that inadvertently violate the LSP? Yes, easily. As I discussed in a previous post, the use of Header Interfaces pulls us towards LSP violations. The more members an interface has, the more likely are LSP violations.

TDD can definitely drive us towards Header Interfaces (although they tend to hurt in the long run). I've seen this happen numerous times, and I’ve been there myself. TDD doesn’t properly encourage LSP adherence.

The score this round: 0 points for TDD, for a running total of ½ point.

TDD versus the Interface Segregation Principle

Does TDD drive us towards the Interface Segregation Principle (ISP)? No. It’s pretty easy to test-drive a SUT towards a Header Interface, just as we can test-drive towards a God Class.

Another 0 points for TDD. The score is still ½ point to TDD.

TDD versus the Dependency Inversion Principle

Does TDD drive us towards the Dependency Inversion Principle (DIP)? Yes, it does.

The whole drive towards Testability – the ability to replace dependencies with Test Doubles – drives us exactly in the same direction as the DIP.

Since we tend to mistake such mechanistic loose coupling with proper application design, this probably explains why we, for so long, have confused TDD with good design. However, although I view loose coupling as a prerequisite for good design, it is by no means enough.

For those that still keep score, TDD scores 1 point against DIP, for a total of 1½ points.

TDD does not ensure SOLID

With 1½ out of 5 possible points I have stated my case. I am convinced that TDD itself does not drive us towards SOLID design. It’s definitely possible to use test-first techniques to drive towards SOLID designs, but that will always be an extra effort that supplements TDD; it’s not something that is inherently built into TDD.

Obviously you could argue that SOLID in itself is not the end-all, be-all of proper API design. I would agree. However, based on my experience with TDD, I think the conclusion holds. TDD does not drive us towards good design. It is not a design technique.

I still write code test-first because I find it more productive, but I make design decisions out of band. I’m a Test-Driven Design Apostate.

posted on Wednesday, December 22, 2010 2:57:56 PM (Romance Standard Time, UTC+01:00)  #    Comments [15] Trackback
# Saturday, December 18, 2010

As a comment to my previous post about interfaces being no guarantee for abstractions, Danny asks some interesting questions. In particular, his questions relate to Udi Dahan’s presentation Intentions & Interfaces: Making patterns concrete (also known as Making Roles Explicit). Danny writes:

it would seem that Udi recommends creating interfaces for each "role" the domain object plays and using a Service Locator to find the concrete implementation ... or in his case the concrete FetchingStrategy used to pull data back from his ORM. This sounds like his application would have many 1:1 abstractions.

Can this be true, or can we consolidate Role Interfaces with the Reused Abstractions Principle (RAP) – preferably without resorting to a Service Locator? Yes, of course we can.

In Udi Dahan’s talks, we see various examples where he queries a Service Locator for a Role Interface. If the Service Locator returns an instance he uses it; otherwise, he falls back to some sort of default behavior. Here is my interpretation of Udi Dahan’s slides:

public void Persist(Customer entity)
{
    var validator = this.serviceLocator
        .Get<IValidator<Customer>>();
    if (validator != null)
    {
        validator.Validate(entity);
    }
 
    // Save entity in actual store
}

This is actually not very pretty object-oriented code, but I have Udi Dahan suspected of choosing this implementation to better communicate the essence of how to use Role Interfaces. However, a more proper implementation would have a default (or Null Object) implementation of the Role Interface, and then the special implementation.

If we assume that a NullValidator exists, we can require that the Service Locator can always serve up a proper instance of IValidator<Customer>. This enables us to simplify the Persist method to something like this:

public void Persist(Customer entity)
{
    var validator = this.serviceLocator
        .Get<IValidator<Customer>>();
    validator.Validate(entity);
 
    // Save entity in actual store
}

Either the Service Locator returns a specialized CustomerValidator, or it returns the NullValidator. In any case, this assumption enables us to leverage the Liskov Substitution Principle and refactor the conditional logic to polymorphism.

In other words: every single time we discover the need to extract a Role Interface, we should end up with at least two implementations: the Null Object and the Special Case. Thus the RAP is satisfied.

As a last refactoring, we can also get rid of the Service Locator. Instead, we can use Constructor Injection to inject IValidator<Customer> directly into the Persistence class:

public class CustomerPersistence 
{
    private readonly IValidator<Customer> validator;
 
    public CustomerPersistence(IValidator<Customer> v)
    {
        if (v == null)
        {
            throw new ArgumentNullException("...");
        }
 
        this.validator = v;
    }
 
    public void Persist(Customer entity)
    {
        this.validator.Validate(entity);
 
        // Save entity in actual store
    }
}

Thus, the use of Role Interfaces in no way hinges on using a Service Locator, and everything is good again :)

posted on Saturday, December 18, 2010 3:21:17 PM (Romance Standard Time, UTC+01:00)  #    Comments [3] Trackback
# Friday, December 03, 2010

In my previous post I discussed why the use of interfaces doesn’t guarantee that we work against good abstractions. In this post I will look at some guidelines that might be helpful in defining better abstractions.

One important trait of a useful abstraction is that we can create many different implementations of it. This is the Reused Abstractions Principle (RAP). This is particularly important because composition and separation of concerns often result in such reuse. Every time we use Null Objects, Decorators or Composites, we reuse the same abstraction to compose an application from separate classes that all adhere to the Single Responsibility Principle. For example, Decorators are an excellent way to implement Cross-Cutting Concerns.

The RAP gives us a way to identify good abstractions after the fact, but doesn’t say much about the traits that make up a good, composable interface.

On the other hand, I find that the composability of an interface is a pretty good indicator of its potential for reuse. While we can create Decorators from just about any interface, creating meaningful Null Objects or Composites are much harder. As we previously saw, bad abstractions often prevent us from implementing a meaningful Composite.

Being able to implement a meaningful Composite is a good indication of a sound interface.

This understanding is equivalent to the realization associated with the concept of Closure of Operations from Domain-Driven Design. As soon as we achieve this, a lot of very intuitive, almost arithmetic-like APIs tend to follow. It becomes much easier to compose various instances of the abstraction.

With Composite as an indicator of good abstractions, here are some guidelines that should enable us to define more useful interfaces.

ISP

The more members an interface has, the more difficult it is to create a Composite of it. Thus, the Interface Segregation Principle is a good guide, as it points us towards small interfaces. By extrapolation, the best interface would be an interface with a single member.

That’s a good start, but even such an interface could be problematic if it’s a Leaky Abstraction or a Shallow Interface. Still, let us assume that we aim for such Role Interfaces and move on to see what other guidelines are available to us.

Commands

Commands, and by extension any interface that consists of all void methods, are imminently composable. To implement a Null Object, just ignore the input and do nothing. To implement a Composite, just pass on the input to each contained instance.

A Command is the epitome of the Hollywood Principle because telling is the only thing you can do. There’s no way to ask a Command about anything when the method returns void. Commands also guarantee the Law of Demeter, because there’s no way you can ‘dot’ across a void :)

If a Command takes one or more input parameters, they must all stay clear of Shallow Interfaces and Leaky Abstractions. If these conditions are satisfied, a Command tends to be a very good abstraction. However, sometimes we just need return values.

Closure of Operations

We already briefly discussed Closure of Operations. In C# we can describe this concept as any method that fits this signature in some way:

T DoIt(T x);

An interface that returns the same type as the input type(s) exhibit Closure of Operations. There may be more than one input parameter as long as they are all of the same type.

The interesting thing about Closure of Operations is that any interface with that quality is easily implemented as a Null Object (just return the input). A sort of Composite is often also possible because we can pass the input to each instance in the Composite and use some sort of aggregation or selection algorithm to return a result.

Even if the return type doesn’t easily lend itself towards aggregation, you can often implement a coalescing behavior with a Composite by returning the first non-null instance returned by the contained instances.

Interfaces that exhibits Closure of Operations tend to be good abstractions, but it’s not always possible to design APIs like that.

Reduction of Input

Sometimes we can keep some of the benefits from Closure of Operations even though a pure model isn’t possible. Any method that returns a type that is a subset of the input types also tends to be composable.

One variation is something like this:

T1 DoIt(T1 x, T2 y, T3 z);

In this sort of interface, the return type is the same as the first parameter. When creating Null Objects or Composites, we can generally just do as we did with pure Closure of Operations and ignore the other parameters.

Another variation is a method like this:

T1 DoIt(Foo<T1, T2, T3> foo);

where Foo is defined like this:

public class Foo<T1, T2, T3>
{
    public T1 X { get; set; }
    public T2 Y { get; set; }
    public T3 Z { get; set; }
}

In this case we can still reduce the input to create the output by simply selecting and returning foo.X and ignoring the other properties.

Still, we may not always be able to define APIs such as these.

Composable return types

Sometimes (perhaps even most of the times) we can’t mold our APIs into any of the above shapes because we inherently need to map one type into another type:

T2 Map(T1 x);

To keep such a method composable, we must then make sure that the output type itself is composable. This would allow us to implement a Composite by wrapping each return value from the contained instances into a Composite of the return type.

Likewise, we could create a Null Object by returning another Null Object for the return type.

In theory, we could repeat this design process to create a big chain of composable types, as long as the last type terminates the chain by fitting into one of the above shapes. However, this can quickly become unwieldy, so we should go to great efforts to make those chains as short as possible.

It should be noted that every type that implements IEnumerable fits pretty well into this category. A Null Object is simply an empty sequence, and a Composite is simply a sequence with multiple items. Thus, interfaces that return enumerables tend to be good abstractions.

Conclusion

There are many well-known variations of good interface design. The above guiding principles looks only at a small, interrelated set. In fact, we can regard both Commands and Closure of Operations as degenerate cases of Reduction of Input. We should strive to create interfaces that directly fit into one of these categories, and when that isn’t possible, at least interfaces that return types that fit into those categories.

Keeping interfaces small and focused makes this possible in the first place.

posted on Friday, December 03, 2010 2:19:48 PM (Romance Standard Time, UTC+01:00)  #    Comments [1] Trackback
# Thursday, December 02, 2010

One of the first sound bites from the beloved book Design Patterns is this:

Program to an interface, not an implementation

It would seem that a corollary is that we can measure the quality of our code on the number of interfaces; the more, the better. However, that’s not how it feels in reality when you are trying to figure out whether to use an IFooFactory, IFooPolicy, IFooPolicyFactory or perhaps even an IFooFactoryFactory.

Do you extract interfaces from your classes to enable loose coupling? If so, you probably have a 1:1 relationship between your interfaces and the concrete classes that implement them. That’s probably not a good sign, and violates the Reused Abstractions Principle (RAP). I’ve been guilty of this and didn’t like the result.

Having only one implementation of a given interface is a code smell.

Programming to an interface does not guarantee that we are coding against an abstraction. Interfaces are not abstractions. Why not?

An interface is just a language construct. In essence, it’s just a shape. It’s like a power plug and socket. In Europe we use one kind, and the US uses another, but it’s only by convention that we transmit 230V through European sockets and 110V through US sockets. Although plugs only fit in their respective sockets, nothing prevents us from sending 230V through a US plug/socket combination.

Krzysztof Cwalina already pointed this out in 2004: interfaces are not contracts. If they aren’t even contracts, then how can they be abstractions?

Interfaces can be used as abstractions, but using an interface is in itself no guarantee that we are dealing with an abstraction. Rather, we have the following relationship between interfaces and abstractions:

Abstractions, interfaces and their intersection

There are basically two sets: a set of abstractions and a set of interfaces. In the following we will discuss the set of interfaces that does not intersect the set of abstractions, saving the intersection for another blog post.

There are many ways an interface can turn out to be a poor abstraction. The following is an incomplete list:

LSP Violations

Violating the Liskov Substitution Principle is a pretty obvious sign that the interface in use is a poor abstraction. This may be most obvious when the consumer of the interface needs to downcast an instance to properly work with it.

However, as Uncle Bob points out, even an interface as simple as this seemingly innocuous rectangle ‘abstraction’ contains potential dangers:

public interface IRectangle
{
    int Width { get; set; }
    int Height { get; set; }
}

The issue becomes apparent when you attempt to let a Square class implement IRectangle. To protect the invariants of Square, you can’t allow the Width and Height properties to differ. You have a couple of options, none of which are very good:

  • Update both Width and Height to the same value when one of them are being written.
  • Ignore the write operation when the caller attempts to assign an invalid value.
  • Throw an exception when the caller attempts to assign a Width which is different from the Height (and vice versa).

From the point of view of a consumer of the IRectangle interface, all of these options would at the very least violate the Principle of Least Astonishment, and throwing exceptions would definitely cause the consumer to behave differently when consuming Square instances as opposed to ‘normal’ rectangles.

The problem stems from the fact that the operations have side effects. Invoking one operation changes the state of a seemingly unrelated piece of data. The more members we have, the greater the risk is, so the Interface Segregation Principle can, to a certain extent, help.

Header Interfaces

Since a higher number of members increases the risk of unexpected side effects and temporal coupling it should come as no surprise that interfaces mechanically extracted from all members of a a concrete class are poor abstractions.

As always, Visual Studio makes it very easy to do the wrong thing by offering the Extract Interface refactoring feature.

We call such interfaces Header Interfaces because they resemble C++ header files. They tend to simply state the same thing twice without apparent benefit. This is particularly true when you have only a single implementation, which tends to be very likely for interfaces with many members.

Shallow Interfaces

When you use the Extract Interface refactoring feature in Visual Studio, even if you don’t extract every member, the resulting interface is shallow because it doesn’t recursively extract interfaces from the concrete types exposed by the extracted members.

An example I’ve seen more than once involves extracting an interface from a LINQ to SQL or LINQ to Entities context in order to define a Repository interface. As an example, here’s an interface extracted from a very simple LINQ to Entities context:

public interface IPostingContext
{
    void AddToPostings(Posting posting);
    ObjectSet<Posting> Postings { get; }
}

At first glance this may look useful, but it isn’t. Even though it’s an interface, it’s still tightly coupled to a specific object context. Not only does ObjectSet<T> reference the Entity Framework, but the Posting class is defined by a very specific, auto-generated Entity context.

The interface may give you the impression of working against loosely coupled code, but you can’t easily (if at all) implement a different IPostingContext with a radically different data access technology. You’ll be stuck with this particular PostingContext.

If you must extract an interface, you’ll need to do it recursively.

Leaky Abstractions

Another way we can create problems for ourselves is when our interfaces leak implementation details. A good example can be found in the SystemWrapper project that provides extracted interfaces for various BCL types, such as System.IO.FileInfo. Those interfaces may enable mocking, but we shouldn’t expect to ever be able to create another implementation of SystemWrapper.IO.IFileInfoWrap. In other words, those interfaces aren’t very useful.

Another example is this attempt at defining a Repository interface:

public interface IFooRepository
{
    string ConnectionString { get; set; }
    // ...
}

Exposing a ConnectionString property strongly indicates that the repository is implemented on top of a database; this knowledge leaks through. If we wanted to implement the repository based on a web service, we might be able to repurpose the the ConnectionString property to a service URL, but it would be a hack at best – and how would we define security settings in that scenario?

Exposing a FileName property on an interface that represents an abstract resource is another example of a Leaky Abstraction.

Leaky Abstractions like these are often difficult to reuse. As an example, it would be difficult to implement a Composite out of the above IFooRepository – how do you aggregate a ConnectionString?

Conclusion

In short, using interfaces in no way guarantees that we operate with appropriate abstractions. Thus, the proliferation of interfaces that typically follow from TDD or use of DI may not be the pure goodness we tend to believe.

Creating good abstractions is difficult and requires skill. In a future post, I’ll look at some principles that we can use as guides.

posted on Thursday, December 02, 2010 2:03:04 PM (Romance Standard Time, UTC+01:00)  #    Comments [14] Trackback
# Monday, November 01, 2010

Garth Kidd was so nice to point out to me that I hadn’t needed stop where I did in my previous post, and he is, of course, correct. Taking a dependency on an Abstract Factory that doesn’t take any contextual information (i.e. has no method parameters) is often an indication of a Leaky Abstraction. It indicates that the consumer has knowledge about the dependency’s lifetime that it shouldn’t have.

We can remove this flaw by introducing a Decorator of the IRepository<T> interface. Something like this should suffice:

public class FoundRepository<T> : IRepository<T>
{
    private readonly IRepository<T> repository;
 
    public FoundRepository(IRepositoryFinder<T> finder)
    {
        if (finder == null)
        {
            throw new ArgumentNullException("finder");
        }
 
        this.repository = finder.FindRepository();
    }
 
    /* Implement IRepository<T> by delegating to
     * this.repository */
}

This means that we can change the implementation of MyServiceOperation to this:

public void MyServiceOperation(
    IRepository<Customer> repository)
{
    // ...
}

This is much better, but this requires a couple of notes.

First of all we should keep in mind that since FoundRepository creates and saves an instance of IRepository right away, we should control the lifetime of FoundRepository. In essense, the lifetime should be tied to the specific service operation. Two concurrent invocations of MyServiceOperation should each receive separate instances of FoundRepository.

Many DI containers support Factory methods, so it may not even be necessary to implement FoundRepository explicitly. Rather, it would be possible to register IRepository<T> so that an instance is always created by invoking IRepositoryFinder<T>.FindRepository().

posted on Monday, November 01, 2010 10:19:06 PM (Romance Standard Time, UTC+01:00)  #    Comments [10] Trackback

One of the readers of my book recently asked me an interesting question that relates to the disadvantages of the Service Locator anti-pattern. I found both the question and the potential solution so interesting that I would like to share it.

In short, the reader’s organization currently uses Service Locator in their code, but don’t really see a way out of it. This post demonstrates how we can refactor from Service Locator to Abstract Factory. Here’s the original question:

“We have been writing a WCF middle tier using DI”

“Our application talks to multiple databases.  There is one Global database which contains Enterprise records, and each Enterprise has the connection string of a corresponding Enterprise database.”

“The trick is when we want to write a service which connects to an Enterprise database.  The context for which enterprise we are dealing with is not available until one of the service methods is called, so what we do is this:”

public void MyServiceOperation(
    EnterpriseContext context)
{
   
/* Get a Customer repository operating
        * in the given enterprise’s context
        * (database) */

    var customerRepository =
        context.FindRepository<Customer>(
            context.EnterpriseId);
    // ...
}

“I’m not sure how, in this case, we can turn what we’ve got into a more pure DI system, since we have the dependency on the EnterpriseContext passed in to each service method.  We are mocking and testing just fine, and seem reasonably well decoupled.  Any ideas?”

When we look at the FindRepository method we quickly find that it’s a Service Locator. There are many problems with Service Locator, but the general issue is that the generic argument can be one of an unbounded set of types.

The problem is that seen from the outside, the consuming type (MyService in the example) doesn’t advertise its dependencies. In the example the dependency is a CustomerRepository, but you could later go into the implementation of MyServiceOperation and change the call to context.FindRepository<Qux>(context.EnterpriseId) and everything would still compile. However, at run-time, you’d likely get an exception.

It would be much safer to use an Abstract Factory, but how do we get there from here, and will it be better?

Let’s see how we can do that. First, we’ll have to make some assumptions on how EnterpriseContext works. In the following, I’ll assume that it looks like this – warning: it’s ugly, but that’s the point, so don’t give up reading just yet:

public class EnterpriseContext
{
    private readonly int enterpriseId;
    private readonly IDictionary<int, string>
        connectionStrings;

    public EnterpriseContext(int enterpriseId)
    {
        this.enterpriseId = enterpriseId;

        this.connectionStrings =
            new Dictionary<int, string>();
        this.connectionStrings[1] = "Foo";
        this.connectionStrings[2] = "Bar";
        this.connectionStrings[3] = "Baz";
    }

    public virtual int EnterpriseId
    {
        get { return this.enterpriseId; }
    }

    public virtual IRepository<T> FindRepository<T>(
        int enterpriseId)
    {
        if (typeof(T) == typeof(Customer))
        {
            return (IRepository<T>)this
                .FindCustomerRepository(enterpriseId);
        }
        if (typeof(T) == typeof(Campaign))
        {
            return (IRepository<T>)this
                .FindCampaignRepository(enterpriseId);
        }
        if (typeof(T) == typeof(Product))
        {
            return (IRepository<T>)this
                .FindProductRepository(enterpriseId);
        }

        throw new InvalidOperationException("...");
    }

    private IRepository<Campaign>
        FindCampaignRepository(int enterpriseId)
    {
        var cs = this.connectionStrings[enterpriseId];
        return new CampaignRepository(cs);
    }

    private IRepository<Customer>
        FindCustomerRepository(int enterpriseId)
    {
        var cs = this.connectionStrings[enterpriseId];
        return new CustomerRepository(cs);
    }

    private IRepository<Product>
        FindProductRepository(int enterpriseId)
    {
        var cs = this.connectionStrings[enterpriseId];
        return new ProductRepository(cs);
    }
}

That’s pretty horrible, but that’s exactly the point. Every time we need to to add a new type of repository, we’ll need to modify this class, so it’s one big violation of the Open/Closed Principle.

I didn’t implement EnterpriseContext with a DI Container on purpose. Yes: using a DI Container would make it appear less ugly, but it would only hide the design issue – not address it. I chose the above implementation to demonstrate just how ugly this sort of design really is.

So, let’s start refactoring.

Step 1

We change each of the private finder methods to public methods.

In this example, there are only three methods, but I realize that in a real system there might be many more. However, we’ll end up with only a single interface and its implementation, so don’t despair just yet. It’ll turn out just fine.

As a single example the FindCustomerRepository method is shown here:

public IRepository<Customer>
    FindCustomerRepository(int enterpriseId)
{
    var cs = this.connectionStrings[enterpriseId];
    return new CustomerRepository(cs);
}

For each of the methods we extract an interface, like this:

public interface ICustomerRepositoryFinder
{
    int EnterpriseId { get; }

    IRepository<Customer> FindCustomerRepository(
        int enterpriseId);
}

We also include the EnterpriseId property because we’ll need it soon. This is just an intermediary artifact which is not going to survive until the end.

This is very reminiscent of the steps described by Udi Dahan in his excellent talk Intentions & Interfaces: Making patterns concrete. We make the roles of finding repositories explicit.

This leaves us with three distinct interfaces that EnterpriseContext can implement:

public class EnterpriseContext : 
    ICampaignRepositoryFinder,
    ICustomerRepositoryFinder,
   
IProductRepositoryFinder

Until now, we haven’t touched the service.

Step 2

We can now change the implementation of MyServiceOperation to explicitly require only the role that it needs:

public void MyServiceOperation(
    ICustomerRepositoryFinder finder)
{
    var customerRepository =
        finder.FindCustomerRepository(
            finder.EnterpriseId);
}

Since we now only consume the strongly typed role interfaces, we can now delete the original FindRepository<T> method from EnterpriseContext.

Step 3

At this point, we’re actually already done, since ICustomerRepositoryFinder is an Abstract Factory, but we can make the API even better. When we consider the implementation of MyServiceOperation, it should quickly become clear that there’s a sort of local Feature Envy in play. Why do we need to access finder.EnterpriseId to invoke finder.FindCustomerRepository? Shouldn’t it rather be the finder’s own responsibility to figure that out for us?

Instead, let us change the implementation so that the method does not need the enterpriseId parameter:

public IRepository<Customer> FindCustomerRepository()
{
    var cs =
        this.connectionStrings[this.EnterpriseId];
    return new CustomerRepository(cs);
}

Notice that the EnterpriseId can be accessed just as well from the implementation of the method itself. This change requires us to also change the interface:

public interface ICustomerRepositoryFinder
{
    IRepository<Customer> FindCustomerRepository();
}

Notice that we removed the EnterpriseId property, as well as the enterpriseId parameter. The fact that there’s an enterprise ID in play is now an implementation detail.

MyServiceOperation now looks like this:

public void MyServiceOperation(
    ICustomerRepositoryFinder finder)
{
    var customerRepository =
        finder.FindCustomerRepository();
}

This takes care of the Feature Envy smell, but still leaves us with a lot of very similarly looking interfaces: ICampaignRepositoryFinder, ICustomerRepositoryFinder and IProductRepositoryFinder.

Step 4

We can collapse all the very similar interfaces into a single generic interface:

public interface IRepositoryFinder<T>
{
    IRepository<T> FindRepository();
}

With that, MyServiceOperation now becomes:

public void MyServiceOperation(
    IRepositoryFinder<Customer> finder)
{
    var customerRepository =
        finder.FindRepository();
}

Now that we only have a single generic interface (which is still an Abstract Factory), we can seriously consider getting rid of all the very similarly looking implementations in EnterpriseContext and instead just create a single generic class. We now have a more explicit API that better communicates intent.

How is this better? What if a method needs both an IRepository<Customer> and an IRepository<Product>? We’ll now have to pass two parameters instead of one.

Yes, but that’s good because it explicitly calls to your attention exactly which collaborators are involved. With the original Service Locator, you might not notice the responsibility creep as you over time request more and more repositories from the EnterpriseContext. With Abstract Factories in play, violations of the Single Responsibility Principle (SRP) becomes much more obvious.

Refactoring from Service Locator to Abstract Factories make it more painful to violate the SRP.

You can always make roles explicit to get rid of Service Locators. This is likely to result in a more explicit design where doing the right thing feels more natural than doing the wrong thing.

posted on Monday, November 01, 2010 8:43:24 PM (Romance Standard Time, UTC+01:00)  #    Comments [0] Trackback

It’s easy to confuse the Abstract Factory pattern with the Service Locator anti-pattern – particularly so when generics or contextual information is involved. However, it’s really easy to distinguish between there two, and here’s how!

Here are both (anti-)patterns in condensed form opposite each other:

Abstract Factory Service Locator
public interface IFactory<T>
{
    T Create(object context);
}
public interface IServiceLocator
{
    T Create<T>(object context);
}

For these examples I chose to demonstrate both as generic interfaces that take some kind of contextual information (context) as input.

In this example the context can be any object, but we could also have considered a more strongly typed context parameter. Other variations include more than one method parameter, or, in the degenerate case, no parameters at all.

Both interfaces have a simple Create method that returns the generic type T, so it’s easy to confuse the two. However, even for generic types, it’s easy to tell one from the other:

An Abstract Factory is a generic type, and the return type of the Create method is determined by the type of the factory itself. In other words, a constructed type can only return instances of a single type.

A Service Locator, on the other hand, is a non-generic interface with a generic method. The Create method of a single Service Locator can return instances of an infinite number of types.

Even simpler:

An Abstract Factory is a generic type with a non-generic Create method; a Service Locator is a non-generic type with a generic Create method.

The name of the method, the number of parameters, and other circumstances may vary. The types may not be generic, or may be base classes instead of interfaces, but at the heart of it, the question is whether you can ask for an arbitrary type from the service, or only a single, static type.

posted on Monday, November 01, 2010 1:31:53 PM (Romance Standard Time, UTC+01:00)  #    Comments [4] Trackback
# Monday, September 20, 2010

One of my readers recently asked me an interesting question. It relates to my book’s chapter about Interception (chapter 9) and Decorators and how they can be used for instrumentation-like purposes.

In an earlier blog post we saw how we can use Decorators to implement Cross-Cutting Concerns, but the question relates to how a set of Decorators can be used to log additional information about code execution, such as the time before and after a method is called, the name of the method and so on.

A Decorator can excellently address such a concern as well, as we will see here. Let us first define an IRegistrar interface and create an implementation like this:

public class ConsoleRegistrar : IRegistrar
{
    public void Register(Guid id, string text)
    {
        var now = DateTimeOffset.Now;
        Console.WriteLine("{0}\t{1:s}.{2}\t{3}",
            id, now, now.Millisecond, text);
    }
}

Although this implementation ‘logs’ to the Console, I’m sure you can imagine other implementations. The point is that given this interface, we can add all sorts of ambient information such as the thread ID, the name of the current principal, the current culture and whatnot, while the text string variable still gives us an option to log more information. If we want a more detailed API, we can just make it more detailed – after all, the IRegistrar interface is just an example.

We now know how to register events, but are seemingly no nearer to instrumenting an application. How do we do that? Let us see how we can instrument the OrderProcessor class that I have described several times in past posts.

At the place I left off, the OrderProcessor class uses Constructor Injection all the way down. Although I would normally prefer using a DI Container to auto-wire it, here’s a manual composition using Poor Man’s DI just to remind you of the general structure of the class and its dependencies:

var sut = new OrderProcessor(
    new OrderValidator(), 
    new OrderShipper(),
    new OrderCollector(
        new AccountsReceivable(),
        new RateExchange(),
        new UserContext()));

All the dependencies injected into the OrderProcessor instance implement interfaces on which OrderProcessor relies. This means that we can decorate each concrete dependency with an implementation that instruments it.

Here’s an example that instruments the IOrderProcessor interface itself:

public class InstrumentedOrderProcessor : IOrderProcessor
{
    private readonly IOrderProcessor orderProcessor;
    private readonly IRegistrar registrar;
 
    public InstrumentedOrderProcessor(
        IOrderProcessor processor,
        IRegistrar registrar)
    {
        if (processor == null)
        {
            throw new ArgumentNullException("processor");
        }
        if (registrar == null)
        {
            throw new ArgumentNullException("registrar");
        }
 
        this.orderProcessor = processor;
        this.registrar = registrar;
    }
 
    #region IOrderProcessor Members
 
    public SuccessResult Process(Order order)
    {
        var correlationId = Guid.NewGuid();
        this.registrar.Register(correlationId,
            string.Format("Process begins ({0})",
                this.orderProcessor.GetType().Name));
 
        var result = this.orderProcessor.Process(order);
 
        this.registrar.Register(correlationId,
            string.Format("Process ends   ({0})", 
            this.orderProcessor.GetType().Name));
 
        return result;
    }
 
    #endregion
}

That looks like quite a mouthful, but it’s really quite simple – the cyclomatic complexity of the Process method is as low as it can be: 1. We really just register the Process method call before and after invoking the decorated IOrderProcessor.

Without changing anything else than the composition itself, we can now instrument the IOrderProcessor interface:

var registrar = new ConsoleRegistrar();
var sut = new InstrumentedOrderProcessor(
    new OrderProcessor(
        new OrderValidator(),
        new OrderShipper(),
        new OrderCollector(
            new AccountsReceivable(),
            new RateExchange(),
            new UserContext())),
    registrar);

However, imagine implementing an InstrumentedXyz for every IXyz and compose the application with them. It’s possible, but it’s going to get old really fast – not to mention that it massively violates the DRY principle.

Fortunately we can solve this issue with any DI Container that supports dynamic interception. Castle Windsor does, so let’s see how that could work.

Instead of implementing the same code ‘template’ over and over again to instrument an interface, we can do it once and for all with an interceptor. Imagine that we delete the InstrumentedOrderProcessor; instead, we create this:

public class InstrumentingInterceptor : IInterceptor
{
    private readonly IRegistrar registrar;
 
    public InstrumentingInterceptor(IRegistrar registrar)
    {
        if (registrar == null)
        {
            throw new ArgumentNullException("registrar");
        }
 
        this.registrar = registrar;
    }
 
    #region IInterceptor Members
 
    public void Intercept(IInvocation invocation)
    {
        var correlationId = Guid.NewGuid();
        this.registrar.Register(correlationId, 
            string.Format("{0} begins ({1})", 
                invocation.Method.Name,
                invocation.TargetType.Name));
 
        invocation.Proceed();
 
        this.registrar.Register(correlationId,
            string.Format("{0} ends   ({1})", 
                invocation.Method.Name, 
                invocation.TargetType.Name));
    }
 
    #endregion
}

If you compare this to the Process method of InstrumentedOrderProcessor (that we don’t need anymore), you should be able to see that they are very similar. In this version, we just use the invocation argument to retrieve information about the decorated method.

We can now add InstrumentingInterceptor to a WindsorContainer and enable it for all appropriate components. When we do that and invoke the Process method on the resolved IOrderProcessor, we get a result like this:

  bbb9724e-0fad-4b06-9bb0-b8c1c460cded    2010-09-20T21:01:16.744    Process begins (OrderProcessor)
  43349d42-a463-463b-8ddf-e569e3170c97    2010-09-20T21:01:16.745    Validate begins (TrueOrderValidator)
  43349d42-a463-463b-8ddf-e569e3170c97    2010-09-20T21:01:16.745    Validate ends   (TrueOrderValidator)
  44fdccc8-f12d-4057-ae03-791225686504    2010-09-20T21:01:16.746    Collect begins (OrderCollector)
  8bbb1a0c-6134-4652-a4af-cd8c0c7184a0    2010-09-20T21:01:16.746    GetCurrentUser begins (UserContext)
  8bbb1a0c-6134-4652-a4af-cd8c0c7184a0    2010-09-20T21:01:16.747    GetCurrentUser ends   (UserContext)
  d54359ff-8c32-487f-8728-b19ff0bf4942    2010-09-20T21:01:16.747    GetCurrentUser begins (UserContext)
  d54359ff-8c32-487f-8728-b19ff0bf4942    2010-09-20T21:01:16.747    GetCurrentUser ends   (UserContext)
  c54c4506-23a8-4553-ba9a-066fc64252d2    2010-09-20T21:01:16.748    GetSelectedCurrency begins (UserContext)
  c54c4506-23a8-4553-ba9a-066fc64252d2    2010-09-20T21:01:16.748    GetSelectedCurrency ends   (UserContext)
  b3dba76b-6b4e-44fa-aca5-52b2d8509db3    2010-09-20T21:01:16.750    Convert begins (RateExchange)
  b3dba76b-6b4e-44fa-aca5-52b2d8509db3    2010-09-20T21:01:16.751    Convert ends   (RateExchange)
  e07765bd-fe07-4486-96f1-f74d77241343    2010-09-20T21:01:16.751    Collect begins (AccountsReceivable)
  e07765bd-fe07-4486-96f1-f74d77241343    2010-09-20T21:01:16.752    Collect ends   (AccountsReceivable)
  44fdccc8-f12d-4057-ae03-791225686504    2010-09-20T21:01:16.752    Collect ends   (OrderCollector)
  231055d3-4ebb-425d-8d69-fb9c85d9a860    2010-09-20T21:01:16.752    Ship begins (OrderShipper)
  231055d3-4ebb-425d-8d69-fb9c85d9a860    2010-09-20T21:01:16.753    Ship ends   (OrderShipper)
  bbb9724e-0fad-4b06-9bb0-b8c1c460cded    2010-09-20T21:01:16.753    Process ends   (OrderProcessor)

Notice how we care easily see where and when method calls begin and end using the descriptive text as well as the correlation id. I will leave it as an exercise for the reader to come up with an API that provides better parsing options etc.

As a final note it’s worth pointing out that this way of instrumenting an application (or part of it) can be done following the Open/Closed Principle. I never changed the original implementation of any of the components.

posted on Monday, September 20, 2010 9:18:21 PM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Monday, August 30, 2010

There still seems to be some confusion about what is Dependency Injection (DI) and what is a DI Container, so in this post I will try to sort it out as explicitly as possible.

DI is a set of principles and patterns that enable loose coupling.

That’s it; nothing else. Remember that old quote from p. 18 of Design Patterns?

Program to an interface; not an implementation.

This is the concern that DI addresses. The most useful DI pattern is Constructor Injection where we inject dependencies into consumers via their constructors. No container is required to do this.

The easiest way to build a DI-friendly application is to just use Constructor Injection all the way. Conversely, an application does not automatically become loosely coupled when we use a DI Container. Every time application code queries a container we have an instance of the Service Locator anti-pattern. The corollary leads to this variation of the Hollywood Principle:

Don’t call the container; it’ll call you.

A DI Container is a fantastic tool. It’s like a (motorized) mixer: you can whip cream by hand, but it’s easier with a mixer. On the other hand, without the cream the mixer is nothing. The same is true for a DI Container: to really be valuable, your code must employ Constructor Injection so that the container can auto-wire dependencies.

A well-designed application adheres to the Hollywood Principle for DI Containers: it doesn’t call the container. On the other hand, we can use the container to compose the application – or we can do it the hard way; this is called Poor Man’s DI. Here’s an example that uses Poor Man’s DI to compose a complete application graph in a console application:

private static void Main(string[] args)
{
    var msgWriter = new ConsoleMessageWriter();
    new CoalescingParserSelector(
        new IParser[]
        {
            new HelpParser(msgWriter),
            new WineInformationParser(
                new SqlWineRepository(),
                msgWriter)
        })
        .Parse(args)
        .CreateCommand()
        .Execute();
}

Notice how the nested structure of all the dependencies gives you an almost visual idea about the graph. What we have here is Constructor Injection all the way in.

CoalescingParserSelector’s constructor takes an IEnumerable<IParser> as input. Both HelpParser and WineInformationParser requires an IMessageWriter, and WineInformationParser also an IWineRepository. We even pull in types from different assemblies because SqlWineRepository is defined in the SQL Server-based data access assembly.

Another thing to notice is that the msgWriter variable is shared among two consumers. This is what a DI Container normally addresses with its ability to manage component lifetime. Although there’s not a DI Container in sight, we could certainly benefit from one. Let’s try to wire up the same graph using Unity (just for kicks):

private static void Main(string[] args)
{
    var container = new UnityContainer();
    container.RegisterType<IParser, WineInformationParser>("parser.info");
    container.RegisterType<IParser, HelpParser>("parser.help");
    container.RegisterType<IEnumerable<IParser>, IParser[]>();
 
    container.RegisterType<IParseService, CoalescingParserSelector>();
 
    container.RegisterType<IWineRepository, SqlWineRepository>();
    container.RegisterType<IMessageWriter, ConsoleMessageWriter>(
        new ContainerControlledLifetimeManager());
 
    container.Resolve<IParseService>()
        .Parse(args)
        .CreateCommand()
        .Execute();
    container.Dispose();
}

We are using Constructor Injection throughout, and most DI Containers (even Unity, but not MEF) natively understands that pattern. Consequently, this means that we can mostly just map interfaces to concrete types and the container will figure out the rest for us.

Notice that I’m using the Configure-Resolve-Release pattern described by Krzysztof Koźmic. First I configure the container, then I resolve the entire object graph, and lastly I dispose the container.

The main part of the application’s execution time will be spent within the Execute method, which is where all the real application code runs.

In this example I wire up a console application, but it just as well might be any other type of application. In a web application we just do a resolve per web request instead.

But wait! does that mean that we have to resolve the entire object graph of the application, even if we have dependencies that cannot be resolved at run-time? No, but that does not mean that you should pull from the container. Pull from an Abstract Factory instead.

Another question that is likely to arise is: what if I have dependencies that I rarely use? Must I wire these prematurely, even if they are expensive? No, you don’t have to do that either.

In conclusion: there is never any reason to query the container. Use a container to compose your object graph, but don’t rely on it by querying from it. Constructor Injection all the way enables most containers to auto-wire your application, and an Abstract Factory can be a dependency too.

posted on Monday, August 30, 2010 10:06:58 PM (Romance Daylight Time, UTC+02:00)  #    Comments [2] Trackback
# Monday, July 12, 2010

Occasionally I get a question about whether it is reasonable or advisable to let domain objects implement IDataErrorInfo. In summary, my answer is that it’s not so much a question about whether it’s a leaky abstraction or not, but rather whether it makes sense at all. To me, it doesn’t.

Let us first consider the essence of the concept underlying IDataErrorInfo: It provides information about the validity of an object. More specifically, it provides error information when an object is in an invalid state.

This is really the crux of the matter. Domain Objects should be designed so that they cannot be put into invalid states. They should guarantee their invariants.

Let us return to the good old DanishPhoneNumber example. Instead of accepting or representing a Danish phone number as a string or integer, we model it as a Value Object that encapsulates the appropriate domain logic.

More specifically, the class’ constructor guarantees that you can’t create an invalid instance:

private readonly int number;
 
public DanishPhoneNumber(int number)
{
    if ((number < 112) ||
        (number > 99999999))
    {
        throw new ArgumentOutOfRangeException("number");
    }
    this.number = number;
}

Notice that the Guard Clause guarantees that you can’t create an instance with an invalid number, and the readonly keyword guarantees that you can’t change the value afterwards. Immutable types make it easier to protect a type’s invariants, but it is also possible with mutable types – you just need to place proper Guards in public setters and other mutators, as well as in the constructor.

In any case, whenever a Domain Object guarantees its invariants according to the correct domain logic it makes no sense for it to implement IDataErrorInfo; if it did, the implementation would be trivial, because there would never be an error to report.

Does this mean that IDataErrorInfo is a redundant interface? Not at all, but it is important to realize that it’s an Application Boundary concern instead of a Domain concern. At Application Boundaries, data entry errors will happen, and we must be able to cope with them appropriately; we don’t want the application to crash by passing unvalidated data to DanishPhoneNumber’s constructor.

Does this mean that we should duplicate domain logic at the Application Boundary? That should not be necessary. At first, we can apply a simple refactoring to the DanishPhoneNumber constructor:

public DanishPhoneNumber(int number)
{
    if (!DanishPhoneNumber.IsValid(number))
    {
        throw new ArgumentOutOfRangeException("number");
    }
    this.number = number;
}
 
public static bool IsValid(int number)
{
    return (112 <= number)
        && (number <= 99999999);
}

We now have a public IsValid method we can use to implement an IDataErrorInfo at the Application Boundary. Next steps might be to add a TryParse method.

IDataErrorInfo implementations are often related to input forms in user interfaces. Instead of crashing the application or closing the form, we want to provide appropriate error messages to the user. We can use the Domain Object to provide validation logic, but the concern is completely different: we want the form to stay open until valid data has been entered. Not until all data is valid do we allow the creation of a Domain Object from that data.

In short, if you feel tempted to add IDataErrorInfo to a Domain Class, consider whether you aren’t about to violate the Single Responsibility Principle. In my opinion, this is the case, and you would be better off reconsidering the design.

posted on Monday, July 12, 2010 2:58:16 PM (Romance Daylight Time, UTC+02:00)  #    Comments [9] Trackback
# Wednesday, April 07, 2010

It seems to me that I’ve lately encountered a particular mindset towards Dependency Injection (DI). People seem to think that it’s only really good for replacing one data access implementation with another. Once you get to that point, you know that the following argument isn’t far behind:

“That’s all well and good, but we know for certain that we will never exchange [insert name of RDBMS here] with anything else in this application.”

Apart from the hubris of making such a bold statement about the future of any software endeavor, such a statement reveals the narrow view on DI that its only purpose is for replacing data access components – and perhaps for unit testing.

Those are relevant reasons for using DI, but they are only some of the reasons. Let’s briefly revisit why we employ DI.

We use DI to enable loose coupling.

DI is only a means to an end. Even if you never intend to replace your database and even if you never want to write a single unit test, DI still offers benefits in form of a more maintainable code base. The loose coupling gives you better separation of concerns because it allows you to apply the Open/Closed Principle.

Example coming right up:

Imagine that we need to implement a PrécisViewModel class with a TopSellers property that returns an IEnumerable<string>. To implement this class, we have a data access component. Let’s use the ubiquitous Repository pattern and define IProductRepository to see where that leads us:

public interface IProductRepository
{
    IEnumerable<Product> SelectTopSellers();
}

We can now implement PrécisViewModel like this:

public class PrécisViewModel
{
    private readonly IProductRepository repository;
 
    public PrécisViewModel(IProductRepository repository)
    {
        if (repository == null)
        {
            throw new ArgumentNullException("repository");
        }
 
        this.repository = repository;
    }
 
    public IEnumerable<string> TopSellers
    {
        get
        {
            var topSellers = 
                this.repository.SelectTopSellers();
            return from p in topSellers
                   select p.Name;
        }
    }
}

Nothing fancy is going on here. It’s just straight Constructor Injection at work.

Obviously, we can now implement and use a SQL Server-based repository:

var repository = new SqlProductRepository();
var vm = new PrécisViewModel(repository);

So what does all this loose coupling buy us? It doesn’t seem to help us a lot.

The real benefit is not yet apparent, but it should become more obvious when we start adding requirements. Let’s start with some caching. It turns out that the SelectTopSellers implementation is slow, so we would like to add some caching somewhere.

Where should we add this caching functionality? Without loose coupling, we would more or less be constrained to adding it to either PrécisViewModel or SqlProductRepository, but both have issues:

  • First of all we would be violating the Single Responsibility Principle (SRP) in both cases.
  • If we implement caching in PrécisViewModel, other consumers of the SelectTopSellers would not benefit from it.
  • If we implement caching in SqlProductRepository, it wouldn’t be available for any other IProductRepository implementations.

Since the premise for this post is that we will never use any other database than SQL Server, implementing caching directly in SqlProductRepository sounds like the correct choice, but we would still be violating the SRP, and thus making our code more difficult to maintain.

A better solution is to introduce a caching Decorator like this one:

public class CachingProductRepository : IProductRepository
{
    private readonly ICache cache;
    private readonly IProductRepository repository;
 
    public CachingProductRepository(
        IProductRepository repository, ICache cache)
    {
        if (repository == null)
        {
            throw new ArgumentNullException("repository");
        }
        if (cache == null)
        {
            throw new ArgumentNullException("cache");
        }
 
        this.cache = cache;
        this.repository = repository;
    }
 
    #region IProductRepository Members
 
    public IEnumerable<Product> SelectTopSellers()
    {
        return this.cache
            .Retrieve<IEnumerable<Product>>("topSellers",
                this.repository.SelectTopSellers);
    }
 
    #endregion
}

For completeness sake is here the definition of ICache:

public interface ICache
{
    T Retrieve<T>(string key, Func<T> readThrough);
}

The point is that CachingProductRepository extends any IProductRepository we provide to it (including SqlProductRepository) without modifying it. Thus, we have satisfied both the OCP and the SRP.

Just to drive home the point, let us assume that we also wish to record execution times for various methods for purposes of SLA compliance. We can do this by introducing yet another Decorator:

public class PerformanceMeasuringProductRepository : 
    IProductRepository
{
    private readonly IProductRepository repository;
    private readonly IStopwatch stopwatch;
 
    public PerformanceMeasuringProductRepository(
        IProductRepository repository, 
        IStopwatch stopwatch)
    {
        if (repository == null)
        {
            throw new ArgumentNullException("repository");
        }
        if (stopwatch == null)
        {
            throw new ArgumentNullException("stopwatch");
        }
 
        this.repository = repository;
        this.stopwatch = stopwatch;
    }
 
    #region IProductRepository Members
 
    public IEnumerable<Product> SelectTopSellers()
    {
        var timer = this.stopwatch
            .StartMeasuring("SelectTopSellers");
        var topSellers = 
            this.repository.SelectTopSellers();
        timer.StopMeasuring();
        return topSellers;
    }
 
    #endregion
}

Once again, we modified neither SqlProductRepository nor CachingProductRepository to introduce this new feature. We can implement security and auditing features by following the same principle.

To me, this is what loose coupling (and DI) is all about. That we can also replace data access components and unit test using dynamic mocks are very fortunate side effects, but the loose coupling is valuable in itself because it enables us to write more maintainable code.

We don’t even need a DI Container to wire up all these repositories (although it sure would be helpful). Here’s how we can do it with Poor Man’s DI:

IProductRepository repository =
    new PerformanceMeasuringProductRepository(
        new CachingProductRepository(
            new SqlProductRepository(), new Cache()
            ),
        new RealStopwatch()
    );
var vm = new PrécisViewModel(repository);

The next time someone on your team claims that you don’t need DI because the choice of RDBMS is fixed, you can tell them that it’s irrelevant. The choice is between DI and Spaghetti Code.

posted on Wednesday, April 07, 2010 9:49:11 PM (Romance Daylight Time, UTC+02:00)  #    Comments [11] Trackback
# Monday, January 25, 2010

About a week ago Uncle Bob published a post on Dependency Injection Inversion that caused quite a stir in the tiny part of the .NET community I usually pretend to hang out with. Twitter was alive with much debate, but Ayende seems to sum up the .NET DI community's sentiment pretty well:

if this is a typical example of IoC usage in the Java world, then [Uncle Bob] should peek over the fence to see how IoC is commonly implemented in the .Net space

Despite having initially left a more or less positive note to Uncle Bob's post, after having re-read it carefully, I am beginning to think the same, but instead of just telling everyone how much greener the grass is on the .NET side, let me show you.

First of all, let's translate Uncle Bob's BillingService to C#:

public class BillingService

{

    private readonly CreditCardProcessor processor;

    private readonly TransactionLog transactionLog;

 

    public BillingService(CreditCardProcessor processor,

        TransactionLog transactionLog)

    {

        if (processor == null)

        {

            throw new ArgumentNullException("processor");

        }

        if (transactionLog == null)

        {

            throw new ArgumentNullException("transactionLog");

        }

 

        this.processor = processor;

        this.transactionLog = transactionLog;

    }

 

    public void ProcessCharge(int amount, string id)

    {

        var approval = this.processor.Approve(amount, id);

        this.transactionLog.Log(string.Format(

            "Transaction by {0} for {1} {2}", id, amount,

            this.GetApprovalCode(approval)));

    }

 

    private string GetApprovalCode(bool approval)

    {

        return approval ? "approved" : "denied";

    }

}

It's nice how easy it is to translate Java code to C#, but apart from casing and other minor deviations, let's focus on the main difference. I've added Guard Clauses to protect the injected dependencies against null values as I consider this an essential and required part of Constructor Injection – I think Uncle Bob should have added those as well, but he might have omitted them for brevity.

If you disregard the Guard Clauses, the C# version is a logical line of code shorter than the Java version because it has no DI attribute like Guice's @Inject.

Does this mean that we can't do DI with the C# version of BillingService? Uncle Bob seems to imply that we can do Dependency Inversion, but not Dependency Injection - or is it the other way around? I can't really make head or tails of that part of the post…

The interesting part is that in .NET, there's no difference! We can use DI Containers with the BillingService without sprinkling DI attributes all over our code base. The BillingService class has no reference to any DI Container.

It does, however, use the central DI pattern Constructor Injection. .NET DI Containers know all about this pattern, and with .NET's static type system they know all they need to know to wire dependencies up correctly. (I thought that Java had a static type system as well, but perhaps I am mistaken.) The .NET DI Containers will figure it out for you – you don't have to explicitly tell them how to invoke a constructor with two parameters.

We can write an entire application by using Constructor Injection and stacking dependencies without ever referencing a container!

Like the Lean concept of the Last Responsible Moment, we can wait until the application's entry point to decide how we will wire up the dependencies.

As Uncle Bob suggests, we can use Poor Man's DI and manually create the dependencies directly in Main, but as Ayende correctly observes, that only looks like an attractive alternative because the example is so simple. For complex dependency graphs, a DI Container is a much better choice.

With the C# version of BillingService, which DI Container must we select?

It doesn't matter: we can choose whichever one we would like because we have been following patterns instead of using a framework.

Here's an example of an implementation of Main using Castle Windsor:

public static void Main(string[] args)

{

    var container = new WindsorContainer();

    Program.Configure(container);

 

    var billingService =

        container.Resolve<BillingService>();

    billingService.ProcessCharge(2034, "Bob");

}

This looks a lot like Uncle Bob's first Guice example, but instead of injecting a BillingModule into the container, we can configure it inline or in a helper method:

private static void Configure(WindsorContainer container)

{

    container.Register(Component

        .For<TransactionLog>()

        .ImplementedBy<DatabaseTransactionLog>());

    container.Register(Component

        .For<CreditCardProcessor>()

        .ImplementedBy<MyCreditCardProcessor>());

    container.Register(Component.For<BillingService>());

}

This corresponds more or less to the Guice-specific BillingModule, although Windsor also requires us to register the concrete BillingService as a component (this last step varies a bit from DI Container to DI Container – it is, for example, redundant in Unity).

Imagine that in the future we want to rewire this program to use a different DI Container. The only piece of code we need to change is this Composition Root. We need to change the container declaration and configuration and then we are ready to use a different DI Container.

The bottom line is that Uncle Bob's Dependency Injection Inversion is redundant in .NET. Just use a few well-known design patterns and principles and you can write entire applications with DI-friendly, DI-agnostic code bases.

I recently posted a first take on guidelines for writing DI-agnostic code. I plan to evolve these guiding principles and make them a part of my upcoming book.

posted on Monday, January 25, 2010 9:48:27 PM (Romance Standard Time, UTC+01:00)  #    Comments [6] Trackback
# Wednesday, January 20, 2010

My previous post led to this comment by Phil Haack:

Your LazyOrderShipper directly instantiates an OrderShipper. What about the dependencies that OrderShipper might require? What if those dependencies are costly?

I didn't want to make my original example more complex than necessary to get the point across, so I admit that I made it a bit simpler than I might have liked. However, the issue is easily solved by enabling DI for the LazyOrderShipper itself.

As always, when the dependency's lifetime may be shorter than the consumer, the solution is to inject (via the constructor!) an Abstract Factory, as this modification of LazyOrderShipper shows:

public class LazyOrderShipper2 : IOrderShipper
{
    private readonly IOrderShipperFactory factory;
    private IOrderShipper shipper;
 
    public LazyOrderShipper2(IOrderShipperFactory factory)
    {
        if (factory == null)
        {
            throw new ArgumentNullException("factory");
        }
 
        this.factory = factory;
    }
 
    #region IOrderShipper Members
 
    public void Ship(Order order)
    {
        if (this.shipper == null)
        {
            this.shipper = this.factory.Create();
        }
        this.shipper.Ship(order);
    }
 
    #endregion
}

But, doesn't that reintroduce the OrderShipperFactory that I earlier claimed was a bad design?

No, it doesn't, because this IOrderShipperFactory doesn't rely on static configuration. The other point is that while we do have an IOrderShipperFactory, the original design of OrderProcessor is unchanged (and thus blissfully unaware of the existence of this Abstract Factory).

The lifetime of the various dependencies is completely decoupled from the components themselves, and this is as it should be with DI.

This version of LazyOrderShipper is more reusable because it doesn't rely on any particular implementation of OrderShipper – it can Lazily create any IOrderShipper.

posted on Wednesday, January 20, 2010 7:08:36 PM (Romance Standard Time, UTC+01:00)  #    Comments [7] Trackback

Jeffrey Palermo recently posted a blog post titled Constructor over-injection anti-pattern – go read his post first if you want to be able to follow my arguments.

His point seems to be that Constructor Injection can be an anti-pattern if applied too much, particularly if a consumer doesn't need a particular dependency in the majority of cases.

The problem is illustrated in this little code snippet:

bool isValid = _validator.Validate(order);  
if (isValid) 
{
    _shipper.Ship(order);  
}

If the Validate method returns false often, the shipper dependency is never needed.

This, he argues, can lead to inefficiencies if the dependency is costly to create. It's not a good thing to require a costly dependency if you are not going to use it in a lot of cases.

That sounds like a reasonable statement, but is it really? And is the proposed solution a good solution?

No, this isn't a reasonable statement, and the proposed solution isn't a good solution.

It would seem like there's a problem with Constructor Injection, but in reality the problem is that it is being used incorrectly and in too constrained a way.

The proposed solution is problematic because it involves tightly coupling the code to OrderShipperFactory. This is more or less a specialized application of the Service Locator anti-pattern.

Consumers of OrderProcessor have no static type information to warn them that they need to configure the OrderShipperFactory.CreationClosure static member - a completely unrelated type. This may technically work, but creates a very developer-unfriendly API. IntelliSense isn't going to be of much help here, because when you want to create an instance of OrderProcessor, it's not going to remind you that you need to statically configure OrderShipperFactory first. Enter lots of run-time exceptions.

Another issue is that he allows a concrete implementation of an interface to change the design of the OrderProcessor class - that's hardly in the spirit of the Liskov Substitution Principle. I consider this a strong design smell.

One of the commenters (Alwin) suggests instead injecting an IOrderShipperFactory. While this is a better option, it still suffers from letting a concrete implementation influence the design, but there's a better solution.

First of all we should realize that the whole case is a bit construed because although the IOrderShipper implementation may be expensive to create, there's no need to create a new instance for every OrderProcessor. Instead, we can use the so-called Singleton lifetime style where we share or reuse a single IOrderShipper instance between multiple OrderProcessor instances.

The beauty of this is that we can wait making that decision until we wire up the actual dependencies. If we have implementations of IOrderShipper that are inexpensive to create, we may still decide to create a new instance every time.

There may still be a corner case where a shared instance doesn't work for a particular implementation (perhaps because it's not thread-safe). In such cases, we can use Lazy loading to create a LazyOrderShipper like this (for clarity I've omitted making this implementation thread-safe, but that would be trivial to do):

public class LazyOrderShipper : IOrderShipper
{
    private OrderShipper shipper;
 
    #region IOrderShipper Members
 
    public void Ship(Order order)
    {
        if (this.shipper == null)
        {
            this.shipper = new OrderShipper();
        }
        this.shipper.Ship(order);
    }
 
    #endregion
}

Notice that this implementation of IOrderShipper only creates the expensive OrderShipper instance when it needs it.

Instead of directly injecting the expensive OrderShipper instance directly into OrderProcessor, we wrap it in the LazyOrderShipper class and inject that instead. The following test proves the point:

[TestMethod]
public void OrderProcessorIsFast()
{
    // Fixture setup
    var stopwatch = new Stopwatch();
    stopwatch.Start();
 
    var order = new Order();
 
    var validator = new Mock<IOrderValidator>();
    validator.Setup(v => 
        v.Validate(order)).Returns(false);
 
    var shipper = new LazyOrderShipper();
 
    var sut = new OrderProcessor(validator.Object,
        shipper);
    // Exercise system
    sut.Process(order);
    // Verify outcome
    stopwatch.Stop();
    Assert.IsTrue(stopwatch.Elapsed < 
        TimeSpan.FromMilliseconds(777));
    Console.WriteLine(stopwatch.Elapsed);
    // Teardown
}

This test is significantly faster than 777 milliseconds because the OrderShipper never comes into play. In fact, the stopwatch instance reports that the elapsed time was around 3 ms!

The bottom line is that Constructor Injection is not an anti-pattern. On the contrary, it is the most powerful DI pattern available, and you should think twice before deviating from it.

posted on Wednesday, January 20, 2010 5:28:03 PM (Romance Standard Time, UTC+01:00)  #    Comments [10] Trackback
# Tuesday, September 29, 2009

The SOLID principles of OOD as originally put forth by Robert C. Martin make for such a catchy acronym, although they seem to originally have been spelled SOLDI.

In any case I've lately been thinking a bit about these principles and it seems to me that the Single Responsibility Principle (SRP) and the Interface Segregation Principle (ISP) seem to be very much related. In essence you could say that the ISP is simply SRP applied to interfaces.

The notion underlying both is that a type should deal with only a single concept. Whether that applies to the public API or the internal implementation is less relevant because a corollary to the Liskov Substitution Principle (LSP) and Dependency Inversion Principle (DIP) is that we shouldn't really care about the internals (unless we are actually implementing, that is).

The API is what matters.

Although I do understand the subtle differences between SRP and ISP I think they are so closely related that one of them is really redundant. We can remove the ISP and still have a fairly good acronym: SOLD (although SOLID is still better).

There's one principle that I think is missing from this set: The principle about Command/Query Separation (CQS). In my opinion, this is a very important principle that should be highlighted more than is currently the case.

If we add CQS to SOLD, we are left with some less attractive acronyms:

  • SCOLD
  • COLDS
  • CLODS

Not nearly as confidence-inspiring acronyms as SOLID, but nonetheless, I'm striving to write COLDS code.

posted on Tuesday, September 29, 2009 9:38:42 PM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Friday, June 05, 2009

When I talk with people about TDD and unit testing, the discussion often moves into the area of Testability – that is, the software's susceptibility to unit testing. A couple of years back, Roy even discussed the seemingly opposable forces of Object-Oriented Design and Testability.

Lately, it has been occurring to me that there really isn't any conflict. Encapsulation is important because it manifests expert knowledge so that other developers can effectively leverage that knowledge, and it does so in a way that minimizes misuse.

However, too much encapsulation goes against the Open/Closed Principle (that states that objects should be open for extension, but closed for modification). From a Testability perspective, the Open/Closed Principle pulls object-oriented design in the desired direction. Equivalently, done correctly, making your API Testable is simply opening it up for extensibility.

As an example, consider a simple WPF ViewModel class called MainWindowViewModel. This class has an ICommand property that, when invoked, should show a message box. Showing a message box is good example of breaking testability, because if the SUT were to show a message box, it would be very hard to automatically verify and we wouldn't have fully automated tests.

For this reason, we need to introduce an abstraction that basically models an action with a string as input. Although we could define an interface for that, an Action<string> fits the bill perfectly.

To enable that feature, I decide to use Constructor Injection to inject that abstraction into the MainWindowViewModel class:

public MainWindowViewModel(Action<string> notify)
{
    this.ButtonCommand = new RelayCommand(p => 
    { notify("Button was clicked!"); });
}

When I recently did that at a public talk I gave, one member of the audience initially reacted by assuming that I was now introducing test-specific code into my SUT, but that's not the case.

What I'm really doing here is opening the MainWindowViewModel class for extensibility. It can still be used with message boxes:

var vm = new MainWindowViewModel(s => MessageBox.Show(s));

but now we also have the option of notifying by sending off an email; writing to a database; or whatever else we can think of.

It just so happens that one of the things we can do instead of showing a message box, is unit testing by passing in a Test Double.

// Fixture setup
var mockNotify = 
    MockRepository.GenerateMock<Action<string>>();
mockNotify.Expect(a => a("Button was clicked!"));
 
var sut = new MainWindowViewModel(mockNotify);
// Exercise system
sut.ButtonCommand.Execute(new object());
// Verify outcome
mockNotify.VerifyAllExpectations();
// Teardown

Once again, TDD has lead to better design. In this case it prompted me to open the class for extensibility. There really isn't a need for Testability as a specific concept; the Open/Closed Principle should be enough to drive us in the right direction.

Pragmatically, that's not the case, so we use TDD to drive us towards the Open/Closed Principle, but I think it's important to note that we are not only doing this to enable testing: We are creating a better and more flexible API at the same time.

posted on Friday, June 05, 2009 9:56:19 AM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Thursday, May 28, 2009

This is really nothing new, but I don't think I've explicitly stated this before: It makes a lot of sense to view delegates as anonymous one-method interfaces.

Many people liken delegates to function pointers. While that's probably correct (I wouldn't really know), it's not a very object-oriented view to take – at least not when we are dealing with managed code. To me, it makes more sense to view delegates as anonymous one-method interfaces.

Lets consider a simple example. As always, we have the ubiquitous MyClass with its DoStuff method. In this example, DoStuff takes as input an abstraction that takes a string as input and returns an integer – let's imagine that this is some kind of Strategy (notice the capital S – I'm talking about the design pattern, here).

In traditional object-oriented design, we could solve this by introducing the IMyInterface type:

public interface IMyInterface
{
    int DoIt(string message);
}

The implementation of DoStuff is simply:

public string DoStuff(IMyInterface strategy)
{
    return strategy.DoIt("Ploeh").ToString();
}

Hardly rocket science…

However, defining a completely new interface just to do this is not really necessary, since we could just as well have implemented DoStuff with a Func<string, int>:

public string DoStuff(Func<string, int> strategy)
{
    return strategy("Ploeh").ToString();
}

This not only frees us from defining a new interface, but also from implementing that interface to use the DoStuff method. Instead, we can simply pass a lambda expression:

string result = sut.DoStuff(s => s.Count());

What's most amazing is that RhinoMocks understands and treats delegates just like other abstract types, so that we can write the following to treat it as a mock:

// Fixture setup
Func<string, int> mock =
    MockRepository.GenerateMock<Func<string, int>>();
mock.Expect(f => f("Ploeh")).Return(42);
var sut = new MyClass();
// Exercise system
string result = sut.DoStuff(mock);
// Verify outcome
mock.VerifyAllExpectations();
// Teardown

Whenever possible, I prefer to model my APIs with delegates instead of one-method interfaces, since it gives me greater flexibility and less infrastructure code.

Obviously, this technique only works as long as you only need to abstract a single method. As soon as your abstraction needs a second method, you will need to introduce a proper interface or, preferably, an abstract base class.

posted on Thursday, May 28, 2009 10:19:04 PM (Romance Daylight Time, UTC+02:00)  #    Comments [1] Trackback
# Tuesday, May 05, 2009

Udi recently posted an article on managing loose coupling in Visual Studio. While I completely agree, this is a topic that deserves more detailed treatment. In particular, I'd like to expand on this statement:

"In fact, each component could theoretically have its own solution"

This is really the crux of the matter, although in practical terms, you'd typically need at least a couple of projects per component. In special cases, a component may truly be a stand-alone component, requiring no other dependencies than what is already in the BCL (in fact, AutoFixture is just such a component), but most components of more complex software have dependencies.

Even when you are programming against interfaces (which you should be), these interfaces will normally be defined in other projects.

PragmaticMinimalSolution

A component may even use multiple interfaces, since it may be implementing some, but consuming others, and these interfaces may be defined in different projects. This is particularly the case with Adapters.

Finally, you should have at least one unit test project that targets your component.

In essence, while the exact number of projects you need will vary, it should stay small. In the figure above, we end up with five projects, but there's also quite a few abstractions being pulled in.

As a rule of thumb I'd say that if you can't create an .sln file that contains less than ten projects to work on any component, you should seriously consider your decoupling strategy.

You may choose to work with more than ten projects in a solution, but it should always be possible to create a solution to work with a single component, and it should drag only few dependencies along.

posted on Tuesday, May 05, 2009 8:54:11 PM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Friday, May 01, 2009

As a response to my description of how AutoFixture creates objects, Klaus asked:

“[What] if the constructor of ComplexChild imposes some kind of restriction on its parameter? If, for example, instead of the "name" parameter, it would take a "phoneNumber" parameter (as a string), and do some format checking?”

Now that we have covered some of the basic features of AutoFixture, it’s time to properly answer this excellent question.

For simplicity’s sake, let’s assume that the phone number in question is a Danish phone number: This is pretty good for example code, since a Danish phone number is essentially just an 8-digit number. It can have white space and an optional country code (+45), but strip that away, and it’s just an 8-digit number. However, there are exceptions, since the emergency number is 112 (equivalent to the American 911), and other 3-digit special numbers exist as well.

With that in mind, let’s look at a simple Contact class that contains a contact’s name and Danish phone number. The constructor might look like this:

public Contact(string name, string phoneNumber)
{
    this.Name = name;
    this.PhoneNumber = 
        Contact.ParsePhoneNumber(phoneNumber);
}

The static ParsePhoneNumber method strips away white space and optional country code and parses the normalized string to a number. This fits the scenario laid out in Klaus’ question.

So what happens when we ask AutoFixture to create an instance of Contact? It will Reflect over Contact’s constructor and create two new anonymous string instances – one for name, and one for phoneNumber. As previously described, each string will be created as a Guid prepended with a named hint – in this case the argument name. Thus, the phoneNumber argument will get a value like "phoneNumberfa432351-1563-4769-842c-7588af32a056", which will cause the ParsePhoneNumber method to throw an exception.

How do we deal with that?

The most obvious fix is to modify AutoFixture’s algorithm for generating strings. Here an initial attempt:

fixture.Register<string>(() => "112");

This will simply cause all generated strings to be "112", including the Contact instance's Name property. In unit testing, this may not be a problem in itself, since, from an API perspective, the name could in principle be any string.

However, if the Contact class also had an Email property that was parsed and verified from a string argument, we'd be in trouble, since "112" is not a valid email address.

We can't easily modify the string generation algorithm to fit the requirements for both a Danish telephone number and an email address.

Should we then conclude that AutoFixture isn't really useful after all?

On the contrary, this is a hint to us that the Contact class' API could be better. If an automated tool can't figure out how to generate correct input, how can we expect other developers to do it?

Although humans can make leaps of intuition, an API should still go to great lengths to protect its users from making mistakes. Asking for an unbounded string and then expecting it to be in a particular format may not always be the best option available.

In our particular case, the Value Object pattern offers a better alternative. Our first version of the DanishPhoneNumber class simply takes an integer as a constructor argument:

public DanishPhoneNumber(int number)
{
    this.number = number;
}

If we still need to parse strings (e.g. from user input), we could add a static Parse, or even a TryParse, method and test that method in isolation without involving the Contact class.

This neatly solves our original issue with AutoFixture, since it will now create a new instance of DanishPhoneNumber as part of the creation process when we ask for an anonymous Contact instance.

The only remaining issue is that by default, the number fed into the DanishPhoneNumber instance is likely to be considerably less than 112 – actually, if no other Int32 instances are created, it will be 1.

This will be a problem if we modify the DanishPhoneNumber constructor to look like this:

public DanishPhoneNumber(int number)
{
    if ((number < 112) ||
        (number > 99999999))
    {
        throw new ArgumentOutOfRangeException("number");
    }
    this.number = number;
}

Unless a unit test has already caused AutFixture to previously create 111 other integers (highly unlikely), CreateAnonymous<Contact> is going to throw an exception.

This is easy to fix. Once again, the most obvious fix is to modify the creation algorithm for integers.

fixture.Register<int>(() => 12345678);

However, this will cause that particular instance of Fixture to return 12345678 every time you ask it to create an anonymous integer. Depending on the scenario, this may or may not be a problem.

A more targeted solution is to specifically address the algorithm for generating DanishPhoneNumber instances:

fixture.Register<int, DanishPhoneNumber>(i => 
    new DanishPhoneNumber(i + 112));

Here, I've even used the Register overload that automatically provides an anonymous integer to feed into the DanishPhoneNumber constructor, so all I have to do is ensure that the number falls into the proper range. Adding 112 (the minimum) neatly does the trick.

If you don't like the hard-coded value of 112 in the test, you can use that to further drive the design. In this case, we can add a MinValue to DanishPhoneNumber:

fixture.Register<int, DanishPhoneNumber>(i =>
    new DanishPhoneNumber(i + 
        DanishPhoneNumber.MinValue));

Obvously, MinValue will also be used in DanishPhoneNumber's constructor to define the lower limit of the Guard Clause.

In my opinion, a good API should guide the user and make it difficult to make mistakes. In many ways, you can view AutoFixture as an exceptionally dim user of your API. This is the reason I really enjoyed receiving Klaus' original question: Like other TDD practices, AutoFixture drives better design.

posted on Friday, May 01, 2009 5:56:00 AM (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Sunday, February 22, 2009

When working with the ObjectContext in LINQ To Entities, a lot of operations are easily performed as long as you work with the same ObjectContext instance: You can retrieve entities from storage by selecting them; update or delete these entities and create new entities, and the ObjectContext will keep track of all this for you, so the changes are correctly applied to the store when you call SaveChanges.

This is all well and good, but not particularly useful when you start working with layered applications. In this case, LINQ To Entities is just a persistence technology that you (or someone else) decided to use to implement the Data Access Layer. A few years ago, I tended to implement my Data Access Components in straight ADO.NET; and a lot of people prefer NHibernate or similar tools – but I digress…

When LINQ To Entities is just an implementation detail of a service, lifetime management becomes important, so it is commonly recommended that any ObjectContext instance is instantiated when needed and disposed immediately after use.

This means that you will have a lot of detached entities in your system. Entities are likely to be returned to the calling code as interface, and when updating, a client will simply pass a reference to some implementation of that interface.

public void CompleteAtSource(IRecord record)

Since we should always follow the Liskov Substitution Principle, we should not even try to cast the interface to an entity. Instead, we must populate a new instance of the entity in question with the correct data and save it.

That’s not hard, but since we are creating a new instance of an entity that represents data that is already in the database, we must attach it to the ObjectContext so that it can start tracking it again.

Now we are getting to the heat of the matter, because this is done with the AttachTo method, which is woefully inadequately documented.

At first, I couldn’t get it to work, and it wasn’t very apparent to me what I did wrong, so although the answer is very simple, this post might save you a bit of time.

This was my first attempt:

using (MessageEntities store = 
    new MessageEntities(this.connectionString))
{
    Message m = new Message();
    m.Id = record.Id;
    m.InputReference = record.InputReference;
    m.State = 2;
    m.Text = record.Text;
 
    store.AttachTo("Messages", m);
 
    store.SaveChanges();
}

I find this approach very intuitive: Build the entity from the input parameter’s data, attach it to the store and save the changes. Unfortunately, this approach is wrong.

What happens is that when you invoke AttachTo, the state of the entity becomes Unchanged, and thus, not updated.

The solution is so simple that I’m surprised it took me so long to arrive at it: Simply call AttachTo right after setting the Id property:

using (MessageEntities store = 
    new MessageEntities(this.connectionString))
{
    Message m = new Message();
    m.Id = record.Id;
 
    store.AttachTo("Messages", m);
 
    m.InputReference = record.InputReference;
    m.State = 2;
    m.Text = record.Text;
 
    store.SaveChanges();
}

You can’t invoke AttachTo before adding the Id, since this method requires that the entity has a populated EntityKey before it can be attached, but as soon as you begin updating properties after the call to AttachTo, the entity’s state changes to Modified, and SaveChanges now updates the data in the database.

That you have to follow this specific sequence when re-attaching data to the ObjectContext is poorly documented and not enforced by the API, so I thought I’d share this in case it would save someone else a bit of time.

posted on Sunday, February 22, 2009 9:45:36 PM (Romance Standard Time, UTC+01:00)  #    Comments [1] Trackback