Compose object graphs with confidence

Friday, 04 March 2011 11:15:10 UTC

The main principle behind the Register Resolve Release pattern is that loosely coupled object graphs should be composed as a single action in the entry point of the application (the Composition Root). For request-based applications (web sites and services), we use a variation where we compose once per request.

It seems to me that a lot of people are apprehensive when they first hear about this concept. It may sound reasonable from an architectural point of view, but isn't it horribly inefficient? A well-known example of such a concern is Jeffrey Palermo's blog post Constructor over-injection anti-pattern. Is it really a good idea to compose a complete object graph in one go? What if we don't need part of the graph, or only need it later? Doesn't it adversely affect response times?

Normally it doesn't, and if it does, there are elegant ways to address the issue.

In the rest of this blog post I will expand on this topic. To keep the discussion as simple as possible, I'll restrict my analysis to object trees instead of full graphs. This is quite a reasonable simplification as we should strive to avoid circular dependencies, but even in the case of full graphs the arguments and techniques put forward below hold.

Consider a simple tree composed of classes from three different assemblies:

Tree

All the A classes (blue) are defined in the A assembly, B classes (green) in the B assembly, and the C1 class (red) in the C assembly. In code we create the tree with Constructor Injection like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new C1(
            new B3()));

Given the tree above, we can now address the most common concerns about composing object trees in one go.

Will it be slow? #

Most likely not. Keep in mind that Injection Constructors should be very simple, so not a lot of work is going on during composition. Obviously just creating new object instances takes a bit of time in itself, but we create objects instances all the time in .NET code, and it's often a very fast operation.

Even when using DI Containers, which perform a lot of (necessary) extra work when creating objects, we can create tens of thousand trees per second. Creation of objects simply isn't that big a deal.

But still: what about assembly loading? #

I glossed over an important point in the above argument. While object creation is fast, it sometimes takes a bit of time to load an assembly. The tree above uses classes from three different assemblies, so to create the tree all three assemblies must be loaded.

In many cases that's a performance hit you'll have to take because you need those classes anyway, but sometimes you might be concerned with taking this performance hit too early. However, I make the claim that in the vast majority of cases, this concern is irrelevant.

In this particular context there are two different types of applications: Request-based applications (web) and all the rest (desktop apps, daemons, batch-jobs, etc.).

Request-based applications #

For request-based applications such as web sites and REST services, an object tree must be composed for each request. However, all requests are served by the same AppDomain, so once an assembly is loaded, it sticks around to be available for all subsequent requests. Thus, the first few requests will suffer a performance penalty from having to load all assemblies, but after that there will be no performance impact.

In short, in request-based applications, you can compose object trees with confidence. In only extremely rare cases should you have performance issues from composing the entire tree in one go.

Long-running applications #

For long-running applications the entire object tree must be composed at start-up. For background services such as daemons and batch processes the start-up time probably doesn't matter much, but for desktop applications it can be of great importance.

In some cases the application requires the entire tree to be immediately available, in which case there's not a lot you can do. Still, once all assemblies have been loaded, actually creating the tree will be very fast.

In other cases an entire branch of the tree may not be immediately required. As an example, if the C1 node in the above graph isn't needed right away, we could improve start-up time if we could somehow defer creating that branch, because this would also defer loading of the entire C assembly.

Deferred branches #

Since object creation is fast, the only case where it makes sense to defer loading of a branch is when creation of that branch causes an assembly to be loaded. If we can defer creation of such a branch, we can also defer loading of the assembly, thus improving the time it takes to compose the initial tree.

Imagine that we wish to defer creation of the C1 branch of the above tree. It will prevent the C assembly from being loaded because that assembly is not used in any other place in the tree. However, it will not prevent the B assembly from being loaded, since that assembly is also being used by the A2 node.

Still, in those rare situations where it makes sense to defer creation of a branch, we can make that cut into a part of the infrastructure of the tree. I originally described this technique as a reaction to the above mentioned post by Jeffrey Palermo, but here's a restatement in the current context.

We can defer creating the C1 node by wrapping it in a lazy implementation of the same interface. The C1 node implements an interface called ISolo<IMarker>, so we can wrap it in a Virtual Proxy that defers creation of C1 until it's needed:

public class LazySoloMarker : ISolo<IMarker>
{
    private readonly Lazy<ISolo<IMarker>> lazy;
 
    public LazySoloMarker(Lazy<ISolo<IMarker>> lazy)
    {
        if (lazy == null)
        {
            throw new ArgumentNullException("lazy");
        }
 
        this.lazy = lazy;
    }
 
    #region ISolo<IMarker> Members
 
    public IMarker Item
    {
        get { return this.lazy.Value.Item; }
    }
 
    #endregion
}

This Virtual Proxy takes a Lazy<ISolo<IMarker>> as input and defers to it to implement the members of the interface. This only causes the Value property to be created when it's first accessed - which may be long after the LazySoloMarker instance was created.

The tree can now be composed like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new LazySoloMarker(
            new Lazy<ISolo<IMarker>>(() => new C1(
                new B3()))));

This retains all the original behavior of the original tree, but defers creation of the C1 node until it's needed for the first time.

The bottom line is this: you can compose the entire object graph with confidence. It's not going to be a performance bottleneck.

Update (2013-08-19 08:09 UTC): For a more detailed treatment of this topic, watch my NDC 2013 talk Big Object Graphs Up Front.


Comments

Hi there, great article and way to focus on the inherent differences between request-based and long-running applications.

The viewpoint however seems skewed towards request-based apps and I think really trivializes the innate lazy exploration nature of object graphs in rich-client apps (let's not call them desktop, as mobile is the exact same scenario - only difference is resources and processing power are a lot more limited and thus our architectures have to be better designed to account for it).

The main premise here is that in a rich-client app the trees you mention above should ALWAYS be lazily loaded, since the "branches" of the component tree are usually screens or their various sub-components (a dialog in some sub-screen for example). The branches are also more like "chains" as screens can have multiple follow-on screens that slowly load more content as you dive in deeper (while content you have visited higher up the chain may get unloaded - "lazy unloading" shall we say?). You never want to load screens that are not yet visible at app startup. In a desktop, fully-local app this results in a chuggy-performing or needlessly resource-wasteful app; in mobile it is simply impossible within the memory constraints. And of course... a majority of rich-client screens are hydrated with data pulled from the Internet and that data depends on either user input or server state (think accessing a venue listing on Yelp - you need the user to say which venue they want to view and the venue data is stored and updated server-side in a crowd-sourced manner not locally available to the user). So even if you had unlimited client-side memory you still couldn't pre-load the whole component object graph for the application up front.

I completely agree with your Composition Root article (/2011/07/28/CompositionRoot) and I think it describes things very beautifully. But what it also blissfully points out is that rich-client apps are poorly suited to a Composition Root and thus Dependency Injection approach.

DI does lazy-loading very poorly (just look at the amount of wrapper/plumbing you have to write for one component above). Now picture doing that for every screen in a mobile app (ours has ~30 screens and it's quite a trivial, minimal app for our problem domain). That's not to even mention whether you actually WANT to pull all the lifetime management into a single component - what could easily be seen as a break in Cohesion for logic belonging to the various sub-components. In web this doesn't really matter as the lifetime is usually all or nothing - it's either a singleton/per-request or per-instance. The request provides the full scoping needed for most processing. But in rich-client the lifetimes need to be finely managed based on screen navigation, user action and caching or memory constraints. Sub-components need to get loaded in and out dynamically in a lot more complex a manner than singletons or per-user-session. Again pulling this fine-tuned logic into a shared, centralized component is highly questionable - even if it was easily doable, which it's not.

I won't go into alternatives here (perhaps that's the subject of a post I should write up), but service location and manual instantiation ends up being a much preferable approach in these kinds of long-running application scenarios. Testability can be achieved in other, possibly simpler ways (http://unitbox.codeplex.com) if that's the driving concern.

Thus I think the key driving differentiator comes down to object graph composition: are you able to feasibly and desirably load the whole thing at once (such as in web scenarios) or not? In rich-client apps (desktop and mobile) this is a striking NO. You need components to load sub-components at will and not be bound and dependent on a centralized component (Composite Root) to do so. The alternative is passing a dependency container around to every component in the system so that the components can resolve sub-components using the container - but we all know how major an anti-pattern that is (oh Android!...)

Would love to see the community start differentiating between the scenarios where DI makes sense and where it ends up being an anti-productive burden. And I think rich client apps is evidently one of those places.
2012-10-18 05:11 UTC
Hi Marcel

Thank you for your insightful comment. When it comes to the analysis of the needs of desktop and mobile applications, I completely agree that many nodes of the object graph would need to be lazy for exactly the reasons you so clearly explain.

However, I don't agree with your conclusions. Yes, there's a bit of plumbing involved with defining a Virtual Proxy over an expensive dependency, but I don't think it's particularly problematic issue. There's a number of reasons for that:

- First of all, let's consider your example of an application with 30 screens. Obviously, you don't want to load all 30 screens up-front, but that doesn't mean that you'll need to write 30 custom Virtual Proxies. Hopefully, you have a single abstraction that loads screens, so that's only a single Virtual Proxy you'll need to write.

- As you point out, you'll also want to postpone loading of data for each screen. I completely agree, but you don't need a Service Locator for this. The most common approach for this is to inject a strongly typed Query service (think: Repository) into your Controllers (or whatever you use to load data). This would essentially be a stateless service object without much (if any) read-only state, so even in a mobile app, I doubt it would take up much resources. Even so, you can still lazy-load it if you need to - but do measure before jumping to conclusions.

- In the end, you may need to proxy more than a single service, but if you find yourself in a situation where you need to proxy 30+ services, that's more likely to indicate a violation of the Reused Abstractions Principle than a failure of DI and the Composition Root pattern.

- Finally, while it may seem like an overhead to create the plumbing code, it's likely to be very robust. Once you've created a Virtual Proxy for an interface, the only reason it has to change is if you change the interface itself. If you stick to Role Interfaces that shouldn't happen very often. Thus, while you may be concerned that creating Virtual Proxies will require extra effort, it'll abstract away an application concern that will tend to be very robust and have a very low maintenance overhead. I consider this far superior to the brittle approach of a Service Locator. In the end, you'll spend less time maintaining that code than if you go for a Service Locator. It's a classic case of a bigger up-front investment that pays huge dividends over time - just like TDD.
2012-10-23 00:36 UTC
Hi Mark,
Just found your blog while doing additional research for a conference talk I'm about to give. The content here is pure gold! I'll make sure to read all of it.
Your response to Marcel above is exactly what I was thinking of when I read his comment. I'm a professional Android developer and I confirm that the scenario described in Marcel's comment is best handled in a way you suggested.
I would like to know your expert opinion on this statement of mine: "When using DI, if a requirement for lazy loading of an injected service(s) arises, it should be treated as 'code smell'. Most probably, this requirement is due to a service(s) that does too much work upon construction. If this service can be refactored, then you should favor refactoring over lazy loading. If the service can't be refactored (e.g. part of the framework), then you should check whether the client can be refactored in a way that eliminates a need for lazy loading. Use lazy loading of injected services only as a last resort in order to compensate for an unfortunate design that you can't change".
Does the above statement makes sense in your opinion?
2017-03-19 05:11 UTC

Vasiliy, thank you for writing. That statement sounds reasonable. FWIW, objects that do too much work during construction violate Nikola Malovic's 4th law of IoC. Unfortunately, the original links to Nikola Malovic laws of IoC no longer point at the original material, but I described the fourth law in an article called Injection Constructors should be simple.

If you want to give some examples of possible refactorings, you may want to take a look at the Decoraptor pattern. Additionally, if you're dealing with a third-party component, you can often create an Adapter that behaves nicely during construction.

2017-03-19 11:53 UTC

Mark, thank you for such a quick response and additional information and references! I would vote for Decoraptor to be included in the next edition of GOF's book.

BTW, I think I found Nikola'a article that you mentioned here.

2017-03-19 13:30 UTC

Injection Constructors should be simple

Thursday, 03 March 2011 14:18:54 UTC

The Constructor Injection design pattern is a extremely useful way to implement loose coupling. It's easy to understand and implement, but sometime perhaps a bit misunderstood.

The pattern itself is easily described through an example:

private readonly ISpecimenBuilder builder;
 
public SpecimenContext(ISpecimenBuilder builder)
{
    if (builder == null)
    {
        throw new ArgumentNullException("builder");
    }
 
    this.builder = builder;
}

The SpecimenContext constructor statically declares that it requires an ISpecimenBuilder instance as an argument. To guarantee that the builder field is an invariant of the class, the constructor contains a Guard Clause before it assigns the builder parameter to the builder field. This pattern can be repeated for each constructor argument.

It's important to understand that when using Constructor Injection the constructor should contain no additional logic.

An Injection Constructor should do no more than receiving the dependencies.

This is simply a rephrasing of Nikola Malovic's 4th law of IoC. There are several reasons for this rule of thumb:

  • When we compose applications with Constructor Injection we often create substantial object graphs, and we want to be able to create these graphs as efficiently as possible. This is Nikola's original argument.
  • In the odd (and not recommended) cases where you have circular dependencies, the injected dependencies may not yet be fully initialized, so an attempt to invoke their members at that time may result in an exception. This issue is similar to the issue of invoking virtual members from the constructor. Conceptually, an injected dependency is equivalent to a virtual member.
  • With Constructor Injection, the constructor's responsibility is to demand and receive the dependencies. Thus, according to the Single Responsibility Principle (SRP), it should not try to do something else as well. Some readers might argue that I'm misusing the SRP here, but I think I'm simply applying the underlying principle in a more granular context.

There's no reason to feel constrained by this rule, as in any case the constructor is an implementation detail. In loosely coupled code, the constructor is not part of the overall application API. When we consider the API at that level, we are still free to design the API as we'd like.

Please notice that this rule is contextual: it applies to Services that use Constructor Injection. Entities and Value Objects tend not to use DI, so their constructors are covered by other rules.


Comments

Nice post, sometimes I find useful to have an IInitializable interface and instruct the container to call the initialize method after instantiation. What you thing about this?
2011-03-03 15:01 UTC
That's very rarely a good idea. The problem with an Initialize method is the same as with Property Injection (A.K.A. Setter Injection): it creates a temporal coupling between the Initialize method and all other members of the class. Unless you truly can invoke any other member of the class without first invoking the Initialize method, such API design is deceitful and will lead to run-time exceptions. It also becomes much harder to ensure that the object is always in a consistent state.

Constructor Injection is a far superior pattern because is enforces that required dependencies will be present. Property Injection on the other hand implies that the dependency is optional, which is rarely the case.
2011-03-03 16:04 UTC
What about wiring events in the constructor? For example:

this.foo = foo;
this.foo.SomeEvent += HandleSomeEvent;
2011-03-03 19:28 UTC
When you look at what happens on the IL level, subscribing to an event is just another method call, so the same arguments as above still apply.

Keep in mind, however, that the above constitutes a guideline. It's not an absolute truth. I rarely use events, but it happens from time to time, and I can think of at least one case where I've done just what you suggest. I also occasionally break the above rule in other ways, but I always pause and consider the implications and whether there's a better alternative - often there is.
2011-03-03 19:50 UTC

Interfaces are access modifiers

Monday, 28 February 2011 13:19:04 UTC

.NET developers should be familiar with the standard access modifiers (public, protected, internal, private). However, in loosely coupled code we can regard interface implementations as a fifth access modifier. This concept was originally introduced to me by Udi Dahan the only time I've had the pleasure of meeting him. That was many years ago and while I didn't grok it back then, I've subsequently come to appreciate it quite a lot.

Although I can't take credit for the idea, I've never seen it described, and it really deserves to be.

The basic idea is simple:

If a consumer respects the Liskov Substitution Principle (LSP), the only visible members are those belonging to the interface. Thus, the interface represents a dimension of visibility.

As an example, consider this simple interface from AutoFixture:

public interface ISpecimenContext
{
    object Resolve(object request);
}

A well-behaved consumer can only invoke the Resolve method even though an implementation may have additional public members:

public class SpecimenContext : ISpecimenContext
{
    private readonly ISpecimenBuilder builder;
 
    public SpecimenContext(ISpecimenBuilder builder)
    {
        if (builder == null)
        {
            throw new ArgumentNullException("builder");
        }
 
        this.builder = builder;
    }
 
    public ISpecimenBuilder Builder
    {
        get { return this.builder; }
    }
 
    #region ISpecimenContext Members
 
    public object Resolve(object request)
    {
        return this.Builder.Create(request, this);
    }
 
    #endregion
}

Even though the SpecimenContext class defines the Builder property, as well as a public constructor, any consumer respecting the LSP will only see the Resolve method.

In fact, the Builder property on the SpecimenContext class mostly exists to support unit testing because I sometimes need to assert that a given instance of SpecimenContext contains the expected ISpecimenBuilder. This doesn't break encapsulation since the Builder is exposed as a read-only property, and it more importantly doesn't pollute the API.

To support unit testing (and whichever other clients might be interested in the encapsulated ISpecimenBuilder) we have a public property that follows all framework design guidelines. However, it's essentially an implementation detail, so it's not visible via the ISpecimenContext interface.

When writing loosely coupled code, I've increasingly begun to see the interfaces as the real API. Most other (even public) members are pure implementation details. If the members are public, I still demand that they follow the framework design guidelines, but I don't consider them parts of the API. It's a very important distinction.

The interfaces define the bulk of an application's API. Most other types and members are implementation details.

An important corollary is that constructors are implementation details too, since they can never by part of any interfaces.

In that sense we can regard interfaces as a fifth access modifier - perhaps even the most important one.


Comments

Taking this a step further, I wonder if it would make sense to prefer explicit interface implementations, separating the interface from any additional functionality in the type in a way so that you could not invoke Resolve on a SpecimenContext directly, but only on objects that are typed as ISpecimenContext? I am not sure whether I like that idea or not, but it would help enforce the logical separation between the interface and its implementation.
2011-02-28 14:12 UTC
Mark,

Thanks for this post.

I have to disagree.

Any public type or member becomes a part of your API. Consumer code will possibly link to them, they will become a part of your legacy, and you will need to provide backward-compatibility to any part of the public API.

If you don't want a type, constructor or method to be part of the API, you have to make this it internal.

Anyone is free to expose its entire API as a set of interfaces (as in COM), but the only thing that makes a semantic part of the API is whether it is public or not. Making a constructor public and then telling "oh you know, you can't use it in your code, it's an implementation detail" breaks the fundamental principles of object-oriented programming.

Interfaces exist to allow for late binding between the consumer and the provider of a set of semantics. Visibility exits to segregate contractual semantics from implementations details. These concepts are related but actually orthogonal: you can have internal and public interfaces too.

Bottom line: what makes your public API is the 'public' keyword.


-gael


2011-02-28 16:32 UTC
I must admit that I haven't fully thought through whether or not interfaces ought to be explicit or implicit, but my gut feeling is that implicit implementations (as shown in the example) are fine.

The thing is: when I unit test the concrete implementations I often tend to declare the System Under Test (SUT) as the concrete type because I want to exercise an interface member, but verify against a concrete member. If I were to use an explicit implementation this would require a cast in each test. If there was an indisputable benefit to be derived from explicit implementations I wouldn't consider this a valid argument, but in the absence of such, I'd tend towards making unit testing as easy as possible.

I think the argument that explicit interfaces help enforce the logical separation is valid, but not by a sufficiently high degree...
2011-02-28 20:49 UTC
Gael

Thank you for your comment.

Please note that the context of the blog post is loosely coupled code. When composing classes using Dependency Injection (DI), consumers will never see anything else than the interface members. Thus, the API from which we compose an application contains mainly the interfaces. These are the moving parts from which we can define interaction.

I agree that if you consider only a single concrete type at a time, all public (and protected) members are part of the API of that type, but that's not what I'm talking about. In DI it's implicitly discouraged to invoke public constructors of any Services because once you do that, you tightly couple a consumer to a specific implementation.
2011-02-28 20:57 UTC
I see from AutoMapper and the FDG references you're using the ILikePrefixes convention. I've been around the houses with it and have read most of the 'debates' on SO about it, but I keep coming back to the Growing Object Oriented Software Guided by Tests sidebar that says I is an antipattern and agreeing with it, even in .NET.

Any comment on the above? Ever tried going I-less?

Does it have any influence/overlap with your thoughts in the [excellent food-for-thought] article?
2011-03-01 01:15 UTC
Ruben

Thank you for writing.

In my opinion, the most important goal for coding conventions is to reduce friction when reading (and writing) code. Thus, I generally try to write Clean Code, but another important guide is the POLA. When it comes to the debate around the Hungarian I in interface names, I think that the POLA weighs heavier than the strictly logical argument against it.

However illogical it is, (close to) 10 years of convention causes us to expect the I to be there; when it's not, it causes unnecessary friction.
2011-03-01 08:42 UTC
Harry Dev #
If the "Builder" property is only used for testing why not use the "internal" access modifier instead, and allow the testing assembly access to this using the "[assembly: InternalsVisibleTo("XXX.Test")]" attribute?

If the "Builder" property really is an implementation detail there should be no reason to expose it. Although, as you say it does not pollute much since it is a readonly property.
2011-03-02 09:51 UTC
First and foremost I consider the InternalsVisibleTo feature an abomination. We should only unit test the public members of our code.

As I wrote in another answer on this page, the Builder property is most certainly part of the public API of the concrete SpecimenContext class. It doesn't pollute the class in any way because it's an integral part of what the SpecimenContext class does.

There's no reason to expect that the Builder property is used only for testing. It's true that it was driven into existence by TDD, but it makes sense as part of the class' API and is available to other potential consumers. In the rare cases that a third-party consumer wants to use the SpecimenContext directly, it can access the Builder property as well. It wouldn't be able to do that if the property was internal.

However, the Builder property in no way belongs on the interface because that would be a leaky abstraction, so while it doesn't pollute the class, it would pollute the interface.
2011-03-02 10:35 UTC
Ruben, I have to agree with Mark. The I prefix is a convention that is expected and I doubt anyone really considers it hungarian notation (even though it fits the definition).

If you're providing an API and there is a method expecting 'SomeType' most people will attempt to instantiate an instance of SomeType at which point they will receive red squiggles and will then have to do some investigation to determine what's going on. Even if it's only a matter of 5 seconds to figure it out, you've violated POLA, caused the developer to become confused because he now has to solve yet another problem loses momentum. There are many other potential confusing scenarios that can arise from not clearly marking an interface as such.

No one expects to see strings prefixed with 'str' but clearly identifying an interface is expected. There many "rules" that should be followed but with all rules, there are exceptions. Most of them are for the sake of developers. It takes me no time to see any type and recognize it as an interface which I already know I can do this or I can do that, because it's prefixed with an I. but unless i'm already familiar with a framework/API that does not use the I convention, I would need to spend time learning and trial/error. Waste time.

If for no other reason, then do it because Microsoft uses the I convention in the .NET framework and that is what .NET developers expect, even if it is "incorrect".
2011-03-02 16:29 UTC

Creating general populated lists with AutoFixture

Tuesday, 08 February 2011 14:53:16 UTC

In my previous post I described how to customize a Fixture instance to populate lists with items instead of returning empty lists. While it's pretty easy to do so, the drawback is that you have to do it explicitly for every type you want to influence. In this post I will follow up by describing how to enable some general conventions that simply populates all collections that the Fixture resolves.

This post describes a feature that will be available in AutoFixture 2.1. It's not available in AutoFixture 2.0, but is already available in the code repository. Thus, if you can't wait for AutoFixture 2.1 you can download the source and built it.

Instead of having to create multiple customizations for IEnumerable<int>, IList<int>, List<int>, IEnumerable<string>, IList<string>, etc. you can simply enable these general conventions as easy as this:

var fixture = new Fixture()
    .Customize(new MultipleCustomization());

Notice that enabling conventions for populating sequences and lists with ‘many' items is an optional customization that you must explicitly add.

This feature must be explicitly enabled. There are several reasons for that:

  • It would be a breaking change if AutoFixture suddenly started to behave like this by default.
  • The MultipleCustomization targets not only concrete types such as List<T> and Collection<T>, but also interfaces such as IEnumerable<T>, IList<T> etc. Thus, if you also use AutoFixture as an Auto-Mocking container, I wanted to provide the ability to define which customization takes precedence.

With that simple customization enabled, all requested IEnumerable<T> are now populated. The following will give us a finite, but populated list of integers:

var integers = 
    fixture.CreateAnonymous<IEnumerable<int>>();

This will give us a populated List<int>:

var list = fixture.CreateAnonymous<List<int>>();

This will give us a populated Collection<int>:

var collection = 
    fixture.CreateAnonymous<Collection<int>>();

As implied above, it also handles common list interfaces, so this gives us a populated IList<T>:

var list = fixture.CreateAnonymous<IList<int>>();

The exact number of ‘many' is as always determined by the Fixture's RepeatCount.

As this code is still (at the time of publishing) in preview, I would love to get feedback on this feature.


Creating specific populated lists with AutoFixture

Monday, 07 February 2011 19:49:26 UTC

How do you get AutoFixture to create populated lists or sequences of items? Recently I seem to have been getting this question a lot, and luckily it's quite easy to answer.

Let's first look at the standard AutoFixture behavior and API.

You can ask AutoFixture to create an anonymous List like this:

var list = fixture.CreateAnonymous<List<int>>();

Seen from AutoFixture's point of view, List<int> is just a class like any other. It has a default constructor, so AutoFixture just uses that and returns an instance. You get back an instance, no exceptions are thrown, but the list is empty. What if you'd rather want a populated list?

There are many ways to go about this. A simple, low-level solution is to populate the list after creation:

fixture.AddManyTo(list);

However, you may instead prefer getting a populated list right away. This is also possible, but before we look at how to get there, I'd like to point out a feature that surprisingly few users notice. You can create many anonymous specimens at once:

var integers = fixture.CreateMany<int>();

Armed with this knowledge, as well as the knowledge of how to map types, we can now create this customization to map IEnumerable<int> to CreateMany<int>:

fixture.Register(() => fixture.CreateMany<int>());

The Register method is really a generic method, but since we have type inference, we don't have to write it out. However, since CreateMany<int>() returns IEnumerable<int>, this is the type we register. Thus, every time we subsequently resolve IEnumerable<int>, we will get back a populated sequence.

Getting back to the original List<int> example, we can now customize it to a populated list like this:

fixture.Register(() =>
    fixture.CreateMany<int>().ToList());

Because the ToList() extension method returns List<T>, this call registers List<int> so that we will get back a populated list of integers every time the fixture resolves List<int>.

What about other collection types that don't have a nice LINQ extension method? Personally, I never use Collection<T>, but if you wanted, you could customize it like this:

fixture.Register(() =>
    new Collection<int>(
        fixture.CreateMany<int>().ToList()));

Since Collection<T> has a constructor overload that take IList<T> we can customize the type to use this specific overload and populate it with ‘many' items.

Finally, we can combine all this to map from collection interfaces to populated lists. As an example, we can map from IList<int> to a populated List<int> like this:

fixture.Register<IList<int>>(() => 
    fixture.CreateMany<int>().ToList());

When we use the Register method to map types we can no longer rely on type inference. Instead, we must explicitly register IList<int> against a delegate that creates a populated List<int>. Because List<int> implements IList<int> this compiles. Whenever this fixture instance resolves IList<int> it will create a populated List<int>.

All of this describes what you can do with the strongly typed API available in AutoFixture 2.0. It's easy and very flexible, but the only important drawback is that it's not general. All of the customizations in this post specifically address lists and sequences of integers, but not lists of any other type. What if you would like to expand this sort of behavior to any List<T>, IEnumerable<T> etc?

Stay tuned, because in the next post I will describe how to do that.


The BCL already has a Maybe monad

Friday, 04 February 2011 13:11:34 UTC

During the last couple of weeks I've been very interested in using a Maybe monad with AutoFixture's Kernel code, but although many examples can be found on the internet, they remain samples. Rinat Abdullin and Zack Owens both posted samples, but I particularly like Mike Hadlow's series about Monads in C# because he also explains how to use LINQ with monads such as the Maybe monad.

As I really wanted a Maybe monad for AutoFixture, I first thought about simply implementing it directly in the AutoFixture source. However, I found it too arbitrary to put such a general purpose programming construct into a specific library such as AutoFixture. My next thought was to create a small open source project just for that single purpose, but then I though about the problem a bit more…

The BCL sort of already has a Maybe monad - you just need to recognize it as such.

What is a Maybe monad really? If you really distill it, it's just a type that either contains a value, or doesn't contain a value. In other words, it's a type that represents a particular range: a set with either zero or one items. That's just a special case of a more general range or collection, and we already have LINQ covering those constructs.

Here it is: the Maybe monad from the BLC (encapsulated in a nice extension method):

public static class LightweightMaybe
{
    public static IEnumerable<T> Maybe<T>(this T value)
    {
        return new[] { value };
    }
}

Obviously, this method returns a Maybe with a value, but we can just as easily represent Nothing with an empty array.

With my ‘new' Maybe monad, I can now write code like this (where request is a System.Object instance):

return (from t in request.Maybe().OfType<Type>()
        let typeArguments = t.GetGenericArguments()
        where typeArguments.Length == 1
        && typeof(IList<>)
            == t.GetGenericTypeDefinition()
        select context.Resolve(typeof(List<>)
            .MakeGenericType(typeArguments)))
        .DefaultIfEmpty(new NoSpecimen(request))
        .SingleOrDefault();

You may think that this looks dense, but before that the code looked like this:

var type = request as Type;
if (type == null)
{
    return new NoSpecimen(request);
}
 
var typeArguments = type.GetGenericArguments();
if (typeArguments.Length != 1)
{
    return new NoSpecimen(request);
}
 
if (typeof(IList<>) != 
    type.GetGenericTypeDefinition())
{
    return new NoSpecimen(request);
}
 
return context.Resolve(typeof(List<>)
    .MakeGenericType(typeArguments));

Notice that in this more traditional approach involving Guard Clauses, I have to construct a new NoSpecimen object in three different places, thus violating the DRY principle. I like not having all those if/return blocks in the code.


Comments

That's a very neat idea. Now I think of it, it's a pattern you see used a lot in Haskell too.

In your first code blog, rather than returning:

return new[]{value};

It would be nicer to do this:

return Enumerable.Single(value);

:)
2011-02-04 20:38 UTC
Yes, I believe the concept of a Maybe monad originates from Haskell or a similar language (but I can't remember the specific details).

Using Enumerable.Single(value) will not work because it takes an IEnumerable<T> and returns a T. We want the exact opposite: take a T and return IEnumerable<T>.
2011-02-04 21:33 UTC
Cant say I know it inside out have never used DefaultIfEmpty IRL, but should the .SingleOrDefault be a .Single() or a [0] ? (I'd much favor a .Single() to be honest)

@other commenter: Enumerable.Repeat(value,1) does the trick you want. I sometimes cruft up a .One helper method, but I believe there's a more accepted name for it in the excellent RealWorldFunctionalProgramming in C# and F# book I dont have to hand (The one that makes Mark's head hurt :D)
2011-02-04 23:29 UTC
Yes, you are right - Single() is enough. My mistake :)

It's true that there are many ways to create an IEnumerable with a single element.
2011-02-05 08:06 UTC
I agree that this functional approach is DRY, but I still find it a bit hard to follow. Instead, why not simply refactor the imperative code to a DRY, more intend revealing (but still imperative) version. This is what I propose:

var type = request as Type;

bool requestIsAType = type != null;
bool withOneGenericArgument = requestIsAType && type.GetGenericArguments().Length == 1;
bool isAGenericList = requestIsAType && type.GetGenericTypeDefinition() == typeof(IList<>);

if (requestIsAType && isAGenericList && withOneGenericArgument)
{
return context.Resolve(typeof(List<>).MakeGenericType(type.GetGenericArguments()));
}
else
{
return new NoSpecimen(request);
}

Doesn’t this just read like a functional spec? “When the request is a Type of a generic list with one generic generic argument, than … otherwise …”.

Cheers
2011-02-05 09:07 UTC
Did you test that code? I'm pretty sure it has defects.

If you call GetGenericTypeDefinition() on a type which is not generic, an exception will be thrown. This could happen if, for instance, I were to invoke the method with request = typeof(object).

If you want to play around with this, just pull the AutoFixture source and revert to revision 391 and try it out on the ListRelay class. It has pretty comprehensive test coverage.
2011-02-05 09:31 UTC
#
Did I test that code? Of course not! ;-) Just trying to prove what a bit of refactoring can do :-)
2011-02-05 22:50 UTC
Yes, but the point is that it's those little things that end up making a more procedural refactoring less than readable. In any case, 'readability' of code is highly subjective so obviously YMMV.
2011-02-06 09:35 UTC
3P #
For me the LINQ version is almost unreadable. If it makes me more then few seconds to understand the code I think that code is not finished. Putting sth in one line is not "Clean Code" I think.
2011-04-03 19:03 UTC
I came late to comment due to tweets exchange with author. But I want add that treating null as something that should semantically avoided in code, it's not only a matter of readability but also a symptom of good design. Totally agree with Mark Seemann that kindly supplied also this Maybe monad implementation. Excellent work.
2013-03-06 08:18 UTC

Scalable doesn't mean fast

Monday, 24 January 2011 12:03:16 UTC

Recently I spent a couple of days with Thomas Jespersen who's working towards a launch of spiir.dk - on Windows Azure. The reason I got to talk to him was to see if I could help with some performance issues he had with Azure Table Storage.

The scenario is really simple: the application needs to load all of a user's bank transactions into memory to enable pretty advanced sorting and filtering. That sounds like a lot, but really isn't more than approximately 200 kB of data retrieved through a single query - so: there are no 1+N problems in play here, but even so it originally took more than two seconds. That's a bit long to wait before you can even start rendering a web page.

By tweaking his partitioning strategy and using parallel queries, Thomas managed to bring down the data retrieval time to approximately one second. Although stress testing indicated that this duration was very stable, even under load, it is still too slow. So we met to see what could be done.

Thomas had done a great job tweaking the query, so I couldn't really suggest some sort of secret API that would make it run significantly faster. Basically, we have to deal with Azure storage being based on REST and that there are a lot of things about run-time behavior we cannot control. Apart from designing a proper partitioning strategy, we can't add indexes to Azure Table Storage.

It was time to take a different approach.

As far as I can tell, Windows Azure is designed to be very scalable. However, just because scalability implies that you can handle an insane amount of work within acceptable time frames, it doesn't mean that you can extrapolate it to mean that under a light load, everything will be lightning fast. That's not the case at all.

Scalability means that performance characteristics remain stable from light to heavy load.

Consequently this means that if performance is adequate under heavy load, it will also be adequate under a light load. Azure Storage is first and foremost designed to be scalable, and as a second priority, as fast as possible.

As Thomas discovered, Azure Table Storage isn't particularly fast.

It may be a masochistic side of me that I'm not otherwise aware of, but I actually appreciate that. It makes us reassess our most basic assumptions.

The data that Thomas needs to read isn't particularly dynamic, so what if we take a snapshot of it? In short, we loaded all of a user's data into memory and serialized it to Azure Blob Storage.

Loading the same data from a binary serialized Blob took only 1/6 of the time it did to load it from Table Storage.

As it turns out, Thomas doesn't even need all the columns from the Table to populate the view, so we could even make the serialized Blob smaller yet.

At this point, however, we now have two representations of the same data: The original data in Table Storage, and a persistent cache in Blob Storage. The remaining challenge is to figure out how to keep these in sync.

This may seem like a hack, but is really represents a paradigm shift. Letting go of ACID opens up a lot of new opportunities.

Actually, I spend most of the next day trying to convince Thomas that CQRS would be the best approach, or that we could at least pick up some of the techniques from asynchronous, messaging based architectures, but that's another story.

The morale here is that on Azure, things may be slower than you are used to, but storage is (relatively) cheap, so denormalization can save you a lot of execution time.


Comments

I don't see how this is different in azure vs. other solutions. It is basic knowledge that scalability introduces overhead in most simple cases to ensure more consistent response times under load, often this overhead is caused by a distribution in place of centralization, in your case regarding distributed storage. Maybe im missing the point but you seem to be stating the obvious like "water travels downhill" :-)

Your comment about you letting go of ACID strikes me as odd. Basically you are just adding a cache in front of your table storage like people have done for ages so where in lies the paradigm shift in "letting go of ACID", are you speaking to people who have never cached data? Even people not using a cache will let go of ACID in most cases because they keep stale data in objects in their application and in best case do optimistic concurrency checks and worst case just let the last-to-write-win.
2011-01-24 20:35 UTC
Jakob, if you already knew all of those things, then this blog post wasn't meant for you. I knew them too, but apparently many people don't, so I wrote for their benefit. Trust me: for many developers, this is far from obvious.

The same goes for the ACID comment. People don't seem to have too big of a problem with in-memory caches because somehow they know that these can't possibly be consistent. However, as soon as you start writing to a persistent store, you encounter knee-jerk reactions that all persisted data must be written and updated within transactions.

I'm not disagreeing with you. This is the basic premise behind CQRS and other scalable architectures. It's just not particularly widely known yet.
2011-01-24 20:46 UTC

My Christmas challenge has a winner

Saturday, 01 January 2011 13:53:33 UTC

A week ago I concluded Microsoft Denmark's 2010 .NET Community Christmas Calendar with a challenge about resolving closed types with MEF. As 2011 came around, the deadline ended, so it's now time to pick the winner.

I didn't get a lot of entries, which can be interpreted in at least one (or more) of the following ways:

  • The challenge was too difficult
  • The challenge wasn't interesting
  • The prize wasn't attractive
  • People had other things to do during the holidays

Whatever the reason, it made my task of picking a winner that much easier. The best Danish entry came from Daniel Volder Guarnieri who cheated a bit by partially hard-coding the composition of Mayonnaise into the ContainerBuilder. As I wrote in the original challenge, there are many ways to tackle the challenge, and one was to take the unit tests very literally :)

However, honorable mention must go to Boyan Mihaylov who participated just for the honor. He took a more general approach similar to Fluent MEF. This involves implementing a completely new ComposablePartCatalog with associated ComposablePartDefinition and ComposablePart implementations - not a trivial undertaking.

Kudos to Boyan and congratulations to Daniel. My thanks for your submissions, and a happy new year to all my readers!


Comments

Could you post the solution?
2011-01-01 21:00 UTC

Challenge: Resolve closed types with MEF

Friday, 24 December 2010 09:29:06 UTC

For my international reader, a bit of context is in order for this post: Microsoft Denmark sponsors a series of Christmas challenges known as the Microsoft Christmas Calendar. Different bloggers alternate hosting a coding challenge for the day, and Microsoft sponsors the prizes.

Today I have the honor of hosting the last challenge for 2010. Contestants from the Danish .NET community have the opportunity to win a Molecular Gastronomy Starter Kit - if any of my international readers feel like joining in, they are welcome, but (probably) not eligible for the prize.

The challenge #

The Managed Extensibility Framework (MEF) enables us to compose applications by annotating types and members with [Import] and [Export] attributes, but what can you do when you have types without these attributes and you can't add the attributes?

For example, how can we compose Mayonnaise from these three classes?

public sealed class EggYolk { }
 
public sealed class OliveOil { }
 
public sealed class Mayonnaise
{
    private readonly EggYolk eggYolk;
    private readonly OliveOil oil;
 
    public Mayonnaise(EggYolk eggYolk, OliveOil oil)
    {
        if (eggYolk == null)
        {
            throw new ArgumentNullException("eggYolk");
        }
        if (oil == null)
        {
            throw new ArgumentNullException("oil");
        }
 
        this.eggYolk = eggYolk;
        this.oil = oil;
    }
 
    public EggYolk EggYolk
    {
        get { return this.eggYolk; }
    }
 
    public OliveOil Oil
    {
        get { return this.oil; }
    }
}

The challenge is to come up with a good solution to that problem. Here are the formal rules:

  • The unit test suite at the end of this post must pass.
  • You are not allowed to edit the unit tests.
  • You are only allowed to add one (1) using directive to the unit test file to reference the namespace of your proposed solution.
  • You must work from the Visual Studio 2010 solution attached to this post. Add a new project that contains your solution. Send me the solution in a .zip file to enter the contest.
  • You are allowed to implement your solution in any language you would like as long as it compiles and runs from Visual Studio 2010 Premium.
  • The winner is chosen by my subjective judgment, but I will emphasize clean code and design. A tip: attempt to get as good scores as possible from Visual Studio's Code Analysis and Code Metrics. Good scores does not guarantee that you win, but bad scores will most likely ensure that you don't.
  • Since many of you are on Christmas vacation the deadline is this year. As long as you submit a solution in 2010 (Danish time) you're a contestant.

There are lots of different ways to skin this cat, so I'm looking forward to your submissions to see all your creative solutions.

This unit test suite is the specification:

using System.ComponentModel.Composition.Hosting;
using Ploeh.Samples.MeffyXmas.MenuModel;
using Xunit;
 
namespace Ploeh.Samples.MeffyXmas.MefMenu.UnitTest
{
    public class ContainerBuilderFacts
    {
        [Fact]
        public void DefaultContainerCorrectlyResolvesOliveOil()
        {
            CompositionContainer container = new ContainerBuilder()
                .Build();
            var oil = container.GetExportedValue<OliveOil>();
            Assert.NotNull(oil);
        }
 
        [Fact]
        public void DefaultContainerCorrectlyResolvesEggYolk()
        {
            CompositionContainer container = new ContainerBuilder()
                .Build();
            var yolk = container.GetExportedValue<EggYolk>();
            Assert.NotNull(yolk);
        }
 
        [Fact]
        public void DefaultContainerCorrectlyResolvesMayonnaise()
        {
            CompositionContainer container = new ContainerBuilder()
                .Build();
            var mayo = container.GetExportedValue<Mayonnaise>();
            Assert.NotNull(mayo);
        }
 
        [Fact]
        public void DefaultContainerReturnsSingletonMayonnaise()
        {
            CompositionContainer container = new ContainerBuilder()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.Same(mayo1, mayo2);
        }
 
        [Fact]
        public void WithTransientMayonnaiseReturnTransientMayonnaise()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1, mayo2);
        }
 
        [Fact]
        public void TransientMayonnaiseByDefaultContainsSingletonEggYolk()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.Same(mayo1.EggYolk, mayo2.EggYolk);
        }
 
        [Fact]
        public void TransientMayonnaiseByDefaultContainsSingletonOil()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.Same(mayo1.Oil, mayo2.Oil);
        }
 
        [Fact]
        public void TransientMayonnaiseCanHaveTransientEggYolk()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedEggYolk()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1.EggYolk, mayo2.EggYolk);
        }
 
        [Fact]
        public void TransientMayonnaiseCanHaveSingletonOil()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedEggYolk()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.Same(mayo1.Oil, mayo2.Oil);
        }
 
        [Fact]
        public void TransientMayonnaiseCanHaveTransientOil()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedOil()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1.Oil, mayo2.Oil);
        }
 
        [Fact]
        public void TransientMayonnaiseCanHaveSingletonEggYolk()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedOil()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.Same(mayo1.EggYolk, mayo2.EggYolk);
        }
 
        [Fact]
        public void PureTransientMayonnaiseIsTransient()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedEggYolk()
                .WithNonSharedOil()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1, mayo2);
        }
 
        [Fact]
        public void PureTransientMayonnaiseHasTransientEggYolk()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedEggYolk()
                .WithNonSharedOil()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1.EggYolk, mayo2.EggYolk);
        }
 
        [Fact]
        public void PureTransientMayonnaiseHasTransientOil()
        {
            CompositionContainer container = new ContainerBuilder()
                .WithNonSharedMayonnaise()
                .WithNonSharedEggYolk()
                .WithNonSharedOil()
                .Build();
            var mayo1 = container.GetExportedValue<Mayonnaise>();
            var mayo2 = container.GetExportedValue<Mayonnaise>();
            Assert.NotSame(mayo1.Oil, mayo2.Oil);
        }
    }
}

ContainerBuilder is the class you must implement, so the unit tests don't compile until you make them. Meffy xmas!

clip_image002


The TDD Apostate

Wednesday, 22 December 2010 13:57:56 UTC

I've been doing Test-Driven Development since 2003. I still do, I still love it, and I still expect to be doing it in the future. Over the years, I've repeatedly returned to the discussion of whether TDD should be regarded as Test-Driven Development or Test-Driven Design. For a long time I've been of the conviction that TDD is both of those. Not so any longer.

TDD is not a good design methodology.

Over the years I've written tons of code with TDD. I've written code where tests blindly drove the design, and I've written code where the design was the result of a long period of deliberation, and the tests were only the manifestations of already well-formed ideas.

I can safely say that the code where tests alone drove the design never turned out particularly well. Although it was testable and, after a fashion, ‘loosely coupled', it was still Spaghetti Code in the sense that it lacked overall consistency and good abstractions.

On the other hand, I'm immensely pleased with code like AutoFixture 2.0, which was mostly the result of hours of careful contemplation riding my bike to and from work. It was still written test-first, but the design was well thought out in advance.

This made me think: did I just fail (repeatedly) at Test-Driven Design, or is the overall concept a fallacy?

That's a pretty hard question to answer; what constitutes good design? In the following, let's assume that the SOLID principles is a pretty good indicator of good design. If so, does test-first drive us towards SOLID design?

TDD versus the Single Responsibility Principle #

Does TDD ensure the application of the Single Responsibility Principle (SRP)? This question is easy to answer and the answer is a resounding NO! Nothing prevents us from test-driving a God Class. I've seen many examples, and I've been guilty of it myself.

Constructor Injection is a much better help because it makes SRP violations so painful.

The score so far: 0 points to TDD.

TDD versus the Open/Closed Principle #

Does TDD ensure that we follow the Open/Closed Principle (OCP)? This is a bit harder to answer. I've previously argued that Testability is just another name for OCP, so that would in itself imply that TDD drives OCP. However, the issue is more complex than that, because there are several different ways we can address the OCP:

  • Inheritance
  • Composition

According to Roy Osherove's book The Art of Unit Testing, the Extract and Override technique is a common unit testing trick. Personally, I rarely use it, but if used it will indirectly drive us a bit towards OCP via inheritance.

However, we all know that we should favor composition over inheritance, so does TDD drive us in that direction? As I alluded to previously, TDD does tend to drive us towards the use of Test Doubles, which we can view as one way to achieve OCP via composition.

However, another favorite composition technique of mine is to add functionality with a Decorator. This is only possible if the original type implements an interface that can be decorated. It's possible to write a test that forces a SUT to implement an interface, but TDD as a technique in itself does not drive us in that direction.

Grudgingly, however, I must admit that TDD still scores half a point against OCP, for a total score so far of ½ point.

TDD versus the Liskov Substitution Principle #

Does TDD drive us towards adhering to the Liskov Substitution Princple (LSP)? Perhaps, but probably not.

Black box testing can't protect us against the SUT attempting to downcast its dependencies, but at least it doesn't particularly pull us in that direction either. When it comes to the SUT's treatment of a dependency, TDD pulls in neither direction.

Can we test-drive interface implementations that inadvertently violate the LSP? Yes, easily. As I discussed in a previous post, the use of Header Interfaces pulls us towards LSP violations. The more members an interface has, the more likely are LSP violations.

TDD can definitely drive us towards Header Interfaces (although they tend to hurt in the long run). I've seen this happen numerous times, and I've been there myself. TDD doesn't properly encourage LSP adherence.

The score this round: 0 points for TDD, for a running total of ½ point.

TDD versus the Interface Segregation Principle #

Does TDD drive us towards the Interface Segregation Principle (ISP)? No. It's pretty easy to test-drive a SUT towards a Header Interface, just as we can test-drive towards a God Class.

Another 0 points for TDD. The score is still ½ point to TDD.

TDD versus the Dependency Inversion Principle #

Does TDD drive us towards the Dependency Inversion Principle (DIP)? Yes, it does.

The whole drive towards Testability - the ability to replace dependencies with Test Doubles - drives us exactly in the same direction as the DIP.

Since we tend to mistake such mechanistic loose coupling with proper application design, this probably explains why we, for so long, have confused TDD with good design. However, although I view loose coupling as a prerequisite for good design, it is by no means enough.

For those that still keep score, TDD scores 1 point against DIP, for a total of 1½ points.

TDD does not ensure SOLID #

With 1½ out of 5 possible points I have stated my case. I am convinced that TDD itself does not drive us towards SOLID design. It's definitely possible to use test-first techniques to drive towards SOLID designs, but that will always be an extra effort that supplements TDD; it's not something that is inherently built into TDD.

Obviously you could argue that SOLID in itself is not the end-all, be-all of proper API design. I would agree. However, based on my experience with TDD, I think the conclusion holds. TDD does not drive us towards good design. It is not a design technique.

I still write code test-first because I find it more productive, but I make design decisions out of band. I'm a Test-Driven Design Apostate.


Comments

I think I agree with you. TDD has lots of benefits. But using TDD and only TDD to drive your design is silly. The application of the SOLID principles, being proactive about finding and correcting code smells, and keeping your "domain logic" explicit are much more effective methods for encouraging good design.

I think some of TDD's primary benefits are:
- it raises the quality of your features (less bugs, simpler, more thought out)
- it helps support you as you refactor to improve your design
- it helps people work on existing code

Thanks for the post, very thought provoking!
Kevin
2010-12-22 22:07 UTC
keith ray #
Refactoring is a necessary step in TDD. Recognizing code smells, which are violations of SOLID principles (and other good design principles), and fixing them while the tests are green.
2010-12-23 00:55 UTC
I do think that TDD usually helps drive toward SRP. I suppose that if you have two completely orthogonal responsibilities in a class then that fact might not show up in your tests. But usually I find that I create classes that have two responsibilities that interact in some way and when I write tests to demonstrate how all of those responsibilities should work together I quickly end up with combinatorial explosions which make the tests extremely painful, i.e. when A is true and B is greater than 10 and C is not null and D contains E, then Foo should happen. When I see tests like that I know immediately that I've committed an SRP violation somewhere and I need to go pull some responsibilities out into separate classes, test those responsibilities in isolation, and then build a coordinator that consumes those abstracted components.

I'd give at least half a point for helping with SRP, and probably a full point. Nevertheless, I'd agree with your larger idea that TDD isn't a license to turn your brain off. The TDD system is read-green-REFACTOR, and the refactor part means you need to engage your brain and apply design princples that will make your code stronger. The tests allow you to do that refactoring in relative safety.
2010-12-23 02:18 UTC
Kelly Anderson #
TDD doesn't mean you can turn your brain off. When you're on your bicycle, you come up with ideas that can lead to refactoring your code towards a better design. Refactoring with tests is far superior to refactoring without tests. The biggest difficulty in my experience is when a major refactoring requires the changing of code and test at the same time. I always try to avoid that, but sometimes it's just inevitable. Lots of revision control submissions when doing that...

The Design part of TDD is, imho all about designing your low level APIs. And they are designed for ease of use automatically because when you are writing your tests, you think, "What's the easiest way to invoke the functionality I'm contemplating?" As you have pointed out, ease of use is only one component of good design, so TDD doesn't design the code for you ALL BY ITSELF.

Knowing the limitations of the methodology you are employing is critical to getting the most out of it. While you can ride your bicycle to New York, there are few situations where that is the most practical way of getting there.

-Kelly
2010-12-23 15:05 UTC
What is your opinion on this article?
http://cleancoder.posterous.com/the-transformation-priority-premise
2010-12-23 15:14 UTC
That article seems reasonable (and interesting), but I don't see it as being adverse to what I wrote. Uncle Bob writes about how you evolve an implementation of an API as an interaction between progressively complex tests and the code itself. As far as I can tell, at no point during that blog post does he change the API. In other words, the API design is given a priori.

What I'm discussing here is whether or not you can use tests to blindly design APIs. That's a different perspective.
2010-12-23 15:28 UTC
"What I'm discussing here is whether or not you can use tests to blindly design APIs"

Are there authoritative sources that assert that you can? Or non-authoritative ones? If so, could you cite them, which would help to put the piece into context. Otherwise, I'm just hearing "red-green on its own doesn't produce good design", which I thought was pretty much obvious, given that the "refactor" step is where I've always expected the design part to happen, and the article boils down to "doing it wrong doesn't work".

;-)
2010-12-23 16:11 UTC
Marty Nelson #
IMO, SOLID is more like good construction (or craft) practices than design (maybe the ambiguity of "design" is part of the problem). TDD is driving the design as architecture: what are the needs, metaphors, principles, etc. driving the software. Just as a house may be brilliantly designed and meet the purpose of its occupants, we still need solid construction of the structure itself or the longevity is at risk.

TDD /drives/ through priority and constraint. Good tests fix intent, not implementation, while only allowing implementations to emerge that meet the intent. By definition then, it should not be coupled to elements of construction (or design as you are calling it).
2010-12-23 18:00 UTC
I found your post interesting and well written. Not sure I entirely agree though - it sounds as though "blindly" allowing TDD to shape your design means that you skip out the "refactor" part of the TDD cycle. This is no different to not doing refactoring whether using TDD or not, and therefore it's most likely that your code will not closely follow SOLID.

I appreciate the point that you're making i.e. TDD != a SOLID design, but then I don't think it ever aspired to - that's why there was the "refactor" bit after getting your tests green.

2010-12-24 14:10 UTC
Kevin Stevens #
I think the post falls for one of the biggest fallacies about TDD: that it makes a programmer a better designer of code. Nothing could be further from the truth. If you suck at design before using TDD, you will still suck once you adopt TDD as a practice.

In short, in order to be good at design, learn to design. Then TDD will help you get to a good design faster.
2010-12-24 23:50 UTC
I agree with you and most of the comments. The problem here is when someone tries to turn a development technique into a whole methodology. TDD is just a technique and I don't think it ever intended to be further than that.

At the same time there I think it is clear that TDD can also drive the design of your class/program. But there is a quite big difference into turning that original "can" into a "should" or a "must". And I don't know why there is so many people obssesed in change the original intention of TDD.

Isaac for example in one of the commens makes a great point about refactoring and its impact of TDD. And that's the point of that "can". But this doesn't mean that you should only rely on TDD to design your project. To me is a great technique that helps 1. to force your team to make tests, 2. to force your team to think on the stuff they are solving and 3. to pop up design flaws that we never thought of.

2010-12-25 08:33 UTC
now we're getting somewhere xD, I don't do test first, my approach is more like "Inteface based programming", I like to think the design through first using only interfaces, then start implementing, then writing the tests to merely check on the correctness, so far, that works for me really well
2010-12-26 02:17 UTC
You wrote:

"I’ve written code where tests blindly drove the design..."

I don't think this is true because it's not possible. Tests can't "blindly" drive any design, especially considering the fact that tests don't write themselves. It takes a human being, a programmer with ideas and plans for the software, to decide what tests to write and how to implement them.

Now, there is a context in which a phrase like "blindly drive" is valid, and it's the TDD method. No matter how great or valid your ideas for your software might be, TDD demands that you prove the worth of your ideas one tiny step at a time. You write a simple test, then you implement it in the most simple manner. Then you repeat, then you repeat, and eventually you're left with software that may or may not match with what you had in your head.

The method is "blind" to your ideas in that your implementation is focused on one tiny requirement, but it can't be blind to your ideas completely. What gave you the idea to write the test in the first place?

When programmers start TDD for the first time, their software doesn't magically become pure examples of the SOLID principles. It takes a lot of practice to know what tests/questions to write, where to start the TDD practice, and how generally to keep things together. And even with lots of experience, it's still possible to mess things up. If I mess up during my practice of TDD, it's not fair to blame the practice of TDD any more than it's fair to blame an automobile manufacturer if someone drives their car off the road.

I think that's kinda what you're doing when you say things like:

"Nothing prevents us from test-driving a God Class."

Nothing prevents us from test-driving a God class? How about the fact that the tests will be hard to write, be unmaintainable, and will generally smell? Whenever a class takes on additional responsibilities, the tests for those responsibilities have to be "mixed" with the other tests. That fact that the programmer is going test-first will provide him with the earliest clue that the class is taking on too much, will cause him to *design* a way around the SRP violation by using a separate class.

If TDD helps to provide so much evidence that a SRP violation is occurring, why go after it because it doesn't *force* the programmer to act based on that evidence?

Making up a series of TDD "versus" the SOLID principles seems a little far-fetched to me. TDD isn't meant to be a replacement for the human brain.
2010-12-26 22:50 UTC
Your post is very interesting and so are the comments from your readers. I’ve had a number of friends send me links to this post so I thought I’d address some of your points with more details but my feedback has grown too big to fit in a comment so I wrote a blog post on it. Please see my blog post at http://techniquesofdesign.com/2011/01/12/the-tdd-zealot/ and feel free to comment.

I agree with many of the things you say in your post. Thank you for inspiring such fruitful conversations.
2011-01-13 06:23 UTC
Philip Schwarz #
Great post...what do you make of this http://groups.google.com/group/growing-object-oriented-software/browse_frm/thread/e0a41018c356c221
2011-05-04 21:39 UTC

Page 62 of 73

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!