Encapsulating AutoFixture Customizations

Friday, 18 March 2011 12:51:08 UTC

AutoFixture is designed around the 80-20 principle. If your code is well-designed I'd expect a default instance of Fixture to be able to create specimens of your classes without too much trouble. There are cases where AutoFixture needs extra help:

  • If a class consumes interfaces a default Fixture will not be able to create it. However, this is easily fixed through one of the AutoMocking Customizations.
  • If an API has circular references, Fixture might enter an infinite recursion, but you can easily customize it to cut off one the references.
  • Some constructors may only accept arguments that don't fit with the default specimens created by Fixture. There are ways to deal with that as well.

I could keep on listing examples, but let's keep it at that. The key assumption underlying AutoFixture is that these special cases are a relatively small part of the overall API you want to test - preferably much less than 20 percent.

Still, to address those special cases you'll need to customize a Fixture instance in some way, but you also want to keep your test code DRY.

As the popularity of AutoFixture grows, I'm getting more and more glimpses of how people address this challenge: Some derive from Fixture, others create extension methods or other static methods, while others again wrap creation of Fixture instances in Factories or Builders.

There really is no need to do that. AutoFixture has an idiomatic way to encapsulate Customizations. It's called… well… a Customization, which is just another way to say that there's an ICustomization interface that you can implement. This concept corresponds closely to the modularization APIs for several well-known DI Containers. Castle Windsor has Installers, StructureMap has Registries and Autofac has Modules.

The ICustomization interface is simple and has a very gentle learning curve:

public interface ICustomization
{
    void Customize(IFixture fixture);
}

Anything you can do with a Fixture instance you can also do with the IFixture instance passed to the Customize method, so this is the perfect place to encapsulate common Customizations to AutoFixture. Note that the AutoMocking extensions as well as several other optional behaviors for AutoFixture are already defined as Customizations.

Using a Customization is also easy:

var fixture = new Fixture()
    .Customize(new DomainCustomization());

You just need to invoke the Customize method on the Fixture. That's no more difficult than calling a custom Factory or extension method - particularly if you also use a Visual Studio Code Snippet.

When I start a new unit testing project, one of the first things I always do is to create a new ‘default' customization for that project. It usually doesn't take long before I need to tweak it a bit - if nothing else, then for adding the AutoMoqCustomization. To apply separation of concerns I encapsulate each Customization in its own class and compose them with a CompositeCustomization:

public class DomainCustomization : CompositeCustomization
{
    public DomainCustomization()
        : base(
            new AutoMoqCustomization(),
            new FuncCustomization())
    {
    }
}

Whenever I need to make sweeping changes to my Fixtures I can simply modify DomainCustomization or one of the Customizations it composes.

In fact, these days I rarely explicitly create new Fixture instances, but rather encapsulate them in a custom AutoDataAttribute like this:

public class AutoDomainDataAttribute : AutoDataAttribute
{
    public AutoDomainDataAttribute()
        : base(new Fixture()
            .Customize(new DomainCustomization()))
    {
    }
}

This means that I can reuse the DomainCustomization across normal, imperative unit tests as well as the declarative, xUnit.net-powered data theories I normally prefer:

[Theory, AutoDomainData]
public void CanReserveReturnsTrueWhenQuantityIsEqualToRemaining(
    Capacity sut, Guid id)
{
    var result = sut.CanReserve(sut.Remaining, id);
    Assert.True(result);
}

Using Customizations to encapsulate your own specializations makes it easy to compose and manage them in an object-oriented fashion.


Resolving closed types with MEF

Monday, 14 March 2011 20:49:11 UTC

A while back I posed the challenge of resolving closed types with MEF. I received some responses, but I also wanted to provide an alternative outline for a solution. In case you don't remember the problem statement, it revolved around using the Managed Extensibility Framework (MEF) to compose classes in those cases where it's impossible to annotate those classes with the MEF attributes. In the given example I want to compose the Mayonnaise class from EggYolk and OliveOil, but all three classes are sealed and cannot be recompiled.

As I describe in my book, a general solution to this type of problem is to create a sort of adapter the exports the closed type via a read-only attribute, like these EggYolkAdapter and MayonnaiseAdapter classes (the OliveOilAdapter looks just like the EggYolkAdapter):

public class EggYolkAdapter
{
    private readonly EggYolk eggYolk;
 
    public EggYolkAdapter()
    {
        this.eggYolk = new EggYolk();
    }
 
    [Export]
    public virtual EggYolk EggYolk
    {
        get { return this.eggYolk; }
    }
}
 
public class MayonnaiseAdapter
{
    private readonly Mayonnaise mayo;
 
    [ImportingConstructor]
    public MayonnaiseAdapter(
        EggYolk yolk, OliveOil oil)
    {
        if (yolk == null)
        {
            throw new ArgumentNullException("yolk");
        }
        if (oil == null)
        {
            throw new ArgumentNullException("oil");
        }
 
        this.mayo = new Mayonnaise(yolk, oil);
    }
 
    [Export]
    public virtual Mayonnaise Mayonnaise
    {
        get { return this.mayo; }
    }
}

Doing it like this is always possible, but if you have a lot of types that you need to compose, it becomes tedious having to define a lot of similar adapters. Fortunately, we can take it a step further and generalize the idea of a MEF adapter to a small set of generic classes.

The EggYolkAdapter can be generalized as follows:

public class MefAdapter<T> where T : new()
{
    private readonly T export;
 
    public MefAdapter()
    {
        this.export = new T();
    }
 
    [Export]
    public virtual T Export
    {
        get { return this.export; }
    }
}

Notice that I've more or less just replaced the EggYolk class with a type argument (T). However, I also had to add the generic new() constraint, which is often quite restrictive. However, to support a type like Mayonnaise, I can create another, similar generic MEF adapter like this:

public class MefAdapter<T1, T2, TResult>
{
    private readonly static Func<T1, T2, TResult> createExport =
        FuncFactory.Create<T1, T2, TResult>();
    private readonly TResult export;
 
    [ImportingConstructor]
    public MefAdapter(T1 arg1, T2 arg2)
    {
        this.export = createExport(arg1, arg2);
    }
 
    [Export]
    public virtual TResult Export
    {
        get { return this.export; }
    }
}

The major difference from the simple MefAdapter<T> is that we need slightly more complicated code to invoke the constructor of TResult, which is expected to take two constructor arguments of types T1 and T2. This work is delegated to a FuncFactory that builds and compiles the appropriate delegate using an expression tree:

internal static Func<T1, T2, TResult>
    Create<T1, T2, TResult>()
{
    var arg1Exp =
        Expression.Parameter(typeof(T1), "arg1");
    var arg2Exp = 
        Expression.Parameter(typeof(T2), "arg2");
 
    var ctorInfo = 
        typeof(TResult).GetConstructor(new[]
        {
            typeof(T1),
            typeof(T2)
        });
    var ctorExp =
        Expression.New(ctorInfo, arg1Exp, arg2Exp);
 
    return Expression.Lambda<Func<T1, T2, TResult>>(
        ctorExp, arg1Exp, arg2Exp).Compile();
}

With a couple of MEF adapters, I can now compose a MEF Catalog almost like I can with a real DI Container, and resolve the Mayonnaise class:

var catalog = new TypeCatalog(
    typeof(MefAdapter<OliveOil>),
    typeof(MefAdapter<EggYolk>),
    typeof(MefAdapter<EggYolk, OliveOil, Mayonnaise>)
    );
var container = new CompositionContainer(catalog);
 
var mayo = container.GetExportedValue<Mayonnaise>();

If you want to change the Creation Policy to NonShared, you can derive from the MefAdapter classes and annotate them with [PartCreationPolicy] attributes.

I'd never voluntarily choose to use MEF like this, but if I was stuck with MEF in a project and had to use it like a DI Container, I'd do something like this.


Comments

Is there anyway we can enhance your MefAdapter to support keys as well? So that when the adapter is export's using the Export property we could associate a key to the export? With attributes I know the string has to be constant, didn't know if we could somehow assign this in runtime even though.
2012-09-19 23:12 UTC

Compose object graphs with confidence

Friday, 04 March 2011 11:15:10 UTC

The main principle behind the Register Resolve Release pattern is that loosely coupled object graphs should be composed as a single action in the entry point of the application (the Composition Root). For request-based applications (web sites and services), we use a variation where we compose once per request.

It seems to me that a lot of people are apprehensive when they first hear about this concept. It may sound reasonable from an architectural point of view, but isn't it horribly inefficient? A well-known example of such a concern is Jeffrey Palermo's blog post Constructor over-injection anti-pattern. Is it really a good idea to compose a complete object graph in one go? What if we don't need part of the graph, or only need it later? Doesn't it adversely affect response times?

Normally it doesn't, and if it does, there are elegant ways to address the issue.

In the rest of this blog post I will expand on this topic. To keep the discussion as simple as possible, I'll restrict my analysis to object trees instead of full graphs. This is quite a reasonable simplification as we should strive to avoid circular dependencies, but even in the case of full graphs the arguments and techniques put forward below hold.

Consider a simple tree composed of classes from three different assemblies:

Tree

All the A classes (blue) are defined in the A assembly, B classes (green) in the B assembly, and the C1 class (red) in the C assembly. In code we create the tree with Constructor Injection like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new C1(
            new B3()));

Given the tree above, we can now address the most common concerns about composing object trees in one go.

Will it be slow? #

Most likely not. Keep in mind that Injection Constructors should be very simple, so not a lot of work is going on during composition. Obviously just creating new object instances takes a bit of time in itself, but we create objects instances all the time in .NET code, and it's often a very fast operation.

Even when using DI Containers, which perform a lot of (necessary) extra work when creating objects, we can create tens of thousand trees per second. Creation of objects simply isn't that big a deal.

But still: what about assembly loading? #

I glossed over an important point in the above argument. While object creation is fast, it sometimes takes a bit of time to load an assembly. The tree above uses classes from three different assemblies, so to create the tree all three assemblies must be loaded.

In many cases that's a performance hit you'll have to take because you need those classes anyway, but sometimes you might be concerned with taking this performance hit too early. However, I make the claim that in the vast majority of cases, this concern is irrelevant.

In this particular context there are two different types of applications: Request-based applications (web) and all the rest (desktop apps, daemons, batch-jobs, etc.).

Request-based applications #

For request-based applications such as web sites and REST services, an object tree must be composed for each request. However, all requests are served by the same AppDomain, so once an assembly is loaded, it sticks around to be available for all subsequent requests. Thus, the first few requests will suffer a performance penalty from having to load all assemblies, but after that there will be no performance impact.

In short, in request-based applications, you can compose object trees with confidence. In only extremely rare cases should you have performance issues from composing the entire tree in one go.

Long-running applications #

For long-running applications the entire object tree must be composed at start-up. For background services such as daemons and batch processes the start-up time probably doesn't matter much, but for desktop applications it can be of great importance.

In some cases the application requires the entire tree to be immediately available, in which case there's not a lot you can do. Still, once all assemblies have been loaded, actually creating the tree will be very fast.

In other cases an entire branch of the tree may not be immediately required. As an example, if the C1 node in the above graph isn't needed right away, we could improve start-up time if we could somehow defer creating that branch, because this would also defer loading of the entire C assembly.

Deferred branches #

Since object creation is fast, the only case where it makes sense to defer loading of a branch is when creation of that branch causes an assembly to be loaded. If we can defer creation of such a branch, we can also defer loading of the assembly, thus improving the time it takes to compose the initial tree.

Imagine that we wish to defer creation of the C1 branch of the above tree. It will prevent the C assembly from being loaded because that assembly is not used in any other place in the tree. However, it will not prevent the B assembly from being loaded, since that assembly is also being used by the A2 node.

Still, in those rare situations where it makes sense to defer creation of a branch, we can make that cut into a part of the infrastructure of the tree. I originally described this technique as a reaction to the above mentioned post by Jeffrey Palermo, but here's a restatement in the current context.

We can defer creating the C1 node by wrapping it in a lazy implementation of the same interface. The C1 node implements an interface called ISolo<IMarker>, so we can wrap it in a Virtual Proxy that defers creation of C1 until it's needed:

public class LazySoloMarker : ISolo<IMarker>
{
    private readonly Lazy<ISolo<IMarker>> lazy;
 
    public LazySoloMarker(Lazy<ISolo<IMarker>> lazy)
    {
        if (lazy == null)
        {
            throw new ArgumentNullException("lazy");
        }
 
        this.lazy = lazy;
    }
 
    #region ISolo<IMarker> Members
 
    public IMarker Item
    {
        get { return this.lazy.Value.Item; }
    }
 
    #endregion
}

This Virtual Proxy takes a Lazy<ISolo<IMarker>> as input and defers to it to implement the members of the interface. This only causes the Value property to be created when it's first accessed - which may be long after the LazySoloMarker instance was created.

The tree can now be composed like this:

var t =
    new A1(
        new A2(
            new B1(
                new B2()),
            new A3()),
        new LazySoloMarker(
            new Lazy<ISolo<IMarker>>(() => new C1(
                new B3()))));

This retains all the original behavior of the original tree, but defers creation of the C1 node until it's needed for the first time.

The bottom line is this: you can compose the entire object graph with confidence. It's not going to be a performance bottleneck.

Update (2013-08-19 08:09 UTC): For a more detailed treatment of this topic, watch my NDC 2013 talk Big Object Graphs Up Front.


Comments

Hi there, great article and way to focus on the inherent differences between request-based and long-running applications.

The viewpoint however seems skewed towards request-based apps and I think really trivializes the innate lazy exploration nature of object graphs in rich-client apps (let's not call them desktop, as mobile is the exact same scenario - only difference is resources and processing power are a lot more limited and thus our architectures have to be better designed to account for it).

The main premise here is that in a rich-client app the trees you mention above should ALWAYS be lazily loaded, since the "branches" of the component tree are usually screens or their various sub-components (a dialog in some sub-screen for example). The branches are also more like "chains" as screens can have multiple follow-on screens that slowly load more content as you dive in deeper (while content you have visited higher up the chain may get unloaded - "lazy unloading" shall we say?). You never want to load screens that are not yet visible at app startup. In a desktop, fully-local app this results in a chuggy-performing or needlessly resource-wasteful app; in mobile it is simply impossible within the memory constraints. And of course... a majority of rich-client screens are hydrated with data pulled from the Internet and that data depends on either user input or server state (think accessing a venue listing on Yelp - you need the user to say which venue they want to view and the venue data is stored and updated server-side in a crowd-sourced manner not locally available to the user). So even if you had unlimited client-side memory you still couldn't pre-load the whole component object graph for the application up front.

I completely agree with your Composition Root article (/2011/07/28/CompositionRoot) and I think it describes things very beautifully. But what it also blissfully points out is that rich-client apps are poorly suited to a Composition Root and thus Dependency Injection approach.

DI does lazy-loading very poorly (just look at the amount of wrapper/plumbing you have to write for one component above). Now picture doing that for every screen in a mobile app (ours has ~30 screens and it's quite a trivial, minimal app for our problem domain). That's not to even mention whether you actually WANT to pull all the lifetime management into a single component - what could easily be seen as a break in Cohesion for logic belonging to the various sub-components. In web this doesn't really matter as the lifetime is usually all or nothing - it's either a singleton/per-request or per-instance. The request provides the full scoping needed for most processing. But in rich-client the lifetimes need to be finely managed based on screen navigation, user action and caching or memory constraints. Sub-components need to get loaded in and out dynamically in a lot more complex a manner than singletons or per-user-session. Again pulling this fine-tuned logic into a shared, centralized component is highly questionable - even if it was easily doable, which it's not.

I won't go into alternatives here (perhaps that's the subject of a post I should write up), but service location and manual instantiation ends up being a much preferable approach in these kinds of long-running application scenarios. Testability can be achieved in other, possibly simpler ways (http://unitbox.codeplex.com) if that's the driving concern.

Thus I think the key driving differentiator comes down to object graph composition: are you able to feasibly and desirably load the whole thing at once (such as in web scenarios) or not? In rich-client apps (desktop and mobile) this is a striking NO. You need components to load sub-components at will and not be bound and dependent on a centralized component (Composite Root) to do so. The alternative is passing a dependency container around to every component in the system so that the components can resolve sub-components using the container - but we all know how major an anti-pattern that is (oh Android!...)

Would love to see the community start differentiating between the scenarios where DI makes sense and where it ends up being an anti-productive burden. And I think rich client apps is evidently one of those places.
2012-10-18 05:11 UTC
Hi Marcel

Thank you for your insightful comment. When it comes to the analysis of the needs of desktop and mobile applications, I completely agree that many nodes of the object graph would need to be lazy for exactly the reasons you so clearly explain.

However, I don't agree with your conclusions. Yes, there's a bit of plumbing involved with defining a Virtual Proxy over an expensive dependency, but I don't think it's particularly problematic issue. There's a number of reasons for that:

- First of all, let's consider your example of an application with 30 screens. Obviously, you don't want to load all 30 screens up-front, but that doesn't mean that you'll need to write 30 custom Virtual Proxies. Hopefully, you have a single abstraction that loads screens, so that's only a single Virtual Proxy you'll need to write.

- As you point out, you'll also want to postpone loading of data for each screen. I completely agree, but you don't need a Service Locator for this. The most common approach for this is to inject a strongly typed Query service (think: Repository) into your Controllers (or whatever you use to load data). This would essentially be a stateless service object without much (if any) read-only state, so even in a mobile app, I doubt it would take up much resources. Even so, you can still lazy-load it if you need to - but do measure before jumping to conclusions.

- In the end, you may need to proxy more than a single service, but if you find yourself in a situation where you need to proxy 30+ services, that's more likely to indicate a violation of the Reused Abstractions Principle than a failure of DI and the Composition Root pattern.

- Finally, while it may seem like an overhead to create the plumbing code, it's likely to be very robust. Once you've created a Virtual Proxy for an interface, the only reason it has to change is if you change the interface itself. If you stick to Role Interfaces that shouldn't happen very often. Thus, while you may be concerned that creating Virtual Proxies will require extra effort, it'll abstract away an application concern that will tend to be very robust and have a very low maintenance overhead. I consider this far superior to the brittle approach of a Service Locator. In the end, you'll spend less time maintaining that code than if you go for a Service Locator. It's a classic case of a bigger up-front investment that pays huge dividends over time - just like TDD.
2012-10-23 00:36 UTC
Hi Mark,
Just found your blog while doing additional research for a conference talk I'm about to give. The content here is pure gold! I'll make sure to read all of it.
Your response to Marcel above is exactly what I was thinking of when I read his comment. I'm a professional Android developer and I confirm that the scenario described in Marcel's comment is best handled in a way you suggested.
I would like to know your expert opinion on this statement of mine: "When using DI, if a requirement for lazy loading of an injected service(s) arises, it should be treated as 'code smell'. Most probably, this requirement is due to a service(s) that does too much work upon construction. If this service can be refactored, then you should favor refactoring over lazy loading. If the service can't be refactored (e.g. part of the framework), then you should check whether the client can be refactored in a way that eliminates a need for lazy loading. Use lazy loading of injected services only as a last resort in order to compensate for an unfortunate design that you can't change".
Does the above statement makes sense in your opinion?
2017-03-19 05:11 UTC

Vasiliy, thank you for writing. That statement sounds reasonable. FWIW, objects that do too much work during construction violate Nikola Malovic's 4th law of IoC. Unfortunately, the original links to Nikola Malovic laws of IoC no longer point at the original material, but I described the fourth law in an article called Injection Constructors should be simple.

If you want to give some examples of possible refactorings, you may want to take a look at the Decoraptor pattern. Additionally, if you're dealing with a third-party component, you can often create an Adapter that behaves nicely during construction.

2017-03-19 11:53 UTC

Mark, thank you for such a quick response and additional information and references! I would vote for Decoraptor to be included in the next edition of GOF's book.

BTW, I think I found Nikola'a article that you mentioned here.

2017-03-19 13:30 UTC

Injection Constructors should be simple

Thursday, 03 March 2011 14:18:54 UTC

The Constructor Injection design pattern is a extremely useful way to implement loose coupling. It's easy to understand and implement, but sometime perhaps a bit misunderstood.

The pattern itself is easily described through an example:

private readonly ISpecimenBuilder builder;
 
public SpecimenContext(ISpecimenBuilder builder)
{
    if (builder == null)
    {
        throw new ArgumentNullException("builder");
    }
 
    this.builder = builder;
}

The SpecimenContext constructor statically declares that it requires an ISpecimenBuilder instance as an argument. To guarantee that the builder field is an invariant of the class, the constructor contains a Guard Clause before it assigns the builder parameter to the builder field. This pattern can be repeated for each constructor argument.

It's important to understand that when using Constructor Injection the constructor should contain no additional logic.

An Injection Constructor should do no more than receiving the dependencies.

This is simply a rephrasing of Nikola Malovic's 4th law of IoC. There are several reasons for this rule of thumb:

  • When we compose applications with Constructor Injection we often create substantial object graphs, and we want to be able to create these graphs as efficiently as possible. This is Nikola's original argument.
  • In the odd (and not recommended) cases where you have circular dependencies, the injected dependencies may not yet be fully initialized, so an attempt to invoke their members at that time may result in an exception. This issue is similar to the issue of invoking virtual members from the constructor. Conceptually, an injected dependency is equivalent to a virtual member.
  • With Constructor Injection, the constructor's responsibility is to demand and receive the dependencies. Thus, according to the Single Responsibility Principle (SRP), it should not try to do something else as well. Some readers might argue that I'm misusing the SRP here, but I think I'm simply applying the underlying principle in a more granular context.

There's no reason to feel constrained by this rule, as in any case the constructor is an implementation detail. In loosely coupled code, the constructor is not part of the overall application API. When we consider the API at that level, we are still free to design the API as we'd like.

Please notice that this rule is contextual: it applies to Services that use Constructor Injection. Entities and Value Objects tend not to use DI, so their constructors are covered by other rules.


Comments

Nice post, sometimes I find useful to have an IInitializable interface and instruct the container to call the initialize method after instantiation. What you thing about this?
2011-03-03 15:01 UTC
That's very rarely a good idea. The problem with an Initialize method is the same as with Property Injection (A.K.A. Setter Injection): it creates a temporal coupling between the Initialize method and all other members of the class. Unless you truly can invoke any other member of the class without first invoking the Initialize method, such API design is deceitful and will lead to run-time exceptions. It also becomes much harder to ensure that the object is always in a consistent state.

Constructor Injection is a far superior pattern because is enforces that required dependencies will be present. Property Injection on the other hand implies that the dependency is optional, which is rarely the case.
2011-03-03 16:04 UTC
What about wiring events in the constructor? For example:

this.foo = foo;
this.foo.SomeEvent += HandleSomeEvent;
2011-03-03 19:28 UTC
When you look at what happens on the IL level, subscribing to an event is just another method call, so the same arguments as above still apply.

Keep in mind, however, that the above constitutes a guideline. It's not an absolute truth. I rarely use events, but it happens from time to time, and I can think of at least one case where I've done just what you suggest. I also occasionally break the above rule in other ways, but I always pause and consider the implications and whether there's a better alternative - often there is.
2011-03-03 19:50 UTC

Interfaces are access modifiers

Monday, 28 February 2011 13:19:04 UTC

.NET developers should be familiar with the standard access modifiers (public, protected, internal, private). However, in loosely coupled code we can regard interface implementations as a fifth access modifier. This concept was originally introduced to me by Udi Dahan the only time I've had the pleasure of meeting him. That was many years ago and while I didn't grok it back then, I've subsequently come to appreciate it quite a lot.

Although I can't take credit for the idea, I've never seen it described, and it really deserves to be.

The basic idea is simple:

If a consumer respects the Liskov Substitution Principle (LSP), the only visible members are those belonging to the interface. Thus, the interface represents a dimension of visibility.

As an example, consider this simple interface from AutoFixture:

public interface ISpecimenContext
{
    object Resolve(object request);
}

A well-behaved consumer can only invoke the Resolve method even though an implementation may have additional public members:

public class SpecimenContext : ISpecimenContext
{
    private readonly ISpecimenBuilder builder;
 
    public SpecimenContext(ISpecimenBuilder builder)
    {
        if (builder == null)
        {
            throw new ArgumentNullException("builder");
        }
 
        this.builder = builder;
    }
 
    public ISpecimenBuilder Builder
    {
        get { return this.builder; }
    }
 
    #region ISpecimenContext Members
 
    public object Resolve(object request)
    {
        return this.Builder.Create(request, this);
    }
 
    #endregion
}

Even though the SpecimenContext class defines the Builder property, as well as a public constructor, any consumer respecting the LSP will only see the Resolve method.

In fact, the Builder property on the SpecimenContext class mostly exists to support unit testing because I sometimes need to assert that a given instance of SpecimenContext contains the expected ISpecimenBuilder. This doesn't break encapsulation since the Builder is exposed as a read-only property, and it more importantly doesn't pollute the API.

To support unit testing (and whichever other clients might be interested in the encapsulated ISpecimenBuilder) we have a public property that follows all framework design guidelines. However, it's essentially an implementation detail, so it's not visible via the ISpecimenContext interface.

When writing loosely coupled code, I've increasingly begun to see the interfaces as the real API. Most other (even public) members are pure implementation details. If the members are public, I still demand that they follow the framework design guidelines, but I don't consider them parts of the API. It's a very important distinction.

The interfaces define the bulk of an application's API. Most other types and members are implementation details.

An important corollary is that constructors are implementation details too, since they can never by part of any interfaces.

In that sense we can regard interfaces as a fifth access modifier - perhaps even the most important one.


Comments

Taking this a step further, I wonder if it would make sense to prefer explicit interface implementations, separating the interface from any additional functionality in the type in a way so that you could not invoke Resolve on a SpecimenContext directly, but only on objects that are typed as ISpecimenContext? I am not sure whether I like that idea or not, but it would help enforce the logical separation between the interface and its implementation.
2011-02-28 14:12 UTC
Mark,

Thanks for this post.

I have to disagree.

Any public type or member becomes a part of your API. Consumer code will possibly link to them, they will become a part of your legacy, and you will need to provide backward-compatibility to any part of the public API.

If you don't want a type, constructor or method to be part of the API, you have to make this it internal.

Anyone is free to expose its entire API as a set of interfaces (as in COM), but the only thing that makes a semantic part of the API is whether it is public or not. Making a constructor public and then telling "oh you know, you can't use it in your code, it's an implementation detail" breaks the fundamental principles of object-oriented programming.

Interfaces exist to allow for late binding between the consumer and the provider of a set of semantics. Visibility exits to segregate contractual semantics from implementations details. These concepts are related but actually orthogonal: you can have internal and public interfaces too.

Bottom line: what makes your public API is the 'public' keyword.


-gael


2011-02-28 16:32 UTC
I must admit that I haven't fully thought through whether or not interfaces ought to be explicit or implicit, but my gut feeling is that implicit implementations (as shown in the example) are fine.

The thing is: when I unit test the concrete implementations I often tend to declare the System Under Test (SUT) as the concrete type because I want to exercise an interface member, but verify against a concrete member. If I were to use an explicit implementation this would require a cast in each test. If there was an indisputable benefit to be derived from explicit implementations I wouldn't consider this a valid argument, but in the absence of such, I'd tend towards making unit testing as easy as possible.

I think the argument that explicit interfaces help enforce the logical separation is valid, but not by a sufficiently high degree...
2011-02-28 20:49 UTC
Gael

Thank you for your comment.

Please note that the context of the blog post is loosely coupled code. When composing classes using Dependency Injection (DI), consumers will never see anything else than the interface members. Thus, the API from which we compose an application contains mainly the interfaces. These are the moving parts from which we can define interaction.

I agree that if you consider only a single concrete type at a time, all public (and protected) members are part of the API of that type, but that's not what I'm talking about. In DI it's implicitly discouraged to invoke public constructors of any Services because once you do that, you tightly couple a consumer to a specific implementation.
2011-02-28 20:57 UTC
I see from AutoMapper and the FDG references you're using the ILikePrefixes convention. I've been around the houses with it and have read most of the 'debates' on SO about it, but I keep coming back to the Growing Object Oriented Software Guided by Tests sidebar that says I is an antipattern and agreeing with it, even in .NET.

Any comment on the above? Ever tried going I-less?

Does it have any influence/overlap with your thoughts in the [excellent food-for-thought] article?
2011-03-01 01:15 UTC
Ruben

Thank you for writing.

In my opinion, the most important goal for coding conventions is to reduce friction when reading (and writing) code. Thus, I generally try to write Clean Code, but another important guide is the POLA. When it comes to the debate around the Hungarian I in interface names, I think that the POLA weighs heavier than the strictly logical argument against it.

However illogical it is, (close to) 10 years of convention causes us to expect the I to be there; when it's not, it causes unnecessary friction.
2011-03-01 08:42 UTC
Harry Dev #
If the "Builder" property is only used for testing why not use the "internal" access modifier instead, and allow the testing assembly access to this using the "[assembly: InternalsVisibleTo("XXX.Test")]" attribute?

If the "Builder" property really is an implementation detail there should be no reason to expose it. Although, as you say it does not pollute much since it is a readonly property.
2011-03-02 09:51 UTC
First and foremost I consider the InternalsVisibleTo feature an abomination. We should only unit test the public members of our code.

As I wrote in another answer on this page, the Builder property is most certainly part of the public API of the concrete SpecimenContext class. It doesn't pollute the class in any way because it's an integral part of what the SpecimenContext class does.

There's no reason to expect that the Builder property is used only for testing. It's true that it was driven into existence by TDD, but it makes sense as part of the class' API and is available to other potential consumers. In the rare cases that a third-party consumer wants to use the SpecimenContext directly, it can access the Builder property as well. It wouldn't be able to do that if the property was internal.

However, the Builder property in no way belongs on the interface because that would be a leaky abstraction, so while it doesn't pollute the class, it would pollute the interface.
2011-03-02 10:35 UTC
Ruben, I have to agree with Mark. The I prefix is a convention that is expected and I doubt anyone really considers it hungarian notation (even though it fits the definition).

If you're providing an API and there is a method expecting 'SomeType' most people will attempt to instantiate an instance of SomeType at which point they will receive red squiggles and will then have to do some investigation to determine what's going on. Even if it's only a matter of 5 seconds to figure it out, you've violated POLA, caused the developer to become confused because he now has to solve yet another problem loses momentum. There are many other potential confusing scenarios that can arise from not clearly marking an interface as such.

No one expects to see strings prefixed with 'str' but clearly identifying an interface is expected. There many "rules" that should be followed but with all rules, there are exceptions. Most of them are for the sake of developers. It takes me no time to see any type and recognize it as an interface which I already know I can do this or I can do that, because it's prefixed with an I. but unless i'm already familiar with a framework/API that does not use the I convention, I would need to spend time learning and trial/error. Waste time.

If for no other reason, then do it because Microsoft uses the I convention in the .NET framework and that is what .NET developers expect, even if it is "incorrect".
2011-03-02 16:29 UTC

Creating general populated lists with AutoFixture

Tuesday, 08 February 2011 14:53:16 UTC

In my previous post I described how to customize a Fixture instance to populate lists with items instead of returning empty lists. While it's pretty easy to do so, the drawback is that you have to do it explicitly for every type you want to influence. In this post I will follow up by describing how to enable some general conventions that simply populates all collections that the Fixture resolves.

This post describes a feature that will be available in AutoFixture 2.1. It's not available in AutoFixture 2.0, but is already available in the code repository. Thus, if you can't wait for AutoFixture 2.1 you can download the source and built it.

Instead of having to create multiple customizations for IEnumerable<int>, IList<int>, List<int>, IEnumerable<string>, IList<string>, etc. you can simply enable these general conventions as easy as this:

var fixture = new Fixture()
    .Customize(new MultipleCustomization());

Notice that enabling conventions for populating sequences and lists with ‘many' items is an optional customization that you must explicitly add.

This feature must be explicitly enabled. There are several reasons for that:

  • It would be a breaking change if AutoFixture suddenly started to behave like this by default.
  • The MultipleCustomization targets not only concrete types such as List<T> and Collection<T>, but also interfaces such as IEnumerable<T>, IList<T> etc. Thus, if you also use AutoFixture as an Auto-Mocking container, I wanted to provide the ability to define which customization takes precedence.

With that simple customization enabled, all requested IEnumerable<T> are now populated. The following will give us a finite, but populated list of integers:

var integers = 
    fixture.CreateAnonymous<IEnumerable<int>>();

This will give us a populated List<int>:

var list = fixture.CreateAnonymous<List<int>>();

This will give us a populated Collection<int>:

var collection = 
    fixture.CreateAnonymous<Collection<int>>();

As implied above, it also handles common list interfaces, so this gives us a populated IList<T>:

var list = fixture.CreateAnonymous<IList<int>>();

The exact number of ‘many' is as always determined by the Fixture's RepeatCount.

As this code is still (at the time of publishing) in preview, I would love to get feedback on this feature.


Creating specific populated lists with AutoFixture

Monday, 07 February 2011 19:49:26 UTC

How do you get AutoFixture to create populated lists or sequences of items? Recently I seem to have been getting this question a lot, and luckily it's quite easy to answer.

Let's first look at the standard AutoFixture behavior and API.

You can ask AutoFixture to create an anonymous List like this:

var list = fixture.CreateAnonymous<List<int>>();

Seen from AutoFixture's point of view, List<int> is just a class like any other. It has a default constructor, so AutoFixture just uses that and returns an instance. You get back an instance, no exceptions are thrown, but the list is empty. What if you'd rather want a populated list?

There are many ways to go about this. A simple, low-level solution is to populate the list after creation:

fixture.AddManyTo(list);

However, you may instead prefer getting a populated list right away. This is also possible, but before we look at how to get there, I'd like to point out a feature that surprisingly few users notice. You can create many anonymous specimens at once:

var integers = fixture.CreateMany<int>();

Armed with this knowledge, as well as the knowledge of how to map types, we can now create this customization to map IEnumerable<int> to CreateMany<int>:

fixture.Register(() => fixture.CreateMany<int>());

The Register method is really a generic method, but since we have type inference, we don't have to write it out. However, since CreateMany<int>() returns IEnumerable<int>, this is the type we register. Thus, every time we subsequently resolve IEnumerable<int>, we will get back a populated sequence.

Getting back to the original List<int> example, we can now customize it to a populated list like this:

fixture.Register(() =>
    fixture.CreateMany<int>().ToList());

Because the ToList() extension method returns List<T>, this call registers List<int> so that we will get back a populated list of integers every time the fixture resolves List<int>.

What about other collection types that don't have a nice LINQ extension method? Personally, I never use Collection<T>, but if you wanted, you could customize it like this:

fixture.Register(() =>
    new Collection<int>(
        fixture.CreateMany<int>().ToList()));

Since Collection<T> has a constructor overload that take IList<T> we can customize the type to use this specific overload and populate it with ‘many' items.

Finally, we can combine all this to map from collection interfaces to populated lists. As an example, we can map from IList<int> to a populated List<int> like this:

fixture.Register<IList<int>>(() => 
    fixture.CreateMany<int>().ToList());

When we use the Register method to map types we can no longer rely on type inference. Instead, we must explicitly register IList<int> against a delegate that creates a populated List<int>. Because List<int> implements IList<int> this compiles. Whenever this fixture instance resolves IList<int> it will create a populated List<int>.

All of this describes what you can do with the strongly typed API available in AutoFixture 2.0. It's easy and very flexible, but the only important drawback is that it's not general. All of the customizations in this post specifically address lists and sequences of integers, but not lists of any other type. What if you would like to expand this sort of behavior to any List<T>, IEnumerable<T> etc?

Stay tuned, because in the next post I will describe how to do that.


The BCL already has a Maybe monad

Friday, 04 February 2011 13:11:34 UTC

During the last couple of weeks I've been very interested in using a Maybe monad with AutoFixture's Kernel code, but although many examples can be found on the internet, they remain samples. Rinat Abdullin and Zack Owens both posted samples, but I particularly like Mike Hadlow's series about Monads in C# because he also explains how to use LINQ with monads such as the Maybe monad.

As I really wanted a Maybe monad for AutoFixture, I first thought about simply implementing it directly in the AutoFixture source. However, I found it too arbitrary to put such a general purpose programming construct into a specific library such as AutoFixture. My next thought was to create a small open source project just for that single purpose, but then I though about the problem a bit more…

The BCL sort of already has a Maybe monad - you just need to recognize it as such.

What is a Maybe monad really? If you really distill it, it's just a type that either contains a value, or doesn't contain a value. In other words, it's a type that represents a particular range: a set with either zero or one items. That's just a special case of a more general range or collection, and we already have LINQ covering those constructs.

Here it is: the Maybe monad from the BLC (encapsulated in a nice extension method):

public static class LightweightMaybe
{
    public static IEnumerable<T> Maybe<T>(this T value)
    {
        return new[] { value };
    }
}

Obviously, this method returns a Maybe with a value, but we can just as easily represent Nothing with an empty array.

With my ‘new' Maybe monad, I can now write code like this (where request is a System.Object instance):

return (from t in request.Maybe().OfType<Type>()
        let typeArguments = t.GetGenericArguments()
        where typeArguments.Length == 1
        && typeof(IList<>)
            == t.GetGenericTypeDefinition()
        select context.Resolve(typeof(List<>)
            .MakeGenericType(typeArguments)))
        .DefaultIfEmpty(new NoSpecimen(request))
        .SingleOrDefault();

You may think that this looks dense, but before that the code looked like this:

var type = request as Type;
if (type == null)
{
    return new NoSpecimen(request);
}
 
var typeArguments = type.GetGenericArguments();
if (typeArguments.Length != 1)
{
    return new NoSpecimen(request);
}
 
if (typeof(IList<>) != 
    type.GetGenericTypeDefinition())
{
    return new NoSpecimen(request);
}
 
return context.Resolve(typeof(List<>)
    .MakeGenericType(typeArguments));

Notice that in this more traditional approach involving Guard Clauses, I have to construct a new NoSpecimen object in three different places, thus violating the DRY principle. I like not having all those if/return blocks in the code.


Comments

That's a very neat idea. Now I think of it, it's a pattern you see used a lot in Haskell too.

In your first code blog, rather than returning:

return new[]{value};

It would be nicer to do this:

return Enumerable.Single(value);

:)
2011-02-04 20:38 UTC
Yes, I believe the concept of a Maybe monad originates from Haskell or a similar language (but I can't remember the specific details).

Using Enumerable.Single(value) will not work because it takes an IEnumerable<T> and returns a T. We want the exact opposite: take a T and return IEnumerable<T>.
2011-02-04 21:33 UTC
Cant say I know it inside out have never used DefaultIfEmpty IRL, but should the .SingleOrDefault be a .Single() or a [0] ? (I'd much favor a .Single() to be honest)

@other commenter: Enumerable.Repeat(value,1) does the trick you want. I sometimes cruft up a .One helper method, but I believe there's a more accepted name for it in the excellent RealWorldFunctionalProgramming in C# and F# book I dont have to hand (The one that makes Mark's head hurt :D)
2011-02-04 23:29 UTC
Yes, you are right - Single() is enough. My mistake :)

It's true that there are many ways to create an IEnumerable with a single element.
2011-02-05 08:06 UTC
I agree that this functional approach is DRY, but I still find it a bit hard to follow. Instead, why not simply refactor the imperative code to a DRY, more intend revealing (but still imperative) version. This is what I propose:

var type = request as Type;

bool requestIsAType = type != null;
bool withOneGenericArgument = requestIsAType && type.GetGenericArguments().Length == 1;
bool isAGenericList = requestIsAType && type.GetGenericTypeDefinition() == typeof(IList<>);

if (requestIsAType && isAGenericList && withOneGenericArgument)
{
return context.Resolve(typeof(List<>).MakeGenericType(type.GetGenericArguments()));
}
else
{
return new NoSpecimen(request);
}

Doesn’t this just read like a functional spec? “When the request is a Type of a generic list with one generic generic argument, than … otherwise …”.

Cheers
2011-02-05 09:07 UTC
Did you test that code? I'm pretty sure it has defects.

If you call GetGenericTypeDefinition() on a type which is not generic, an exception will be thrown. This could happen if, for instance, I were to invoke the method with request = typeof(object).

If you want to play around with this, just pull the AutoFixture source and revert to revision 391 and try it out on the ListRelay class. It has pretty comprehensive test coverage.
2011-02-05 09:31 UTC
#
Did I test that code? Of course not! ;-) Just trying to prove what a bit of refactoring can do :-)
2011-02-05 22:50 UTC
Yes, but the point is that it's those little things that end up making a more procedural refactoring less than readable. In any case, 'readability' of code is highly subjective so obviously YMMV.
2011-02-06 09:35 UTC
3P #
For me the LINQ version is almost unreadable. If it makes me more then few seconds to understand the code I think that code is not finished. Putting sth in one line is not "Clean Code" I think.
2011-04-03 19:03 UTC
I came late to comment due to tweets exchange with author. But I want add that treating null as something that should semantically avoided in code, it's not only a matter of readability but also a symptom of good design. Totally agree with Mark Seemann that kindly supplied also this Maybe monad implementation. Excellent work.
2013-03-06 08:18 UTC

Scalable doesn't mean fast

Monday, 24 January 2011 12:03:16 UTC

Recently I spent a couple of days with Thomas Jespersen who's working towards a launch of spiir.dk - on Windows Azure. The reason I got to talk to him was to see if I could help with some performance issues he had with Azure Table Storage.

The scenario is really simple: the application needs to load all of a user's bank transactions into memory to enable pretty advanced sorting and filtering. That sounds like a lot, but really isn't more than approximately 200 kB of data retrieved through a single query - so: there are no 1+N problems in play here, but even so it originally took more than two seconds. That's a bit long to wait before you can even start rendering a web page.

By tweaking his partitioning strategy and using parallel queries, Thomas managed to bring down the data retrieval time to approximately one second. Although stress testing indicated that this duration was very stable, even under load, it is still too slow. So we met to see what could be done.

Thomas had done a great job tweaking the query, so I couldn't really suggest some sort of secret API that would make it run significantly faster. Basically, we have to deal with Azure storage being based on REST and that there are a lot of things about run-time behavior we cannot control. Apart from designing a proper partitioning strategy, we can't add indexes to Azure Table Storage.

It was time to take a different approach.

As far as I can tell, Windows Azure is designed to be very scalable. However, just because scalability implies that you can handle an insane amount of work within acceptable time frames, it doesn't mean that you can extrapolate it to mean that under a light load, everything will be lightning fast. That's not the case at all.

Scalability means that performance characteristics remain stable from light to heavy load.

Consequently this means that if performance is adequate under heavy load, it will also be adequate under a light load. Azure Storage is first and foremost designed to be scalable, and as a second priority, as fast as possible.

As Thomas discovered, Azure Table Storage isn't particularly fast.

It may be a masochistic side of me that I'm not otherwise aware of, but I actually appreciate that. It makes us reassess our most basic assumptions.

The data that Thomas needs to read isn't particularly dynamic, so what if we take a snapshot of it? In short, we loaded all of a user's data into memory and serialized it to Azure Blob Storage.

Loading the same data from a binary serialized Blob took only 1/6 of the time it did to load it from Table Storage.

As it turns out, Thomas doesn't even need all the columns from the Table to populate the view, so we could even make the serialized Blob smaller yet.

At this point, however, we now have two representations of the same data: The original data in Table Storage, and a persistent cache in Blob Storage. The remaining challenge is to figure out how to keep these in sync.

This may seem like a hack, but is really represents a paradigm shift. Letting go of ACID opens up a lot of new opportunities.

Actually, I spend most of the next day trying to convince Thomas that CQRS would be the best approach, or that we could at least pick up some of the techniques from asynchronous, messaging based architectures, but that's another story.

The morale here is that on Azure, things may be slower than you are used to, but storage is (relatively) cheap, so denormalization can save you a lot of execution time.


Comments

I don't see how this is different in azure vs. other solutions. It is basic knowledge that scalability introduces overhead in most simple cases to ensure more consistent response times under load, often this overhead is caused by a distribution in place of centralization, in your case regarding distributed storage. Maybe im missing the point but you seem to be stating the obvious like "water travels downhill" :-)

Your comment about you letting go of ACID strikes me as odd. Basically you are just adding a cache in front of your table storage like people have done for ages so where in lies the paradigm shift in "letting go of ACID", are you speaking to people who have never cached data? Even people not using a cache will let go of ACID in most cases because they keep stale data in objects in their application and in best case do optimistic concurrency checks and worst case just let the last-to-write-win.
2011-01-24 20:35 UTC
Jakob, if you already knew all of those things, then this blog post wasn't meant for you. I knew them too, but apparently many people don't, so I wrote for their benefit. Trust me: for many developers, this is far from obvious.

The same goes for the ACID comment. People don't seem to have too big of a problem with in-memory caches because somehow they know that these can't possibly be consistent. However, as soon as you start writing to a persistent store, you encounter knee-jerk reactions that all persisted data must be written and updated within transactions.

I'm not disagreeing with you. This is the basic premise behind CQRS and other scalable architectures. It's just not particularly widely known yet.
2011-01-24 20:46 UTC

My Christmas challenge has a winner

Saturday, 01 January 2011 13:53:33 UTC

A week ago I concluded Microsoft Denmark's 2010 .NET Community Christmas Calendar with a challenge about resolving closed types with MEF. As 2011 came around, the deadline ended, so it's now time to pick the winner.

I didn't get a lot of entries, which can be interpreted in at least one (or more) of the following ways:

  • The challenge was too difficult
  • The challenge wasn't interesting
  • The prize wasn't attractive
  • People had other things to do during the holidays

Whatever the reason, it made my task of picking a winner that much easier. The best Danish entry came from Daniel Volder Guarnieri who cheated a bit by partially hard-coding the composition of Mayonnaise into the ContainerBuilder. As I wrote in the original challenge, there are many ways to tackle the challenge, and one was to take the unit tests very literally :)

However, honorable mention must go to Boyan Mihaylov who participated just for the honor. He took a more general approach similar to Fluent MEF. This involves implementing a completely new ComposablePartCatalog with associated ComposablePartDefinition and ComposablePart implementations - not a trivial undertaking.

Kudos to Boyan and congratulations to Daniel. My thanks for your submissions, and a happy new year to all my readers!


Comments

Could you post the solution?
2011-01-01 21:00 UTC

Page 65 of 76

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!