ploeh blog danish software design
Checking for exactly one item in a sequence using C# and F#
Here's a programming issue that comes up from time to time. A method takes a sequence of items as input, like this:
public void Route(IEnumerable<string> args)
While the signature of the method may be given, the implementation may be concerned with finding out whether there is exactly one element in the sequence. (I'd argue that this would be a violation of the Liskov Substitution Principle, but that's another discussion.) By corollary, we might also be interested in the result sets on each side of that single element: no elements and multiple elements.
Let's assume that we're required to raise the appropriate event for each of these three cases.
Naïve approach in C# #
A naïve implementation would be something like this:
public void Route(IEnumerable<string> args) { var countCategory = args.Count(); switch (countCategory) { case 0: this.RaiseNoArgument(); return; case 1: this.RaiseSingleArgument(args.Single()); return; default: this.RaiseMultipleArguments(args); return; } }
However, the problem with that is that IEnumerable<string> carries no guarantee that the sequence will ever end. In fact, there's a whole category of implementations that keep iterating forever - these are called Generators. If you pass a Generator to the above implementation, it will never return because the Count method will block forever.
Robust implementation in C# #
A better solution comes from the realization that we're only interested in knowing about which of the three categories the input matches: No elements, a single element, or multiple elements. The last case is covered if we find at least two elements. In other words, we don't have to read more than at most two elements to figure out the category. Here's a more robust solution:
public void Route(IEnumerable<string> args) { var countCategory = args.Take(2).Count(); switch (countCategory) { case 0: this.RaiseNoArgument(); return; case 1: this.RaiseSingleArgument(args.Single()); return; default: this.RaiseMultipleArguments(args); return; } }
Notice the inclusion of the Take(2) method call, which is the only difference from the first attempt. This will give us at most two elements that we can then count with the Count method.
While this is better, it still annoys me that it's necessary with a secondary LINQ call (to the Single method) to extract that single element. Not that it's particularly inefficient, but it still feels like I'm repeating myself here.
(We could also have converted the Take(2) iterator into an array, which would have enabled us to query its Length property, as well as index into it to get the single value, but it basically amounts to the same work.)
Implementation in F# #
In F# we can implement the same functionality in a much more compact manner, taking advantage of pattern matching against native F# lists:
member this.Route args = let atMostTwo = args |> Seq.truncate 2 |> Seq.toList match atMostTwo with | [] -> onNoArgument.Trigger(Unit.Default) | [arg] -> onSingleArgument.Trigger(arg) | _ -> onMultipleArguments.Trigger(args)
The first thing happening here is that the input is being piped through a couple of functions. The truncate method does the same thing as the Take LINQ method does, and the toList method subsequently converts that sequence of at most two elements into a native F# list.
The beautiful thing about native F# lists is that they support pattern matching, so instead of first figuring out in which category the input belongs, and then subsequently extract the data in the single element case, we can match and forward the element in a single statement.
Why is this important? I don't know… it's just satisfying on an aesthetic level :)
Weakly-typed versus Strongly-typed Message Channels
Soon after I posted my previous blog post on message dispatching without Service Location I received an email from Jeff Saunders with some great observations. Jeff has been so kind to allow me to quote his email here on the blog, so here it is:
“I enjoyed your latest blog post about message dispatching. I have to ask, though: why do we want weakly-typed messages? Why can't we just inject an appropriate IConsumer<T> into our services - they know which messages they're going to send or receive.
“A really good example of this is ISubject<T> from Rx. It implements both IObserver<T> (a message consumer) and IObservable<T> (a message producer) and the default implementation Subject<T> routes messages directly from its IObserver side to its IObservable side.
“We can use this with DI quite nicely - I have written an example in .NET Pad: http://dotnetpad.net/ViewPaste/woTkGk6_GEq3P9xTVEJYZg#c9,c26,
“The good thing about this is that we now have access to all of the standard LINQ query operators and the new ones added in Rx, so we can use a select query to map messages between layers, for instance.
“This way we get all the benefits of a weakly-typed IChannel interface, with the added advantages of strong typing for our messages and composability using Rx.
“One potential benefit of weak typing that could be raised is that we can have just a single implementation for IChannel, instead of an ISubject<T> for each message type. I don't think this is really a benefit, though, as we may want different propagation behaviour for each message type - there are other implementations of ISubject<T> that call consumers asynchronously, and we could pass any IObservable<T> or IObserver<T> into a service for testing purposes.”
These are great observations and I think that Rx holds much promise in this space. Basically you can say that in CQRS-style architectures we're already pushing events (and commands) around, so why not build upon what the framework offers?
Even if you find the IObserver<T> interface a bit too clunky with its OnNext, OnError and OnCompleted methods compared to the strongly typed IConsumer<T> interface, the question still remains: why do we want weakly-typed messages?
We don't, necessarily. My previous post wasn't meant as a particular endorsement of a weakly typed messaging channel. It was more an observation that I've seen many variations of this IChannel interface:
public interface IChannel { void Send<T>(T message); }
The most important thing I wanted to point out was that while the generic type argument may create the illusion that this is a strongly typed method, this is all it is: an illusion. IChannel isn't strongly typed because you can invoke the Send method with any type of message - and the code will still compile. This is no different than the mechanical distinction between a Service Locator and an Abstract Factory.
Thus, when defining a channel interface I normally prefer to make this explicit and instead model it like this:
public interface IChannel { void Send(object message); }
This achieves exactly the same and is more honest.
Still, this doesn't really answer Jeff's question: is this preferable to one or more strongly typed IConsumer<T> dependencies?
Any high-level application entry point that relies on a weakly typed IChannel can get by with a single IChannel dependency. This is flexible, but (just like with Service Locator), it might hide that the client may have (or (d)evolve) too many responsibilities.
If, instead, the client would rely on strongly typed dependencies it becomes much easier to see if/when it violates the Single Responsibility Principle.
In conclusion, I'd tend to prefer strongly typed Datatype Channels instead of a single weakly typed channel, but one shouldn't underestimate the flexibility of a general-purpose channel either.
Comments
Message Dispatching without Service Location
Once upon a time I wrote a blog post about why Service Locator is an anti-pattern, and ever since then, I occasionally receive rebuffs from people who agree with me in principle, but think that, still: in various special cases (the argument goes), Service Locator does have its uses.
Most of these arguments actually stem from mistaking the mechanics for the role of a Service Locator. Still, once in a while a compelling argument seems to come my way. One of the most insistent arguments concerns message dispatching - a pattern which is currently gaining in prominence due to the increasing popularity of CQRS, Domain Events and kindred architectural styles.
In this article I'll first provide a quick sketch of the scenario, followed by a typical implementation based on a ‘Service Locator', and then conclude by demonstrating why a Service Locator isn't necessary.
Scenario: Message Dispatching #
Appropriate use of message dispatching internally in an application can significantly help decouple the code and make roles explicit. A common implementation utilizes a messaging interface like this one:
public interface IChannel { void Send<T>(T message); }
Personally, I find that the generic typing of the Send method is entirely redundant (not to mention heavily reminiscent of the shape of a Service Locator), but it's very common and not particularly important right now (but more about that later).
An application might use the IChannel interface like this:
var registerUser = new RegisterUserCommand( Guid.NewGuid(), "Jane Doe", "password", "jane@ploeh.dk"); this.channel.Send(registerUser); // ... var changeUserName = new ChangeUserNameCommand( registerUser.UserId, "Jane Ploeh"); this.channel.Send(changeUserName); // ... var resetPassword = new ResetPasswordCommand( registerUser.UserId); this.channel.Send(resetPassword);
Obviously, in this example, the channel variable is an injected instance of the IChannel interface.
On the receiving end, these messages must be dispatched to appropriate consumers, which must all implement this interface:
public interface IConsumer<T> { void Consume(T message); }
Thus, each of the command messages in the example have a corresponding consumer:
public class RegisterUserConsumer : IConsumer<RegisterUserCommand> public class ChangeUserNameConsumer : IConsumer<ChangeUserNameCommand> public class ResetPasswordConsumer : IConsumer<ResetPasswordCommand>
This certainly is a very powerful pattern, so it's often used as an argument to prove that Service Locator is, after all, not an anti-pattern.
Message Dispatching using a DI Container #
In order to implement IChannel it's necessary to match messages to their appropriate consumers. One easy way to do this is by employing a DI Container. Here's an example that uses Autofac to implement IChannel, but any other container would do as well:
private class AutofacChannel : IChannel { private readonly IComponentContext container; public AutofacChannel(IComponentContext container) { if (container == null) throw new ArgumentNullException("container"); this.container = container; } public void Send<T>(T message) { var consumer = this.container.Resolve<IConsumer<T>>(); consumer.Consume(message); } }
This class is an Adapter from Autofac's IComponentContext interface to the IChannel interface. At this point I can always see the “Q.E.D.” around the corner: “look! Service Locator isn't an anti-pattern after all! I'd like to see you implement IChannel without a Service Locator.”
While I'll do the latter in just a moment, I'd like to dwell on the DI Container-based implementation for a moment.
- Is it simple? Yes.
- Is it flexible? Yes, although it has shortcomings.
- Would I use it like this? Perhaps. It depends :)
- Is it the only way to implement IChannel? No - see the next section.
- Does it use a Service Locator? No.
While AutofacChannel uses Autofac (a DI Container) to implement the functionality, it's not (necessarily) a Service Locator in action. This was the point I already tried to get across in my previous post about the subject: just because its mechanics look like Service Locator it doesn't mean that it is one. In my implementation, the AutofacChannel class is a piece of pure infrastructure code. I even made it a private nested class in my Composition Root to underscore the point. The container is still not available to the application code, so is never used in the Service Locator role.
One of the shortcomings about the above implementations is that it provides no fallback mechanism. What happens if the container can't resolve the matching consumer? Perhaps there isn't a consumer for the message. That's entirely possible because there are no safeguards in place to ensure that there's a consumer for every possibly message.
The shape of the Send method enables the client to send any conceivable message type, and the code still compiles even if no consumer exists. That may look like a problem, but is actually an important insight into implementing an alternative IChannel class.
Message Dispatching using weakly typed matching #
Consider the IChannel.Send method once again:
void Send<T>(T message);
Despite its generic signature it's important to realize that this is, in fact, a weakly typed method (at least when used with type inferencing, as in the above example). Equivalently to a bona fide Service Locator, it's possible for a developer to define a new class (Foo) and send it - and the code still compiles:
this.channel.Send(new Foo());
However, at run-time, this will fail because there's no matching consumer. Despite the generic signature of the Send method, it contains no type safety. This insight can be used to implement IChannel without a DI Container.
Before I go on I should point out that I don't consider the following solution intrinsically superior to using a DI Container. However, readers of my book will know that I consider it a very illuminating exercise to try to implement everything with Poor Man's DI once in a while.
Using Poor Man's DI often helps unearth some important design elements of DI because it helps to think about solutions in terms of patterns and principles instead of in terms of technology.
However, once I have arrived at an appropriate conclusion while considering Poor Man's DI, I still tend to prefer mapping it back to an implementation that involves a DI Container.
Thus, the purpose of this section is first and foremost to outline how message dispatching can be implemented without relying on a Service Locator.
While this alternative implementation isn't allowed to change any of the existing API, it's a pure implementation detail to encapsulate the insight about the weakly typed nature of IChannel into a similarly weakly typed consumer interface:
private interface IConsumer { void Consume(object message); }
Notice that this is a private nested interface of my Poor Man's DI Composition Root - it's a pure implementation detail. However, given this private interface, it's now possible to implement IChannel like this:
private class PoorMansChannel : IChannel { private readonly IEnumerable<IConsumer> consumers; public PoorMansChannel(params IConsumer[] consumers) { this.consumers = consumers; } public void Send<T>(T message) { foreach (var c in this.consumers) c.Consume(message); } }
Notice that this is another private nested type that belongs to the Composition Root. It loops though all injected consumers, so it's up to each consumer to decide whether or not to do anything about the message.
A final private nested class bridges the generically typed world with the weakly typed world:
private class Consumer<T> : IConsumer { private readonly IConsumer<T> consumer; public Consumer(IConsumer<T> consumer) { this.consumer = consumer; } public void Consume(object message) { if (message is T) this.consumer.Consume((T)message); } }
This generic class is another Adapter - this time adapting the generic IConsumer<T> interface to the weakly typed (private) IConsumer interface. Notice that it only delegates the message to the adapted consumer if the type of the message matches the consumer.
Each implementer of IConsumer<T> can be wrapped in the (private) Consumer<T> class and injected into the PoorMansChannel class:
var channel = new PoorMansChannel( new Consumer<ChangeUserNameCommand>( new ChangeUserNameConsumer(store)), new Consumer<RegisterUserCommand>( new RegisterUserConsumer(store)), new Consumer<ResetPasswordCommand>( new ResetPasswordConsumer(store)));
So there you have it: type-based message dispatching without a DI Container in sight. However, it would be easy to use convention-based configuration to scan an assembly and register all IConsumer<T> implementations and wrap them in Consumer<T> instances and use this list to compose a PoorMansChannel instance. However, I will leave this as an exercise to the reader (or a later blog post).
My claim still stands #
In conclusion, I find that I can still defend my original claim: Service Locator is an anti-pattern.
That claim, by the way, is falsifiable, so I do appreciate that people take it seriously enough by attempting to disprove it. However, until now, I've yet to be presented with a scenario where I couldn't come up with a better solution that didn't involve a Service Locator.
Keep in mind that a Service Locator is defined by the role it plays - not the shape of the API.
Comments
I'm helping a .NET vendor improve their blog by finding respected developers who will contribute guest posts. Each post will include your byline, URL, book link (with your Amazon affiliate link) and a small honorarium. It can either be a new post or one of your popular older posts.
Being an author myself, I know that getting in front of new audiences boosts sales, generates consulting opportunities and in this case, a little cash. Would you be interested? If so, let me know and I'll set you up.
Cheers,
Bob Walsh
Looking forward to getting your book later this month.
I think it comes down to the definition of Service Locator. I'm not sure that the AutofacChannel example is much different than a common example I come up against, which is a ViewModel factory that more or less wraps a call to the container, and then gets injected as a IViewModelFactory into classes that need to create VMs. I don't feel that this is "wrong," as I only allow this kind of thing in cases where more explicit injection is significantly more painful. However I do still think of it as Service Location, and it does violate the Three Calls Pattern. As long as I limit the number of places this is allowed and everyone is aware of them, I see little risk in doing it this way. Some might argue it's a slippery slope . . .
Whether it is a static dependency or an injected instance, it seems unnatural to me to call a service directly from a domain object. I think I've seen you say the same thing, but I'm not sure in what context.
Anyway, was wondering if you had any additional thoughts on the subject. I've been struggling with it for a while, and have settled (temporarily) on firing events outside of the domain object (i.e. call the domain.Method, then fire the event from the command handler).
Thanks for writing.
It's my experience that in MVVM one definitely needs some kind of factory to create ViewModels. However, there's no reason to define it as a Service Locator. A normal (generic) Abstract Factory is preferable, and achieves the same thing.
Regarding the question about whether or not to raise Domain Events from within Entities: it bothered me for a while until I realized that when you move towards CQRS and Event Sourcing, Commands and Events become first-class citizens and this problem tends to go away because you can keep the logic about which commands raises which events decoupled from the Entities. In this architectural style, Entities are immutable, so nothing ever changes within them.
In CQRS we have consumers that consume Commands and Events, and typically we have a single consumer which is responsible for receiving a Command and (upon validation) convert it to an Event. Such a consumer is a Service which can easily hold other Services, such as a channel upon which Events can be published.
Thanks.
I had that problem previously, but I thought I fixed it months ago. From where I'm sitting, the code looks OK both in Google Reader on the Web, Google Reader on Android as well as FeedDemon. Can you share exactly where and how it looks unreadable?
http://imgur.com/LvCfJ
Thanks.
Thanks.
Why I am posting:
I am currently designing the architecture of a new application and I want to design my domain models persistent ignorant but still use them directly in the NHibernate mapping to benefit from lazy loading and to not have near identical entity objects. One part of a PI domain model is that the models might rely on services to do their work which get injected using constructor injection. Now, NHibernate needs a default constructor by default but that can be changed (see: http://fabiomaulo.blogspot.com/2008/11/entities-behavior-injection.html). In the middle of this post there is a class implementation called ReflectionOptimizer that is responsible for creating the entities. It uses an injected container to receive an instance of the requested entity type or falls back to the default implementation of NHibernate if that type is unknown to the container.
Do you think this is using the container in a service locator role?
I think not, because a Poor Man's DI implementation of this class would get a list of factories, one for each supported entity and all of this is pure infrastructure.
The biggest benefit of changing the implementation in a way that it receives factories is that it fails fast: I am constructing all factories along with their dependencies in the composition root.
What is your view on this matter?
It'd be particularly clean if you could inject an Abstract Factory into your NHibernate infrastructure and keep the container itself isolated to the Composition Root. In any case I agree that this sounds like pure infrastructure code, so it doesn't sound like Service Location.
However, I'd think twice about injecting Services into Entities - see also this post from Miško Hevery.
If Message Bus is used - how is about Layers? Should it be some kind of "Superlayer" (visible for all other layers?)
How do you think - in which situation should Message Bus be involved?
Should it better be implemented or is there some good products? (C#, not commerce licence)
It's not too hard to implement a message bus on top of a queue system, but it might be worth taking a look at NServiceBus or Rebus.
I have implemented this pattern before and everything is well when the return type is void.
So, for dispatching messages, this is a really usefull and flexible pattern
However, I was looking into implementing the same thing for a query dispatcher. The structure is similar, with the difference that your messages are queries and that the consumer actually returns a result.
I do have a working implementation but I cannot get type inference to work on my query dispatcher. That means that every time I call the query dispatcher I need to specify the return type and the query type
This may seem a bit abstract, but you can check out it this question on StackoverFlow: type inference with interfaces and generic constraints.
I'm aware that the way I'm doing it there is not possible with c#, but I was wondering if you'd see a pattern that would allow me to do that.
What is your view on this matter?
Many thanks!
Kenneth, thank you for writing. Your question reminded my of my series about Role Hints, particularly Metadata, Role Interface, and Partial Type Name Role Hints. The examples in those posts aren't generically typed, but I wonder if you might still find those patterns useful?
Hi Mark,
I am also using this pattern as an in-process mediator for commands, queries and events. My mediator (channel in your case) now uses a dependency resolver, which is a pure wrapper around a DI container and only contains resolve methods.
I am now trying to refactor the dependency resolver away by creating separate factories for the command, query and event handlers (in your example this are the consumers). In my current code and also in yours and dozens of other implementations on the net don’t deal with releasing the handlers. My question is should this be a responsibility of the mediator (channel)? I think so because the mediator is the only place that knows about the existence of the handler. The problem I have with my answer is that the release of the handler is always called after a dispatch (send) even though the handler could be used again for sending another command of the same type during the same request (HTTP request for example). This implies that the handler’s lifetime is per HTTP request.
Maybe I am thinking in the wrong direction but I would like to hear your opinion about the releasing handler’s problem.
Many thanks in advance,
Martijn Burgers
Martijn, thank you for writing. The golden rule for decommissioning is that the object responsible for composing the object graph should also be responsible for releasing it. That's what some DI Containers (Castle Windsor, Autofac, MEF) do - typically using a Release method. The reason for that is that only the Composer knows if there are any nodes in the object graph that should be disposed of, so only the Composer can properly decommission an object graph.
You can also implement that Resolve/Release pattern using Pure DI. If you're writing a framework, you may need to define an Abstract Factory with an associated Release method, but otherwise, you can just release the graph when you're done with it.
In the present article, you're correct that I haven't addressed the lifetime management aspect. The Composer here is the last code snippet that composes the PoorMansChannel object. As shown here, the entire object graph has the same lifetime as the PoorMansChannel object, but I don't describe whether or not it's a Singleton, Per Graph, Transient, or something else. However, the code that creates the PoorMansChannel object should also be the code that releases it, if that's required.
In my book's chapter 8, I cover decommissioning in much more details, although I don't cover this particular example. I hope this answer helps; otherwise, please write again.
AutoFixture goes Continuous Delivery with Semantic Versioning
For the last couple of months I've been working on setting up AutoFixture for Continuous Delivery (thanks to the nice people at http://teamcity.codebetter.com/ for supplying the CI service) and I think I've finally succeeded. I've just pushed some code from my local Mercurial repository, and 5 minutes later the release is live on both the CodePlex site as well as on the NuGet Gallery.
The plan for AutoFixture going forward is to maintain Continuous Delivery and switch the versioning scheme from ad hoc to Semantic Versioning. This means that obviously you'll see releases much more often, and versions are going to be incremented much more often. Since the previous release the current release incidentally ended at version 2.2.44, but since the versioning scheme has now changed, you can expect to see 2.3, 2.4 etc. in rapid succession.
While I've been mostly focused on setting up Continuous Delivery, Nikos Baxevanis and Enrico Campidoglio have been busy writing new features:
- Support for anonymous delegates
- Heuristics for static factory methods
- Inline AutoData Theories
- More anonymous numbers
Apart from these excellent contributions, other new features are
- Added StableFiniteSequenceCustomization
- Added [FavorArrays], [FavorEnumerables] and [FavorLists] attributes to xUnit.net extension
- Added a Generator<T> class
- Added a completely new project/package called Idioms, which contains convention-based tests (more about this later)
- Probably some other things I've forgotten about…
While you can expect to see version numbers to increase more rapidly and releases to occur more frequently, I'm also beginning to think about AutoFixture 3.0. This release will streamline some of the API in the root namespace, which, I'll admit, was always a bit haphazard. For those people who care, I have no plans to touch the API in the Ploeh.AutoFixture.Kernel namespace. AutoFixture 3.0 will mainly target the API contained in the Ploeh.AutoFixture namespace itself.
Some of the changes I have in mind will hopefully make the default experience with AutoFixture more pleasant - I'm unofficially thinking about AutoFixture 3.0 as the ‘pit of success' release. It will also enable some of the various outstanding feature requests.
Feedback is, as usual, appreciated.
Service Locator: roles vs. mechanics
It's time to take a step back from the whole debate about whether or not Service Locator is, or isn't, an anti-pattern. It remains my strong belief that it's an anti-pattern, while others disagree. Although everyone is welcome to think differently than me, I've noticed that some of the arguments being put forth in defense of Service Locator seem very convincing. However, I believe that in those cases we no longer talk about Service Locator, but something that looks an awful lot like it.
Some APIs are easy to confuse with a ‘real' Service Locator. It probably doesn't help that last year I published an article on how to tell the difference between a Service Locator and an Abstract Factory. In this article I may have focused too much on the mechanics of Service Locator, but as Derick Bailey was so kind to point out, this hides the role the API might play.
To repeat that earlier post, a Service Locator looks like this:
public interface IServiceLocator { T Create<T>(object context); }
All Service Locators I've seen so far look like that, or some variation thereof, but that doesn't mean that the relationship is transitive. Just because an API looks like that it doesn't automatically means that it's a Service Locator.
If it was, all DI containers would be Service Locators. As an example, here's Castle Windsor's Resolve method:
public T Resolve<T>()
Even AutoFixture has an API like that:
MyClass sut = fixture.CreateAnonymous<MyClass>();
It has never been my intention to denounce every single DI container available, as well as my own open source framework. Service Locator is ultimately not identified by the mechanics of its API, but by the role it plays.
A DI container encapsulated in a Composition Root is not a Service Locator - it's an infrastructure component.
It becomes a Service Locator if used incorrectly: when application code (as opposed to infrastructure code) actively queries a service in order to be provided with required dependencies, then it has become a Service Locator.
Service Locators are spread thinly and pervasively throughout a code base - that is just as much a defining characteristic.
Comments
The anti-pattern comes from a DI container used as a ServiceLocator?
A DI Container is not, in itself, a Service Locator, but it can be used like one. If you do that, it's an anti-pattern, but that doesn't mean that any use of a container constitutes an anti-pattern. When a container is used according to the Register Resolve Release pattern from within a Composition Root it's all good.
I pop by your blog intermittently: it's good stuff.
For some reason, I always had a particular idea of the type of person you are; but for some other reason, I thought that this stemmed from something other than your (excellent) writings.
I've just realised what it is.
That photo of you, up there on the top right, your, "Contact," photo.
It looks like you have an ear-ring dangling from your left ear.
I have one concern about Composition Root. For WPF applications, you said that the Composition Root is at OnStartUp. So, if I want to compose the main window and all other windows (with their view models) in the app at once and one place only (I mean at OnStartUp). How could I do? Thanks in advance!
namespace SimpleCSharpApp
{
class Program
{
static void Main(string[] args)
{
// resolve object using Unity
IUnityContainer container = new UnityContainer();
foreach (var t in typeof(Program).Assembly.GetExportedTypes())
{
if (typeof(IMessageWriter).IsAssignableFrom(t))
{
container.RegisterType(typeof(IMessageWriter), t, t.FullName);
}
}
container.Resolve<Salutation>().Exclaim();
Console.ReadLine();
}
}
public class Salutation
{
private readonly IMessageWriter writer;
public Salutation(IMessageWriter writer)
{
this.writer = writer;
}
public void Exclaim()
{
writer.Write("Hello DI!");
}
}
public interface IMessageWriter
{
void Write(string message);
}
public class ConsoleMessageWriter : IMessageWriter
{
public void Write(string message)
{
Console.WriteLine(message);
}
}
}
Joining AppHarbor
I'm pleased to announce that I'll be joining AppHarbor as a developer. With my long-standing interest in TDD and OOD as well as my more recent interests in open-source .NET software, distributed source control systems, Continuous Delivery etc. AppHarbor seems like a perfect match for me.
Although AppHarbor is very attractive to me, this has been a difficult decision as Commentor has been a great employer. However, despite great customers I just don't feel like consulting at the moment. Since Safewhere went out of business I've been writing much less code than I'd liked, so when presented with an opportunity to join such a congenial outfit as AppHarbor I had few doubts.
I'll still be working out of Copenhagen, Denmark, and I also expect to keep up my usual community engagement at home as well as abroad.
Comments
Looking forward to the results of you guys working together! :)
Composition Root
In my book I describe the Composition Root pattern in chapter 3. This post serves as a summary description of the pattern.
The Constructor Injection pattern is easy to understand until a follow-up question comes up:
Where should we compose object graphs?
It's easy to understand that each class should require its dependencies through its constructor, but this pushes the responsibility of composing the classes with their dependencies to a third party. Where should that be?
It seems to me that most people are eager to compose as early as possible, but the correct answer is:
As close as possible to the application's entry point.
This place is called the Composition Root of the application and defined like this:
A Composition Root is a (preferably) unique location in an application where modules are composed together.
This means that all the application code relies solely on Constructor Injection (or other injection patterns), but is never composed. Only at the entry point of the application is the entire object graph finally composed.
The appropriate entry point depends on the framework:
- In console applications it's the Main method
- In ASP.NET MVC applications it's global.asax and a custom IControllerFactory
- In WPF applications it's the Application.OnStartup method
- In WCF it's a custom ServiceHostFactory
- etc.
(you can read more about framework-specific Composition Roots in chapter 7 of my book.)
The Composition Root is an application infrastructure component.
Only applications should have Composition Roots. Libraries and frameworks shouldn't.
The Composition Root can be implemented with Poor Man's DI Pure DI, but is also the (only) appropriate place to use a DI Container.
A DI Container should only be referenced from the Composition Root. All other modules should have no reference to the container.
Using a DI Container is often a good choice. In that case it should be applied using the Register Resolve Release pattern entirely from within the Composition Root.
Read more in Dependency Injection Principles, Practices, and Patterns.
Comments
The earlier you compose, the more you limit your options, and there's simply no reason to do that.
You may or may not find this article helpful - otherwise, please write again :)
My question is quite parallel to David’s. How about we have a complex application that loads its modules dynamically on-demand, and these modules can consist of multiple assemblies.
For example, a client application that has a page-based navigation. Different modules can be deployed and undeployed all the time. At design-time we don’t know all the types we wanna compose and we don’t have direct reference to them.
Should we introduce some infrastructure code, like David suggested, to let the modules register their own types (e.g. services) to a child container, then we apply “resolve” on the loaded module and dispose the child container when we are done with the module?
Looking forward to the final release of your book BTW :-)
Thanks
One option is to use whichever DI Container you'd like from the Composition Root, and use its XML capabilities to configure the modules into each particular application that needs them. That's always possible, but adds some overhead, and is rarely required.
A better option is to simply drop all desired modules in an add-in folder and then use convention-based rules to scan each assembly. The most difficult part of that exercise is that you have to think explicitly about cardinality, but that's always an issue with add-in architectures. The MEF chapter of my book discusses cardinality a bit.
I think I'm headed in
UI(Asp.net) depends on Presentation layer and a pure interfaces/DTOs (with any serialization attributes/methods needed). so structure/schema of the 'model' lives in that domain-schema layer. domain model depends on the domain-schema layer, but is purely behaviors (methods/logic). In this way there's a sharing of schema/dto between service layer and presentation layer, with the clean separation of all business logic and only business logic living in the domain layer.So outside of the composition root, neither the UI, presentation layer, or resource access layers (public api -> internal adapter-> individual resource) can touch the domain methods or types. Requires we define an interface for the domain model classes I think? Cross cutting-concerns would live in their own assembly.
Any thoughts?
The catch comes when you look at the behavior, and you realize that there is hardly any behavior on these objects, making them little more than bags of getters and setters. Indeed often these models come with design rules that say that you are not to put any domain logic in the the domain objects. AnemicDomain Models - Martin Fowler
It becomes a 'real' domain model when it contains all (or most) of the behaviour that makes up the business domain (note I'm emphasising business logic, not UI or other orthogonal concerns). Anemic Domain Models - Stack overflow
I don't feel that this is headed into an Anemic domain model as the domain model assembly would contain all behaviors specific to the domain.
Designs that share very little state or, even better, have no state at all tend to be less prone to hard to analyze bugs and easier to repurpose when requirements change. Blog article - makes sense but i could be missing multiple boats here.
The domain would have zero state, and would depend on interfaces not DTOs. This would make test-ability and analysis simple.
Motivation for DTOs in the interfaces layer: presentation and persistence could all share the DTOs rather than having them duplicated in either place.
Barring that perceived gain, then the shared layer across all would just be interfaces.
The UI could depend on interfaces only, but the same assembly would contain DTOs that the presentation layer and any persistence could share without having to be duplicated/rewritten. So the domain is completely persistence and presentation ignorant.
The purpose of a DTO is "to transfer multiple items of data between two processes in a single method call." That is (I take it) not what you are attempting to do here.
Whether or not you implement the business logic in the same assembly as your DTOs has nothing to with avoiding an Anemic Domain Model. The behavior must be defined by the object that holds the data.
I am writing a WPF system tray application which has a primary constraint to keep lean, especially in terms of performance and memory usage. This constraint, along with the relatively low complexity of the application, means that I cannot justify the overhead of MVVM. In the absence of a prescribed architecture for building the object graph, I could set one up 'manually' in Application.OnStartup. Presumably it would mean instantiating all the Window objects (with dependences), which are utilised from time to time by the user.
However, I have a problem with these Window instance sitting in memory, doing nothing, 90% of the time. It seems much more sensible to me to instantiate these Window objects in the events where the user asks for them. Yet, I will want to maintain the inversion of control, so how can I avoid accessing the DI container in multiple points in my app?
You can use one of the solutions outlined in my recent series on Role Hints to avoid referencing a container outside of the Composition Root.
I wanted to post the following comment on your linked page, here
/2011/03/04/Composeobjectgraphswithconfidence
but I am encountering a page error on submission.
The above article suggests a great framework for delayed construction of resources, however I have a different reason for wanting this than loading an assembly.
In my scenario I have occasionally-needed Window objects with heavy GUIs, presenting lots of data. The GUI and data only need to be in memory while the Window is displayed. The lazy loading approach seems to apply, except it leaves me with a couple of questions.
1) When the window is closed again, I want to release the resources. How can I unload and reload it again?
2) What would this line look like when using an IoC like Ninject?
> new Lazy<ISolo<IMarker>>(() => new C1(
An alternative I am considering is putting classes defining events in the object graph, instead of the windows themselves. These event classes could be constructed with instances of dependencies which will be used in the event methods to construct Windows when needed. The dependancies I am wanting to inject to maintain inversion of control are light-weight objects like persistance services which will exist as singletons in memory anyway (due to the way most IoC's work). Do you see any problem with this?
Many thanks.
I can't help you with Ninject.
You make the point that only the composition root should register dependencies of various layers and components. And thereby only the project containing the composition root would have references to the DI container being used
In a couple of projects I've worked on we have multiple applications and thereby composition root, being websites, service endpoint, console applications etc. giving us around 10 application endpoints that needs to have their own composition roots. But at the same time we work with a components based design which nicely encapsulates different aspects of the system.
If I were to only do my registration code within the place of the composition roots, then applications using the same components would have to register exactly the same dependencies per compontent, leading to severe code duplication
At this point we've gone with a solution where every component has their own registration class which the composition root's then register, so basically out composition roots only compose another level of composition roots. I hope it makes sense!
What I'm explaining here directly violates the principles you make in this post. Which I already had I sense it would. But at the same time I don't know how else to manage this situation effectively. Do you have any ideas for how to handle this type of situation?
Allan, thank you for writing. In the end, you'll need to do what makes sense to you, and based on what you write, it may make sense to do what you do. Still, my first reaction is that I would probably tend to worry less about duplication than you seem to do. However, I don't know your particular project: neither how big your Composition Roots are, or how often they change. In other words, it's easy enough for me to say that a bit of duplication is okay, but that may not fit your reality at all.
Another point is that, in my experience at least, even if you have different applications (web sites, services, console applications, etc.) as part of the same overall system, and they share code, the way they should be composed tend to diverge the longer they live. This is reminiscent of Mathias Verraes' point that duplication may be coincidental, but that the duplicated code may deviate from each other in the future. This could also happen for such applications; e.g. at one point, you may wish to instrument some of your web site's dependencies, but not your batch job.
My best advice is to build smaller systems, in order to keep complexity down, and then build more of them, rather than building big systems. Again, that advice may not be useful in your case...
Another option, since you mention component-based design, is to move towards a convention-based approach, so that you don't have to maintain those Composition Roots at all: just drop in the binaries, and let your DI Container take care of the rest. Take a look at the Managed Extensibility Framework (MEF) for inspiration. Still, while MEF exposes some nice ideas, and is a good source of inspiration, I would personally chose to do something like this with a proper DI Container that also supports run-time Interception and programmable Pointcuts.
In the end, I think I understand your problem, but my overall reaction is that you seem to have a problem you shouldn't have, so my priority would be:
- Remove the problem altogether.
- Live with it.
- If that's not possible, solve it with technology.
It seems that I have a certan inclination for reopening "old" posts. Still the may be considered evergreens! :) Now back on topic.
When it comes to the well known frameworks as ASP.NET MVC, Web.API, WCF, it is quite clear on where and how to set our composition root. But what if we do not have a clear "entry point" to our code? Imagine that you are writing an SDK, laying down some classes that will be used by other developers. Now, you have a class that exposes the following constructor.
public MyClass(IWhatEver whatEver)
Consider also that who is going to use this classes has no idea about IWhatEver nor it should have. For make the usage of MyClass as simple as possible, we I should be able to instatiate MyClass via the parameterless constructor. I had the idea of making the constructor that is used for DI, internal, and the only publically available one to be the paramtereless constructor. Then fetch somehow the instances of my dependencies in the paramterless constructor. Now imagine that I have several of classes as MyClass and that I do have a common set of "services" that are injected inside of them. Now, my questions are.- Can I still have the single composition root?
- How do I trigger the composition?
- Does recreating the composition root per each "high level" class has a significant performance impact (considering a dependency tree of let's say 1000 objects).
- Is there a better way (pattern) to solve a problem like this?
Mario, thank you for writing. If you're writing an SDK, you are writing either a library or a framework, so the Composition Root pattern isn't appropriate (as stated above). Instead, consider the guidelines for writing DI friendly libraries or DI friendly frameworks.
So, in order to answer your specific questions:
1: Conceptually, there should only be a single Composition Root, but it belongs to the application, so as a library developer, then no: 'you' can't have any Composition Root.
2: A library shouldn't trigger any composition, but as I explain in my guidelines for writing DI friendly libraries, you can provide Facades to make the learning curve gentle. For frameworks, you may need to provide appropriate Abstract Factories.
3: The performance impact of Composition shouldn't be a major concern. However, the real problem of attempting to apply the Composition Root pattern where it doesn't apply is that it increases coupling and takes away options that the client developer may want to utilize; e.g. how can a client developer instrument a sub-graph with Decorators if that sub-graph isn't available?
4: There are better ways to design DI friendly libraries and DI friendly frameworks.
Thank you for your quick and valuable replay.
I read your posts and made some toughs about them. Still I am not convinced that what is described in that posts does tackle my problem. Even more I'm realizing that is less and less relevant to the topic of composition root. However, let me try describe a situation.
I do plan to write an SDK (a library that will allow developers to interact with my application and use some of it's functionality). Consider that I'm exposing a class called FancyCalculator. FancyCalculater needs to get some information from the settings repository. Now, I have a dependency on ISettingsRepository implementation and it is injected via a constructor into my FancyCalculator. Nevertheless, who is going to use the FancyCalculator, doesn't know and he shouldn't know anything about ISettingsRepository and it's dependency tree. He is expected only to make an instance of FancyCalculator and call a method on it. I do have a single implementation of ISettingsRepository in form of SettingsRepository class. In this case, how do I get to create an instance of SettingsRepository once my FancyCalculator is created?
A Factory? Service locator? Something else?
Composition root in applications like MVC, WebAPI, etc, is a very nice and clean approach, but what about the situations when we do not have a clean single entry point to the application?
Thank you again!
Mario, thank you for writing again. As far as I understand, you basically have this scenario:
public class FancyCalculator { private readonly ISettingsRepository settingsRepository; }
Then you say: "a dependency on ISettingsRepository implementation and it is injected via a constructor into my FancyCalculator". Okay, that means this:
public class FancyCalculator { private readonly ISettingsRepository settingsRepository; public FancyCalculator(ISettingsRepository settingsRepository) { if (settingsRepository == null) throw new ArgumentNullException("settingsRepository"); this.settingsRepository = settingsRepository; } }
But then you say: "who is going to use the FancyCalculator, doesn't know and he shouldn't know anything about ISettingsRepository and it's dependency tree."
That sounds to me like mutually exclusive constraints. Why are you injecting ISettingsRepository into FancyCalculator if you don't want to enable the user to supply any implementation of ISettingsRepository? If you don't want to allow that, the solution is easy:
public class FancyCalculator { private readonly ISettingsRepository settingsRepository; public FancyCalculator() { this.settingsRepository = new DefaultSettingsRepository(); } }
If you want the best of both worlds, the solution is the Facade pattern I already described in my DI Friendly library article.
Hi Mark, You got the example right. The reason of my choices are the following. I do not want user of my library to know about the dependencies for a couple of reason. First of all it needs to be as simple as possible to use. Second thing the dependencies are let's say "internal", so that code is loosely coupled and testable. He should not know even about them or get bothered at any point. Still I do not think it's wise to create instances of the dependencies in the parameterless constructor. Why? Well, I am concerned about maintainability. I would like a to somehow request the default instance of my dependency in the parameterless constructor and get it, so that I do have a single point in my sdk where teh default dependency is specified, and also that I do not need to handle the dependency tree. The fact is that I can't get what this something should be. I re-read the chapter 5 of your book this weekend to see if I can come up with something valid ideas. What I came up with is that I should use the parameterless constructor of each of my classes to handle it's own default dependencies. In this way, resolving the tree should be fine, and it is easy to maintain. Also to prevent user to see the constructors that to require the dependencies to be injected (and confuse it) I tough of declaring these constructors as internal. This are my words translated in code (easier to understand).
class Program { static void Main(string[] args) { // Using the SDK by others FancyCalculator calculator = new FancyCalculator(); Console.WriteLine(calculator.DoSomeThingFancy()); } } public interface ISettingsRepository { } public interface ILogging { } public class Logging : ILogging { } public class SettingsRepository : ISettingsRepository { private readonly ILogging logging; public SettingsRepository() { this.logging = new Logging(); } internal SettingsRepository(ILogging logging) { if (logging == null) throw new ArgumentNullException("logging"); this.logging = logging; } } public class FancyCalculator { private readonly ISettingsRepository settingsRepository; public FancyCalculator() { this.settingsRepository = new SettingsRepository(); } internal FancyCalculator(ISettingsRepository settingsRepository) { if (settingsRepository == null) throw new ArgumentNullException("settingsRepository"); this.settingsRepository = settingsRepository; } public int DoSomeThingFancy() { return 1; } }
What do you think about a solution like this? What are a bad sides of this approach? Is there any pattern that encapsulates a practise like this?
Mario, you don't want the user of your library to know about the dependencies, in order to make it easy to use the SDK. That's fine: this goal is easily achieved with one of those Facade patterns I've already described. With either Constructor Chaining, or use of the Fluent Builder pattern, you can make it as easy to get started with FancyCalculator as possible: just invoke its parameterless constructor.
What then, is your motivation for wanting to make the other constructors internal? What do you gain from doing that?
Such code isn't loosely coupled, because only classes internal to the library can use those members. Thus, such a design violates the Open/Closed Principle, because these classes aren't open for extension.
Dear Mark,
I am pretty sure that Mario uses internal code in order to be able to write unit tests. Probably the interfaces of the dependecies which he is hiding should also be internal. I think it is a good aproach because thanks to it he can safely refactor those interfaces, beacuse it is not a public API and also he can have unit tests.
Yes - "Such code isn't loosely coupled, because only classes internal to the library can use those member", however I think that very often stable API is very important for libraries.
Could you give any comments on that?
Robert, thank you for writing. Your guess sounds reasonable. Even assuming that this is the case, I find it important to preface my answer with the caution that since I don't know the entire context, my answer is, at best, based on mainstream scenarios I can think of. There may be contexts where this approach is, indeed, the best solution, but in my experience, this tends not to be the case.
To me, the desire to keep an API stable leads to APIs that are so locked down that they are close to useless unless you happen to be so incredibly lucky that you're right on the path of a 'supported use case'. Over the years, I've worked with many object models in the .NET Base Class Library, where I've wanted it to do something a little out of the ordinary, only to find myself in a cul-de-sac of internal interfaces, sealed classes, or internal virtual methods. Many other people have had the same problems with .NET, which, I believe, has been a contributing factor causing so many of the brightest programmers to leave the platform for other, more open platforms and languages (Ruby, Clojure, JavaScript, Erlang, etc.).
The .NET platform is getting much better in this regard, but it's still not nearly as good as it could be. Additionally, .NET is littered with failed technologies that turned out to be completely useless after all (Windows Workflow Foundation, Commerce Server, (early versions of) Entity Framework, etc.).
.NET has suffered from a combination of fear of breaking backwards compatibility, combined with Big Design Up-Front (BDUF). The fact that (after all) it's still quite useful is a testament to the people who originally designed it; these people were (or are) some of most skilled people in the industry. Unless you're Anders Hejlsberg, Krzysztof Cwalina, Brad Abrams, or on a similar level, you can't expect to produce a useful and stable API if you take the conservative BDUF approach.
Instead, what you can do is to use TDD to explore the design of your API. This doesn't guarantee that you'll end up with a stable API, but it does help to make the API as useful as possible.
How do you ensure stability, then? One option is to realise that you can use interfaces as access modifiers. Thus, you can publish a library that mostly contains interfaces and a few high-level classes, and then add the public implementations of those interfaces in other libraries, and clearly document that these types are not guaranteed to remain stable. This option may be useful in some contexts, but if you're really exposing an API to the wild, you probably need a more robust strategy.
The best strategy I've been able to identify so far is to realise that you can't design a stable API up front. You will make mistakes, and you will need to deal with these design mistakes. One really effective strategy is to apply the Strangler pattern: leave the design mistakes in, but add new, better APIs as you learn; SOLID is append-only. In my Encapsulation and SOLID Pluralsight course, I discuss this particular problem and approach in the Append-Only section of the Liskov Substitution Principle module.
Good point Mark! I really like your point of view, in general I agree with you.
I will try to 'defend' the internal dependecies, beacuse it may help us explore all the possible reasonable usages of it.
1. Personlly I would hide my dependencies with internal access when I would be quite certain that my design is bad, but I do not have time to fix it, but I will do it for sure in probably near future.
2. Moreover if I would have unit tests without accessing to internal's (so not the case which I was writing in my previous comment), then I should be able to refactor the 'internals' without even touching the unit tests. This is why I see internal depedencies as a refactoring terchnique - especially when working with legacy code.
These were some cases for Library/Framework. What about internal access in non-modular Application development that is consists of several projects (dll's)? Personally I try to keep everthing internal by default and only make public interfaces, entities, messages etc. for things that are used between the projects. Then I compose them in a dedicated project (which I name Bootstrapper) which is a composition root and has access to all internal types... If your books covers this topic - please just give a reference. I have not read your book so far, but it is in the queue :)
Please note that I'm not saying that internal or private classes are always bad; all I'm saying is that unit testing internal classes (presumably using the [InternalsVisibleTo] attribute) is, in my opinion, not a good idea.
You can always come up with some edge cases against a blanket statement like never unit test internal classes, but in the general case, I believe I've already outlined why I think it's a really poor idea to do so.
Ultimately, I've never understood the need for the [InternalsVisibleTo] attribute. When you're applying it, you're basically making internal types public to select consumers, with all the coupling that implies. Why not make the types truly public then?
As far as I can tell, the main motivation for not making types public is when the creators don't trust their code. If this is the case, making things internal isn't a solution, it's a symptom. Address the problem instead of trying to hide it. Make the types trustworthy by applying encapsulation.
An example from my last days at work: I marked a class as internal that is wrapping some native library (driver for some hardware) using [DllImport]. I want clients of my Library use my classes - not the DLL wrapper - this is why I hide it using internal. However I needed InternalVisibleTo so that I could write integration tests to see if the wrapper really works.
Why making public when nobody outside is using it? YAGNI. I only expose when I know when its needed for the clients. Then for those that I have already exposed I need to have backward compability. The less I have exposed the more easily and safely I can refactor my design. And it is not about always bad design. New requirements frequently come in parallel with refactoring the design. This is why I like the Martin Fowler's idea of Published Interface which is also mentioned in his Access Modifier article.
Additionally I always had a feeling that making everything public can affect badly influence the software architecture. The more encaupsulated are the packages the better. And for me internal access is also a mean of encaupsulation at the packaging level. Three days ago Simon Brown had a presentation named "Software Architecture vs Code" on DevDay conference. When he told something like "do not make public classes by default" people were applauding!
My rules of a thumb for seting the class access modifiers are:
- private - when it is a helper for a class where it is being nested
- internal - when it used only in the package
- public - when other packages needs to use it
- oh and, how I would love to have this Published Interface!
Thank to Mark Seemann for his very inspiring write about Composition Root design pattern, /2011/07/28/CompositionRoot/
I wrote my own JavaScript Dependency Injection Framework called Di-Ninja with these principles in mind https://github.com/di-ninja/di-ninja
As I know, is the only one in javascript that implement the Composition-Root design pattern and it's documentation could be another good example to demonstrate how it works.
It work for both NodeJS and browser (with Webpack)
Thanks for this insightful post!
I was just wondering, could we say then that using a Service Locator exclusively at the Composition root is not something bad, actually is perfectly fine?
I'm asking this, since it's wide-spread the service locator 'anti-pattern', however, at the Composition root its usage is not a problem at all, right ?
Thanks for the amazing blog! ;)
Lisber, thank you for writing. Does this answer your question?
Totally!
The sum up at the end of that article is brilliant!
It becomes a Service Locator if used incorrectly: when application code (as opposed to infrastructure code) actively queries a service in order to be provided with required dependencies, then it has become a Service Locator.Thank you for clarifying that.
SOLID Code isn't
Recently I had an interesting conversation with a developer at my current client, about how the SOLID principles would impact their code base. The client wants to write SOLID code - who doesn't? It's a beautiful acronym that fully demonstrates the power of catchy terminology.
However, when you start to outline what it actually means people become uneasy. At the point where the discussion became interesting, I had already sketched my view on encapsulation. However, the client's current code base is designed around validation at the perimeter. Most of the classes in the Domain Model are actually internal and implicitly trust input.
We were actually discussing Test-Driven Development, and I had already told them that they should only test against the public API of their code base. The discussion went something like this (I'm hoping I'm not making my ‘opponent' sound dumb, because the real developer I talked to was anything but):
Client: "That would mean that each and every class we expose must validate input!"
Me: "Yes…?"
Client: "That would be a lot of extra work."
Me: "Would it? Why is that?"
Client: "The input that we deal with consist of complex data structures, and we must validate that all values are present and correct."
Me: "Assume that input is SOLID as well. This would mean that each input instance can be assumed to be in a valid state because that would be its own responsibility. Given that, what would validation really mean?"
Client: "I'm not sure I understand what you mean…"
Me: "Assuming that the input instance is a self-validating reference type, what could possibly go wrong?"
Client: "The instance might be null…"
Me: "Yes. Anything else?"
Client: "Not that I can think of…"
Me: "Me neither. This means that while you must add more code to implement proper encapsulation, it's really trivial code. It's just some Guard Clauses."
Client: "But isn't it still gold plating?"
Me: "Not really, because we are designing for change in the general sense. We know that we can't predict specific change, but I can guarantee you that change requests will occur. Instead of trying to predict specific changes and design variability in those specific places, we simply put interfaces around everything because the cost of doing so is really low. This means that when change does happen, we already have Seams in the right places."
Client: "How does SOLID help with that?"
Me: "A result of the Single Responsibility Principle is that each self-encapsulated class becomes really small, and there will be a lot of them."
Client: "Lots of classes… I'm not sure I'm comfortable with that. Doesn't it make it much harder to find what you need?"
Me: "I don't think so. Each class is very small, so although you have many of them, understanding what each one does is easy. In my experience this is a lot easier than trying to figure out what a big class with thousands of lines of code does. When you have few big classes, your object model might look something like this:"
"There's a few objects and they kind of fit together to form the overall picture. However, if you need to change something, you'll need to substantially change the shape of each of those objects. That's a lot of work, and this is why such an object design isn't particularly adaptable to change.
"With SOLID, on the other hand, you have lots of small-grained objects which you can easily re-arrange to match new requirements:"
And that's when it hit me: SOLID code isn't really solid at all. I'm not a material scientist, but to me a solid indicates a rigid structure. In essence a structure where the particles are tightly locked to each other and can't easily move about.
However, when thinking about SOLID code, it actually helps to think about it more like a liquid (although perhaps a rather viscous one). Each class has much more room to maneuver because it is small and fits together with other classes in many different ways. It's clear that when you push an analogy too far, it breaks apart.
Still, a closing anecdote is appropriate...
My (then) three-year old son one day handed me a handful of Duplo bricks and asked me to build him a dragon. If you've ever tried to build anything out of Duplo you'll know that the ‘resolution' of the bricks is rather coarse-grained. Given that ‘a handful' for a three-year old isn't a lot of bricks, this was quite a challenge. Fortunately, I had an appreciative audience with quite a bit of imagination, so I was able to put the few bricks together in a way that satisfied my son.
Still, building a dragon of comparable size out of Lego bricks is much easier because the bricks have a much finer ‘resolution'. SOLID code is more comparable to Lego than Duplo.
Comments
http://bit.ly/ik3dCW
Functional units of small size are key to evolvable code. This makes for a fine grained "code sand" which can be brought into ever changing shapes.
The question however is: What is to become smaller? Classes or methods?
I´d say it´s classes that need to be limited in size, maybe 50 to 100 LOC.
Classes are to be kept small because they are the "blueprints" of the smallest stateful runtime units of code which can be recombined: objects.
However this leads to two problems:
1. It´s difficult to think of a decomposition of the very common noun-classes into smaller classes. How to break up a PaymentService class into many smaller classes?
2. Small classes would lead to an explosion of dependencies between all of these classes. They would be hard to manage (even with dependency injection).
That´s two reasons why most developers will probably resists decreasing the size of their classes. They´ll just keep methods small - but the effect of this will be limited. There´s no limit to the size of a class; 1,000 LOC class can consist of 100 methods each 10 lines long.
But the two problems go away if SOLID is applied in combination with a different view on object orientation.
We just need to switch from nouns to verbs when thinking about domain logic classes (not data classes).
If classes become functions (or maybe better behaviors) then there is no decomposition problem anymore. Behaviors can be as small as needed, maybe just 1 LOC.
Also behaviors conceptually don´t have any dependencies on each other (but maybe to an environment); sitting is not dependend on running even though running might "be done" after sitting.
"I certainly wouldn't want to drive a Lego car or walk across a Lego bridge"
ROTFL
Great quote! I've always liked the principles behind solid, but have always been a little turned off by some of the zealotry associated with it. Like most everything in life it's great "in moderation".
Thanks for your article.
You are describing a philosophy
which is almost standard practice
in Smalltalk since the 80's..
As you know, albeit desired, it is
however not always possible to
make small classes.
Recommended reading:
(also for OO in general)
"Smalltalk, Objects and Design"
by: Chamond Liu
ISBN: 1-8847777-27-9
Manning Publications Co, 1996.
Kind Regards
Ted
..btw I am sure that there must be
a photo of yours whereon you look a bit happier?
It's my experience that there is lots of discussion around the concept of abstraction and too little discussion around the use and practice of abstraction.
I interview many software development candidates, and I always explore what I consider foundational concepts - one is abstraction. I find most developers / programmers / software engineers (/ whatever), cannot adequately communicate HOW they use abstraction to enable their code - not on an architectural level, not on a modular level, not on a class level, and not even on a simple functional level. This is a real indication of the state of software design and it is alarming to me.
It's my belief that designing abstraction in software engineering is the most critical tool in software construction, but it is the least discussed.
So since there is no real expression of abstraction (except for inheritance) in code, it´s neglected or left to personal style. Sad.
I’m not convinced that the cost is indeed "really low", for two reasons.
1) Interfaces, before TDD became popular, imparted information to the maintainer of a type because they were used to indicate, for instance, that a given type was to be treated like a collection that could be enumerated or a type that could be compared with other types for the purposes of, say, sorting. When added reflexively as a means of inserting a level of indirection, the interface no longer imparts any information.
2) When navigating around a large code base where nearly all types are accessed via a level of indirection (courtesy of ubiquitous interfaces) my IDE – Visual Studio 2010 – struggles to answer common questions like "what code will be executed when SomeType.ReadHeader() is called?". Hitting F12 doesn’t take me to the definition of the method but rather takes me to the definition of the interface which, because of the point above, is of no value. ReSharper can sometimes find the method with a more advanced search but not always. The upshot of this is that code making heavy use of interfaces becomes much harder to statically analyse.
Agreed. That point is manifest in @Dave's subsequent post. So, how would the lack of direct language support for abstraction - other than inheritance - be solved? And how would code tracing / debugging interfaces and their implementations be handled?
My gut tells me this is one reason (of several) that scripting-like languages have gained popularity in the recent past, and that we are being pointed in a more real-time, interpreted language direction due to the necessary dynamic nature of these constructs. Hm... {thinking}
Love a blog that makes me think for more than a few seconds.
"So, how would the lack of direct language support for abstraction - other than inheritance - be solved?"
and I don´t know if we should hope for languages to offer support soon. We´ve to live with what we have: Java, C#, C++, Ruby etc. (Scripting languages to me don´t offer better support for abstraction.)
Instead we need to figure out how to use whatever features these languages offer in a way to express abstraction - and I don´t mean inheritance ;-)
And that leads to a separation of mental model from technical features. We need to think in a certain way about our solutions - and then translate our mental models to language reality.
Object orientation tried to unify mental model and code reality. But as all those maintenance nightmare projects show, this has not really worked out as nicely as expected. The combination of OO method (OOAD) and OO programming has not lived up to our expectations.
Since we´re pretty much stuck with OO languages we need to replace the other part of the pair, the method, I´d say.
Very interesting article...
At the Boundaries, Applications are Not Object-Oriented
My recent series of blog posts about Poka-yoke Design generated a few responses (I would have been disappointed had this not been the case). Quite a few of these reactions relate to various serialization or translation technologies usually employed at application boundaries: Serialization, XML (de)hydration, UI validation, etc. Note that such translation happens not only at the perimeter of the application, but also at the persistence layer. ORMs are also a translation mechanism.
Common to most of the comments is that lots of serialization technologies require the presence of a default constructor. As an example, the XmlSerializer requires a default constructor and public writable properties. Most ORMs I've investigated seem to have the same kind of requirements. Windows Forms and WPF Controls (UI is also an application boundary) also must have default constructors. Doesn't that break encapsulation? Yes and no.
Objects at the Boundary #
It certainly would break encapsulation if you were to expose your (domain) objects directly at the boundary. Consider a simple XML document like this one:
<name> <firstName>Mark</firstName> <lastName>Seemann</lastName> </name>
Whether or not we have formal a contract (XSD), we might stipulate that both the firstName and lastName elements are required. However, despite such a contract, I can easily create a document that breaks it:
<name> <firstName>Mark</firstName> </name>
We can't enforce the contract as there's no compilation step involved. We can validate input (and output), but that's a different matter. Exactly because there's no enforcement it's very easy to create malformed input. The same argument can be made for UI input forms and any sort of serialized byte sequence. This is why we must treat all input as suspect.
This isn't a new observation at all. In Patterns of Enterprise Application Architecture, Martin Fowler described this as a Data Transfer Object (DTO). However, despite the name we should realize that DTOs are not really objects at all. This is nothing new either. Back in 2004 Don Box formulated the Four Tenets of Service Orientation. (Yes, I know that they are not in vogue any more and that people wanted to retire them, but some of them still make tons of sense.) Particularly the third tenet is germane to this particular discussion:
Services share schema and contract, not class.
Yes, and that means they are not objects. A DTO is a representation of such a piece of data mapped into an object-oriented language. That still doesn't make them objects in the sense of encapsulation. It would be impossible. Since all input is suspect, we can hardly enforce any invariants at all.
Often, as Craig Stuntz points out in a comment to one of my previous posts, even if the input is invalid, we want to capture what we did receive in order to present a proper error message (this argument also applies on machine-to-machine boundaries). This means that any DTO must have very weak invariants (if any at all).
DTOs don't break encapsulation because they aren't objects at all.
Don't be fooled by your tooling. The .NET framework very, very much wants you to treat DTOs as objects. Code generation ensues.
However, the strong typing provided by such auto-generated classes gives a false sense of security. You may think that you get rapid feedback from the compiler, but there are many possible ways you can get run-time errors (most notably when you forget to update the auto-generated code based on new schema versions).
An even more problematic result of representing input and output as objects is that it tricks lots of developers into dealing with them as though they represent the real object model. The result is invariably an anemic domain model.
More and more, this line of reasoning is leading me towards the conclusion that the DTO mental model that we have gotten used to over the last ten years is a dead end.
What Should Happen at the Boundary #
Given that we write object-oriented code and that data at the boundary is anything but object-oriented, how do we deal with it?
One option is to stick with what we already have. To bridge the gap we must then develop translation layers that can translate the DTOs to properly encapsulated domain objects. This is the route I take with the samples in my book. However, this is a solution that more and more I'm beginning to think may not be the best. It has issues with maintainability. (Incidentally, that's the problem with writing a book: at the time you're done, you know so much more than you did when you started out… Not that I'm denouncing the book - it's just not perfect…)
Another option is to stop treating data as objects and start treating it as the structured data that it really is. It would be really nice if our programming language had a separate concept of structured data… Interestingly, while C# has nothing of the kind, F# has tons of ways to model data structures without behavior. Perhaps that's a more honest approach to dealing with data… I will need to experiment more with this…
A third option is to look towards dynamic types. In his article Cutting Edge: Expando Objects in C# 4.0, Dino Esposito outlines a dynamic approach towards consuming structured data that shortcuts auto-generated code and provides a lightweight API to structured data. This also looks like a promising approach… It doesn't provide compile-time feedback, but that's only a false sense of security anyway. We must resort to unit tests to get rapid feedback, but we're all using TDD already, right?
In summary, my entire series about encapsulation relates to object-oriented programming. Although there are lots of technologies available to represent boundary data as ‘objects', they are false objects. Even if we use an object-oriented language at the boundary, the code has nothing to do with object orientation. Thus, the Poka-yoke Design rules don't apply there.
Now go back and reread this post, but replace ‘DTO' with ‘Entity' (or whatever your ORM calls its representation of a relational table row) and you should begin to see the contours of why ORMs are problematic.
P.S. 2022-05-02. See also At the boundaries, applications aren't functional.
Comments
Anyway the gist of it: great post, I commpletely agree that data objects aren't domain objects, and thinking of them as structured data is very liberating. Using Entities/DTOs/data objects as a composible part of a domain object is a nice way to create that separation which still allows you to (potentially directly) use ORM entities while avoiding having an anemic domain model. So, for example, CustomerDomain would/could contain a CustomerEntity, instead of trying to add behavior to CustomerEntity (which might be a generated class).
Yep, I meant "entity" more in the sense of how it is defined in some of the popular ORMs, like LLBLGen and Entity Framework. Essentially, an instance that corresponds (more or less) directly to a row in a table or view.
For instance, generating DTOs from a database schema provides an entire static structure (relationships, etc.) that tend to constrain us when we subsequently attempt to define an object model. I rather prefer being able to work unconstrained and then subsequently figure out how to persist it.
Keep in mind that in most cases, a DTO is not an end-goal in itself. Rather, the end-goal is often the wire-representation of the DTO (XML, JSON, etc.). Wouldn't it be better if we could just skip the DTO altogether and use a bit of convention-based (and testable) dynamic code to make that translation directly?
It's a difficult problem to solve perfectly--every solution had downsides. I have gone down the road of mapping DTOs (or data entities) via a translation layer and that can be painful as well. As you said, the tradeoff of using data objects as a composable part of the Domain Object is that you can't model your domain with complete freedom (which DDD purists would likely see as non-negotiable). The upside is that you have the potential for less maintenance.
There is lots to argue about in this article (nothing "wrong", just semantics). The main point I would like to pick up on is about your point "What should happen at domain boundaries".
I would contest that rather than "translation" layers, its semantically better to think of it as an "interpretation object". When an app accepts input, there are very few assumptions that should be automatically attached to the unit of input. One of the fundamental ones and often the easiest to break is the assumption that the input is in any way valid, useful or complete.
The semantic concept(abstraction) that is wrapped around the unit of input (the object that encapsulates the input data) need to have a rich interface that can convey abstractions such as "Cannot interpret this input", "Can partially interpret the input but it contains some rubbish", "Input contains SQL injection attack", "Input is poorly formed and has a missing .DTD file" etc.
My argument is that the semantic concept of "translation" while obviously closely related to "interpretation" is semantically a "transformation process". A->B kind of idea, while "interpretation", at least in my head, is not so deterministic and seeks to extract domain abstractions from the input that are then exposed to the application as high level abstractions rather than low level "translated data/safe data/sanitized data".
Thanks,
hotsleeper
Dynamic types are also an option -- anyone who has used Rails/ActiveRecord is familiar with their upsides and downsides. In short, maintaining them is free, but using them costs more.
My preferred approach (for now) is to use EF entities as boundary objects for the DB. They cost almost nothing to maintain -- a couple of mouse clicks when we change DB schemata -- since we use "Database First" modeling for internal reasons.
You only get into encapsulation trouble with this if you try to use the EF entities as business objects -- something many people do. My $0.02 is that it's usually wrong to put any kind of behavior on an EF entity.
My personal experience with EF was that even with code generation, it was far from frictionless, but YMMV...
Depends! If we're building a framework that is used by others then there is a big difference between the applications boundary to the "outside world" (DTO's) and persistence.
A quote taken from NHDay - Loosely Coupled Complexity - CQRS (see minute 2:10) What is wrong with a design with DTO's? "Nothing." End of presentation. But then continues into CQRS :)
Especially the translation-layer when requests come into our application, and we try to rebuild the OO model to help save new objects with an ORM can be cumbersome.
The point is that the database follows from the domain design, not other way round like EF or ORM class generators make you try to believe.
There may be nothing wrong with that, but if you decide that you need object-orientation to solve a business problem, you must transform DTOs into proper objects, because it's impossible to make OOD with DTOs.
But why not think this further? What´s a boundary? Is a boundary where data is exchanged between processes? Or processes on different machines? Processes implemented using different platforms? Or is a boundary where data moves between .NET AppDomains? Or between threads?
The past year I´ve felt great relieve by very strictly appyling this rule: data that moves around is, well, just data.
That does not mean I´m back to procedural programming. I appreciate the benefits of object oriented languages.
But just because I can combine data and functions into one "thing", I should not always do it.
If data is state of a behavior, then data+function makes sense.
But if data is pushed around between "behaviroal objects" then data should be just data (maybe spiced with some convenience functions).
So what I´ve come to doing is "data flow design" or "behaviroal design". And that plays very nicely with async programming and distributing code across processes.
If you do it in a request/response style (even mapped into internal code), I'd say that you'd be doing procedural programming.
However, if you do it as Commands, passing structured data to void methods, you'd basically be heading in the direction of Pipes and Filters architecture. That's actually a very good place to be.
Wouldn't it be better if we could just skip the DTO altogether and use a bit of convention-based (and testable) dynamic code to make that translation directly?I think technology is finally starting to get there. Most serializers still want a default constructor, but some also provide mechanisms for constructor injection — making immutable domain models possible again.
Now we just need to be able to validate the input appropriately; you made a point about validation being "a different matter" above that I didn't quite understand. It seems to me that, as long as the input is valid, we can use it to create a proper model object.
And if we can automatically return abstract errors like @hotsleeper suggested, then I think we have a full solution.
My most recent take on this problem has been to use a combination of Jersey, Jackson, and Hibernate Validator to translate from on-the-wire JSON representation to valid domain object.
Is there something I'm missing with this approach?
The "different matter" of interpretation is that you need to be able to interpret and validate the incoming data, before you turn it into a Domain Object. An encapsulated Domain Object should throw exceptions if you attempt to create invalid instances, so unless you want to catch exceptions and surface them at the boundary, you'll need to be able to validate the input before you create the Domain Object.
There are various technologies where you can annotate your Domain Objects, in order to add such interpretation logic to them, but you should be aware that if you do that, you'd be conflating the interpretation logic with your Domain Model. Is that a problem? Only you can tell, but at least, you must be aware of this before you can make the decision.
What happens if you need to interpret XML as well as JSON? What happens if you need to interpret two mutually exlusive versions (v1 and v2) of XML into the same object? What happens if you need to interpret YAML in addition to JSON and XML? Etc.
5 years have passed after original post date, but smart thoughts never get old i think.
I have a question, concerning "Currently I think that using dynamic types looks most promising."
Are you still think like that? I assume yes. But is it even "ethical" to use amorphous, general-purpose objects, like that?
It's bad to populate objects with property injection, but it's ok to give them method they didn't have? When I'm looking at this Expando I just see something like:
- take JSON file with object graph
- deserialize to "DTO" graph
- give arbitrary DTOs methods, so they now domain objects
- seems legit
Of cource, im exagerrating. But htat's why:
Right now we're starting a project. The only and last thing that bothers me now is exactly this: Client application receives JSON file with moderately large Scenes. While I'm satisfied with overal design I've achieved, there is this one "little" thing. As of now, i was planning to take this JSON, check it against schema (mb add some valitations later), "decompose" it to a number of DTOs and feed them to container in the process of object tree construction and for later use with Factories, that supply runtime objects with certain types and/or properties.
I don't like it, but I don't see a better way yet.
For example, how do I set up references? Using json's method is ofcourse not an option. Make repository and inject it everywhere? Yet, I don't see it is being Service Locator, I for some reason suspect it's bad. Who will register everyone there, for example? Though, it can be nimble visitor in both cases (but it will demand the addition of specialized interface to all objects just for this concern).
And now I recalled about this Expando. And now i fell myself even worse: earlier I was just seeing not-so-good solutions, but in fact acceptable if treated responsibly and carefully. But this expando.. I see how much easier it may make everything, but I don't know how to allow myself to use it (tho the problem may disappear by itself, as we may fall under constrait of useing .NET 3.5). But if you say using it is ok, I will surely do so. That's because I know that I can trust your opinion more, that my own in this context of software design.
All in all, I will be happy to hear your opinion on the matter of this article, but n years later. What is the answer to questions raised here by article and comments, ultimately? What transition from boundaries to inner OO paradise is the best in your opinion (and what is second, just in case)?
Also, concerning offtopic: can you give me a couple advices on setting up those references (that DI composition will not able to handle)? I think this will be needed only after object graph was builded, so may be do it in composition root right away during cunstruction? I was trying to do this pretty successfully, but i consider those method a "hack".
Is it bad to have repository with almost all business objects?
Dmitry, thank you for writing. This purpose of this post is mostly to help people realise that, as the title says, at the boundaries, applications aren't Object-Oriented (and neither are they Functional). Beyond that, the article is hardly prescriptive. While the What Should Happen at the Boundary section muses on ways to deal with input and output, I hope it doesn't suggest that there are any silver bullets that address these problems.
It's a subject not easily addressed here in a comment. A new article wouldn't even do the topic justice; a book would be required to truly cover all the ins and outs.
That said, I don't think I ever used the ExpandoObject-approach in practice, although I've used dynamic typing against JSON a lot with Integration Testing. Mostly, I use the Railway oriented programming approach to input validation, but it doesn't address all the other issues with parsing, implementing Tolerant Readers, etc.
"I know that I can trust your opinion more, that my own in this context of software design."I don't think you should. All advice is given in a specific context. Sometimes, that context is explicit, but usually it's implicit. I don't know your context (and I confess that I don't understand your more specific questions), so I can only give advice from my own (implicit) context. It may or may not fit your context.
This is the reason I take pains to present arguments for my conclusion. While the arguments serve the purpose of trying to convince the reader, they also serve as documentation. From the arguments, the train of thought that leads to a particular conclusion should be clearer to the reader. If a reader disagrees with a premise, he or she will also disagree with the conclusion. That's OK; software development is not a one-size-fits-all activity.
Design Smell: Default Constructor
This post is the fifth in a series about Poka-yoke Design - also known as encapsulation.
Default constructors are code smells. There you have it. That probably sounds outrageous, but consider this: object-orientation is about encapsulating behavior and data into cohesive pieces of code (classes). Encapsulation means that the class should protect the integrity of the data it encapsulates. When data is required, it must often be supplied through a constructor. Conversely, a default constructor implies that no external data is required. That's a rather weak statement about the invariants of the class.
Please be aware that this post represents a smell. This indicates that whenever a certain idiom or pattern (in this case a default constructor) is encountered in code it should trigger further investigation.
As I will outline below, there are several scenarios where default constructors are perfectly fine, so the purpose of this blog post is not to thunder against default constructors. It's to provide food for thought.
If you have read my book you will know that Constructor Injection is the dominating DI pattern exactly because it statically advertises dependencies and protects the integrity of those dependencies by guaranteeing that an initialized consumer is always in a consistent state. This is fail-safe design because the compiler can enforce the relationship, thus providing rapid feedback.
This principle extends far beyond DI. In a previous post I described how a constructor with arguments statically advertises that the argument is required:
public class Fragrance : IFragrance { private readonly string name; public Fragrance(string name) { if (name == null) { throw new ArgumentNullException("name"); } this.name = name; } public string Spread() { return this.name; } }
The Fragrance class protects the integrity of the name by requiring it through the constructor. Since this class requires the name to implement its behavior, requesting it through the constructor is the correct thing to do. A default constructor would not have been fail-safe, since it would introduce a temporal coupling.
Consider that objects are supposed to be containers of behavior and data. Whenever an object contains data, the data must be encapsulated. In the (very common) case where no meaningful default value can be defined, the data must be provided via the constructor. Thus, default constructors might indicate that encapsulation is broken.
When are Default Constructors OK? #
There are still scenarios where default constructors are in order (I'm sure there are more than those listed here).
- If a default constructor can assign meaningful default values to all contained fields a default constructor still protects the invariants of the class. As an example, the default constructor of UriBuilder initializes its internal values to a consistent set that will build the Uri http://localhost unless one or more of its properties are subsequently manipulated. You may agree or disagree with this default behavior, but it's consistent and so encapsulation is preserved.
-
If a class contains no data obviously there is no data to protect. This may be a symptom of the Feature Envy code smell, which is often evidenced by the class in question being a concrete class.
- If such a class can be turned into a static class it's a certain sign of Feature Envy.
- If, on the other hand, the class implements an interface, it might be a sign that it actually represents pure behavior.
A class that represents pure behavior by implementing an interface is not necessarily a bad thing. This can be a very powerful construct.
In summary, a default constructor should be a signal to stop and think about the invariants of the class in question. Does the default constructor sufficiently guarantee the integrity of the encapsulated data? If so, the default constructor is appropriate, but otherwise it's not. In my experience, default constructors tend to be the exception rather than the rule.
Comments
My most common scenario for the default constructor is when some (probably reflection-based) framework requires it.
Don't like them, at all. There are always some dependencies, and excluding them from the ctor means you are introducing them in an uglier place...
Good post!
public ReturnPolicyManagementView(): base()
{
InitializeComponent();
}
public ReturnPolicyManagementView(ReturnPolicyManagementViewModel model) : this()
{
this.DataContext = model;
}
You can of course get away with this because WPF bindings fail without causing an exception.
Attacking default constructors is either mad or madly genius.
An example where a struct was inappropriately applied would by Guid, whose default value is Guid.Empty. Making Guid a struct adds no value because you'd always conceptually have to check whether you just received Guid.Empty. Had Guid been a reference type we could just have checked for null, but with this design choice we need to have special cases for handling exactly Guids. This means that we have to write special code that deals only with Guids, instead of code that just deals with any reference type.
Comments
int count = 0;
foreach(var current in args)
{
item = current;
i ++;
if (i == 2)
{
RaiseMultipleArguments(args);
return;
}
}
if (i == 1)
this.RaiseSingleArgument(item);
else
RaiseNoArgument();