Composition Root

Thursday, 28 July 2011 15:22:04 UTC

In my book I describe the Composition Root pattern in chapter 3. This post serves as a summary description of the pattern.

The Constructor Injection pattern is easy to understand until a follow-up question comes up:

Where should we compose object graphs?

It's easy to understand that each class should require its dependencies through its constructor, but this pushes the responsibility of composing the classes with their dependencies to a third party. Where should that be?

It seems to me that most people are eager to compose as early as possible, but the correct answer is:

As close as possible to the application's entry point.

This place is called the Composition Root of the application and defined like this:

A Composition Root is a (preferably) unique location in an application where modules are composed together.

This means that all the application code relies solely on Constructor Injection (or other injection patterns), but is never composed. Only at the entry point of the application is the entire object graph finally composed.

The appropriate entry point depends on the framework:

  • In console applications it's the Main method
  • In ASP.NET MVC applications it's global.asax and a custom IControllerFactory
  • In WPF applications it's the Application.OnStartup method
  • In WCF it's a custom ServiceHostFactory
  • etc.

(you can read more about framework-specific Composition Roots in chapter 7 of my book.)

The Composition Root is an application infrastructure component.

Only applications should have Composition Roots. Libraries and frameworks shouldn't.

The Composition Root can be implemented with Poor Man's DI Pure DI, but is also the (only) appropriate place to use a DI Container.

A DI Container should only be referenced from the Composition Root. All other modules should have no reference to the container.

Using a DI Container is often a good choice. In that case it should be applied using the Register Resolve Release pattern entirely from within the Composition Root.

Read more in Dependency Injection Principles, Practices, and Patterns.


Comments

David Martin #
Reading the pre-release and it's very good. As for the single composition root: what about complex apps with several deep features? These features could be considered by some to be mini-apps in and of themselves. Would it be appropriate to consider each feature to have its own composition root? I'm thinking of Unity's child containers here. Hopefully the question makes sense. As I use your book as ammo at work I come across the particular question.
2011-08-02 03:24 UTC
Each application/process requires only a single Composition Root. It doesn't matter how complex the application is.

The earlier you compose, the more you limit your options, and there's simply no reason to do that.

You may or may not find this article helpful - otherwise, please write again :)
2011-08-02 07:56 UTC
Erkan Durmaz #
Hi Mark,

My question is quite parallel to David’s. How about we have a complex application that loads its modules dynamically on-demand, and these modules can consist of multiple assemblies.

For example, a client application that has a page-based navigation. Different modules can be deployed and undeployed all the time. At design-time we don’t know all the types we wanna compose and we don’t have direct reference to them.

Should we introduce some infrastructure code, like David suggested, to let the modules register their own types (e.g. services) to a child container, then we apply “resolve” on the loaded module and dispose the child container when we are done with the module?

Looking forward to the final release of your book BTW :-)

Thanks
2011-09-29 11:57 UTC
The problem with letting each module register their own types to any sort of container is that it tightly couples the modules (and any application in which you might want to use them) to a specific container. Even worse, if you ever need to consume a service provided by one module from another module, you'd have to couple these two to each other. Pretty soon, you'll end up with a tightly coupled mess.

One option is to use whichever DI Container you'd like from the Composition Root, and use its XML capabilities to configure the modules into each particular application that needs them. That's always possible, but adds some overhead, and is rarely required.

A better option is to simply drop all desired modules in an add-in folder and then use convention-based rules to scan each assembly. The most difficult part of that exercise is that you have to think explicitly about cardinality, but that's always an issue with add-in architectures. The MEF chapter of my book discusses cardinality a bit.
2011-10-03 06:18 UTC
I left your book at home, never mind it's here on my desk. I was going to ask what your composition root for asp.net webforms looked like, but here it is on pg 227.

I think I'm headed in
UI(Asp.net) depends on Presentation layer and a pure interfaces/DTOs (with any serialization attributes/methods needed). so structure/schema of the 'model' lives in that domain-schema layer. domain model depends on the domain-schema layer, but is purely behaviors (methods/logic). In this way there's a sharing of schema/dto between service layer and presentation layer, with the clean separation of all business logic and only business logic living in the domain layer.So outside of the composition root, neither the UI, presentation layer, or resource access layers (public api -> internal adapter-> individual resource) can touch the domain methods or types. Requires we define an interface for the domain model classes I think? Cross cutting-concerns would live in their own assembly.

Any thoughts?
2011-10-21 16:09 UTC
Why would your UI depend on DTOs? A Data Transfer Object plays the role of a boundary data carrier, just like View Model might do - they fit architecturally at the same place of application architecture: at the boundary. If you base your inner architecture upon DTOs you'll be headed in the direction of an Anemic Domain Model, so that doesn't sound a particularly good idea to me...
2011-10-23 09:05 UTC
I'm only up to reading mid-way through your chapter on doing it right (and jumped to the asp.net webforms composition root after reading this blog post), but here is more of my reasoning.

The catch comes when you look at the behavior, and you realize that there is hardly any behavior on these objects, making them little more than bags of getters and setters. Indeed often these models come with design rules that say that you are not to put any domain logic in the the domain objects. AnemicDomain Models - Martin Fowler


It becomes a 'real' domain model when it contains all (or most) of the behaviour that makes up the business domain (note I'm emphasising business logic, not UI or other orthogonal concerns). Anemic Domain Models - Stack overflow

I don't feel that this is headed into an Anemic domain model as the domain model assembly would contain all behaviors specific to the domain.

Designs that share very little state or, even better, have no state at all tend to be less prone to hard to analyze bugs and easier to repurpose when requirements change. Blog article - makes sense but i could be missing multiple boats here.

The domain would have zero state, and would depend on interfaces not DTOs. This would make test-ability and analysis simple.

Motivation for DTOs in the interfaces layer: presentation and persistence could all share the DTOs rather than having them duplicated in either place.
Barring that perceived gain, then the shared layer across all would just be interfaces.

The UI could depend on interfaces only, but the same assembly would contain DTOs that the presentation layer and any persistence could share without having to be duplicated/rewritten. So the domain is completely persistence and presentation ignorant.
2011-10-25 00:17 UTC
It may just be that we use the terminology different, but according to Patterns of Enterprise Application Architecture "a Data Transfer Object is one of those objects our mothers told us never to write. It's often little more than a bunch of fields and the getters and setters for them."

The purpose of a DTO is "to transfer multiple items of data between two processes in a single method call." That is (I take it) not what you are attempting to do here.

Whether or not you implement the business logic in the same assembly as your DTOs has nothing to with avoiding an Anemic Domain Model. The behavior must be defined by the object that holds the data.
2011-10-25 06:46 UTC
Jordan Morris #
I appreciate your exhortations to good architecture, but I am struggling to apply them in my situation.

I am writing a WPF system tray application which has a primary constraint to keep lean, especially in terms of performance and memory usage. This constraint, along with the relatively low complexity of the application, means that I cannot justify the overhead of MVVM. In the absence of a prescribed architecture for building the object graph, I could set one up 'manually' in Application.OnStartup. Presumably it would mean instantiating all the Window objects (with dependences), which are utilised from time to time by the user.

However, I have a problem with these Window instance sitting in memory, doing nothing, 90% of the time. It seems much more sensible to me to instantiate these Window objects in the events where the user asks for them. Yet, I will want to maintain the inversion of control, so how can I avoid accessing the DI container in multiple points in my app?
2013-01-12 10:09 UTC
Jordan, would the Virtual Proxy pattern be of help here?

You can use one of the solutions outlined in my recent series on Role Hints to avoid referencing a container outside of the Composition Root.
2013-01-12 21:17 UTC
Jordan Morris #
Hi Mark

I wanted to post the following comment on your linked page, here
/2011/03/04/Composeobjectgraphswithconfidence
but I am encountering a page error on submission.

The above article suggests a great framework for delayed construction of resources, however I have a different reason for wanting this than loading an assembly.

In my scenario I have occasionally-needed Window objects with heavy GUIs, presenting lots of data. The GUI and data only need to be in memory while the Window is displayed. The lazy loading approach seems to apply, except it leaves me with a couple of questions.

1) When the window is closed again, I want to release the resources. How can I unload and reload it again?

2) What would this line look like when using an IoC like Ninject?

> new Lazy<ISolo<IMarker>>(() => new C1(

An alternative I am considering is putting classes defining events in the object graph, instead of the windows themselves. These event classes could be constructed with instances of dependencies which will be used in the event methods to construct Windows when needed. The dependancies I am wanting to inject to maintain inversion of control are light-weight objects like persistance services which will exist as singletons in memory anyway (due to the way most IoC's work). Do you see any problem with this?

Many thanks.
2013-01-13 23:07 UTC
I'll refer you to section 6.2 in my book regarding lifetime management and the closing of resources. However, if things aren't disposable, you can also just let them go out of scope again and let the garbage collector take care of cleaning up.

I can't help you with Ninject.

2013-01-18 10:03 UTC
Allan Friis Hansen #
I would like to touch and expand upon the subject David Martin raised regarding complex applications with deep features.
You make the point that only the composition root should register dependencies of various layers and components. And thereby only the project containing the composition root would have references to the DI container being used

In a couple of projects I've worked on we have multiple applications and thereby composition root, being websites, service endpoint, console applications etc. giving us around 10 application endpoints that needs to have their own composition roots. But at the same time we work with a components based design which nicely encapsulates different aspects of the system.

If I were to only do my registration code within the place of the composition roots, then applications using the same components would have to register exactly the same dependencies per compontent, leading to severe code duplication
At this point we've gone with a solution where every component has their own registration class which the composition root's then register, so basically out composition roots only compose another level of composition roots. I hope it makes sense!

What I'm explaining here directly violates the principles you make in this post. Which I already had I sense it would. But at the same time I don't know how else to manage this situation effectively. Do you have any ideas for how to handle this type of situation?
2014-08-02 22:28 UTC

Allan, thank you for writing. In the end, you'll need to do what makes sense to you, and based on what you write, it may make sense to do what you do. Still, my first reaction is that I would probably tend to worry less about duplication than you seem to do. However, I don't know your particular project: neither how big your Composition Roots are, or how often they change. In other words, it's easy enough for me to say that a bit of duplication is okay, but that may not fit your reality at all.

Another point is that, in my experience at least, even if you have different applications (web sites, services, console applications, etc.) as part of the same overall system, and they share code, the way they should be composed tend to diverge the longer they live. This is reminiscent of Mathias Verraes' point that duplication may be coincidental, but that the duplicated code may deviate from each other in the future. This could also happen for such applications; e.g. at one point, you may wish to instrument some of your web site's dependencies, but not your batch job.

My best advice is to build smaller systems, in order to keep complexity down, and then build more of them, rather than building big systems. Again, that advice may not be useful in your case...

Another option, since you mention component-based design, is to move towards a convention-based approach, so that you don't have to maintain those Composition Roots at all: just drop in the binaries, and let your DI Container take care of the rest. Take a look at the Managed Extensibility Framework (MEF) for inspiration. Still, while MEF exposes some nice ideas, and is a good source of inspiration, I would personally chose to do something like this with a proper DI Container that also supports run-time Interception and programmable Pointcuts.

In the end, I think I understand your problem, but my overall reaction is that you seem to have a problem you shouldn't have, so my priority would be:

  1. Remove the problem altogether.
  2. Live with it.
  3. If that's not possible, solve it with technology.

2014-08-08 12:02 UTC

It seems that I have a certan inclination for reopening "old" posts. Still the may be considered evergreens! :) Now back on topic.

When it comes to the well known frameworks as ASP.NET MVC, Web.API, WCF, it is quite clear on where and how to set our composition root. But what if we do not have a clear "entry point" to our code? Imagine that you are writing an SDK, laying down some classes that will be used by other developers. Now, you have a class that exposes the following constructor.

public MyClass(IWhatEver whatEver)

Consider also that who is going to use this classes has no idea about IWhatEver nor it should have. For make the usage of MyClass as simple as possible, we I should be able to instatiate MyClass via the parameterless constructor. I had the idea of making the constructor that is used for DI, internal, and the only publically available one to be the paramtereless constructor. Then fetch somehow the instances of my dependencies in the paramterless constructor. Now imagine that I have several of classes as MyClass and that I do have a common set of "services" that are injected inside of them. Now, my questions are.
  1. Can I still have the single composition root?
  2. How do I trigger the composition?
  3. Does recreating the composition root per each "high level" class has a significant performance impact (considering a dependency tree of let's say 1000 objects).
  4. Is there a better way (pattern) to solve a problem like this?
Some of this questions may be slightly off topic, still the main question is, how to deal with the composition root in cases like this cases. Hoping you will make some clarity, I do thank you in advance.
2014-09-18 07:17 UTC

Mario, thank you for writing. If you're writing an SDK, you are writing either a library or a framework, so the Composition Root pattern isn't appropriate (as stated above). Instead, consider the guidelines for writing DI friendly libraries or DI friendly frameworks.

So, in order to answer your specific questions:

1: Conceptually, there should only be a single Composition Root, but it belongs to the application, so as a library developer, then no: 'you' can't have any Composition Root.

2: A library shouldn't trigger any composition, but as I explain in my guidelines for writing DI friendly libraries, you can provide Facades to make the learning curve gentle. For frameworks, you may need to provide appropriate Abstract Factories.

3: The performance impact of Composition shouldn't be a major concern. However, the real problem of attempting to apply the Composition Root pattern where it doesn't apply is that it increases coupling and takes away options that the client developer may want to utilize; e.g. how can a client developer instrument a sub-graph with Decorators if that sub-graph isn't available?

4: There are better ways to design DI friendly libraries and DI friendly frameworks.

2014-09-18 08:12 UTC

Thank you for your quick and valuable replay.

I read your posts and made some toughs about them. Still I am not convinced that what is described in that posts does tackle my problem. Even more I'm realizing that is less and less relevant to the topic of composition root. However, let me try describe a situation.

I do plan to write an SDK (a library that will allow developers to interact with my application and use some of it's functionality). Consider that I'm exposing a class called FancyCalculator. FancyCalculater needs to get some information from the settings repository. Now, I have a dependency on ISettingsRepository implementation and it is injected via a constructor into my FancyCalculator. Nevertheless, who is going to use the FancyCalculator, doesn't know and he shouldn't know anything about ISettingsRepository and it's dependency tree. He is expected only to make an instance of FancyCalculator and call a method on it. I do have a single implementation of ISettingsRepository in form of SettingsRepository class. In this case, how do I get to create an instance of SettingsRepository once my FancyCalculator is created?

A Factory? Service locator? Something else?

Composition root in applications like MVC, WebAPI, etc, is a very nice and clean approach, but what about the situations when we do not have a clean single entry point to the application?

Thank you again!

2014-09-18 12:57 UTC

Mario, thank you for writing again. As far as I understand, you basically have this scenario:

public class FancyCalculator
{
    private readonly ISettingsRepository settingsRepository;
}

Then you say: "a dependency on ISettingsRepository implementation and it is injected via a constructor into my FancyCalculator". Okay, that means this:

public class FancyCalculator
{
    private readonly ISettingsRepository settingsRepository;
 
    public FancyCalculator(ISettingsRepository settingsRepository)
    {
        if (settingsRepository == null)
            throw new ArgumentNullException("settingsRepository");
 
        this.settingsRepository = settingsRepository;
    }
}

But then you say: "who is going to use the FancyCalculator, doesn't know and he shouldn't know anything about ISettingsRepository and it's dependency tree."

That sounds to me like mutually exclusive constraints. Why are you injecting ISettingsRepository into FancyCalculator if you don't want to enable the user to supply any implementation of ISettingsRepository? If you don't want to allow that, the solution is easy:

public class FancyCalculator
{
    private readonly ISettingsRepository settingsRepository;
 
    public FancyCalculator()
    {
        this.settingsRepository = new DefaultSettingsRepository();
    }
}

If you want the best of both worlds, the solution is the Facade pattern I already described in my DI Friendly library article.

2014-09-19 19:56 UTC

Hi Mark, You got the example right. The reason of my choices are the following. I do not want user of my library to know about the dependencies for a couple of reason. First of all it needs to be as simple as possible to use. Second thing the dependencies are let's say "internal", so that code is loosely coupled and testable. He should not know even about them or get bothered at any point. Still I do not think it's wise to create instances of the dependencies in the parameterless constructor. Why? Well, I am concerned about maintainability. I would like a to somehow request the default instance of my dependency in the parameterless constructor and get it, so that I do have a single point in my sdk where teh default dependency is specified, and also that I do not need to handle the dependency tree. The fact is that I can't get what this something should be. I re-read the chapter 5 of your book this weekend to see if I can come up with something valid ideas. What I came up with is that I should use the parameterless constructor of each of my classes to handle it's own default dependencies. In this way, resolving the tree should be fine, and it is easy to maintain. Also to prevent user to see the constructors that to require the dependencies to be injected (and confuse it) I tough of declaring these constructors as internal. This are my words translated in code (easier to understand).

class Program
			{
			    static void Main(string[] args)
			    {
			        // Using the SDK by others
			        FancyCalculator calculator = new FancyCalculator();
			 
			        Console.WriteLine(calculator.DoSomeThingFancy());
			    }
			}
			 
			public interface ISettingsRepository { }
			public interface ILogging { }
			 
			public class Logging : ILogging { }
			 
			public class SettingsRepository : ISettingsRepository
			{
			    private readonly ILogging logging;
			 
			    public SettingsRepository()
			    {
			        this.logging = new Logging();
			    }
			 
			    internal SettingsRepository(ILogging logging)
			    {
			        if (logging == null)
			            throw new ArgumentNullException("logging");
			 
			        this.logging = logging;
			    }
			}
			 
			public class FancyCalculator
			{
			    private readonly ISettingsRepository settingsRepository;
			 
			    public FancyCalculator()
			    {
			        this.settingsRepository = new SettingsRepository();
			    }
			 
			    internal FancyCalculator(ISettingsRepository settingsRepository)
			    {
			        if (settingsRepository == null)
			            throw new ArgumentNullException("settingsRepository");
			 
			        this.settingsRepository = settingsRepository;
			    }
			 
			    public int DoSomeThingFancy() { return 1; }
			}

What do you think about a solution like this? What are a bad sides of this approach? Is there any pattern that encapsulates a practise like this?

2014-09-22 10:07 UTC

Mario, you don't want the user of your library to know about the dependencies, in order to make it easy to use the SDK. That's fine: this goal is easily achieved with one of those Facade patterns I've already described. With either Constructor Chaining, or use of the Fluent Builder pattern, you can make it as easy to get started with FancyCalculator as possible: just invoke its parameterless constructor.

What then, is your motivation for wanting to make the other constructors internal? What do you gain from doing that?

Such code isn't loosely coupled, because only classes internal to the library can use those members. Thus, such a design violates the Open/Closed Principle, because these classes aren't open for extension.

2014-09-13 6:48 UTC

Dear Mark,

I am pretty sure that Mario uses internal code in order to be able to write unit tests. Probably the interfaces of the dependecies which he is hiding should also be internal. I think it is a good aproach because thanks to it he can safely refactor those interfaces, beacuse it is not a public API and also he can have unit tests.

Yes - "Such code isn't loosely coupled, because only classes internal to the library can use those member", however I think that very often stable API is very important for libraries.

Could you give any comments on that?

2014-09-25 20:10 UTC

Robert, thank you for writing. Your guess sounds reasonable. Even assuming that this is the case, I find it important to preface my answer with the caution that since I don't know the entire context, my answer is, at best, based on mainstream scenarios I can think of. There may be contexts where this approach is, indeed, the best solution, but in my experience, this tends not to be the case.

To me, the desire to keep an API stable leads to APIs that are so locked down that they are close to useless unless you happen to be so incredibly lucky that you're right on the path of a 'supported use case'. Over the years, I've worked with many object models in the .NET Base Class Library, where I've wanted it to do something a little out of the ordinary, only to find myself in a cul-de-sac of internal interfaces, sealed classes, or internal virtual methods. Many other people have had the same problems with .NET, which, I believe, has been a contributing factor causing so many of the brightest programmers to leave the platform for other, more open platforms and languages (Ruby, Clojure, JavaScript, Erlang, etc.).

The .NET platform is getting much better in this regard, but it's still not nearly as good as it could be. Additionally, .NET is littered with failed technologies that turned out to be completely useless after all (Windows Workflow Foundation, Commerce Server, (early versions of) Entity Framework, etc.).

.NET has suffered from a combination of fear of breaking backwards compatibility, combined with Big Design Up-Front (BDUF). The fact that (after all) it's still quite useful is a testament to the people who originally designed it; these people were (or are) some of most skilled people in the industry. Unless you're Anders Hejlsberg, Krzysztof Cwalina, Brad Abrams, or on a similar level, you can't expect to produce a useful and stable API if you take the conservative BDUF approach.

Instead, what you can do is to use TDD to explore the design of your API. This doesn't guarantee that you'll end up with a stable API, but it does help to make the API as useful as possible.

How do you ensure stability, then? One option is to realise that you can use interfaces as access modifiers. Thus, you can publish a library that mostly contains interfaces and a few high-level classes, and then add the public implementations of those interfaces in other libraries, and clearly document that these types are not guaranteed to remain stable. This option may be useful in some contexts, but if you're really exposing an API to the wild, you probably need a more robust strategy.

The best strategy I've been able to identify so far is to realise that you can't design a stable API up front. You will make mistakes, and you will need to deal with these design mistakes. One really effective strategy is to apply the Strangler pattern: leave the design mistakes in, but add new, better APIs as you learn; SOLID is append-only. In my Encapsulation and SOLID Pluralsight course, I discuss this particular problem and approach in the Append-Only section of the Liskov Substitution Principle module.

2014-09-28 13:40 UTC

Good point Mark! I really like your point of view, in general I agree with you.

I will try to 'defend' the internal dependecies, beacuse it may help us explore all the possible reasonable usages of it.

1. Personlly I would hide my dependencies with internal access when I would be quite certain that my design is bad, but I do not have time to fix it, but I will do it for sure in probably near future.

2. Moreover if I would have unit tests without accessing to internal's (so not the case which I was writing in my previous comment), then I should be able to refactor the 'internals' without even touching the unit tests. This is why I see internal depedencies as a refactoring terchnique - especially when working with legacy code.

These were some cases for Library/Framework. What about internal access in non-modular Application development that is consists of several projects (dll's)? Personally I try to keep everthing internal by default and only make public interfaces, entities, messages etc. for things that are used between the projects. Then I compose them in a dedicated project (which I name Bootstrapper) which is a composition root and has access to all internal types... If your books covers this topic - please just give a reference. I have not read your book so far, but it is in the queue :)

2014-09-28 18:12 UTC

Please note that I'm not saying that internal or private classes are always bad; all I'm saying is that unit testing internal classes (presumably using the [InternalsVisibleTo] attribute) is, in my opinion, not a good idea.

You can always come up with some edge cases against a blanket statement like never unit test internal classes, but in the general case, I believe I've already outlined why I think it's a really poor idea to do so.

Ultimately, I've never understood the need for the [InternalsVisibleTo] attribute. When you're applying it, you're basically making internal types public to select consumers, with all the coupling that implies. Why not make the types truly public then?

As far as I can tell, the main motivation for not making types public is when the creators don't trust their code. If this is the case, making things internal isn't a solution, it's a symptom. Address the problem instead of trying to hide it. Make the types trustworthy by applying encapsulation.

2014-09-29 11:59 UTC

An example from my last days at work: I marked a class as internal that is wrapping some native library (driver for some hardware) using [DllImport]. I want clients of my Library use my classes - not the DLL wrapper - this is why I hide it using internal. However I needed InternalVisibleTo so that I could write integration tests to see if the wrapper really works.

Why making public when nobody outside is using it? YAGNI. I only expose when I know when its needed for the clients. Then for those that I have already exposed I need to have backward compability. The less I have exposed the more easily and safely I can refactor my design. And it is not about always bad design. New requirements frequently come in parallel with refactoring the design. This is why I like the Martin Fowler's idea of Published Interface which is also mentioned in his Access Modifier article.

Additionally I always had a feeling that making everything public can affect badly influence the software architecture. The more encaupsulated are the packages the better. And for me internal access is also a mean of encaupsulation at the packaging level. Three days ago Simon Brown had a presentation named "Software Architecture vs Code" on DevDay conference. When he told something like "do not make public classes by default" people were applauding!

My rules of a thumb for seting the class access modifiers are:

  • private - when it is a helper for a class where it is being nested
  • internal - when it used only in the package
  • public - when other packages needs to use it
  • oh and, how I would love to have this Published Interface!
So for each package I can end up with having some public interfaces and classes (like messages and entities) and many internal interfaces and classes which are only seen by the unit tests and a bootstrapping package (if I have such!) It really helps when you have a project that has about 300k lines of code :)

2014-09-29 16:40 UTC

Thank to Mark Seemann for his very inspiring write about Composition Root design pattern, /2011/07/28/CompositionRoot/

I wrote my own JavaScript Dependency Injection Framework called Di-Ninja with these principles in mind https://github.com/di-ninja/di-ninja

As I know, is the only one in javascript that implement the Composition-Root design pattern and it's documentation could be another good example to demonstrate how it works.

It work for both NodeJS and browser (with Webpack)

2018-01-04 08:22 UTC

Thanks for this insightful post!

I was just wondering, could we say then that using a Service Locator exclusively at the Composition root is not something bad, actually is perfectly fine?

I'm asking this, since it's wide-spread the service locator 'anti-pattern', however, at the Composition root its usage is not a problem at all, right ?

Thanks for the amazing blog! ;)

2022-01-24 9:25 UTC

Lisber, thank you for writing. Does this answer your question?

2022-01-24 11:49 UTC

Totally!

The sum up at the end of that article is brilliant!

It becomes a Service Locator if used incorrectly: when application code (as opposed to infrastructure code) actively queries a service in order to be provided with required dependencies, then it has become a Service Locator.

Thank you for clarifying that.

2022-01-24 13:40 UTC

SOLID Code isn't

Tuesday, 07 June 2011 13:46:07 UTC

Recently I had an interesting conversation with a developer at my current client, about how the SOLID principles would impact their code base. The client wants to write SOLID code - who doesn't? It's a beautiful acronym that fully demonstrates the power of catchy terminology.

However, when you start to outline what it actually means people become uneasy. At the point where the discussion became interesting, I had already sketched my view on encapsulation. However, the client's current code base is designed around validation at the perimeter. Most of the classes in the Domain Model are actually internal and implicitly trust input.

We were actually discussing Test-Driven Development, and I had already told them that they should only test against the public API of their code base. The discussion went something like this (I'm hoping I'm not making my ‘opponent' sound dumb, because the real developer I talked to was anything but):

Client: "That would mean that each and every class we expose must validate input!"

Me: "Yes…?"

Client: "That would be a lot of extra work."

Me: "Would it? Why is that?"

Client: "The input that we deal with consist of complex data structures, and we must validate that all values are present and correct."

Me: "Assume that input is SOLID as well. This would mean that each input instance can be assumed to be in a valid state because that would be its own responsibility. Given that, what would validation really mean?"

Client: "I'm not sure I understand what you mean…"

Me: "Assuming that the input instance is a self-validating reference type, what could possibly go wrong?"

Client: "The instance might be null…"

Me: "Yes. Anything else?"

Client: "Not that I can think of…"

Me: "Me neither. This means that while you must add more code to implement proper encapsulation, it's really trivial code. It's just some Guard Clauses."

Client: "But isn't it still gold plating?"

Me: "Not really, because we are designing for change in the general sense. We know that we can't predict specific change, but I can guarantee you that change requests will occur. Instead of trying to predict specific changes and design variability in those specific places, we simply put interfaces around everything because the cost of doing so is really low. This means that when change does happen, we already have Seams in the right places."

Client: "How does SOLID help with that?"

Me: "A result of the Single Responsibility Principle is that each self-encapsulated class becomes really small, and there will be a lot of them."

Client: "Lots of classes… I'm not sure I'm comfortable with that. Doesn't it make it much harder to find what you need?"

Me: "I don't think so. Each class is very small, so although you have many of them, understanding what each one does is easy. In my experience this is a lot easier than trying to figure out what a big class with thousands of lines of code does. When you have few big classes, your object model might look something like this:"

Coarse-grained objects

"There's a few objects and they kind of fit together to form the overall picture. However, if you need to change something, you'll need to substantially change the shape of each of those objects. That's a lot of work, and this is why such an object design isn't particularly adaptable to change.

"With SOLID, on the other hand, you have lots of small-grained objects which you can easily re-arrange to match new requirements:"

Fine-grained objects

And that's when it hit me: SOLID code isn't really solid at all. I'm not a material scientist, but to me a solid indicates a rigid structure. In essence a structure where the particles are tightly locked to each other and can't easily move about.

However, when thinking about SOLID code, it actually helps to think about it more like a liquid (although perhaps a rather viscous one). Each class has much more room to maneuver because it is small and fits together with other classes in many different ways. It's clear that when you push an analogy too far, it breaks apart.

Still, a closing anecdote is appropriate...

My (then) three-year old son one day handed me a handful of Duplo bricks and asked me to build him a dragon. If you've ever tried to build anything out of Duplo you'll know that the ‘resolution' of the bricks is rather coarse-grained. Given that ‘a handful' for a three-year old isn't a lot of bricks, this was quite a challenge. Fortunately, I had an appreciative audience with quite a bit of imagination, so I was able to put the few bricks together in a way that satisfied my son.

Still, building a dragon of comparable size out of Lego bricks is much easier because the bricks have a much finer ‘resolution'. SOLID code is more comparable to Lego than Duplo.


Comments

It's easier to use Lego bricks if your dinosaur is small, but if it's life sized - the cost and time using Lego instead of Duplo would be enormous. The same could be said about getting too granular with classes (IShippingOptionProviderFactory, anyone?) on a large project. Speaking of taking analogies too far... isn't building anything (real) out of Lego bricks a bad idea? I mean, I certainly wouldn't want to drive a Lego car or walk across a Lego bridge.
2011-06-07 14:41 UTC
Brilliant article :)
2011-06-07 14:56 UTC
Looks like your thinking is in the line of the DDJ editorial:
http://bit.ly/ik3dCW

Functional units of small size are key to evolvable code. This makes for a fine grained "code sand" which can be brought into ever changing shapes.

The question however is: What is to become smaller? Classes or methods?

I´d say it´s classes that need to be limited in size, maybe 50 to 100 LOC.
Classes are to be kept small because they are the "blueprints" of the smallest stateful runtime units of code which can be recombined: objects.

However this leads to two problems:

1. It´s difficult to think of a decomposition of the very common noun-classes into smaller classes. How to break up a PaymentService class into many smaller classes?

2. Small classes would lead to an explosion of dependencies between all of these classes. They would be hard to manage (even with dependency injection).

That´s two reasons why most developers will probably resists decreasing the size of their classes. They´ll just keep methods small - but the effect of this will be limited. There´s no limit to the size of a class; 1,000 LOC class can consist of 100 methods each 10 lines long.

But the two problems go away if SOLID is applied in combination with a different view on object orientation.

We just need to switch from nouns to verbs when thinking about domain logic classes (not data classes).

If classes become functions (or maybe better behaviors) then there is no decomposition problem anymore. Behaviors can be as small as needed, maybe just 1 LOC.

Also behaviors conceptually don´t have any dependencies on each other (but maybe to an environment); sitting is not dependend on running even though running might "be done" after sitting.
2011-06-09 10:53 UTC
Darin #
@Alex Papadimoulis

"I certainly wouldn't want to drive a Lego car or walk across a Lego bridge"

ROTFL

Great quote! I've always liked the principles behind solid, but have always been a little turned off by some of the zealotry associated with it. Like most everything in life it's great "in moderation".
2011-06-09 12:15 UTC
Hi Mark.
Thanks for your article.
You are describing a philosophy
which is almost standard practice
in Smalltalk since the 80's..
As you know, albeit desired, it is
however not always possible to
make small classes.
Recommended reading:
(also for OO in general)
"Smalltalk, Objects and Design"
by: Chamond Liu
ISBN: 1-8847777-27-9
Manning Publications Co, 1996.
Kind Regards
Ted
..btw I am sure that there must be
a photo of yours whereon you look a bit happier?


2011-06-09 12:40 UTC
Bill Berger #
Nice blog. Good visual analogy.

It's my experience that there is lots of discussion around the concept of abstraction and too little discussion around the use and practice of abstraction.

I interview many software development candidates, and I always explore what I consider foundational concepts - one is abstraction. I find most developers / programmers / software engineers (/ whatever), cannot adequately communicate HOW they use abstraction to enable their code - not on an architectural level, not on a modular level, not on a class level, and not even on a simple functional level. This is a real indication of the state of software design and it is alarming to me.

It's my belief that designing abstraction in software engineering is the most critical tool in software construction, but it is the least discussed.
2011-06-09 12:49 UTC
@Bill: I agree that abstraction is missing from many discussions (and from languages). Pars pro toto: in UML class diagrams abstraction (in the form of composition) cannot be distinguished from simple usage.

So since there is no real expression of abstraction (except for inheritance) in code, it´s neglected or left to personal style. Sad.
2011-06-09 13:09 UTC
Mark, thanks for this post. I am telling you that I had almost the exact same conversation with client late last year. At some point he says "but it's SOOO many classes.." Today reading your post I just had to laugh. Good times. :)
2011-06-09 14:25 UTC
Dave Boyle #
"…put interfaces around everything because the cost of doing so is really low."

I’m not convinced that the cost is indeed "really low", for two reasons.

1) Interfaces, before TDD became popular, imparted information to the maintainer of a type because they were used to indicate, for instance, that a given type was to be treated like a collection that could be enumerated or a type that could be compared with other types for the purposes of, say, sorting. When added reflexively as a means of inserting a level of indirection, the interface no longer imparts any information.

2) When navigating around a large code base where nearly all types are accessed via a level of indirection (courtesy of ubiquitous interfaces) my IDE – Visual Studio 2010 – struggles to answer common questions like "what code will be executed when SomeType.ReadHeader() is called?". Hitting F12 doesn’t take me to the definition of the method but rather takes me to the definition of the interface which, because of the point above, is of no value. ReSharper can sometimes find the method with a more advanced search but not always. The upshot of this is that code making heavy use of interfaces becomes much harder to statically analyse.
2011-06-09 14:50 UTC
Bill Berger #
@Ralf: "So since there is no real expression of abstraction (except for inheritance) in code"

Agreed. That point is manifest in @Dave's subsequent post. So, how would the lack of direct language support for abstraction - other than inheritance - be solved? And how would code tracing / debugging interfaces and their implementations be handled?

My gut tells me this is one reason (of several) that scripting-like languages have gained popularity in the recent past, and that we are being pointed in a more real-time, interpreted language direction due to the necessary dynamic nature of these constructs. Hm... {thinking}

Love a blog that makes me think for more than a few seconds.
2011-06-09 16:52 UTC
P #
As a client, looking at the second picture with lots of small parts, I see more complicated picture, therefore more maintenance, more support and more money to spend on it. As a consultant for that client, I see more dollar signs in my bank account...
2011-06-09 22:14 UTC
@Bill: You ask

"So, how would the lack of direct language support for abstraction - other than inheritance - be solved?"

and I don´t know if we should hope for languages to offer support soon. We´ve to live with what we have: Java, C#, C++, Ruby etc. (Scripting languages to me don´t offer better support for abstraction.)

Instead we need to figure out how to use whatever features these languages offer in a way to express abstraction - and I don´t mean inheritance ;-)

And that leads to a separation of mental model from technical features. We need to think in a certain way about our solutions - and then translate our mental models to language reality.

Object orientation tried to unify mental model and code reality. But as all those maintenance nightmare projects show, this has not really worked out as nicely as expected. The combination of OO method (OOAD) and OO programming has not lived up to our expectations.

Since we´re pretty much stuck with OO languages we need to replace the other part of the pair, the method, I´d say.
2011-06-10 08:44 UTC
Kenneth Siewers Møller #
You really need a +1 :)
Very interesting article...
2011-08-24 08:31 UTC

At the Boundaries, Applications are Not Object-Oriented

Tuesday, 31 May 2011 13:27:11 UTC

My recent series of blog posts about Poka-yoke Design generated a few responses (I would have been disappointed had this not been the case). Quite a few of these reactions relate to various serialization or translation technologies usually employed at application boundaries: Serialization, XML (de)hydration, UI validation, etc. Note that such translation happens not only at the perimeter of the application, but also at the persistence layer. ORMs are also a translation mechanism.

Common to most of the comments is that lots of serialization technologies require the presence of a default constructor. As an example, the XmlSerializer requires a default constructor and public writable properties. Most ORMs I've investigated seem to have the same kind of requirements. Windows Forms and WPF Controls (UI is also an application boundary) also must have default constructors. Doesn't that break encapsulation? Yes and no.

Objects at the Boundary #

It certainly would break encapsulation if you were to expose your (domain) objects directly at the boundary. Consider a simple XML document like this one:

<name>
  <firstName>Mark</firstName>
  <lastName>Seemann</lastName>
</name>

Whether or not we have formal a contract (XSD), we might stipulate that both the firstName and lastName elements are required. However, despite such a contract, I can easily create a document that breaks it:

<name>
  <firstName>Mark</firstName>
</name>

We can't enforce the contract as there's no compilation step involved. We can validate input (and output), but that's a different matter. Exactly because there's no enforcement it's very easy to create malformed input. The same argument can be made for UI input forms and any sort of serialized byte sequence. This is why we must treat all input as suspect.

This isn't a new observation at all. In Patterns of Enterprise Application Architecture, Martin Fowler described this as a Data Transfer Object (DTO). However, despite the name we should realize that DTOs are not really objects at all. This is nothing new either. Back in 2004 Don Box formulated the Four Tenets of Service Orientation. (Yes, I know that they are not in vogue any more and that people wanted to retire them, but some of them still make tons of sense.) Particularly the third tenet is germane to this particular discussion:

Services share schema and contract, not class.

Yes, and that means they are not objects. A DTO is a representation of such a piece of data mapped into an object-oriented language. That still doesn't make them objects in the sense of encapsulation. It would be impossible. Since all input is suspect, we can hardly enforce any invariants at all.

Often, as Craig Stuntz points out in a comment to one of my previous posts, even if the input is invalid, we want to capture what we did receive in order to present a proper error message (this argument also applies on machine-to-machine boundaries). This means that any DTO must have very weak invariants (if any at all).

DTOs don't break encapsulation because they aren't objects at all.

Don't be fooled by your tooling. The .NET framework very, very much wants you to treat DTOs as objects. Code generation ensues.

However, the strong typing provided by such auto-generated classes gives a false sense of security. You may think that you get rapid feedback from the compiler, but there are many possible ways you can get run-time errors (most notably when you forget to update the auto-generated code based on new schema versions).

An even more problematic result of representing input and output as objects is that it tricks lots of developers into dealing with them as though they represent the real object model. The result is invariably an anemic domain model.

More and more, this line of reasoning is leading me towards the conclusion that the DTO mental model that we have gotten used to over the last ten years is a dead end.

What Should Happen at the Boundary #

Given that we write object-oriented code and that data at the boundary is anything but object-oriented, how do we deal with it?

One option is to stick with what we already have. To bridge the gap we must then develop translation layers that can translate the DTOs to properly encapsulated domain objects. This is the route I take with the samples in my book. However, this is a solution that more and more I'm beginning to think may not be the best. It has issues with maintainability. (Incidentally, that's the problem with writing a book: at the time you're done, you know so much more than you did when you started out… Not that I'm denouncing the book - it's just not perfect…)

Another option is to stop treating data as objects and start treating it as the structured data that it really is. It would be really nice if our programming language had a separate concept of structured data… Interestingly, while C# has nothing of the kind, F# has tons of ways to model data structures without behavior. Perhaps that's a more honest approach to dealing with data… I will need to experiment more with this…

A third option is to look towards dynamic types. In his article Cutting Edge: Expando Objects in C# 4.0, Dino Esposito outlines a dynamic approach towards consuming structured data that shortcuts auto-generated code and provides a lightweight API to structured data. This also looks like a promising approach… It doesn't provide compile-time feedback, but that's only a false sense of security anyway. We must resort to unit tests to get rapid feedback, but we're all using TDD already, right?

In summary, my entire series about encapsulation relates to object-oriented programming. Although there are lots of technologies available to represent boundary data as ‘objects', they are false objects. Even if we use an object-oriented language at the boundary, the code has nothing to do with object orientation. Thus, the Poka-yoke Design rules don't apply there.

Now go back and reread this post, but replace ‘DTO' with ‘Entity' (or whatever your ORM calls its representation of a relational table row) and you should begin to see the contours of why ORMs are problematic.

P.S. 2022-05-02. See also At the boundaries, applications aren't functional.


Comments

Absolutely. I spend a lot of time in this space, an while I enjoy using (and maintaining) tools to make this transition simple, whenever there isthe first hint of a problem I always advise "add a dedicated DTO". A similar question (using interfaces in WCF messages) came up on stackoverflow last week. But ultimately, any boundary talks "data", not behaviour. It also emphasises why BinaryFormatter (implementation-aware) sucks so much for ***any*** kind of data exchange.
2011-05-31 14:53 UTC
Phil Sandler #
Arg, had a longer comment but it got eaten when I clicked Save Comment.

Anyway the gist of it: great post, I commpletely agree that data objects aren't domain objects, and thinking of them as structured data is very liberating. Using Entities/DTOs/data objects as a composible part of a domain object is a nice way to create that separation which still allows you to (potentially directly) use ORM entities while avoiding having an anemic domain model. So, for example, CustomerDomain would/could contain a CustomerEntity, instead of trying to add behavior to CustomerEntity (which might be a generated class).
2011-05-31 16:37 UTC
Phil, when you write 'Entity' I immediately think about the definition provided in "Domain-Driven Design". Those Entities are definitely Domain Objects which should have lots of behavior, but I'm getting the feeling that you're thinking about something else?
2011-05-31 18:22 UTC
Phil Sandler #
Hey Mark,

Yep, I meant "entity" more in the sense of how it is defined in some of the popular ORMs, like LLBLGen and Entity Framework. Essentially, an instance that corresponds (more or less) directly to a row in a table or view.
2011-05-31 22:09 UTC
Yes, we might compose real domain objects from DTOs, but I'm not sure that is the best solution. When we map non-object-oriented structures to objects, they tend to drag along a lot of baggage which hurts OO modeling.

For instance, generating DTOs from a database schema provides an entire static structure (relationships, etc.) that tend to constrain us when we subsequently attempt to define an object model. I rather prefer being able to work unconstrained and then subsequently figure out how to persist it.
2011-06-01 05:44 UTC
I like this article. As far as I undertand you for every entity you have (a domain object) you need a separate DTO for different representations (XML, SQL, etc.). So you need a third object which translates every DTO into a specific entity. Am I correct? If so, isn't that process too complex?
2011-06-01 08:29 UTC
Yes, that's what I've been doing for the last couple of years, but the downside is the maintenance overhead. That's why I, in the last part of the post, discuss alternatives. Currently I think that using dynamic types looks most promising.

Keep in mind that in most cases, a DTO is not an end-goal in itself. Rather, the end-goal is often the wire-representation of the DTO (XML, JSON, etc.). Wouldn't it be better if we could just skip the DTO altogether and use a bit of convention-based (and testable) dynamic code to make that translation directly?
2011-06-01 08:41 UTC
Florian Hötzinger #
I guess in WCF service scenarios there is really no other way than to create a DTO layer and compose our domain objects from those DTOs when passing them from/to the client. Since I'm doing a lot of Silverlight development at the moment, this came up quite early.
2011-06-01 09:13 UTC
@Mark: Yes, that would be quite a better way. Here comes the great article, you've mentioned, about the ExpandoObject and the dynamic way of dealing with json, xml and other representations of data. However, in case of SQL at this moment you still need some DTOs I guess. Maybe if we have some wrapper over the traditional IDataReader or some other mechanism to access the data, it will be possible again.
2011-06-01 10:13 UTC
Phil Sandler #
Hey Mark,

It's a difficult problem to solve perfectly--every solution had downsides. I have gone down the road of mapping DTOs (or data entities) via a translation layer and that can be painful as well. As you said, the tradeoff of using data objects as a composable part of the Domain Object is that you can't model your domain with complete freedom (which DDD purists would likely see as non-negotiable). The upside is that you have the potential for less maintenance.

2011-06-01 14:34 UTC
Florian, in WCF it is possible to drop down to the message-level and write directly to the message. Even so, I'm not sure I'd go that route, but it's good to know that the option exists.
2011-06-01 18:43 UTC
Boyan, I was specifically thinking about dropping down to IDataReader. For one example, please refer to Tomas Petricek's article Dynamic in F#: Reading data from SQL database.
2011-06-01 18:46 UTC
Hi Mark,

There is lots to argue about in this article (nothing "wrong", just semantics). The main point I would like to pick up on is about your point "What should happen at domain boundaries".

I would contest that rather than "translation" layers, its semantically better to think of it as an "interpretation object". When an app accepts input, there are very few assumptions that should be automatically attached to the unit of input. One of the fundamental ones and often the easiest to break is the assumption that the input is in any way valid, useful or complete.

The semantic concept(abstraction) that is wrapped around the unit of input (the object that encapsulates the input data) need to have a rich interface that can convey abstractions such as "Cannot interpret this input", "Can partially interpret the input but it contains some rubbish", "Input contains SQL injection attack", "Input is poorly formed and has a missing .DTD file" etc.

My argument is that the semantic concept of "translation" while obviously closely related to "interpretation" is semantically a "transformation process". A->B kind of idea, while "interpretation", at least in my head, is not so deterministic and seeks to extract domain abstractions from the input that are then exposed to the application as high level abstractions rather than low level "translated data/safe data/sanitized data".

Thanks,

hotsleeper
2011-06-02 02:37 UTC
Makes sense. I mainly used the term 'translation layer' as I assumed that most people would then immediately know what I meant :)
2011-06-02 06:51 UTC
Arved Sandstrom #
What's the processing error that is currently preventing comments? I can't get a usefully long one successfully submitted.
2011-06-02 09:50 UTC
Sorry about that. This blog is running on dasBlog which was last updated two years ago. Migrating the blog to a better engine is on my to-do list, but it'll be a couple of months at least before I get there.
2011-06-02 10:07 UTC
FWIW, the maintenance cost is effectively zero if the DTO types are codegened.

Dynamic types are also an option -- anyone who has used Rails/ActiveRecord is familiar with their upsides and downsides. In short, maintaining them is free, but using them costs more.

My preferred approach (for now) is to use EF entities as boundary objects for the DB. They cost almost nothing to maintain -- a couple of mouse clicks when we change DB schemata -- since we use "Database First" modeling for internal reasons.

You only get into encapsulation trouble with this if you try to use the EF entities as business objects -- something many people do. My $0.02 is that it's usually wrong to put any kind of behavior on an EF entity.
2011-06-04 15:51 UTC
I agree on the rule of never putting any behavior on an EF entity (although, still, I used to put a translation method on most of them: ToDomainObject(), but still...).

My personal experience with EF was that even with code generation, it was far from frictionless, but YMMV...
2011-06-04 21:13 UTC
Hi Mark,

Depends! If we're building a framework that is used by others then there is a big difference between the applications boundary to the "outside world" (DTO's) and persistence.

A quote taken from NHDay - Loosely Coupled Complexity - CQRS (see minute 2:10) What is wrong with a design with DTO's? "Nothing." End of presentation. But then continues into CQRS :)

Especially the translation-layer when requests come into our application, and we try to rebuild the OO model to help save new objects with an ORM can be cumbersome.

The point is that the database follows from the domain design, not other way round like EF or ORM class generators make you try to believe.
2011-06-05 11:18 UTC
The point being made here is that despite the name, DTOs aren't 'objects'. It's certainly possible to create a complex application entirely with DTOs - it's just not object-oriented.

There may be nothing wrong with that, but if you decide that you need object-orientation to solve a business problem, you must transform DTOs into proper objects, because it's impossible to make OOD with DTOs.
2011-06-05 11:45 UTC
The "everything is an ________" mantra gets us in trouble every time, whether it's strings in TCL, objects in Ruby, etc. Objects are a programming construct in practice, and the problem is we're trying to treat data as objects... which it really isn't. I echo your sentiments to "stop treating data as objects and start treating it as the structured data that it really is."
2011-06-05 18:59 UTC
Great article!

But why not think this further? What´s a boundary? Is a boundary where data is exchanged between processes? Or processes on different machines? Processes implemented using different platforms? Or is a boundary where data moves between .NET AppDomains? Or between threads?

The past year I´ve felt great relieve by very strictly appyling this rule: data that moves around is, well, just data.

That does not mean I´m back to procedural programming. I appreciate the benefits of object oriented languages.

But just because I can combine data and functions into one "thing", I should not always do it.

If data is state of a behavior, then data+function makes sense.

But if data is pushed around between "behaviroal objects" then data should be just data (maybe spiced with some convenience functions).

So what I´ve come to doing is "data flow design" or "behaviroal design". And that plays very nicely with async programming and distributing code across processes.
2011-06-08 14:14 UTC
Yes, we can take it further, but I don't believe that one approach necessarily rules out the other. However, it depends on how you decide to pass around structured data.

If you do it in a request/response style (even mapped into internal code), I'd say that you'd be doing procedural programming.

However, if you do it as Commands, passing structured data to void methods, you'd basically be heading in the direction of Pipes and Filters architecture. That's actually a very good place to be.
2011-06-09 09:47 UTC
Nice post (even though I'm a couple years late to it).
Wouldn't it be better if we could just skip the DTO altogether and use a bit of convention-based (and testable) dynamic code to make that translation directly?
I think technology is finally starting to get there. Most serializers still want a default constructor, but some also provide mechanisms for constructor injection — making immutable domain models possible again.
Now we just need to be able to validate the input appropriately; you made a point about validation being "a different matter" above that I didn't quite understand. It seems to me that, as long as the input is valid, we can use it to create a proper model object.
And if we can automatically return abstract errors like @hotsleeper suggested, then I think we have a full solution.
My most recent take on this problem has been to use a combination of Jersey, Jackson, and Hibernate Validator to translate from on-the-wire JSON representation to valid domain object.
Is there something I'm missing with this approach?
2013-05-07 10:45 CDT

The "different matter" of interpretation is that you need to be able to interpret and validate the incoming data, before you turn it into a Domain Object. An encapsulated Domain Object should throw exceptions if you attempt to create invalid instances, so unless you want to catch exceptions and surface them at the boundary, you'll need to be able to validate the input before you create the Domain Object.

There are various technologies where you can annotate your Domain Objects, in order to add such interpretation logic to them, but you should be aware that if you do that, you'd be conflating the interpretation logic with your Domain Model. Is that a problem? Only you can tell, but at least, you must be aware of this before you can make the decision.

What happens if you need to interpret XML as well as JSON? What happens if you need to interpret two mutually exlusive versions (v1 and v2) of XML into the same object? What happens if you need to interpret YAML in addition to JSON and XML? Etc.

2013-05-08 06:15 UTC

5 years have passed after original post date, but smart thoughts never get old i think.

I have a question, concerning "Currently I think that using dynamic types looks most promising."
Are you still think like that? I assume yes. But is it even "ethical" to use amorphous, general-purpose objects, like that?
It's bad to populate objects with property injection, but it's ok to give them method they didn't have? When I'm looking at this Expando I just see something like:
- take JSON file with object graph
- deserialize to "DTO" graph
- give arbitrary DTOs methods, so they now domain objects
- seems legit

Of cource, im exagerrating. But htat's why:
Right now we're starting a project. The only and last thing that bothers me now is exactly this: Client application receives JSON file with moderately large Scenes. While I'm satisfied with overal design I've achieved, there is this one "little" thing. As of now, i was planning to take this JSON, check it against schema (mb add some valitations later), "decompose" it to a number of DTOs and feed them to container in the process of object tree construction and for later use with Factories, that supply runtime objects with certain types and/or properties.
I don't like it, but I don't see a better way yet.

For example, how do I set up references? Using json's method is ofcourse not an option. Make repository and inject it everywhere? Yet, I don't see it is being Service Locator, I for some reason suspect it's bad. Who will register everyone there, for example? Though, it can be nimble visitor in both cases (but it will demand the addition of specialized interface to all objects just for this concern).
And now I recalled about this Expando. And now i fell myself even worse: earlier I was just seeing not-so-good solutions, but in fact acceptable if treated responsibly and carefully. But this expando.. I see how much easier it may make everything, but I don't know how to allow myself to use it (tho the problem may disappear by itself, as we may fall under constrait of useing .NET 3.5). But if you say using it is ok, I will surely do so. That's because I know that I can trust your opinion more, that my own in this context of software design.

All in all, I will be happy to hear your opinion on the matter of this article, but n years later. What is the answer to questions raised here by article and comments, ultimately? What transition from boundaries to inner OO paradise is the best in your opinion (and what is second, just in case)?
Also, concerning offtopic: can you give me a couple advices on setting up those references (that DI composition will not able to handle)? I think this will be needed only after object graph was builded, so may be do it in composition root right away during cunstruction? I was trying to do this pretty successfully, but i consider those method a "hack". Is it bad to have repository with almost all business objects?

2015-12-19 1:40 UTC

Dmitry, thank you for writing. This purpose of this post is mostly to help people realise that, as the title says, at the boundaries, applications aren't Object-Oriented (and neither are they Functional). Beyond that, the article is hardly prescriptive. While the What Should Happen at the Boundary section muses on ways to deal with input and output, I hope it doesn't suggest that there are any silver bullets that address these problems.

It's a subject not easily addressed here in a comment. A new article wouldn't even do the topic justice; a book would be required to truly cover all the ins and outs.

That said, I don't think I ever used the ExpandoObject-approach in practice, although I've used dynamic typing against JSON a lot with Integration Testing. Mostly, I use the Railway oriented programming approach to input validation, but it doesn't address all the other issues with parsing, implementing Tolerant Readers, etc.

"I know that I can trust your opinion more, that my own in this context of software design."
I don't think you should. All advice is given in a specific context. Sometimes, that context is explicit, but usually it's implicit. I don't know your context (and I confess that I don't understand your more specific questions), so I can only give advice from my own (implicit) context. It may or may not fit your context.

This is the reason I take pains to present arguments for my conclusion. While the arguments serve the purpose of trying to convince the reader, they also serve as documentation. From the arguments, the train of thought that leads to a particular conclusion should be clearer to the reader. If a reader disagrees with a premise, he or she will also disagree with the conclusion. That's OK; software development is not a one-size-fits-all activity.

2015-12-19 10:35 UTC

Design Smell: Default Constructor

Monday, 30 May 2011 13:02:02 UTC

This post is the fifth in a series about Poka-yoke Design - also known as encapsulation.

Default constructors are code smells. There you have it. That probably sounds outrageous, but consider this: object-orientation is about encapsulating behavior and data into cohesive pieces of code (classes). Encapsulation means that the class should protect the integrity of the data it encapsulates. When data is required, it must often be supplied through a constructor. Conversely, a default constructor implies that no external data is required. That's a rather weak statement about the invariants of the class.

Please be aware that this post represents a smell. This indicates that whenever a certain idiom or pattern (in this case a default constructor) is encountered in code it should trigger further investigation.

As I will outline below, there are several scenarios where default constructors are perfectly fine, so the purpose of this blog post is not to thunder against default constructors. It's to provide food for thought.

If you have read my book you will know that Constructor Injection is the dominating DI pattern exactly because it statically advertises dependencies and protects the integrity of those dependencies by guaranteeing that an initialized consumer is always in a consistent state. This is fail-safe design because the compiler can enforce the relationship, thus providing rapid feedback.

This principle extends far beyond DI. In a previous post I described how a constructor with arguments statically advertises that the argument is required:

public class Fragrance : IFragrance
{
    private readonly string name;
 
    public Fragrance(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
 
        this.name = name;
    }
 
    public string Spread()
    {
        return this.name;
    }
}

The Fragrance class protects the integrity of the name by requiring it through the constructor. Since this class requires the name to implement its behavior, requesting it through the constructor is the correct thing to do. A default constructor would not have been fail-safe, since it would introduce a temporal coupling.

Consider that objects are supposed to be containers of behavior and data. Whenever an object contains data, the data must be encapsulated. In the (very common) case where no meaningful default value can be defined, the data must be provided via the constructor. Thus, default constructors might indicate that encapsulation is broken.

When are Default Constructors OK? #

There are still scenarios where default constructors are in order (I'm sure there are more than those listed here).

  • If a default constructor can assign meaningful default values to all contained fields a default constructor still protects the invariants of the class. As an example, the default constructor of UriBuilder initializes its internal values to a consistent set that will build the Uri http://localhost unless one or more of its properties are subsequently manipulated. You may agree or disagree with this default behavior, but it's consistent and so encapsulation is preserved.
  • If a class contains no data obviously there is no data to protect. This may be a symptom of the Feature Envy code smell, which is often evidenced by the class in question being a concrete class.
    • If such a class can be turned into a static class it's a certain sign of Feature Envy.
    • If, on the other hand, the class implements an interface, it might be a sign that it actually represents pure behavior.

A class that represents pure behavior by implementing an interface is not necessarily a bad thing. This can be a very powerful construct.

In summary, a default constructor should be a signal to stop and think about the invariants of the class in question. Does the default constructor sufficiently guarantee the integrity of the encapsulated data? If so, the default constructor is appropriate, but otherwise it's not. In my experience, default constructors tend to be the exception rather than the rule.


Comments

Agreed.
My most common scenario for the default constructor is when some (probably reflection-based) framework requires it.
Don't like them, at all. There are always some dependencies, and excluding them from the ctor means you are introducing them in an uglier place...

Good post!
2011-05-30 16:24 UTC
martin #
So if I have a DTO class(only properties), all the properties should be provided in the contructor or the class should be contructed by some builder ?
2011-05-30 20:06 UTC
One caveat is the case of de- serialization.
2011-05-31 00:32 UTC
Loving this series Mark. Nice work :)
2011-05-31 01:14 UTC
Ken #
I would like to point out if you are using expression blend you need a default constructor on your view otherwise Blend can't open it. Something like this.

public ReturnPolicyManagementView(): base()
{
InitializeComponent();
}
public ReturnPolicyManagementView(ReturnPolicyManagementViewModel model) : this()
{
this.DataContext = model;
}

You can of course get away with this because WPF bindings fail without causing an exception.
2011-05-31 15:29 UTC
Martin, Bob, Ken, I wrote this postscript that essentially explains why such boundary code isn't object-oriented code. At the boundaries, encapsulation doesn't apply.
2011-05-31 18:17 UTC
You have such a unique way of looking at code & constructs, it's eye opening to say the least. I enjoy how you turn the widely accepted into a beast of evil :)

Attacking default constructors is either mad or madly genius.
2011-05-31 20:08 UTC
Nice series. This post reminded me of a painful realization I made lately, which is that by design, a struct must have a parameterless default constructor. Does this make structs evil in your book, except for cases where the parameterless constructor makes sense?
2011-06-05 07:39 UTC
structs aren't evil - as with most other language constructs, it's a matter of understanding when and how to apply them. Basically, a struct is (or should be) a specialized representation of a number. Int32, Decimal, DateTime, TimeSpan etc. all fall into this category, and if you have a similar domain-specific concept in you code base, a struct could be a good representation as long as 0 is a valid value. A hypothetical Temperature type would be an excellent candidate.

An example where a struct was inappropriately applied would by Guid, whose default value is Guid.Empty. Making Guid a struct adds no value because you'd always conceptually have to check whether you just received Guid.Empty. Had Guid been a reference type we could just have checked for null, but with this design choice we need to have special cases for handling exactly Guids. This means that we have to write special code that deals only with Guids, instead of code that just deals with any reference type.
2011-06-05 08:02 UTC

Design Smell: Redundant Required Attribute

Friday, 27 May 2011 13:21:06 UTC

This post is the fourth in a series about Poka-yoke Design - also known as encapsulation.

Recently I saw this apparently enthusiastic tweet reporting from some Microsoft technology event:

[Required] attribute in code automatically creates a non-nullable entry in DB and validation in the webpage - nice […]

I imagine that it must look something like this:

public class Smell
{
    [Required]
    public int Id { get; set; }
}

Every time I see something like this I die a little inside. If you already read my previous posts it should by now be painfully clear why this breaks encapsulation. Despite the [Required] attribute there's no guarantee that the Id property will ever be assigned a value. The attribute is just a piece of garbage making a claim it can't back up.

Code like that is not fail-safe.

I understand that the attribute mentioned in the above tweet is intended to signal to some tool (probably EF) that the property must be mapped to a database schema as non-nullable, but it's still redundant. Attributes are not the correct way to make a statement about invariants.

Improved Design #

The [Required] attribute is redundant because there's a much better way to state that a piece of data is required. This has been possible since .NET 1.0. Here's the Poka-yoke version of that same statement:

public class Fragrance
{
    private readonly int id;
 
    public Fragrance(int id)
    {
        this.id = id;
    }
 
    public int Id
    {
        get { return this.id; }
    }
}

This simple structural design ensures that the ID truly is required (and if the ID can only be positive a Guard Clause can be added). An instance of Fragrance can only be created with an ID. Since this is a structural construction, the compiler can enforce the requirement, giving us rapid feedback.

I do realize that the [Required] attribute mentioned above is intended to address the challenge of mapping objects to relational data and rendering, but instead of closing the impedance mismatch gap, it widens it. Instead of introducing yet another redundant attribute the team should have made their tool understand simple idioms for encapsulation like the one above.

This isn't at all hard to do. As an example, DI Containers thrive on structural information encoded into constructors (this is called Auto-wiring). The team behind the [Required] attribute could have done that as well. The [Required] attribute is a primitive and toxic hack.

This is the major reason I never expect to use EF. It forces developers to break encapsulation, which is a principle upon which I refuse to compromise.


Comments

Seems like overkill if all you have is a piece of data that you want to display on a web page. Are you against using Entity Framework to fill Data Transfer Objects to be displayed?
2011-05-27 13:59 UTC
Agreed, I'd only use [Required] for validation on a view model
2011-05-27 14:35 UTC
You misstate the intended purpose of the attribute, because you cite a tweet instead of the documentation.

It is true that most code-to-schema-mapping-tools, including EF code-first, Fluent NHibernate, and others, will infer non-nullability from [Required] if it happens to be present on a type which is actually nullable. Why shouldn't they? I realize it's cool for the hip kids to flame EF at every opportunity, but good grief, it didn't cause the tornadoes in Memphis, and it didn't create (nor does it "require") [Required]...

The documentation makes the intended purpose of the attribute clear:

"The RequiredAttribute attribute specifies that when a field on a form is validated, the field must contain a value. A validation exception is raised if the property is null, contains an empty string (""), or contains only white-space characters."

Your "imagined" code example is indeed redundant even in the case where someone is trying to create a non-nullable DB field instead of using the attribute for its intended purpose, because integer properties, being value types, are always mapped to non-nullable fields, with or without [Required]. One must use "int?" to get a nullable DB field; [Required] has no effect on "int" at all. (Indeed, the MVC framework, e.g., treats [Required] and value types exactly the same.)

Now, let's consider the actual purpose of required, per the documentation, and see if it's redundant. Consider a view model with two "required" strings. The business requirement is that if either one or both of them do not contain a non-empty string, the web page should highlight the editors in red and display an appropriate message to the user. One could write a parameterized constructor and throw if they're not present, per your example, but you only get one chance to throw an exception, so you would have to handle three separate cases: The first field missing, the second field missing, or both fields missing. Better to use a validation framework which handles all of these cases for you -- which is exactly what [Required] is intended for.

I would argue that such a parameterized constructor with one or more throws is the wrong design in this scenario, because view models must be able to cope with the task of re-displaying invalid data entered by the user so that the user can correct her mistakes. In contrast to a business object, it must be able to hold and report on invalid states.
2011-05-27 15:19 UTC
Ben, if all you have "is a piece of data that you want to display on a web page" you're most likely in the CRUD application scenario. There are cases where this is indeed the correct solution to a business problem. It's perfectly fine, but it has nothing to do with object-orientation.

This whole series of posts about Poka-yoke Design are about encapsulation, which is an object-oriented principle. And yes: when it comes to OOD I'd never use EF (or any other ORM) if it requires me to break encapsulation.
2011-05-28 21:29 UTC
Craig, I agree that when it comes to user input validation, using an attribute on one or more properties with very loose invariants is the best solution. In general, when at the boundary of an application, it's always best to assume that input can be invalid in all sorts of ways. This is true for UI input as well as automated input (XML, CSV, JSON, etc.). Thus, any 'object' modeling an application boundary data structure is essentially not encapsulated.

However, it would be a grave mistake if we let UI concerns drive the design of our domain models. Thus, as soon as we know that we have a case of valid input data on our hands, we must translate it to a proper encapsulated object. From that point on (including db schema design) an attribute would certainly be redundant.

BTW, don't read to much into the tweet. Perhaps I chose the wrong example, and it was never my intention to thunder particularly at EF. So far, all I've heard about NHibernate indicates to me that it has similar issues with regards to encapsulation.
2011-05-28 21:42 UTC
I like what you said in above: "Thus, any 'object' modeling an application boundary data structure is essentially not encapsulated." That's an insight that many people miss.

My personal take on this is that data service interfaces are boundary structures themselves, and should be BOs in the same way that view models shouldn't be BOs, either.
2011-05-31 14:56 UTC
When you say 'BO', do you mean 'Boundary Object' or 'Business Object'? It seems to me that you might use both abbreviations in the same sentence... or I'm totally missing something :)
2011-05-31 18:19 UTC
Sorry, I meant business object, and "should" should have been "shouldn't." Not the clearest sentence I ever wrote!
2011-05-31 20:39 UTC
Gotcha! Agreed :)
2011-05-31 20:40 UTC

Code Smell: Automatic Property

Thursday, 26 May 2011 13:33:13 UTC

This post is the third in a series about Poka-yoke Design - also known as encapsulation.

Automatic properties are one of the most redundant features of C#. I know that some people really love them, but they address a problem you shouldn't have in the first place.

I totally agree that code like this looks redundant:

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

However, the solution is not to write this instead:

public string Name { get; set; }

The problem with the first code snippet isn't that it contains too much ceremony. The problem is that it breaks encapsulation. In fact

"[…] getters and setters do not achieve encapsulation or information hiding: they are a language-legitimized way to violate them."

James O. Coplien & Gertrud Bjørnvig. Lean Architecture. Wiley. 2010. p. 134.

While I personally think that properties do have their uses, I very rarely find use for automatic properties. They are never appropriate for reference types, and only rarely for value types.

Code Smell: Automatic Reference Type Property #

First of all, let's consider the very large set of properties that expose a reference type.

In the case of reference types, null is a possible value. However, when we think about Poka-yoke design, null is never an appropriate value because it leads to NullReferenceExceptions. The Null Object pattern provides a better alternative to deal with situations where a value might be undefined.

In other words, an automatic property like the Name property above is never appropriate. The setter must have some kind of Guard Clause to protect it against null (and possibly other invalid values). Here's the most fundamental example:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        this.name = value; 
    }
}

As an alternative, a Guard Clause could also check for null and provide a default Null Object in the cases where the assigned value is null:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            this.name = "";
            return;
        }
        this.name = value; 
    }
}

However, this implementation contains a POLA violation because the getter sometimes returns a different value than what was assigned. It's possible to fix this problem by adding an associated boolean field indicating whether the name was assigned null so that null can be returned from the setter in this special case, but that leads to another code smell.

Code Smell: Automatic Value Type Property #

If the type of the property is a value type, the case is less clear-cut because value types can't be null. This means that a Null Guard is never appropriate. However, directly consuming a value type may still be inappropriate. In fact, it's only appropriate if the class can meaningfully accept and handle any value of that type.

If, for example, the class can really only handle a certain subset of all possible values, a Guard Clause must be introduced. Consider this example:

public int RetryCount { get; set; }

This property might be used to set the appropriate number or retries for a given operation. The problem with using an automatic property is that it's possible to assign a negative value to it, and that wouldn't make any sense. One possible remedy is to add a Guard Clause:

private int retryCount;
public int RetryCount
{
    get { return this.retryCount; }
    set
    {
        if (value < 0)
        {
            throw new ArgumentOutOfRangeException();
        }
        this.retryCount = value;
    }
}

However, in many cases, exposing a primitive property is more likely to be a case of Primitive Obsession.

Improved Design: Guard Clause #

As I described above, the most immediate fix for automatic properties is to properly implement the property with a Guard Clause. This ensures that the class' invariants are properly encapsulated.

Improved Design: Value Object Property #

When the automatic property is a value type, a Guard Clause may still be in order. However, when the property is really a symptom of Primitive Obsession, a better alternative is to introduce a proper Value Object.

Consider, as an example, this property:

public int Temperature { get; set; }

This is bad design for a number of reasons. It doesn't communicate the unit of measure and allows unbounded values to be assigned. What happens if - 100 is assigned? If the unit of measure is Celcius it should succeed, although in the case when it's Kelvin, it should fail. No matter the unit of measure, attempting to assign int.MinValue should fail.

A more robust design can be had if we introduce a new Temperature type and change the property to have that type. Apart from protection of invariants it would also encapsulate conversion between different temperature scales.

However, if that Value Object is implemented as a reference type the situation is equivalent to the situation described above, and a Null Guard is necessary. Only in the case where the Value Object is implemented as a value type is an anonymous property appropriate.

The bottom line is that automatic properties are rarely appropriate. In fact, they are only appropriate when the type of the property is a value type and all conceivable values are allowed. Since there are a few cases where automatic properties are appropriate their use can't be entirely dismissed, but it should be treated as warranting further investigation. It's a code smell, not an anti-pattern.

On a different note properties also violate the Law of Demeter, but that's the topic of a future blog post...


Comments

James Nail #
Hi Mark,
I'm enjoying this series, and you can be sure I'll be sharing it with my teammates.
I wanted to point out a typo that might confuse readers, though -- you used the term "anonymous properties" instead of "automatic properties" toward the end of the article, which threw me off at first.
Anyway, keep up the great work!
2011-05-26 14:07 UTC
Omer Katz #
What about NetworkStream's Connected property?
It has a private set and a public get and returns a boolean that indicates if the underlaying socket is connected.
Nothing wrong with that.

By your logic, any .NET ORM is bad because it uses properties and violates the Law of Demeter.
2011-05-26 14:09 UTC
Isn't the code smell "public setters" rather than automatic properties? Making the set protected or private solves this issue. E.g.

public decimal MyValue { get; private set; }

With such usages it is legitimate for the guard code to be in a state mutating method, constructor validation etc
2011-05-26 14:15 UTC
Good points, and I completely agree with the { get; set; } kind of auto-implemented properties - in fact I don't think I've ever checked one in.

However, I'm forever using the {get; private set; } kind of auto-implemented properties - what are your opinions on those? What about a hypothetical and confusingly named { get; readonly set; } auto-implemented property?
2011-05-26 14:25 UTC
James, thanks for pointing that out. I've now corrected the error.
2011-05-26 18:31 UTC
Omer, a boolean property falls into that rare case described during the discussion of value type properties. If the class exposing the property truly can accept any value that the type can possibly take, an automatic property is in order. A boolean value can only take one of two values, and both are legal. Thus, no Guard Clause is required. However, please notice that this scenario is not (or at least should not be) representative.

Regarding ORMs you are now jumping ahead of me :) I'll come to that in a later blog post, but yes, I genuinely believe that the whole concept of an ORM (at least in any incarnations I've seen so far) is a fallacy.
2011-05-26 18:39 UTC
Ben, you are correct that the issue is at least most prevalent when the setter is public. However, keep in mind that we should always treat protected as public, as anyone can always come by and derive from a class.

For private setters it's not so clear-cut. In most cases when I have read-only properties it tends to be because the property is being supplied through the constructor and subsequently treated as immutable. When that is the case, I much prefer a backing field because it can be declared as readonly.

In other words I rarely write code where the class itself can mutate the value of a read-only property, but I know that other people do. When this is the case, I could argue that I'd still prefer the setter to protect the invariants of the class, but that argument tends to become a little silly because with a backing field the class could always just mutate the value directly. In any case, if one has a class that needs that kind of protection from its own internals one probably has more serious problems :)

Keep in mind that this whole discussion about encapsulation relates to the public API of classes. Obviously, once you start poking around in the internals of a class, you're already looking at the source code and thus past the encapsulation perimeter.
2011-05-26 18:50 UTC
Alex, see my previous comment regarding { get; private set; }

Regarding { get; readonly set; } I would probably use something like this assuming this was something that could only be used from the constructor, because I'd still be able to add the appropriate Guard Clause in the constructor before invoking the setter.
2011-05-26 18:54 UTC
I totally agree with your post. However, there are some cases when you do not need to make checks for validity for reference types. If a class with such a property does not provide a default non-null value for it, then any other checks are redundant as the user may set or may not set a value to that property. This way it will still have its default value of NULL.

Using properties with public setters in really tricky. When you have a class with properties and methods, users do not know which properties should be set before a method is called which will result in a run-time exception. That's why required parameters should be provided directly to constructors and methods.
2011-05-27 07:52 UTC
If the value is optional a public setter is in order. However, allowing the field to be null is not, as it would require the rest of the class to have to check for null every time it uses the field. A Null Object is a better alternative, which again leads to the rule that a Null Guard is always appropriate for reference types. However, in certain cases, that Null Guard might not throw an exception, but rather replace the null with a Null Object.
2011-05-29 08:55 UTC
Theo Andersen #
Hi Mark,
I really like your design smell series posts, and i have a few questions.

I get and agree on your points about encapsulation and Automatic Properties, but as i see it a lot of frameworks forces you to break this encapsulation. An example is XML Serialization (ISerializable), which needs an empty constructor and then access to all the public properties to set them.

Couldn't this lead to both the automatic property and temporal coupling smells, as you need to have basic properties so the framework can set them, plus an empty constructor which then breaks your temporal encapsulation?
To adhere to the serialization interface you're forced to open up your code (or create a DTO).

2011-05-30 09:18 UTC
Theo, please see my latest post that touches on exactly that point.
2011-05-31 18:11 UTC
I wonder if there is a more elegant way to do this than creating guard clauses for every reference type property. It would have been nice to do this in a more declerative way. I posted a question on SO as well: http://stackoverflow.com/questions/6773189/auto-implemented-properties-with-non-null-check
2011-07-21 16:38 UTC
Ben Robinson #
Your points are arguments against using automatic properties inappropriately and are far from general cases. There are perfectly valid reasons to allow the setting of properties to null (e.g. using ORM how would you set a property mapped DB field to null) i think your example of a setter that throws a null reference exception is a code smell itself unless you have a specific reason for disallowing it. Your example of the RetryCount is simply an inappropriate use of an automatic property not at all an argument against automatic properties. Your example of temperature is an example of inappropriate type and has nothing to do with automatic properties at all, with the correct type an automatic property would make perfect sense as you could build the validation into the Temperature class. Automatic properties as just shorthand for simple field backed properties and where simple field back properties are correct then so are automatic properties. Simple field backed properties with no validation are sometimes used when they shouldn't be, but that has nothing to do with automatic properties.
2011-09-02 08:38 UTC
I agree with Ben's last point, I believe automatic property(or simple field backed property) is just a way to deal with how C# compiler does assembly linking, and using simple properties allows minor modifications to them without recompiling all dependent assemblies.

I somewhat agree to your Guard Clause argument, it's something I should pay more attention to instead of blindly implementing automatic properties. But I do have some doubts, in many cases, there is no proper default value, like Birth Date, Country, etc. In fact, I think that's the case most of the time, how would you deal with those situations?
2012-01-10 17:04 UTC
2012-01-15 16:40 UTC
Juliano Leal Goncalves #
Mark, have you considered revisiting this topic or adding some more info to this particular post after the introduction of automatic readonly properties? Most of what you discuss here is still applicable, but some of the newer constructs prevents a fair share of the problems associated with properties.
2018-03-20 14:15 UTC

Juliano, thank you for writing. You're right that after the publication of this post, C# got more features, at least one of which seems to relate to this article. When I write C# code today, I often use automatic read-only properties, combined with appropriate Guard Clauses in the constructor. I find that a useful language feature.

That particular later language feature (automatically implemented read-only properties) is, however, only tangentially related to the topic of this post. Did you have some other language construct in mind?

2018-03-20 16:12 UTC

Design Smell: Primitive Obsession

Wednesday, 25 May 2011 15:03:31 UTC

This post is the second in a series about Poka-yoke Design - also known as encapsulation.

Many classes have a tendency to consume or expose primitive values like integers and strings. While such primitive types exist on any platform, they tend to lead to procedural code. Furthermore they often break encapsulation by allowing invalid values to be assigned.

This problem has been addressed many times before. Years ago Jimmy Bogard provided an excellent treatment of the issue, as well as guidance on how to resolve it. In relation to AutoFixture I also touched upon the subject some time ago. As such, the current post is mostly a placeholder.

However, it's worth noting that both Jimmy's and my own post address the concern that strings and integers do not sufficiently encapsulate the concepts of Zip codes and phone numbers.

  • When a Zip code is represented as a string it's possible to assign values such as null, string.Emtpy, “foo”, very long strings, etc. Jimmy's ZipCode class encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.
  • When a Danish phone number is represented as an integer it's possible to assign values such as - 98, 0, int.MaxValue, etc. Once again the DanishPhoneNumber class from the above example encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.

Encapsulation is broken unless the concept represented by a primitive value can truly take any of the possible values of the primitive type. This is rarely the case.

Design Smell #

A class consumes a primitive type. However, further analysis shows that not all possible values of the type are legal values.

Improved Design #

Encapsulate the primitive value in a Value Object that contains appropriate Guard Clauses etc. to guarantee that only valid instances are possible.

Primitives tend to not be fail-safe, but encapsulated Value Objects are.


Comments

Jon Wingfield #
What do you mean by this statement:

Encapsulation is broken unless the concept represented by a primitive value can truly take any of the possible values of the primitive type. This is rarely the case.

Should a Qty field that only takes positive integers now be represented by a separate class? You seem to be defining encapsulation as compile-safe or something.
2011-05-25 15:52 UTC
I would very much like to be able to represent a quantity as a specialized class. Unfortunately uint is not a CLS-compliant type - otherwise I would use that one a lot.

Keep in mind that I'm talking about a code smell. This means that whenever you encounter the smell it should make you stop and think. I'm not saying that using a primitive type is always wrong.

With a quantity, and given the constraints of C#, unfortunately there isn't a lot we can do. While we could encapsulate it in a PositiveInteger struct, it wouldn't add much value from a feedback perspective since we can't make the compiler choke on a negative value.

However, a hypothetical PositiveInteger struct would still add the value that it encapsulates the invariant in one place instead of spreading it out all over the code base.

Still, a positive integer is such a basic concept that it never changes. This means that even if you spread Guard Clauses that check for negative values all over your code base, you may not have much of a maintenance burden, since the Guard will never change. Still, you might forget it in a couple of places.

In any case I like the Temperature example above much better because it not only provides safety, but also encapsulates the concept as well as provides conversion logic etc.
2011-05-25 18:03 UTC
Thomas #
I try whenever possible to design that way my api especially in the domain layer. However I run into some corner cases with RESTful web services when it's more convenient for the client to provide a simple primitive than the complex object. Then on the server side the check is done to know if the provided value is correct. Would you bother to enforce it that case?
2011-05-31 22:38 UTC
Thomas, did you see my post on application boundaries? I think it ought to answer your question :)
2011-06-01 05:45 UTC
Thomas #
Not yet:) I'm catching up with the old one first. But I go to it:)
2011-06-01 06:29 UTC
Scott Peterson #
Hello Mark:

I am revisiting an old post in hopes you can shed some light on Value objects and their limits. I am currently trying to implement a Password class, which seems like a perfect example of something that should not be handled as a simple string. I modelled my Password object based on examples you've provided, in that I pass in a string to a constructor and then run an IsValid method to see if the string meets our business rules for passwords (length, types of characters, etc.). That's fine as it is, and I have unit tests to make sure all is well. But there are more business rules to a password. I need to have a privately set DateCreated field, and I need to store the number of days the password is valid, while providing a function to see if the password is still valid based on the DateCreated and the number of days the password is valid. However, adding these things to my value object seems like I'm polluting it. Plus, I want to pass in the number of days valid when the object is created, so now I have two parameters, which causes problems if I want to have an explicit operator. I thought about creating a PasswordPrimitive class and then a Password class that inherits the PasswordPrimitive class, but that seems messy.

If you have any thoughts and/or guidance, I'd appreciate the input.

Regards,
Scott
2015-02-26 21:04 UTC

Scott, thank you for writing. A Value Object can contain more than a single primitive value. The canonical Value Object example is Money, and you often see Money examples where a Money value is composed of an amount and a currency; one example in the literature is Kent Beck's implementation in Test Driven Development: By Example, which contains amount and currency.

Thus, I don't see any intrinsic problem with your password class containing both a string (or a byte array?) and an expiration time.

It's true that you can no longer have a lossless conversion from your Value Object to a primitive value, but that's not a requirement for it to be a Value Object.

(BTW, I hope you don't store passwords, but only the hashes!)

2015-02-27 7:29 UTC

Design Smell: Temporal Coupling

Tuesday, 24 May 2011 14:00:42 UTC

This post is the first in a series about Poka-yoke Design - also known as encapsulation.

A common problem in API design is temporal coupling, which occurs when there's an implicit relationship between two, or more, members of a class requiring clients to invoke one member before the other. This tightly couples the members in the temporal dimension.

The archetypical example is the use of an Initialize method, although copious other examples can be found - even in the BCL. As an example, this usage of EndpointAddressBuilder compiles, but fails at run-time:

var b = new EndpointAddressBuilder();
var e = b.ToEndpointAddress();

It turns out that at least an URI is required before an EndpointAddress can be created. The following code compiles and succeeds at run-time:

var b = new EndpointAddressBuilder();
b.Uri = new UriBuilder().Uri;
var e = b.ToEndpointAddress();

The API provides no hint that this is necessary, but there's a temporal coupling between the Uri property and the ToEndpointAddress method.

In the rest of the post I will provide a more complete example, as well as a guideline to improve the API towards Poka-yoke Design.

Smell Example #

This example describes a more abstract code smell, exhibited by the Smell class. The public API looks like this:

public class Smell
{
    public void Initialize(string name)
 
    public string Spread()
}

Semantically the name of the Initialize method is obviously a clue, but on a structural level this API gives us no indication of temporal coupling. Thus, code like this compiles, but throws an exception at run-time:

var s = new Smell();
var n = s.Spread();

It turns out that the Spread method throws an InvalidOperationException because the Smell has not been initialized with a name. The problem with the Smell class is that it doesn't properly protect its invariants. In other words, encapsulation is broken.

To fix the issue the Initialize method must be invoked before the Spread method:

var sut = new Smell();
sut.Initialize("Sulphur");
var n = sut.Spread();

While it's possible to write unit tests that explore the behavior of the Smell class, it would be better if the design was improved to enable the compiler to provide feedback.

Improvement: Constructor Injection #

Encapsulation (Poka-yoke style) requires that the class can never be in an inconsistent state. Since the name of the smell is required, a guarantee that it is always available must be built into the class. If no good default value is available, the name must be requested via the constructor:

public class Fragrance : IFragrance
{
    private readonly string name;
 
    public Fragrance(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
 
        this.name = name;
    }
 
    public string Spread()
    {
        return this.name;
    }
}

This effectively guarantees that the name is always available in all instances of the class. There  are also positive side effects:

  • The cyclomatic complexity of the class has been reduced
  • The class is now immutable, and thereby thread-safe

However, there are times when the original version of the class implements an interface that causes the temporal coupling. It might have looked like this:

public interface ISmell
{
    void Initialize(string name);
 
    string Spread();
}

In many cases the injected value (name) is unknown until run-time, in which case straight use of the constructor seems prohibitive - after all, the constructor is an implementation detail and not part of the loosely coupled API. When programming against an interface it's not possible to invoke the constructor.

There's a solution for that as well.

Improvement: Abstract Factory #

To decouple the methods in the ISmell (ha ha) interface the Initialize method can be moved to a new interface. Instead of mutating the (inconsistent) state of a class, the Create method (formerly known as Initialize) returns a new instance of the IFragrance interface:

public interface IFragranceFactory
{
    IFragrance Create(string name);
}

The implementation is straightforward:

public class FragranceFactory : IFragranceFactory
{
    public IFragrance Create(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
        return new Fragrance(name);
    }
}

This enables encapsulation because both the FragranceFactory and Fragrance classes protect their invariants. They can never be in an inconsistent state. A client previously interacting with the ISmell interface can use the IFragranceFactory/IFragrance combination to achieve the same funcionality:

var f = factory.Create(name);
var n = f.Spread();

This is better because improper use of the API can now be detected by the compiler instead of at run-time. An interesting side-effect by moving towards a more statically declared interaction structure is that classes tend towards immutability. Immutable classes are automatically thread-safe, which is an increasingly important trait in the (relatively) new multi-core era.


Comments

Scott #
Great insight again in to core principles that often get over looked even by big names in software development.

Looking forward to the rest of the series.
2011-05-24 16:02 UTC
juan agui #
Have you ever considered renaming your blog to programming 101? It's amazing how most my colleagues are still writing procedural code and violate each and every OOP principle. Fortunately there are blogs like yours that explain brilliantly and concisely these topics. Keep up the good job!


2011-05-24 17:28 UTC
Although, I agree with the fact that temporal coupling must be avoided, when possible, I find the solution you propose quite overkill (as always with that dependency injection jazz, IMHO). With the initial design, at least, as a developer, the problem I had were simple to solve (ie: add the Initialize call). Now with DI, as a user of the Small API, I may have to deal with all sort of runtime problems, just because I didn't "read" the API, or the documentation properly :-) Plus, now I have 4 types instead of 1 loaded in my domain. A simple design may not be the most elegant, but it bears simple problems.
2011-05-25 05:50 UTC
So, suppose I want my FragranceFactory to return a Fragrance that implements ISmell (as you have it above) instead of an IFragrance--perhaps because I can't change ISmell). I should just have FragranceFactory return initialized Fragrances wherein the Initialize method does nothing, right? This would accomplish all our goals, although ISmell would still be confusing to the consumer.
2011-05-25 17:27 UTC
Simon, exactly the opposite is true. With the Fragrance API you wouldn't have to deal with any runtime problems, because as soon as you can compile, the design of the code makes it very likely that it's also going to work at run-time (except for null values, which you can't escape no matter what you do).

Getting feedback sooner is betting than getting feedback later. I much prefer getting a compiler error over a run-time exception.

The number of types is hardly a measure of the quality of the code, but if it is, the more the better. The Single Responsibility Principle favors many small classes over a few God Classes.

However, even if you don't agree, then comparing one type against four doesn't make sense. In the absence of interfaces there'd be one class in each alternative (Smell versus Fragrance). When you add interfaces into the mix, you have two types (Smell and ISmell) in the temporally coupled API against four types in the safe API.
2011-05-25 18:19 UTC
Patrick, yes, that might be a possible variation, although perhaps a bit exotic... In which scenario would that be the case?
2011-05-25 18:21 UTC
Not sure to follow you on the number of types. Initially, I had 1 Smell class, and in then end I have 2 classes and 2 interfaces. Or maybe I missed something.
In a real world project, multiply classes by 2, 3, 4 or more (lets add some WCF interfaces to this!) is indeed an issue, unless you generate some or all of it.
I'm not saying what you propose can't be done, or is inelegant. It is exactly the opposite, it's clever and smart, and is the style of code that can be taught in classes or blogged about, but the generalization of these principles can lead to bloated code and huge assemblies if applied massively.

About maintenability, I pretend your code is harder to extend. Initially, I had one class that I could easily extend and the JIT helps me a lot here. I can create a new Spread(arguments) methods easily without breaking any of my existing clients (and without changing the semantics coupling). It's less easy with interfaces, do I need one new ISmellxx interfaces for every new method? New types again? Classes are extensible, interfaces are carved in stone forever - in fact, most people *do* change their interfaces, but that's another story :-)
2011-05-26 07:08 UTC
This blog post discusses two scenarios. The first scenario is that you have only the Smell class. In that scenario, you can replace the Smell class with the Fragrance class (minus the IFragrance interface). No new types are necessary, so you start out with one type and end up with one type.

The second scenarios is when you have a Smell class implementing an ISmell interface. In that case you'd need to go from two to four types.

When it comes to maintainability, I find that the SOLID principles are some of the best guidelines around. The Open/Closed Principle explicitly states that classes should be closed for modification, but open for extensibility. In other words, we extend code only by adding new classes.
2011-05-26 18:24 UTC
_ikke_ #
Just to be sure:

The factory class is allowed to create concrete instances without using a DI container? And any dependencies that the concrete class needs will have to be passed to the factory class?

2011-05-27 10:49 UTC
Use of a DI Container is IMO orthogonal to the issue.

Conceptually it's best to think about any such factories as concrete classes. If they need to pass dependencies to the objects they create, they can do so by requesting them by Constructor Injection. In other words you are correct that "any dependencies that the concrete class needs will have to be passed to the factory class" (excluding those that are passed as parameters to the Create method).

On a practical side, you can always consider implementing the factory as an infrastructure component based on a container. Essentially, the factory would be an Adapter over the container. However, if you do this, you must be sure to place the Adapter in the Composition Root of the application, as otherwise you risk that the container reference leaks into your application code. This means that the concrete class and the factory will end up being implemented in two different assemblies. If you use Castle Windsor, you can go all the way and not even implement the container at all because instead you can leverage the Typed Factory facility.
2011-05-28 21:52 UTC
Mohammed #
All of our code is full of temporal couplings. You can not prevent it because we solve our problems in doing first step1, followed by step2, and so on and so forth. So I don't think you can create code where there is no temporal coupling. What you can do on the other hand, and what I think this blog should have addressed, is to always create valid objects. The design of the class Smell makes it possible creating the object of type Smell without a name. Right after you issued the keyword new() your object is invalid. Any method you call after that will operate on an invalid object (Initialize(...) tries to remedy that). Your solution on the other hand is the right one.
2011-06-17 13:07 UTC
Daniel Hilgarth #
Mark, thanks for this and the other articles in this and other series.
I have one question however:
FragranceFactory.Create checks name for null. Why?
I see the following problems with it:
1) Code duplication:
As Fragrance can be instantiated without the use of the factory, the null check can't be removed from the constructor of Fragrance.
2) Knowledge of implementation details of Fragrance:
The FragranceFactory knows that Fragrance can't work with a name that is null. This is not implicit, this is explicit knowledge. Another implementation of IFragrance could simply assign a default value if null is passed. As FragranceFactory is coupled tightly to Fragrancy, that is not such a big problem. Still I think it is not correct
3) No benefit whatsoever:
Adding the null check in the Create method brings no benefit at all. On the contrary: If we were to change the constructor of Fragrance to allow null and to assign a default value, but would forget to change FragranceFactory.Create accordingly, the semantics of the object creation would differ, depending on how the object is created.

Can you please explain what I have missed and why you decided to put the null check into the Create method?
2011-07-21 12:34 UTC
Daniel, thank you for your insightful comment - you are completely correct.

I just added the Guard Clause out of habit, but your arguments are valid and I will in the future be more careful with my Guard Clauses. It only makes sense that one should only have a Guard Clause if one uses the argument in question. That's not the case when it's just being passed on, so it should not be there.

Thanks for pointing that out.
2011-07-21 15:03 UTC
Mark, thanks for your answer. I hope I didn't sound like I was nitpicking. I simply was a bit puzzled about it.
2011-07-21 15:25 UTC
Not at all - I'm happy that you pointed out something I hadn't properly considered. I learned something from it :)
2011-07-21 15:28 UTC
Hey Mark,
What about fluent interfaces? They completely rely on temporal coupling by design:

var query = new QueryBuilder().WithField("dressing").NotNull().Containing("bacon");

Do you consider this an edge case, or do you actually dislike fluent interfaces in general?
2011-08-02 17:50 UTC
Not at all (to both statements). A Fluent Interface doesn't rely on Temporal Coupling at all - I frequently implement Fluent APIs with immutable types.

See this blog post for a very simple example. For a very complex example, see AutoFixture's Ploeh.AutoFixture.Dsl namespace, which is kicked off by the fixture.Build<T>() method.

In any case, Temporal Coupling doesn't disallow mutability - it just states that if you have mutating methods, the order in which they are invoked (if at all) mustn't matter.
2011-08-02 18:23 UTC
Emanuel Pasat #
Offtopic question: striving for immutability on the domain model is a good practice (or only value objects should be immutable)?
2011-09-16 14:32 UTC
The most common approach is certainly to model Entities as mutable objects while Value Objects are kept immutable.

When you use 'classic' CRUD-style Repositories or Unit of Work, that may also be the best match to that style of architecture. However, my more recent experience with CQRS and Event Sourcing seems to indicate to me that an immutable domain model begins to make a lot more sense. However, I've still to gain more experience with this before I can say anything with confidence.

It has been my general experience over the last couple of years that the more immutable objects I can define, the easier the code is to work with and reason about.
2011-09-16 19:18 UTC
Luis #
Quick noob question, why the IFragranceFactory interface? Your code factory.Create("name") which i imagine would be preceded by "var factory = new FragranceFactory()" does not use the interface, does it?
2012-09-11 14:12 UTC
An IFragranceFactory instance can be injected into any client that needs it, freeing it from having to create the tightly coupled instance itself.

The IFragranceFactory instance can be created in the Composition Root, potentially far removed from any clients. At this point it could be created with the 'new' keyword, or by a DI Container.
2012-09-11 15:43 UTC
Luis #
i came to this blog post with the specific intention of reading about temporal coupling, but after reading your reply to my message and feeling utterly confused i decided to go and read your book, maybe after reading it i'll actually understand what you're saying, and gather some knowledge about DI in the process, obviously this DI is a thing
2012-09-21 12:45 UTC
Sorry, I didn't mean to confuse you...
2012-09-21 13:15 UTC
James Wheeler #
Hi Mark, this is an old post now but it's still very relevant for me. What about the case when initialization is a time consuming process?

For example, at my company we work with a lot of devices and we often end up writing initialization methods for some Type, let's call it 'FooDevice,' that provides functionality for that kind of device.

Let's say our FooDevice implements an interface very similar to ISmell:
IFooDevice
{
    Task Initialize(string comPort);
    int DoSomething();
}
This particular initialization method accepts as input parameter the COM port on which the device is connected to and through which we send serial commands to control the device.

So the initialization method may then create a serial port object for that port, configure the device in various ways (such as disabling physical buttons, since we're now controlling it programatically) and so forth.

The initialization may be a time consuming process that can potentially take in excess of a few seconds or longer.

Following your advice, we should apply a refactoring here. The constructor for FooDevice should accept the COM port instead. Though that seems to imply that initialization should be done in the constructor:
class FooDevice : IFooDevice
{
    public FooDevice(string comPort)
    {
        // Initialize(comPort);
    }
    
    private void Initialize(string comPort) { /* ... */ }
}
Yet that becomes problematic, because now we've lost the ability to return a Task. But of course if we have our IFooDeviceFactory, it can provide the Task:
IFooDeviceFactory
{
    Task<IFooDevice> Create(string comPort);
}
Now, if I understand your example correctly, then the whole purpose of providing an Abstract Factory is to satisify the requirement that the client be in control of when the object is created. Or else the client could just directly accept an IFragrance and no factory would be needed. Another client could feasibily skip the factory and invoke the Fragrance constructor directly.

But if we allow the same behavior with my above example, we're back in the same awkward situation with the constructor.

What can we do then? Should we make the constructor private, and instead provide a static factory method like the following?
class FooDevice : IFooDevice
{
    private FooDevice() { }

    public static Task<FooDevice> Create(string comPort)
    {
        var device = new FooDevice();
        await device.Initialize(string comPort);

        return device;
    }

    private Task Initialize(string comPort) { /* ... */ }
}
2021-06-07 22:22 UTC

James, thank you for writing. Based on what you describe, I find the last option, the one with a static factory method, preferable.

You may want to consider calling it something like Connect instead of Create, but then again, perhaps this isn't much of an improvement. The most important information is in the return type. Since the method returns a task, it's a strong signal that this could take some time.

What about closing the connection? Is that a concern? If so, you may want to make the object implement IDisposable. These days, various code analysers will tell you to dispose of such an object once you're done with it.

You write:

"the whole purpose of providing an Abstract Factory is to satisify the requirement that the client be in control of when the object is created"

Not quite. The main purpose of such a design is encapsulation, guaranteeing that an object can never be in an invalid state.

2021-06-09 12:42 UTC
James Wheeler #

Thanks for the reponse Mark. I too would consider calling the static factory method Connect but I stuck with Create for the sake of consistency with the example.

I'm not sure I follow your comment about encapsulation. With your example, FragranceFactory merely delegates its work to the Fragrance constructor--the Fragrance class already achieves encapsulation by requiring name as a constructor argument.

If you had instead introduced a SmellFactory that behaved like the following, then I would agree it encapsulates the required initialization protocol:

			class SmellFactory : ISmellFactory
			{
				ISmell Create(string name)
				{
					var smell = new Smell();
					smell.Initialize(name);
					return smell;
				}
			}
			

So the only usage I can see for keeping a factory around after improving the design with IFragrance is to allow the client to lazily create an instance of Fragrance without tightly coupling the client via the constructor. Otherwise, an already valid instance of IFragrance could be passed to the client.

Indeed, you preface the introduction of the Abstract Factory improvement with, "In many cases the injected value (name) is unknown until run-time..."

As for disposal, yes, we almost always need to implement IDisposable, since the supplier's managed API for the device usually implements IDisposable as well.

However, wouldn't using an Abstract Factory like this be undesirable? IDevice would need to implement IDisposable so that the client could invoke Dispose on the device instance created by the factory. But as you and Steven point out in your book on dependency injection, an abstraction that implements IDisposable is a leaky abstraction. Or would that not apply in this case?

2021-06-14 19:25 UTC

James, thank you for writing. There are two separate concerns here:

  • Guaranteeing that an object is in a valid state (encapsulation)
  • Enabling polymorphism
What valid state means is specific to an implementation; i.e. a concrete class. Thus, if FooDevice can only be initialised like shown above, that's an implementation detail.

The other concern is polymorphism. Your original definition of the IFooDevice interface had two methods: Initialize and DoSomething. In order to retain both of these capabilities, you're going to need to put both of these functions somewhere. In order to address the sequential coupling implied by that version of the interface, you'll have to distribute those methods over two different polymoprhic types:

IFooDeviceFactory
{
    Task<IFooDevice> Create(string comPort);
}

IFooDevice
{
    int DoSomething();
}

This prevents programmer error. With the original design, a client developer could call DoSomething without first calling Initialize. With the new design, this is no longer possible. You can't call DoSomething unless you have an IFooDevice instance, and you can't get an IFooDevice instance unless you call the Create method.

Is the factory required? Not necessarily. That depends on where the responsibility to create the object lies. If client code is the only code in possession of the comPort value, then you'll need the polymorphic factory in order to intialise a polymorphic object. On the other hand, if the comPort value is available to the application infrastructure (e.g. via a configuration value or environment variable) then you may not the the abstract factory. In that case, any client code that needs an IFooDevice object can receive it via the constructor, and the application's Composition Root can create the object using the concrete Create method.

As to the observation about interfaces inheriting from IDisposable, I still consider that a leaky abstraction. There's usually a way to avoid it, but it tends to depend on the details of the application architecture. Thus, it's difficult to give general advice on that topic without knowing more about the actual circumstances.

2021-06-17 10:50 UTC

Poka-yoke Design: From Smell to Fragrance

Tuesday, 24 May 2011 13:57:39 UTC

Encapsulation is one of the most misunderstood aspects of object-oriented programming. Most people seem to think that the related concept of information hiding simply means that private fields should be exposed by public properties (or getter/setter methods in languages that don't have native properties).

Have you ever wondered what's the real benefit to be derived from code like the following?

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

This feels awfully much like redundant code to me (and automatic properties are not the answer - it's just a compiler trick that still creates private backing fields). No information is actually hidden. Derick Bailey has a good piece on why this view of encapsulation is too narrow, so I'm not going to reiterate all his points here.

So then what is encapsulation?

The whole point of object-orientation is to produce cohesive pieces of code (classes) that solve given problems once and for all, so that programmers can use those classes without having to learn about the intricate details of the implementations.

This is what encapsulation is all about: exposing a solution to a problem without requiring the consumer to fully understand the problem domain.

This is what all well-designed classes do.

  • You don't have to know the intricate details of TDS to use ADO.NET against SQL Server.
  • You don't have to know the intricate details of painting on the screen to use WPF or Windows Forms.
  • You don't have to know the intricate details of Reflection to use a DI Container.
  • You don't have to know how to efficiently sort a list in order to efficiently sort a list in .NET.
  • Etc.

What makes encapsulation so important is exactly this trait. The class must hide the information it encapsulates in order to protect it against ‘naïve' users. Wikipedia has this to say:

Hiding the internals of the object protects its integrity by preventing users from setting the internal data of the component into an invalid or inconsistent state.

Keep in mind that users are expected to not fully understand the internal implementation of a class. This makes it obvious what encapsulation is really about:

Encapsulation is a fail-safe mechanism.

By corollary, encapsulation does not mean hiding complexity. Whenever complexity is hidden (as is the case for Providers) feedback time increases. Rapid feedback is much preferred, so delaying feedback is not desirable if it can be avoided.

Encapsulation is not about hiding complexity, but conversely exposing complexity in a fail-safe manner.

In Lean this is known as Poka-yoke, so I find it only fitting to think about encapsulation as Poka-yoke Design: APIs that make it as hard as possible to do the wrong thing. Considering that compilation is the cheapest feedback mechanism, it's preferable to design APIs so that the code can only compile when classes are used correctly.

In a series of blog posts I will look at various design smells that break encapsulation, as well as provide guidance on how to improve the design to make it safer, thus going from smell to fragrance.

  1. Design Smell: Temporal Coupling
  2. Design Smell: Primitive Obsession
  3. Code Smell: Automatic Property
  4. Design Smell: Redundant Required Attribute
  5. Design Smell: Default Constructor
  6. DI Container smell: Captive Dependency

Postscript: At the Boundaries, Applications are Not Object-Oriented


Comments

Perhaps it would be beneficial to elaborate on your focus here by describing when/where these "smells" are applicable. With the .NET community (as a whole) only recently coming on to topics like anemic models, I fear that they will take this advice and apply it to every scenario.

That's not to say that this is not relevant information but surely you're not implying that's applicable to scenarios like messaging, RESTful APIs, and other circumstances that need easily serializable objecst?
2011-05-30 15:03 UTC
Kurt Guntheroth #
You're still doing it wrong. Encapsulation means that, instead of exposing data, you expose an interface that exports exactly the operations and values that are defined for your class.

For instance, if a string name is used as a sorting key, it isn't appropriate to change the name. You can make the name string public, but that implies that changing the name is a valid operation, and might lead to bugs later. Providing a const getter, but no setter for the name says "You can't change this name".

If you have a setter and a getter for a piece of data, it should be because the class needs to expose that data for the purpose of changing it. That happens a lot, and it shouldn't be viewed as unreasonable that you have a private data member and a setter/getter. It's not a waste of time or code. It's a clear contract with users of your class that you intend to provide these operations no matter how the class evolves.

One important property of good encapsulation is that you are free to change the data representation of your class if the interface remains the same, and your changes will be limited to the methods of the class itself. Want to change from a 1-based count to a zero-based index? If you exposed a member called items, you're screwed. If you exposed a method called CountGet() you're ok. Just change CountGet()'s implementation from returning items to return items+1.
2011-06-01 16:22 UTC
One thing at a time... There's a reason I themed this series of blog posts around fail-safing. I agree that encapsulation involves more than just that, but I don't think I ever said anything else.
2011-06-01 18:52 UTC
Florian Fordermaier #
Interesting post. A few years ago I did some research on the "absence of proper encapsulation language features" in C# and Java. It's an interesting approach to advance from smells to fragrance, pointing out code snippets that potentially violate encapsulation in some way. But - to be honest - this will help only those people that are even aware of these problems.
You mentioned the compiler as the first and cheapest feedback mechanism, so the target should be to achieve automation of the process of enforcing proper encapsulation with (a) a better compiler or (b) static analysis or (c) runtime functionality that can be applied minimally-invasive.
See C++'s const qualifier, an excellent example of a language feature to support proper encapsulation, this would allow for e.g. auto-properties with a getter/setter when making the setter const. Of course this will itself impact your design, but it offers a language integrated fail-safe mechanism for encapsulation. What do you think?
I also may have an additional smell for you, a violation of the law of demeter breaks encapsulation in most cases.
2011-06-16 08:12 UTC
Florian, I agree with your statement about Law of Demeter, although Fowler calls it 'the occasionally useful suggestion of Demeter' (or some such thing).

I don't have any comment on the C++ const qualifier, as I have no idea what it does...
2011-06-18 18:00 UTC
Florian Fordermaier #
Mark,

first of all I have to correct my first comment, auto properties with a const setter is definitely non-sense.
The C++ const qualifier is effectively a statically checked and compiler-enforced construct to syntactically express your objectives regarding "permissions" to change an object's state.
If I e.g. declare a class A with a const method. Every caller calling the const method knows, that wahtever the method itself does, it will definitely not change the state of the object - imagine you have a immutable 'this' in your const method.
The same holds for e.g. a parameter that is passed to a method. If the parameter is declared const, the compiler will enforce that the parameter (be it a value or reference type) will not be changed.
But the real problem with every object that is owned by another object and exposed in some way(e.g. property getter), is, that when I return a reference to it, the caller that received the reference can change this object's state without me knowing it - this breaks encapsulation. The const qualifier comes to the rescue, when I return a const reference, the caller cannot change the returned object (compiler-checked!).

Although the const qualifier does not solve all the problems you mentioned in your blog, it can be of help. I actually only brought your attention to this C++ language construct to have an example in hand (I'm far away from a C++ expert) for what I meant with "automation of the process of enforcing proper encapsulation". I'm still interested in your opinion regarding efforts on automation of these things...


2011-06-21 11:30 UTC
Sounds useful. Currently, Microsoft is making an effort along the same lines with Code Contracts, but although it's actually a released technology it still seems a bit immature to me.

As I've been using TDD since 2003 I usually just codify my assumptions about invariants in the tests I write. A framework like Greg Young's Grensesnitt might also come in handy there.
2011-06-21 19:03 UTC

Tennis Kata with immutable types and a cyclomatic complexity of 1

Monday, 16 May 2011 11:01:00 UTC

Recently I had the inclination to do the Tennis Kata a couple of times. The first time I saw it I thought it wasn't terribly interesting as an exercise in C# development. It would basically just be an application of the State pattern, so I decided to make it a bit more interesting. More or less by intuition I decided to give myself the following constraints:

Now that's more interesting :)

Given these constraints, what would be the correct approach? Given that this is a finite state machine with a fixed number of states, the Visitor pattern will be a good match.

Each player's score can be modeled as a Value Object that can be one of these types:

  • ZeroPoints
  • FifteenPoints
  • ThirtyPoints
  • FortyPoints
  • AdvantagePoint
  • GamePoint

All of these classes implement the IPoints interface:

public interface IPoints
{
    IPoints Accept(IPoints visitor);
 
    IPoints LoseBall();
 
    IPoints WinBall(IPoints opponentPoints);
 
    IPoints WinBall(AdvantagePoint opponentPoints);
 
    IPoints WinBall(FortyPoints opponentPoints);
}

The interesting insight here is that until the opponent's score reaches FortyPoints nothing special happens. Those states can be effectively collapsed into the WinBall(IPoints) method. However, when the opponent either has FortyPoints or AdvantagePoint, special things happen, so IPoints has specialized methods for those cases. All implementations should use double dispatch to invoke the correct overload of WinBall, so the Accept method must be implemented like this:

public IPoints Accept(IPoints visitor)
{
    return visitor.WinBall(this);
}

That's the core of the Visitor pattern in action. When the implementer of the Accept method is either FortyPoints or AdvantagePoint, the specialized overload will be invoked.

It's now possible to create a context around a pair of IPoints (called a Game) to implement a method to register that Player 1 won a ball:

public Game PlayerOneWinsBall()
{
    var newPlayerOnePoints = this.PlayerTwoScore
        .Accept(this.PlayerOneScore);
    var newPlayerTwoPoints = 
        this.PlayerTwoScore.LoseBall();
    return new Game(
        newPlayerOnePoints, newPlayerTwoPoints);
}

A similar method for player two simply reverses the roles. (I'm currently reading Lean Architecture, but have yet to reach the chapter on DCI. However, considering what I've already read about DCI, this seems to fit the bill pretty well… although I might be wrong on that account.)

The context calculates new scores for both players and returns the result as a new instance of the Game class. This keeps the Game and IPoints implementations immutable.

The new score for the winner depends on the opponent's score, so the appropriate overload of WinBall should be invoked. The Visitor implementation makes it possible to pick the right overload without resorting to casts and if statements. As an example, the FortyPoints class implements the three WinBall overloads like this:

public IPoints WinBall(IPoints opponentPoints)
{
    return new GamePoint();
}
 
public IPoints WinBall(FortyPoints opponentPoints)
{
    return new AdvantagePoint();
}
 
public IPoints WinBall(AdvantagePoint opponentPoints)
{
    return this;
}

It's also important to correctly implement the LoseBall method. In most cases, losing a ball doesn't change the current state of the loser, in which case the implementation looks like this:

public IPoints LoseBall()
{
    return this;
}

However, when the player has advantage and loses the ball, he or she loses the advantage, so for the AdvantagePoint class the implementation looks like this:

public IPoints LoseBall()
{
    return new FortyPoints();
}

To keep things simple I decided to implicitly model deuce as both players having FortyPoints, so there's not explicit Deuce class. Thus, AdvantagePoint returns FortyPoints when losing the ball.

Using the Visitor pattern it's possible to keep the cyclomatic complexity at 1. The code has no branches or loops. It's immutable to boot, so a game might look like this:

[Fact]
public void PlayerOneWinsAfterHardFight()
{
    var game = new Game()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall();
 
    Assert.Equal(new GamePoint(), game.PlayerOneScore);
    Assert.Equal(new FortyPoints(), game.PlayerTwoScore);
}

In case you'd like to take a closer look at the code I'm attaching it to this post. It was driven completely by using the AutoFixture.Xunit extension, so if you are interested in idiomatic AutoFixture code it's also a good example of that.

TennisKata.zip (3.09 MB)

Comments

It's a beautiful thing
2011-05-17 14:59 UTC
Thanks :)
2011-05-17 15:02 UTC
Klaus Hebsgaard #
Funny to see, I believe I created thi kata once upon a time ...
2011-05-19 15:31 UTC

Page 63 of 76

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!