Zookeepers must become Rangers

Tuesday, 18 December 2012 15:08:31 UTC

For want of a nail the shoe was lost... This post explains why Zookeeper developers must become Ranger developers to escape the vicious cycle of impossible deadlines and low-quality code.

In a previous article I wrote about Ranger and Zookeper developers. The current article is going to make no sense to you if you haven't read the previous, but in brief, Rangers explicitly deal with versioning and forwards and backwards compatibility (henceforth called temporal compatibility) in order to produce workable code.

While reading the previous article, you may have thought that you'd rather like to be a Ranger than a Zookeeper, simply because it sounds cooler. Well, I didn't pick those names by accident :) You don't want to be a Zookeeper developer.

Zookeepers may be under the impression that, since they (or their organization) control the entire installation base of their code, they don't need to deal with the versioning aspect of software design. This is an illusion. It may seem like a little thing, but, through a chain reaction, the lack of versioning leads to impossible deadlines, slipping code quality, death marches and many unpleasant aspects of our otherwise wonderful vocation.

In order to escape the vicious cycle of low-quality-code, death marches, and firefighting, Zookeepers need to explicitly deal with versioning of the software they produce, essentially turning themselves into Rangers.

In the following, I will make two assumptions about the type of software I discuss here:

  • It's impossible to predict all future feature requirements.
  • Applications don't exist in a vacuum. They depend on other applications, and other applications depend on them.

For the vast majority of Zoo Software, I believe these tenets to be true.

In order to understand why versioning is so important (even for Zoo Software) it's important to first understand why Zookeepers consistently disregard it.

We don't need no steenking design guidelines #

It's sometimes a bit surprising to me that (.NET) programmers are resistant to good design. There's this tome of knowledge originally published as the Framework Design Guidelines and later published online on MSDN as the Design Guidelines for Developing Class Libraries. Even if you don't care to read through it all, tools such as Visual Studio Code Analysis and FxCop (which is free) encapsulate many of those guidelines and helps identify when your code diverges.

In my experience, Zookeeper resistance against the Framework Design Guidelines is perfectly examplified by the 'rules' related to selecting between classes and interfaces:

  • Do favor defining classes over interfaces.
  • Do use abstract (MustInherit in Visual Basic) classes instead of interfaces to decouple the contract from implementations.

(For the record, I think this is horrible advice, but that's a discussion for another day.)

The problem with a guideline like this is that most Zookeepers react to such advice by thinking: "This isn't relevant for the kind of code I write. I control the entire install base of my code, so it's not a problem for me to introduce a breaking change. That entire knowledge base of design guidelines is targeted at another type of developers. I need not care about it."

Zookeepers fail to address versioning and temporal compatibility because they (incorrectly) assume that they can always schedule deployments of services and clients together in such a way that breaking changes never actually break anything. That may work as long as systems are monolithic and exist in a vacuum, but as soon as you start integrating systems, this disregard for versioning leads straight to hell.

Example: an internal music catalog service #

Now that you understand why Zookeepers tend to ignore versioning it's time to understand where that attitude leads. This is best done with an example.

Imagine a team tasked with building a music catalog service for internal use. In the first release of the service a track looks like this:

<track>
  <name>Recovery</name>
  <artist>Curve</artist>
  <album>Come Clean</album>
  <length>288</length>
</track>

This is obviously a naïve attempt, and a bit of planning could probably have prevented the following nasty particular surprise. However, this is just an illustrative example, and I'm sure you've found yourself in a situation where a new requirement took you entirely by surprise - no matter how much you tried to predict the future.

After the team has released the first version of the music catalog service, it turns out that some tracks may be the result of collaborations, and thus have multiple artists. The track may also appear in multiple albums (such as compilations or greatest hits collections), or on no album at all. Thus, version 2 of a track will have to look like this:

<track>
  <name>Under Pressure</name>
  <artists>
    <artist>David Bowie</artist>
    <artist>Queen</artist>
  </artists>
  <albums>
    <album>Hot Space</album>
    <album>Greatest Hits</album>
    <album>The Singles: 1969-1993</album>
  </albums>
  <length>242</length>
</track>

This is obviously a breaking change, but since the team works in an organization where the entire installation base is under control, they work with the Enterprise Architecture team to schedule a release of this breaking change with all their clients.

However, the service was made to support other systems, so this change must be coordinated with all the known clients. It turns out that three systems depend on the music catalog service.

Each of these systems are being worked on by separate teams, each with individual deadlines. These teams are not planning to deploy new versions of their applications all on the same day. Their immediate schedules are already set, and they aren't aligned.

The Architecture team must negotiate future iterations with teams A, B, and C to line them up so that the music catalog team can release their new version on the same day.

This alignment may be months into the future - perhaps a significant fraction of a year. The music catalog team can't sit still while they wait for this date to arrive, so they work on other new features, most likely introducing other breaking changes along the way.

Meanwhile, team A goes through three iterations. In the first two iterations, they know that they are going to deploy into a production environment with version 1 of the music catalog, so they are going to need a testing environment that looks like that. For the third iteration, they know that they are going to deploy into a production environment with version 2 of the music catalog, so they are going to have to change their testing environment to reflect that fact.

Team B goes through four iterations that don't align with those of Team A. When Team A updates their testing environment to music catalog version 2, Team B is still working on an iteration where they must test against version 1. Thus, they must have their own testing environment. They can't share the testing environment with Team A.

The same is true for Team C: it needs its own private testing environment. Notice that this is a testing environment that not only involves configuration and maintenance of the team's own application, but also of its dependency, the music catalog service. In order to provide realistic data for performance testing, each testing environment must also be maintained with representative sample data, and may have to run on production-like hardware, in production-like network topologies. Add software licences to the mix, and you may start to realize that such a testing environment is expensive.

So far, the analysis has been unrealistically simplified. It turns out that the A, B, and C applications have other dependencies besides the music catalog service.

Each of these dependencies are also moving, producing new features. Some of those new features also involve breaking changes. Once again, the Enterprise Architecture team must negotiate a coordinated deployment date to make sure nothing breaks. However, the organization have more applications than A, B, and C. Application D, for instance, doesn't depend on the music catalog service, but it shares a dependency with application A. Thus, Team D must take part in the coordination effort, making sure that they deploy their new version on the same day as everyone else. Meanwhile, they'll need yet another one of those expensive testing environments.

I'm not making this up. I've had clients (financial institutions) that had hundreds of testing environments and big, coordinated deployments. Their applications didn't have two or three dependencies, but dozens of dependencies each - often on mainframe-based systems (very expensive).

The rot of Big Bang Releases #

All this leads to Big Bang Releases. Once or twice each year, all Zoo Sofware is simultaneously deployed according to a meticulously laid plan. This date becomes an unnegotiatable deadline. No team can miss the deadline, because dozens of other teams directly or indirectly depend on each other.

What happens when a Big Bang deadline grows nearer? What if you aren't ready to release? What if you just discovered a catastrophic bug?

This is what happens:

  • Teams work nights and weekends to meet deadlines. Members burn out, or get divorced, or decide to quit. More work is left for remaining team members. A vicious circle indeed.
  • Software is deployed with known bugs. Often, it's simply not possible to address all bugs in time, because they are discovered too late.
  • The team enters survival mode, building up technical debt. Unit tests are ignored. Dirty hacks are made. Code rots. In the end, the team deploys a version of the software where they don't really know what works and what doesn't work. This makes it harder to work on the next version, because the lack of test coverage means that they don't even know if they'll be introducing breaking changes. At the absence of knowledge it's better to assume the worst. Another vicious cycle is created.

Remember: each team must deploy at the Big Bang Release, no matter how bad the quality is. It's not even possible to elect to skip a release all together, because the other teams have built their new versions with the assumption that your application's breaking changes will be deployed. In a Big Bang Release, there are only two options: Deploy everything, or deploy nothing. Deciding to deploy nothing may threaten the entire company's existence, so that decision is rarely made.

Do you think I'm exaggerating? I have seen this happen more than once. In fact, I also strongly suspect that this sort of situation recently unfolded itself within a big international ISV - at least, the symptoms are all here, for all to see :)

All this because Zookeepers don't deal with temporal compatibility.

Call to action #

Zookeepers must stop being Zookeepers and become Rangers. There's no excuse to not deal explicitly with versioning. You can't just introduce a breaking change in your code. The cost is often much larger than you think. The repercussions can be immense. Sometimes there's no way around breaking changes, but deal with them in tried-and-true ways. Introduce your new API side-by-side with the old API and deprecate the old API. Give clients a chance to move to the new API at their own pace. Monitor usage of your API to figure out when it's safe to completely remove the deprecated API. There are many well-described ways to deal with versioning and temporal compatibility. This is not an article about how to evolve an API, but rather a call to action: explicitly address versioning, or face the consequences!

Looser coupling can help. What I've described here is simply a form of tight temporal coupling. Still, even with loosely coupled, asynchronous, messaging-based architectures, messages must still be explicitly versioned to avoid problems.

Once an orginazation has moved entirely to a loosely coupled, explicitly versioned architecture where each team can use Continuous Delivery, they may find that they don't need that Enterprise Achitecture team at all :)


Comments

Interesting post. I agree with you the need for new terminology and ways to express the evolution of software development. This post reminds me how important it is to be mindful when we are producing software. I just recently heard a podcast (http://www.thersa.org/__data/assets/file/0006/999816/20121206NassimNicholasTaleb.mp3) that talks about fragility of systems I think you might find it interesting and I believe it complements the idea of making software robust to change. The speaker also has a good book also related http://www.amazon.com/Antifragile-Things-That-Disorder-ebook/dp/B0083DJWGO/ref=tmm_kin_title_0.

Thanks for sharing,

Jason
2012-12-19 02:08 UTC
Markus Zywitza #
The difference between "Rangers" and "Zookeepers" is the same as between professional and unprofessional developers, independent of the installation base. The Agile movement, which is adopted among a lot of internal development teams, mandates practices like test-first, continuous integration, automated acceptance tests, instant feedback and so on.

In such a case, one had to change an acceptance test if a service could not deal with legacy clients. A developer who does this or a manager who commands this, acts unprofessional.

A team that calls itself agile and does not use the tools and techniques necessary for agile development, is just a bunch of cowboy coders, but neither "Zookeepers" nor "Rangers".
2012-12-19 08:43 UTC
Markus #
Markus, I suspect you can adopt TDD+CS+Automation and still be a Zookeeper. All those things are boxes you can tick and still get your company into a mess.

The point of the article (IMO) is that developers need to learn a new tool that you haven't listed - namely good internal version handling (and to start thinking of internal collaborators in the same way as external).

A tricky thing for an organisation is to decide when make the move from (easy-to-write) temporally coupled code to discrete services (with increased versioning and infrastructure overheads). Most places with this problem have started off small, grow quickly by bolting bits onto the existing architecture, and by the time someone switched on enough to start changing the organisation takes charge, it is too late to make the switch easily. TBH most places I've worked where this is the root cause of the problems, the Enterprise Architect team can't see the wood for the trees.
2012-12-19 14:28 UTC
Markus Zywitza, the dictionary definition of "professional" simply mean that one gets paid to do a particular task. There are lots of professional programmers that don't use testing, etc. While I agree that it often seems ignorant or short-sighted to neglect to follow what you call "agile practices", you can hardly say that it's unprofessional if the team gets paid to do it. Thus, we need a different terminology.

However, the other Markus is spot on. The message of this article isn't that you should use TDD, Continuous Integration, etc. Lot's of people have already said that before me. The message is that even if you are an 'internal' developer, you must think explicitly about versioning and backwards compatibility.
2012-12-26 21:44 UTC
I think Markus the firsts definition of an professional developer were more in the line of how dear Uncle Bob defines it in The Clean Coder rather then the strict dictionary definition. A developer who takes responsibility for the code he writes by having tests, delivering on time, saying and sticking to a no and not agreeing to impossible schedules.
2012-12-28 17:25 UTC
Harry McIntyre #
I think some copy-and-paste has led to some confusion. I'm actually Harry McIntyre, not another Markus! Apologies.
2013-01-02 10:00 UTC

Rangers and Zookeepers

Tuesday, 18 December 2012 08:10:23 UTC

This article discusses software that runs in the wild, versus software running in potentially controllable environments.

There are many perspectives on software development. One particular perspective that has always interested me is the distinction between software running 'in the wild' versus software running in potentially controllable environments (such as corporate networks, including DMZs). Whether it's one or the other has a substantial impact on how you write, release and maintain software.

Historical perspective #

Software 'in the wild' is simply software where you don't know, or can't control, the install base. In the good old days (a few years ago) that typically meant software produced by ISVs such as Microsoft, Oracle, SAS, SAP, Autodesk, Adobe etc. The software produced by such organizations were/are often purchased by license and deployed by enterprise operations organizations. Even for ISVs targeting end-users directly (such as personal tax software, single player games, etc.), the software was/is typically installed by the individual user, and the ISV had/has zero control of the deployment environment or schedule.

The opposite of ISV Software has until recently been Enterprise Software. The problem with this term is that it has become almost derogatory, but I don't think that's entirely fair, because the forces working on this kind of software are very different from those working on ISV software. In this category I count specialized software made for specialized, internal purposes. It can be developed by in-house developers or a hired team, but the main characteristic is that the software is deployed in a potentially controllable environment (I originally wrote 'controlled environment,' but one reviewer interpreted this to indicate only software explicitly managed by process and tools such as Chef or Puppet). Even if there are several deployment environments such as testing, staging and production, and even if we are talking about Client/Server architectures with desktop clients deployed to enterprise desktops, an operations team should be able to keep track of all installations and schedule updates (if any). Often the original developers can work with operations to troubleshoot problems directly on the installed machine (or at least get logs).

Note that I'm not counting software such as SAP, Microsoft Dynamics or Oracle as Enterprise Software, despite the fact that such software is used by enterprises. Enterprise software is software developed by the enterprise itself, for its own purposes.

Enterprise Software can be a small system that sits in a corner of a corporate network, used by two employees every last weekday of the month. It can also be a massively scalable system build for a special occasion, such as the official site for a big sports tournament like the FIFA World Cup or the Summer Olympics.

Current perspective #

The historical distinction of ISV versus Enterprise Development makes less and less sense today. First of all, with the shift of emphasis towards SaaS, traditional ISVs are moving into an area that looks a lot like Enterprise Development. With SaaS, vendors suddenly find themselves in a situation where they control the entire installation base (at least on the service side). Some of them are now figuring out that this enables them to iterate faster than before. Apparently even such an Enterprisey-sounding service as the Team Foundation Service is now deploying new features several times a year. Expect that cadence to increase.

On the other hand, the rising popularity of Open Source Software (OSS) suddenly puts a lot of OSS developers in the old position of ISV developers. OSS tends to run in the wild, and the developers have no control of the installation base or upgrade schedules.

Oh, and do I even have to say 'Apps'?

Thus, we need a better terminology. Developing, supporting and managing software in 'the wild' sounds a lot like the duties of a Ranger: you can put overall plans into motion to nudge the environment in a certain direction, and sometimes you encounter a particular specimen with which you can interact, but essentially you'll have to understand complex environmental dynamics and plan for the long term.

If traditional ISV developers, as well as OSS programmers, are Rangers, then Enterprise and SaaS developers must be Zookeepers. They deal with a controlled environment, can keep an accurate tally, and execute detailed project plans at pre-planned schedules.

OK, I admit that it sounds cooler to be a Ranger than a Zookeeper, but the metaphor makes sense to me :)

As a corrollary, we can talk about Wildlife Software versus Zoo Software.

Forces #

The forces working on Wildlife Software versus Zoo Software are very different:

Advantages Disadvantages
Wildlife Software Since you can't control the installation base, you have to make the software robust, well-tested, secure, easy (enough) to install, and documented. It should be well-instrumented and possible to troubleshoot without being the original developer. Once you release a version into the wild, it must be able to stand on its own, or it will die. This is a rather Darwinian environment, but the advantage is that the software that survives tends to have some attributes we often associate with high 'quality'. Traditionally, producing software with all these 'quality' attributes has been an expensive and slow endeavor. It also leads to conservatism, because every change must be carefully considered. Once released into the wild, a feature or behavior of a piece of software can't be changed (in that version). To wit: Microsoft has traditionally shipped new versions of Windows, Office, Visual Studio, the BCL etc. years apart. The BCL is peppered with sealed or internal classes, much to the chagrin of programmers everywhere.
Zoo Software The Product Owners of Zoo Software will expect their programmers to be able to iterate much faster, since the installation base is much smaller and well-known. There are fewer environment permutations to consider - e.g. you may know up front that the software should only have to be able to run on Windows Server 2008 R2 with a SQL Server 2008 R2 database. The entire deployment environment is also well-known, so there are many assumptions you can trust to hold. This indicates that you should be able to produce software in an 'agile' manner. Because the Zoo is a much less dangerous place, the software can be less robust (at least along some axes), which again indicates that it can be produced by a smaller team than corresponding Wildlife Software. This again helps keeping down cost. There are certain quality shortcuts that can be safely made with Zoo Software - e.g. never testing the software on Windows XP if you know it's never going to run on that OS. However, once a team under deadline pressure starts to make warranted shortcuts, it may begin making unwarranted shortcuts as well. Thus, we often experience Zoo Software that is poorly tested, is extremely difficult to deploy, is poorly documented and hard to operate. This, I believe, is why Enterprise Development today has such a negative ring to it.

Now that former ISVs are moving into Zoo Software via SaaS, it's going to be interesting to see what happens in this space.

Conclusion #

Don't jump to conclusions about the advantages of either approach. This article is meant to be descriptive, first and foremost. This means that I'm describing the characteristics of Wildlife and Zoo Software as I most commonly encounter those types of software. I'm fully aware of initiatives such as DevOps, so I'm not saying that software has to be like I describe it - I'm just describing what I'm currently observing.


Comments

Great article! One of the very interesting things to see what happens will be scalability. Since with SaaS we are paying for each virtual machine (VM) that is running, the service must scale or else we have to pay for another VM to handle the load.
2012-12-18 10:30 UTC
Great article!

It's an interesting dynamic that while ISVs is moving into creating what can be thought of as Zoo Software due to the SaaS environment the fact that the client in a SaaS environment most often is browser-based means that they remain in Ranger-role on the client-side since the myriad of different browsers and browser-versions makes the new client-world as wild if not even wilder than a traditional desktop application environment.
2012-12-18 12:34 UTC
Given the lingo you introduce, as a Ranger there are many options that make it easy to deliver Zoo like software: e.g. creating an installer that has launch conditions requiring a particular OS, a particular database version. It's a bit of a misconception that all Rangers shrink wrap the entire package (with docs, vids, etc). As a Ranger delivering software for what is basically a Zoo, I can opt to have special trained cowboys that provide training, implementation guidance, or god forbid special customization services (SAP et al). Selling these extra services/consultancy to make up for poor quality is an entire business model in and by itself. I say this because there's more nuance to a Rangers (don't work in a Zoo so can't judge that) modus operandi. If I do any of these things, am I still a Ranger?
2012-12-19 08:43 UTC

Encapsulation of properties

Tuesday, 27 November 2012 13:53:59 UTC

This post explains why Information Hiding isn't entirely the same thing as Encapsulation.

Despite being an old concept, Encapsulation is one of the most misunderstood principles of object-oriented programming. Particularly for C# and Visual Basic.NET developers, this concept is confusing because those languages have properties. Other languages, including F#, have properties too, but it would be far from me to suggest that F# programmers are confused :)

(As an aside, even in languages without formal properties, such as e.g. Java, you often see property-like constructs. In Java, for example, a pair of a getter and a setter method is sometimes informally referred to as a property. In the end, that's also how .NET properties are implemented. Thus, the present article can also be applied to Java and other 'property-less' languages.)

In my experience, the two most important aspects of encapsulation are Protection of Invariants and Information Hiding. Earlier, I have had much to say about Protection of Invariants, including the invariants of properties. In this post, I will instead focus on Information Hiding.

With all that confusion (which I will get back to), you would think that Information Hiding is really hard to grasp. It's not - it's really simple, but I think that the name erects an effective psychological barrier to understanding. If instead we were to call it Implementation Hiding, I think most people would immediately grasp what it's all about.

However, since Information Hiding has this misleading name, it becomes really difficult to understand what it means. Does it mean that all information in an object should be hidden from clients? How can we reconcile such a viewpoint with the fundamental concept that object-orientation is about data and behavior? Some people take the misunderstanding so far that they begin to evangelize against properties as a design principle. Granted, too heavy a reliance on properties leads to violations of the Law of Demeter as well as Feature Envy, but without properties, how can a client ever know the state of a system?

Direct field access isn't the solution, as this discloses data to an even larger degree than properties. Still, in the lack of better guidance, the question of Encapsulation often degenerates to the choice between fields and properties. Perhaps the most widely known and accepted .NET design guideline is that data should be exposed via properties. This again leads to the redundant 'Automatic Property' language feature.

Tell me again: how is this

public string Name { get; set; }

better than this?

public string Name;

The Design Guidelines for Developing Class Libraries isn't wrong. It's just important to understand why properties are preferred over fields, and it has only partially to do with Encapsulation. The real purpose of this guideline is to enable versioning of types. In other words: it's about backwards compatibility.

An example demonstrating how properties enable code to evolve while maintaining backwards compatibility is in order. This example also demonstrates Implementation Hiding.

Example: a tennis game #

The Tennis kata is one of my favorite TDD katas. Previously, I posted my own, very specific take on it, but I've also, when teaching, asked groups to do it as an exercise. Often participants arrive at a solution not too far removed from this example:

public class Game
{
    private int player1Score;
    private int player2Score;
 
    public void PointTo(Player player)
    {
        if (player == Player.One)
            if (this.player1Score >= 30)
                this.player1Score += 10;
            else
                this.player1Score += 15;
        else
            if (this.player2Score >= 30)
                this.player2Score += 10;
            else
                this.player2Score += 15;
    }
 
    public string Score
    {
        get
        {
            if (this.player1Score == this.player2Score &&
                this.player1Score >= 40)
                return "Deuce";
            if (this.player1Score > 40 &&
                this.player1Score == this.player2Score + 10)
                return "AdvantagePlayerOne";
            if (this.player2Score > 40 &&
                this.player2Score == this.player1Score + 10)
                return "AdvantagePlayerTwo";
            if (this.player1Score > 40 &&
                this.player1Score >= this.player2Score + 20)
                return "GamePlayerOne";
            if (this.player2Score > 40)
                return "GamePlayerTwo";
            var score1Word = ToWord(this.player1Score);
            var score2Word = ToWord(this.player2Score);
            if (score1Word == score2Word)
                return score1Word + "All";
            return score1Word + score2Word;
        }
    }
 
    private string ToWord(int score)
    {
        switch (score)
        {
            case 0:
                return "Love";
            case 15:
                return "Fifteen";
            case 30:
                return "Thirty";
            case 40:
                return "Forty";
            default:
                throw new ArgumentException(
                    "Unexpected score value.",
                    "score");
        }
    }
}

Granted: there's a lot more going on here than just a single property, but I wanted to provide an example of enough complexity to demonstrate why Information Hiding is an important design principle. First of all, this Game class tips its hat to that other principle of Encapsulation by protecting its invariants. Notice that while the Score property is public, it's read-only. It wouldn't make much sense if the Game class allowed an external caller to assign a value to the property.

When it comes to Information Hiding the Game class hides the implementation details by exposing a single Score property, but internally storing the state as two integers. It doesn't hide the state of the game, which is readily available via the Score property.

The benefit of hiding the data and exposing the state as a property is that it enables you to vary the implementation independently from the public API. As a simple example, you may realize that you don't really need to keep the score in terms of 0, 15, 30, 40, etc. Actually, the scoring rules of a tennis game are very simple: in order to win, a player must win at least four points with at least two more points than the opponent. Once you realize this, you may decide to change the implementation to this:

public class Game
{
    private int player1Score;
    private int player2Score;
 
    public void PointTo(Player player)
    {
        if (player == Player.One)
            this.player1Score += 1;
        else
            this.player2Score += 1;
    }
 
    public string Score
    {
        get
        {
            if (this.player1Score == this.player2Score &&
                this.player1Score >= 3)
                return "Deuce";
            if (this.player1Score > 3 &&
                this.player1Score == this.player2Score + 1)
                return "AdvantagePlayerOne";
            if (this.player2Score > 3 &&
                this.player2Score == this.player1Score + 1)
                return "AdvantagePlayerTwo";
            if (this.player1Score > 3 &&
                this.player1Score >= this.player2Score + 2)
                return "GamePlayerOne";
            if (this.player2Score > 3)
                return "GamePlayerTwo";
            var score1Word = ToWord(this.player1Score);
            var score2Word = ToWord(this.player2Score);
            if (score1Word == score2Word)
                return score1Word + "All";
            return score1Word + score2Word;
        }
    }
 
    private string ToWord(int score)
    {
        switch (score)
        {
            case 0:
                return "Love";
            case 1:
                return "Fifteen";
            case 2:
                return "Thirty";
            case 3:
                return "Forty";
            default:
                throw new ArgumentException(
                    "Unexpected score value.",
                    "score");
        }
    }
}

This is a true Refactoring because it modifies (actually simplifies) the internal implementation without changing the external API one bit. Good thing those integer fields were never publicly exposed.

The tennis game scoring system is actually a finite state machine and from Design Patterns we know that this can be effectively implemented using the State pattern. Thus, a further refactoring could be to implement the Game class with a set of private or internal state classes. However, I will leave this as an exercise to the reader :)

The progress towards the final score often seems to be an important aspect in sports reporting, so another possible future development of the Game class might be to enable it to not only report on its current state (as the Score property does), but also report on how it arrived at that state. This might prompt you to store the sequence of points as they were scored, and calculate past and current state based on the history of points. This would cause you to change the internal storage from two integers to a sequence of points scored.

In both the above refactoring cases, you would be able to make the desired changes to the Game class because the implementation details are hidden.

Conclusion #

Information Hiding isn't Data Hiding. Implementation Hiding would be a much better word. Properties expose data about the object while hiding the implementation details. This enables you to vary the implementation and the public API independently. Thus, the important benefit from Information Hiding is that it enables you to evolve your API in a backwards compatible fashion. It doesn't mean that you shouldn't expose any data at all, but since properties are just as much members of a class' public API as its methods, you must exercise the same care when designing properties as when designing methods.

Update (2012.12.02): A reader correctly pointed out to me that there was a bug in the Game class. This bug has now been corrected.


Comments

I have read through several of your posts on encapsulation and information hiding and have also noticed throughout several discussions that information hiding is an often misunderstood/underestimated concept. Personally I think a lot of problems arise from ambiguous definitions, rather than the name "information hiding". A particularly enlightening source for me was browsing through all the different definitions which have been formulated over the years, as compiled by Edward V. Berard.

http://www.itmweb.com/essay550.htm

In his conclusion he makes the following uncommon distinction: "Abstraction, information hiding, and encapsulation are very different, but highly-related, concepts. One could argue that abstraction is a technique that helps us identify which specific information should be visible, and which information should be hidden. Encapsulation is then the technique for packaging the information in such a way as to hide what should be hidden, and make visible what is intended to be visible."

Within a theoretical discussion, I agree with his conclusion that "a stronger argument can be made for keeping the concepts, and thus the terms, distinct". A key thing to keep in mind here according to this definition encapsulating something doesn't necessarily mean it is hidden.

When you refer to "encapsulation" you seem to refer to e.g. Rumbaugh's definition: "Encapsulation (also information hiding) consists of separating the external aspects of an object which are accessible to other objects, from the internal implementation details of the object, which are hidden from other objects." who ironically doesn't seem to make a distinction between information hiding and encapsulation, highlighting the problem even more. This also corresponds to the first listed definition on the wikipedia article on encapsulation you link to, but not the second.

1. A language mechanism for restricting access to some of the object's components.

2. A language construct that facilitates the bundling of data with the methods (or other functions) operating on that data.

When hiding information we almost always (always?) encapsulate something, which is probably why many see them as the same thing.

In a practical setting where we are just quickly trying to convey ideas things get even more problematic. When I refer to "encapsulation" I usually refer to information hiding, because that is generally my intention when encapsulating something. Interpreting it the other way around where encapsulation is seen solely as an OOP principle where state is placed within the context of a class with restricted access is more problematic when no distinction is made with information hiding. The many other ways in which information can be hidden are neglected.

Keeping this warning in mind on the ambiguous definitions of encapsulation and information hiding, I wonder how you feel about my blog posts where I discuss information hiding beyond the scope of the class. To me this is a useful thing to do, for all the same reasons as information hiding in general is a good thing to do.

In "Improved encapsulation using lambdas" I discuss how a piece of reusable code can be encapsulated within the scope of one function by using lambdas.

http://whathecode.wordpress.com/2011/06/05/improved-encapsulation-using-lambdas/

In "Beyond private accessibility" I discuss how variables used only within one function could potentially be hidden from other functions in a class.

http://whathecode.wordpress.com/2011/06/13/beyond-private-accessibility/

I'm glad I subscribed to this blog, you seem to talk about a lot of topics I'm truly interested in. :)
2012-11-27 21:56 UTC
And speaking about encapsulation, I think that the score comparing logic should be ENCAPSULATED into its own PlayerScore class, something like [ https://gist.github.com/4159602 ](no link as the blog errors if i'm using html tags). So you can change anytime how the score is calculated and how many score points a tennis point really is. Also, it's testable and the Game object just manage where the points go, instead of also implementing scoring rules.
2012-11-28 07:33 UTC
Steven, thank you for your thoughtful comment. There are definitely many ways to think about encapsulation, and I don't think I have the only true definition. Basically I'm just trying to make sense of a lot of things that I know, from experience and intuition, are correct.

The main purpose of this post is to drive a big stake through the notion that properties = Encapsulation.

The lambda trick you describe is an additional way to scope a method - another is to realize that interfaces can also be used for scoping.
2012-11-28 07:46 UTC
"The main purpose of this post is to drive a big stake through the notion that properties = Encapsulation."

While I agree 100% with the intent of that statement, I just wanted to warn you that according to some definitions, a property encapsulates a backing field, a getter, and a setter. Whether or not that getter and setter hide anything or are protected, is assumed by some to be an entirely different concept, namely information hiding.

However using the 'mainstream' OOP definition of encapsulation you are absolutely right that a property with a public getter and setter (without any implementation logic inside) doesn't encapsulate anything as no access restrictions are imposed.

And as per my cause, neither does moving something to the scope of the class with access restrictions provide the best possible information hiding in OOP. You are only hiding complexity from fellow programmers who will consume the class, but not for those who have to maintain it (or extend from it in the case of protected members).
2012-11-28 09:39 UTC
@Mike

You are probably right his example can be further encapsulated, but if you are taking the effort of encapsulating something I would at least try to make it reusable instead of making a specific 'PlayerScore' class.

A score is basically a range of values which is a construct which is missing from C# and Java, but e.g. Ruby has. From experience I can tell you implementing it yourself is truly useful as you start seeing opportunities to use it everywhere: https://github.com/Whathecode/Framework-Class-Library-Extension/blob/master/Whathecode.System/Arithmetic/Range/Interval.cs

Probably I would create a generic or abstract 'Score' class which is composed of a range and possibly some additional logic (Reset(), events, ...)

As to moving the logic of the game (the score comparison) to the 'PlayerScore' class, I don't think I entirely agree. This is the concern of the 'Game' class and not the 'Score'. When separating these concerns one could use the 'Score' class in different types of games as well.
2012-11-28 10:00 UTC
@Steven Once in a while I use delegates, mainly Funcs to encapsulate utility functions inside private functions to get that additional encapsulation-level.

However I feel that it's just not a very good fit in C# compared to functional languages like F# or Clojure where functions are truely first class and defining a function inside another function is fully supported with the same syntax as regular functions. What I really would like if C# simply would let you define functions inside functions with the usual syntax.

Regarding performance I don't know how C# does it but I imagine it's similar to the F# inner function performance hit is negligible: http://stackoverflow.com/questions/7920234/what-are-the-performance-side-effects-of-defining-functions-inside-a-recursive-f

Maybe it's better to encapsulate the private func as a public func in a nested private (static?) class and then have the inner function as a private function of that class.
2012-11-29 05:59 UTC

AppSettings convention for Castle Windsor

Wednesday, 07 November 2012 16:54:43 UTC

This post describes a generalized convention for Castle Windsor that handles AppSettings primitives.

In my previous post I explained how Convention over Configuration is the preferred way to use a DI Container. Some readers asked to see some actual convention implementations (although I actually linked to them in the post). In fact, I've previously showcased some simple conventions expressed with Castle Windsor's API.. In this post I'm going to show you another convention, which is completely reusable. Feel free to copy and paste :)

Most conventions are really easy to implement. Actually, sometimes it takes more effort to express the specification than it actually takes to implement it.

This convention deals with Primitive Dependencies. In my original post on the topic I included an AppSettingsConvention class as part of the code listing, but that implementation was hard-coded to only deal with integers. This narrow convention can be generalized:

The AppSettingsConvention should map AppSettings .config values into Primitive Dependencies.

  • If a class has a dependency, the name of the dependency is assumed to be the name of the constructor argument (or property, for that matter). If, for example, the name of a constructor argument is top, this is the name of the dependency.
  • If there's an appSettings key with the same name in the .config, and if there's a known conversion from string to the type of the dependency, the .config value is converted and used.

Example requirement: int top #

Consider this constructor:

public DbChartReader(int top, string chartConnectionString)

In this case the convention should look after an AppSettings key named top as well as check whether there's a known conversion from string to int (there is). Imagine that the .config file contains this XML fragment:

<appSettings>
  <add key="top" value="40" />
</appSettings>

The convention should read "40" from the .config file and convert it to an integer and inject 40 into a DbChartReader instance.

Example requirement: Uri catalogTrackBaseUri #

Consider this constructor:

public CatalogApiTrackLinkFactory(Uri catalogTrackBaseUri)

In this case the convention should look after an AppSettings key named catalogTrackBaseUri and check if there's a known conversion from string to Uri. Imagine that the .config file contains this XML fragment:

<appSettings>
  <add key="catalogTrackBaseUri" value="http://www.ploeh.dk/foo/img/"/>
  <add key="foo" value="bar"/>
  <add key="baz" value="42"/>
</appSettings>

The convention should read "http://www.ploeh.dk/foo/img/" from the .config file and convert it to a Uri instance.

Implementation #

By now it should be clear what the conventions should do. With Castle Windsor this is easily done by implementing an ISubDependencyResolver. Each method is a one-liner:

public class AppSettingsConvention : ISubDependencyResolver
{
    public bool CanResolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        return ConfigurationManager.AppSettings.AllKeys
                .Contains(dependency.DependencyKey)
            && TypeDescriptor
                .GetConverter(dependency.TargetType)
                .CanConvertFrom(typeof(string));
    }
 
    public object Resolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        return TypeDescriptor
            .GetConverter(dependency.TargetType)
            .ConvertFrom(
                ConfigurationManager.AppSettings[dependency.DependencyKey]);
    }
}

The ISubDependencyResolver interface is an example of the Tester-Doer pattern. Only if the CanResolve method returns true is the Resolve method invoked.

The CanResolve method performs two checks:

  • Is there an AppSettings key in the configuration which is equal to the name of the dependency?
  • Is there a known conversion from string to the type of the dependency?
If both answers are true, then the CanResolve method returns true.

The Resolve method simply reads the .config value and converts it to the appropriate type and returns it.

Adding the convention to an IWindsorContainer instance is easy:

container.Kernel.Resolver.AddSubResolver(
    new AppSettingsConvention());            

Summary #

The AppSettingsConvention is a completely reusable convention for Castle Windsor. With it, Primitive Dependencies are automatically wired the appropriate .config values if they are defined.


Comments

Actually IComponentModelContributor would be an even better place to put the logic than ISDR.
- it would handle all the type conversion for you
- the approach, since the dependency is set up as part of the ComponentModel is statically analysable, whereas ISDR works dynamically so your components that depend on values from config file would show up as "Potentially misconfigured components".
2012-11-07 21:19 UTC
Krzysztof, if I try to implement an interface called "IComponentModelContributor" IntelliSense gives me nothing. Where is that interface defined? (I'm drawing a blank on Google too...)
2012-11-07 21:29 UTC
a@http://docs.castleproject.org/Windsor.ComponentModel-construction-contributors.ashx@this thing
2012-11-07 21:31 UTC
How would you implement the above convention with that interface?
2012-11-07 21:49 UTC
I guess that means a blogpost :)
2012-11-07 21:51 UTC

When to use a DI Container

Tuesday, 06 November 2012 11:42:06 UTC

This post explains why a DI Container is useful with Convention over Configuration while Poor Man's DI might be a better fit for a more explicit Composition Root.

Note (2018-07-18): Since I wrote this article, I've retired the term Poor Man's DI in favour of Pure DI.

It seems to me that lately there's been a backlash against DI Containers among alpha geeks. Many of the software leaders that I myself learn from seem to dismiss the entire concept of a DI Container, claiming that it's too complex, too 'magical', that it isn't a good architectural pattern, or that the derived value doesn't warrant the 'cost' (most, if not all, DI Containers are open source, so they are free in a monetary sense, but there's always a cost in learning curve etc.).

This must have caused Krzysztof Koźmic to write a nice article about what sort of problem a DI Container solves. I agree with the article, but want to provide a different perspective here.

In short, it makes sense to me to illustrate the tradeoffs of Poor Man's DI versus DI Containers in a diagram like this:

usefulness vs. sophistication

The point of the diagram is that Poor Man's DI can be valuable because it's simple, while a DI Container can be either valuable or pointless depending on how it's used. However, when used in a sufficiently sophisticated way I consider a DI Container to offer the best value/cost ratio. When people criticize DI Containers as being pointless I suspect that what really happened was that they gave up before they were out of the Trough of Disillusionment. Had they continued to learn, they might have arrived at a new Plateau of Productivity.

DI style Advantages Disadvantages
Poor Man's DI
  • Easy to learn
  • Strongly typed
  • High maintenance
Explicit Register
  • Weakly typed
Convention over Configuration
  • Low maintenance
  • Hard to learn
  • Weakly typed

There are other, less important advantages and disadvantages of each approach, but here I'm focusing on three main axes that I consider important:

  • How easy is it to understand and learn?
  • How soon will you get feedback if something is not right?
  • How easy is it to maintain?

The major advantage of Poor Man's DI is that it's easy to learn. You don't have to learn the API of any DI Container (Unity, Autofac, Ninject, StructureMap, Castle Windsor, etc.) and while individual classes still use DI, once you find the Composition Root it'll be evident what's going on and how object graphs are constructed. No 'magic' is involved.

The second big advantage of Poor Man's DI is often overlooked: it's strongly typed. This is an advantage because it provides the fastest feedback about correctness that you can get. However, strong typing cuts both ways because it also means that every time you refactor a constructor, you will break the Composition Root. If you are sharing a library (Domain Model, Utility, Data Access component, etc.) between more than one application (unit of deployment), you may have more than one Composition Root to maintain. How much of a burden this is depends on how often you refactor constructors, but I've seen projects where this happens several times each day (keep in mind that constructor are implementation details).

If you use a DI Container, but explicitly Register each and every component using the container's API, you lose the rapid feedback from strong typing. On the other hand, the maintenance burden is also likely to drop because of Auto-wiring. Still, you'll need to register each new class or interface when you introduce them, and you (and your team) still has to learn the specific API of that container. In my opinion, you lose more advantages than you gain.

Ultimately, if you can wield a DI Container in a sufficiently sophisticated way, you can use it to define a set of conventions. These conventions define a rule set that your code should adhere to, and as long as you stick to those rules, things just work. The container drops to the background, and you rarely need to touch it. Yes, this is hard to learn, and is still weakly typed, but if done right, it enables you to focus on code that adds value instead of infrastructure. An additional advantage is that it creates a positive feedback mechanism forcing a team to produce code that is consistent with the conventions.

Example: Poor Man's DI #

The following example is part of my Booking sample application. It shows the state of the Ploeh.Samples.Booking.Daemon.Program class as it looks in the git tag total-complexity (git commit ID 64b7b670fff9560d8947dd133ae54779d867a451).

var queueDirectory = 
    new DirectoryInfo(@"..\..\..\BookingWebUI\Queue").CreateIfAbsent();
var singleSourceOfTruthDirectory = 
    new DirectoryInfo(@"..\..\..\BookingWebUI\SSoT").CreateIfAbsent();
var viewStoreDirectory = 
    new DirectoryInfo(@"..\..\..\BookingWebUI\ViewStore").CreateIfAbsent();
 
var extension = "txt";
 
var fileDateStore = new FileDateStore(
    singleSourceOfTruthDirectory,
    extension);
 
var quickenings = new IQuickening[]
{
    new RequestReservationCommand.Quickening(),
    new ReservationAcceptedEvent.Quickening(),
    new ReservationRejectedEvent.Quickening(),
    new CapacityReservedEvent.Quickening(),
    new SoldOutEvent.Quickening()
};
 
var disposable = new CompositeDisposable();
var messageDispatcher = new Subject<object>();
disposable.Add(
    messageDispatcher.Subscribe(
        new Dispatcher<RequestReservationCommand>(
            new CapacityGate(
                new JsonCapacityRepository(
                    fileDateStore,
                    fileDateStore,
                    quickenings),
                new JsonChannel<ReservationAcceptedEvent>(
                    new FileQueueWriter<ReservationAcceptedEvent>(
                        queueDirectory,
                        extension)),
                new JsonChannel<ReservationRejectedEvent>(
                    new FileQueueWriter<ReservationRejectedEvent>(
                        queueDirectory,
                        extension)),
                new JsonChannel<SoldOutEvent>(
                    new FileQueueWriter<SoldOutEvent>(
                        queueDirectory,
                        extension))))));
disposable.Add(
    messageDispatcher.Subscribe(
        new Dispatcher<SoldOutEvent>(
            new MonthViewUpdater(
                new FileMonthViewStore(
                    viewStoreDirectory,
                    extension)))));
 
var q = new QueueConsumer(
    new FileQueue(
        queueDirectory,
        extension),
    new JsonStreamObserver(
        quickenings,
        messageDispatcher));
 
RunUntilStopped(q);

Yes, that's a lot of code. I deliberately chose a non-trivial example to highlight just how much stuff there might be. You don't have to read and understand all of this code to appreciate that it might require a bit of maintenance. It's a big object graph, with some shared subgraphs, and since it uses the new keyword to create all the objects, every time you change a constructor signature, you'll need to update this code, because it's not going to compile until you do.

Still, there's no 'magical' tool (read: DI Container) involved, so it's pretty easy to understand what's going on here. As Dan North put it once I saw him endorse this technique: 'new' is the new 'new' :) Once you see how Explicit Register looks, you may appreciate why.

Example: Explicit Register #

The following example performs exactly the same work as the previous example, but now in a state (git tag: controllers-by-convention; commit ID: 13fc576b729cdddd5ec53f1db907ec0a7d00836b) where it's being wired by Castle Windsor. The name of this class is DaemonWindsorInstaller, and all components are explictly registered. Hang on to something.

container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\Queue").CreateIfAbsent())
    .Named("queueDirectory"));
container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\SSoT").CreateIfAbsent())
    .Named("ssotDirectory"));
container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\ViewStore").CreateIfAbsent())
    .Named("viewStoreDirectory"));            
 
container.Register(Component
    .For<IQueue>()
    .ImplementedBy<FileQueue>()
    .DependsOn(
        Dependency.OnComponent("directory", "queueDirectory"),
        Dependency.OnValue("extension", "txt")));
 
container.Register(Component
    .For<IStoreWriter<DateTime>, IStoreReader<DateTime>>()
    .ImplementedBy<FileDateStore>()
    .DependsOn(
        Dependency.OnComponent("directory", "ssotDirectory"),
        Dependency.OnValue("extension", "txt")));
container.Register(Component
    .For<IStoreWriter<ReservationAcceptedEvent>>()
    .ImplementedBy<FileQueueWriter<ReservationAcceptedEvent>>()
    .DependsOn(
        Dependency.OnComponent("directory", "queueDirectory"),
        Dependency.OnValue("extension", "txt")));
container.Register(Component
    .For<IStoreWriter<ReservationRejectedEvent>>()
    .ImplementedBy<FileQueueWriter<ReservationRejectedEvent>>()
    .DependsOn(
        Dependency.OnComponent("directory", "queueDirectory"),
        Dependency.OnValue("extension", "txt")));
container.Register(Component
    .For<IStoreWriter<SoldOutEvent>>()
    .ImplementedBy<FileQueueWriter<SoldOutEvent>>()
    .DependsOn(
        Dependency.OnComponent("directory", "queueDirectory"),
        Dependency.OnValue("extension", "txt")));
 
container.Register(Component
    .For<IChannel<ReservationAcceptedEvent>>()
    .ImplementedBy<JsonChannel<ReservationAcceptedEvent>>());
container.Register(Component
    .For<IChannel<ReservationRejectedEvent>>()
    .ImplementedBy<JsonChannel<ReservationRejectedEvent>>());
container.Register(Component
    .For<IChannel<SoldOutEvent>>()
    .ImplementedBy<JsonChannel<SoldOutEvent>>());
 
container.Register(Component
    .For<ICapacityRepository>()
    .ImplementedBy<JsonCapacityRepository>());
 
container.Register(Component
    .For<IConsumer<RequestReservationCommand>>()
    .ImplementedBy<CapacityGate>());
container.Register(Component
    .For<IConsumer<SoldOutEvent>>()
    .ImplementedBy<MonthViewUpdater>());
 
container.Register(Component
    .For<Dispatcher<RequestReservationCommand>>());
container.Register(Component
    .For<Dispatcher<SoldOutEvent>>());
 
container.Register(Component
    .For<IObserver<Stream>>()
    .ImplementedBy<JsonStreamObserver>());
container.Register(Component
    .For<IObserver<DateTime>>()
    .ImplementedBy<FileMonthViewStore>()
    .DependsOn(
        Dependency.OnComponent("directory", "viewStoreDirectory"),
        Dependency.OnValue("extension", "txt")));
container.Register(Component
    .For<IObserver<object>>()
    .UsingFactoryMethod(k =>
    {
        var messageDispatcher = new Subject<object>();
        messageDispatcher.Subscribe(k.Resolve<Dispatcher<RequestReservationCommand>>());
        messageDispatcher.Subscribe(k.Resolve<Dispatcher<SoldOutEvent>>());
        return messageDispatcher;
    }));
 
container.Register(Component
    .For<IQuickening>()
    .ImplementedBy<RequestReservationCommand.Quickening>());
container.Register(Component
    .For<IQuickening>()
    .ImplementedBy<ReservationAcceptedEvent.Quickening>());
container.Register(Component
    .For<IQuickening>()
    .ImplementedBy<ReservationRejectedEvent.Quickening>());
container.Register(Component
    .For<IQuickening>()
    .ImplementedBy<CapacityReservedEvent.Quickening>());
container.Register(Component
    .For<IQuickening>()
    .ImplementedBy<SoldOutEvent.Quickening>());
 
container.Register(Component
    .For<QueueConsumer>());
 
container.Kernel.Resolver.AddSubResolver(new CollectionResolver(container.Kernel));

This is actually more verbose than before - almost double the size of the Poor Man's DI example. To add spite to injury, this is no longer strongly typed in the sense that you'll no longer get any compiler errors if you change something, but a change to your classes can easily lead to a runtime exception, since something may not be correctly configured.

This example uses the Registration API of Castle Windsor, but imagine the horror if you were to use XML configuration instead.

Other DI Containers have similar Registration APIs (apart from those that only support XML), so this problem isn't isolated to Castle Windsor only. It's inherent in the Explicit Register style.

I can't claim to be an expert in Java, but all I've ever heard and seen of DI Containers in Java (Spring, Guice, Pico), they don't seem to have Registration APIs much more sophisticated than that. In fact, many of them still seem to be heavily focused on XML Registration. If that's the case, it's no wonder many software thought leaders (like Dan North with his 'new' is the new 'new' line) dismiss DI Containers as being essentially pointless. If there weren't a more sophisticated option, I would tend to agree.

Example: Convention over Configuration #

This is still the same example as before, but now in a state (git tag: services-by-convention-in-daemon; git commit ID: 0a7e6f246cacdbefc8f6933fc84b024774d02038) where almost the entire configuration is done by convention.

container.AddFacility<ConsumerConvention>();
 
container.Register(Component
    .For<IObserver<object>>()
    .ImplementedBy<CompositeObserver<object>>());
 
container.Register(Classes
    .FromAssemblyInDirectory(new AssemblyFilter(".").FilterByName(an => an.Name.StartsWith("Ploeh.Samples.Booking")))
    .Where(t => !(t.IsGenericType && t.GetGenericTypeDefinition() == typeof(Dispatcher<>)))
    .WithServiceAllInterfaces());
 
container.Kernel.Resolver.AddSubResolver(new ExtensionConvention());
container.Kernel.Resolver.AddSubResolver(new DirectoryConvention(container.Kernel));
container.Kernel.Resolver.AddSubResolver(new CollectionResolver(container.Kernel));
 
#region Manual configuration that requires maintenance
container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\Queue").CreateIfAbsent())
    .Named("queueDirectory"));
container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\SSoT").CreateIfAbsent())
    .Named("ssotDirectory"));
container.Register(Component
    .For<DirectoryInfo>()
    .UsingFactoryMethod(() =>
        new DirectoryInfo(@"..\..\..\BookingWebUI\ViewStore").CreateIfAbsent())
    .Named("viewStoreDirectory"));
#endregion

It's pretty clear that this is a lot less verbose - and then I even left three explicit Register statements as a deliberate decision. Just because you decide to use Convention over Configuration doesn't mean that you have to stick to this principle 100 %.

Compared to the previous example, this requires a lot less maintenance. While you are working with this code base, most of the time you can concentrate on adding new functionality to the software, and the conventions are just going to pick up your changes and new classes and interfaces. Personally, this is where I find the best tradeoff between the value provided by a DI Container versus the cost of figuring out how to implement the conventions. You should also keep in mind that once you've learned to use a particular DI Container like this, the cost goes down.

Summary #

Using a DI Container to compose object graphs by convention presents an unparalled opportunity to push infrastructure code to the background. However, if you're not prepared to go all the way, Poor Man's DI may actually be a better option. Don't use a DI Container just to use one. Understand the value and cost associated with it, and always keep in mind that Poor Man's DI is a valid alternative.


Comments

I'd offer a different suggestion: Try writing a test for every line of code you write. Then tell me how Poor Man's DI works out for you.

Testing usually seems to be the pressure release for not using an IoC container in a static language. Even if it's just a matter of not writing the tests for the overridden constructor, testing is usually thrown out. And in a world without tests, Poor Man's DI (or not DI at all) is often the "simpler" solution. Less lines of code, "it just works," etc etc. There are lots of options when you only look at the implementation without concern about how one is to provide automated verification against regressions.

If using TDD or even just "testing," an IoC container is always the simpler solution. Unless, of course -- if you just switch to a language or framework that lets you do both. *cough* ruby *cough* python *cough* dynamic languages *cough*
2012-11-06 17:00 UTC
Daniel Hilgarth #
Very good article, thank you!


Just to add another perspective:

I created some extension methods in one of my core libraries that registers everything that ends with Service, Factory, Provider etc. Additionally, I created some extension methods for special areas like NHibernate or AutoMapping.

With these extension methods and a project that adheres to these conventions, my composition roots are very short and need virtually no maintenance.

I have successfully used this approach in several mid to big sized projects. It just works, I wouldn't want to work without a DI container anymore as it would cost so much more time.
2012-11-06 19:53 UTC
Darren, thank you for your comment. How do you think TDD fits into this discussion? Which overloaded constructors? If you examine the git repo behind these samples, you should find that I didn't change the production code between the three examples. It's the same production code - the same classes, the same constructors, etc. - just wired in three different ways.
2012-11-06 20:52 UTC
Every time I've heard of Poor Man's DI, it's always described the practice of creating two constructors:

1.) A constructor with no arguments. Dependencies are initialized in the constructor. This constructor can be used to instantiate the object like "myThing = new MyThing();" This is not using DI at all.

2.) A constructor with arguments for each dependency. Dependencies are passed in. This constructor is used for testing, since it actually uses DI.

This "Poor Man's DI" is a concept because it's a cheap way to get DI into a class that may not have originally been written to support DI. In a way, it seems to give devs the best of both worlds. Users can still instantiate the class simply, but users can also test it. It sounds fine, but it has some issues because the class is still bound to its dependencies and because the implementation uses different code than the tests.

Looking deeper at your example, I see that's not what your "Poor Man's DI" example is. Your way is fully testable, but I don't think its deserving of the extra "Poor Man's DI" moniker because it's just hand-rolled class instantiation. Or to put it another way: If your code is an example of "Poor Man's DI," then wouldn't any DI that wasn't handled through an IoC container? You are just creating objects with code -- nothing special. (or wrong, either)

If that's what "Poor Man's DI" means, there should probably be a new phrase to identify the practice that I've seen the phrase tied to -- as it's a "special" and unique practice. (Take that however you will. :) )
2012-11-07 01:02 UTC
Darren, I'll refer you to this answer for further details on the terminology choice. You do have a point, but I'm sticking to the terminology from my book.
2012-11-07 08:17 UTC
I guess people can call things whatever they want. I've seen & heard many references to the two-constructor pattern as Poor Man's DI and for a long time, but this is the first time I've seen the phrase used in your way. It's also the basic first time I've seen basic class instantiation given a special name.

Now that I think about it,the concept of "Poor Man's DI" and "Bastard Injection" seem to refer to different things. Given your definition, Poor Man's DI basically seems to mean that I don't use an IoC container. It's a concept defining the method in which the object is created. But Bastard Injection refers to what I think would be the more common use of "Poor Man's DI," the practice of creating two constructors. It's a concept defining the method in which the class itself is written. I guess, then, it's possible for me to use Bastard Injection with Poor Man's DI, so long as I don't call the default constructor?

As one more side note: I really don't like the name "Bastard Injection" due to the coarse language. I know it's an anti-pattern, but "bastard" is a word I'd never ever accept from myself or other developers in a professional setting, especially with a client. I just asked my wife, an public elementary school teacher and librarian, and she said that word would not be accepted in her class or at any school she's been at. I don't think it's helpful to give PG13-level words to programming concepts. :)
2012-11-07 13:04 UTC
Another thoughtful article; thank you!

How do considerations of lifetime management factor in? I may want Singleton here, Transient there, etc. That would seem to favor Explicit Register.

There's also the option of integration-testing the Composition Root to provide some type-checking.
2012-11-09 18:51 UTC
Bill, thanks for your comment.

When it comes to lifetime management, there are answers on more than one level.

On the pragmatic level, I've often found that in reality, most of my graphs tend to need to be Transient (or Per Graph) because some commonly used leaf node must be Transient (or Per Graph) for whatever reason. Once that happens, if most (say: more than 75%) of all objects are already Transient, does it really matter if a few more are also Transient? Yes, it could have been more efficient, but if you profile your app, you're most likely to discover that your bottleneck is somewhere else entirely.

On a more explicit level, it would be possible to define a convention that picks up a hint about the desired lifetime from the type itself. You could for example call a service "ThreadSafeFoo" to hint that Singleton would be appropriate - or perhaps you could adorn it with a [ThreadSafe] attribute...

Testing the container itself I don't find particularly useful, but a set of smoke tests of the entire app can be helpful.
2012-11-11 18:03 UTC
Good points, Mark, original perspective.
I may be wrong, but reading your post I understand that the goal of a DI Container is to compose object graphs. This is undoubtedly true. Yet I think that this is just one of DI Containers' goals, and possibly not even the main one.

I'm sure you already know the amazing post An Autofac Lifetime Primer by Nicholas Blumhardt. It is about AutoFac, but it covers principles that are common to all the CI Containers.
Reading Nicholas post what I get is that a CI Container is a tool whose main goal is to manage resources lifetimes. Nicholas defines a resource as "anything with acquisition and release phases in its lifecycle". IoC Containers "provide a good solution to the resource management problem" and "to do this, they need to take ownership of the disposable components that they create". In other words, not only do DI Containers compose object graphs, but they also take care of the lifecycle of objects they created. Nicholas post is very detailed in explaining how and why a DI Container must track resources and guarantee that their disposal is properly managed.
This is an excerpt I find particurarly significant:

"[...] you need to find a strategy to ensure resources are disposed when they’re no longer required. The most widely-attempted one is based around the idea that whatever object acquires the resource should also release it. I pejoratively call it “ad-hoc” because it doesn’t work consistently. Eventually you’ll come up against one (and likely more) of the following issues:
Sharing: When multiple independent components share a resource, it is very hard to figure out when none of them requires it any more. Either a third party will have to know about all of the potential users of the resource, or the users will have to collaborate. Either way, things get hard fast.
Cascading Changes: Let’s say we have three components – A uses B which uses C. If no resources are involved, then no thought needs to be given to how resource ownership or release works. But, if the application changes so that C must now own a disposable resource, then both A and B will probably have to change to signal appropriately (via disposal) when that resource is no longer needed. The more components involved, the nastier this one is to unravel."

CI Containers solve these problems.
Poor Man (or Pure) CI solves the compose phase only. But the CI should also take care of resource disposal, or it would not provide any Unit of Work and possibly lead to memory leaks or NullPointerExceptions at runtime. What a basic Por Man implementation provides is just an Instance Per Dependency Scope (every request gets a new instance). With few modificatios, it could provide a Single Instance Scope (that is, a Singleton). But you might agree that managing nested scopes, shared dependencies, instances per web request and a proper disposal management with a Poor Man CI is all but a simple task.

So, I'm not sure the distinction between Poor Man and DI Containers is only a matter of Convention over Configuration. I got to the conclusion that the main goal of a DI Container is lifecycle management, much more than object graph composition.

What do you think?
2015-08-18 7:06 UTC

Arialdo, thank you for writing.

Like you, I used to think that lifetime management was an strong motivation to use a DI Container; there's an entire chapter about lifetime management in my book.

There may be cases where that's true, but these days I prefer the the explicit lifetime matching I get from Pure DI.

While you can make lifetime management quite complicated, I prefer to keep it simple, so in practice, I only use the Singleton and Transient lifetime styles. Additionally, I prefer to design my components so that they aren't disposable. If I must use a disposable third-party object, my next priority would be to use a Decoraptor, and add decommissioning support if necessary. Only if none of that is possible will I begin to look at disposal from the Composition Root.

Usually, when you only use Singleton and Transient, manual disposal from the Composition Root is easy. There's no practical reason to dispose of the Singletons, so you only need to dispose of the Transient objects. How you do that varies from framework to framework, but in ASP.NET Web API, for example, it's easy.

2015-08-18 08:41 UTC

Dependency Injection in ASP.NET Web API with Castle Windsor

Wednesday, 03 October 2012 03:45:22 UTC

This post describes how to compose Controllers with Castle Windsor in the ASP.NET Web API

In my previous post I described how to use Dependency Injection (DI) in the ASP.NET Web API using Poor Man's DI. It explained the basic building blocks, including the relevant extensibility points in the Web API. Poor Man's DI can be an easy way to get started with DI and may be sufficient for a small code base, but for larger code bases you may want to adopt a more convention-based approach. Some DI Containers provide excellent support for Convention over Configuration. One of these is Castle Windsor.

Composition Root #

Instead of the PoorMansCompositionRoot from the example in the previous post, you can create an implementation of IHttpControllerActivator that acts as an Adapter over Castle Windsor:

public class WindsorCompositionRoot : IHttpControllerActivator
{
    private readonly IWindsorContainer container;
 
    public WindsorCompositionRoot(IWindsorContainer container)
    {
        this.container = container;
    }
 
    public IHttpController Create(
        HttpRequestMessage request,
        HttpControllerDescriptor controllerDescriptor,
        Type controllerType)
    {
        var controller =
            (IHttpController)this.container.Resolve(controllerType);
 
        request.RegisterForDispose(
            new Release(
                () => this.container.Release(controller)));
 
        return controller;
    }
 
    private class Release : IDisposable
    {
        private readonly Action release;
 
        public Release(Action release)
        {
            this.release = release;
        }
 
        public void Dispose()
        {
            this.release();
        }
    }
}

That's pretty much all there is to it, but there are a few points of interest here. First of all, the class implements IHttpControllerActivator just like the previous PoorMansCompositionRoot. That's the extensibility point you need to implement in order to create Controller instances. However, instead of hard-coding knowledge of concrete Controller classes into the Create method, you delegate creation of the instance to an injected IWindsorContainer instance. However, before returning the IHttpController instance created by calling container.Resolve, you register that object graph for disposal.

With Castle Windsor decommissioning is done by invoking the Release method on IWindsorContainer. The input into the Release method is the object graph originally created by IWindsorContainer.Resolve. That's the rule from the Register Resolve Release pattern: What you Resolve you must also Release. This ensures that if the Resolve method created a disposable instance (even deep in the object graph), the Release method signals to the container that it can now safely dispose of it. You can read more about this subject in my book.

The RegisterForDispose method takes as a parameter an IDisposable instance, and not a Release method, so you must wrap the call to the Release method in an IDisposable implementation. That's the little private Release class in the code example. It adapts an Action delegate into a class which implements IDisposable, invoking the code block when Dispose is invoked. The code block you pass into the constructor of the Release class is a closure around the outer variables this.container and controller so when the Dispose method is called, the container releases the controller (and the entire object graph beneath it).

Configuring the container #

With the WindsorCompositionRoot class in place, all that's left is to set it all up in Global.asax. Since IWindsorContainer itself implements IDisposable, you should create and configure the container in the application's constructor so that you can dispose it when the application exits:

private readonly IWindsorContainer container;
 
public WebApiApplication()
{
    this.container =
        new WindsorContainer().Install(new DependencyConventions());
}
 
public override void Dispose()
{
    this.container.Dispose();
    base.Dispose();
}

Notice that you can configure the container with the Install method directly in the constructor. That's the Register phase of the Register Resolve Release pattern.

In Application_Start you tell the ASP.NET Web API about your WindsorCompositionRoot instead of PoorMansCompositionRoot from the previous example:

GlobalConfiguration.Configuration.Services.Replace(
    typeof(IHttpControllerActivator),
    new WindsorCompositionRoot(this.container));

Notice that the container instance field is passed into the constructor of WindsorCompositionRoot, so that it can use the container instance to Resolve Controller instances.

Summary #

Setting up DI in the ASP.NET Web API with a DI Container is easy, and it would work very similarly with other containers (not just Castle Windsor), although the Release mechanisms tend to be a bit different from container to container. You'll need to create an Adapter from IHttpControllerActivator to your container of choice and set it all up in the Global.asax.


Comments

Mahesh #
Hi,

I am getting an error -

The type or namespace name 'DependencyConventions' could not be found (are you missing a using directive or an assembly reference?)

I added Castle windsor via Nuget in VS 2012 Web Express.

What's the problem?

Thanks,
Mahesh.
2012-10-14 10:49 UTC
Luis #
Hi Mark,

Not even remotely related to Web API, but I was wondering if a blog post about CQRS and DI in general was in the pipeline. Last time I posted I hadn't read your book, now that I have, I'm finding myself reading your blog posts like a book and I can't wait for the next. Great book by the way, can't recommend it enough, unless you're on some sort of diet.

Luis
2012-10-16 08:15 UTC
Thank you. Have you read my CQRS article? For examples on using DI with CQRS, I'd like to suggest the updated example code.
2012-10-16 11:59 UTC
Simon #
Hi Mark,

Does the above implementation also resolve normal MVC4 website controllers? If so is there any extra setup required in the Global.asax file? Prior to MVC4 I was using the ControllerFactory method described in your book but is this still the best way?
2012-10-17 08:12 UTC
IHttpControllerActivator is a special Web API interface, as is IHttpController. MVC Controllers are not resolved with this API, but it's very similar to the approach outlined in my book.
2012-10-17 12:30 UTC
Rema Manual #
Hi Mark,

How about using Windsor/above technique for injecting dependencies into MVC 4 attributes? I am using customized Authorize and ExceptionFilter attributes and so far I have not found a nice, easy and clean way to inject dependencies into them?

2012-10-20 12:58 UTC
You can't use the above technique for injecting anything into MVC 4 attributes, since they aren't Controllers. The only way to inject dependencies into MVC attributes is by Property Injection, and if you read section 4.2 of my book you'll see that there are many issues with this pattern.

A better approach is to use global filters with behaviour, and use only passive attributes.
2012-10-21 13:48 UTC
Daniel Hilgarth #
Thanks for your article. I adapted it to Autofac and I thought I'd share that code.
Autofac has the concept of LifetimeScopes. Using these, the code looks like the following:

public IHttpController Create(HttpRequestMessage request, HttpControllerDescriptor controllerDescriptor, Type controllerType)
{
var scope = _container.BeginLifetimeScope();
var controller = (IHttpController)scope.Resolve(controllerType);
request.RegisterForDispose(scope);
return controller;
}

If you want to register dependencies that are different for every request (like Hyprlinkr's RouteLinker), you can do this when beginning the lifetime scope:

public IHttpController Create(HttpRequestMessage request, HttpControllerDescriptor controllerDescriptor, Type controllerType)
{
var scope = _container.BeginLifetimeScope(x => RegisterRequestDependantResources(x, request));
var controller = (IHttpController)scope.Resolve(controllerType);
request.RegisterForDispose(scope);
return controller;
}

private static void RegisterRequestDependantResources(ContainerBuilder containerBuilder, HttpRequestMessage request)
{
containerBuilder.RegisterInstance(new RouteLinker(request));
containerBuilder.RegisterInstance(new ResourceLinkVerifier(request.GetConfiguration()));
}

Sorry for the formatting, I have no idea how to format code here.
2012-10-26 18:54 UTC
Alexander #
Hi Mark,
Nice article.

As I understand WebApiApplication can be instantiated several times and disposed several times as well. This page(http://msdn.microsoft.com/en-gb/library/ms178473.aspx) says "The Application_Start and Application_End methods are special methods that do not represent HttpApplication events. ASP.NET calls them once for the lifetime of the application domain, not for each HttpApplication instance."
So as I understand your approach we can get into a situation using disposed context.

What do you think about this?
2012-11-20 02:32 UTC
I've never experienced it to be a problem, so my guess is that in reality the documentation is off and there's only one instance of HttpApplication. Otherwise, the container should be disposed, and I've never seen that happen.
2012-11-20 08:07 UTC
Alexander #
I've ended with static context in the HttpApplication. Create context in Application_Start and dispose Application_End. I think it's better for future once documentation become the reality:). For each AppDomain which possibly can be created we will have separate context.
Anyway your example is very useful for me.
2012-11-20 09:38 UTC
Chris #
Is the constructor the correct place to initialise the container?

As there are multiple instances of HttpApplication per application

(If I put a breakpoint in the constructor it gets hit multiple times)

As you can see by these articles, there is not a single instance of HttpApplication, but multiple

http://lowleveldesign.wordpress.com/2011/07/20/global-asax-in-asp-net/
http://msdn.microsoft.com/en-us/library/a0xez8f2(v=vs.71).aspx

wouldn't it be more appropriate to go in Application_Start?


2012-12-07 13:50 UTC
See previous comments.
2012-12-09 19:36 UTC
Thanks for this post. Very useful. Not sure if I am missing something, but having the same issue as the first commentor (Mahesh). What should DependencyConventions() actually do?
2013-04-12 16:01 UTC
Will, DependencyConventions is just a Windsor Installer. I just called it DependencyConventions because I prefer Convention over Configuration when using DI Containers. In your own project, you'll need to define your own Windsor Installer. Alternatively, you can configure the container directly in the WebApiApplication constructor.
2013-04-12 16:00 UTC
Jeff Soper #

I've been studying this article and some of your answers like this one to StackOverflow questions pertaining to DI. It seems that the established community efforts to integrate popular IoC containers such as Ninject are, at their core,implementations of IDependencyResolver, rather than IHttpControllerActivator.

Are these popular community efforts missing the 'access to context' trait of your solution, or are they simply accomplishing it another way? Are there any established projects, open-source or otherwise, that do what you propose, or is this still an untapped 'pull request' opportunity for projects like the Ninject MVC, etc?

2014-03-15 17:40 UTC

Jeff, thank you for writing. You are indeed correct that one of the many problems with the Service Locator anti-pattern (and therefore also IDependencyResolver) is that the overall context is hidden. Glenn Block originally pointed that particular problem out to me.

This is also the case with IDependencyResolver, because when GetService(Type) or GetServices(Type) is invoked, the only information the composition engine knows, is the requested type. Thus, resolving something that requires access to the HttpRequestMessage or one of its properties, is impossible with IDependencyResolver, but perfectly possible with IHttpControllerActivator.

So, yes, I would definitely say that any DI Container that provides 'ASP.NET Web API integration' by implementing IDependencyResolver is missing out. In any case, these days I rarely use a DI Container, so I don't consider it a big deal - and if I need to use a DI Container, I just need to add those few lines of code listed above in this blog post.

2014-03-15 18:28 UTC
Dmitry Goryunov #

Can't figure out, how is it better to organize my solution.

There are, for example, three projects Cars.API, Cars.Core, and Cars.Data. API contains web-interface, Core contains all the abstractions, and Data communicates with DB. Data and API should depend on Core according to Dependency inversion principle. At this point everything seems to be clear, but then we implement our Composition Root in the API project, which makes it dependent on the Data project containing implementations of abstractions that are stored in Core project. Is it violation of Dependency inversion principle?

P.S. thank you for your book and the articles you write.

2015-12-07 16:53 UTC

Dmitry, thank you for writing. Does this or this help?

2015-12-07 17:05 UTC
Andrew G #

In the Configuring the Container section, you are placing the Install inside the constructor. Whenever the application starts up or executes a request, the constructor seems to be called multiple times. In turn, the container will be created multiple times throughout its life time. Is that the point? Or should the container be moved into the Application_Start? Although the constructor is called multiple times, application start seems to be called once. The dispose doesnt seem to be called till the end as well. Is there something earlier in the lifecycle that would cause a need for the Register to be done in the constructor?

I very much enjoy your book and your blog btw. great source of solid information!

2017-09-01 10:52 UTC

Andrew, thank you for writing. In general, I don't recall that this has ever been an issue, but see previous threads in the comments for this post. The question's come up before.

I do, however, admit that I've never formally studied the question like I did with WCF, so it's possible that I'm wrong on this particular issue. Also, details of the framework could have changed in the five years that's gone by since I wrote the article.

2017-09-02 15:33 UTC

Dependency Injection and Lifetime Management with ASP.NET Web API

Friday, 28 September 2012 03:56:21 UTC

This post describes how to properly use Dependency Injection in the ASP.NET Web API, including proper Lifetime Management.

The ASP.NET Web API supports Dependency Injection (DI), but the appropriate way to make it work is not the way it's documented by Microsoft. Even though the final version of IDependencyResolver includes the notion of an IDependencyScope, and thus seemingly supports decommissioning (the release of IDisposable dependencies), it's not very useful.

The problem with IDependencyResolver #

The main problem with IDependencyResolver is that it's essentially a Service Locator. There are many problems with the Service Locator anti-pattern, but most of them I've already described elsewhere on this blog (and in my book). One disadvantage of Service Locator that I haven't yet written so much about is that within each call to GetService there's no context at all. This is a general problem with the Service Locator anti-pattern, not just with IDependencyResolver. Glenn Block originally pointed this out to me. The problem is that in an implementation, all you're given is a Type instance and asked to return an object, but you're not informed about the context. You don't know how deep in a dependency graph you are, and if you're being asked to provide an instance of the same service multiple times, you don't know whether it's within the same HTTP request, or whether it's for multiple concurrent HTTP requests.

In the ASP.NET Web API this issue is exacerbated by another design decision that the team made. Contrary to the IDependencyResolver design, I find this other decision highly appropriate. It's how context is modeled. In previous incarnations of web frameworks from Microsoft, we've had such abominations as HttpContext.Current, HttpContextBase and HttpContextWrapper. If you've ever tried to work with these interfaces, and particularly if you've tried to do TDD against any of these types, you know just how painful they are. That they are built around the Singleton pattern certainly doesn't help.

The ASP.NET Web API does it differently, and that's very fortunate. Everything you need to know about the context is accessible through the HttpRequestMessage class. While you could argue that it's a bit of a God Object, it's certainly a step in the right direction because at least it's a class you can instantiate within a unit test. No more nasty Singletons.

This is good, but from the perspective of DI this makes IDependencyResolver close to useless. Imagine a situation where a dependency deep in the dependency graph need to know something about the context. What was the request URL? What was the base address (host name etc.) requested? How can you share dependency instances within a single request? To answer such questions, you must know about the context, and IDependencyResolver doesn't provide this information. In short, IDependencyResolver isn't the right hook to compose dependency graphs. Fortunately, the ASP.NET Web API has a better extensibility point for this purpose.

Composition within context #

Because HttpRequestMessage provides the context you may need to compose dependency graphs, the best extensibility point is the extensibility point which provides an HttpRequestMessage every time a graph should be composed. This extensibility point is the IHttpControllerActivator interface:

public interface IHttpControllerActivator
{
    IHttpController Create(
        HttpRequestMessage request,
        HttpControllerDescriptor controllerDescriptor,
        Type controllerType);
}

As you can see, each time the Web API framework invokes this method, it will provide an HttpRequestMessage instance. This seems to be the best place to compose the dependency graph for each request.

Example: Poor Man's DI #

As an example, consider a Controller with this constructor signature:

public RootController(IStatusQuery statusQuery)

If this is the only Controller in your project, you can compose its dependency graph with a custom IHttpControllerActivator. This is easy to do:

public class PoorMansCompositionRoot : IHttpControllerActivator
{
    public IHttpController Create(
        HttpRequestMessage request,
        HttpControllerDescriptor controllerDescriptor,
        Type controllerType)
    {
        if (controllerType == typeof(RootController))
            return new RootController(
                new StatusQuery());
 
        return null;
    }
}

The Create implementation contains a check on the supplied controllerType parameter, creating a RootController instance if the requested Controller type is RootController. It simply creates the (very shallow) dependency graph by injecting a new StatusQuery instance into a new RootController instance. If the requested Controller type is anything else than RootController, the method returns null. It seems to be a convention in the Web API that if you can't supply an instance, you should return null. (This isn't a design I'm fond of, but this time I'm only on the supplying side, and I can only pity the developers on the receiving side (the ASP.NET team) that they'll have to write all those null checks.)

Some readers may think that it would be better to use a DI Container here, and that's certainly possible. In a future post I'll provide an example on how to do that.

The new PoorMansCompositionRoot class must be registered with the Web API framework before it will work. This is done with a single line in Application_Start in Global.asax.cs:

GlobalConfiguration.Configuration.Services.Replace(
    typeof(IHttpControllerActivator),
    new PoorMansCompositionRoot());

This replaces the default implementation that the framework provides with the PoorMansCompositionRoot instance.

Decommissioning #

Implementing IHttpControllerActivator.Create takes care of composing object graphs, but what about decommissioning? What if you have dependencies (deep within the dependency graph) implementing the IDisposable interface? These must be disposed of after the request has ended (unless they are Singletons) - if not, you will have a resource leak at hand. However, there seems to be no Release hook in IHttpControllerActivator. On the other hand, there's a Release hook in IDependencyResolver, so is IDependencyResolver, after all, the right extensibility point? Must you trade off context for decommissioning, or can you have both?

Fortunately you can have both, because there's a RegisterForDispose extension method hanging off HttpRequestMessage. It enables you to register all appropriate disposable instances for disposal after the request has completed.

Example: disposing of a disposable dependency #

Imagine that, instead of the StatusQuery class from the previous example, you need to use a disposable implementation of IStatusQuery. Each instance must be disposed of after each request completes. In order to accomplish this goal, you can modify the PoorMansCompositionRoot implementation to this:

public class PoorMansCompositionRoot : IHttpControllerActivator
{
    public IHttpController Create(
        HttpRequestMessage request,
        HttpControllerDescriptor controllerDescriptor,
        Type controllerType)
    {
        if (controllerType == typeof(RootController))
        {
            var disposableQuery = new DisposableStatusQuery();
            request.RegisterForDispose(disposableQuery);
            return new RootController(disposableQuery);
        }
 
        return null;
    }
}

Notice that the disposableQuery instance is passed to the RegisterForDispose method before it's injected into the RootController instance and the entire graph is returned from the method. After the request completes, DisposableStatusQuery.Dispose will be called.

If you have a dependency which implements IDisposable, but should be shared across all requests, obviously you shouldn't register it for disposal. Such Singletons you can keep around and dispose of properly when the application exits gracefull (if that ever happens).

Summary #

Proper DI and Lifetime Management with the ASP.NET Web API is easy once you know how to do it. It only requires a few lines of code.

Stay away from the IDependencyResolver interface, which is close to useless. Instead, implement IHttpControllerActivator and use the RegisterForDispose method for decommissioning.

In a future post I will demonstrate how to use a DI Container instead of Poor Man's DI.


Comments

IDependencyResolver (WebApi version) do support scoping. Look at the "public IDependencyScope BeginScope()" method.
2012-09-28 12:24 UTC
Yes, that's what I wrote. The problem is that it doesn't provide any context.
2012-09-28 12:30 UTC
So what you're saying is that the DependencyResolver is bad because you *might* need to get more context information for the controller composition? Because I fail to see the advantage of your approach with the examples that you have given.
2012-09-28 18:17 UTC
Dave Bettin #
What changes would have you made to the DependencyResolver/Scope design to provide this context? Or would you ditch the design completely and start fresh?
2012-09-28 20:40 UTC
Dave, the IHttpControllerActivator/RegisterForDispose is a workable combo, but I'd preferred a release method on it, just like MVC's IControllerFactory.
2012-09-28 21:26 UTC
Jonas, let's assume that you have a deeper dependency graph. Let's say that you have this Controller constructor: public MyController(IService1, IService2). Imagine, moreover, that you have implementations of each of these interfaces, with these constructors: public Service1(IFoo) and public Service2(IFoo).

You have one implementation of IFoo, namely Foo, and for efficiency reasons, or perhaps because Foo is a Mediator, you'd like to share the same instance of Foo between Service1 and Service2. However, Foo isn't thread-safe, so you can only share the Foo instance within a single request. For each request, you want to use a single instance of Foo.

This is the Per Graph lifestyle pattern (p. 259 in my book).

That's not possible with IDependencyResolver, but it is with IHttpControllerActivator.
2012-09-28 21:35 UTC
Hi Mark

First of all I would like to mention that I just read your book and I enjoyed it very much.
Straight to the point with good simple examples and metaphors.
What I have noticed thought, from your book and from this blog post, is that you give handing IDisposable object a great deal of attention.

I have written a lightweight service container over at Github (https://github.com/seesharper/LightInject/wiki/Getting-started) that tries to do the "right thing" with a minimal effort.

Then I started to read about handling disposable services in your book and realized that this is actually quite a complicated thing to deal with.

It also seems to be unclear how a service container actually should do this. The various container implementations has pretty much its own take on the disposable challenge where as Spring.Net for instance, does not seem to go very far on this topic. Yet it is one of the most popular DI frameworks out there.

The question then remains, is automatic handling of disposable objects a necessity for any container or is a feature?

If it is absolutely necessary, how complex does it need to be. I would rather not implement a whole ref-count system on top of the CLR :)

Regards

Bernhard Richter


2012-09-29 08:39 UTC
Well, whether or not it's a necessity depends on who you ask.

StructureMap, for example, has no decommissioning capability, and when you ask Jeremy Miller, it's by design. The reason for this is that if you don't have any disposable dependencies at all, it's just an overhead keeping track of all the instances created by the container. Garbage collection will ensure that resources are properly reclaimed.

Containers that do keep track of the instances they created will leak unless you explicitly remember to Release what you Resolve. By that argument, Jeremy Miller considers StructureMap's design safer because in the majority case there will be no leaks. IMO, the downside of that is that if you have disposable dependencies, they will leak and there's nothing you can do about.

On the other hand, with a container like Castle Windsor, it's important to Release what you Resolve, or you might have leaks. The advantage, however, is that you're guaranteed that everything can be properly disposed of.

Thus, in the end, no matter the strategy, it all boils down to that the developers using the container must exercise discipline. However, they are two different kinds of discipline: Either never use disposable dependencies or always Release what you Resolve.
2012-09-30 12:09 UTC
Kirin Yao #
So, it's also applicable to DI in ASP.NET MVC, isn't it? In IControllerActivator.Create method, RequestContext parameter provides the context for creating controller.
2012-10-19 09:02 UTC
Yes, this is described in detail in section 7.2 in my book.
2012-10-19 12:39 UTC
Ali #
Hi Mark,

Great article, thanks a lot! Loved your book as well :)

I just have a little observation that returning null in the Create method causes a 404. Instead we could do the following to call the default implementation for other controllers:

return new DefaultHttpControllerActivator().Create(request, controllerDescriptor, controllerType);
2013-02-22 10:16 UTC
ImaginaryDevelopment #
Hi Mark,

How would you handle db calls that need to be made to validate objects in say [CustomValidationAttribute()] tags
where you need the objects used to validate again in the actual web api action method?

2014-08-28 14:52 UTC

In general, I don't like attributes with behaviour, but prefer passive attributes. Still, that just moves the implementation to the Filter's ExecuteActionFilterAsync method, so that doesn't really address your question.

If you need access to the actual objects created from the incoming request, you probably could pull it out of the HttpActionContext passed into ExecuteActionFilterAsync, but why bother? You can access the object from the Controller.

A Filter attribute in ASP.NET Web API, to be useful, represents a Cross-Cutting Concern, such as security, logging, metering, caching, etc. Cross-Cutting Concerns are cross-cutting exactly because they are independent of the actual values or objects in use. This isn't the case for validation, which tends to be specific to particular types, so these are most likely better addressed by a Controller, or a Service invoked by the Controller, rather than by a custom attribute.

Once you're in the Controller, if you need access to a database, you can inject a data access object (commonly referred to as a Repository) into the Controller.

2014-08-30 11:23 UTC

Concrete Dependencies

Friday, 31 August 2012 20:37:37 UTC

Concrete classes can also be used as dependencies

Usually, when we think about Dependency Injection (DI), we tend to consider that only polymorphic types (interfaces or (abstract) base classes) can act as dependencies. However, in a previous blog post I described how primitives can be used as dependencies. A primitive is about as far removed from polymorphism as it can be, but there's a middle ground too. Sometimes 'normal' concrete classes with non-virtual members can be used as dependencies with to great effect.

While the Liskov Substitution Principle is voided by injecting a concrete type, there can be other good reasons to occasionaly do something like this. Consider how many times you've written an extension method to perform some recurring task. Sometimes it turns out that an extension method isn't the best way to encapsulate a common algorithm. It might start out simple enough, but then you realize that you need to provide the extension method with a control parameter in order to 'configure' it. This causes you to add more arguments to the extension method, or to add more overloads. Soon, you have something like the Object Mother (anti-)pattern on your hand.

A concrete class can sometimes be a better way to encapsulate common algorithms in a way where the behavior can be tweaked or changed in one go. Sometimes the boundary can become blurred. In the previous post I examined constructor arguments such as strings and integers, but what about an Uri instance? It might act as a base URI for creating absolute links from within a Controller. An Uri instance isn't really a primitive, although it basically just encapsulates something which is a string. It's an excellent example of the Value Object pattern, providing a rich API for manipulating and querying URIs.

It can be more complex that that. Consider Hyprlinkr as an example. What it does is to produce URI links to other Controllers in an ASP.NET Web API service in a strongly typed way. It's not really a polymorphic dependency as such, although it does implement an interface. It's more like a reusable component which produces a determinstic result without side-effects. In Functional Programming terminology, it's comparable to a pure function. For a number of reasons, this is a prime candidate for a concrete dependency.

Before I get to that, I want to show you what I mean when I talk about locally scoped methods, including extension methods and such. Then I want to talk about using the RouteLinker class (the main class in Hyprlinkr) as a classic polymorphic dependency, and why that doesn't really work either. Finally, I want to talk about why the best option is to treat RouteLinker as a concrete dependency.

RouteLinker as a local variable #

While Hyprlinkr was always designed with DI in mind, you actually don't have to use DI to use it. From within an ApiController class, you can just create an instance like this:

var linker = new RouteLinker(this.Request);

With this locally scoped variable you can start creating links to other resources:

Href = linker.GetUri<NoDIController>(r => r.Get(id)).ToString()

That seems easy, so why make it hard than that? Well, it's easy as long as you have only a single, default route in your web service. As soon as you add more routes, you'll need to help Hyprlinkr a bit by providing a custom IRouteDispatcher. That goes as the second argument in a constructor overload:

var linker = new RouteLinker(this.Request, ??);

The question is: how do you create an instance of the desired IRouteDispatcher? You could do it inline every time you want to create an instance of RouteLinker:

var linker = new RouteLinker(this.Request, new MyCustomRouteDispatcher());

However, that's starting to look less than DRY. This is where many people might consider creating an extension method which creates a RouteLinker instance from an HttpRequestMessage instance. Now what if you need to supply a configuration value to the custom route dispatcher? Should you pull it from app.config straight from within your extension method? Then what if you need to be able to vary that configuration value from a unit test? This could lead toward an unmaintainable mess quickly. Perhaps it would be better injecting the dependency after all...

IResourceLinker as a polymorphic dependency #

The RouteLinker class actually implements an interface (IResourceLinker) so would it be worthwhile to inject it as a polymorphic interface? This is possible, but actually leads to more trouble. The problem is that due to its signature, it's damn hard to unit test. The interface looks like this:

public interface IResourceLinker
{
    Uri GetUri<T>(Expression<Action<T>> method);
}

That may at first glance look innocuous, but is actually quite poisonous. The first issue is that it's impossible to define proper setups when using dynamic mocks. This is because of the Expression parameter. The problem is that while the following Moq setup compiles, it can never work:

linkerStub
    .Setup(x => x.GetUri<ArtistController>(c => c.Get(artistId)))
    .Returns(uri);

The problem is that the expression passed into the Setup method isn't the same as the expression used in the SUT. It may look like the same expression, but it's not. Most of the expression tree actually is the same, but the problem is the leaf of the tree. The leaf of the expression tree is the reference to the artistId variable. This is a test variable, while in the SUT it's a variable which is internal to the SUT. While the values of both variables are expected to be the same, the variables themselves aren't.

It might be possible to write a custom equality comparer that picks expression trees apart in order to compare the values of leaf nodes, but that could become messy very quickly.

The only option seems to define Setups like this:

linkerStub
    .Setup(x => x.GetUri(It.IsAny<Expression<Action<ArtistController>>>()))
    .Returns(uri);

That sort of defies the purpose of a dynamic Test Double...

That's not the only problem with the IResourceLinker interface. The other problem is the return type. Since Uri doesn't have a default constructor, it's necessary to tell Moq what to return when the GetUri method is called. While the default behavior of Moq is to return null if no matching Setups were found, I never allow null in my code, so I always change Moq's behavior to return something proper instead. However, this has the disadvantage that if there's no matching Setup when the SUT attempts to invoke the GetUri method, Moq will throw an exception because there's no default constructor for Uri and it doesn't know what else to return.

This leads to Fixture Setup code like this:

linkerStub
    .Setup(x => x.GetUri(It.IsAny<Expression<Action<ArtistController>>>()))
    .Returns(uri);
linkerStub
    .Setup(x => x.GetUri(It.IsAny<Expression<Action<ArtistAlbumsController>>>()))
    .Returns(uri);
linkerStub
    .Setup(x => x.GetUri(It.IsAny<Expression<Action<ArtistTracksController>>>()))
    .Returns(uri);
linkerStub
    .Setup(x => x.GetUri(It.IsAny<Expression<Action<SimilarArtistsController>>>()))
    .Returns(uri);

...and that's just to prevent the unit test from crashing. Each and every unit test that hits the same method must have this Setup because the SUT method internally invokes the GetUri method four times with four different parameters. This is pure noise and isn't even part of the test case itself. The tests become very brittle.

If only there was a better way...

RouteLinker as a concrete dependency #

What would happen if you inject the concrete RouteLinker class into other classes? This might look like this:

private readonly RouteLinker linker;
 
public HomeController(
    RouteLinker linker)
{
    this.linker = linker;
}

Creating links from within the Controller is similar to before:

Href = this.linker.GetUri<HomeController>(r => r.GetHome()).ToString(),

What about unit testing? Well, since the GetUri method is strictly deterministic, given the same input, it will always produce the same output. Thus, from a unit test, you only have to ask the instance of RouteLinker injected into the SUT what it would return if invoked with a specific input. Then you can compare this expected output with the actual output.

[Theory, AutoUserData]
public void GetHomeReturnsResultWithCorrectSelfLink(
    [Frozen]RouteLinker linker,
    HomeController sut)
{
    var actual = sut.GetHome();
 
    var expected = new AtomLinkModel
    {
        Href = linker.GetUri<HomeController>(r => r.GetHome()).ToString(),
        Rel = "self"
    }.AsSource().OfLikeness<AtomLinkModel>();
    Assert.True(actual.Links.Any(expected.Equals));
}

In this test, you Freeze the RouteLinker instance, which means that the linker variable is the same instance as the RouteLinker injected into the SUT. Next, you ask that RouteLinker instance what it would produce when invoked in a particular way, and since AtomLinkModel doesn't override Equals, you produce a Likeness from the AtomLinkModel and verify that the actual collection of links contains the expected link.

That's much more precise than those horribly forgiving It.IsAny constraints. The other advantage is also that you don't have to care about Setups of methods you don't care about in a particular test case. The SUT can invoke the GetUri method as many times as it wants, with as many different arguments as it likes, and the test is never going to break because of that. Since the real implementation is injected, it always works without further Setup.

Granted, strictly speaking these aren't unit tests any longer, but rather Facade Tests.

This technique works because the GetUri method is deterministic and has no side-effects. Thus, it's very similar to Function Composition in Functional languages.


The order of AutoFixture Customizations matter

Tuesday, 31 July 2012 16:31:11 UTC

This post answers a FAQ about ordering of AutoFixture Customizations

With AutoFixture you can encapsulate common Customizations using the Customize method and the ICustomization interface. However, customizations may 'compete' for the same requests in the sense that more than one customization is able to handle a request.

As an example, consider a request for something as basic as IEnumerable<T>. By default, AutoFixture can't create instances of IEnumerable<T>, but more than one customization can.

As previously described the MultipleCustomization handles requests for sequences just fine:

var fixture = new Fixture().Customize(new MultipleCustomization());
var seq = fixture.CreateAnonymous<IEnumerable<int>>();

However, the AutoMoqCustomization can also (sort of) create sequences:

var fixture = new Fixture().Customize(new AutoMoqCustomization());
var seq = fixture.CreateAnonymous<IEnumerable<int>>();

However, in this case, the implementation of IEnumerable<int> is a dynamic proxy, so it's not much of a sequence after all.

Mocking IEnumerable<T> #

Here I need to make a little digression on why that is, because this seems to confuse a lot of people. Consider what a dynamic mock object is: it's a dynamic proxy which implements an interface (or abstract base class). It doesn't have a lot of implemented behavior. Dynamic mocks do what we tell them through their configuration APIs (such as the Setup methods for Moq). If we don't tell them what to do, they must fall back on some sort of default implementation. When the AutoMoqCustomization is used, it sets Mock<T>.DefaultValue to DefaultValue.Mock, which means that the default behavior is to return a new dynamic proxy for reference types.

Here's how an unconfigured dymamic proxy of IEnumerable<T> will behave: the interface only has two (very similar) methods:

public interface IEnumerable<out T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

Via IEnumerable the interface also defines the non-generic GetEnumerator method, but it's so similar to the generic GetEnumerator method that the following discussion applies for both.

When you iterate over IEnumerable<T> using foreach, or when you use LINQ, the first thing that happens is that the GetEnumerator method is called. An unconfigured dynamic mock will respond by returning another dynamic proxy implementing IEnumerator<T>. This interface directly and indirectly defines these methods:

T Current { get; }
 
object IEnumerator.Current { get; }
 
bool MoveNext();
 
void Reset();
 
void Dispose();

Iterating over a sequence will typically start by invoking the MoveNext method. Since the dynamic proxy is unconfigured, it has to fall back to default behavior. For booleans the default value is false, so the return value of a call to MoveNext would be false. This means that there are no more elements in the sequence. Iteration stops even before it begins. In effect, such an implementation would look like an empty sequence.

OK, back to AutoFixture.

Ordering Customizations #

Frequently I receive questions like this:

"Creating lists with AutoFixture seems inconsistent. When MultipleCustomization comes before AutoMoqCustomization, lists are popuplated, but the other way around they are empty. Is this a bug?"

No, this is by design. By now, you can probably figure out why.

Still, lets look at the symptoms. Both of these tests pass:

[Fact]
public void OnlyMultipleResolvingSequence()
{
    var fixture = new Fixture().Customize(new MultipleCustomization());
    var seq = fixture.CreateAnonymous<IEnumerable<int>>();
    Assert.NotEmpty(seq);
}
 
[Fact]
public void OnlyAutoMoqResolvingSequence()
{
    var fixture = new Fixture().Customize(new AutoMoqCustomization());
    var seq = fixture.CreateAnonymous<IEnumerable<int>>();
    Assert.Empty(seq);
}

Notice that in the first test, the sequence is not empty, whereas in the second test, the sequence is empty. This is because the MultipleCustomization produces a 'proper' sequence, while the AutoMoqCustomization produces a dynamic proxy of IEnumerable<int> as described above. At this point, this should hardly be surprising.

The same obvervations can be made when both Customizations are in use:

[Fact]
public void WrongOrderResolvingSequence()
{
    var fixture = new Fixture().Customize(
        new CompositeCustomization(
            new AutoMoqCustomization(),
            new MultipleCustomization()));
 
    var seq = fixture.CreateAnonymous<IEnumerable<int>>();
 
    Assert.Empty(seq);
}
 
[Fact]
public void CorrectOrderResolvingSequnce()
{
    var fixture = new Fixture().Customize(
        new CompositeCustomization(
            new MultipleCustomization(),                    
            new AutoMoqCustomization()));
 
    var seq = fixture.CreateAnonymous<IEnumerable<int>>();
 
    Assert.NotEmpty(seq);
}

Both of these tests also pass. In the first test the sequence is empty, and in the second it contains elements. This is because the first Customization 'wins'.

In general, a Customization may potentially be able to handle a lot of requests. For instance, the AutoMoqCustomization can handle all requests for interfaces and abstract base classes. Thus, multiple Customizations may be able to handle a request, so AutoFixture needs a conflict resolution strategy. That strategy is simply that the first Customization which can handle a request gets to do that, and the other Customizations are never invoked. You can use this feature to put specific Customizations in front of more catch-all Customizations. That's essentially what happens when you put MultipleCustomization in front of AutoMoqCustomization.


FizzBuzz kata in F#: stage 2

Wednesday, 25 July 2012 09:05:09 UTC

In my previous post I walked you through stage 1 of the FizzBuzz kata. In this post I'll walk you through stage 2 of the kata, where new requirements are introduced (see the kata itself for details). This makes the implementation much more complex.

Unit test #

In order to meet the new requirements, I first modified and expanded my existing test cases:

[<Theory>]
[<InlineData(1, "1")>]
[<InlineData(2, "2")>]
[<InlineData(3, "Fizz")>]
[<InlineData(4, "4")>]
[<InlineData(5, "Buzz")>]
[<InlineData(6, "Fizz")>]
[<InlineData(7, "7")>]
[<InlineData(8, "8")>]
[<InlineData(9, "Fizz")>]
[<InlineData(10, "Buzz")>]
[<InlineData(11, "11")>]
[<InlineData(12, "Fizz")>]
[<InlineData(13, "Fizz")>]
[<InlineData(14, "14")>]
[<InlineData(15, "FizzBuzz")>]
[<InlineData(16, "16")>]
[<InlineData(17, "17")>]
[<InlineData(18, "Fizz")>]
[<InlineData(19, "19")>]
[<InlineData(20, "Buzz")>]
[<InlineData(30, "FizzBuzz")>]
[<InlineData(31, "Fizz")>]
[<InlineData(32, "Fizz")>]
[<InlineData(33, "Fizz")>]
[<InlineData(34, "Fizz")>]
[<InlineData(35, "FizzBuzz")>]
[<InlineData(36, "Fizz")>]
[<InlineData(37, "Fizz")>]
[<InlineData(38, "Fizz")>]
[<InlineData(39, "Fizz")>]
[<InlineData(50, "Buzz")>]
[<InlineData(51, "FizzBuzz")>]
[<InlineData(52, "Buzz")>]
[<InlineData(53, "FizzBuzz")>]
[<InlineData(54, "FizzBuzz")>]
[<InlineData(55, "Buzz")>]
[<InlineData(56, "Buzz")>]
[<InlineData(57, "FizzBuzz")>]
[<InlineData(58, "Buzz")>]
[<InlineData(59, "Buzz")>]
let FizzBuzzReturnsCorrectResult number expected =
    number
    |> FizzBuzz
    |> should equal expected

This is the same test code as before, only with new or modified test data.

Implementation #

Compared with the stage 1 implementation, my implementation to meet the new requirements is much more complex. First, I'll post the entire code listing and then walk you through the details:

let FizzBuzz number =
    let arithmeticFizzBuzz number =
        seq {
            if number % 3 = 0 then yield "Fizz"
            if number % 5 = 0 then yield "Buzz"
            }
 
    let digitalFizzBuzz digit =
        seq {
            if digit = 3 then yield "Fizz"
            if digit = 5 then yield "Buzz"
            }
 
    let rec digitize number =
        seq {
                yield number % 10
                let aTenth = number / 10
                if aTenth >= 1 then yield! digitize aTenth
            }
 
    let arithmeticFizzBuzzes = number |> arithmeticFizzBuzz
    let digitalFizzBuzzes = number
                            |> digitize
                            |> Seq.collect digitalFizzBuzz
 
    let fizzOrBuzz = arithmeticFizzBuzzes
                     |> Seq.append digitalFizzBuzzes
                     |> Seq.distinct
                     |> Seq.toArray
                     |> Array.sort
                     |> Array.rev
                     |> String.Concat
 
    if fizzOrBuzz = ""
    then number.ToString()
    else fizzOrBuzz

First of all, you may wonder where the original implementation went. According to the requirements, the function must still 'Fizz' or 'Buzz' when a number is divisible by 3 or 5. This is handled by the nested arithmeticFizzBuzz function:

let arithmeticFizzBuzz number =
    seq {
        if number % 3 = 0 then yield "Fizz"
        if number % 5 = 0 then yield "Buzz"
        }

The seq symbol specifies a sequence expression, which means that everything within the curly brackets is expected to produce parts of a sequence. It works a bit like the yield keyword in C#.

Due to F#'s strong type inference, the type of the function is int -> seq<string>, which means that it takes an integer as input and returns a sequence of strings. In C# an equivalent signature would be IEnumerable<string> arithmeticFizzBuzz(int number). This function produces a sequence of strings depending on the input.

  • 1 produces an empty sequence.
  • 2 produces an empty sequence.
  • 3 produces a sequence containing the single string "Fizz".
  • 4 produces an empty seqence.
  • 5 produces a sequence containing the single string "Buzz".
  • 6 produces a sequence containing the single string "Fizz".
  • 15 produces a sequence containing the strings "Fizz" and "Buzz" (in that order).

That doesn't quite sound like the original requirements, but the trick will be to concatenate the strings. Thus, an empty sequence will be "", "Fizz" will be "Fizz", "Buzz" will be "Buzz", but "Fizz" and "Buzz" will become "FizzBuzz".

The digitalFizzBuzz function works in much the same way, but expects only a single digit.

let digitalFizzBuzz digit =
    seq {
        if digit = 3 then yield "Fizz"
        if digit = 5 then yield "Buzz"
        }

  • 1 produces an empty sequence.
  • 2 produces an empty sequence.
  • 3 produces a sequence containing the single string "Fizz".
  • 4 produces an empty seqence.
  • 5 produces a sequence containing the single string "Buzz".
  • 6 produces an empty sequence.

In order to be able to apply the new rule of Fizzing and Buzzing if a digit is 3 or 5, it's necessary to split a number into digits. This is done by the recursive digitize function:

let rec digitize number =
    seq {
            yield number % 10
            let aTenth = number / 10
            if aTenth >= 1 then yield! digitize aTenth
        }

This function works recursively by first yielding the rest of a division by 10, and then calling itself recursively with a tenth of the original number. Since the number is an integer, the division simply still produces an integer. The function produces a sequence of digits, but in a sort of backwards way.

  • 1 produces a sequence containing 1.
  • 2 produces a sequence containing 2.
  • 12 produces a sequence containing 2 followed by 1.
  • 23 produces a sequence containing 3 followed by 2.
  • 148 produces 8, 4, 1.

This provides all the building blocks. To get the arithmetic (original) FizzBuzzes, the number is piped into the arithmeticFizzBuzz function:

let arithmeticFizzBuzzes = number |> arithmeticFizzBuzz

In order to get the digital (new) FizzBuzzes, the number is first piped into the digitize function, and the resulting sequence of digits is then piped to the digitalFizzBuzz function by way of the Seq.collection function.

let digitalFizzBuzzes = number
                        |> digitize
                        |> Seq.collect digitalFizzBuzz

The Seq.collect function is a built-in function that takes a sequence of elements (in this case a sequence of digits) and for each element calls a method that produces a sequence of elements, and then concatenates all the produced sequences. As an example, consider the number 53.

Calling digitize with the number 53 produces the sequence { 3; 5 }. Calling digitalFizzBuzz with 3 produces the sequence { "Fizz" } and calling digitalFizzBuzz with 5 produces { "Buzz" }. Seq.collect concatenates these two sequences to produce the single sequence { "Fizz"; "Buzz" }.

Now we have two sequences of "Fizz" or "Buzz" strings - one produced by the old, arithmetic function, and one produced by the new, digital function. These two sequences can now be merged and ordered with the purpose of producing a single string:

let fizzOrBuzz = arithmeticFizzBuzzes
                 |> Seq.append digitalFizzBuzzes
                 |> Seq.distinct
                 |> Seq.toArray
                 |> Array.sort
                 |> Array.rev
                 |> String.Concat

First, the Seq.append function simply concatenates the two sequences into a single sequence. This could potentially result in a sequence like this: { "Fizz"; "Buzz"; "Fizz" }. The Seq.distinct function gets rid of the duplicates, but the ordering may be wrong - the sequence may look like this: { "Buzz"; "Fizz" }. This can be fixed by sorting the sequence, but sorting alphabetically would always put "Buzz" before "Fizz" so it's also necessary to reverse the sequence. There's no function in the Seq module which can reverse a sequence, so first the Seq.toArray function is used to convert the sequence to an array. After sorting and reversing the array, the result is one of four arrays: [], [ "Fizz" ], [ "Buzz" ], or [ "Fizz"; "Buzz" ]. The last step is to concatenate these string arrays to a single string using the String.Concat BCL method.

If there were no Fizzes or Buzzes, the string will be empty, in which case the number is converted to a string and returned; otherwise, the fizzOrBuzz string is returned.

if fizzOrBuzz = ""
then number.ToString()
else fizzOrBuzz

To print the FizzBuzz list for numbers from 1 to 100 the same solution as before can be used.

What I like about Functional Programming is that data just flows through the function. There's not state and no mutation - only operations on sequences of data.


Page 59 of 77

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!