Code Smell: Automatic Property

Thursday, 26 May 2011 13:33:13 UTC

This post is the third in a series about Poka-yoke Design - also known as encapsulation.

Automatic properties are one of the most redundant features of C#. I know that some people really love them, but they address a problem you shouldn't have in the first place.

I totally agree that code like this looks redundant:

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

However, the solution is not to write this instead:

public string Name { get; set; }

The problem with the first code snippet isn't that it contains too much ceremony. The problem is that it breaks encapsulation. In fact

"[…] getters and setters do not achieve encapsulation or information hiding: they are a language-legitimized way to violate them."

James O. Coplien & Gertrud Bjørnvig. Lean Architecture. Wiley. 2010. p. 134.

While I personally think that properties do have their uses, I very rarely find use for automatic properties. They are never appropriate for reference types, and only rarely for value types.

Code Smell: Automatic Reference Type Property #

First of all, let's consider the very large set of properties that expose a reference type.

In the case of reference types, null is a possible value. However, when we think about Poka-yoke design, null is never an appropriate value because it leads to NullReferenceExceptions. The Null Object pattern provides a better alternative to deal with situations where a value might be undefined.

In other words, an automatic property like the Name property above is never appropriate. The setter must have some kind of Guard Clause to protect it against null (and possibly other invalid values). Here's the most fundamental example:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        this.name = value; 
    }
}

As an alternative, a Guard Clause could also check for null and provide a default Null Object in the cases where the assigned value is null:

private string name;
public string Name
{
    get { return this.name; }
    set 
    {
        if (value == null)
        {
            this.name = "";
            return;
        }
        this.name = value; 
    }
}

However, this implementation contains a POLA violation because the getter sometimes returns a different value than what was assigned. It's possible to fix this problem by adding an associated boolean field indicating whether the name was assigned null so that null can be returned from the setter in this special case, but that leads to another code smell.

Code Smell: Automatic Value Type Property #

If the type of the property is a value type, the case is less clear-cut because value types can't be null. This means that a Null Guard is never appropriate. However, directly consuming a value type may still be inappropriate. In fact, it's only appropriate if the class can meaningfully accept and handle any value of that type.

If, for example, the class can really only handle a certain subset of all possible values, a Guard Clause must be introduced. Consider this example:

public int RetryCount { get; set; }

This property might be used to set the appropriate number or retries for a given operation. The problem with using an automatic property is that it's possible to assign a negative value to it, and that wouldn't make any sense. One possible remedy is to add a Guard Clause:

private int retryCount;
public int RetryCount
{
    get { return this.retryCount; }
    set
    {
        if (value < 0)
        {
            throw new ArgumentOutOfRangeException();
        }
        this.retryCount = value;
    }
}

However, in many cases, exposing a primitive property is more likely to be a case of Primitive Obsession.

Improved Design: Guard Clause #

As I described above, the most immediate fix for automatic properties is to properly implement the property with a Guard Clause. This ensures that the class' invariants are properly encapsulated.

Improved Design: Value Object Property #

When the automatic property is a value type, a Guard Clause may still be in order. However, when the property is really a symptom of Primitive Obsession, a better alternative is to introduce a proper Value Object.

Consider, as an example, this property:

public int Temperature { get; set; }

This is bad design for a number of reasons. It doesn't communicate the unit of measure and allows unbounded values to be assigned. What happens if - 100 is assigned? If the unit of measure is Celcius it should succeed, although in the case when it's Kelvin, it should fail. No matter the unit of measure, attempting to assign int.MinValue should fail.

A more robust design can be had if we introduce a new Temperature type and change the property to have that type. Apart from protection of invariants it would also encapsulate conversion between different temperature scales.

However, if that Value Object is implemented as a reference type the situation is equivalent to the situation described above, and a Null Guard is necessary. Only in the case where the Value Object is implemented as a value type is an anonymous property appropriate.

The bottom line is that automatic properties are rarely appropriate. In fact, they are only appropriate when the type of the property is a value type and all conceivable values are allowed. Since there are a few cases where automatic properties are appropriate their use can't be entirely dismissed, but it should be treated as warranting further investigation. It's a code smell, not an anti-pattern.

On a different note properties also violate the Law of Demeter, but that's the topic of a future blog post...


Comments

James Nail #
Hi Mark,
I'm enjoying this series, and you can be sure I'll be sharing it with my teammates.
I wanted to point out a typo that might confuse readers, though -- you used the term "anonymous properties" instead of "automatic properties" toward the end of the article, which threw me off at first.
Anyway, keep up the great work!
2011-05-26 14:07 UTC
Omer Katz #
What about NetworkStream's Connected property?
It has a private set and a public get and returns a boolean that indicates if the underlaying socket is connected.
Nothing wrong with that.

By your logic, any .NET ORM is bad because it uses properties and violates the Law of Demeter.
2011-05-26 14:09 UTC
Isn't the code smell "public setters" rather than automatic properties? Making the set protected or private solves this issue. E.g.

public decimal MyValue { get; private set; }

With such usages it is legitimate for the guard code to be in a state mutating method, constructor validation etc
2011-05-26 14:15 UTC
Good points, and I completely agree with the { get; set; } kind of auto-implemented properties - in fact I don't think I've ever checked one in.

However, I'm forever using the {get; private set; } kind of auto-implemented properties - what are your opinions on those? What about a hypothetical and confusingly named { get; readonly set; } auto-implemented property?
2011-05-26 14:25 UTC
James, thanks for pointing that out. I've now corrected the error.
2011-05-26 18:31 UTC
Omer, a boolean property falls into that rare case described during the discussion of value type properties. If the class exposing the property truly can accept any value that the type can possibly take, an automatic property is in order. A boolean value can only take one of two values, and both are legal. Thus, no Guard Clause is required. However, please notice that this scenario is not (or at least should not be) representative.

Regarding ORMs you are now jumping ahead of me :) I'll come to that in a later blog post, but yes, I genuinely believe that the whole concept of an ORM (at least in any incarnations I've seen so far) is a fallacy.
2011-05-26 18:39 UTC
Ben, you are correct that the issue is at least most prevalent when the setter is public. However, keep in mind that we should always treat protected as public, as anyone can always come by and derive from a class.

For private setters it's not so clear-cut. In most cases when I have read-only properties it tends to be because the property is being supplied through the constructor and subsequently treated as immutable. When that is the case, I much prefer a backing field because it can be declared as readonly.

In other words I rarely write code where the class itself can mutate the value of a read-only property, but I know that other people do. When this is the case, I could argue that I'd still prefer the setter to protect the invariants of the class, but that argument tends to become a little silly because with a backing field the class could always just mutate the value directly. In any case, if one has a class that needs that kind of protection from its own internals one probably has more serious problems :)

Keep in mind that this whole discussion about encapsulation relates to the public API of classes. Obviously, once you start poking around in the internals of a class, you're already looking at the source code and thus past the encapsulation perimeter.
2011-05-26 18:50 UTC
Alex, see my previous comment regarding { get; private set; }

Regarding { get; readonly set; } I would probably use something like this assuming this was something that could only be used from the constructor, because I'd still be able to add the appropriate Guard Clause in the constructor before invoking the setter.
2011-05-26 18:54 UTC
I totally agree with your post. However, there are some cases when you do not need to make checks for validity for reference types. If a class with such a property does not provide a default non-null value for it, then any other checks are redundant as the user may set or may not set a value to that property. This way it will still have its default value of NULL.

Using properties with public setters in really tricky. When you have a class with properties and methods, users do not know which properties should be set before a method is called which will result in a run-time exception. That's why required parameters should be provided directly to constructors and methods.
2011-05-27 07:52 UTC
If the value is optional a public setter is in order. However, allowing the field to be null is not, as it would require the rest of the class to have to check for null every time it uses the field. A Null Object is a better alternative, which again leads to the rule that a Null Guard is always appropriate for reference types. However, in certain cases, that Null Guard might not throw an exception, but rather replace the null with a Null Object.
2011-05-29 08:55 UTC
Theo Andersen #
Hi Mark,
I really like your design smell series posts, and i have a few questions.

I get and agree on your points about encapsulation and Automatic Properties, but as i see it a lot of frameworks forces you to break this encapsulation. An example is XML Serialization (ISerializable), which needs an empty constructor and then access to all the public properties to set them.

Couldn't this lead to both the automatic property and temporal coupling smells, as you need to have basic properties so the framework can set them, plus an empty constructor which then breaks your temporal encapsulation?
To adhere to the serialization interface you're forced to open up your code (or create a DTO).

2011-05-30 09:18 UTC
Theo, please see my latest post that touches on exactly that point.
2011-05-31 18:11 UTC
I wonder if there is a more elegant way to do this than creating guard clauses for every reference type property. It would have been nice to do this in a more declerative way. I posted a question on SO as well: http://stackoverflow.com/questions/6773189/auto-implemented-properties-with-non-null-check
2011-07-21 16:38 UTC
Ben Robinson #
Your points are arguments against using automatic properties inappropriately and are far from general cases. There are perfectly valid reasons to allow the setting of properties to null (e.g. using ORM how would you set a property mapped DB field to null) i think your example of a setter that throws a null reference exception is a code smell itself unless you have a specific reason for disallowing it. Your example of the RetryCount is simply an inappropriate use of an automatic property not at all an argument against automatic properties. Your example of temperature is an example of inappropriate type and has nothing to do with automatic properties at all, with the correct type an automatic property would make perfect sense as you could build the validation into the Temperature class. Automatic properties as just shorthand for simple field backed properties and where simple field back properties are correct then so are automatic properties. Simple field backed properties with no validation are sometimes used when they shouldn't be, but that has nothing to do with automatic properties.
2011-09-02 08:38 UTC
I agree with Ben's last point, I believe automatic property(or simple field backed property) is just a way to deal with how C# compiler does assembly linking, and using simple properties allows minor modifications to them without recompiling all dependent assemblies.

I somewhat agree to your Guard Clause argument, it's something I should pay more attention to instead of blindly implementing automatic properties. But I do have some doubts, in many cases, there is no proper default value, like Birth Date, Country, etc. In fact, I think that's the case most of the time, how would you deal with those situations?
2012-01-10 17:04 UTC
2012-01-15 16:40 UTC
Juliano Leal Goncalves #
Mark, have you considered revisiting this topic or adding some more info to this particular post after the introduction of automatic readonly properties? Most of what you discuss here is still applicable, but some of the newer constructs prevents a fair share of the problems associated with properties.
2018-03-20 14:15 UTC

Juliano, thank you for writing. You're right that after the publication of this post, C# got more features, at least one of which seems to relate to this article. When I write C# code today, I often use automatic read-only properties, combined with appropriate Guard Clauses in the constructor. I find that a useful language feature.

That particular later language feature (automatically implemented read-only properties) is, however, only tangentially related to the topic of this post. Did you have some other language construct in mind?

2018-03-20 16:12 UTC

Design Smell: Primitive Obsession

Wednesday, 25 May 2011 15:03:31 UTC

This post is the second in a series about Poka-yoke Design - also known as encapsulation.

Many classes have a tendency to consume or expose primitive values like integers and strings. While such primitive types exist on any platform, they tend to lead to procedural code. Furthermore they often break encapsulation by allowing invalid values to be assigned.

This problem has been addressed many times before. Years ago Jimmy Bogard provided an excellent treatment of the issue, as well as guidance on how to resolve it. In relation to AutoFixture I also touched upon the subject some time ago. As such, the current post is mostly a placeholder.

However, it's worth noting that both Jimmy's and my own post address the concern that strings and integers do not sufficiently encapsulate the concepts of Zip codes and phone numbers.

  • When a Zip code is represented as a string it's possible to assign values such as null, string.Emtpy, “foo”, very long strings, etc. Jimmy's ZipCode class encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.
  • When a Danish phone number is represented as an integer it's possible to assign values such as - 98, 0, int.MaxValue, etc. Once again the DanishPhoneNumber class from the above example encapsulates the concept by guaranteeing that an instance can only be successfully created with a correct value.

Encapsulation is broken unless the concept represented by a primitive value can truly take any of the possible values of the primitive type. This is rarely the case.

Design Smell #

A class consumes a primitive type. However, further analysis shows that not all possible values of the type are legal values.

Improved Design #

Encapsulate the primitive value in a Value Object that contains appropriate Guard Clauses etc. to guarantee that only valid instances are possible.

Primitives tend to not be fail-safe, but encapsulated Value Objects are.


Comments

Jon Wingfield #
What do you mean by this statement:

Encapsulation is broken unless the concept represented by a primitive value can truly take any of the possible values of the primitive type. This is rarely the case.

Should a Qty field that only takes positive integers now be represented by a separate class? You seem to be defining encapsulation as compile-safe or something.
2011-05-25 15:52 UTC
I would very much like to be able to represent a quantity as a specialized class. Unfortunately uint is not a CLS-compliant type - otherwise I would use that one a lot.

Keep in mind that I'm talking about a code smell. This means that whenever you encounter the smell it should make you stop and think. I'm not saying that using a primitive type is always wrong.

With a quantity, and given the constraints of C#, unfortunately there isn't a lot we can do. While we could encapsulate it in a PositiveInteger struct, it wouldn't add much value from a feedback perspective since we can't make the compiler choke on a negative value.

However, a hypothetical PositiveInteger struct would still add the value that it encapsulates the invariant in one place instead of spreading it out all over the code base.

Still, a positive integer is such a basic concept that it never changes. This means that even if you spread Guard Clauses that check for negative values all over your code base, you may not have much of a maintenance burden, since the Guard will never change. Still, you might forget it in a couple of places.

In any case I like the Temperature example above much better because it not only provides safety, but also encapsulates the concept as well as provides conversion logic etc.
2011-05-25 18:03 UTC
Thomas #
I try whenever possible to design that way my api especially in the domain layer. However I run into some corner cases with RESTful web services when it's more convenient for the client to provide a simple primitive than the complex object. Then on the server side the check is done to know if the provided value is correct. Would you bother to enforce it that case?
2011-05-31 22:38 UTC
Thomas, did you see my post on application boundaries? I think it ought to answer your question :)
2011-06-01 05:45 UTC
Thomas #
Not yet:) I'm catching up with the old one first. But I go to it:)
2011-06-01 06:29 UTC
Scott Peterson #
Hello Mark:

I am revisiting an old post in hopes you can shed some light on Value objects and their limits. I am currently trying to implement a Password class, which seems like a perfect example of something that should not be handled as a simple string. I modelled my Password object based on examples you've provided, in that I pass in a string to a constructor and then run an IsValid method to see if the string meets our business rules for passwords (length, types of characters, etc.). That's fine as it is, and I have unit tests to make sure all is well. But there are more business rules to a password. I need to have a privately set DateCreated field, and I need to store the number of days the password is valid, while providing a function to see if the password is still valid based on the DateCreated and the number of days the password is valid. However, adding these things to my value object seems like I'm polluting it. Plus, I want to pass in the number of days valid when the object is created, so now I have two parameters, which causes problems if I want to have an explicit operator. I thought about creating a PasswordPrimitive class and then a Password class that inherits the PasswordPrimitive class, but that seems messy.

If you have any thoughts and/or guidance, I'd appreciate the input.

Regards,
Scott
2015-02-26 21:04 UTC

Scott, thank you for writing. A Value Object can contain more than a single primitive value. The canonical Value Object example is Money, and you often see Money examples where a Money value is composed of an amount and a currency; one example in the literature is Kent Beck's implementation in Test Driven Development: By Example, which contains amount and currency.

Thus, I don't see any intrinsic problem with your password class containing both a string (or a byte array?) and an expiration time.

It's true that you can no longer have a lossless conversion from your Value Object to a primitive value, but that's not a requirement for it to be a Value Object.

(BTW, I hope you don't store passwords, but only the hashes!)

2015-02-27 7:29 UTC

Design Smell: Temporal Coupling

Tuesday, 24 May 2011 14:00:42 UTC

This post is the first in a series about Poka-yoke Design - also known as encapsulation.

A common problem in API design is temporal coupling, which occurs when there's an implicit relationship between two, or more, members of a class requiring clients to invoke one member before the other. This tightly couples the members in the temporal dimension.

The archetypical example is the use of an Initialize method, although copious other examples can be found - even in the BCL. As an example, this usage of EndpointAddressBuilder compiles, but fails at run-time:

var b = new EndpointAddressBuilder();
var e = b.ToEndpointAddress();

It turns out that at least an URI is required before an EndpointAddress can be created. The following code compiles and succeeds at run-time:

var b = new EndpointAddressBuilder();
b.Uri = new UriBuilder().Uri;
var e = b.ToEndpointAddress();

The API provides no hint that this is necessary, but there's a temporal coupling between the Uri property and the ToEndpointAddress method.

In the rest of the post I will provide a more complete example, as well as a guideline to improve the API towards Poka-yoke Design.

Smell Example #

This example describes a more abstract code smell, exhibited by the Smell class. The public API looks like this:

public class Smell
{
    public void Initialize(string name)
 
    public string Spread()
}

Semantically the name of the Initialize method is obviously a clue, but on a structural level this API gives us no indication of temporal coupling. Thus, code like this compiles, but throws an exception at run-time:

var s = new Smell();
var n = s.Spread();

It turns out that the Spread method throws an InvalidOperationException because the Smell has not been initialized with a name. The problem with the Smell class is that it doesn't properly protect its invariants. In other words, encapsulation is broken.

To fix the issue the Initialize method must be invoked before the Spread method:

var sut = new Smell();
sut.Initialize("Sulphur");
var n = sut.Spread();

While it's possible to write unit tests that explore the behavior of the Smell class, it would be better if the design was improved to enable the compiler to provide feedback.

Improvement: Constructor Injection #

Encapsulation (Poka-yoke style) requires that the class can never be in an inconsistent state. Since the name of the smell is required, a guarantee that it is always available must be built into the class. If no good default value is available, the name must be requested via the constructor:

public class Fragrance : IFragrance
{
    private readonly string name;
 
    public Fragrance(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
 
        this.name = name;
    }
 
    public string Spread()
    {
        return this.name;
    }
}

This effectively guarantees that the name is always available in all instances of the class. There  are also positive side effects:

  • The cyclomatic complexity of the class has been reduced
  • The class is now immutable, and thereby thread-safe

However, there are times when the original version of the class implements an interface that causes the temporal coupling. It might have looked like this:

public interface ISmell
{
    void Initialize(string name);
 
    string Spread();
}

In many cases the injected value (name) is unknown until run-time, in which case straight use of the constructor seems prohibitive - after all, the constructor is an implementation detail and not part of the loosely coupled API. When programming against an interface it's not possible to invoke the constructor.

There's a solution for that as well.

Improvement: Abstract Factory #

To decouple the methods in the ISmell (ha ha) interface the Initialize method can be moved to a new interface. Instead of mutating the (inconsistent) state of a class, the Create method (formerly known as Initialize) returns a new instance of the IFragrance interface:

public interface IFragranceFactory
{
    IFragrance Create(string name);
}

The implementation is straightforward:

public class FragranceFactory : IFragranceFactory
{
    public IFragrance Create(string name)
    {
        if (name == null)
        {
            throw new ArgumentNullException("name");
        }
        return new Fragrance(name);
    }
}

This enables encapsulation because both the FragranceFactory and Fragrance classes protect their invariants. They can never be in an inconsistent state. A client previously interacting with the ISmell interface can use the IFragranceFactory/IFragrance combination to achieve the same funcionality:

var f = factory.Create(name);
var n = f.Spread();

This is better because improper use of the API can now be detected by the compiler instead of at run-time. An interesting side-effect by moving towards a more statically declared interaction structure is that classes tend towards immutability. Immutable classes are automatically thread-safe, which is an increasingly important trait in the (relatively) new multi-core era.


Comments

Scott #
Great insight again in to core principles that often get over looked even by big names in software development.

Looking forward to the rest of the series.
2011-05-24 16:02 UTC
juan agui #
Have you ever considered renaming your blog to programming 101? It's amazing how most my colleagues are still writing procedural code and violate each and every OOP principle. Fortunately there are blogs like yours that explain brilliantly and concisely these topics. Keep up the good job!


2011-05-24 17:28 UTC
Although, I agree with the fact that temporal coupling must be avoided, when possible, I find the solution you propose quite overkill (as always with that dependency injection jazz, IMHO). With the initial design, at least, as a developer, the problem I had were simple to solve (ie: add the Initialize call). Now with DI, as a user of the Small API, I may have to deal with all sort of runtime problems, just because I didn't "read" the API, or the documentation properly :-) Plus, now I have 4 types instead of 1 loaded in my domain. A simple design may not be the most elegant, but it bears simple problems.
2011-05-25 05:50 UTC
So, suppose I want my FragranceFactory to return a Fragrance that implements ISmell (as you have it above) instead of an IFragrance--perhaps because I can't change ISmell). I should just have FragranceFactory return initialized Fragrances wherein the Initialize method does nothing, right? This would accomplish all our goals, although ISmell would still be confusing to the consumer.
2011-05-25 17:27 UTC
Simon, exactly the opposite is true. With the Fragrance API you wouldn't have to deal with any runtime problems, because as soon as you can compile, the design of the code makes it very likely that it's also going to work at run-time (except for null values, which you can't escape no matter what you do).

Getting feedback sooner is betting than getting feedback later. I much prefer getting a compiler error over a run-time exception.

The number of types is hardly a measure of the quality of the code, but if it is, the more the better. The Single Responsibility Principle favors many small classes over a few God Classes.

However, even if you don't agree, then comparing one type against four doesn't make sense. In the absence of interfaces there'd be one class in each alternative (Smell versus Fragrance). When you add interfaces into the mix, you have two types (Smell and ISmell) in the temporally coupled API against four types in the safe API.
2011-05-25 18:19 UTC
Patrick, yes, that might be a possible variation, although perhaps a bit exotic... In which scenario would that be the case?
2011-05-25 18:21 UTC
Not sure to follow you on the number of types. Initially, I had 1 Smell class, and in then end I have 2 classes and 2 interfaces. Or maybe I missed something.
In a real world project, multiply classes by 2, 3, 4 or more (lets add some WCF interfaces to this!) is indeed an issue, unless you generate some or all of it.
I'm not saying what you propose can't be done, or is inelegant. It is exactly the opposite, it's clever and smart, and is the style of code that can be taught in classes or blogged about, but the generalization of these principles can lead to bloated code and huge assemblies if applied massively.

About maintenability, I pretend your code is harder to extend. Initially, I had one class that I could easily extend and the JIT helps me a lot here. I can create a new Spread(arguments) methods easily without breaking any of my existing clients (and without changing the semantics coupling). It's less easy with interfaces, do I need one new ISmellxx interfaces for every new method? New types again? Classes are extensible, interfaces are carved in stone forever - in fact, most people *do* change their interfaces, but that's another story :-)
2011-05-26 07:08 UTC
This blog post discusses two scenarios. The first scenario is that you have only the Smell class. In that scenario, you can replace the Smell class with the Fragrance class (minus the IFragrance interface). No new types are necessary, so you start out with one type and end up with one type.

The second scenarios is when you have a Smell class implementing an ISmell interface. In that case you'd need to go from two to four types.

When it comes to maintainability, I find that the SOLID principles are some of the best guidelines around. The Open/Closed Principle explicitly states that classes should be closed for modification, but open for extensibility. In other words, we extend code only by adding new classes.
2011-05-26 18:24 UTC
_ikke_ #
Just to be sure:

The factory class is allowed to create concrete instances without using a DI container? And any dependencies that the concrete class needs will have to be passed to the factory class?

2011-05-27 10:49 UTC
Use of a DI Container is IMO orthogonal to the issue.

Conceptually it's best to think about any such factories as concrete classes. If they need to pass dependencies to the objects they create, they can do so by requesting them by Constructor Injection. In other words you are correct that "any dependencies that the concrete class needs will have to be passed to the factory class" (excluding those that are passed as parameters to the Create method).

On a practical side, you can always consider implementing the factory as an infrastructure component based on a container. Essentially, the factory would be an Adapter over the container. However, if you do this, you must be sure to place the Adapter in the Composition Root of the application, as otherwise you risk that the container reference leaks into your application code. This means that the concrete class and the factory will end up being implemented in two different assemblies. If you use Castle Windsor, you can go all the way and not even implement the container at all because instead you can leverage the Typed Factory facility.
2011-05-28 21:52 UTC
Mohammed #
All of our code is full of temporal couplings. You can not prevent it because we solve our problems in doing first step1, followed by step2, and so on and so forth. So I don't think you can create code where there is no temporal coupling. What you can do on the other hand, and what I think this blog should have addressed, is to always create valid objects. The design of the class Smell makes it possible creating the object of type Smell without a name. Right after you issued the keyword new() your object is invalid. Any method you call after that will operate on an invalid object (Initialize(...) tries to remedy that). Your solution on the other hand is the right one.
2011-06-17 13:07 UTC
Daniel Hilgarth #
Mark, thanks for this and the other articles in this and other series.
I have one question however:
FragranceFactory.Create checks name for null. Why?
I see the following problems with it:
1) Code duplication:
As Fragrance can be instantiated without the use of the factory, the null check can't be removed from the constructor of Fragrance.
2) Knowledge of implementation details of Fragrance:
The FragranceFactory knows that Fragrance can't work with a name that is null. This is not implicit, this is explicit knowledge. Another implementation of IFragrance could simply assign a default value if null is passed. As FragranceFactory is coupled tightly to Fragrancy, that is not such a big problem. Still I think it is not correct
3) No benefit whatsoever:
Adding the null check in the Create method brings no benefit at all. On the contrary: If we were to change the constructor of Fragrance to allow null and to assign a default value, but would forget to change FragranceFactory.Create accordingly, the semantics of the object creation would differ, depending on how the object is created.

Can you please explain what I have missed and why you decided to put the null check into the Create method?
2011-07-21 12:34 UTC
Daniel, thank you for your insightful comment - you are completely correct.

I just added the Guard Clause out of habit, but your arguments are valid and I will in the future be more careful with my Guard Clauses. It only makes sense that one should only have a Guard Clause if one uses the argument in question. That's not the case when it's just being passed on, so it should not be there.

Thanks for pointing that out.
2011-07-21 15:03 UTC
Mark, thanks for your answer. I hope I didn't sound like I was nitpicking. I simply was a bit puzzled about it.
2011-07-21 15:25 UTC
Not at all - I'm happy that you pointed out something I hadn't properly considered. I learned something from it :)
2011-07-21 15:28 UTC
Hey Mark,
What about fluent interfaces? They completely rely on temporal coupling by design:

var query = new QueryBuilder().WithField("dressing").NotNull().Containing("bacon");

Do you consider this an edge case, or do you actually dislike fluent interfaces in general?
2011-08-02 17:50 UTC
Not at all (to both statements). A Fluent Interface doesn't rely on Temporal Coupling at all - I frequently implement Fluent APIs with immutable types.

See this blog post for a very simple example. For a very complex example, see AutoFixture's Ploeh.AutoFixture.Dsl namespace, which is kicked off by the fixture.Build<T>() method.

In any case, Temporal Coupling doesn't disallow mutability - it just states that if you have mutating methods, the order in which they are invoked (if at all) mustn't matter.
2011-08-02 18:23 UTC
Emanuel Pasat #
Offtopic question: striving for immutability on the domain model is a good practice (or only value objects should be immutable)?
2011-09-16 14:32 UTC
The most common approach is certainly to model Entities as mutable objects while Value Objects are kept immutable.

When you use 'classic' CRUD-style Repositories or Unit of Work, that may also be the best match to that style of architecture. However, my more recent experience with CQRS and Event Sourcing seems to indicate to me that an immutable domain model begins to make a lot more sense. However, I've still to gain more experience with this before I can say anything with confidence.

It has been my general experience over the last couple of years that the more immutable objects I can define, the easier the code is to work with and reason about.
2011-09-16 19:18 UTC
Luis #
Quick noob question, why the IFragranceFactory interface? Your code factory.Create("name") which i imagine would be preceded by "var factory = new FragranceFactory()" does not use the interface, does it?
2012-09-11 14:12 UTC
An IFragranceFactory instance can be injected into any client that needs it, freeing it from having to create the tightly coupled instance itself.

The IFragranceFactory instance can be created in the Composition Root, potentially far removed from any clients. At this point it could be created with the 'new' keyword, or by a DI Container.
2012-09-11 15:43 UTC
Luis #
i came to this blog post with the specific intention of reading about temporal coupling, but after reading your reply to my message and feeling utterly confused i decided to go and read your book, maybe after reading it i'll actually understand what you're saying, and gather some knowledge about DI in the process, obviously this DI is a thing
2012-09-21 12:45 UTC
Sorry, I didn't mean to confuse you...
2012-09-21 13:15 UTC
James Wheeler #
Hi Mark, this is an old post now but it's still very relevant for me. What about the case when initialization is a time consuming process?

For example, at my company we work with a lot of devices and we often end up writing initialization methods for some Type, let's call it 'FooDevice,' that provides functionality for that kind of device.

Let's say our FooDevice implements an interface very similar to ISmell:
IFooDevice
{
    Task Initialize(string comPort);
    int DoSomething();
}
This particular initialization method accepts as input parameter the COM port on which the device is connected to and through which we send serial commands to control the device.

So the initialization method may then create a serial port object for that port, configure the device in various ways (such as disabling physical buttons, since we're now controlling it programatically) and so forth.

The initialization may be a time consuming process that can potentially take in excess of a few seconds or longer.

Following your advice, we should apply a refactoring here. The constructor for FooDevice should accept the COM port instead. Though that seems to imply that initialization should be done in the constructor:
class FooDevice : IFooDevice
{
    public FooDevice(string comPort)
    {
        // Initialize(comPort);
    }
    
    private void Initialize(string comPort) { /* ... */ }
}
Yet that becomes problematic, because now we've lost the ability to return a Task. But of course if we have our IFooDeviceFactory, it can provide the Task:
IFooDeviceFactory
{
    Task<IFooDevice> Create(string comPort);
}
Now, if I understand your example correctly, then the whole purpose of providing an Abstract Factory is to satisify the requirement that the client be in control of when the object is created. Or else the client could just directly accept an IFragrance and no factory would be needed. Another client could feasibily skip the factory and invoke the Fragrance constructor directly.

But if we allow the same behavior with my above example, we're back in the same awkward situation with the constructor.

What can we do then? Should we make the constructor private, and instead provide a static factory method like the following?
class FooDevice : IFooDevice
{
    private FooDevice() { }

    public static Task<FooDevice> Create(string comPort)
    {
        var device = new FooDevice();
        await device.Initialize(string comPort);

        return device;
    }

    private Task Initialize(string comPort) { /* ... */ }
}
2021-06-07 22:22 UTC

James, thank you for writing. Based on what you describe, I find the last option, the one with a static factory method, preferable.

You may want to consider calling it something like Connect instead of Create, but then again, perhaps this isn't much of an improvement. The most important information is in the return type. Since the method returns a task, it's a strong signal that this could take some time.

What about closing the connection? Is that a concern? If so, you may want to make the object implement IDisposable. These days, various code analysers will tell you to dispose of such an object once you're done with it.

You write:

"the whole purpose of providing an Abstract Factory is to satisify the requirement that the client be in control of when the object is created"

Not quite. The main purpose of such a design is encapsulation, guaranteeing that an object can never be in an invalid state.

2021-06-09 12:42 UTC
James Wheeler #

Thanks for the reponse Mark. I too would consider calling the static factory method Connect but I stuck with Create for the sake of consistency with the example.

I'm not sure I follow your comment about encapsulation. With your example, FragranceFactory merely delegates its work to the Fragrance constructor--the Fragrance class already achieves encapsulation by requiring name as a constructor argument.

If you had instead introduced a SmellFactory that behaved like the following, then I would agree it encapsulates the required initialization protocol:

			class SmellFactory : ISmellFactory
			{
				ISmell Create(string name)
				{
					var smell = new Smell();
					smell.Initialize(name);
					return smell;
				}
			}
			

So the only usage I can see for keeping a factory around after improving the design with IFragrance is to allow the client to lazily create an instance of Fragrance without tightly coupling the client via the constructor. Otherwise, an already valid instance of IFragrance could be passed to the client.

Indeed, you preface the introduction of the Abstract Factory improvement with, "In many cases the injected value (name) is unknown until run-time..."

As for disposal, yes, we almost always need to implement IDisposable, since the supplier's managed API for the device usually implements IDisposable as well.

However, wouldn't using an Abstract Factory like this be undesirable? IDevice would need to implement IDisposable so that the client could invoke Dispose on the device instance created by the factory. But as you and Steven point out in your book on dependency injection, an abstraction that implements IDisposable is a leaky abstraction. Or would that not apply in this case?

2021-06-14 19:25 UTC

James, thank you for writing. There are two separate concerns here:

  • Guaranteeing that an object is in a valid state (encapsulation)
  • Enabling polymorphism
What valid state means is specific to an implementation; i.e. a concrete class. Thus, if FooDevice can only be initialised like shown above, that's an implementation detail.

The other concern is polymorphism. Your original definition of the IFooDevice interface had two methods: Initialize and DoSomething. In order to retain both of these capabilities, you're going to need to put both of these functions somewhere. In order to address the sequential coupling implied by that version of the interface, you'll have to distribute those methods over two different polymoprhic types:

IFooDeviceFactory
{
    Task<IFooDevice> Create(string comPort);
}

IFooDevice
{
    int DoSomething();
}

This prevents programmer error. With the original design, a client developer could call DoSomething without first calling Initialize. With the new design, this is no longer possible. You can't call DoSomething unless you have an IFooDevice instance, and you can't get an IFooDevice instance unless you call the Create method.

Is the factory required? Not necessarily. That depends on where the responsibility to create the object lies. If client code is the only code in possession of the comPort value, then you'll need the polymorphic factory in order to intialise a polymorphic object. On the other hand, if the comPort value is available to the application infrastructure (e.g. via a configuration value or environment variable) then you may not the the abstract factory. In that case, any client code that needs an IFooDevice object can receive it via the constructor, and the application's Composition Root can create the object using the concrete Create method.

As to the observation about interfaces inheriting from IDisposable, I still consider that a leaky abstraction. There's usually a way to avoid it, but it tends to depend on the details of the application architecture. Thus, it's difficult to give general advice on that topic without knowing more about the actual circumstances.

2021-06-17 10:50 UTC

Poka-yoke Design: From Smell to Fragrance

Tuesday, 24 May 2011 13:57:39 UTC

Encapsulation is one of the most misunderstood aspects of object-oriented programming. Most people seem to think that the related concept of information hiding simply means that private fields should be exposed by public properties (or getter/setter methods in languages that don't have native properties).

Have you ever wondered what's the real benefit to be derived from code like the following?

private string name;
public string Name
{
    get { return this.name; }
    set { this.name = value; }
}

This feels awfully much like redundant code to me (and automatic properties are not the answer - it's just a compiler trick that still creates private backing fields). No information is actually hidden. Derick Bailey has a good piece on why this view of encapsulation is too narrow, so I'm not going to reiterate all his points here.

So then what is encapsulation?

The whole point of object-orientation is to produce cohesive pieces of code (classes) that solve given problems once and for all, so that programmers can use those classes without having to learn about the intricate details of the implementations.

This is what encapsulation is all about: exposing a solution to a problem without requiring the consumer to fully understand the problem domain.

This is what all well-designed classes do.

  • You don't have to know the intricate details of TDS to use ADO.NET against SQL Server.
  • You don't have to know the intricate details of painting on the screen to use WPF or Windows Forms.
  • You don't have to know the intricate details of Reflection to use a DI Container.
  • You don't have to know how to efficiently sort a list in order to efficiently sort a list in .NET.
  • Etc.

What makes encapsulation so important is exactly this trait. The class must hide the information it encapsulates in order to protect it against ‘naïve' users. Wikipedia has this to say:

Hiding the internals of the object protects its integrity by preventing users from setting the internal data of the component into an invalid or inconsistent state.

Keep in mind that users are expected to not fully understand the internal implementation of a class. This makes it obvious what encapsulation is really about:

Encapsulation is a fail-safe mechanism.

By corollary, encapsulation does not mean hiding complexity. Whenever complexity is hidden (as is the case for Providers) feedback time increases. Rapid feedback is much preferred, so delaying feedback is not desirable if it can be avoided.

Encapsulation is not about hiding complexity, but conversely exposing complexity in a fail-safe manner.

In Lean this is known as Poka-yoke, so I find it only fitting to think about encapsulation as Poka-yoke Design: APIs that make it as hard as possible to do the wrong thing. Considering that compilation is the cheapest feedback mechanism, it's preferable to design APIs so that the code can only compile when classes are used correctly.

In a series of blog posts I will look at various design smells that break encapsulation, as well as provide guidance on how to improve the design to make it safer, thus going from smell to fragrance.

  1. Design Smell: Temporal Coupling
  2. Design Smell: Primitive Obsession
  3. Code Smell: Automatic Property
  4. Design Smell: Redundant Required Attribute
  5. Design Smell: Default Constructor
  6. DI Container smell: Captive Dependency

Postscript: At the Boundaries, Applications are Not Object-Oriented


Comments

Perhaps it would be beneficial to elaborate on your focus here by describing when/where these "smells" are applicable. With the .NET community (as a whole) only recently coming on to topics like anemic models, I fear that they will take this advice and apply it to every scenario.

That's not to say that this is not relevant information but surely you're not implying that's applicable to scenarios like messaging, RESTful APIs, and other circumstances that need easily serializable objecst?
2011-05-30 15:03 UTC
Kurt Guntheroth #
You're still doing it wrong. Encapsulation means that, instead of exposing data, you expose an interface that exports exactly the operations and values that are defined for your class.

For instance, if a string name is used as a sorting key, it isn't appropriate to change the name. You can make the name string public, but that implies that changing the name is a valid operation, and might lead to bugs later. Providing a const getter, but no setter for the name says "You can't change this name".

If you have a setter and a getter for a piece of data, it should be because the class needs to expose that data for the purpose of changing it. That happens a lot, and it shouldn't be viewed as unreasonable that you have a private data member and a setter/getter. It's not a waste of time or code. It's a clear contract with users of your class that you intend to provide these operations no matter how the class evolves.

One important property of good encapsulation is that you are free to change the data representation of your class if the interface remains the same, and your changes will be limited to the methods of the class itself. Want to change from a 1-based count to a zero-based index? If you exposed a member called items, you're screwed. If you exposed a method called CountGet() you're ok. Just change CountGet()'s implementation from returning items to return items+1.
2011-06-01 16:22 UTC
One thing at a time... There's a reason I themed this series of blog posts around fail-safing. I agree that encapsulation involves more than just that, but I don't think I ever said anything else.
2011-06-01 18:52 UTC
Florian Fordermaier #
Interesting post. A few years ago I did some research on the "absence of proper encapsulation language features" in C# and Java. It's an interesting approach to advance from smells to fragrance, pointing out code snippets that potentially violate encapsulation in some way. But - to be honest - this will help only those people that are even aware of these problems.
You mentioned the compiler as the first and cheapest feedback mechanism, so the target should be to achieve automation of the process of enforcing proper encapsulation with (a) a better compiler or (b) static analysis or (c) runtime functionality that can be applied minimally-invasive.
See C++'s const qualifier, an excellent example of a language feature to support proper encapsulation, this would allow for e.g. auto-properties with a getter/setter when making the setter const. Of course this will itself impact your design, but it offers a language integrated fail-safe mechanism for encapsulation. What do you think?
I also may have an additional smell for you, a violation of the law of demeter breaks encapsulation in most cases.
2011-06-16 08:12 UTC
Florian, I agree with your statement about Law of Demeter, although Fowler calls it 'the occasionally useful suggestion of Demeter' (or some such thing).

I don't have any comment on the C++ const qualifier, as I have no idea what it does...
2011-06-18 18:00 UTC
Florian Fordermaier #
Mark,

first of all I have to correct my first comment, auto properties with a const setter is definitely non-sense.
The C++ const qualifier is effectively a statically checked and compiler-enforced construct to syntactically express your objectives regarding "permissions" to change an object's state.
If I e.g. declare a class A with a const method. Every caller calling the const method knows, that wahtever the method itself does, it will definitely not change the state of the object - imagine you have a immutable 'this' in your const method.
The same holds for e.g. a parameter that is passed to a method. If the parameter is declared const, the compiler will enforce that the parameter (be it a value or reference type) will not be changed.
But the real problem with every object that is owned by another object and exposed in some way(e.g. property getter), is, that when I return a reference to it, the caller that received the reference can change this object's state without me knowing it - this breaks encapsulation. The const qualifier comes to the rescue, when I return a const reference, the caller cannot change the returned object (compiler-checked!).

Although the const qualifier does not solve all the problems you mentioned in your blog, it can be of help. I actually only brought your attention to this C++ language construct to have an example in hand (I'm far away from a C++ expert) for what I meant with "automation of the process of enforcing proper encapsulation". I'm still interested in your opinion regarding efforts on automation of these things...


2011-06-21 11:30 UTC
Sounds useful. Currently, Microsoft is making an effort along the same lines with Code Contracts, but although it's actually a released technology it still seems a bit immature to me.

As I've been using TDD since 2003 I usually just codify my assumptions about invariants in the tests I write. A framework like Greg Young's Grensesnitt might also come in handy there.
2011-06-21 19:03 UTC

Tennis Kata with immutable types and a cyclomatic complexity of 1

Monday, 16 May 2011 11:01:00 UTC

Recently I had the inclination to do the Tennis Kata a couple of times. The first time I saw it I thought it wasn't terribly interesting as an exercise in C# development. It would basically just be an application of the State pattern, so I decided to make it a bit more interesting. More or less by intuition I decided to give myself the following constraints:

Now that's more interesting :)

Given these constraints, what would be the correct approach? Given that this is a finite state machine with a fixed number of states, the Visitor pattern will be a good match.

Each player's score can be modeled as a Value Object that can be one of these types:

  • ZeroPoints
  • FifteenPoints
  • ThirtyPoints
  • FortyPoints
  • AdvantagePoint
  • GamePoint

All of these classes implement the IPoints interface:

public interface IPoints
{
    IPoints Accept(IPoints visitor);
 
    IPoints LoseBall();
 
    IPoints WinBall(IPoints opponentPoints);
 
    IPoints WinBall(AdvantagePoint opponentPoints);
 
    IPoints WinBall(FortyPoints opponentPoints);
}

The interesting insight here is that until the opponent's score reaches FortyPoints nothing special happens. Those states can be effectively collapsed into the WinBall(IPoints) method. However, when the opponent either has FortyPoints or AdvantagePoint, special things happen, so IPoints has specialized methods for those cases. All implementations should use double dispatch to invoke the correct overload of WinBall, so the Accept method must be implemented like this:

public IPoints Accept(IPoints visitor)
{
    return visitor.WinBall(this);
}

That's the core of the Visitor pattern in action. When the implementer of the Accept method is either FortyPoints or AdvantagePoint, the specialized overload will be invoked.

It's now possible to create a context around a pair of IPoints (called a Game) to implement a method to register that Player 1 won a ball:

public Game PlayerOneWinsBall()
{
    var newPlayerOnePoints = this.PlayerTwoScore
        .Accept(this.PlayerOneScore);
    var newPlayerTwoPoints = 
        this.PlayerTwoScore.LoseBall();
    return new Game(
        newPlayerOnePoints, newPlayerTwoPoints);
}

A similar method for player two simply reverses the roles. (I'm currently reading Lean Architecture, but have yet to reach the chapter on DCI. However, considering what I've already read about DCI, this seems to fit the bill pretty well… although I might be wrong on that account.)

The context calculates new scores for both players and returns the result as a new instance of the Game class. This keeps the Game and IPoints implementations immutable.

The new score for the winner depends on the opponent's score, so the appropriate overload of WinBall should be invoked. The Visitor implementation makes it possible to pick the right overload without resorting to casts and if statements. As an example, the FortyPoints class implements the three WinBall overloads like this:

public IPoints WinBall(IPoints opponentPoints)
{
    return new GamePoint();
}
 
public IPoints WinBall(FortyPoints opponentPoints)
{
    return new AdvantagePoint();
}
 
public IPoints WinBall(AdvantagePoint opponentPoints)
{
    return this;
}

It's also important to correctly implement the LoseBall method. In most cases, losing a ball doesn't change the current state of the loser, in which case the implementation looks like this:

public IPoints LoseBall()
{
    return this;
}

However, when the player has advantage and loses the ball, he or she loses the advantage, so for the AdvantagePoint class the implementation looks like this:

public IPoints LoseBall()
{
    return new FortyPoints();
}

To keep things simple I decided to implicitly model deuce as both players having FortyPoints, so there's not explicit Deuce class. Thus, AdvantagePoint returns FortyPoints when losing the ball.

Using the Visitor pattern it's possible to keep the cyclomatic complexity at 1. The code has no branches or loops. It's immutable to boot, so a game might look like this:

[Fact]
public void PlayerOneWinsAfterHardFight()
{
    var game = new Game()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerTwoWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall()
        .PlayerOneWinsBall();
 
    Assert.Equal(new GamePoint(), game.PlayerOneScore);
    Assert.Equal(new FortyPoints(), game.PlayerTwoScore);
}

In case you'd like to take a closer look at the code I'm attaching it to this post. It was driven completely by using the AutoFixture.Xunit extension, so if you are interested in idiomatic AutoFixture code it's also a good example of that.

TennisKata.zip (3.09 MB)

Comments

It's a beautiful thing
2011-05-17 14:59 UTC
Thanks :)
2011-05-17 15:02 UTC
Klaus Hebsgaard #
Funny to see, I believe I created thi kata once upon a time ...
2011-05-19 15:31 UTC

Generic unit testing with xUnit.net

Monday, 09 May 2011 12:37:17 UTC

Generics in .NET are wonderful, but sometimes when doing Test-Driven Development against a generic class I've felt frustrated because I've been feeling that dropping down to the lowest common denominator and testing against, say, Foo<object> doesn't properly capture the variability inherent in generics. On the other hand, writing the same test for five different types of T have seemed too wasteful (not to mention boring) to bother with.

That's until it occurred to me that in xUnit.net (and possibly other unit testing frameworks) I can define a generic test class. As an example, I wanted to test-drive a class with this class definition:

public class Interval<T>

Instead of writing a set of tests against Interval<object> I rather wanted to write a set of tests against a representative set of T. This is so easy to do that I don't know why I haven't thought of it before. I simply declared the test class itself as a generic class:

public abstract class IntervalFacts<T>

The reason I declared the class as abstract is because that effectively prevents the test runner from trying to run the test methods directly on the class. That would fail because T is still open. However, it enabled me to write tests like this:

[Theory, AutoCatalogData]
public void MinimumIsCorrect(IComparable<T> first, 
    IComparable<T> second)
{
    var sut = new Interval<T>(first, second);
    IComparable<T> result = sut.Minimum;
    Assert.Equal(result, first);
}

In this test, I also use AutoFixture's xUnit.net extension, but that's completely optional. You might as well just write an old-fashioned unit test, but then you'll need a SUT Factory that can resolve generics. If you don't use AutoFixture any other (auto-mocking) container will do, and if you don't use one of those, you can define a Factory Method that creates appropriate instances of T (and, in this particular case, instances of IComparable<T>).

In the above example I used a [Theory], but I might as well have been using a [Fact] as long as I had a container or a Factory Method to create the appropriate instances.

The above test doesn't execute in itself because the owning class is abstract. I needed to declare an appropriate set of constructed types for which I wanted the test to run. To do that, I defined the following test classes:

public class DecimalIntervalFacts : IntervalFacts<decimal> { }
public class StringIntervalFacts : IntervalFacts<string> { }
public class DateTimeIntervalFacts : IntervalFacts<DateTime> { }
public class TimSpanIntervalFacts : IntervalFacts<TimeSpan> { }

The only thing these classes do is to each pick a particular type for T. However, since they are concrete classes with test methods, the test runner will pick up the test cases and execute them. This means that the single unit test I wrote above is now being executed four times - one for each constructed type.

It's even possible to specialize the generic class and add more methods to a single specialized class. As a sanity check I wanted to write a set of tests specifically against Interval<int>, so I added this class as well:

public class Int32IntervalFacts : IntervalFacts<int>
{
    [Theory]
    [InlineData(0, 0, 0, true)]
    [InlineData(-1, 1, -1, true)]
    [InlineData(-1, 1, 0, true)]
    [InlineData(-1, 1, 1, true)]
    [InlineData(-1, 1, -2, false)]
    [InlineData(-1, 1, 2, false)]
    public void Int32ContainsReturnsCorrectResult(
        int minimum, int maximum, int value,
        bool expectedResult)
    {
        var sut = new Interval<int>(minimum, maximum);
        var result = sut.Contains(value);
        Assert.Equal(expectedResult, result);
    }
 
    // More tests...
}

When added as an extra class in addition to the four ‘empty' concrete classes above, it now causes each generic method to be executed five times, whereas the above unit test is only executed for the Int32IntervalFacts class (on the other hand it's a parameterized test, so the method is actually executed six times).

It's also possible to write parameterized tests in the generic test class itself:

[Theory]
[InlineData(-1, -1, false)]
[InlineData(-1, 0, true)]
[InlineData(-1, 1, true)]
[InlineData(0, -1, false)]
[InlineData(0, 0, true)]
[InlineData(0, 1, true)]
[InlineData(1, -1, false)]
[InlineData(1, 0, false)]
[InlineData(1, 1, false)]
public void ContainsReturnsCorrectResult(
    int minumResult, int maximumResult,
    bool expectedResult)
{
    // Test method body
}

Since this parameterized test in itself has 9 variations and is declared by IntervalFacts<T> which now has 5 constructed implementers, this single test method will be executed 9 x 5 = 45 times!

Not that the number of executed tests in itself is any measure of the quality of the test, but I do appreciate the ability to write generic unit tests against generic types.


AutoFixture 2.1

Monday, 02 May 2011 18:34:52 UTC

Since I announced AutoFixture 2.1 beta 1 no issues have been reported, so I've now promoted the release to the official 2.1 release. This means that if you already downloaded AutoFixture 2.1 beta 1, you already have the 2.1 binaries. If not, you can head over to the release page to get it.

The NuGet packages will soon be available.

Update (2011.5.4): The NuGet packages are now available.


Windows Azure migration smell: SQL Server over-utilization

Monday, 02 May 2011 12:23:49 UTC

Recently I partook in a Windows Azure migration workshop, helping developers from existing development organizations port their applications to Windows Azure. Once more an old design smell popped up: SQL Server over-utilization. This ought to be old news to anyone with experience designing software on the Wintel stack, but apparently it bears repetition:

Don't put logic in your database. SQL Server should be used only for persistent storage of data.

(Yes: this post is written in 2011…)

Many years ago I heard that role described as a ‘bit bucket' - you put in data and pull it out again, and that's all you do. No fancy stored procedures or functions or triggers.

Why wouldn't we want to use the database if we have one? Scalability is the answer. SQL Server doesn't scale horizontally. You can't add more servers to take the load off a database server (well, some of my old colleagues will argue that this is possible with Oracle, and that may be true, but with SQL Server it's impossible).

Yes, we can jump through hoops like partitioning and splitting the database up into several smaller databases, but it still doesn't give us horizontal scalability. SQL Server is a bottleneck in any system in which it takes part.

How is this relevant to Windows Azure? It's relevant for two important reasons:

  • There's an upper size limit on SQL Azure. Currently that size limit is 50 GB, and while it's likely to grow in the future, there's going to be a ceiling for a long time.
  • You can't fine tune the hardware for performance. The server runs on virtual hardware.

Development organizations that rely heavily on the database for execution of logic often need expensive hardware and experienced DBAs to squeeze extra performance out of the database servers. Such people know that write-intensive/append-only tables work best with one type of RAID, while read-intensive tables are better hosted on other file groups on different disks with different RAID configurations.

With SQL Azure you can just forget about all that.

The bottom line is that there are fundamental rules for software development that you must follow if you want to be able to successfully migrate to Windows Azure. I previously described an even simpler sanity check you should perform, but after that you should take a good look at your database.

The best solution is if you can completely replace SQL Server with Azure's very scalable storage services, but those come with their own set of challenges.


Feedback mechanisms and tradeoffs

Friday, 29 April 2011 13:02:14 UTC

Rapid feedback is one of the cornerstones of agile development. The faster we can get feedback, the less it costs to correct errors. Unit testing is one of the ways to get feedback, but not the only one. Each way we can get feedback comes with its own advantages and disadvantages.

In my experience there's a tradeoff between the cost of setting up a feedback mechanism and the level of confidence we can get from it. This is quite intuitive to most people, but I've rarely seen it explicitly described. The purpose of this post is to explicitly describe and relate those different mechanisms.

Compilation #

In compiled languages, compilation provides the first level of feedback. I think a lot of people don't think about this as feedback as (for compiled languages) it's a necessary step before the code can be executed.

The level of confidence may not be very high - we tend to laugh at statements like ‘if it compiles, it works'. However, the compiler still catches a lot of mistakes. Think about it: does your code always compile, or do you sometimes get compilation errors? Do you sometimes try to compile your code just to see if it's possible? I certainly do, and that's because the compiler is always available, and it's the first and fastest level of feedback we get.

Code can be designed to take advantage of this verification step. Constructor Injection, for example, ensures that we can't create a new instance of a class without supplying it with its required dependencies.

It's also interesting to note that anecdotal evidence seems to suggest that unit testing is more prevalent for interpreted languages. That makes a lot of sense, because as developers we need feedback as soon as possible, and if there's no compiler we must reach for the next level of feedback mechanism.

Static code analysis #

The idea of static code analysis isn't specific to .NET, but in certain versions of Visual Studio we have Code Analysis (which is also available as FxCop). Static code analysis is a step up from compilation - not only is code checked for syntactical correctness, but it's also checked against a set of known anti-patterns and idioms.

Static code analysis is encapsulated expert knowledge, yet surprisingly few people use it. I think part of the reason for this is because the ratio of time against confidence isn't as compelling as with other feedback mechanism. It takes some time to run the analysis, but like compilation the level of confidence gained from getting a clean result is not very high.

Personally, I use static code analysis once in a while (e.g. before checking in code to the default branch), but not as often as other feedback mechanisms. Still, I think this type of feedback mechanism is under-utilized.

Unit testing #

Running a unit test suite is one of the most efficient ways to gain rapid feedback. The execution time is often comparable to running static code analysis while the level of confidence is much higher.

Obviously a successful unit test execution is no guarantee that the final application works as intended, but it's a much better indicator than the two previous mechanisms. When presented with the choice of using either static code analysis or unit tests, I prefer unit tests every time because the confidence level is much higher.

If you have sufficiently high code coverage I'd say that a successful unit test suite is enough confidence to check in code and let continuous integration take care of the rest.

Keep in mind that continuous integration and other types of automated builds are feedback mechanisms. If you institute rules where people are ‘punished' for checking in code that breaks the build, you effectively disable the automated build as a source of feedback.

Thus, the rest of these feedback mechanisms should be used rarely (if at all) on developer work stations.

Integration testing #

Integration testing comes in several sub-categories. You can integrate multiple classes from within the same library, or integrate multiple libraries while still using Test Doubles instead of out-of-process resources. At this level, the cost/confidence ratio is comparable to unit testing.

Subcutaneous testing #

A more full level of integration testing comes when all resources are integrated, but the tests are still driven by code instead of through the UI. While this gives a degree of confidence that is close to what you can get from a full systems test, the cost of setting up the entire testing environment is much higher. In many cases I consider this the realm of a dedicated testing department, as it's often a full-time job to maintain the suite of automation tools that enable such tests - particularly if we are talking about distributed systems.

System testing #

This is a test of the full system, often driven through the UI and supplemented with manual testing (such as exploratory testing). This provides the highest degree of confidence, but comes with a very high cost. It's often at this level we find formal acceptance tests.

The time before we can get feedback at this level is often considerable. It may take days or even weeks or months.

Understand the cost/confidence ratio #

The purpose of this post has been to talk about different ways we can get feedback when developing software. If we could get instantaneous feedback with guaranteed confidence it would be easy to choose, but as this is not the case we must understand the benefits and drawbacks of each mechanism.

It's important to realize that there are many mechanisms, and that we can combine them in a staggered approach where we start with fast feedback (like the compiler) with a low degree of confidence and then move through successive stages that each provide a higher level of confidence.


Comments

Markus Hjärne #
Hi Mark, good post. I just thought of one thing worth mentioning in this context: guarding public APIs with checks that throws exceptions in case they fail and guarding private methods with asserts has been a great source of good and fast feedback for me over the years.

Regards,
Markus
2011-05-04 20:46 UTC
I totally agree, and I was (and am still) planning on writing something about that in a later post. This is essentially the Fail Fast principle, but as for feedback mechanisms, we don't discover these Guard Clauses before we invoke them.

My strong preference is for writing unit tests that invoke these Guard Clauses to verify that they work, and that means that as far as feedback mechanisms go, they fall into the unit testing 'bucket'.
2011-05-04 20:54 UTC
Your comments about compilation are interesting, I hadn't thought of it before but it is a level of feedback that I find as useful as you do.

I wonder if you'd consider IntelliSense as an even earlier form of feedback though?

Gog
2012-06-25 09:20 UTC
IntelliSense is, I think, more of a learning tool. It's not really feedback about what you did - it's more information about what you can do. It's certainly a productivity feature, although there are some people who argue that it does more harm than good.
2012-06-25 10:50 UTC
It might be worth considering a couple of additional feedback mechanisms at the opposite end of the spectrum, in the sense that they provide even later feedback, if only to underscore your points from the article with extreme counter-examples. Run-time exceptions are themselves a feedback mechanism. Most unnecessarily "stringly-typed" code could be considered to be pushing code away from the more rapid, desirable feedback mechanisms. (Primitive Obsession in general might be considered another tendency which pushes the design of code toward being run-time exception-prone.) Logic erors are flaws in code with which the code appears to execute successfully, until the program output is manually inspected. Just off the cuff, one typical cause might be the tendency in some projects to add guard clauses such as if (methodArg==null) return; which causes methods to silently and unexpectedly return without accomplishing the task that they were designed to do. Mathematical calculations performed without adequate test coverage might be another example.

We learned about 3 classes of errors in school: 1) Syntax, 2) Run-time, and 3) Logic (Unit testing and static code analysis weren't as much of a thing back then). It might seem odd to discuss types of feedback mechanisms that are undesirable by definition, until you stop to observe how common it is for the design of much code to push errors towards this undesirable side of the continuum. "To defeat your enemy you first must know him" or something like that.
2016-09-11 00:11 UTC

David, thank you for making these feedback cases explicit. When I originally wrote this article, I implicitly had such alternative, common-place feedback mechanisms in mind, but I never explicitly spelled them out. While I should have done that, I can now thank you for doing it; your comment is a good addition to the original article.

2016-09-11 08:07 UTC

Provider is not a pattern

Wednesday, 27 April 2011 12:14:52 UTC

Developers exposed to ASP.NET are likely to be familiar with the so-called Provider pattern. You see it a lot in that part of the BCL: Role Provider, Membership Provider, Profile Provider, etc. Lots of text has already been written about Providers, but the reason I want to add yet another blog post on the topic is because once in a while I get the question on how it relates to Dependency Injection (DI).

Is Provider a proper way to do DI?

No, it has nothing to do with DI, but as it tries to mimic loose coupling I can understand the confusion.

First things first. Let's start with the name. Is it a pattern at all? Regular readers of this blog may get the impression that I'm fond of calling everything and the kitchen sink an anti-pattern. That's not true because I only make that claim when I'm certain I can hold that position, so I'm not going to denounce Provider as an anti-pattern. On the contrary I will make the claim that Provider is not a pattern at all.

A design pattern is not invented - it's discovered as a repeated solution to a commonly recurring problem. Providers, on the other hand, were invented by Microsoft, and I've rarely seen them used outside their original scope. Secondly I'd also dispute that they solve anything.

That aside, however, I want to explain why Provider is bad design:

  • It uses the Constrained Construction anti-pattern
  • It hides complexity
  • It prevents proper lifetime management
  • It's not testable

In the rest of this post I will explain each point in detail, but before I do that we need an example to look at. The old OrderProcessor example suffices, but instead of injecting IOrderValidator, IOrderCollector, and IOrderShipper this variation uses Providers to provide instances of the Services:

public SuccessResult Process(Order order)
{
    IOrderValidator validator = 
        ValidatorProvider.Validator;
    bool isValid = validator.Validate(order);
    if (isValid)
    {
        CollectorProvider.Collector.Collect(order);
        ShipperProvider.Shipper.Ship(order);
    }
 
    return this.CreateStatus(isValid);
}

The ValidatorProvider uses the configuration system to create and return an instance of IOrderValidator:

public static IOrderValidator Validator
{
    get 
    {
        var section = 
            OrderValidationConfigurationSection
                .GetSection();
        var typeName = section.ValidatorTypeName;
        var type = Type.GetType(typeName, true);
        var obj = Activator.CreateInstance(type);
        return (IOrderValidator)obj;
    }
}

There are lots of details I omitted here. I could have saved the reference for later use instead of creating a new instance each time the property is accessed. In that case I would also have had to make the code thread-safe, so I decided to skip that complexity. The code could also be more defensive, but I'm sure you get the picture.

The type name is defined in the app.config file like this:

<orderValidation 
  type="Ploeh.Samples.OrderModel.UnitTest.TrueOrderValidator,
        Ploeh.Samples.OrderModel.UnitTest" />

Obviously, CollectorProvider and ShipperProvider follow the same… blueprint.

This should be well-known to most .NET developers, so what's wrong with this model?

Constrained Construction #

In my book's chapter on DI anti-patterns I describe the Constrained Construction anti-pattern. Basically it occurs every time there's an implicit constraint on the constructor of an implementer. In the case of Providers the constraint is that each implementer must have a default constructor. In the example the culprit is this line of code:

var obj = Activator.CreateInstance(type);

This constrains any implementation of IOrderValidator to have a default constructor, which obviously means that the most fundamental DI pattern Constructor Injection is out of the question.

Variations of the Provider idiom is to supply an Initialize method with a context, but this creates a temporal coupling while still not enabling us to inject arbitrary Services into our implementations. I'm not going to repeat six pages of detailed description of Constrained Construction here, but the bottom line is that you can't fix it - you have to refactor towards true DI - preferably Constructor Injection.

Hidden complexity #

Providers hide the complexity of their implementations. This is not the same as encapsulation. Rather it's a dishonest API and the problem is that it just postpones the moment when you discover how complex the implementation really is.

When you implement a client and use code like the following everything looks deceptively simple:

IOrderValidator validator = 
    ValidatorProvider.Validator;

However, if this is the only line of code you write it will fail, but you will not notice until run-time. Check back to the implementation of the Validator property if you need to refresh the implementation: there's a lot of things that can go wrong here:

  • The appropriate configuration section is not available in the app.config file.
  • The ValidatorTypeName is not provided, or is null, or is malformed.
  • The ValidatorTypeName is correctly formed, but the type in question cannot be located by Fusion.
  • The Type doesn't have a default constructor. This is one of the other problems of Constrained Construction: it can't be statically enforced because a constructor is not part of an abstraction's API.
  • The created type doesn't implement IOrderValidator.

I'm sure I even forgot a thing or two, but the above list is sufficient for me. None of these problems are caught by the compiler, so you don't discover these issues until you run an integration test. So much for rapid feedback.

I don't like APIs that lie about their complexity.

Hiding complexity does not make an API easier to use; it makes it harder.

An API that hides necessary complexity makes it impossible to discover problems at compile time. It simply creates more friction.

Lifetime management issues #

A Provider exerts too much control over the instances it creates. This is a variation of the Control Freak anti-pattern (also from my book). In the current implementation the Validator property totally violates the Principle of least surprise since it returns a new instance every time you invoke the getter. I did this to keep the implementation simple (this is, after all, example code), but a more normal implementation would reuse the same instance every time.

However, reusing the same instance every time may be problematic in a multi-threaded context (such as a web application) because you'll need to make sure that the implementation is thread-safe. Often, we'd much prefer to scope the lifetime of the Service to each HTTP request.

HTTP request scoping can be built into the Provider, but then it would only work in web applications. That's not very flexible.

What's even more problematic is that once we move away from the Singleton lifestyle (not to be confused with the Singleton design pattern) we may have a memory leak at hand, since the implementation may implement IDisposable. This can be solved by adding a Release method to each Provider, but now we are moving so far into DI Container territory that I find it far more reasonable to just use proper DI instead of trying to reinvent the wheel.

Furthermore, the fact that each Provider owns the lifetime of the Service it controls makes it impossible to share resources. What if the implementation we want to use implements several Role Interfaces each served up by a different Provider? We might want to use that common implementation to share or coordinate state across different Services, but that's not possible because we can't share an instance across multiple providers.

Even if we configure all Providers with the same concrete class, each will instantiate and serve its own separate instance.

Testability #

The Control Freak also impacts testability. Since a Provider creates instances of interfaces based on XML configuration and Activator.CreateInstance, there's no way to inject a dynamic mock.

It is possible to use hard-coded Test Doubles such as Stubs or Fakes because we can configure the XML with their type names, but even a Spy is problematic because we'll rarely have an object reference to the Test Double.

In short, the Provider idiom is not a good approach to loose coupling. Although Microsoft uses it in some of their products, it only leads to problems, so there's no reason to mimic it. Instead, use Constructor Injection to create loosely coupled components and wire them in the application's Composition Root using the Register Resolve Release pattern.


Comments

I have to disagree with your statement that the provider is not a pattern at all. The provider (in .NET) is a specific implementation of the bridge pattern as defined in "Design Patterns" by Gamma, Helm, Johnson, and Vlissides. I believe that this description has been lost over the past 10 years as people in the industry have resorted to calling it the "Provider Pattern."

When used correctly as a bridge pattern, the provider does actually solve reoccuring problems very eloquently. It decouples the interface from the implementation so that the implementation details are hidden from the client. You can extend the implementation by building on existing implementations to reduce the complexity. Complexity is only introduced as a by-product of bad design.

The notion that the provider interfers with testability is incorrect. Each individual implementation should be designed from the beginning to be testable. The bridge is not the entry point for testing the implementor. The bridge should only be responsible for forwarding client requests to its implementor. (Therefore the bridge should be testable as well.) I am not stating that there aren't providers out there that violate this priciple. I'm merely stating that providers which do this are poorly designed.
2011-06-27 16:43 UTC
I just reread the Bridge pattern and while I agree that Provider is a specialization of Bridge, I don't agree that this relationship goes the other way. If you read the description of Provider provided in the link above, you'll notice that it goes into very specific details on how a Provider should be implemented, including how it should be backed by the configuration system and that it must be created by clients by a static factory.

This goes way beyond what the Bridge pattern describes, so I hardly think you can equate the two.

Especially the part about being created by a static factory which can only read from the configuration system is testability poison.
2011-06-27 19:25 UTC
I disagree with your analysis. Provider model is a pattern.

I realize this is a year late, but I just found your article a few days ago via a reference on scott hanselmans blog (in comments). Here is my rebuttal.
http://candordeveloper.com/2012/06/26/provider-model-is-a-solid-pattern/
2012-06-29 11:31 UTC
Also going to throw in on the disagreement with your assertion:

"A design pattern is not invented - it's discovered as a repeated solution to a commonly recurring problem."

The design pattern concept does not provide an opinion on how they were arrived upon. I am always delighted when programming terminology gives me new insight into words. Kinda like "abstraction" -- I don't think I would even know the term if I were not a programmer, at least, not as well as I do.

However, programming has little to teach us about the words "design" and "pattern" that we do not already know. There is nothing mysterous or supernatural about them, even when combined. A pattern is anything that repeats in a predictable fashion.. and that's really all there is to it.

Whether or not the service provider is a good pattern, I am not sure, but I can be certain that it is, in fact, a pattern. Isn't the neuance you're describing subjective, anyway? Suppose that you invented it, and then I used it. Suppose also that a third developer "discovered" our usage. What then, does the universe unwind? My example is trivial, but scale it up and it makes a little more sense.

But, I do love your blog, thank you for the discussion.
2014-11-11 9:55 UTC

Luke, thank you for writing. Obviously, you are free to have your own opinions, and interpretation of various words. However, if we want to be able to discuss our trade in a concise manner, we need as precise a vocabulary as possible.

On this blog (and elsewhere), I strive to use a consistent and well-known jargon. Thus, when I use the term design pattern, I refer to it in a way that's compatible with the original description in Design Patterns. In the introduction (on page 2), Gamma et al. write:

"None of the design patterns in this book describes new or unproven designs. We have included only designs that have been applied more than once in different systems."
(Emphasis mine.) The point is that, using the established vocabulary, a design pattern most certainly implies that the pattern is a result of parallel evolution, and subsequently discovered and described as a common solution to a particular type of problem.

Although I grant that it's partially subjective, there's still an implicit value judgement in cataloguing a design pattern. While most patterns come with a set of known disadvantages, but those disadvantages tend to not outweigh the benefits. If they did, it wouldn't be a design pattern, but rather an anti-pattern. This, too, is a well-defined term:

"a commonly occurring solution to a problem that generates decidedly negative consequences."
This definition differs from a 'proper' design pattern because a design pattern has mostly beneficial consequences.

2014-11-14 20:32 UTC

I appreciate your effort to maintain consistent jargon and termonology. I have had dozens of heated arguments with my co-workers about individual words in our interfaces, all the while they're staring back at me with a look that says "omg dude, its just a word." I gather that you and I have that common.

Personally, I'm against the concept of jargon all together because it, by its nature, can only create confusion. By that I mean, if it is consistent with the English language, then its not jargon, its just English. Conversely, if it contradicts or conflicts with the English language, then its bad jargon anyway, and can only lead to confusion. I tend to err on the side of verbosity, and might have described the pattern as not being a "Beneficial Design Pattern".

In reading the paragraph you quoted from, I did not get the impression that the author gave an opinion on the origin of a pattern as a contributor to its validity. I think he is simply trying to disclaim the notion that the patterns described within the book are original ideas, and effort was made to only include patterns that have been implemented in the wild, and by more than one implementor. i.e. I did not write a book about my own patterns, or the clever things my co-workers have done.

"So although these designs are not new, we capture them in a new and accessible way: as a catalog of design patterns having a consistent format."

The service pattern certainly has been implemented by more than one implementor, sometimes identified by a different name, sometimes in such a way that provides more benefit than otherwise, and very likely before Microsoft introduced it to the world at-large.

My critique of your article was based solely on the language used and only because of provocativeness of your statement. It seemed intentionally unintuitive.

With all of that said, though, I concede my point. I do that mostly because you obviously have a superior understanding of design patterns than I do (by multitudes). Your blog is one of resources that first got me interested in the study of design patterns, and I've learned a great deal from it. I think it would be rather foolish for me to try to "take you on", as I'd certainly be embarrassed.

Besides that, I could not formulate an opinion on the supporting arguments you put forth as I reread the blog. I have not programmed in .NET in a very long time and I have trouble understanding if we're even referring to the same things. I fear that I am a bit out of my league.

My [perhaps misguided] interpretation of the "Service Pattern" comes from places wholy unrelated to .NET. I've seen the term used to describe a pattern akin to the "Adapter Pattern", where the term "Adapter" is replaced by "Service", and "Adaptee" is replaced by "Provider".

Implementor < Service < Provider (or ServiceProvider)

i.e.

Model < StorageService < MySQLProvider

Thank you for your thoughtful response.

2014-11-15 3:10 UTC

At the risk of coming across as a pedant, did you mean Provider when you wrote "Service Pattern"? At least, I'm not aware of any Service Pattern in the most commonly accepted pattern literature, but perhaps there's a book I should read.

The Provider design isn't the same as the Adapter pattern. The difference is, among others, that Providers rely on the .NET configuration system, as well as the Constrained Construction anti-pattern. Particularly its reliance on the .NET configuration system is what excludes it from being a proper pattern, because you couldn't possibly discover it on a different platform.

The Adapter pattern describes how to adapt one API to another API; while it's a useful pattern, it has a different concern than Provider. It also doesn't prescribe any particular method of initialization.

2014-11-15 20:53 UTC

Page 60 of 73

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!