Poka-yoke Design: From Smell to Fragrance by Mark Seemann
Encapsulation is one of the most misunderstood aspects of object-oriented programming. Most people seem to think that the related concept of information hiding simply means that private fields should be exposed by public properties (or getter/setter methods in languages that don't have native properties).
Have you ever wondered what's the real benefit to be derived from code like the following?
private string name; public string Name { get { return this.name; } set { this.name = value; } }
This feels awfully much like redundant code to me (and automatic properties are not the answer - it's just a compiler trick that still creates private backing fields). No information is actually hidden. Derick Bailey has a good piece on why this view of encapsulation is too narrow, so I'm not going to reiterate all his points here.
So then what is encapsulation?
The whole point of object-orientation is to produce cohesive pieces of code (classes) that solve given problems once and for all, so that programmers can use those classes without having to learn about the intricate details of the implementations.
This is what encapsulation is all about: exposing a solution to a problem without requiring the consumer to fully understand the problem domain.
This is what all well-designed classes do.
- You don't have to know the intricate details of TDS to use ADO.NET against SQL Server.
- You don't have to know the intricate details of painting on the screen to use WPF or Windows Forms.
- You don't have to know the intricate details of Reflection to use a DI Container.
- You don't have to know how to efficiently sort a list in order to efficiently sort a list in .NET.
- Etc.
What makes encapsulation so important is exactly this trait. The class must hide the information it encapsulates in order to protect it against ‘naïve' users. Wikipedia has this to say:
Hiding the internals of the object protects its integrity by preventing users from setting the internal data of the component into an invalid or inconsistent state.
Keep in mind that users are expected to not fully understand the internal implementation of a class. This makes it obvious what encapsulation is really about:
Encapsulation is a fail-safe mechanism.
By corollary, encapsulation does not mean hiding complexity. Whenever complexity is hidden (as is the case for Providers) feedback time increases. Rapid feedback is much preferred, so delaying feedback is not desirable if it can be avoided.
Encapsulation is not about hiding complexity, but conversely exposing complexity in a fail-safe manner.
In Lean this is known as Poka-yoke, so I find it only fitting to think about encapsulation as Poka-yoke Design: APIs that make it as hard as possible to do the wrong thing. Considering that compilation is the cheapest feedback mechanism, it's preferable to design APIs so that the code can only compile when classes are used correctly.
In a series of blog posts I will look at various design smells that break encapsulation, as well as provide guidance on how to improve the design to make it safer, thus going from smell to fragrance.
- Design Smell: Temporal Coupling
- Design Smell: Primitive Obsession
- Code Smell: Automatic Property
- Design Smell: Redundant Required Attribute
- Design Smell: Default Constructor
- DI Container smell: Captive Dependency
Postscript: At the Boundaries, Applications are Not Object-Oriented
Comments
That's not to say that this is not relevant information but surely you're not implying that's applicable to scenarios like messaging, RESTful APIs, and other circumstances that need easily serializable objecst?
For instance, if a string name is used as a sorting key, it isn't appropriate to change the name. You can make the name string public, but that implies that changing the name is a valid operation, and might lead to bugs later. Providing a const getter, but no setter for the name says "You can't change this name".
If you have a setter and a getter for a piece of data, it should be because the class needs to expose that data for the purpose of changing it. That happens a lot, and it shouldn't be viewed as unreasonable that you have a private data member and a setter/getter. It's not a waste of time or code. It's a clear contract with users of your class that you intend to provide these operations no matter how the class evolves.
One important property of good encapsulation is that you are free to change the data representation of your class if the interface remains the same, and your changes will be limited to the methods of the class itself. Want to change from a 1-based count to a zero-based index? If you exposed a member called items, you're screwed. If you exposed a method called CountGet() you're ok. Just change CountGet()'s implementation from returning items to return items+1.
You mentioned the compiler as the first and cheapest feedback mechanism, so the target should be to achieve automation of the process of enforcing proper encapsulation with (a) a better compiler or (b) static analysis or (c) runtime functionality that can be applied minimally-invasive.
See C++'s const qualifier, an excellent example of a language feature to support proper encapsulation, this would allow for e.g. auto-properties with a getter/setter when making the setter const. Of course this will itself impact your design, but it offers a language integrated fail-safe mechanism for encapsulation. What do you think?
I also may have an additional smell for you, a violation of the law of demeter breaks encapsulation in most cases.
I don't have any comment on the C++ const qualifier, as I have no idea what it does...
first of all I have to correct my first comment, auto properties with a const setter is definitely non-sense.
The C++ const qualifier is effectively a statically checked and compiler-enforced construct to syntactically express your objectives regarding "permissions" to change an object's state.
If I e.g. declare a class A with a const method. Every caller calling the const method knows, that wahtever the method itself does, it will definitely not change the state of the object - imagine you have a immutable 'this' in your const method.
The same holds for e.g. a parameter that is passed to a method. If the parameter is declared const, the compiler will enforce that the parameter (be it a value or reference type) will not be changed.
But the real problem with every object that is owned by another object and exposed in some way(e.g. property getter), is, that when I return a reference to it, the caller that received the reference can change this object's state without me knowing it - this breaks encapsulation. The const qualifier comes to the rescue, when I return a const reference, the caller cannot change the returned object (compiler-checked!).
Although the const qualifier does not solve all the problems you mentioned in your blog, it can be of help. I actually only brought your attention to this C++ language construct to have an example in hand (I'm far away from a C++ expert) for what I meant with "automation of the process of enforcing proper encapsulation". I'm still interested in your opinion regarding efforts on automation of these things...
As I've been using TDD since 2003 I usually just codify my assumptions about invariants in the tests I write. A framework like Greg Young's Grensesnitt might also come in handy there.