Public types hidden in plain sight by Mark Seemann
Instead of making types internal to a package, you can hide them in plain sight as public types.
When I review object-oriented code, I often see lots of internal (package-private in Java) classes and members. Sometimes, most of a code base is written at that access level.
When I ask for the reasoning behind this, the answer is invariably: encapsulation.
Encapsulation is one of the most misunderstood concepts in programming. If you think it's about making everything inaccessible, then there's only one logical conclusion: all your code must have this API:
public class Program { public static int Main(string[] args) { // Everything else goes here! } }
If you've seen my Pluralsight course about encapsulation, you'll know that I prefer Bertrand Meyer's view, as outlined in Object-Oriented Software Construction. Essentially, it's about design by contract (pre- and post-conditions).
Once I start to ask more pointed questions about enthusiastic use of internal-only access, it turns out that often, the underlying reason is that the developers want to be able to change the design of their internal types and members, without breaking existing clients. A noble goal.
Imagine that such a code base has existing clients. While the maintainers can't change the public types without breaking these clients, they can change all internal types as much as they want.
Unfortunately, there are disadvantages to this design strategy. You can't easily unit test such code bases. There are ways, but none of them are good.
What if, instead of making types internal, you made them public? Are there ways to prevent clients from relying on such types? Yes, plenty of ways.
Public, append-only #
My preferred approach to this problem is to make types public anyway. If my overall feeling about a type is that "it seems okay", I make it public, even if I know that there's a risk involved with this.
Consider, as an example, the private DurationStatistics class I extracted in a recent article. In that article, I kept the class private, because I wanted to discuss visibility in a separate article.
You can easily promote the DurationStatistics class to a public class:
public class DurationStatistics
Here you only see the declaration of the class, because the implementation is exactly the same as in the other article. The code is moved out of the Estimator class and put into its own file, though.
How can you be sure that DurationStatistics is 'done'? How can you be sure that you're not going to need to change it later?
You can't.
You can, however, deal with that when it occurs. You can still add new members to the class, but you can't remove members or change the type's name.
If you need to change something (thereby breaking compatibility), then don't change it. Instead, add the new type or member, but leave the old artefact in place and deprecate it. In .NET, you can do this by adorning the member or type with the [Obsolete] attribute. You can even add a message that points clients to the new, preferred way of doing things:
[Obsolete("DurationStatistics is being retired (our fault). Use Foo instead.")] public class DurationStatistics
This will cause a compiler warning in all client code using the DurationStatistics class. You should make the message as helpful and friendly as possible.
This may leave your code with compiler warnings. This is good, because you should work to remove these warnings from your own code base. The only place you should leave them in place is in your unit tests. As long as you have a deprecated type in your published code base, you should also keep it covered by unit tests. This will cause compiler warnings in your unit tests, but you can suppress such warnings explicitly there:
[Theory] [InlineData(new[] { 1, 1, 1 }, 1)] [InlineData(new[] { 2, 3, 4 }, 3)] public void AverageIsCorrect(int[] seconds, int expectedSeconds) { #pragma warning disable 618 var sut = new DurationStatistics( seconds.Select(s => TimeSpan.FromSeconds(s)).ToArray()); #pragma warning restore 618 var actual = sut.Average; Assert.Equal(expectedSeconds, actual.TotalSeconds); }
Keep the 'disable scope' as small as possible, and always remember to restore the warning after disabling it.
Perhaps you think that this approach of just making everything public and dealing with the consequences later is an unprofessional, unhygienic, and ugly way to evolve code, but it's really the logical way to produce and publish SOLID code.
Obsolete by default #
A variant of the above technique is that instead of making a type or member internal, you can deprecate it right away. This will immediately warn clients that they're doing something they aren't supposed to do.
If you ever decide to 'promote' that type to a bona fide member of your API, you can simply remove the [Obsolete] attribute.
Separate namespaces #
You can also hide public types in separate namespaces. If you don't document the types in those 'hidden' namespaces, clients will again have to do something explicit to use them.
This is a softer way of hiding public types in plain sight, because clients get no compiler warnings if they use those 'hidden' types. Still, it can be quite effective.
My experience with maintaining public software (e.g. the now-more-than-six-year-old AutoFixture project) is that the most common problem is that users don't even find the public types I want them to find! If you knew how many times I've seen people reinvent a feature already available in the API... And that's even when all the interesting stuff is in the same namespace.
Putting types in separate namespaces is, in my experience, quite an effective way of hiding them.
Separate libraries #
Ultimately, you can put your volatile types in a separate library. In the Estimation example, you can ship the Estimator class (your 'real' public API) in one library, but put DurationStatistics in another library:
The Estimation library references the Estimation.Statistics library, so it can use all the public types in that library. You can unit test the public types in the Estimation library, but you can also unit test the public types in the Estimation.Statistics library.
When you publish your API, you give clients a reference to Estimation, but not to Estimation.Statistics. You still ship Estimation.Statistics as a dependency of your API, but clients shouldn't reference it.
Specifically, if you wish to publish your API as a NuGet package, you can use the <references>
element to ensure that only the 'official' library is added to a project:
In this library, I installed the (local) Estimation NuGet package, and as you can see, only Ploeh.Samples.Estimation is referenced, whereas Ploeh.Samples.Estimation.Statistics isn't. This means that the client can easily use the official API, e.g. to create a new instance of the Estimator class:
this.estimator = new Estimator(TimeSpan.FromMinutes(1));
On the other hand, if a client developer attempts to use the public DurationStatistics class, that's not possible:
Not only is DurationStatistics not available, all Visual Studio can suggest is to create it; it can't suggest pulling in the appropriate assembly reference, which is a good thing.
The library is still there, it's just not referenced, so Estimator still works at run-time.
The trick to set up a NuGet package in this way is to use the <references>
element in the .nuspec file:
<?xml version="1.0"?> <package > <metadata> <id>Estimation</id> <version>1.0.0</version> <authors>Mark Seemann</authors> <owners>Mark Seemann</owners> <requireLicenseAcceptance>false</requireLicenseAcceptance> <description> This is an example that demonstrates how dependent libraries can hide public types in plain sight. </description> <copyright>Copyright Mark Seemann 2015</copyright> <references> <reference file="Ploeh.Samples.Estimation.dll" /> </references> </metadata> <files> <file src="..\Estimation\bin\Debug\Ploeh.Samples.Estimation.dll" target="lib\net45" /> <file src="..\Estimation\bin\Debug\Ploeh.Samples.Estimation.Statistics.dll" target="lib\net45" /> </files> </package>
Notice that both Ploeh.Samples.Estimation.dll and Ploeh.Samples.Estimation.Statistics.dll are included in the NuGet package, but that only Ploeh.Samples.Estimation.dll should be referenced when the NuGet package is added.
For good measure I should point out that in order to create these demo files, I only installed the Estimation NuGet package from my local disk (using the -Source switch), so don't go and look for it on nuget.org.
Summary #
There are plenty of ways to hide public types in plain sight. Often, one or more of these options are better than misusing the internal/package-private access modifier. This doesn't mean that you must never make a class internal, but it means that you should consider the advantages and disadvantages of all options before making such a decision.
The main disadvantage of using the internal access modifier is that you can't easily unit test such a class. The options provided here should give you plenty of alternatives that will enable you to easily unit test classes while still hiding them from client developers.
Comments
The problem with the [InternalsVisibleTo] attribute is exactly that it enables you to unit test internals, which you shouldn't.
Dear Mark,
Thanks for another great article. I totally agree with your point of view, but I can also see why Børge might be somewhat unsatisfied with the answer to not use the InternalsVisibleTo attribute in the context of automated testing.
I think that using internal to hide certain types of a reusable code base is a violation of the Open/Closed Principle, because there is no easy way to extend the internal types without touching the source code (you can use reflection to call internal methods, but that's annoying, and you simply cannot derive from an internal class or implement an internal interface.)
What are your thoughts on this argument?
With regards,
Kenny
Kenny, thank you for writing. Overall, I like that additional argument. In order to keep this discussion balanced, I think it's worth pointing out that your argument certainly applies to reusable software, but that lots of software isn't (or rather, shouldn't be) reusable. Udi Dahan has pointed out that reuse may be overrated, or even have negative consequences, and while I've been disagreeing with certain parts of that claim, in general I think there's much wisdom in that observation.
Some software is written to be reusable, but much software shouldn't be. This is the sort of software I call Zoo software. Your argument about violation of the OCP certainly applies to Wildlife Software, but only indirectly to Zoo software.