ploeh blog danish software design
Test-Driven Development with F# Pluralsight course
Test-Driven Development and Functional Programming is a match made in heaven. Learn how and why in my new Pluralsight course.
A common criticism against Test-Driven Development (TDD) is that it leads to Test-Induced Damage. However, it doesn't have to be that way, and it turns out that with Functional Programming (FP), the design ideals of FP coincide with TDD.
In my new Pluralsight course on Test-Driven Development with F#, you'll learn how the intersection between TDD and F# presents opportunities for better design and better testability.
Introduction to Property based Testing with F# Pluralsight course
My latest Pluralsight course is an introduction to Property-Based Testing with F#.
Soon after the release of my Unit Testing with F# Pluralsight course, it gives me great pleasure to announce my new course, which is an Introduction to Property-based Testing with F#.
In this course, you'll get an introduction to the concept of Property-Based Testing, and see a comprehensive example that demonstrates how to incrementally implement a feature that otherwise would have been hard to address in an iterative fashion.
C# will eventually get all F# features, right?
C# will never get the important features that F# has. Here's why.
The relationship between C# and F# is interesting, no matter if you look at it from the C# or the F# perspective:
- Before releasing F# to the world, Don Syme, its inventor, was instrumental in getting generics into C# and .NET. Had it not been for his and his Cambridge colleagues' effort, we wouldn't have generics today, or we would have the subpar generics that Java has. (Okay, so this point isn't strictly about F#.)
- F# has had async workflows since 2007. In F#, this is simply one of many implementations of a more general language feature called Computation Expressions - other common examples are Sequence Expressions and Query Expressions, but you can also create your own. When async/await was added to C# in 2012, it was a port of that particular implementation, but turned into a one-shot language feature.
- C# 6 gets a lot of small language features, some of which F# already has: auto-property initializers, exception filters, expression-bodied function members...
- For C# 7, the design team is considering many Functional language features that F# already has: pattern matching, records, immutability, tuples...
It's not some hidden cabal that I'm trying to expose, either. Mads Torgersen has been quite open about this relationship.
Why care about F#, then? #
A common reaction to all of this seems to be that if C# eventually gets all the best F# features, there's no reason to care about F#. Just stick with C#, and get those features in the future.
The most obvious answer to such statements is that F# already has those features, while you'll have to wait for a long time to get them in C#. While C# 6 gets a few features, they are hardly killer features. So perhaps you'll get the good F# features in C#, but they could be years away, and some features might be postponed to later versions again.
In my experience, that argument mostly falls on deaf ears. Many programmers are content to wait, perhaps because they feel that the language choice is out of their hands anyway.
What F# features could C# get? #
Often, when F# enthusiasts attempt to sell the language to other programmers, they have a long list of language features that F# has, and that (e.g.) C# doesn't have. However, in the future, C# could hypothetically have those features too:
- Records. C# could have those as well, and they're being considered for C# 7. Implementation-wise, F# records compile to immutable classes anyway.
- Discriminated Unions. Nothing in principle prevents C# from getting those. After all, F# Discriminated Unions compile to a class hierarchy.
- Pattern matching Also being considered for C# 7.
- No nulls. It's a common myth that F# doesn't have nulls. It does. It's even a keyword. It's true that F# doesn't allow its Functional data types (records, unions, tuples, etc.) to have null values, but it's only a compiler trick. At run-time, these types can have null values too, and you can provide null values via Reflection. C# could get such a compiler trick as well.
- Immutability. F#'s immutability 'feature' is similar to how it deals with nulls. Lots of F# can be mutable (everything that interacts with C# code), but the special Functional data types can't. Again, it's mostly in how these specific data types are implemented under the hood that provides this feature, and C# could get that as well.
- Options. These are really just specialised Discriminated Unions, so C# could get those too.
- Object Expressions. Java has had those for years, so there's no reason C# couldn't get them as well.
- Partial Function Application. You can already do this in C# today, but the syntax for it is really awkward. Thus, there's no technical reason C# can't have that, but the C# language designers would have to come up with a better syntax.
- Scripting. F# is great for scripting, but as the success of scriptcs has shown, nothing prevents C# from being a scripting language either.
- REPL. A REPL is a really nice tool, but scriptcs already comes with a REPL, again demonstrating that C# could have that too.
What F# features can C# never get? #
There are a few F# features that many people point to as their favourite, that C# is unlikely to get. A couple of them are:
- Type Providers. Someone that I completely trust on this issue told me quite authoritatively that "C# will never get Type Providers", and then laughed quietly. While I don't know enough about the technical details of Type Providers to be able to evaluate that statement, I trust this person completely on this issue.
- Units of Measure. Here, I simply have to confess ignorance. While I haven't seen talk about units of measure for C#, I have no idea whether it's doable or not.
In a recent article, I explained how less is more when it comes to language features. Many languages come with redundant features (often for historical reasons), but the fewer redundant features a language has, the better.
The F# compiler doesn't allow circular dependencies. You can't use a type or a function before you've defined it. This may seem like a restriction, but is perhaps the most important quality of F#. Cyclic dependencies are closely correlated with coupling, and coupling is the deadliest maintainability killer of code bases.
In C# and most other languages, you can define dependency cycles, and the compiler makes it easy for you. In F#, the compiler makes it impossible.
Studies show that F# projects have fewer and smaller cycles, and that there are types of cycles (motifs) you don't see at all in F# code bases.
The F# compiler protects you from making cycles. C# will never be able to do that, because it would be a massive breaking change: if the C# compiler was changed to protect you as well, most existing C# code wouldn't be able to compile.
Microsoft has never been in the habit of introducing breaking changes, so I'm quite convinced that this will never happen.
Summary #
C# could, theoretically, get a lot of the features that F# has, but not the 'feature' that really matters: protection against coupling. Since coupling is one of the most common reasons for code rot, this is one of the most compelling reasons to switch to F# today.
F# is a great language, not only because of the features it has, but even more so because if the undesirable traits it doesn't have.
Less is more: language features
Many languages have redundant features; progress in language design includes removing those features.
(This post is also available in Japanese.)
There are many programming languages, and new ones are being introduced all the time. Are these languages better than previous languages? Obviously, that's impossible to answer, since there's no clear measurement of what constitutes a 'better' programming language.
Still, if you look at a historical trend, it looks as though one way to make a better language is to identify a redundant language feature, and design a new language that doesn't have that feature.
"perfection is attained not when there is nothing more to add, but when there is nothing more to remove." - Antoine de Saint Exupéry
In this article, you'll see various examples of language features that have already proven to be redundant, and other features where we are seeing strong indications that they are redundant.
Limitless ways to shoot yourself in the foot. #
When the first computers were created, programs had to be written in machine code or assembly language. In machine code, you can express everything the CPU can do, because the machine code is written it terms of a CPU's instruction set. While it's possible to write correct programs in machine code, you can also write a lot of incorrect programs, including programs that crash horribly, or perhaps even destroy the machine on which they are running.
You're most likely used to writing code in a higher-level language, but even so, if you share my experience with this, you'll agree that it takes a lot of trial and error to get things right. For every correct program, there are many incorrect variations. The set of incorrect programs is much bigger than the set of correct programs.
With machine code, you have limitless ways to create incorrect programs. Yes: you can express everything the CPU can execute, but most of it will be incorrect. Thus, the set of valid programs is a small subset of the set of all possible instructions.
Early computer programmers quickly discovered that writing in machine code was extremely error-prone, and the code was also as unreadable as it could be. One way to address this problem was to introduce assembly language, but it didn't solve the underlying problems: most assembly languages were just 'human-readable' one-to-one mappings over machine code, so you still have unlimited ways to express incorrect programs.
High-level languages #
After having suffered with machine code and assembly language for some time, programmers figured out how to express programs in high-level languages. The first programming languages aren't in much use today, but a good example of a 'low-level' high-level language still in use today is C.
In C, you can express almost all valid programs. There are still tons of ways to write incorrect programs that will crash the program (or the computer), but since the language is an abstraction over machine code, there are instructions you can't express. Most of these are invalid instruction sequences anyway, so it's for the better.
Notice what happened: moving from machine code to a high-level programming language removes a feature of the language. In machine code, you can express anything; in a high-level language, there are things you can't express. You're okay with that, because the options that were taken away from you were bad for you.
GOTO #
In 1968, Edsger Dijkstra published his (in)famous paper Go To Statement Considered Harmful. In it, he argued that the GOTO statement was an bad idea, and that programs would be 'better' without the GOTO statement. This sparked a decade-long controversy, but these days we've come to understand that Dijkstra was right. Some of the most popular languages in use today (e.g. Java and JavaScript) don't have GOTO at all.
Since GOTO has turned out to be an entirely redundant language feature, a language without it doesn't limit your ability to express valid programs, but it does limit your ability to express invalid programs.
Do you notice a pattern?
Take something away, and make an improvement.
There's nothing new about this; Robert C. Martin told us that years ago.
You can't just arbitrarily take away anything, because that may constrain you in such a way that there are valid programs you can no longer write:
You have to take away the right features.
Exceptions #
Today, everyone seems to agree that errors should be handled via some sort of exception mechanisms; at least, everyone can agree that error codes are not the way to go (and I agree). We need something richer than error codes, and something that balances usefulness with robustness.
The problem with exceptions is that this mechanism is really only GOTO statements in disguise - and we've already learned that GOTO is considered harmful.
A better approach is to use a sum type to indicate either success or failure in a composable form.
Pointers #
As Robert C. Martin also pointed out, in older languages, including C and C++, you can manipulate pointers, but as soon as you introduce polymorphism, you no longer need 'raw' pointers. Java doesn't have pointers; JavaScript doesn't do pointers; C# does allow you to use pointers, but I've personally never needed them outside of interop with the Windows API.
As these languages have demonstrated, you don't need access to pointers in order to be able to pass values by reference. Pointers can be abstracted away.
Numbers #
Most strongly typed languages give you an opportunity to choose between various different number types: bytes, 16-bit integers, 32-bit integers, 32-bit unsigned integers, single precision floating point numbers, etc. That made sense in the 1950s, but is rarely important these days; we waste time worrying about the micro-optimization it is to pick the right number type, while we lose sight of the bigger picture. As Douglas Crockford explains, in JavaScript there's only a single number type, which is a great idea - just too bad it's the wrong single number type.
With the modern computer resources we have today, a programming language that would restrict you to a single, sane number type would be superior to the mess we have today.
Null pointers #
Nulls are one of the most misunderstood language constructs. There's nothing wrong with the concept of a value that may or may not be present. Many great programming languages have this concept. In Haskell, it's called Maybe; in F#, it's called option; in T-SQL it's called null. What's common in all these languages is that it's an opt-in language feature: you can declare a value as being 'nullable', but by default, it isn't (nullable).
However, due to Tony Hoare's self-admitted billion-dollar mistake, mainstream languages have null pointers: C, C++, Java, C#. The problem isn't the concept of 'null', but rather that everything can be null, which makes it impossible to distinguish between the cases where null is an appropriate and expected value, from the cases where null is a defect.
Design a language without null pointers, and you take away the ability to produce null pointer exceptions.
If Tony Hoare is correct that null pointers cost billions of dollars in the last few decades, getting rid of that source of defects can save you lots of money. From Turing-complete languages like T-SQL, Haskell, and F# we know that you can express all valid programs without null pointers, while a huge source of defects is removed.
Mutation #
A central concept in Procedural, Imperative, and Object-Oriented languages is that you can change the value of a variable while executing your program. That's the reason it's called a variable. This seems intuitive, since a CPU contains registers, and what you actually do when you execute a program is that you move values in and out of those registers. It's also intuitive because the purpose of most programs is to change the state of the world: Store a record in a database. Send an email. Repaint the screen. Print a document. Etc.
However, it turns out that mutation is also a large source of defects in software, because it makes it much harder to reason about code. Consider a line of C# code like this:
var r = this.mapper.Map(rendition);
When the Map
method returns, has rendition
been modified? Well, if you follow Command Query Separation, it shouldn't, but the only way you can be sure is to review the implementation of the Map method. What if that method exhibits the same problem, by calling into other methods that could mutate the state of the application? There's nothing in C# (or Java, or JavaScript, etc.) that prevents this from happening.
In a complicated program with a big call stack, it's impossible to reason about the code, because everything could mutate, and once you have tens or hundreds of variables in play, you can no longer keep track of them. Did the isDirty
flag change? Where? What about the customerStatus
?
Imagine taking away the ability to mutate state:
Most languages don't completely take away this ability, but as Haskell (a Turing complete language) demonstrates, it's possible to write any program without implicit state mutation.
At this point, many people might object that Haskell is too difficult and unintuitive, but to me, that kind of argumentation is reminiscent of the resistance to removing GOTO. If you are used to relying on GOTO, you have to learn alternative ways to model the same behaviour without GOTO. Likewise, if you are used to relying on state mutation, you have to learn alternative ways to model the same behaviour without state mutation.
Reference Equality #
In Object-Oriented languages like C# and Java, the default equality comparison is Reference Equality. If two variables point to the same memory address, the variables are considered equal. If two variables have all identical constituent values, but point to two different memory addresses, they are considered different. That's not intuitive, and many software defects are caused by this.
What if you take away Reference Equality from a language?
What if all data structures instead had Structural Equality?
This one I'm not entirely sure about, but in my experience, I almost never need Reference Equality. For Basic Correctness you never need it, but I wonder if you may need the occasional reference comparison in order to implement some performance optimizations... Still, it wouldn't hurt to switch the defaults so that Structural Equality was the default, and there was a special function you could invoke to compare two variables for Reference Equality.
Inheritance #
Even in 2015, inheritance is everywhere, although it's more than 20 years ago the Gang of Four taught us to favor object composition over class inheritance. It turns out that there's nothing you can do with inheritance that you can't also do with composition with interfaces. The converse doesn't hold in languages with single inheritance: there are things you can do with interfaces (such as implement more than one), that you can't do with inheritance. Composition is a superset of inheritance.
This isn't just theory: in my experience, I've been able to avoid inheritance in the way I've designed my code for many years. Once you get the hang of it, it's not even difficult.
Interfaces #
Many strongly typed languages (for example Java and C#) have interfaces, which is a mechanism to implement polymorphism. In this way, you can bundle various operations together as methods on an interface. However, one of the consequences of SOLID is that you should favour Role Interfaces over Header Interfaces, and as I've previously explained, the logical conclusion is that all interfaces should only have a single member.
When interfaces only have a single member, the interface declaration itself tends to mostly be in the way - the only thing that matters is the operation. In C#, you could just use a delegate instead, and even Java has lambdas now.
There's nothing new about this. Functional languages have used functions as the basic compositional unit for years.
In my experience, you can model everything with single-member interfaces, which also implies that you can model everything with functions. Again, this isn't really surprising, since functional languages like Haskell are Turing complete too.
It's possible to get by without interfaces. In fact, Robert C. Martin finds them harmful.
Reflection #
If you've ever done any sort of meta-programming in .NET or Java, you probably know what Reflection is. It's a set of APIs and language or platform features that enable you to inspect, query, manipulate, or emit code.
Meta-programming is an indispensable tool, so I'd be sorry to lose it. However, Reflection isn't the only way to enable meta-programming. Some languages are homoiconic, which means that a program in such languages is structured as data, which itself can be queried and manipulated like any other data. Such languages don't need Reflection as a feature, because meta-programming is baked into the language, so to speak.
In other words: Reflection is a language feature aimed at the goal of meta-programming. If meta-programming can be achieved via homoiconicity instead, it would imply that Reflection is a redundant feature.
Cyclic Dependencies #
While it may be true that null pointers are the biggest single source of software defects, in my experience the greatest single source of unmaintainable code is coupling. One of the most problematic types of coupling is cyclic dependencies. In languages like C# and Java, cyclic dependencies are almost impossible to avoid.
Here's one of my own mistakes that I only discovered because I started to look for it: in the otherwise nice and maintainable AtomEventStore, there's an interface called IXmlWritable:
public interface IXmlWritable { void WriteTo(XmlWriter xmlWriter, IContentSerializer serializer); }
As you can tell, the WriteTo method takes an IContentSerializer argument.
public interface IContentSerializer { void Serialize(XmlWriter xmlWriter, object value); XmlAtomContent Deserialize(XmlReader xmlReader); }
Notice that the Deserialize method returns an XmlAtomContent value. How is XmlAtomContent defined? Like this:
public class XmlAtomContent : IXmlWritable
Notice that it implements IXmlWritable. Oh, dear!
Although I'm constantly on the lookout for these kinds of things, that one slipped right past me.
In F# (and, I believe, OCaml), on the other hand, this wouldn't even have compiled!
While F# has a way to introduce small cycles within a module using the and
and rec
keywords, there are no accidental cycles. You have to explicitly use those keywords to enable limited cycles - and there's no way to define cycles that span modules or libraries.
What an excellent protection against tightly coupled code! Take away the ability to (inadvertently) introduce Cyclic Dependencies, and get a better language!
This has even been shown to work in the wild: Scott Wlaschin examined C# and F# projects for cycles, and found that F# projects have fewer and smaller cycles than C# projects. This analysis was later enhanced and corroborated by Evelina Gabasova.
Summary #
What I've tried to illustrate in this article is that there are many ways in which you could make a better language by taking away a particular feature from an existing language. Take away a redundant feature, and you'll still have a Turing complete language that can do (close to) anything, but with fewer options for shooting yourself in the foot.
Perhaps the ultimate programming language is a language without:
- GOTO
- Exceptions
- Pointers
- Lots of specialized number types
- Null pointers
- Mutation
- Reference equality
- Inheritance
- Interfaces
- Reflection
- Cyclic dependencies
Have I identified all possible redundant features? Most likely not, so here's a great opportunity for language designers to define an even better language, by finding something new to take away!
Update September 14 2015: This article sparked a .NET Rocks! episode with even more discussion about this topic.
Comments
The idea that Javascript is a superset of "valid programs", but that C is not, could do with more explanation.
My background is in server code (Haskell/Java/Python/Javascript/Rust), web code (Javascript) and embedded code (Rust/C) (with some ARM assembly for when Rust/C can't produce the needed program unaided). This makes the idea that I could express programs in Javascript, which I couldn't in C, very interesting.
I think that Rust is going to be very important in embedded development, in a few years.
P.S. I hope I have commented in the correct way.
Chris, thank you for writing. Where in this article do I claim that there are programs you could express in JavaScript, but not in C?
Thanks for the post Mark. Some great points; I like especially the idea of disallowing cyclic dependencies. That'd be awesome on a legacy Java project I'm working on now!
Having working in Scala and a little Haskell, I can say I love the Maybe type. One thing I have trouble visualising still is Exceptions. I think I need to find some more examples on doing without exceptions in real applications.
I guess there's somewhat of a middle ground between having the "safest" language, vs the most performant language. In a lot of cases you don't need awesome performance so the safest language is the better one... but in some you do. You occasionally need control over how your bits are packed.
Let's hope for the future!
Lachlan, thank you for writing. When it comes to working without exceptions, the point is to replace them with something stronger. Scott Wlashin's post and presentation about Railway Oriented Programming is a great place to start. You can see another example in my No Mocks presentation, and in one of my up-coming Pluralsight courses.
When it comes to the balance between safe and performant, there will always be room for languages that sacrifice safety for speed, but in my experience, this shouldn't be a common concern. As soon as you're doing any sort of significant I/O, the cost of that tends to be so much higher than any potential imperfections in pure CPU processing.
Apart from that, I don't see why, in principle, a safe language couldn't also be performant. These two traits don't seem to me to be intrinsically mutually exclusive.
As one of less-is-more advocates, I feel your article makes sense instinctively. However, one thing keeps bugging me:
I think I know what you try to mean with it, but once we try to be more specific, validity can only be defined in terms of a certain framework of semantics. If we have such a framework, then we can derive a programming language from it, which would be a language that only accepts valid programs and reject others. Problem solved.
In reality we don't have such a framework. We don't know what is a valid program precisely. There are programs that doesn't seem to make sense, and there are programs that are obviously useful, but there are huge gray area inbetween. It's actually a moving target---the context, the environment, the user, and other outside factors will greatly affect what's valid or not. But the fact that we don't know what valid programs shakes the ground of this whole discussion, doesn't it? Or it can end up tautology---"valid programs are defined in terms of this language, so this language fit the best to cover the valid programs."
Shiro Kawai, thank you for writing. In this article, I deliberately left the definition of validity vague. Why do we write software? If we exclude Katas and the like (which we do for training), software is ultimately written in order to solve problems; in order to be used. A valid program is a program that's useful.
This doesn't change the discussion, which is pragmatic. From experience, we know that we don't need GOTO; from experience, we know that we don't need pointers; from experience, we know that we don't need null pointers; from experience, we know that we don't need inheritance; etc.
Hi, thanks for the reply. I believe I get your intent. What I wanted was to point out a danger of this kind of argument. Because when you write software to solve problems, you need to articulate the problem, and the way you frame the problem is inevitably influenced by the frame of your language of choice. In other words, when you think you tighten the circle of the language features, it's not necessarily that you get it close to supposed set of "valid programs", but you may just as well start ignoring useful programs outside of your langauge circle and you don't even realize that. You just think "ok, there may be some programs that falls outside but it's not that important." How do you verify that the excluded gray area isn't that important?
(I wrote "you", but actually this is something I constanly try to remind myself, since I tend to be dragged toward feature-minimalism. I love Scheme.)
For example, you talk single-member inheritance and you brought up Haskell, but you didn't mention type classes. Isn't type classe sort of interface, in a sense that it defines a protocol consists of multiple functions? Do you say it's actually redundant? Or will you dismiss programs that can be expressed well using type classes "unimportant"?
Another controversial (and probably obscure) feature is change-class in CL. This one is such a beast that it's difficult to cope with many modern features (obviously it invalidates strict typecheckers). However, in the context where the feature is needed, I don't know if there's a replacement. If you get rid of it, you just have to give up the kind of software that require the feature. That's a reasonable trade-off, but what you do is that you cut off a part of "valid programs" in order to tighten the circle.
There's no solid circle of valid programs, and as you tighten the circle of the language, you actually shape the circle of valid programs that suit to the circle of language you draw. That's not necessarily a bad thing, but the language designers need to be aware of it, that's what I wanted to say.
It's a good point that a language shapes how you approach problems. In this discussion, I assume that all languages (both existing and hypothetical) are Turing complete, but even under that assumption, there will be problems that are difficult, or perhaps even impossible, to address in a particular language.
The question is whether that isn't true for all languages?
In a sense, the most 'powerful' languages are all the dialects of machine code. By definition, they should be able to express everything a particular CPU can do, so by corollary, they should be the most complete languages available. Despite being complete, these dialects don't have inheritance, exceptions, Reflection, interfaces, or a lot of other things. Such features are features that have been added to some languages. Logically, I don't see how 'removing' such a feature constrains one's ability to express 'all valid programs'.
To be fair, that argument of mine doesn't apply to all the language features I've described. For example, CPU instruction sets most certainly allow immutability.
What I was aiming at with my article was never a formal discussion about how a hypothetical language would, or wouldn't, be able to express 'all valid programs'. The purpose was more to point out that decades of experience should by now have taught the overall programmer community that there are many language features that we can do without.
As I pointed out, you can't arbitrarily remove features. Some features are extremely important within a particular language, while it may be irrelevant in another language. While I don't know much about Type Classes, it seems to be an example of this: it's very important in Haskell, but doesn't exist at all in C# or F#.
This is also the reason why I don't think we'll ever have a single, perfect programming language. In the future, we'll also have many different languages, and they'll be good at different types of problems.
Just as it is today, it'll be important to know more than a single programming language, because, exactly as you wrote, a problem may 'fall outside' a given language, but then, if you know another language, you may realise that you can use it to solve that particular problem.
Unit Testing with F# Pluralsight course
My latest Pluralsight course is an introduction to unit testing with F#.
Perhaps you already know all about unit testing. Perhaps you already know all about F#. But do you know how to write unit tests in F#?
My new Pluralsight course explains how to write unit tests with F#. If you already know F# and unit testing on .NET, it's quite straightforward. This is my first beginner-level course on Pluralsight, so regular readers of this blog may find it too basic.
Still, if you don't know what Unquote is and can do for you, you may want to consider watching module four, which introduces this great assertion library, and provides many examples.
This entire course will, together with some of my existing Pluralsight courses, serve as a basis for more courses on F# and Test-Driven Development.
POSTing JSON to an F# Web API
How to write an ASP.NET Web API service that accepts JSON in F#.
It seems that many people have problems with accepting JSON as input to a POST method when they attempt to implement an ASP.NET Web API service in F#.
It's really quite easy, with one weird trick :)
You can follow my recipe for creating a pure F# Web API project to get started. Then, you'll need to add a Data Transfer Record and a Controller to accept your data:
[<CLIMutable>] type MyData = { MyText : string; MyNumber : int } type MyController() = inherit ApiController() member this.Post(myData : MyData) = this.Ok myData
That's quite easy; there's only one problem with this: the incoming myData
value is always null.
The weird trick #
In addition to routes etc. you'll need to add this to your Web API configuration:
GlobalConfiguration.Configuration.Formatters.JsonFormatter.SerializerSettings.ContractResolver <- Newtonsoft.Json.Serialization.CamelCasePropertyNamesContractResolver()
You add this in your Application_Start method in your Global class, so you only have to add it once for your entire project.
The explanation #
Why does this work? Part of the reason is that when you add the [<CLIMutable>] attribute to your record, it causes the record type to be compiled with auto-generated internal mutable fields, and these are named by appending an @ character - in this case, the field names become MyText@
and MyNumber@
.
Apparently, the default JSON Contract Resolver (whatever that is) goes for those fields, even though they're internal, but the CamelCasePropertyNamesContractResolver doesn't. It goes for the properly named MyText
and MyNumber
writeable public properties that the compiler also generates.
As the name implies, the CamelCasePropertyNamesContractResolver converts the names to camel case, so that the JSON properties become myText
and myNumber
instead, but I only find this appropriate anyway, since this is the convention for JSON.
Example HTTP interaction #
You can now start your service and make a POST request against it:
POST http://localhost:49378/my HTTP/1.1 Content-Type: application/json { "myText": "ploeh", "myNumber": 42 }
This request creates this response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 {"myText":"ploeh","myNumber":42}
That's all there is to it.
You can also receive XML instead of JSON using a similar trick.
Comments
[<JsonObject(MemberSerialization=MemberSerialization.OptOut)>]
to the type also worksProperty Based Testing without a Property Based Testing framework
Sometimes, you don't need a Property-Based Testing framework to do Property-Based Testing.
In my previous post, I showed you how to configure FsCheck so that it creates char values exclusively from the list of the upper-case letters A-Z. This is because the only valid input for the Diamond kata is the set of these letters.
By default, FsCheck generates 100 random values for each property, and runs each property with those 100 values. My kata code has 9 properties, so that means 900 function calls (taking just over 1 second on my Lenovo X1 Carbon).
However, why would we want to select 100 random values from a set of 26 valid values? Why not simply invoke each property (which is a function) with those 26 values?
That's not so hard to do, but if there's a way to do it with FsCheck, I haven't figured it out yet. It's fairly easy to do with xUnit.net, though.
What you'll need to do is to change the Letters type to an instance class implementing seq<obj[]> (IEnumerable<object[]> for the single C# reader still reading):
type Letters () = let letters = seq {'A' .. 'Z'} |> Seq.cast<obj> |> Seq.map (fun x -> [|x|]) interface seq<obj[]> with member this.GetEnumerator () = letters.GetEnumerator() member this.GetEnumerator () = letters.GetEnumerator() :> Collections.IEnumerator
This is simply a class that enumerates the char values 'A' to 'Z' in ascending order.
You can now use xUnit.net's Theory and ClassData attributes to make each Property execute exactly 26 times - one for each letter:
[<Theory; ClassData(typeof<Letters>)>] let ``Diamond is as wide as it's high`` (letter : char) = let actual = Diamond.make letter let rows = split actual let expected = rows.Length test <@ rows |> Array.forall (fun x -> x.Length = expected) @>
Instead of 900 tests executing in just over 1 second, I now have 234 tests executing in just under 1 second. A marvellous speed improvement, and, in general, a triumph for mankind.
The point is that if the set of valid input values (the domain) is small enough, you may consider simply using all of them, in which case you don't need a Property-Based Testing framework. However, I still think this is probably a rare occurrence, so I'll most likely reach for FsCheck again next time I need to write some tests.
A simpler Arbitrary for the Diamond kata
There's a simple way to make FsCheck generate letters in a particular range.
In my post about the Diamond kata with FsCheck, I changed the way FsCheck generates char values, using this custom Arbitrary (essentially a random value generator):
type Letters = static member Char() = Arb.Default.Char() |> Arb.filter (fun c -> 'A' <= c && c <= 'Z')
This uses the default, built-in Arbitrary for char values, but filters its values so that most of them are thrown away, and only the letters 'A'-'Z' are left. This works, but isn't particularly efficient. Why generate a lot of values only to throw them away?
It's also possible to instruct FsCheck to generate values from a particular set of valid values, which seems like an appropriate action to take here:
type Letters = static member Char() = Gen.elements ['A' .. 'Z'] |> Arb.fromGen
Instead of using Arb.Default.Char()
and filtering the values generated by it, this implementation uses Gen.elements
to create a Generator of the values 'A'-'Z', and then an Arbitrary from that Generator.
Much simpler, but now it's also clear that this custom Arbitrary will be used to generate 100 test cases (for each property) from a set of 26 values; that doesn't seem right...
Library Bundle Facade
Some people want to define a Facade for a bundle of libraries. Is that a good idea?
My recent article on Composition Root reuse generated some comments:
These comments are from two different people, but they provide a decent summary of the concerns being voiced."What do you think about pushing these factories and builders to a library so that they can be reused by different composition roots?"
"We want to share the composition root, because otherwise when a component needs a new dependency and a constructor parameter is added, we'd have to change the same code in two different places."
Is it a good idea to provide one or more Facades, for example in the form of Factories or Builders, for the libraries making up a Composition Root? More specifically, is it a good idea to provide a Factory or Builder that can compose a complete object graph spanning multiple libraries?
In this article, I will attempt to answer that question for various cases of library bundles. To make the terminology a bit more streamlined, I'll refer to any Factory or Builder that composes object graphs as a Composer.
Single library #
In the case of a single library, I think I've already answered the question in the affirmative. There's nothing wrong with including a Facade in the form of a Factory or Builder in order to make that single library easier to use.
Two libraries #
When you introduce a second library, things start becoming interesting. If we consider the case of two libraries, for example a Domain Model and a Data Access Library, the Composition Root will need to compose an object graph where some of the objects in the graph are from the Domain Model, and some of the objects are from the Data Access Library.
In the spirit of Agile Principles, Patterns, and Practices (APPP), it turns out that simply drawing dependency diagrams can be helpful. From the Dependency Inversion Principle (DIP) we know that the "clients [...] own the abstract interfaces" (APPP, chapter 11), which means that for our two example libraries, the dependency graph must look like this:
At least, if you follow the most common architectures for loosely couple code, the Domain Model is the 'client', so it gets to define the interfaces it needs. Thus, it follows that the Data Access Library, in order to implement those interfaces, must have a compile-time dependency on the Domain Model. That's what the arrow means.
From this diagram, it should be clear that you can't put a Factory or Builder in the Domain Model library. If the Composer should compose object graphs from both libraries, it would need to reference both of those libraries, and the Domain Model can't reference the Data Access Library, since that would result in a circular reference.
You could put the Composer in the Data Access Library, but that somehow doesn't feel right, and in any case, as we shall see later, this solution can't be generalised to n libraries.
A solution that many people reach for, then, is to pull the interfaces out into a separate library, like this:
It's a bit like cheating, according to the DIP, but it's as decoupled as before. In this diagram, the Domain Model depends on the interfaces because it uses them, while the Data Access Library depends on the interface library because it implements the interfaces. Unfortunately, this doesn't solve the problem at all, because there's still no place to place a Composer without getting into a problem with either the DIP, or circular references (exercise: try it!).
A possible option is to keep the libraries as the DIP dictates, and then add a third Composer library:
The Composer library references both the Domain Model and the Data Access Library, so it's possible for it to compose object graphs with objects from both libraries. The only purpose of this library, then, is to compose those object graphs, so it'll likely only contain a single class.
Multiple libraries #
Does the above conclusions change if you have more than two libraries? Only in the sense that it further restricts your options. As the analysis of the special case with two libraries demonstrated, you only have two options for adding a Composer to your bundle of libraries:
- Put the Composer in the Data Access Library
- Put the Composer in a new dedicated library
Imagine that you have two Data Access Libraries instead of one:
For instance, the SQL Access Library may implement various interfaces defined by the Domain Model, based on a SQL Server database; and the Web Service Access Library may implement some other interfaces by calling out to some web service.
If the Composer must be able to compose object graphs with object from all three libraries, it must reside in a library that references all of the relevant libraries. The Domain Model is still out of the question because you can't have circular references. That leaves one of the two Data Access libraries. It'd be technically possible to e.g. add a reference from the SQL Access Library to the Web Service Access Library, and put the Composer in the SQL Access Library:
However, why would you ever do that? It's clearly wrong to let one Data Access library depend on another, and it doesn't help if you reverse the arrow.
Thus, the only option left is to add a new Composer library:
As before, the Composer library has references to all other libraries, and contains a single class that composes object graphs.
Over-engineering #
The point of this analysis is to arrive at the conclusion that no matter how you twist and turn, you'll have to add a completely new library with the only purpose of composing object graphs. Is it warranted?
If the only motivation for doing this is to avoid duplicated code, I would argue that this looks like over-engineering. In the questions quoted above, it sounds as if the Rule of Three isn't even satisfied. It's always important to question your motivations for avoiding duplication. In this case, I'd be wary of introducing so much extra complexity only in order to avoid writing the same lines of code twice - particularly when it's likely that the duplication is accidental.
TL;DR #
Attempting to provide a reusable Facade to compose object graphs across multiple libraries is hardly worth the trouble. Think twice before you do it.
Comments
Dear Mark,
Thanks again for another wonderful post. I'm one of the guys you mentioned at the beginning of this text and I totally agree with you on this subject.
Maybe I wasn't specific enough in my comment that I didn't mean to introduce factories and builders handling dependencies from two or more assemblies, but only from a single one. The composition root wires the independent subgraphs from the different assemblies together.
However, I want to reemphasize the Builder Pattern in this context: it provides default dependencies for the object graph to be created and (usually) a fluent API (i.e. some sort of method chaining) to exchange these dependencies. This has the following consequences for the client programmer using the builder(s): he or she can easily create a default object graph by calling new Builder().Build() and still exchange dependencies he or she cares about. This can keep the composition root clean (at least for Arrange phases of tests this is true).
Why I'm so excited about this? Because I use this all the time in my automated tests, but haven't really used it in my production composition roots (or seen it in others). After reading this post and Composition Root Reuse, I will try to incorporate builders more often in my production code.
Mark - thank you for making me think about this.
Kenny, thank you for writing. The Builder pattern is indeed a good way to implement Facades for a single library.
From Primitive Obsession to Domain Modelling
A string is sometimes not a string. Model it accordingly.
Recently, I was reviewing some code that looked like this:
public IHttpActionResult Get(string userName) { if (string.IsNullOrWhiteSpace(userName)) return this.BadRequest("Invalid user name."); var user = this.repository.FindUser(userName.ToUpper()); return this.Ok(user); }
There was a few things with this that struck me as a bit odd; most notably the use of IsNullOrWhiteSpace. When I review code, IsNullOrWhiteSpace is one of the many things I look for, because most people use it incorrectly.
This made me ask the author of the code why he had chosen to use IsNullOrWhiteSpace, or, more specifically, what was wrong with a string with white space in it?
The answer wasn't what I expected, though. The answer was that it was a business rule that the user name can't be all white space.
You can't really argue about business logic.
In this case, the rule even seems quite reasonable, but I was just so ready to have a discussion about invariants (pre- and postconditions) that I didn't see that one coming. It got me thinking, though.
Where should business rules go? #
It seems like a reasonable business rule that a user name can't consist entirely of white space, but is it reasonable to put that business rule in a Controller? Does that mean that everywhere you have a user name string, you must remember to add the business rule to that code, in order to validate it? That sounds like the kind of duplication that actually hurts.
Shouldn't business rules go in a proper Domain Model?
This is where many programmers would start to write extension methods for strings, but that's just putting lipstick on a pig. What if you forget to call the appropriate extension method? What if a new developer on the team doesn't know about the appropriate extension method to use?
The root problem here is Primitive Obsession. Just because you can represent a value as a string, it doesn't mean that you always should.
The string
data type literally can represent any text. A user name (in the example domain above) can not be any text - we've already established that.
Make it a type #
Instead of string, you can (and should) make user name a type. That type can encapsulate all the business rules in a single place, without violating DRY. This is what is meant by a Domain Model. In Domain-Driven Design terminology, a primitive like a string or a number can be turned into a Value Object. Jimmy Bogard already covered that ground years ago, but here's how I would define a UserName class:
public class UserName { private readonly string value; public UserName(string value) { if (value == null) throw new ArgumentNullException("value"); if (!UserName.IsValid(value)) throw new ArgumentException("Invalid value.", "value"); this.value = value; } public static bool IsValid(string candidate) { if (string.IsNullOrEmpty(candidate)) return false; return candidate.Trim().ToUpper() == candidate; } public static bool TryParse(string candidate, out UserName userName) { userName = null; if (string.IsNullOrWhiteSpace(candidate)) return false; userName = new UserName(candidate.Trim().ToUpper()); return true; } public static implicit operator string(UserName userName) { return userName.value; } public override string ToString() { return this.value.ToString(); } public override bool Equals(object obj) { var other = obj as UserName; if (other == null) return base.Equals(obj); return object.Equals(this.value, other.value); } public override int GetHashCode() { return this.value.GetHashCode(); } }
As you can tell, this class protects its invariants. In case you were wondering about the use of ToUpper, it turns out that there's also another business rule that states that user names are case-insensitive, and one of the ways you can implement that is by converting the value to upper case letters. All business rules pertaining to user names are now nicely encapsulated in this single class, so you don't need to remember where to apply them to strings.
If you want to know the underlying string, you can either invoke ToString, or take advantage of the implicit conversion from UserName to string. You can also compare two UserName instances, because the class overrides Equals. If you have a string, and want to convert it to a UserName, you can use TryParse.
The original code example above can be refactored to use the UserName class instead:
public IHttpActionResult Get(string candidate) { UserName userName; if (!UserName.TryParse(candidate, out userName)) return this.BadRequest("Invalid user name."); var user = this.repository.FindUser(userName); return this.Ok(user); }
This code has the same complexity as the original example, but now it's much clearer what's going on. You don't have to wonder about what looks like arbitrary rules; they're all nicely encapsulated in the UserName class.
Furthermore, as soon as you've left the not Object-Oriented boundary of your system, you can express the rest of your code in terms of the Domain Model; in this case, the UserName class. Here's the IUserRepository interface's Find method:
User FindUser(UserName userName);
As you can tell, it's expressed in terms of the Domain Model, so you can't accidentally pass it a string. From that point, since you're receiving a UserName instance, you know that it conforms to all business rules encapsulated in the UserName class.
Not only for OOP #
While I've used the term encapsulation once or twice here, this way of thinking is in no way limited to Object-Oriented Programming. Scott Wlaschin describes how to wrap primitives in meaningful types in F#. The motivation and the advantages you gain are the same in Functional F# as what I've described here.
Over-engineering? #
Isn't this over-engineering? A 56 lines of code class instead of a string? Really? The answer to such a question is always context-dependent, but I rarely find that it is (over-engineering). You can create a Value Object like the above UserName class in less than half an hour - even if you use Test-Driven Development. When I created this example, I wrote 27 test cases distributed over five test methods in order to make sure that I hadn't done something stupid. It took me 15 minutes.
You may argue that 15 minutes is a lot, compared to the 0 minutes it would take you if you'd 'just' use a string. On the surface, that seems like a valid counter-argument, but perhaps you're forgetting that with a primitive string, you still need to write validation and business logic 'around' the string, and you have to remember to apply that logic consistently across your entire code base. My guess is that you'll spend more than 15 minutes on doing this, and troubleshooting defects that occur when someone forgets to apply one of those rules to a string in some other part of the code base.
Summary #
Primitive values such as strings, integers, decimal numbers, etc. often represent concepts that are constrained in some ways; they're not just any string, any integer, or any decimal number. Ask yourself if extreme values (like the entire APPP manuscript, Int32.MinValue, and so on) are suitable for such variables. If that's not the case, consider introducing a Value Object instead.
If you want to learn more about Encapsulation, you can watch my Encapsulation and SOLID Pluralsight course.
Comments
Mark, I enjoyed your article. I feel that this is the way to go. Too often have I seen people using arrays if bytes for images and strings for email addresses.
An important aspect of DDD is within the language and when there are people talking about user names all the time, that usually indicates that usernames are an important aspect of what is going on and should be modeled explicitly and not use primitives.
The solution doesn't look overengineered to me, although I had to defend similar code before for the same reason. Not only is it now a good place to put further validation, but also allows further distinction using other types and a polymorphism (if carefully used, they can be a delight), each of which covers different invariants.
A UserName can be moved around and used to retrieve objects. But maybe you can't do that with a class of the type InvalidUsername. In another example, maybe there are functions that deal with email eddresses - which I have often seen modeled as strings. Some accept InvalidEmailAddress objects, others accept ValidEmail addresses. While it doesn't reduce the amount of boilerplate null checks, a SendEmail function wouldn't have to do
... SendEmail(EmailAddress email)
{
if(email.IsValid()) {
email.Send()
}
}
One could simply make it to only accept valid email addresses. Their mere existence would guarantee, that the email address is valid. One could then write functions dealing with valid email addresses and some dealing with invalid ones. You can log and analyze bad account creation requests with the invalid ones, but not send email to them and prevent everyone else from doing so, depending on what kind of email address is being passed around.
Johannes, thank you for writing. You are right, it's possible to take the idea further in the way you describe. Again, such techniques aren't even limited to OOP; for instance, Scott Wlaschin explains how to make illegal states unrepresentable in F#.
In fact, the Functional approach using Sum Types (Discriminated Unions) is nicer, because it doesn't rely on inheritance :)
Interesting article. I have no qualms about creating domain types around primitives per se, but IMHO there should be a bit more complexity in the invariants than a single condition before resorting to that. In this case, I would think that the username constraint should be implemented in the User class. If other entities shared the constraint (e.g. Company.AccountName), the logic might be refactored into a NameValidator, and then referenced concretely by the entities.
A string holding a strong password would be a good candidate for promoting to something smarter.
Mark,
This is exactly right. I honestly have no comment on the concept because I have changed primitives to value objects a number of times and use something similar.
Quick side-note, and this may just be a preference: In the TryParse, a Trim()'d value is being passed into the constructor. So if I want to create two usernames " Suamere " and "Suamere", the second will fail since, unbeknownst to me, the first had been successfully created with Trim(). It seems to me that "Do not allow pre/post white-space" is a business rule. Therefore, if there is white-space, I consider that to be breaking a business rule.
My main motivation for this comment is that this UserName Object shows a perfect example of a pattern I've seen when using the TryParse pattern (or Monads). That is: The logic in the TryParse is frequently an exact duplicate of logic in another method such as the constructor. Though instead of throwing exceptions, TryParse gracefully informs the consumer of success/failure.
So business logic is duplicated in the CTOR(), IsValid(), and the TryParse(). Instead, I consider this pattern:
public UserName(string value) { CheckIsValid(value, true); this.value = value; } public static bool TryParse(string candidate, out UserName userName) { userName = null; if (!CheckIsValid(candidate, false)) return false; userName = new UserName(candidate); return true; } public static bool IsValid(string candidate) { return CheckIsValid(candidate, false); } private static bool CheckIsValid(string candidate, bool throwExceptions) { //Brevity follows, but can be split up into more granular rules and exceptions if (string.IsNullOrWhiteSpace(candidate)) { if (throwExceptions) throw new ArgumentNullException("candidate"); return false; } if (!string.Equals(candidate, candidate.Trim(), StringComparison.OrdinalIgnoreCase)) { if (throwExceptions) throw new ArgumentException("Invalid value.", "candidate"); return false; } return true; }
Let me know what you think!
~Suamere, Steven Fletcher
Mike, thank you for writing. A business rule doesn't have to be complex in order to be important. The point isn't only to encapsulate complex logic, but to make sure that (any) business logic is applied consistently.
In my experience, things often start simple, but evolve (or do they devolve?) into something more complex; when do you draw the line?
Steven, thank you for writing. You exhibit traits of critical thinking, which is always good when dealing with business logic :) In the end, it all boils down to what exactly the business rule is. As you interpret the business rule, you rephrase it as "Do not allow pre/post white-space", but that's not the business rule I had in mind.
Since I'm the author of the blog post, I have the luxury of coming up with the example, and thus, in this case, defining the business rule. The business rule I had in mind is that user names are case-insensitive, and they're also insensitive to leading and trailing white space. Thus, " Suamere " and "Suamere" are considered to be two representations of the same user name, just as "FOO" and "foo" represent the same user name. The canonicalised representation of these two user names are "SUAMERE" and "FOO", respectively.
That's also the reason I chose the name TryParse; it's equivalent to DateTime.TryParse, which should be a well-known .NET idiom. If I parse the two strings "2015-01-21" and " 21. januar 2015 " on my machine (which is running with the da-DK locale), I get the exact same value from both strings. Notice that DateTime.TryParse also blissfully ignores leading and trailing white space. This is all correct according to Postel's law.
If all I'd wanted to do was to convert an already valid string into a UserName instance, I'd have implemented an explicit conversion, which is what Jimmy Bogard did in his article. In fact, such an explicit conversion doesn't conflict with the current TryParse method, so I'd find it quite reasonable to add that as well, if any client had the need for it.
This question might be slightly besides the point, but I'm having trouble with the bit about UserName values being case-insensitive. I take that to mean that you can create a UserName with all uppercase, all lowercase, or any combination thereof, and they'll all be treated equally.
To me, that doesn't mean that you have to pass in an uppercase-only value to the constructor (or TryParse() method, for that matter) in order to successfully instantiate a UserName. Was that your intent?
I created a Gist with some tests to illustrate what I'm talking about: I can instantiate with "JEFF", but not with "jeff", using either the constructor directly or the TryParse method.
Jeff, I attempted to run your unit tests, and the only one that fails is, as expected, if you attempt to use the constructor with "jeff"; TryParse with "jeff" works.
Would it have helped if the constructor was private?
I stand corrected regarding the failing tests; I worked with the tests again, and indeed, only the constructor method failed passing in "jeff" in lowercase.
Making the constructor private would clarify things in that there would only be one way for a client to instantiate UserName, which happens to ensure anything passed into the private constructor is all upper-case.
That seems to better match the stated business rule, that UserName values are case-insensitive, rather than a rule that prohibits anything but upper-case values. In any case, it's not quite central to the point of your article, so I thank you for taking the time to reply anyway.
Jeff, I think we fundamentally agree, and I do understand why you'd find it surprising that the constructor accepts "JEFF", but not "jeff". FWIW, I follow (a lot) of rules (of thumb) when I design APIs, and one of them is that the purpose of constructors is to initialise objects. A constructor may perform validation, but must not perform work in the sense of transforming the input or producing side-effects. In other words, constructors are used to initialise objects with valid values.
Whenever I need to transform data as part of initialisation, I often create some sort of factory method for that explicit purpose. A static TryParse method is one of many options available when such a need arises. The point is exactly that the input may not start out being 'valid', but it can be transformed into a valid representation. Going from "jeff" to UserName("JEFF") is a conversion.
This idea of constructors only accepting valid values may seem a bit foreign at first, but if you follow it as a general principle, you get a code base that's easier to reason about, because it's consistent. It's probably because I'm so used to this principle that I just left the constructor public, following another principle of mine, which is that there's no reason to make anything internal or private if the class or member can properly protect its invariants.
Sorry to rant; you just inspired me to explain some of my underlying design thoughts that caused me to arrive at this particular solution.
Thanks for your post, very informative as usual. I would add a few insights.
As much as it can feel Over engineering, I would totally agree and say it doesn't. I deal with GPU resource creation (Direct3D11) on a daily basis and this has helped me so much and saved me headaches (for example a Buffer size must have more than 0 elements, failing to do so gives you a cryptic runtime error than you can only catch error message by enabling Debug Device and looking at Debug output window!
Using a simple BufferElementCount in that case enforces the fact that I provide a correct value, and throw a meaningful exception early in the process. By passing a BufferElementCount I can (almost) safely expect that my buffer gets created.
On the lines of code, I would expect that the initial 50 lines would largely be overtaken by rewriting the same logic, but scattered across the whole code (plus likely adding try/catch blocks in many places on top of it), so I'm pretty sure doing the 0 lines version would lead to more code at the end. Plus 50 lines of code to save be a deep debug session on half million lines code base is a trade-off I'll happily take!
Also as a little bonus, for simple cases I wrote a little snippet so I guess I would share it here Domain Primitive. Now it takes even less than 2 minutes in most simple cases.
Your Equals is fragile. Because the class is not sealed, one could descend from it and break the equality contract.
Example of how to break through descending:
public class DomainUserName : UserName{ private string domain; //Equals written to compare both username and domain } UserName u1 = new DomainUserName("BOB", "GITHUB"); UserName u2 = new UserName("BOB"); u1.Equals(u2) //will use DomainUserName's Equals and return false u2.Equals(u1) //will use UserName's Equals and return true
But equality contract must be symmetrical. So either seal the class or else compare GetTypes() as example here: msdn.microsoft.com/en-us/library/vstudio/336aedhh%28v=vs.100%29.aspx
Alan, thank you for writing. You're right; I'd never considered this - most likely because I never use inheritance. Therefore, my default coping strategy would most likely be to make the class sealed by default, but otherwise, explicitly comparing GetType() return values would have to do.
Thank you for teaching me something today!
Great article and completely agree on having primitive types wrapped up into its own class. I had few questions around how this would be applied to provide a good validation feedback to consuming clients. If we want to provide more details on why the 'user name is invalid', like say 'User name should not be small letters' and 'User name should not have trailing spaces' how would we handle this in the class. How would the class communicate out the details of the rules that it is abstracting to its consuming clients.
When accepting multiple such 'classes' in a endpoint, what would be a suggested validation approach. For instance if we are to look at a similar endpoint, which takes in 'Name' and 'Phone' number(or even more in the case of POST endpoint where we are creating a new user), wouldnt the controller soon be overladed with a lot of such TryParse calls.
From the above comment of constructor only accepting valid values, where would it make sense to use the constructor as opposed to TryParse. Is it at the persistence boundary of the application?
Rahul, thank you for writing. As a general rule, at the boundary of an application, all input must be considered evil until proven otherwise. That means that if you try to pass unvalidated input values into constructors, you should expect exceptions to be thrown on a regular basis. Since exceptions are for exceptional cases, that's most likely not the best way to go forward.
Instead, domain objects could offer Validate methods. These would be a bit like TryParse methods, but instead of returning a primitive boolean value, they'd have to return a list of messages. If there are no messages, the input is good. If there are messages, the input is bad. You can encapsulate such a validation result in an object if you think the rule about messages/no messages is too implicit. Such Validate methods could still take out
parameters like TryParse methods if you want to validate and potentially convert into a domain object in one go.
This isn't particularly elegant in languages like C# or Java, but due to the lack of sum types in these languages, that's the best we can do.
In languages with sum types, you can use the Either monad and applicative composition to address this type of problem much more elegantly. A good place to start would be Scott Wlaschin's article on Railway Oriented Programming.
Mark, implicit conversion operator should never throw exception. In the example, it can throw NullReferenceException when UserName is null. What would you recommend to fix that issue?
Eugene, thank you for writing. That's probably a good rule, but I admit I hadn't thought about it. One option is to convert it to an explicit conversion instead. Another option is to forgo the conversion operators all together. The conversion ability isn't the important point in this article.
A third option is to leave it as is. In the code bases I control, null is never an appropriate value, so if null appears anywhere, I'd consider it a bug somewhere else.
To be realistic, though, I think I'd prefer one of the two first options, as I tend to agree with the Zen of Python: Explicit is better than implicit.
Insightful post Mark!
I have a question, if we are implementing a class that implements Parse and TryParse menthods and this class requires
dependency, how should that be passed?
Consider following example - MobileNumber requires a list of valid area codes, I am passing this dependency in the
constructor and Parse and TryParse method:
public class MobileNumber { public MobileNumber(IEnumerable<string> areaCodes, string countryCode, string phoneNumber) { /* Something */ } public static MobileNumber Parse(IEnumerable<string> areaCodes, string countryCode, string phoneNumber) { /* Something */ } public static bool TryParse(IEnumerable<string> areaCodes, string countryCode, string phoneNumber, out MobileNumber mobileNumber) { /* Something */ } }Is it a right design?
Moiz, thank you for writing. Is it the right design? As usual, It Depends™. Which problem are you trying to solve? What does the input look like? What should the result be of attempting to parse various examples?
It can often be illuminating to write a couple of parametrised tests. What would such tests look like?
A benefit of this technique is that it doesn't prescribe where the business logic should be; only where
- access to it
Comments
I would say C# won't get non-nullable reference types either, even in the form of a compiler trick. It would either introduce too much of breaking changes or be very limited and thus not especially usefull.
Vladimir, thank you for writing. You're probably correct. Many years ago, I overheard Anders Hejlsberg say that it wouldn't be possible to introduce non-nullable reference types into the .NET platform without massive breaking changes. I can't say I ever understood the reasoning behind this (nor was it ever explained to me), but when Anders Hejlsberg tells you that, you sort of have to accept it :)
FWIW, there's a bit of discussion about non-nullable reference types in the C# Design Meeting Notes for Jan 21, 2015, but I have to admit that I didn't follow the link to Eric Lippert's blog :$