Put cyclomatic complexity to good use by Mark Seemann
An actually useful software metric.
In Humane Code I argue that software development suffers from a lack of useful measurements. While I stand by that general assertion, a few code metrics can be useful. Cyclomatic complexity, while no silver bullet, can be put to good use.
Recap #
I think of cyclomatic complexity as a measure of the number of pathways through a piece of code. Even the simplest body of code affords a single pathway, so the minimum cyclomatic complexity is 1. You can easily 'calculate' the cyclomatic complexity of a method for function. You start at one, and then you count how many times if
and for
occurs. For each of these keywords you find, you increment the number (which started at 1).
The specifics are language-dependent. The idea is to count branching and looping instructions. In C#, for example, you'd also have to include foreach
, while
, do
, and each case
in a switch
block. In other languages, the keywords to count will differ.
What's the cyclomatic complexity of this TryParse
method?
public static bool TryParse(string candidate, out UserNamePassworCredentials? credentials) { credentials = null; var arr = candidate.Split(','); if (arr.Length < 2) return false; credentials = new UserNamePassworCredentials(arr[0], arr[1]); return true; }
The cyclomatic complexity of this method is 2. You start with the number 1 and then increment it every time you find one of the branching keywords. In this case, there's only a single if
, so increment 1 to 2. That's it. If you're in doubt, Visual Studio can calculate metrics for you. (It calculates other metrics as well, but I don't find those useful.)
Guide for unit testing #
I find cyclomatic complexity useful because it measures the number of pathways through a method. As such, it indicates the minimum number of test cases you ought to furnish. (Edit July 14, 2023: While this tends to work well in practice, it turns out that it's not strictly true in general. See the article Is cyclomatic complexity really related to branch coverage? and the comments for more details.) This is useful when reviewing code and tests.
Sometimes I'm presented with code that other people wrote. When I look through the production code, I consider its cyclomatic complexity. If, for example, a method has a cyclomatic complexity of 5, I'd expect to find at least five test cases to cover that method.
At other times, I start by reading the tests. The number of test cases gives me a rough indication of what degree of complexity to expect. If I see four distinct tests for the same method, I expect it to have a cyclomatic complexity about 4.
I don't demand 100% coverage. Sometimes, people don't write tests for guard clauses, and I usually accept such omissions. On the other hand, I think that proper decision logic should be covered by tests. If I were to stick unwaveringly to cyclomatic complexity, that would make my reviews more objective, but not necessarily better. I could insist on 100% code coverage, but I don't consider that a good idea.
Presented with the above TryParse
method, I'd expect to see at least two unit tests, since the cyclomatic complexity is 2.
The need for more test cases #
Two unit tests aren't enough, though. You could write these two tests:
[Fact] public void TryParseSucceeds() { var couldParse = UserNamePassworCredentials.TryParse("foo,bar", out var actual); Assert.True(couldParse); var expected = new UserNamePassworCredentials("foo", "bar"); Assert.Equal(expected, actual); } [Fact] public void TryParseFails() { var couldParse = UserNamePassworCredentials.TryParse("foo", out var actual); Assert.False(couldParse); Assert.Null(actual); }
Using the Devil's advocate technique, however, this implementation of TryParse
passes both tests:
public static bool TryParse(string candidate, out UserNamePassworCredentials? credentials) { credentials = null; if (candidate != "foo,bar") return false; credentials = new UserNamePassworCredentials("foo", "bar"); return true; }
This is clearly not the correct implementation, but it has 100% code coverage. It also still has cyclomatic complexity of 2. The metric suggests a minimum number of tests - not a sufficient number.
More test cases #
It often makes sense to cover each branch with a single parametrised test:
[Theory] [InlineData("foo,bar", "foo", "bar")] [InlineData("baz,qux", "baz", "qux")] [InlineData("ploeh,fnaah", "ploeh", "fnaah")] [InlineData("foo,bar,baz", "foo", "bar")] public void TryParseSucceeds(string candidate, string userName, string password) { var couldParse = UserNamePassworCredentials.TryParse(candidate, out var actual); Assert.True(couldParse); var expected = new UserNamePassworCredentials(userName, password); Assert.Equal(expected, actual); } [Theory] [InlineData("")] [InlineData("foobar")] [InlineData("foo;bar")] [InlineData("foo")] public void TryParseFails(string candidate) { var couldParse = UserNamePassworCredentials.TryParse(candidate, out var actual); Assert.False(couldParse); Assert.Null(actual); }
Is a total of eight test cases the correct number? Cyclomatic complexity can't help you here. You'll have to rely on other heuristics, such as test-driven development, the transformation priority premise, and the Devil's Advocate.
Humane Code #
I also find cyclomatic complexity useful for another reason. I keep an eye on complexity because I care about code maintainability. In my Humane Code video, I discuss the magic number seven, plus or minus two.
When you read code, you essentially run a little emulator in your brain. You have to maintain state in order to interpret the code you look at. Will this conditional evaluate to true or false? Is the code going to exit that loop now? Is that array index out of bounds? You can only follow the code by keeping track of variables' contents, and your brain can keep track of approximately seven things.
Cyclomatic complexity is a measure of pathways - not how many things you need to keep track of. Still, in my experience, there seems to be a useful correlation. Code with high cyclomatic complexity tends to have many moving parts. There's too much to keep track of. With low cyclomatic complexity, on the other hand, the code involves few moving parts.
I use cyclomatic complexity 7 as an approximate maximum for that reason. It's only a rule of thumb, since I'm painfully aware that I'm transplanting experimental psychology to a context where no conclusions can be scientifically drawn. But like the 80/24 rule I find that it works well in practice.
Complexity of a method call #
Consider the above parametrised tests. Some of the test cases provide enough triangulation to defeat the Devil's attempt at hard-coding return values. This explains test values like "foo,bar"
, "baz,qux"
, and "ploeh,fnaah"
, but why did I include the "foo,bar,baz"
test case? And why did I include the empty string as one of the test cases for TryParseFails
?
When I write tests, I aspire to compose tests that verify the behaviour rather than the implementation of the System Under Test. The desired behaviour, I decided, is that any extra entries in the comma-separated input should be ignored. Likewise, if there's fewer than two entries, parsing should fail. There must be both a user name and a password.
Fortunately, this happens to be how Split already works. If you consider all the behaviour that Split
exhibits, it encapsulates moderate complexity. It can split on multiple alternative delimiters, it can throw away empty entries, and so on. What would happen if you inline some of that functionality?
public static bool TryParse(string candidate, out UserNamePassworCredentials? credentials) { credentials = null; var l = new List<string>(); var element = ""; foreach (var c in candidate) { if (c == ',') { l.Add(element); element = ""; } else element += c; } l.Add(element); if (l.Count < 2) return false; credentials = new UserNamePassworCredentials(l[0], l[1]); return true; }
This isn't as sophisticated as the Split
method it replaces, but it passes all eight test cases. Why did I do this? To illustrate the following point.
What's the cyclomatic complexity now?
Keep in mind that the externally observable behaviour (as defined by eight test cases) hasn't changed. The cyclomatic complexity, however, has. It's now 4 - double the previous metric.
A method call (like a call to Split
) can hide significant cyclomatic complexity. That's a desirable situation. This is the benefit that encapsulation offers: that you don't have to worry about implementation details as long as both caller and callee fulfils the contract.
When you calculate cyclomatic complexity, a method call doesn't increment the complexity, regardless of the degree of complexity that it encapsulates.
Summary #
Cyclomatic complexity is one of the rare programming metrics that I find useful. It measures the number of pathways through a body of code.
You can use it to guide your testing efforts. The number is the minimum number of tests you must write in order to cover all branches. You'll likely need more test cases than that.
You can also use the number as a threshold. I suggest that 7 ought to be the maximum cyclomatic complexity of a method or function. You're welcome to pick another number, but keeping an eye on cyclomatic complexity is useful. It tells you when it's time to refactor a complex method.
Cyclomatic complexity considers only the code that directly implements a method or function. That code can call other code, but what happens behind a method call doesn't impact the metric.
Comments
Do you know of a tool to calculate cyclomatic complexity for F#? It appears that the Visual Studio feature doesn't support it.
Ghillie, thank you for writing. I'm not aware of any such tool.
FWIW, it's not difficult to manually calculate cyclometric complexity for F#, but clearly that doesn't help if you'd like to automate the process.
It might be a fine project for anyone looking for a way to contribute to the F# ecosystem.
Hi, Mark. Thanks for your article. I'd commenting because I'd like to learn more about your thoughts on mutation testing. I ask this because I know you're not the biggest fan of code coverage as a useful metric. I'm not either, or at least I wasnt, until I learned about mutation testing.
My current view is that code coverage is only (mostly) meaningless if you don't have a way of measuring the quality of the tests. Since mutation testing's goal is exactly that (to test the tests, if you will) my opinion is that, if you use a mutation testing tool, then code coverage become really useful and you should try to get to 100%. I've even written a post about this subject.
So, in short: what are your thoughts on mutation testing and how it affects the meaning of code coverage, if at all? Looking forward to read your answer. A whole post on this would be even better!
Thanks!
Carlos, thank you for writing. I'm sympathetic to the idea of mutation testing, but apart from that, I have no opinion of it. I don't think that I ought to have an opinion about something with which I've no experience.
I first heard about mutation testing decades ago, but I've never come across a mutation testing tool for C# (or F#, for that matter). Can you recommend one?
Unfortunately, tooling is indeed one of the main Achilles heels of mutation testing, at least when it comes to .NET.
In the Java world, they have PIT, which is considered state of the art. For C#, I have tried a few tools, with no success. The most promising solution I've found so far, for C#, is Stryker.net, which is a port of the Stryker mutation, designed originally for JavaScript. The C# version is still in its early phases but it's already usable and it looks very promising.
Is mutation testing the automated version of what Mark has called the Devil's Advocate technique?
Tyson, I actually discuss the relationship with mutation testing in that article.