CC golf by Mark Seemann
Noun. Game in which the goal is to minimise cyclomatic complexity.
Cyclomatic complexity (CC) is a rare code metric since it can be actually useful. In general, it's a good idea to minimise it as much as possible.
In short, CC measures looping and branching in code, and this is often where bugs lurk. While it's only a rough measure, I nonetheless find the metric useful as a general guideline. Lower is better.
Golf #
I'd like to propose the term "CC golf" for the activity of minimising cyclomatic complexity in an area of code. The name derives from code golf, in which you have to implement some behaviour (typically an algorithm) in fewest possible characters.
Such games can be useful because they enable you to explore different ways to express yourself in code. It's always a good kata constraint. The first time I tried that was in 2011, and when looking back on that code today, I'm not that impressed. Still, it taught me a valuable lesson about the Visitor pattern that I never forgot, and that later enabled me to connect some important dots.
But don't limit CC golf to katas and the like. Try it in your production code too. Most production code I've seen could benefit from some CC golf, and if you use Git tactically you can always stash the changes if they're no good.
Idiomatic tension #
Alternative expressions with lower cyclomatic complexity may not always be idiomatic. Let's look at a few examples. In my previous article, I listed some test code where some helper methods had a CC of 2. Here's one of them:
private static IEnumerable<DateOnly> EnumerateDates(DateOnly arrival, DateOnly departure) { var d = arrival; while (d < departure) { yield return d; d = d.AddDays(1); } }
Can you express this functionality with a CC of 1? In Haskell it's essentially built in as (. pred) . enumFromTo
, and in F# it's also idiomatic, although more verbose:
let enumerateDates (arrival : DateOnly) departure = Seq.initInfinite id |> Seq.map arrival.AddDays |> Seq.takeWhile (fun d -> d < departure)
Can we do the same in C#?
If there's a general API in .NET that corresponds to the F#-specific Seq.initInfinite
I haven't found it, but we can do something like this:
private static IEnumerable<DateOnly> EnumerateDates(DateOnly arrival, DateOnly departure) { const int infinity = int.MaxValue; // As close as int gets, at least return Enumerable.Range(0, infinity).Select(arrival.AddDays).TakeWhile(d => d < departure); }
In C# infinite sequences are generally unusual, but if you were to create one, a combination of while true
and yield return
would be the most idiomatic. The problem with that, though, is that such a construct has a cyclomatic complexity of 2.
The above suggestion gets around that problem by pretending that int.MaxValue
is infinity. Practically, at least, a 32-bit signed integer can't get larger than that anyway. I haven't tried to let F#'s Seq.initInfinite run out, but by its type it seems int
-bound as well, so in practice it, too, probably isn't infinite. (Or, if it is, the index that it supplies will have to overflow and wrap around to a negative value.)
Is this alternative C# code better than the first? You be the judge of that. It has a lower cyclomatic complexity, but is less idiomatic. This isn't uncommon. In languages with a procedural background, there's often tension between lower cyclomatic complexity and how 'things are usually done'.
Checking for null #
Is there a way to reduce the cyclomatic complexity of the GetView
helper method?
private IReadOnlyCollection<Room> GetView(DateOnly date) { if (views.TryGetValue(date, out var view)) return view; else return rooms; }
This is an example of the built-in API being in the way. In F#, you naturally write the same behaviour with a CC of 1:
let getView (date : DateOnly) =
views |> Map.tryFind date |> Option.defaultValue rooms |> Set.ofSeq
That TryGet
idiom is in the way for further CC reduction, it seems. It is possible to reach a CC of 1, though, but it's neither pretty nor idiomatic:
private IReadOnlyCollection<Room> GetView(DateOnly date) { views.TryGetValue(date, out var view); return new[] { view, rooms }.Where(x => x is { }).First()!; }
Perhaps there's a better way, but if so, it escapes me. Here, I use my knowledge that view
is going to remain null
if TryGetValue
doesn't find the dictionary entry. Thus, I can put it in front of an array where I put the fallback value rooms
as the second element. Then I filter the array by only keeping the elements that are not null
(that's what the x is { }
pun means; I usually read it as x is something). Finally, I return the first of these elements.
I know that rooms
is never null
, but apparently the compiler can't tell. Thus, I have to suppress its anxiety with the !
operator, telling it that this will result in a non-null value.
I would never use such a code construct in a professional C# code base.
Side effects #
The third helper method suggests another kind of problem that you may run into:
public void RoomBooked(Booking booking) { foreach (var d in EnumerateDates(booking.Arrival, booking.Departure)) { var view = GetView(d); var newView = QueryService.Reserve(booking, view); views[d] = newView; } }
Here the higher-than-one CC stems from the need to loop through dates in order to produce a side effect for each. Even in F# I do that:
member this.RoomBooked booking = for d in enumerateDates booking.Arrival booking.Departure do let newView = getView d |> QueryService.reserve booking |> Seq.toList views <- Map.add d newView views
This also has a cyclomatic complexity of 2. You could do something like this:
member this.RoomBooked booking = enumerateDates booking.Arrival booking.Departure |> Seq.iter (fun d -> let newView = getView d |> QueryService.reserve booking |> Seq.toList in views <- Map.add d newView views)
but while that nominally has a CC of 1, it has the same level of indentation as the previous attempt. This seems to indicate, at least, that it doesn't really address any complexity issue.
You could also try something like this:
member this.RoomBooked booking = enumerateDates booking.Arrival booking.Departure |> Seq.map (fun d -> d, getView d |> QueryService.reserve booking |> Seq.toList) |> Seq.iter (fun (d, newView) -> views <- Map.add d newView views)
which, again, may be nominally better, but forced me to wrap the map
output in a tuple so that both d
and newView
is available to Seq.iter
. I tend to regard that as a code smell.
This latter version is, however, fairly easily translated to C#:
public void RoomBooked(Booking booking) { EnumerateDates(booking.Arrival, booking.Departure) .Select(d => (d, view: QueryService.Reserve(booking, GetView(d)))) .ToList() .ForEach(x => views[x.d] = x.view); }
The standard .NET API doesn't have something equivalent to Seq.iter
(although you could trivially write such an action), but you can convert any sequence to a List<T>
and use its ForEach
method.
In practice, though, I tend to agree with Eric Lippert. There's already an idiomatic way to iterate over each item in a collection, and being explicit is generally helpful to the reader.
Church encoding #
There's a general solution to most of CC golf: Whenever you need to make a decision and branch between two or more pathways, you can model that with a sum type. In C# you can mechanically model that with Church encoding or the Visitor pattern. If you haven't tried that, I recommend it for the exercise, but once you've done it enough times, you realise that it requires little creativity.
As an example, in 2021 I revisited the Tennis kata with the explicit purpose of translating my usual F# approach to the exercise to C# using Church encoding and the Visitor pattern.
Once you've got a sense for how Church encoding enables you to simulate pattern matching in C#, there are few surprises. You may also rightfully question what is gained from such an exercise:
public IScore VisitPoints(IPoint playerOnePoint, IPoint playerTwoPoint) { return playerWhoWinsBall.Match( playerOne: playerOnePoint.Match<IScore>( love: new Points(new Fifteen(), playerTwoPoint), fifteen: new Points(new Thirty(), playerTwoPoint), thirty: new Forty(playerWhoWinsBall, playerTwoPoint)), playerTwo: playerTwoPoint.Match<IScore>( love: new Points(playerOnePoint, new Fifteen()), fifteen: new Points(playerOnePoint, new Thirty()), thirty: new Forty(playerWhoWinsBall, playerOnePoint))); }
Believe it or not, but that method has a CC of 1 despite the double indentation strongly suggesting that there's some branching going on. To a degree, this also highlights the limitations of the cyclomatic complexity metric. Conversely, stupidly simple code may have a high CC rating.
Most of the examples in this article border on the pathological, and I don't recommend that you write code like that. I recommend that you do the exercise. In less pathological scenarios, there are real benefits to be reaped.
Idioms #
In 2015 I published an article titled Idiomatic or idiosyncratic? In it, I tried to explore the idea that the notion of idiomatic code can sometimes hold you back. I revisited that idea in 2021 in an article called Against consistency. The point in both cases is that just because something looks unfamiliar, it doesn't mean that it's bad.
Coding idioms somehow arose. If you believe that there's a portion of natural selection involved in the development of coding idioms, you may assume by default that idioms represent good ways of doing things.
To a degree I believe this to be true. Many idioms represent the best way of doing things at the time they settled into the shape that we now know them. Languages and contexts change, however. Just look at the many approaches to data lookups there have been over the years. For many years now, C# has settled into the so-called TryParse idiom to solve that problem. In my opinion this represents a local maximum.
Languages that provide Maybe (AKA option
) and Either (AKA Result
) types offer a superior alternative. These types naturally compose into CC 1 pipelines, whereas TryParse requires you to stop what you're doing in order to check a return value. How very C-like.
All that said, I still think you should write idiomatic code by default, but don't be a slave by what's considered idiomatic, just as you shouldn't be a slave to consistency. If there's a better way of doing things, choose the better way.
Conclusion #
While cyclomatic complexity is a rough measure, it's one of the few useful programming metrics I know of. It should be as low as possible.
Most professional code I encounter implements decisions almost exclusively with language primitives: if
, for
, switch
, while
, etc. Once, an organisation hired me to give a one-day anti-if workshop. There are other ways to make decisions in code. Most of those alternatives reduce cyclomatic complexity.
That's not really a goal by itself, but reducing cyclomatic complexity tends to produce the beneficial side effect of structuring the code in a more sustainable way. It becomes easier to understand and change.
As the cliché goes: Choose the right tool for the job. You can't, however, do that if you have nothing to choose from. If you only know of one way to do a thing, you have no choice.
Play a little CC golf with your code from time to time. It may improve the code, or it may not. If it didn't, just stash those changes. Either way, you've probably learned something.