100% coverage is not that trivial by Mark Seemann
Dispelling a myth I helped propagate.
Most people who have been around automated testing for a few years understand that code coverage is a useless target measure. Unfortunately, through a game of Chinese whispers, this message often degenerates to the simpler, but incorrect, notion that code coverage is useless.
As I've already covered in that article, code coverage may be useful for other reasons. That's not my agenda for this article. Rather, something about this discussion have been bothering me for a long time.
Have you ever had an uneasy feeling about a topic, without being able to put your finger on exactly what the problem is? This happens to me regularly. I'm going along with the accepted narrative until the cognitive dissonance becomes so conspicuous that I can no longer ignore it.
In this article, I'll grapple with the notion that 'reaching 100% code coverage is easy.'
Origins #
This tends to come up when discussing code coverage. People will say that 100% code coverage isn't a useful measure, because it's easy to reach 100%. I have used that argument myself. Fortunately I also cited my influences in 2015; in this case Martin Fowler's Assertion Free Testing.
"[...] of course you can do this and have 100% code coverage - which is one reason why you have to be careful on interpreting code coverage data."
This may not be the only source of such a claim, but it may have been a contributing factor. There's little wrong with Fowler's article, which doesn't make any groundless claims, but I can imagine how semantic diffusion works on an idea like that.
Fowler also wrote that it's "a story from a friend of a friend." When the source of a story is twice-removed like that, alarm bells should go off. This is the stuff that urban legends are made of, and I wonder if this isn't rather an example of 'programmer folk wisdom'. I've heard variations of that story many times over the years, from various people.
It's not that easy #
Even though I've helped promulgate the idea that reaching 100% code coverage is easy if you cheat, I now realise that that's an overstatement. Even if you write no assertions, and surround the test code with a try/catch block, you can't trivially reach 100% coverage. There are going to be branches that you can't reach.
This often happens in real code bases that query databases, call web services, and so on. If a branch depends on indirect input, you can't force execution down that path just by suppressing exceptions.
An example is warranted.
Example #
Consider this ReadReservation method in the SqlReservationsRepository class from the code base that accompanies my book Code That Fits in Your Head:
public async Task<Reservation?> ReadReservation(int restaurantId, Guid id) { const string readByIdSql = @" SELECT [PublicId], [At], [Name], [Email], [Quantity] FROM [dbo].[Reservations] WHERE [PublicId] = @id"; using var conn = new SqlConnection(ConnectionString); using var cmd = new SqlCommand(readByIdSql, conn); cmd.Parameters.AddWithValue("@id", id); await conn.OpenAsync().ConfigureAwait(false); using var rdr = await cmd.ExecuteReaderAsync().ConfigureAwait(false); if (!rdr.Read()) return null; return ReadReservationRow(rdr); }
Even though it only has a cyclomatic complexity of 2, most of it is unreachable to a test that tries to avoid hard work.
You can try to cheat in the suggested way by adding a test like this:
[Fact] public async Task ReadReservation() { try { var sut = new SqlReservationsRepository("dunno"); var actual = await sut.ReadReservation(0, Guid.NewGuid()); } catch { } }
Granted, this test passes, and if you had 0% code coverage before, it does improve the metric slightly. Interestingly, the Coverlet collector for .NET reports that only the first line, which creates the conn variable, is covered. I wonder, though, if this is due to some kind of compiler optimization associated with asynchronous execution that the coverage tool fails to capture.
More understandably, execution reaches conn.OpenAsync() and crashes, since the test hasn't provided a connection to a real database. This is what happens if you run the test without the surrounding try/catch block.
Coverlet reports 18% coverage, and that's as high you can get with 'the easy hack'. 100% is some distance away.
Toward better coverage #
You may protest that we can do better than this. After all, with utter disregard for using proper arguments, I passed "dunno" as a connection string. Clearly that doesn't work.
Couldn't we easily get to 100% by providing a proper connection string? Perhaps, but what's a proper connection string?
It doesn't help if you pass a well-formed connection string instead of "dunno". In fact, it will only slow down the test, because then conn.OpenAsync() will attempt to open the connection. If the database is unreachable, that statement will eventually time out and fail with an exception.
Couldn't you, though, give it a connection string to a real database?
Yes, you could. If you do that, though, you should make sure that the database has a schema compatible with readByIdSql. Otherwise, the query will fail. What happens if the implied schema changes? Now you need to make sure that the database is updated, too. This sounds error-prone. Perhaps you should automate that.
Furthermore, you may easily cover the branch that returns null. After all, when you query for Guid.NewGuid(), that value is not going to be in the table. On the other hand, how will you cover the other branch; the one that returns a row?
You can only do that if you know the ID of a value already in that table. You may write a second test that queries for that known value. Now you have 100% coverage.
What you have done at this point, however, is no longer an easy cheat to get to 100%. You have, essentially, added integration tests of the data access subsystem.
How about adding some assertions to make the tests useful?
Integration tests for 100% #
In most systems, you will at least need some integration tests to reach 100% code coverage. While the code shown in Code That Fits in Your Head doesn't have 100% code coverage (that was never my goal), it looks quite good. (It's hard to get a single number, because Coverlet apparently can't measure coverage by running multiple test projects, so I can only get partial results. Coverage is probably better than 80%, I estimate.)
To test ReadReservation I wrote integration tests that automate setup and tear-down of a local test-specific database. The book, and the Git repository that accompanies it, has all the details.
Getting to 100%, or even 80%, requires dedicated work. In a realistic code base, the claim that reaching 100% is trivial is hardly true.
Conclusion #
Programmer folk wisdom 'knows' that code coverage is useless. One argument is that any fool can reach 100% by writing assertion-free tests surrounded by try/catch blocks.
This is hardly true in most significant code bases. Whenever you deal with indirect input, try/catch is insufficient to control whereto execution branches.
This suggests that high code-coverage numbers are good, and low numbers bad. What constitutes high and low is context-dependent. What seems to remain true, however, is that code coverage is a useless target. This has little to do with how trivial it is to reach 100%, but rather everything to do with how humans respond to incentives.