Aiming for a particular percentage of code coverage is counter-productive.

It's the end of 2015, and here I thought that it was common knowledge that using code coverage as a metric for code quality is useless at best. After all, Martin Fowler wrote a good article on the subject in 2012, but the fundamental realisation is much older than that. Apparently, it's one of those insights that one assumes that everyone else already knows, but in reality, that's not the case.

Let's make it clear, then: don't set goals for code coverage.

You may think that it could make your code base better, but asking developers to reach a certain code coverage goal will only make your code worse.

I'll show you some examples, but on a general level, the reason is that as with all other measurements, you get what you measure. Unfortunately, we can't measure productivity, so measuring code coverage will produce results that are entirely unrelated to software quality.

"People respond to incentives, although not necessarily in ways that are predictable or manifest. Therefore, one of the most powerful laws in the universe is the law of unintended consequences." - Super Freakonomics
Incentives with negative consequences are called perverse incentives; asking developers to reach a particular code coverage goal is clearly a perverse incentive.

It doesn't matter whether you set the target at 100% code coverage, 90%, 80%, or some other number.

Reaching 100% coverage is easy

Here's a simple code example:

public class GoldCustomerSpecification : ICustomerSpecification
{
    public bool IsSatisfiedBy(Customer candidate)
    {
        return candidate.TotalPurchases >= 10000;
    }
}

Imagine that you have been asked to reach a high level of code coverage, and that this class is still not covered by tests. Not only that, but you have bugs to fix, meetings to go to, new features to implement, documentation to write, time sheets to fill out, and the project is already behind schedule, over budget, and your family is complaining that you're never home.

Fortunately, it's easy to achieve 100% code coverage of the GoldCustomerSpecification class:

[Fact]
public void MeaninglessTestThatStillGivesFullCoverage()
{
    try
    {
        var sut = new GoldCustomerSpecification();
        sut.IsSatisfiedBy(new Customer());
    }
    catch { }
}

This test achieves 100% code coverage of the GoldCustomerSpecification class, but is completely useless. Because of the try/catch block and the lack of assertions, this test will never fail. This is what Martin Fowler calls Assertion-Free Testing.

If you can declare a rule that your code base must have so-and-so test coverage, however, you can also declare that all unit tests must have assertions, and must not have try/catch blocks.

Despite of this new policy, you still have lots of other things you need to attend to, so instead, you write this test:

[Fact]
public void SlightlyMoreInnocuousLookingTestThatGivesFullCoverage()
{
    var sut = new GoldCustomerSpecification();
    var actual = sut.IsSatisfiedBy(new Customer());
    Assert.False(actual);
}

This test also reaches 100% coverage of the GoldCustomerSpecification class.

What's wrong with this test? Nothing, as such. It looks like a fine test, but in itself, it doesn't prevent regressions, or proves that the System Under Test works as intended. In fact, this alternative implementation also passes the test:

public class GoldCustomerSpecification : ICustomerSpecification
{
    public bool IsSatisfiedBy(Customer candidate)
    {
        return false;
    }
}

If you want your tests to demonstrate that the software works as intended, particularly at boundary values, you'll need to add more tests:

[Theory]
[InlineData(100, false)]
[InlineData(9999, false)]
[InlineData(10000, true)]
[InlineData(20000, true)]
public void IsSatisfiedReturnsCorrectResult(
    int totalPurchases,
    bool expected)
{
    var sut = new GoldCustomerSpecification();
 
    var candidate = new Customer { TotalPurchases = totalPurchases };
    var actual = sut.IsSatisfiedBy(candidate);
 
    Assert.Equal(expected, actual);
}

This is a much better test, but it doesn't increase code coverage! Code coverage was already 100% with the SlightlyMoreInnocuousLookingTestThatGivesFullCoverage test, and it's still 100% with this test. There's no correlation between code coverage and the quality of the test(s).

Code coverage objectives inhibit quality improvement

Not only is test coverage percentage a meaningless number in itself, but setting a goal that must be reached actually hinders improvement of quality. Take another look at the GoldCustomerSpecification class:

public class GoldCustomerSpecification : ICustomerSpecification
{
    public bool IsSatisfiedBy(Customer candidate)
    {
        return candidate.TotalPurchases >= 10000;
    }
}

Is the implementation good? Can you think of any improvements to this code?

What happens if candidate is null? In that case, a NullReferenceException will be thrown. In other words, the IsSatisfiedBy method doesn't properly check that its preconditions are satisfied (which means that encapsulation is broken).

A better implementation would be to explicitly check for null:

public class GoldCustomerSpecification : ICustomerSpecification
{
    public bool IsSatisfiedBy(Customer candidate)
    {
        if (candidate == null)
            throw new ArgumentNullException(nameof(candidate));
 
        return candidate.TotalPurchases >= 10000;
    }
}

The problem, though, is that if you do this, coverage drops! That is, unless you write another test case...

Developers in a hurry often refrain from making the code better, because it would hurt their coverage target - and they don't feel they have time to also write the tests that go with the improvement in question.

Instituting a code coverage target - any percentage - will have that effect. Not only does the coverage number (e.g. 87%) tell you nothing, but setting it as a target will make the code base worse.

Attitude

You may argue that I'm taking too dim a view on developers, but I've seen examples of the behaviour I describe. People mostly have good intentions, but if you put enough pressure on them, they'll act according to that pressure. This is the reason we need to be aware of perverse incentives.

You may also argue that if a team is already doing Test Driven Development, and in general prioritise code quality, then coverage will already be high. In that case, will it hurt setting a target? Perhaps not, but it's not going to help either. At best, the target will be irrelevant.

This article, however, doesn't discuss teams that already do everything right; it describes the negative consequences that code coverage targets will have on teams where managers or lead developers mistakenly believe that setting such goals is a good idea.

Code coverage is still useful

While it's dangerous to use code coverage for target setting, collecting coverage metrics can still be useful.

Some people use it to find areas where coverage is weak. There may be good reasons that some parts of a code base are sparsely covered by tests, but doing a manual inspection once in a while is a good idea. Perhaps you find that all is good, but you may also discover that a quality effort is overdue.

In some projects, I've had some success watching the code coverage trend. When I review pull requests, I first review the changes by looking at them. If the pull request needs improvement, I work with the contributor to get the pull request to an acceptable quality. Once that is done, I've already made my decision on merging the code, and then I measure code coverage. It doesn't influence my decision to merge, but it tells me about the trend. On some projects, I've reported that trend back to the contributor while closing the pull request. I wouldn't report the exact number, but I'd remark that coverage went up, or down, or remained the same, by 'a little', 'much', etc. The point of that is to make team members aware that testing is important.

Sometimes, coverage goes down. There are many good reasons that could happen. Watching how coverage evolves over time doesn't mean that you have to pounce on developers every time it goes down, but it means that if something looks odd, it may be worth investigating.

Summary

Don't use code coverage as an objective. Code coverage has no correlation with code quality, and setting a target can easily make the quality worse.

On the other hand, it can be useful to measure code coverage once in a while, but it shouldn't be your main source of information about the status of your source code.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or Google Plus, or somewhere else with a permalink. Ping me with the link, and I may add it as a comment.

Published

Monday, 16 November 2015 08:38:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!