Why trust tests? by Mark Seemann
Tests are trustworthy if they are simple and you've seen them fail.
When I started out with TDD some ten years ago, it was really hard to convince programmers that unit testing was valuable. It's better today, but much software is still produced without test coverage. One of the most common arguments for unit tests is to ask the question: "How do you know that your code works?"
That's a darn good question, and one of the best answers is that you should cover the code with automated tests (unit tests).
Okay, then: How do you know that your unit tests work?
A test watches the correctness of the SUT, but who watches the watchmen? Should you write some more unit tests to test your unit tests? Is it going to be turtles all the way down?
Obviously, that's not what we do. Then why do we trust unit tests?
I think that there's two different reasons for that:
- They are easy to review
- We've seen them fail
Review #
The reason that a lot of code fails to do what you think it does is because it's complicated. Sometimes there's also going to be subtle bugs, but I think that the most common problem is that as the code becomes more involved, it becomes more difficult for you to hold it in your brain. A simple Hello World application is easy to understand. Typical software isn't.
Unit tests tend to be a lot simpler. First of all, unit tests should be deterministic. This means that there should be a clear path through each test case. In other words, a unit test should have a Cyclomatic Complexity of 1.
If you follow the AAA or Four-Phase Test patterns, keep the Cyclomatic Complexity at 1, keep the total line count down, and use good names, you should have a readable test case. Such test code is easy to review for correctness. If you are Pair Programming, you and your partner review the unit test code as you write it. If you use other review mechanisms (e.g. pull requests), the reviewer can review your unit test code. Even if you have no other person reviewing your code, keeping it simple helps you understand what it is that you do. Ultimately, it can't be more error prone than single-handedly writing production code without tests.
Part of the reason why we trust unit tests is because, if written well, they are easy to review for correctness.
See the test fail #
Deeply ingrained into proper TDD is the Red/Green/Refactor cycle. It's extremely important to see the test fail. Personally, I'm saved by this rule about once a week. I write a test and run it, and much to my surprise it turns out to be a false negative. Although I've been doing TDD for some ten years, this still happens to me regularly, so I think it's one of those rigid rules you should always follow. Seeing the test fail verifies that the test indeed tests something.
This is also the reason why, if you don't do TDD, but rather write tests after the fact, it becomes extremely important to follow a proper procedure for writing Characterization Tests.
Part of the reason that we trust unit tests is that we've seen them fail. It's like double-entry bookkeeping. The tests keep the SUT in place, and the SUT makes the tests pass.
Append-only tests #
Both of the reasons I've provided are focused on the test as it's being created. At that time, you see it fail and you review it for correctness. As time goes by, you should end up with a test suite with lots of tests. Do you ever look at those tests?
You trust them because they were correct when they were created. Are they still correct today?
Well, they probably are if you never again touched them, but every time you modify a test case, you make it just a little less trustworthy. Similar in spirit to the Open/Closed Principle, a test suite should be open for extension, but closed for modification. Trustworthy test suites are append-only.