TDD test suites should run in 10 seconds or less by Mark Seemann
Most guidance about Test-Driven Development (TDD) will tell you that unit tests should be fast. Examples of such guidance can be found in FIRST and xUnit Test Patterns. Rarely does it tell you how fast unit tests should be.
10 seconds for a unit test suite. Max.
Here's why.
When you follow the Red/Green/Refactor process, ideally you'd be running your unit test suite at least three times for each iteration:
- Red. Run the test suite.
- Green. Run the test suite.
- Refactor. Run the test suite.
Each time you run the unit test suite, you're essentially blocked. You have to wait until the test run has completed until you get the result.
During that wait time, it's important to keep focus. If the test run takes too long, your mind starts to wonder, and you'll suffer from context switching when you have to resume work.
When does your mind start to wonder? After about 10 seconds. That number just keeps popping up when the topic turns to focused attention. Obviously it's not a hard limit. Obviously there are individual and contextual variations. Still, it seems as though a 10 second short-term attention span is more or less hard-wired into the human brain.
Thus, a unit test suite used for TDD should run in less than 10 seconds. If it's slower, you'll be less productive because you'll constantly lose focus.
Implications #
The test suite you work with when you do TDD should execute in less than 10 seconds on your machine. If you have hundreds of tests, each test should be faster than 100 milliseconds. If you have thousands, each test should be faster than 10 milliseconds. You get the picture.
That test suite doesn't need to be the same as is running on your CI server. It could be a subset of tests. That's OK. The TDD suite may just be part of your Test Pyramid.
Selective test runs #
Many people work around the Slow Tests anti-pattern by only running one test at a time, or perhaps one test class. In my experience, this is not an optimal solution because it slows you down. Instead of just going
- Red
- Run
- Green
- Run
- Refactor
- Run
you'd need to go
- Red
- Specifically instruct your Test Runner to run only the test you just wrote
- Green
- Decide which subset of tests may have been affected by the new test. This obviously involves the new test, but may include more tests.
- Run the tests you just selected
- Refactor
- Decide which subset of tests to run now
- Run the tests you just selected
Obviously, that introduces friction into your process. Personally, I much prefer to have a fast test suite that I can run all the time at a key press.
Still, there are tools available that promises to do this analysis for you. One of them are Mighty Moose, with which I've had varying degrees of success. Another similar approach is NCrunch.
References? #
For a long time I've been wanting to write this article, but I've always felt held back by a lack of citations I could exhibit. While the Wikipedia link does provide a bit of information, it's not that convincing in itself (and it also cites the time span as 8 seconds).
However, that 10 second number just keeps popping up in all sorts of contexts, not all of them having anything to do with software development, so I decided to run with it. However, if some of my readers can provide me with better references, I'd be delighted.
Or perhaps I'm just suffering from confirmation bias...
 
Comments
Some of that can be alleviated with a DVCS, but rapid feedback just makes things a lot easier.
>> Mighty Moose, with which I've had varying degrees of success. Another similar approach is NCrunch.
I tried both the tools in a real VS solution (contained about 180 projects, 20 of them are test ones). Mighty Moose was continuously throwing exceptions and I didn't manage to get it work at all. NCrunch could not compile some of projects (it uses Mono compiler) but works with some success with the remained ones. Yes, feedback is rather fast (2-5 secs instead of 10-20 using ReSharper test runner), but it unstable and I've returned to ReSharper back.
TDD cannot be used for large applications.
I was doing some research myself. From what I understand, the Wikipedia article you link to talks mainly about attention span in the context of someone doing a certain task and being disturbed. However, while we wait for our unit tests to finish, we're not doing anything. This is more similar to waiting for a website to download and render.
The studies on "tolerable waiting time" are all over the map, but even old ones talk about 2 seconds. This paper mentions several studies, two of them from pre-internet days (scroll down to the graphics for a comparison table). This would mean that, ideally, we would need our tests to run in not 10 but 2 seconds! I say ideally, because this seams unrealistic to me at this moment (especially in projects where even compilation takes longer than 2 seconds). Maybe in the future, who knows.
Peter, thank you for writing. I recall reading about the 10 second rule some months before I wrote the article, but that, when writing the article, I had trouble finding public reference material. Thank you for making the effort of researching this and sharing your findings. I don't dispute what you wrote, but here are some further thoughts on the topic:
If the time limit really is two seconds, that's cause for some concern. I agree with you that it might be difficult to achieve that level of response time for a moderately-sized test suite. This does, however, heavily depend on various factors, the least of which isn't the language and platform.
For instance, when working with a 'warm' C# code base, compilation time can be fast enough that you might actually be able to compile and run hundreds of tests within two seconds. On the other hand, just compiling a moderately-sized Haskell code base takes longer than two seconds on my machine (but then, once Haskell compiles, you don't need a lot of tests to verify that it works correctly).
When working in interpreted languages, like SmallTalk (where TDD was originally rediscovered), Ruby, or JavaScript, there's no compilation step, so tests start running immediately. Being interpreted, the test code may run slower than compiled code, but my limited experience with JavaScript is that it can still be fast enough.