You can also make mistakes that compile in Haskell.

Seeing a (unit) test fail before making it pass is an important part of empirical software engineering. This is nothing new. We've known about the red-green-refactor cycle for at least twenty years, so we know that ensuring that a test can fail is of the essence.

A fundamental problem of automated testing is that we're writing code to test code. How do we know that our test code works? After all, it's easy enough to make mistakes, even with simple code.

The importance of discipline #

When I test-drive a programming task, it regularly happens that I write a test that I expect to fail, only for it to pass. Even so, it doesn't happen that often. During a week of intensive coding, it may happen once or twice.

After all, a unit test is supposed to be a simple, short block of code, ideally with a cyclomatic complexity of 1. If you're an experienced programmer, working in a language that you know, you wouldn't expect to make simple mistakes too often.

It's easy to get lulled into a false sense of security. When new tests fail (as they should) more than 95% of the time, you may be tempted to cut corners: Just write the test, implement the desired code, and move on.

My experience is, however, that I inadvertently write a passing test now and then. I've been doing test-driven development (TDD) for more than twenty years, and this still happens. It is therefore important to keep up discipline and run that damned test, even if you don't feel like it. It's exactly when your attention is flagging that you need this safety mechanism the most.

Until you've tried a couple of times writing a test that unexpectedly pass, it can be hard to grasp why the red phase is essential. Therefore I've always felt that it was important to publish examples.

Tautologies #

A new test that passes on the first run is almost always a tautology. Theoretically, it may be that you think that the System Under Test does not yet have a certain capability, but after writing the test, it turns out that, after all, it does. This almost never happens to me. I can't rule out that this may have been the case once or twice in the last few decades, but it's as scarce as hen's teeth.

Usually, the problem is that the test is a tautology. While you believe that the test correctly self-checks something relevant, in reality, it's written in such a way that it can't possibly fail.

The first time I had the opportunity to document such a tautological assertion the underlying problem turned out to be aliasing. The next examples I ran into also had their roots in aliasing.

I began to wonder: Is the problem of tautological assertions mostly related to aliasing? If so, does it mean that the phenomenon of writing a passing test by mistake is mostly isolated to procedural and object-oriented programming? Could it be that this doesn't happen in functional programming?

If it compiles, it works #

Many so-called functional programming languages are in reality mixed-paradigm languages. The one that I know best is F#, which Don Syme calls functional-first. Even so, it's explicitly designed to interact with .NET libraries written in other languages (almost always C#), so it can still suffer from aliasing problems. The same situation applies to Clojure and Scala, which both run on the JVM.

A few languages are, however, unapologetically functional. Haskell is probably the most famous. If you rule out actions that run in IO, shared mutable state is not a concern.

Haskell's type system is so advanced and powerful that there are programmers who still seem to approach the language with an implied attitude that if it compiles, it works. Clearly, as we shall see, that is not the case.

Example #

Last week, I was writing tests against an API that was already given. As the 53rd test, I wrote this:

testCase "Cell 1 does not reproduce" $
  let cell1 = Galapagos.CellState (Just cheater) (mkStdGen 0)
      cell2 = Galapagos.CellState Nothing (mkStdGen 1) -- seeded: no repr.
      actual = Galapagos.reproduce Galapagos.defaultParams (cell1, cell2)
  in do
    -- Sanity check that cell 1 remains, sampling on strategy:
    ( Galapagos.finchStrategyExp <$> Galapagos.cellFinch (fst actual)) @?=
      Galapagos.finchStrategyExp <$> Galapagos.cellFinch cell1
    ( Galapagos.finchHP <$> Galapagos.cellFinch cell2) @?= Nothing

To my surprise, it immediately passed. What's wrong with it? Before reading on, see if you can spot the problem.

I know that you don't know the problem domain or the particular API. Even so, the problem is in the test itself, not in the implementation code. After all, according to the red-green-refactor cycle, I hadn't yet added the code that would make this test pass.

The error is that I meant to verify that the actual value's second component remained Nothing, but either due to a brain fart, or because I copied an earlier test, the last assertion checks that cell2 is Nothing.

Since cell2 is originally initialized with Nothing, and Haskell values are immutable, there's no way it could be anything else. This is a tautological assertion. The correct assertion is

(Galapagos.finchHP <$> Galapagos.cellFinch (snd actual)) @?= Nothing

Notice that it examines snd actual rather than cell2.

Causes #

How could I be so stupid? First, to err is human, and this is exactly why it's important to start with a failing test. The error was mine, but even so, it sometimes pays to analyse if there might be an underlying driving force behind the error. In this case, I can identify at least two, although they are related.

First, I don't recall exactly how I wrote this test, but looking at previous commits, I find it likely that I copied and pasted an earlier test case. Evidently, I failed to correctly alter the assertion to specify the new test case.

Why was I even copying and pasting the test? Because the test is too complicated. As a rule of thumb, pay attention to copy-and-paste, also of test cases. Test code should be well-factored, without too much duplication, for the same reasons that apply to other code. If a test is too difficult to write, it indicates that the API is too difficult to use. This is feedback about the API design, and you should consider if simplification is possible.

And the test is, indeed, too complicated. What I wanted to verify is that Galapagos.cellFinch (snd actual) is Nothing. If so, why didn't I just write Galapagos.cellFinch (snd actual) @?= Nothing? Because that doesn't compile.

The issue is that the data type in question (Finch) doesn't have an Eq instance, which @?= requires. Thus, I had to project the value I wanted to examine to a value that does have an Eq instance, such finchHP, which is an Int.

Why didn't I assert on isNothing instead? Eventually, I did, but the problem with asserting on raw Boolean values is that when the assertion fails, you get no good feedback. The test runner will only tell you that the expected value was True, but the actual value False.

The assertBool action offers a solution, but then you have to write your own error message, and I wasn't yet ready to do that. Eventually, I did, though.

The bottom line is that the risk of making mistakes increases, the more complicated things become. This also applies to functional programming, but in reality, simple is rarely easy.

Conclusion #

Tautological assertions are not only caused by aliasing, but also plain human error. In this article, you've seen an example of such an error in Haskell. This is a language with a type system that can identify many errors at compile time. Even so, some errors are run-time errors, and when it comes to TDD, in the red phase, the absence of failure indicates an error.

I find it important to share such errors when they occur, because they are at the same time regular and uncommon. While rare, they still happen with some periodicity. Since, after all, they don't happen that often, it may take time if you only have your own mistakes to learn from. So learn from mine, too.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 15 December 2025 14:12:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 15 December 2025 14:12:00 UTC