A C# example with xUnit.net and CsCheck

This is the first comprehensive example that accompanies the article Epistemology of interaction testing. In that article, I argue that in a code base that leans toward functional programming (FP), property-based testing is a better fit than interaction-based testing. In this example, I will show how to refactor simple interaction-based tests into a property-based tests.

This small article series was prompted by an email from Sergei Rogovtsev, who was kind enough to furnish example code. I'll use his code as a starting point for this example, so I've forked the repository. If you want to follow along, all my work is in a branch called no-mocks. That branch simply continues off the master branch.

Interaction-based testing #

Sergei Rogovtsev writes:

"A major thing to point out here is that I'm not following TDD here not by my own choice, but because my original question arose in a context of a legacy system devoid of tests, so I choose to present it to you in the same way. I imagine that working from tests would avoid a lot of questions."

Even when using test-driven development (TDD), most code bases I've seen make use of Stubs and Mocks (or, rather, Spies). In an object-oriented context this can make much sense. After all, a catch phrase of object-oriented programming is tell, don't ask.

If you base API design on that principle, you're modelling side effects, and it makes sense that tests use Spies to verify those side effects. The book Growing Object-Oriented Software, Guided by Tests is a good example of this approach. Thus, even if you follow established good TDD practice, you could easily arrive at a code base reminiscent of Sergei Rogovtsev's example. I've written plenty of such code bases myself.

Sergei Rogovtsev then extracts a couple of components, leaving him with a Controller class looking like this:

public string Complete(string statestring code)
{
    var knownState = _repository.GetState(state);
    try
    {
        if (_stateValidator.Validate(code, knownState))
            return _renderer.Success(knownState);
        else
            return _renderer.Failure(knownState);
    }
    catch (Exception e)
    {
        return _renderer.Error(knownState, e);
    }
}

This code snippet doesn't show the entire class, but only its solitary action method. Keep in mind that the entire repository is available on GitHub if you want to see the surrounding code.

The Complete method orchestrates three injected dependencies: _repository, _stateValidator, and _renderer. The question that Sergei Rogovtsev asks is how to test this method. You may think that it's so simple that you don't need to test it, but keep in mind that this is a minimal and self-contained example that stands in for something more complicated.

The method has a cyclomatic complexity of 3, so you need at least three test cases. That's also what Sergei Rogovtsev's code contains. I'll show each test case in turn, while I refactor them.

The overall question is still this: Both IStateValidator and IRenderer interfaces have only a single production implementation, and in both cases the implementations are pure functions. If interaction-based testing is suboptimal, is there a better way to test this code?

As I outlined in the introductory article, I consider property-based testing a good alternative. In the following, I'll refactor the tests. Since the tests already use AutoFixture, most of the preliminary work can be done without choosing a property-based testing framework. I'll postpone that decision until I need it.

State validator #

The IStateValidator interface has a single implementation:

public class StateValidator : IStateValidator
{
    public bool Validate(string code, (string expectedCode, bool isMobile, Uri redirect) knownState)
        => code == knownState.expectedCode;
}

The Validate method is a pure function, so it's completely deterministic. It means that you don't have to hide it behind an interface and replace it with a Test Double in order to control it. Rather, just feed it proper data. Still, that's not what the interaction-based tests do:

[Theory]
[AutoData]
public void HappyPath(string statestring code, (stringbool, Uri) knownStatestring response)
{
    _repository.Add(state, knownState);
    _stateValidator
        .Setup(validator => validator.Validate(code, knownState))
        .Returns(true);
    _renderer
        .Setup(renderer => renderer.Success(knownState))
        .Returns(response);
 
    _target
        .Complete(state, code)
        .Should().Be(response);
}

These tests use AutoFixture, which will make it a bit easier to refactor them to properties. It also makes the test a bit more abstract, since you don't get to see concrete test data. In short, the [AutoData] attribute will generate a random state string, a random code string, and so on. If you want to see an example with concrete test data, the next article shows that variation.

The test uses Moq to control the behaviour of the Test Doubles. It states that the Validate method will return true when called with certain arguments. This is possible because you can redefine its behaviour, but as far as executable specifications go, this test doesn't reflect reality. There's only one Validate implementation, and it doesn't behave like that. Rather, it'll return true when code is equal to knownState.expectedCode. The test poorly communicates that behaviour.

Even before I replace AutoFixture with CsCheck, I'll prepare the test by making it more honest. I'll replace the code parameter with a Derived Value:

[Theory]
[AutoData]
public void HappyPath(string state, (stringbool, Uri) knownStatestring response)
{
    var (expectedCode__) = knownState;
    var code = expectedCode;
    // The rest of the test...

I've removed the code parameter to replace it with a variable derived from knownState. Notice how this documents the overall behaviour of the (sub-)system.

This also means that I can now replace the IStateValidator Test Double with the real, pure implementation:

[Theory]
[AutoData]
public void HappyPath(string state, (stringbool, Uri) knownStatestring response)
{
    var (expectedCode__) = knownState;
    var code = expectedCode;
    _repository.Add(state, knownState);
    _renderer
        .Setup(renderer => renderer.Success(knownState))
        .Returns(response);
    var sut = new Controller(_repository, new StateValidator(), _renderer.Object);
 
    sut
        .Complete(state, code)
        .Should().Be(response);
}

I give the Failure test case the same treatment:

[Theory]
[AutoData]
public void Failure(string state, (stringbool, Uri) knownStatestring response)
{
    var (expectedCode__) = knownState;
    var code = expectedCode + "1"// Any extra string will do
    _repository.Add(state, knownState);
    _renderer
        .Setup(renderer => renderer.Failure(knownState))
        .Returns(response);
    var sut = new Controller(_repository, new StateValidator(), _renderer.Object);
 
    sut
        .Complete(state, code)
        .Should().Be(response);
}

The third test case is a bit more interesting.

An impossible case #

Before I make any changes to it, the third test case is this:

[Theory]
[AutoData]
public void Error(
    string state,
    string code,
    (stringbool, Uri) knownState,
    Exception e,
    string response)
{
    _repository.Add(state, knownState);
    _stateValidator
        .Setup(validator => validator.Validate(code, knownState))
        .Throws(e);
    _renderer
        .Setup(renderer => renderer.Error(knownState, e))
        .Returns(response);
 
    _target
        .Complete(state, code)
        .Should().Be(response);
}

This test case verifies the behaviour of the Controller class when the Validate method throws an exception. If we want to instead use the real, pure implementation, how can we get it to throw an exception? Consider it again:

public bool Validate(string code, (string expectedCode, bool isMobile, Uri redirect) knownState)
    => code == knownState.expectedCode;

As far as I can tell, there's no way to get this method to throw an exception. You might suggest passing null as the knownState parameter, but that's not possible. This is a new version of C# and the nullable reference types feature is turned on. I spent some fifteen minutes trying to convince the compiler to pass a null argument in place of knownState, but I couldn't make it work in a unit test.

That's interesting. The Error test is exercising a code path that's impossible in production. Is it redundant?

It might be, but here I think that it's more an artefact of the process. Sergei Rogovtsev has provided a minimal example, and as it sometimes happens, perhaps it's a bit too minimal. He did write, however, that he considered it essential for the example that the logic involved more that an Boolean true/false condition. In order to keep with the spirit of the example, then, I'm going to modify the Validate method so that it's also possible to make it throw an exception:

public bool Validate(string code, (string expectedCode, bool isMobile, Uri redirect) knownState)
{
    if (knownState == default)
        throw new ArgumentNullException(nameof(knownState));
 
    return code == knownState.expectedCode;
}

The method now throws an exception if you pass it a default value for knownState. From an implementation standpoint, there's no reason to do this, so it's only for the sake of the example. You can now test how the Controller handles an exception:

[Theory]
[AutoData]
public void Error(string statestring codestring response)
{
    _repository.Add(state, default);
    _renderer
        .Setup(renderer => renderer.Error(default, It.IsAny<Exception>()))
        .Returns(response);
    var sut = new Controller(_repository, new StateValidator(), _renderer.Object);
 
    sut
        .Complete(state, code)
        .Should().Be(response);
}

The test no longer has a reference to the specific Exception object that Validate is going to throw, so instead it has to use Moq's It.IsAny API to configure the _renderer. This is, however, only an interim step, since it's now time to treat that dependency in the same way as the validator.

Renderer #

The Renderer class has three methods, and they are all pure functions:

public class Renderer : IRenderer
{
    public string Success((string expectedCode, bool isMobile, Uri redirect) knownState)
    {
        if (knownState.isMobile)
            return "{\"success\": true, \"redirect\": \"" + knownState.redirect + "\"}";
        else
            return "302 Location: " + knownState.redirect;
    }
 
    public string Failure((string expectedCode, bool isMobile, Uri redirect) knownState)
    {
        if (knownState.isMobile)
            return "{\"success\": false, \"redirect\": \"login\"}";
        else
            return "302 Location: login";
    }
 
    public string Error((string expectedCode, bool isMobile, Uri redirect) knownState, Exception e)
    {
        if (knownState.isMobile)
            return "{\"error\": \"" + e.Message + "\"}";
        else
            return "500";
    }
}

Since all three methods are deterministic, automated tests can control their behaviour simply by passing in the appropriate arguments:

[Theory]
[AutoData]
public void HappyPath(string state, (stringbool, Uri) knownStatestring response)
{
    var (expectedCode__) = knownState;
    var code = expectedCode;
    _repository.Add(state, knownState);
    var renderer = new Renderer();
    var sut = new Controller(_repository, renderer);
 
    var expected = renderer.Success(knownState);
    sut
        .Complete(state, code)
        .Should().Be(expected);
}

Instead of configuring an IRenderer Stub, the test can state the expected output: That the output is equal to the output that renderer.Success would return.

Notice that the test doesn't require that the implementation calls renderer.Success. It only requires that the output is equal to the output that renderer.Success would return. Thus, it has less of an opinion about the implementation, which means that it's marginally less coupled to it.

You might protest that the test now duplicates the implementation code. This is partially true, but no more than the previous incarnation of it. Before, the test used Moq to explicitly require that renderer.Success gets called. Now, there's still coupling, but this refactoring reduces it.

As a side note, this may partially be an artefact of the process. Here I'm refactoring tests while keeping the implementation intact. Had I started with a property, perhaps the test would have turned out differently, and less coupled to the implementation. If you're interested in a successful exercise in using property-based TDD, you may find my article Property-based testing is not the same as partition testing interesting.

Simplification #

Once you've refactored the tests to use the pure functions as dependencies, you no longer need the interfaces. The interfaces IStateValidator and IRenderer only existed to support testing. Now that the tests no longer use the interfaces, you can delete them.

Furthermore, once you've removed those interfaces, there's no reason for the classes to support instantiation. Instead, make them static:

public static class StateValidator
{
    public static bool Validate(
        string code,
        (string expectedCode, bool isMobile, Uri redirect) knownState)
    {
        if (knownState == default)
            throw new ArgumentNullException(nameof(knownState));
 
        return code == knownState.expectedCode;
    }
}

You can do the same for the Renderer class.

This doesn't change the overall flow of the Controller class' Complete method, although the implementation details have changed a bit:

public string Complete(string statestring code)
{
    var knownState = _repository.GetState(state);
    try
    {
        if (StateValidator.Validate(code, knownState))
            return Renderer.Success(knownState);
        else
            return Renderer.Failure(knownState);
    }
    catch (Exception e)
    {
        return Renderer.Error(knownState, e);
    }
}

StateValidator and Renderer are no longer injected dependencies, but rather 'modules' that affords pure functions.

Both the Controller class and the tests that cover it are simpler.

Properties #

So far I've been able to make all these changes without introducing a property-based testing framework. This was possible because the tests already used AutoFixture, which, while not a property-based testing framework, already strongly encourages you to write tests without literal test data.

This makes it easy to make the final change to property-based testing. On the other hand, it's a bit unfortunate from a pedagogical perspective. This means that you didn't get to see how to refactor a 'traditional' unit test to a property. The next article in this series will plug that hole, as well as show a more realistic example.

It's now time to pick a property-based testing framework. On .NET you have a few choices. Since this code base is C#, you may consider a framework written in C#. I'm not convinced that this is necessarily better, but it's a worthwhile experiment. Here I've used CsCheck.

Since the tests already used randomly generated test data, the conversion to CsCheck is relatively straightforward. I'm only going to show one of the tests. You can always find the rest of the code in the Git repository.

[Fact]
public void HappyPath()
{
    (from state in Gen.String
     from expectedCode in Gen.String
     from isMobile in Gen.Bool
     let urls = new[] { "https://example.com""https://example.org" }
     from redirect in Gen.OneOfConst(urls).Select(s => new Uri(s))
     select (state, (expectedCode, isMobile, redirect)))
    .Sample((stateknownState) =>
    {
        var (expectedCode__) = knownState;
        var code = expectedCode;
        var repository = new RepositoryStub();
        repository.Add(state, knownState);
        var sut = new Controller(repository);
 
        var expected = Renderer.Success(knownState);
        sut
            .Complete(state, code)
            .Should().Be(expected);
    });
}

Compared to the AutoFixture version of the test, this looks more complicated. Part of it is that CsCheck (as far as I know) doesn't have the same integration with xUnit.net that AutoFixture has. That might be an issue that someone could address; after all, FsCheck has framework integration, to name an example.

Test data generators are monads so you typically leverage whatever syntactic sugar a language offers to simplify monadic composition. In C# that syntactic sugar is query syntax, which explains that initial from block.

The test does look too top-heavy for my taste. An equivalent problem appears in the next article, where I also try to address it. In general, the better monad support a language offers, the more elegantly you can address this kind of problem. C# isn't really there yet, whereas languages like F# and Haskell offer superior alternatives.

Conclusion #

In this article I've tried to demonstrate how property-based testing is a viable alternative to using Stubs and Mocks for verification of composition. You can try to sabotage the Controller.Complete method in the no-mocks branch and see that one or more properties will fail.

While the example code base that I've used for this article has the strength of being small and self-contained, it also suffers from a few weaknesses. It's perhaps a bit too abstract to truly resonate. It also uses AutoFixture to generate test data, which already takes it halfway towards property-based testing. While that makes the refactoring easier, it also means that it may not fully demonstrate how to refactor an example-based test to a property. I'll try to address these shortcomings in the next article.

Next: A restaurant example of refactoring from example-based to property-based testing.


Comments

First of all, thanks again for continuing to explore this matter. This was very enlightening, but in the end, I was left with is a sense of subtle wrongness, which is very hard to pin down, and even harder to tell apart between "this is actually not right for me" and "this is something new I'm not accustomed to".

I suppose that my main question would center around difference between your tests for IStateValidator and IRenderer. Let's start with the latter:

Instead of configuring an IRenderer Stub, the test can state the expected output: That the output is equal to the output that renderer.Success would return.

Coupled with the explanation ("[the test] has less of an opinion about the implementation, which means that it's marginally less coupled to it"), this makes a lot of sense, with the only caveat that in more production-like cases comparing the output would be harder (e.g., if IRenderer operates on HttpContext to produce a full HTTP response), but that's a technicality that can be sorted out with proper assertion library. But let's now look at the IStateValidator part:

as far as executable specifications go, this test doesn't reflect reality. There's only one Validate implementation, and it doesn't behave like that. Rather, it'll return true when code is equal to knownState.expectedCode. The test poorly communicates that behaviour.

Here you act with the opposite intent: you want the test to communicate the specification, and thus be explicitly tied to the logic in the implementation (if not the actual code of it). There are two thing about that that bother me. First of all, it's somewhat inconsistent, so it makes it harder for me to choose which path to follow when testing the next code I'd write (or articulating to another developer how they should do it). But what's more important - and that comes from my example being minimal, as you've already noted - is that the validation logic might be more complicated, and thus the setup would be complicated as well. And as you've already mentioned on Twitter, when chaging the code in the validator implementation, you might be forced to change the implementation in the test, even if the test is more about the controller itself.

There's also another frame for the same issue: the original test read as (at least for me): "if the state is valid, we return successful response based on this state". It didn't matter what is "valid" not did it matter what is "successful response". The new tests reads as "if state in the repository matches passed code, we return successful response for the state". It still doesn't matter what is "successful response", but the definition of validity does matter. For me, this is a change of test meaning, and I'm not sure I understand where that leads me.

Let's consider the following scenario: we need to add another validity criteria, such as "state in repository has an expiration date, and this date should be in the future". We obviously need to add a couple of tests for this (negative and positive). Where do we add them? I'd say we add them into the tests for the validator itself (which are "not shown" for the purposes of brevity), but then it feels weird that we also need to change this "happy path" test...

2023-04-03 21:24 UTC

Thanks for showing CsCheck. I've put in a PR to show how I would refactor the CsCheck tests and will attempt to explain some of the design choices of CsCheck.

First of all, it may be a personal opinion but I don't really tend to use query syntax for CsCheck. I prefer to see the SelectManys and there are a number of additional overloads that simplify ranging and composing Gens.

On the design of CsCheck, I build it to not use reflection, attributes or target test frameworks. I've seen the very difficult problems these lead to (for author and user) when you try to move past simple examples.

I wanted the user to be able to move from simple general type generators to ranged complex types easily in a fluent style. No Arbs, attributes, PositiveInt type etc.

CsCheck has automatic shrinking even for the most complex types that just comes out of composing Gens.

I think some of the reason it was so easy to extend the library to areas such as concurrency testing was because of this simplicity (as well as the random shrinking insight).

Gen<Uri> _genRedirect = Gen.OneOfConst(new Uri("https://example.com"), new Uri("https://example.org"));

[Fact]
public void HappyPath()
{
    Gen.Select(Gen.String, Gen.String, Gen.Bool, _genRedirect)
    .Sample((state, expectedCode, isMobile, redirect) =>
    {
    var code = expectedCode;
    var knownState = (expectedCode, isMobile, redirect);
    var repository = new RepositoryStub { { state, knownState } };
    var sut = new Controller(repository);
    var actual = sut.Complete(state, code);
    var expected = Renderer.Success(knownState);
    return actual == expected;
    });
}
2023-04-04 23:56 UTC

Sergei, thank you for writing. I'm afraid all of this is context-dependent, and I seem to constantly fail giving enough context. It's a fair criticism that I seem to employ inconsistent heuristics when making technical decisions. Part of it is caused by my lack of context. The code base is deliberately stripped of context, which has many other benefits, but gives me little to navigate after. I'm flying blind, so to speak. I've had to (implicitly) imagine some forces acting on the software in order to make technical decisions. Since we haven't explicitly outlined such forces, I've had to make them up as I went. It's quite possible, even, that I've imagined one set of forces in one place, and another set somewhere else. If so, no wonder the decisions are inconsistent.

What do I mean by forces? I'm thinking of the surrounding context that you have to take into a account when making technical decisions about code: Is the current feature urgent? Is it a bug already in production? Is this a new system with a small code base? Or is it an old system with a large code base? Do you have good automated tests? Do you have a Continuous Delivery pipeline? Are you experiencing problems with code quality? What does the team makeup look like? Do you have mostly seasoned veterans who've worked in that code base for years? Or do you have many newcomers? Is the system public-facing or internal? Is it a system, even, or a library or framework? What sort of organisation owns the software? Is it a product group? Or a cost centre? What is the organization's goals? How are you incentivized? How are other stakeholders incentivized?

As you can imagine, I can keep going, asking questions like these, and they may all be relevant. Clearly, we can't expect a self-contained minimal example to also include all such superstructure, so that's what I (inconsistently) have to imagine, on the fly.

I admit that the decisions I describe seem inconsistent, and the explanation may simply be what is already implied above: I may have had a different context in mind when I made one, and a variation in mind when I made the other.

That's hardly the whole story, though. I didn't start my answer with the above litany of forces only to make a bad excuse for myself. Rather, what I had in mind was to argue that I use a wider context when making decisions. That context is not just technical, but includes, among many other considerations, the team structure.

As an example, I was recently working with some students in a university setting. These are people in their early twenties, with only a few months of academic programming under their belt, as well as, perhaps, a few years of hobby programming. They'd just been introduced to Git and GitHub a few weeks earlier, C# a month before that. I was trying to teach them how to use Git and GitHub, how to structure decent C# code, and many other things. During our project, they did send me a few pull requests I would have immediately rejected from a professional programmer. In this particular context, however, that would have been counter-productive. These students were doing a good job, based on their level of experience, and they needed the sense of accomplishment that I would (often, but not always) accept their code.

I could have insisted on a higher code quality, and I would also have been able to teach it to anyone patient enough to listen. One thing I've learned all too slowly in my decades of working with other people is that most people aren't as patient with me as I'd like them to be. I need to explicitly consider how to motivate my collaborators.

Here's another example: Years ago, I worked with a rag-tag team hastily assembled via word-of-mouth of some fine European freelancers. My challenge here was another. These people were used to be on top of their game - usually the ones brought in to an organisation because they were the best. I needed them to work together, and among other things, it meant showing them that even though they might think that their way was the best way, other ways exist. I wanted them to be able to work together and produce code with shared ownership. At the beginning, I was rather strict with my standards, clearly bruising a few egos, but ultimately several members have told what a positive transformative experience it was for them. It was a positive transformative experience for me, too.

I discuss all of this because you, among various points, mention the need to be able to articulate to other developers how to make technical decisions about tests. The point is that there's a lot of context that goes into making decisions, and hardly a one-size-fits-all heuristic.

What usually guides me is an emphasis on coupling, and that's also, I believe, what ultimately motivated me here. There's always going to be some coupling between tests and production code, but the less the better. For example, when considering whether how to write an assertion, I consider whether a change in production code's behaviour would cause a test to break.

Consider, for example, the renderer in the present example. How important is the exact output? What happens if I change a character in the string that is being returned?

That's a good example of context being important. If that output is part of an implementation of a network protocol or some other technical spec, just one character change could, indeed, imply that your implementation is off spec. In that case, we do want to test the exact output, and we do want the test to fail if the output changes.

On the other hand, if the output is a piece of UI, or perhaps an error message, then the exact wording is likely to change over time. Since this doesn't really imply a change in behaviour, changing such a string output shouldn't break a test.

You need that wider context in order to make decisions like that: If we change the System Under Test in this way, will the test break? Should it? What if we change it in another way?

This is relevant in order to address your final concern: What if you now decide that the expiration date should be in the future? The way you describe it, it sounds like this strengthens the preconditions of the system - in other words, it breaks backwards compatibility. So yes, making that change may break existing tests, but this could be an indication that it's also going to break existing clients.

If you have any clients, that is. Again, you know your context better than I do, so only you can decide whether making such a change is okay. I can think of situations where is, but I usually find myself in contexts where it isn't, so I tend to err on the side of avoiding breaking changes.

2023-04-09 14:28 UTC

Mark, thank you for taking the time to discuss this.

Having the magic word "architect" somewhere in my title, I know the importance of context, and in fact that would usually be the first counter-question that I have for somebody coming at me with a question: "what is your context?". So here we are, me being contractually obliged to strip as much context from my code as possible, and you having to reinvent it back from your experience. On the other hand, this allows us to point out which decisions are actually context-driven, and how different contexts affect them.

With that in mind, I can actually propose two different contexts to reframe the decisions above, so that we could arrive at more insights.

The first would be an imaginary context, which I had in mind when writing the code, but haven't thought of communicating: the renderer is as important as the validator. In case of "mobile" state the consumer is actually a mobile application, so we need to know we've produced the right JSON it will consume, and in case of non-"mobile" state the consumer is the browser, which again needs to be properly redirected. In my mind, this is no less important than the validation logic itself, because breaking it will break at least one consumer (mobile), and more likely both of them. Thus, according to the logic above, this is a compatibility issue, and as such we need to explicitly spell this behavior in the tests. Which gives us six outcome branches... six tests? Or something more complicated? This is especially interesting, considering the fact that we can test the renderer in isolation, so we'd be either duplicating our tests... or just just discarding the isolated tests for the renderer?

And then here's the actual real context, which I can thankfully divulge to this extent: this is, in fact, a migration problem, when we move from one externally-facing framework (i.e. ASP.NET Web API) to another (i.e. ASP.NET Core). So I am not, in fact, concerned about the validation at all - I'm concerned about the data being properly passed to the validator (because the validator already existed, and worked properly, I'm just calling it from another framework), its result properly handled in the controller (which I am replacing), and then I'm concerned that despite the heavy changes between ASP.NET versions I'm still rendering the output in the exactly same way.

Now that I'm thinking about it, it seems grossly unfair that I've hidden this context beforehand, but then, I didn't see how it was affecting the decisions in test design. So hopefully we can still find some use in this.

2023-04-10 18:03 UTC

Anthony, thank you for writing. I didn't intend to disparage CsCheck. Having once run a moderately successful open source project, I've learned that it's a good idea to limit the scope to a functional minimum. I'm not complaining about the design decisions made for CsCheck.

I was a bit concerned that a casual reader might compare the CsCheck example with the previous code and be put off by the apparent complexity. I admit that your example looks less complex than mine, so to address that particular concern, I could have used code similar to yours.

Whether one prefers the syntactic sugar of query syntax or the more explicit use of SelectMany is, to a degree, subjective. There are, on the other hand, some cases where one is objectively better than the other. One day I should write an article about that.

I agree that the test-framework integration that exists in FsCheck is less than ideal. I'm not wishing for anything like that. Something declarative, however, would be nice. Contrary to you, I consider wrapper types like PositiveInt to be a boon, but perhaps not like they're implemented in FsCheck. (And this, by the way, isn't to harp on FsCheck either; I only bring up those examples because FsCheck has more of that kind of API than Hedgehog does.) The Haskell QuickCheck approach is nice: While a wrapper like Positive is predefined in the package, it's just an Arbitrary instance. There's really nothing special about it, and you can easily define domain-specific wrappers for your own testing purposes. Here's an example: Naming newtypes for QuickCheck Arbitraries. I'm wondering if something similar wouldn't be possible with interfaces in .NET.

2023-04-13 8:46 UTC

Sergei, thank you for writing. I'm sorry if I came across as condescending or implying that you weren't aware that context matters. Of course you do. I was mostly trying to reconstruct my own decision process, which is far less explicit than you might like.

Regarding the renderer component, I understand that testing such a thing in reality may be more involved than the toy example we're looking at. My first concern is to avoid duplicating efforts too much. Again, however, the external behaviour should be the primary concern. I'm increasingly shifting towards making more and more things internal, as long as I can still test them via the application boundary. As I understand it, this is the same concern that made Dan North come up with behaviour-driven development. I'm also concerned primarily with testing the behaviour of systems, just without all the Gherkin overhead.

There comes a time, however, where testing everything through an external API becomes too awkward. I'm not adverse to introduce or make public classes or functions to enable testing at a different abstraction level. Doing that, however, represents a bet that it's possible to keep the new component's API stable enough that it isn't going to cause too much test churn. Still, if we imagine that we've already made such a decision, and that we now have some renderer component, then it's only natural to test it thoroughly. Then, in order to avoid duplicating assertions, we can state, as I did in this article, that the overall system should expect to see whatever the renderer component returns.

That was, perhaps, too wordy an explanation. Perhaps this is more helpful: Don't repeat yourself. What has been asserted in one place shouldn't be asserted in another place.

The other example you mention, about migrating to another framework, reminds me of two things.

The first is that we shouldn't forget about other ways of verifying code correctness. We've mostly been discussing black-box testing, and while it can be an interesting exercise to imagine an adversary developer, in general that's rarely the reality. Are there other ways to verify that methods are called with the correct values? How about looking at the code? Consistent code reviews are good at detecting bugs.

The second observation is that if you already have two working components (validator and renderer) you can treat them as Test Oracles. This still works well with property-based testing. Write tests based on equivalence classes and get a property-based testing framework to sample from those classes. Then use the Test Oracles to define the expected output. That's essentially what I have done in this article.

Does it prove that your framework-based code calls the components with the correct arguments? No, not like if you'd used a Test Spy. Property-based testing produces knowledge reminiscent of the kind of knowledge produced by experimental physics; not the kind of axiomatic knowledge produced by mathematics. That's why I named this article series Epistemology of interaction testing.

Is it wrong to test with Stubs or Spies in a case like this? Not necessarily. Ultimately, what I try to do with this blog is to investigate and present alternatives. Only once we have alternatives do we have choices.

2023-04-15 15:53 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 03 April 2023 06:02:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 03 April 2023 06:02:00 UTC