ploeh blog

FizzBuzz kata in F#: stage 2

Wednesday, 25 July 2012 09:05:09 UTC

In my previous post I walked you through stage 1 of the FizzBuzz kata. In this post I'll walk you through stage 2 of the kata, where new requirements are introduced (see the kata itself for details). This makes the implementation much more complex.

Unit test #

In order to meet the new requirements, I first modified and expanded my existing test cases:

[<Theory>]
[<InlineData(1, "1")>]
[<InlineData(2, "2")>]
[<InlineData(3, "Fizz")>]
[<InlineData(4, "4")>]
[<InlineData(5, "Buzz")>]
[<InlineData(6, "Fizz")>]
[<InlineData(7, "7")>]
[<InlineData(8, "8")>]
[<InlineData(9, "Fizz")>]
[<InlineData(10, "Buzz")>]
[<InlineData(11, "11")>]
[<InlineData(12, "Fizz")>]
[<InlineData(13, "Fizz")>]
[<InlineData(14, "14")>]
[<InlineData(15, "FizzBuzz")>]
[<InlineData(16, "16")>]
[<InlineData(17, "17")>]
[<InlineData(18, "Fizz")>]
[<InlineData(19, "19")>]
[<InlineData(20, "Buzz")>]
[<InlineData(30, "FizzBuzz")>]
[<InlineData(31, "Fizz")>]
[<InlineData(32, "Fizz")>]
[<InlineData(33, "Fizz")>]
[<InlineData(34, "Fizz")>]
[<InlineData(35, "FizzBuzz")>]
[<InlineData(36, "Fizz")>]
[<InlineData(37, "Fizz")>]
[<InlineData(38, "Fizz")>]
[<InlineData(39, "Fizz")>]
[<InlineData(50, "Buzz")>]
[<InlineData(51, "FizzBuzz")>]
[<InlineData(52, "Buzz")>]
[<InlineData(53, "FizzBuzz")>]
[<InlineData(54, "FizzBuzz")>]
[<InlineData(55, "Buzz")>]
[<InlineData(56, "Buzz")>]
[<InlineData(57, "FizzBuzz")>]
[<InlineData(58, "Buzz")>]
[<InlineData(59, "Buzz")>]
let FizzBuzzReturnsCorrectResult number expected =
    number
    |> FizzBuzz
    |> should equal expected

This is the same test code as before, only with new or modified test data.

Implementation #

Compared with the stage 1 implementation, my implementation to meet the new requirements is much more complex. First, I'll post the entire code listing and then walk you through the details:

let FizzBuzz number =
    let arithmeticFizzBuzz number =
        seq {
            if number % 3 = 0 then yield "Fizz"
            if number % 5 = 0 then yield "Buzz"
            }
 
    let digitalFizzBuzz digit =
        seq {
            if digit = 3 then yield "Fizz"
            if digit = 5 then yield "Buzz"
            }
 
    let rec digitize number =
        seq {
                yield number % 10
                let aTenth = number / 10
                if aTenth >= 1 then yield! digitize aTenth
            }
 
    let arithmeticFizzBuzzes = number |> arithmeticFizzBuzz
    let digitalFizzBuzzes = number
                            |> digitize
                            |> Seq.collect digitalFizzBuzz
 
    let fizzOrBuzz = arithmeticFizzBuzzes
                     |> Seq.append digitalFizzBuzzes
                     |> Seq.distinct
                     |> Seq.toArray
                     |> Array.sort
                     |> Array.rev
                     |> String.Concat
 
    if fizzOrBuzz = ""
    then number.ToString()
    else fizzOrBuzz

First of all, you may wonder where the original implementation went. According to the requirements, the function must still 'Fizz' or 'Buzz' when a number is divisible by 3 or 5. This is handled by the nested arithmeticFizzBuzz function:

let arithmeticFizzBuzz number =
    seq {
        if number % 3 = 0 then yield "Fizz"
        if number % 5 = 0 then yield "Buzz"
        }

The seq symbol specifies a sequence expression, which means that everything within the curly brackets is expected to produce parts of a sequence. It works a bit like the yield keyword in C#.

Due to F#'s strong type inference, the type of the function is int -> seq<string>, which means that it takes an integer as input and returns a sequence of strings. In C# an equivalent signature would be IEnumerable<string> arithmeticFizzBuzz(int number). This function produces a sequence of strings depending on the input.

1 produces an empty sequence.
2 produces an empty sequence.
3 produces a sequence containing the single string "Fizz".
4 produces an empty seqence.
5 produces a sequence containing the single string "Buzz".
6 produces a sequence containing the single string "Fizz".
15 produces a sequence containing the strings "Fizz" and "Buzz" (in that order).

That doesn't quite sound like the original requirements, but the trick will be to concatenate the strings. Thus, an empty sequence will be "", "Fizz" will be "Fizz", "Buzz" will be "Buzz", but "Fizz" and "Buzz" will become "FizzBuzz".

The digitalFizzBuzz function works in much the same way, but expects only a single digit.

let digitalFizzBuzz digit =
    seq {
        if digit = 3 then yield "Fizz"
        if digit = 5 then yield "Buzz"
        }

1 produces an empty sequence.
2 produces an empty sequence.
3 produces a sequence containing the single string "Fizz".
4 produces an empty seqence.
5 produces a sequence containing the single string "Buzz".
6 produces an empty sequence.

In order to be able to apply the new rule of Fizzing and Buzzing if a digit is 3 or 5, it's necessary to split a number into digits. This is done by the recursive digitize function:

let rec digitize number =
    seq {
            yield number % 10
            let aTenth = number / 10
            if aTenth >= 1 then yield! digitize aTenth
        }

This function works recursively by first yielding the rest of a division by 10, and then calling itself recursively with a tenth of the original number. Since the number is an integer, the division simply still produces an integer. The function produces a sequence of digits, but in a sort of backwards way.

1 produces a sequence containing 1.
2 produces a sequence containing 2.
12 produces a sequence containing 2 followed by 1.
23 produces a sequence containing 3 followed by 2.
148 produces 8, 4, 1.

This provides all the building blocks. To get the arithmetic (original) FizzBuzzes, the number is piped into the arithmeticFizzBuzz function:

let arithmeticFizzBuzzes = number |> arithmeticFizzBuzz

In order to get the digital (new) FizzBuzzes, the number is first piped into the digitize function, and the resulting sequence of digits is then piped to the digitalFizzBuzz function by way of the Seq.collection function.

let digitalFizzBuzzes = number
                        |> digitize
                        |> Seq.collect digitalFizzBuzz

The Seq.collect function is a built-in function that takes a sequence of elements (in this case a sequence of digits) and for each element calls a method that produces a sequence of elements, and then concatenates all the produced sequences. As an example, consider the number 53.

Calling digitize with the number 53 produces the sequence { 3; 5 }. Calling digitalFizzBuzz with 3 produces the sequence { "Fizz" } and calling digitalFizzBuzz with 5 produces { "Buzz" }. Seq.collect concatenates these two sequences to produce the single sequence { "Fizz"; "Buzz" }.

Now we have two sequences of "Fizz" or "Buzz" strings - one produced by the old, arithmetic function, and one produced by the new, digital function. These two sequences can now be merged and ordered with the purpose of producing a single string:

let fizzOrBuzz = arithmeticFizzBuzzes
                 |> Seq.append digitalFizzBuzzes
                 |> Seq.distinct
                 |> Seq.toArray
                 |> Array.sort
                 |> Array.rev
                 |> String.Concat

First, the Seq.append function simply concatenates the two sequences into a single sequence. This could potentially result in a sequence like this: { "Fizz"; "Buzz"; "Fizz" }. The Seq.distinct function gets rid of the duplicates, but the ordering may be wrong - the sequence may look like this: { "Buzz"; "Fizz" }. This can be fixed by sorting the sequence, but sorting alphabetically would always put "Buzz" before "Fizz" so it's also necessary to reverse the sequence. There's no function in the Seq module which can reverse a sequence, so first the Seq.toArray function is used to convert the sequence to an array. After sorting and reversing the array, the result is one of four arrays: [], [ "Fizz" ], [ "Buzz" ], or [ "Fizz"; "Buzz" ]. The last step is to concatenate these string arrays to a single string using the String.Concat BCL method.

If there were no Fizzes or Buzzes, the string will be empty, in which case the number is converted to a string and returned; otherwise, the fizzOrBuzz string is returned.

if fizzOrBuzz = ""
then number.ToString()
else fizzOrBuzz

To print the FizzBuzz list for numbers from 1 to 100 the same solution as before can be used.

What I like about Functional Programming is that data just flows through the function. There's not state and no mutation - only operations on sequences of data.

FizzBuzz kata in F#: stage 1

Friday, 20 July 2012 05:26:12 UTC

In previous posts I've walked through the Bank OCR kata in F#. In this post, I will do the same for the first stage of the very simple FizzBuzz kata. This is a very simple kata, so if you already know F#, there will be nothing new to see here. On the other hand, if you've yet to be exposed to F#, this is a good place to start - I'll attempt to walk you through the code assuming that you don't know F# yet.

Unit test #

Since I developed the solution using Test-Driven Development, I started by writing a single Parameterized Test:

[<Theory>]
[<InlineData(1, "1")>]
[<InlineData(2, "2")>]
[<InlineData(3, "Fizz")>]
[<InlineData(4, "4")>]
[<InlineData(5, "Buzz")>]
[<InlineData(6, "Fizz")>]
[<InlineData(7, "7")>]
[<InlineData(8, "8")>]
[<InlineData(9, "Fizz")>]
[<InlineData(10, "Buzz")>]
[<InlineData(11, "11")>]
[<InlineData(12, "Fizz")>]
[<InlineData(13, "13")>]
[<InlineData(14, "14")>]
[<InlineData(15, "FizzBuzz")>]
[<InlineData(16, "16")>]
[<InlineData(17, "17")>]
[<InlineData(18, "Fizz")>]
[<InlineData(19, "19")>]
[<InlineData(20, "Buzz")>]
let FizzBuzzReturnsCorrectResult number expected =
    number
    |> FizzBuzz
    |> should equal expected

This test uses xUnit.net data theories to provide a set of test data in the form of an integer as input and an expected string.

The number input variable is piped to the FizzBuzz function, using F#'s pipe operator |>. This is just another way of writing

FizzBuzz number

The pipe operator simply takes the data being piped and uses it as the last input parameter to the function being piped. In this case, the number integer variable is the data being piped, so it's used as the last input parameter to the FizzBuzz function, which only takes a single paramter.

The result of invoking the FizzBuzz function is a string. This result is again piped to the should method, which is defined by the FsUnit module. The should method is an assertion function that takes three input parameters. The first two parameters are supplied as part of the function invokation as equal expected, but since the pipe operator is being used, the third and final parameter value is the result of invoking the FizzBuzz function.

In all, the test states that when the FizzBuzz function is called with number, the result should be equal to the expected string.

Implementation #

The FizzBuzz implementation is really simple:

let FizzBuzz number =
    match number with
    | i when i % 3 = 0 && i % 5 = 0 -> "FizzBuzz"
    | i when i % 3 = 0 -> "Fizz"
    | i when i % 5 = 0 -> "Buzz"
    | _ -> number.ToString()

All it does is to use pattern matching against the number input argument. In all cases except the last one, the value of number is matched against any number, but with a condition. The first condition is that the number should be divible by both 3 and 5. If this is the case, the result to the right of the -> operator is returned from the function ("FizzBuzz").

The last line of the match block uses an underscore as the match pattern. This is a catch-all pattern that's being triggered if none of the other patterns are matched. In this case, the number input argument is converted to a string and returned.

Printing all lines #

The kata requires me to print the output for all numbers from 1 to 100. The astute reader may have noticed that the FizzBuzz function doesn't do that - it only converts a single integer to a string. However, printing all numbers fizzed and buzzed is easy:

[1..100]
|> List.map FizzBuzz
|> List.reduce (sprintf "%s\r\n%s")

The first line defines a list of numbers from 1 to 100. The next line pipes this list of integers into the List.map function, which applies the FizzBuzz function to each integer. The output of this function call is another list of strings ["1"; "2"; "Fizz"; "4"; "Buzz"; etc.]. This list of strings is piped into the List.reduce function, which in this case uses the sprintf function to concatenate the strings and add a line break after each element, so that it formats correctly.

The List.reduce function applies a function to pairwise elements in a list in order to produce a new element of the same type. Consider the beginning of the list of strings ["1"; "2"; "Fizz"; "4"; "Buzz"; etc.]. The List.reduce function starts with "1" and "2" and applies a function in order to produce a new string from those two strings. That function is the sprintf function, which is similar to the more well-known String.Format method in the BCL. In this case, the template is to take the two strings and insert a line break between them. Thus, when applied to "1" and "2", the result is

1
2

(notice the line break). Now, the List.reduce function takes that string and the next string in the list ("Fizz") and applies the funtion again, giving this result:

1
2
Fizz

It now takes this string and the next value ("4") and applies the sprintf function once more, etc. This is how the final list is being printed.

In a future post I'll walk you through stage 2 of the kata.

Comments

Simon Skov Boisen #

One difference between sprintf and String.Format that is worth mentioning is that sprintf offers typesafety - it won't let you pass an integer to a %s since %s represents a string - it's a compile time error. More information available at http://msdn.microsoft.com/en-us/library/ee370560.aspx.

2012-08-06 11:03 UTC

Hyprlinkr

Wednesday, 18 July 2012 09:46:48 UTC

This post serves as an announcement that my latest open source project Hyprlinkr is available. It's a very sharply focused helper library for the ASP.NET Web API, enabling you to create type-safe hyperlinks between resources in a RESTful API.

It's basically a reusable package of the RouteLinker class I previously presented as a spike. The original idea wasn't mine, but José F. Romaniello's - I just took the idea and created a project out of it.

The library is mostly useful if you build level 3 RESTful APIs with the ASP.NET Web API.

It's available as a NuGet package.

Apart from that, I'll refer you to the project site for more information.

Comments

Anthony Johnston #

Thanks for this Mark, it will surely come in use

The one thing I am struggling to reconcile, you must have some insight on, is the addition of the link properties to the model.

I am thinking that hyperlinks should be a responsibility of the serializer and not the model...

2012-07-18 11:23 UTC

Mark Seemann #

In the cases I can think about right now, I'd say that the Controller (or one of its dependencies) must have the responsibility of adding the hyperlinks. It's an application concern to define what can be linked to. This isn't any different than adding "<a href" links to HTML in a 'normal' web application.

2012-07-18 11:45 UTC

Jose Romaniello #

thanks for mentioning my article, I really like what you did.

At the time I wrote that article i was trying different syntax to create a DSL to express a workflow, like the coffee workflow in "Rest on practice".

i think this is an step forward

thanks

2012-07-18 23:02 UTC

Primitive Dependencies

Monday, 02 July 2012 09:26:30 UTC

Primitives are also dependencies

There are tons of examples of how Dependency Injection (DI) can be used to decouple clients and services. When the subject is DI, the focus tends to be heavily on the Liskov Substitution Principle (LSP), so most people think about dependencies as polymorphic types (interfaces or abstract base classes). Primitive types like strings and integers tend to be ignored or discouraged. It doesn't help that most DI Containers need extra help to deal with such values.

Primitives are dependencies, too. It doesn't really matter whether or not they are polymorphic. In the end, a dependency is something that the client depends on - hence the name. It doesn't really matter whether the dependency is an interface, a class or a primitive type. In most object-oriented languages, everything is an object - even integers and booleans (although boxing occurs).

There are several ways to inject dependencies into clients. My book describes a set of patterns including Constructor Injection and Property Injection. It's important to keep in mind that ultimately, the reason why Constructor Injection should be your preferred DI pattern has nothing to do with polymorphism. It has to do with protecting the invariants of the class.

Therefore, if the class in question requires a primitive value in order to work, that is a dependency too. Primitive constructor arguments can be mixed with polymorphic arguments. There's really no difference.

Example: a chart reader #

Imagine that you're building a service which provides Top 40 music chart data. There's a ChartController which relies on an IChartReader:

public class ChartController
{
    private readonly IChartReader chartReader;
 
    public ChartController(IChartReader chartReader)
    {
        if (chartReader == null)
            throw new ArgumentNullException("chartReader");
 
        this.chartReader = chartReader;
    }
 
    // ...
}

One implementation of IChartReader is based on a database, so it requires a connection string (a primitive). It also requires a configuration value which establishes the size of the chart:

public class DbChartReader : IChartReader
{
    private readonly int top;
    private readonly string chartConnectionString;
 
    public DbChartReader(int top, string chartConnectionString)
    {
        if (top <= 0)
            throw new ArgumentOutOfRangeException(
                "top",
                "Only positive numbers allowed.");
        if (chartConnectionString == null)
            throw new ArgumentNullException("chartConnectionString");
 
        this.top = top;
        this.chartConnectionString = chartConnectionString;
    }
 
    // ...
}

When top has the value 40, the chart is a Top 40 chart; when the value is 10 it's a Top 10 chart; etc.

Unit testing #

Obviously, a class like DbChartReader is easy to wire up in a unit test:

[Fact]
public void UnitTestingExample()
{
    var sut = new DbChartReader(
        top: 10,
        chartConnectionString: "localhost;foo;bar");
 
    // Act goes here...
    // Assert goes here...
}

Hard-coded composition #

When it's time to bootstrap a complete application, one of the advantages of treating primitives as dependencies is that you have many options for how and where you define those values. At the beginning of an application's lifetime, the best option is often to hard-code some or all of the values. This is as easy to do with primitive dependencies as with polymorphic dependencies:

var controller = new ChartController(
    new DbChartReader(
        top: 40,
        chartConnectionString: "foo"));

This code is part of the application's Composition Root.

Configuration-based composition #

If the time ever comes to move the arms of the the Configuration Complexity Clock towards using the configuration system, that's easy to do too:

var topString = ConfigurationManager.AppSettings["top"];
var top = int.Parse(topString);
 
var chartConnectionString = ConfigurationManager
    .ConnectionStrings["chart"].ConnectionString;
 
var controller = new ChartController(
    new DbChartReader(
        top,
        chartConnectionString));

This is still part of the Composition Root.

Wiring a DI Container with primitives #

Most DI Containers need a little help with primitives. You can configure components with primitives, but you often need to be quite explicit about it. Here's an example of configuring Castle Windsor:

container.Register(Component
    .For<ChartController>());
container.Register(Component
    .For<IChartReader>()
    .ImplementedBy<DbChartReader>()
    .DependsOn(
        Dependency.OnAppSettingsValue("top"),
        Dependency.OnValue<string>(
            ConfigurationManager.ConnectionStrings["chart"]
                .ConnectionString)));

This configures the ChartController type exactly like the previous example, but it's actually more complicated now, and you even lost the feedback from the compiler. That's not good, but you can do better.

Conventions for primitives #

A DI Container like Castle Windsor enables you define your own conventions. How about these conventions?

If a dependency is a string and it ends with "ConnectionString", the part of the name before "ConnectionString" is the name of an entry in the app.config's connectionStrings element.
If a dependency is a primitive (e.g. an integer) the name of the constructor argument is the key to the appSettings entry.

That would be really nice because it means that you can keep on evolving you application by adding code, and it just works. Need a connection string to the 'artist database'? Just add a constructor argument called "artistConnectionString" and a corresponding artist connection string in your app.config.

Here's how those conventions could be configured with Castle Windsor:

container.Register(Classes
    .FromAssemblyInDirectory(new AssemblyFilter(".")
        .FilterByName(an => an.Name.StartsWith("Ploeh")))
    .Pick()
    .WithServiceAllInterfaces());
 
container.Kernel.Resolver.AddSubResolver(
    new ConnectionStringConventions());
container.Kernel.Resolver.AddSubResolver(
    new AppSettingsConvention());

The Register call scans all appropriate assemblies in the application's root and registers all components according to the interfaces they implement, while the two sub-resolvers each implement one of the conventions described above.

public class ConnectionStringConventions : ISubDependencyResolver
{
    public bool CanResolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        return dependency.TargetType == typeof(string)
            && dependency.DependencyKey.EndsWith("ConnectionString");
    }
 
    public object Resolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        var name = dependency.DependencyKey.Replace("ConnectionString", "");
        return ConfigurationManager.ConnectionStrings[name].ConnectionString;
    }
}

The CanResolve method ensures that the Resolve method is only invoked for string dependencies with names ending with "ConnectionString". If that's the case, the connection string is simply read from the app.config file according to the name.

public class AppSettingsConvention : ISubDependencyResolver
{
    public bool CanResolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        return dependency.TargetType == typeof(int); // or bool, Guid, etc.
    }
 
    public object Resolve(
        CreationContext context,
        ISubDependencyResolver contextHandlerResolver,
        ComponentModel model,
        DependencyModel dependency)
    {
        var appSettingsKey = dependency.DependencyKey;
        var s = ConfigurationManager.AppSettings[appSettingsKey];
        return Convert.ChangeType(s, dependency.TargetType);
    }
}

This other convention can be used to trigger on primitive dependencies. Since this is a bit of demo code, it only triggers on integers, but I'm sure you'll be able to figure out how to make it trigger on other types as well.

Using convention-based techniques like these can turn a DI Container into a very powerful piece of infrastructure. It just sit there, and it just works, and rarely do you have to touch it. As long as all developers follow the conventions, things just work.

Comments

Roopesh Shenoy #

Convention over configuration - A bit magical, but Nice!

2012-07-02 11:18 UTC

Marcus #

Great post!

2012-07-04 11:42 UTC

Gary McLean Hall #

Do you know if Unity supports this sort of convention?

2012-07-05 20:29 UTC

Mark Seemann #

It's been almost two years since I last worked with Unity, so I don't know - I can't remember all the details of its API. However, I remember Unity as being quite extensible, actually, so I wouldn't be surprised if you could do something like this.

Basically, Unity has very little in terms of Fluent APIs, convention over configuration, etc. but what it does have is a very open architecture. This means it's almost always possible to write a bit of Reflection code to configure Unity by convention.

FWIW, there's a chapter in my book about Unity and its extensibility mechanisms, but to be fair, it doesn't cover exactly this scenario.

2012-07-06 06:16 UTC

Gary McLean Hall #

I haven't gotten that far in your book, yet ;)

It appears it is possible, and that someone has created an extension for convention:

http://aspiringcraftsman.com/2009/06/13/convention-based-registration-extension/

2012-07-06 14:34 UTC

Acaz #

It seems like a over-engineering to me. Make things simple like in "Wiring a DI Container with primitives" is the best option for me. The other options below seems confuse and unnecessary to me.

2012-09-08 22:31 UTC

Facade Test

Wednesday, 27 June 2012 15:11:08 UTC

This post proposes the term Facade Test for an automated test which is more than a unit test, but not quite an integration test.

There are several definitions of what a unit test is, but the one that I personally prefer goes something like this:

A unit test is an automated test that tests a unit in isolation.

This is still a rather loose definition. What's a unit? What's does isolation mean? The answers to these questions seem to be a bit fluffy, but I think it's safe to say that a unit isn't bigger than a class. It's not a library (or 'assembly' in .NET terms). It's not all types in a namespace. In my opinion, a unit is a method... and a bit more. In OO, a method is coupled to it's defining class, which is the reason why I think of a unit as a bit more than a method.

If that's a unit, then isolation means that you should be able to test the unit without using or relying on any other moving pieces of production code.

Often, unit testing means that you have to decouple objects in order to achieve isolation. There's a few ways to do that, but Dependency Injection is a very common approach. That often leads to heavy use of dynamic mocks and too many interfaces that exist exclusively to support unit testing.

Sometimes, a good alternative is to write automated tests that exercise a larger group of classes. When all the collaborating classes are deterministic and implemented within the same library, this can be an effective approach. Krzysztof Koźmic describes that

Windsor, has very few unit tests. Most tests there are (and we have close to 1000 of them) exercise entire container with no part stubbed out in simulations of real life scenarios

Another example is AutoFixture, which has hundreds of automated tests against the Fixture class.

We need a name for tests like these. While we could call them Integration Tests, this label would lump them together with tests that exercise several libraries at once.

To distinguish such tests from Unit Tests on the one side, and Integration Tests on the other side, I suggest the name Facade Test

Facade Test

The reason why I suggest the term Facade Test is that such automated tests tend to mostly target a small subset of classes that act as Facades over many other classes in a library.

Comments

matthewr #

I prefer Composite Tests or even Group Tests.

2012-06-27 22:23 UTC

Bill Watson #

We break it down like this.

Unit test: usually just one class but occasionally a couple of closely related classes, constructed and executed directly from the test code. Runs fast and no contact with the outside world.

Database or Comms test: testing in isolation that a single class that interacts with something externally works. e.g. a class that accesses the database actually accesses the database, a class that contacts a remote server actual contacts a remote server (probably a special one we wrote for testing purposes).

Scenario test: multiple classes, but with all external dependencies replaced with test doubles (e.g. no database access, no SOAP calls, no HTTP requests). Object graph is constructed via the same mechanism as the real object graph. So in some sense this tests the DI configuration. This is a form of integration test, but everything runs in memory so they are quick. Most of our tests are at this level.

Acceptance test: full application running and tested with Selenium.

And then there is manual testing.

2012-06-28 00:54 UTC

Rory Primrose #

I like the term component test for this level of scope. That being said, I still put it in a project suffixed with .UnitTests.

2012-06-29 12:39 UTC

Amir Rajan #

Why differentiate at all? I test is valuable or it isn't. A tests is fast or it isn't. Thoughts?

2012-07-03 21:02 UTC

Mark Seemann #

The purpose is to have a pattern language for automated tests - just as with every other pattern language.

"Let's drive this through Facade Tests" now means something else than "Let's drive this through Unit Tests".

2012-07-03 21:16 UTC

Test-specific Equality versus Domain Equality

Friday, 22 June 2012 15:22:25 UTC

As a reaction to my two previous blog posts, Joe Miller asks on Twitter:

If a test needs to check a "special" type of equality, isn't that special equality actually part of the domain?

That's a fair question which I'd like to answer here.

You should definitely strive towards having domain objects with strong equality semantics. Many things (including unit testing) becomes easier when domain objects override Equals in a meaningful manner. By all means should you prefer that over any Resemblance or Likeness craziness.

What is meaningful equality for a domain object? Personally, I find the guidance provided by Domain-Driven Design indispensable:

If the domain object is an Entity, two objects are equal if their IDs are equal. No other properties should be compared.
If the domain object is a Value Object, two object are equal if (all) their encapsulated data is equal.
If the domain object is a Service, by default I'd say that the default reference equality is often the most correct (i.e. don't override Equals).

Now consider the very common case of mapping layers, or the slightly related scenario of an Anti-corruption Layer. In such cases, the code translates Entities to and from other representations. How do we unit test such mapping code?

If we were to rely on the domain object's built-in equality, we would only be testing that the identity of the Entity is as expected, but not whether or not the mapping code properly maps all the other data. In such cases we need Test-specific Equality, and that's exactly what Resemblances and Likenesses provide.

In the previous example, this is exactly what happens. The SUT produces a new instance of the RequestReservationCommand:

[HttpPost]
public ViewResult Post(BookingViewModel model)
{
    this.channel.Send(model.MakeReservation());
    return this.View("Receipt", model);
}

The MakeReservation method is implemented like this:

public RequestReservationCommand MakeReservation()
{
    return new RequestReservationCommand(
        this.Date, this.Email, this.Name, this.Quantity);
}

Notice that nowhere does the code specify the identity of the RequestReservationCommand instance. This is done by the RequestReservationCommand constructor itself, because it's a domain rule that (unless deserialized) each command has a unique ID.

Thus, from the unit test, you have absolutely no chance of knowing what the ID will be (it's a GUID). You could argue back and forth on whether the RequestReservationCommand class is an Entity or a Value Object, but in both cases, Domain Equality would involve comparing IDs, and those will never match. Therefore, the correct Test-specific Equality is to compare the values without the Id property.

Comments

Yves Reynhout #

As far as the "Guid Generation" goes, I like to control that part using the following code (optionally accompanied by some sort of scoping) in the bits where I'm interested in controlling it: some_code

2012-06-26 13:33 UTC

Mark Seemann #

Well, that's an Ambient Context (Dependency Injection in .NET, p. 118), and while it enables you to assign an expected Guid value from a unit test, from a conceptual perspective I don't think it's the correct solution for a test like this one.

From a behavioral point of view, we don't really care about the value of the ID (the Guid). From other tests we know that a new instance of a command will always have a unique ID. It's part of the command's invariants. Thus, we know that the ID is going to be unique, so having to configure an Ambient Context is only going to add noise to a unit test.

2012-06-26 16:09 UTC

Resemblance and Likeness

Friday, 22 June 2012 12:35:04 UTC

In a previous post I described the Resemblance idiom. In this post I present a class that can be used to automate the process of creating a Resemblance.

The Resemblance idiom enables you to write readable unit tests, but the disadvantage is that you will have to write (and maintain) quite a few test-specific Resemblance classes. If you're a Coding Neat Freak like me, you'll also have to override GetHashCode in every Resemblance, just because you're overriding Equals. If you're still looking for ways to torment yourself, you'll realize that the typical Resemblance implementation of Equals contains branching logic, so ideally, it should be unit tested too. (Yes, I do occasionally unit test my unit testing helper classes. AutoFixture, which is essentially one big unit test helper, is currently covered by some three thousand tests.)

In that light, wouldn't it be nice if you were able to automate the process of defining Resemblance classes? With the Likeness class, you can. Likeness is currently bundled with AutoFixture, so you get it if you install the AutoFixture NuGet package or if you download the AutoFixture zip file from its CodePlex site. The Likeness class can be found in an assembly named Ploeh.SemanticComparison. In the future, I plan on packaging this as a separate NuGet package, but currently, it's included with AutoFixture.

Likeness #

Before I show you how to turn a Likeness into a Resemblance, I think a very short introduction to Likeness is in order. A couple of years ago I introduced Likeness on my blog, but I think a re-introduction is in order. In the context of the unit test from the previous blog post, instead of manually writing a test like this:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<RequestReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
 
    var expected = model.MakeReservation();
    channelMock.Verify(c =>
        c.Send(It.Is<RequestReservationCommand>(cmd =>
            cmd.Date == expected.Date &&
            cmd.Email == expected.Email &&
            cmd.Name == expected.Name &&
            cmd.Quantity == expected.Quantity)));
}

Likeness will do it for you, using Reflection to match properties according to name and type. By default, Likeness compares all properties of the destination instance. This will not work in this test, because the RequestReservationCommand also includes an Id property which is a Guid which is never going to be the same across different instances:

public MakeReservationCommand(
    DateTime date, string email, string name, int quantity)
{
    this.date = date;
    this.email = email;
    this.name = name;
    this.quantity = quantity;
    this.id = Guid.NewGuid();
}

Notice that the above test exludes the Id property. It's not very clear that this is going on, so a Test Reader may think that this is an accidental omission. With Likeness, this becomes explicit:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<MakeReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
    var expected = model.MakeReservation()
        .AsSource().OfLikeness<MakeReservationCommand>()
        .Without(d => d.Id);
    channelMock.Verify(c =>
        c.Send(It.Is<MakeReservationCommand>(x =>
            expected.Equals(x))));
}

Notice how the Without method explicitly states that the Id property should be excluded from the comparison. By convention, all other properties on the target type (MakeReservationCommand, which in this case is the same as the source type) must be equal to each other.

The AsSource().OfLikeness extension method chain creates an instance of Likeness<MakeReservationCommand, MakeReservationCommand>, which isn't the same as an instance of MakeReservationCommand. Thus, you are still forced to use the slightly awkward It.Is syntax to express what equality means:

channelMock.Verify(c =>
    c.Send(It.Is<MakeReservationCommand>(x =>
        expected.Equals(x))));

If a Likeness could be used as a Resemblance, the test would be more readable.

Likeness as Resemblance #

The definition of the Likeness class is this:

public class Likeness<TSource, TDestination> : IEquatable<TDestination>

While it implements IEquatable<TDestination> it isn't a TDestination. That's the reason for the It.Is syntax above.

Thanks to awesome work by Nikos Baxevanis, a Likeness instance can create a dynamic proxy of TDestination, as long as TDestination is non-sealed and doesn't seal the Equals method. The method is called CreateProxy:

public TDestination CreateProxy();

This effectively turns the Likeness into a Resemblance by dynamically emitting a derived class that overrides Equals in the way that the Likeness instance (re)defines equality. With that, you can rewrite the unit test so that it's both readable and maintainable:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<RequestReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
 
    var expected = model.MakeReservation()
        .AsSource().OfLikeness<RequestReservationCommand>()
        .Without(d => d.Id)
        .CreateProxy();
    channelMock.Verify(c => c.Send(expected));
}

The only difference in the definition of the expected variable is the additional call to CreateProxy. This changes the type of expected so that it's no longer Likeness<MakeReservationCommand, MakeReservationCommand>, but rather MakeReservationCommand (actually, a dynamically created sub-type of MakeReservationCommand). This enables you to write a readable assertion:

channelMock.Verify(c => c.Send(expected));

This is exactly the same assertion as the assertion used with the Resemblance idiom described in the previous post. Thus, Likeness has now been turned into a Resemblance, and you no longer have to manually create and maintain concrete Resemblance classes.

Comments

Simon Skov Boisen #

Great stuff - I took a look at the CreateProxy method and I must say that is some pretty hairy (and impressive) reflection and il.Emit code! How do one figure out exactly which IL needs to be emitted in order to create the dynamic proxy? Do you inspect the IL of a manually generated proxy and then use that as a guidance? (I realize you didn't write the code, but maybe you have some insight?) :)

2012-07-05 07:43 UTC

Mark Seemann #

Nah, I don't have any insights on that. Nikos Baxevanis wrote that, and I trust that he got it right (as he did with so much other crazy stuff). However, perhaps I can coax him to provide an answer here :)

2012-07-05 08:03 UTC

Nikos Baxevanis #

Hi Simon, definitely, a good starting point is to manually generate the code and then inspect the disassembled IL. Then, you can use the types in the System.Reflection.Emit namespace to create the (proxy) type. What this proxy generator supports, but usually others don't, is the ability to emit a proxy even for a type with no-default constructor. At that point, you can't manually generate and inspect the disassembled IL for all the possible combinations for (above three) number of arguments in the constructor. However, after loading the (third) constructor argument you keep loading additional arguments by referencing their index (lines 90-114 in the source, commit 8fff809). Finally, you need to create and return an instance of the newly created proxy. Since the proxy does not necessarily contains a default constructor there are some additional steps involved for discovering and choosing a compatible constructor with the source type. Definitely, there should be some points for improvement so if you have any suggestions there are more than welcome. :) Hope that helps.

2012-07-05 11:24 UTC

James Nail #

Hi Mark,
Once again, you guys have done it. I've used foo.AsSource().OfLikeness() before, but the CreateProxy trick solved the exact problem I was having.
I'm continually impressed with the power you've packed into this relatively small tool (AutoFixture).

2012-07-06 09:32 UTC

The Resemblance idiom

Thursday, 21 June 2012 08:42:17 UTC

In this post I describe a unit testing trick which can be used to make assertions more readable.

Many TDD or unit testing experts (e.g. Roy Osherove) give the advice that a unit test should only contain a single assertion. This is a pretty good piece of advice, but sometimes a single assertion is best interpreted as a single logical assertion.

When it comes to state-based testing, you can often write a Custom Assertion to encapsulate multiple assertions into a single logical assertion. For interaction-based testing (i.e. using mock objects) this can be a bit harder.

In order to write readable unit tests with single assertions, I sometimes use an idiom that I'll describe here. Since I've never seen it described anywhere else, I can't reasonably call it a design pattern, so I'll stick with the less universal term idiom - I call it the Resemblance idiom. In short, it introduces a test-specific override of an object's equality method.

Motivating example #

Assume that you're TDD'ing this method:

[HttpPost]
public ViewResult Post(BookingViewModel model)
{
    this.channel.Send(model.MakeReservation());
    return this.View("Receipt", model);
}

Obviously, when TDD'ing, the method isn't going to have any implementation before the test has been written, but for educational purposes, in this case I'm showing you the implementation before the test - I still wrote the tests first. This is the implementation you'd like to arrive at - particularly this line of code:

this.channel.Send(model.MakeReservation());

The challenge here is that the MakeReservation method returns an instance of RequestReservationCommand, which is a data-carrying class (some would call it a DTO) that doesn't override Equals. One option is to override Equals, but that would lead to Equality Pollution, since the equality semantics of RequestReservationCommand isn't going to match what you need in order to write the test.

Using Moq, your first take might look something like this:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<RequestReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
 
    var expected = model.MakeReservation();
    channelMock.Verify(c =>
        c.Send(It.Is<RequestReservationCommand>(cmd =>
            cmd.Date == expected.Date &&
            cmd.Email == expected.Email &&
            cmd.Name == expected.Name &&
            cmd.Quantity == expected.Quantity)));
}

This works, but is awkward. First of all, I'd prefer a more readable test than this. Secondly, what happens if, in the future, someone adds a new property to the RequestReservationCommand class? If you have a lot of tests like this one, should they all be updated to include the new property in the comparison?

Attempted fix #

The first fix you might attempt involves extracting the comparison to a helper method like this one:

private static bool Equals(RequestReservationCommand expected,
    RequestReservationCommand actual)
{
    return actual.Date == expected.Date &&
        actual.Email == expected.Email &&
        actual.Name == expected.Name &&
        actual.Quantity == expected.Quantity;
}

This is better because at least it gives you a central method where you can manage which properties should be included in the comparison, but from a test invocation perspective it's still a bit unwieldy:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<RequestReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
 
    var expected = model.MakeReservation();
    channelMock.Verify(c =>
        c.Send(It.Is<RequestReservationCommand>(cmd =>
            Equals(expected, cmd))));
}

You're still stuck with the It.Is noise and a lambda expression. Being a static helper method, it's also not very object-oriented. Perhaps you don't care about that, but by not being object-oriented, it's also not very composable. This can come back to bite you if you want to compare complex object graphs, because it's dificult to compose static methods. There's a another way which provides a better alternative.

Resemblance #

The trick is to recall that ultimately, most unit testing is about determining whether the actual outcome is equal to the expected outcome. Since the Equals method is virtual, you can simply create a test-specific child class that overrides the Equals method. This is what I call a Resemblance.

private class RequestReservationCommandResemblance : RequestReservationCommand
{
    public RequestReservationCommandResemblance(RequestReservationCommand source)
        : base(source.Date, source.Email, source.Name, source.Quantity)
    {
    }
 
    public override bool Equals(object obj)
    {
        var other = obj as RequestReservationCommand;
        if (other != null)
            return object.Equals(this.Date, other.Date)
                && object.Equals(this.Email, other.Email)
                && object.Equals(this.Name, other.Name)
                && object.Equals(this.Quantity, other.Quantity);
 
        return base.Equals(obj);
    }
 
    public override int GetHashCode()
    {
        return this.Date.GetHashCode()
            ^ this.Email.GetHashCode()
            ^ this.Name.GetHashCode()
            ^ this.Quantity.GetHashCode();
    }
}

Notice that the RequestReservationCommandResemblance class derives from RequestReservationCommand in order to override Equals (and GetHashCode for good measure). This looks a bit more complicated than the previous attempt at a fix, but it's much more object-oriented and composable, and most importantly: the test is much easier to read because almost all the noise can be removed:

[Theory, AutoWebData]
public void PostSendsOnChannel(
    [Frozen]Mock<IChannel<RequestReservationCommand>> channelMock,
    BookingController sut,
    BookingViewModel model)
{
    sut.Post(model);
    var expected = new RequestReservationCommandResemblance(model.MakeReservation());
    channelMock.Verify(c => c.Send(expected));
}

Notice in particular how you're now able to state exactly what you care about, instead of how you care about it:

channelMock.Verify(c => c.Send(expected));

You may still feel that the overhead of creating a new test-specific class just for this purpose doesn't quite justify the increased readability and composability, but stay tuned - I have more tricks to show you.

Comments

Vasily Kirichenko #

Although this idiom allows changing the algorithm of value objects equality from one test case to another (however, introducing a bunch of derived classes may be a bit boring), it has very poor diagnostics characteristics (if a test fails, we can't say what property(s) is mismatched).

"One situation we want to avoid, however, is when we can’t diagnose a test
failure that has happened. The last thing we should have to do is crack open the
debugger and step through the tested code to find the point of disagreement.", GOOS

2012-06-21 13:05 UTC

Mark Seemann #

There's that, but none of the options presented here solve that problem - all of them are going to throw an exception by the Verify method if the correct method call wasn't made.

In GOOS they are better off because they use Hamcrest, and their mocking library understands Hamcrest matchers. That's not the situation in .NET (yet).

The only way around this that I can think of is to call the Verify method four times - one for each property above. IMO that would be sacrificing either readability or maintainability to optimize for a failure scenario which should only occur rarely. I prefer to optimize for readability instead. YMMV.

2012-06-21 15:09 UTC

svick #

If you always want to compare all properties, wouldn't using Reflection solve the problem here?

Or something like

IEqualityComparer[RequestReservationCommand] CreateRequestReservationCommandComparer()
{
return CreateComparer[RequestReservationCommand](
c => c.Date, c => c.Email, c => c.Name, c => c.Quantity
);
}

Although I'm not sure how composable that would be.

BTW, I can't seem to post any comment containing the less than sign, it says:

An error has been encountered while processing the page. We have logged the error condition and are working to correct the problem. We apologize for any inconvenience.

2012-06-21 17:22 UTC

Mark Seemann #

As I wrote: stay tuned :)

(and yes: the comment engine here is really crappy - sorry about that)

2012-06-21 17:42 UTC

Daniel Hilgarth #

When using resemblance Assert.Equal is not working. You need to use Assert.IsTrue(expected.Equals(actual)) instead or make the resemblance class implement IEqualityComparer{T}.
Do you have a solution for that?

2012-10-01 15:18 UTC

Mark Seemann #

I don't have a direct solution for that particular issue, but I find the Resemblance idiom most useful when dealing with Indirect Input and Output, as in the example above. In other words, it's mostly useful when dealing with Mock objects, because the have to deal exclusively with equality.

When it comes to direct assertions, like Assert.Equal, I'd love it if we had something like Hamcrest Matchers in .NET (actually, there is a .NET port, but last time I looked, it seemed abandonded). The next best thing is often a custom IEqualityComparer, but that doesn't necessarily solve the pretty print error reporting need.

Sometimes, it's nice to have both, so one can implement a custom IEqualityComparer and then use that class to also implement a Resemblance.

FWIW, the Likeness class described in the next blog post is built that way. It also includes a ShouldEqual method as a very concrete way of providing more readable error reporting...

2012-10-01 19:28 UTC

Daniel Hilgarth #

Thanks for the comment. Likeness couldn't be used in that particular case as the condition for equality was rather complex with nested properties. You can actually see it in the HttpActionContextResemblence in the pull request to Hyprlinkr: https://github.com/dhilgarth/Hyprlinkr/blob/de27a39477a747f56ae23fd8717bb8ef24c56bea/Hyprlinkr.UnitTest/HttpActionContextResemblance.cs

I used the approach with an implementation of IEqualityComparer

2012-10-02 13:02 UTC

Bank OCR kata in F#: user stories 3 and 4

Monday, 04 June 2012 05:16:35 UTC

In previous posts, I've walked through the first two user stories of the Bank OCR kata in F#. In this post I'll walk through my implementation of user stories 3 and 4.

The reason I'm collecting both user stories in a single blog post is that the requirements of user story 4 represent a change from user story 3, and since I didn't use a DVCS for this kata, I no longer have my implementation for user story 3.

The core unit test #

As with the previous user stories, I started by writing a Parameterized Test:

[<Theory>]
[<InlineData("
    _  _     _  _  _  _  _ 
  | _| _||_||_ |_   ||_||_|
  ||_  _|  | _||_|  ||_| _|", "123456789")>]
[<InlineData("
 _     _  _  _  _  _  _  _ 
 _||_||_ |_||_| _||_||_ |_ 
 _|  | _||_||_||_ |_||_| _|", "345882865")>]
[<InlineData("
 _     _  _  _  _  _       
 _||_||_ |_||_| _||_|  ||_|
 _|  | _||_||_||_ |_|  |  |", "345882814")>]
[<InlineData("
    _  _     _  _  _  _  _ 
  | _| _||_||_ |_   ||_||_|
  ||_  _|  | _||_|  ||_||_|", "123456789")>]
[<InlineData("
    _  _     _  _  _  _  _ 
  | _| _||_||_ |_   ||_||_|
  | _| _|  | _||_|  ||_||_|", "133456788 AMB ['133456188', '193456788']")>]
[<InlineData("
    _  _     _  _  _  _  _ 
  | _| _||_||_ |_   ||_|| |
  | _| _|  | _||_|  ||_||_|", "133456780 ERR")>]
[<InlineData("
 _     _  _  _  _  _       
 _||_||_ |_||_| _||_|  ||_|
 _|| | _||_||_||_ |_|  |  |", "345882814")>]
[<InlineData("
 _  _        _  _     _  _ 
|_||_   |  || | _| _  _||_ 
|_||_|  |  ||_|   |_| _||_|", "86110??36 ILL")>]
[<InlineData("
 _     _  _  _  _  _       
 _||_||_ |_||_| _||_|  ||_|
 _|  | _||  |_|   |_|  |  |", "345?8?814 ILL")>]
[<InlineData("
 
  |  |  |  |  |  |  |  |  |
  |  |  |  |  |  |  |  |  |", "711111111")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
  |  |  |  |  |  |  |  |  |
  |  |  |  |  |  |  |  |  |", "777777177")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
 _|| || || || || || || || |
|_ |_||_||_||_||_||_||_||_|", "200800000")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
 _| _| _| _| _| _| _| _| _|
 _| _| _| _| _| _| _| _| _|", "333393333")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
|_||_||_||_||_||_||_||_||_|
|_||_||_||_||_||_||_||_||_|", "888888888 AMB ['888886888', '888888880', '888888988']")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
|_ |_ |_ |_ |_ |_ |_ |_ |_ 
 _| _| _| _| _| _| _| _| _|", "555555555 AMB ['559555555', '555655555']")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
|_ |_ |_ |_ |_ |_ |_ |_ |_ 
|_||_||_||_||_||_||_||_||_|\r\n", "666666666 AMB ['686666666', '666566666']")>]
[<InlineData("
 _  _  _  _  _  _  _  _  _ 
|_||_||_||_||_||_||_||_||_|
 _| _| _| _| _| _| _| _| _|", "999999999 AMB ['993999999', '999959999', '899999999']")>]
[<InlineData("
    _  _  _  _  _  _     _ 
|_||_|| || ||_   |  |  ||_ 
  | _||_||_||_|  |  |  | _|", "490067715 AMB ['490067115', '490867715', '490067719']")>]
[<InlineData("
    _  _     _  _  _  _  _ 
 _| _| _||_||_ |_   ||_||_|
  ||_  _|  | _||_|  ||_| _|", "123456789")>]
[<InlineData("
 _     _  _  _  _  _  _    
| || || || || || || ||_   |
|_||_||_||_||_||_||_| _|  |", "000000051")>]
[<InlineData("
    _  _  _  _  _  _     _ 
|_||_|| ||_||_   |  |  | _ 
  | _||_||_||_|  |  |  | _|", "490867715")>]
let ParseToPrintoutReturnsCorrectResult entry expected =
    entry
    |> ParseToPrintout
    |> should equal expected

Once again the test utilizes xUnit.net's data theories feature to decouple the test code from the test data, as well as FsUnit for the assertion DSL.

The test is really simple: it's 85 lines of data and three lines of code which state that the result of piping the entry argument to the ParseToPrintout function should be equal to the expected argument.

Implementation #

The implementation may seem more daunting. At first, I'll post the function in its entirety, and afterwards I'll break it down and walk you through it.

let ParseToPrintout entry =
    let toPrintable opt =
        match opt with
        | Some(d) -> d.ToString()
        | None -> "?"
 
    let formatEntry s =
        s
        |> ParseToDigits
        |> Seq.map toPrintable
        |> String.Concat    
 
    let isLegible potentialDigits = potentialDigits |> Seq.forall Option.isSome
 
    let isValidAndLegible s =
        let potentialDigits = s |> ParseToDigits
 
        (potentialDigits |> isLegible) &&
        (potentialDigits |> skipNones |> IsValid)
 
    if entry |> isValidAndLegible
    then entry |> formatEntry
    else
        let getCandidates chars =
            let withAlternatives c =
                seq {
                    match c with
                    | ' ' -> yield! [ '|'; '_' ]
                    | '|' | '_' -> yield ' '
                    | _ -> yield! []
                    }
 
            chars
            |> Seq.mapi (fun i _ -> chars
                                    |> ReplaceAt i withAlternatives)
            |> Seq.concat
 
        let validCandidates =
            entry.ToCharArray()
            |> getCandidates
            |> Seq.map ToString
            |> Seq.filter isValidAndLegible
            |> Seq.toList
 
        let formatAlternatives alternatives =
            alternatives
            |> Seq.map formatEntry
            |> Seq.map (sprintf "'%s'")
            |> Seq.reduce (sprintf "%s, %s")
 
        match validCandidates with
        | [head] -> head |> formatEntry
        | [] -> if entry |> ParseToDigits |> isLegible
                then entry |> formatEntry |> sprintf "%s ERR"
                else entry |> formatEntry |> sprintf "%s ILL"
        | _ -> validCandidates
                |> formatAlternatives
                |> sprintf "%s AMB [%s]" (formatEntry entry)

The first thing to notice is that the ParseToPrintout function is composed of quite a few helper functions. A thing that I currently don't like so much about F# is that it's necessary to define a function before it can be used, so all the helper functions must appear before the function that defines the general flow.

It's possible to move the helper functions to other modules so that they don't clutter the implementation, but most of the functions defined here seem to me to be a very specialized part of the overall implementation. In this particular code base, I've been working under the assumption that it would be best to define a function in as narrow a scope as possible, but looking at the result, I don't think that was the right decision. In the future, I think I'll move more helper functions to a separate module.

Valid and legible #

As a start, skim past the helper functions. The actual program flow of the function starts here:

if entry |> isValidAndLegible
then entry |> formatEntry

That part of the if/then branch should be fairly easy to understand. If the entry is valid (according to the checksum function from user story 2) and legible, the result is formatted and returned.

The isValidAndLegible function determines whether the entry is... well... valid and legible. The first thing it does is to pipe the string into the ParseToDigits function, which, you'll recall, returns a sequence of integer options.

To figure out whether those potential digits are legible it invokes the isLegible function, which I'll return to shortly.

In order to figure out whether the potential digits are valid, it invokes the IsValid method from user story 2. However, in order to do so, it needs to convert the sequence of integer options to a sequence of integers. It does that by invoking the skipNones helper method, that I defined in a helper module:

let skipNones sequence = Seq.collect Option.toArray sequence

This function takes a sequence of options and returns a sequence of values by skipping all the Nones. It does that by turning each option into an array of 0 or 1 elements and then concatenate all these arrays together to form a new sequence.

This array is piped into the IsValid function, which will only return true if there are exactly nine digits and the checksum is OK.

The isLegible function is simpler because all it needs to do is to verify that all the options are Somes (instead of Nones).

Formatting the entry #

The formatEntry function pipes the entry into the ParseToDigits function, which returns a sequence of integer options. Each option is mapped into a string, replacing each None with a "?".

Finally, the sequence of strings is concatenated into a single string by piping it to the built-in String.Concat method.

Finding candidates #

While formatting the entry is fairly easy if it's valid and legible, it suddenly becomes much harder to attempt to find other candidates if the entry seems invalid. What needs to be done in this case is to iterate through each and every character of the entry and try to replace just a single character at a time to see if it produces a better entry.

According to user story 4, a blank space could be either a vertical pipe or and underscore, while an underscore and a pipe might instead be a blank space. This relationship can be formalized by the withAlternatives function:

let withAlternatives c =
    seq {
        match c with
        | ' ' -> yield! [ '|'; '_' ]
        | '|' | '_' -> yield ' '
        | _ -> yield! []
        }

This function takes as input a single character and returns a sequence of characters. If the input is ' ', the output is a list containing '|' and '_'. On the other hand, if the input is either '|' or '_', the output is a sequence with the single element ' '. For all other characters (such as line breaks), the output is an empty sequence, indicating that no replacement should be attempted.

This defines the alternatives which can be fed into the ReplaceAt helper function:

let ReplaceAt index produceReplacements sequence =
    sequence
    |> Seq.nth index
    |> produceReplacements
    |> Seq.map (fun x -> sequence |> ReplaceElementAt index x)

In this case, the ReplaceAt function takes as input a sequence of characters (the entry string) and a function providing alternatives (withAlternatives) and produces a sequence of sequence of characters where only a single character has been replaced according to the alternatives.

If that sounds difficult, perhaps these unit tests explain the behavior better:

[<Fact>]
let ProduceAllReplacementsAtReturnsCorrectResult1() =
    let actual = ReplaceAt 0 (fun c -> [ 'I'; 'i' ]) "123"
    actual |> Seq.map ToString |> should equalSequence [ "I23"; "i23" ]
 
[<Fact>]
let ProduceAllReplacementsAtReturnsCorrectResult2() =
    let actual = ReplaceAt 1 (fun c -> [ 'V' ]) "153"
    actual |> Seq.map ToString |> should equalSequence [ "1V3" ]

The first unit tests replaces the first (zero-indexed) character in the input string "123" with two alternatives, 'I' and 'i', producing the output sequence of "I23" and "i23".

The second test replaces the second (zero-indexed) character in the input string "153" with the single alternative 'V' to produce the output "1V3".

The ReplaceAt function does this by first using the Seq.nth function to pick the nth character out of the sequence, and then pipe that character into the function that produces the alternatives. This produces a sequence of replacement characters, e.g. '|' and '_' as explained above.

This sequence of replacement characters is then used to produce a sequence of strings where the element at the index is replaced with each of the replacement characters. In order to do that, the original input is piped into the ReplaceElementAt helper function:

let ReplaceElementAt index element sequence =
    let beforeIndex = sequence |> Seq.take index
    let atIndex = element |> Seq.singleton
    let afterIndex = sequence |> Seq.skip (index + 1)
    afterIndex |> Seq.append atIndex |> Seq.append beforeIndex

This function takes a sequence of elements (e.g. a string) and returns another sequence (e.g. another string) by replacing the element at the specified index with the supplied element.

All these helper functions can be combined to produce a sequence of candidate strings:

chars
|> Seq.mapi (fun i _ -> chars
                        |> ReplaceAt i withAlternatives)
|> Seq.concat

Of these candidates, a lot will be invalid, but filtering them to find only the valid candidates is fairly easy:

let validCandidates =
    entry.ToCharArray()
    |> getCandidates
    |> Seq.map ToString
    |> Seq.filter isValidAndLegible
    |> Seq.toList

This final list of candidates is a list instead of a sequence because it enables me to use list pattern matching to deal with the cases of a single, none, or multiple valid alternatives:

match validCandidates with
| [head] -> head |> formatEntry
| [] -> if entry |> ParseToDigits |> isLegible
        then entry |> formatEntry |> sprintf "%s ERR"
        else entry |> formatEntry |> sprintf "%s ILL"
| _ -> validCandidates
        |> formatAlternatives
        |> sprintf "%s AMB [%s]" (formatEntry entry)

If there's only a single valid alternative, the [head] match is triggered and the decomposed head variable is simply piped to the formatEntry function.

If there's no valid alternative, the [] match is triggered, and the final formatting is then based on whether or not the entry is legible or not.

Finally, if there are multiple candidates, each is formatted and appended as ambiguous alternatives to the original (invalid) input.

In case you'd like to take an even closer look at the code, I've attached the entire solution as a zip file: KataBankOCR.zip (299.62 KB)

Comments

Jonty #

After reading this post I decided to try out some F# and it is very interesting. I'm struggling to work out "best practices" though. It would be great to hear your thoughts on this and how to move from a C# mindset to F#. I'm thinking things like how SOLID principles apply and how to structure your code to keep it maintainable.

2012-06-21 11:16 UTC

Mark Seemann #

FWIW, I found the book Real World Functional Programming: With Examples in F# and C# an excellent introduction to functional concepts for object-oriented programmers.

Soon, I'll write a small blog post on one way in which SOLID relates to Functional Programming.

2012-06-21 12:24 UTC

Simon Skov Boisen #

Excellent series of blogposts - how do you run your xunit tests? Do you run them with the xunit gui runner or using TestDriven.net?

I second 'Real World Functional Programming: With Examples in F# and C#', it's a really great book!

One very minor change to one of your helper-functions could be to use the backward pipe operator when combining the sequences in ReplaceElementAt, that way one doesn't need to have the somewhat confusing reverse append-order. Or simply use seq { } and you won't need the atIndex. See following gist for alternatives:

https://gist.github.com/3049158

2012-07-04 19:35 UTC

Mark Seemann #

I use TestDriven.net to run the xUnit.net tests.

Thanks for the alternatives for the ReplaceElementAt function. I was never happy with the readability of my implementation. I like the seq { } alternative best, so I've updated my own source code :)

2012-07-04 19:42 UTC

Bank OCR kata in F#: user story 2

Friday, 01 June 2012 05:54:27 UTC

Following up on my initial post about the Bank OCR kata, this post walks through the code required to implement user story 2, which is about calculating a checksum.

The code I previously described already contains a function called ParseToDigits which returns a sequence of digits, so it seems reasonable to express the validation function as based on a sequence of digits as input.

The core unit test #

To ensure that the behavior of the new IsValid function would be correct, I wrote this Parameterized Test:

[<Theory>]
[<InlineData(1, 2, 3, 4, 5, 6, 7, 8, 9, true)>]
[<InlineData(3, 4, 5, 8, 8, 2, 8, 6, 5, true)>]
[<InlineData(3, 4, 5, 8, 8, 2, 8, 1, 4, true)>]
[<InlineData(7, 1, 5, 8, 8, 2, 8, 6, 4, true)>]
[<InlineData(7, 4, 5, 8, 8, 2, 8, 6, 5, false)>]
[<InlineData(7, 4, 5, 8, 8, 2, 8, 6, 9, false)>]
[<InlineData(7, 4, 5, 8, 8, 2, 8, 6, 8, false)>]
[<InlineData(1, 2, 3, 4, 5, 6, 7, 8, 8, false)>]
[<InlineData(1, 3, 3, 4, 5, 6, 7, 8, 8, false)>]
[<InlineData(1, 3, 3, 4, 5, 6, 7, 8, 0, false)>]
let IsValidReturnsCorrectResult d9 d8 d7 d6 d5 d4 d3 d2 d1 expected =
    seq { yield! [d9; d8; d7; d6; d5; d4; d3; d2; d1] }
    |> IsValid
    |> should equal expected

As was the case for the previous tests, this test utilizes xUnit.net's data theories feature to succinctly express a parameterized test, as well as FsUnit for the assertion DSL.

Once again I'd like to point out how the use of pipes enables me to very compactly follow the Arrange Act Assert (AAA) test pattern.

Implementation #

The IsValid function is very simple:

let IsValid digits =
    match digits |> Seq.toArray with
    | [| d9; d8; d7; d6; d5; d4; d3; d2; d1 |] ->
        (d1 + 2 * d2 + 3 * d3 + 4 * d4 + 5 * d5 + 6 * d6 + 7 * d7 + 8 * d8 + 9 * d9) % 11 = 0
    | _ -> false

The input sequence is converted to an array by piping it into the Seq.toArray function. The resulting array is then matched against an expected shape.

If the array contains exactly 9 elements, each element is decomposed into a named variable (d9, d8, etc.) and passed to the correct checksum formula. The formula only returns true if the calculated result is zero; otherwise it returns false.

If the input doesn't match the expected pattern, the return value is false. The following tests prove this:

[<Fact>]
let IsValidOfTooShortSequenceReturnsFalse() =
    seq { yield! [3; 4; 5; 8; 8; 2; 8; 6] }
    |> IsValid
    |> should equal false
 
[<Fact>]
let IsValidOfTooLongSequenceReturnsFalse() =
    seq { yield! [3; 4; 5; 8; 8; 2; 8; 6; 5; 7] }
    |> IsValid
    |> should equal false

In a future post I will walk you through user stories 3 and 4.

Comments

svick #

Why are you using `seq { yield! [d9; d8; d7; d6; d5; d4; d3; d2; d1] }` instead of just `[d9; d8; d7; d6; d5; d4; d3; d2; d1]`? It seems to me it's just a complicated way to write List.toSeq, which isn't even necessary here.

2012-06-01 23:45 UTC

Mark Seemann #

That's a TDD artifact more than an actual requirement. I wrote the test first, and wanted it to specify that the IsValid function should be able to take any sequence, and not only a list.

Because of F# powerful type inference, that's what the function ends up doing anyway, but with TDD, it's the test that specifies the shape of the API, not the other way around.

However, I could also have written [d9; d8; d7; d6; d5; d4; d3; d2; d1] |> List.toSeq... There's no particular reason why I chose a sequence expression over that...

2012-06-02 10:13 UTC

Next Previous

Page 64 of 81

"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!

ploeh blog danish software design

Unit test #

Implementation #

Unit test #

Implementation #

Printing all lines #

Comments

Comments

Example: a chart reader #

Unit testing #

Hard-coded composition #

Configuration-based composition #

Wiring a DI Container with primitives #

Conventions for primitives #

Comments

Comments

Comments

Likeness #

Likeness as Resemblance #

Comments

Motivating example #

Attempted fix #

Resemblance #

Comments

The core unit test #

Implementation #

Valid and legible #

Formatting the entry #

Finding candidates #

Comments

The core unit test #

Implementation #

Comments