Programming languages don't have to have a built-in notion of error handling. You can implement sane error handling from first principles. An introduction for object-oriented programmers.

This article is part of a series of articles about Church encoding. In this series, you'll learn how to re-create various programming language features from first principles. In previous articles, you learned how to implement Boolean logic without Boolean primitives, how to model natural numbers, as well as how to implement Maybe (a type-safe alternative to null). Through these examples, you'll learn how to model sum types without explicit language support.

Error handling without exceptions #

In a previous article, I've discussed how a language doesn't need to have built-in exceptions in order to support composable and type-safe error handling. In fact, exceptions are noting but glorified GOTO statements. A better approach is to use the Either abstraction, which enables you to model values that are either one or another thing.

In F#, this type is known as Result<'T, 'TError>, while in Haskell it's called Either. It enables you to model an outcome that is either something (like a success) or something else (such as an error).

Scott Wlaschin has already brilliantly described how this works in F#, but the Either type can be used for error handling in Haskell in exactly the same way. When we use the terminology related to either, we distinguish between left and right. Typically, right is used to indicate success, via the pun that 'right' is 'correct'.

Lambda calculus Either #

Church encoding is based on the lambda calculus, which defines a universal model of computation based entirely on functions (lambda expressions) and recursion. As far as I can tell, you can define Either in lambda calculus as an expression that takes two arguments, and where there's two fundamental 'implementations' of the contract:

 left = λa.λl.λr.l a
right = λb.λl.λr.r b

(I admit that I'm going out on a limb here, since I haven't found any source that puts either in the above form, so I'd appreciate feedback if I did it incorrectly.)

The contract is that, similar to Maybe, the l function argument represents the left case, whereas the r argument represents the right case. Contrary to Maybe, both l and r are used as functions. (Everything in lambda calculus is a function, but we don't always use the arguments as the function that they are.)

The left function is a function that takes three arguments (a, l, and r) and always returns l a. Recall that in lambda calculus, everything is a function, which includes l (and r). In other words, left unconditionally calls l with a, and that's the return value.

The right function works like the left function, with the only difference that it always returns r b.

The idea, as usual, is that you can partially apply left and right, by, for instance calling left three (where three is the lambda calculus representation of the number 3, as described in the article on Church-encoded natural numbers). Such a partially applied function is a function that still takes the two arguments l and r.

The same is true if you partially apply right with a value, like right one.

In both cases, you have a function of the form λl.λr.[...]. If you've been given such a function by an external source, you may not know if it's a left or a right expression, and that's the point. You must supply handlers (l and r) that cover all possible cases.

In the lambda calculus, expressions are always curried, so instead of viewing left and right as functions with three arguments, you can view them as functions that take a single element (a or b) and return functions that takes two arguments. This agrees with Haskell's Left and Right data constructors:

Prelude> :t Left
Left :: a -> Either a b
Prelude> :t Right
Right :: b -> Either a b

Haskell tells us that Left is a function that takes an a value and returns an Either a b value. Similarly, Right is a function that takes a b value as input, and returns an Either a b

Church-encoded Either in C# #

Both lambda calculus and Haskell relies on currying and partial application to make the contract fit. In C#, as you've previously seen, you can instead define an interface and rely on class fields for the 'extra' function arguments. Since Church-encoded Either is represented by a function that takes two arguments, we'll once again define an interface with a single method that takes two arguments:

public interface IEither<LR>
{
    T Match<T>(Func<LT> onLeft, Func<RT> onRight);
}

The Match method takes two functions as arguments, one that handles the left case, and one that handles the right case. They correspond to the l and r variables in the above lambda expressions. The intent, as with other Church-encoded discriminated unions, is that when client code is given an IEither<L, R> object, it can only interact with that object by telling the Match method how to deal with both cases. Only one of the functions will be called, but at compile-time, you don't know which one. Both functions, however, must return a value of the generic type T, and that's how you can translate an IEither<L, R> object to a T value.

Following the normal procedure for Church encoding, you must also supply two implementations of the IEither<L, R> interface: one for each case.

public class Left<LR> : IEither<LR>
{
    private readonly L left;
 
    public Left(L left)
    {
        this.left = left;
    }
 
    public T Match<T>(Func<LT> onLeft, Func<RT> onRight)
    {
        return onLeft(left);
    }
}

The Left<L, R> class is an Adapter of a value of the generic type L , making it appear as an IEither<L, R> object.

It always calls the onLeft method argument with the adapted value left, while it ignores the onRight method argument. Since onLeft returns a T value, you can return the value produced by the function call.

The right case is implemented in a similar fashion:

public class Right<LR> : IEither<LR>
{
    private readonly R right;
 
    public Right(R right)
    {
        this.right = right;
    }
 
    public T Match<T>(Func<LT> onLeft, Func<RT> onRight)
    {
        return onRight(right);
    }
}

The Right<L, R> class is the mirror image of Left<L, R>. Instead of adapting an L value, it adapts an R value. It implements Match by always calling onRight with the right value, which, again, produces a T value that can be immediately returned.

Notice that for both implementations, the adapted values left and right are private class fields not exposed as public members. The only way you, as a caller, can potentially extract these values is by calling Match, and that forces you to explicitly deal with both cases.

Here's an example of using the API:

> IEither<stringint> e = new Right<stringint>(42);
> e.Match(s => s.Length % 2 == 0, i => i % 2 == 0)
true

I've deliberately declared e as a an IEither<string, int> in order to highlight the scenario where, as a client developer, you're often given a value of such a type, and you don't know if it's a left or a right value. Had I, instead, used the var keyword, the compiler would have detected that e is, really, a Right<string, int> variable. You may consider this choice artificial, but the point I'm trying to get across is that, when writing client code, you're often given a polymorphic value, and you don't know the concrete type of the value. According to the Liskov Substitution Principle, your client code must be able to deal with any subtype without changing the correctness of the system. In the case of an Either value, the way you deal with all subtypes is by supplying handlers for both cases to the Match method.

In the above example, the return value is true because 42 is an even number. If, instead, the e object is a left case containing the string "foo", the return value is false because the length of "foo" is 3 - an odd number:

> IEither<stringint> e = new Left<stringint>("foo");
> e.Match(s => s.Length % 2 == 0, i => i % 2 == 0)
false

Notice that the e.Match method call is the same in both examples; the onLeft and onRight functions are the same in both cases. The results differ because the input values represent different cases.

If you've been following the overall series on Church encoding, you may think that it's cheating to use C#'s built-in string and int data types, but nothing prevents us from sticking to the data types we've built from scratch:

> IEither<IChurchBooleanINaturalNumber> e;
> e = new Right<IChurchBooleanINaturalNumber>(NaturalNumber.Seven);
> e.Match(b => new ChurchNot(b), n => n.IsEven())
ChurchFalse { }
> e = new Left<IChurchBooleanINaturalNumber>(new ChurchFalse());
> e.Match(b => new ChurchNot(b), n => n.IsEven())
ChurchNot(ChurchFalse)

For both the left and the right case, the Match inverts the Boolean expression if it's a left case, and evaluates if the number is even if it's a right case. In the first example, the return value is a ChurchFalse object because 7 is odd. In the second example, the return value is a ChurchNot object containing a ChurchFalse object (in other words, true), because the negation of false is true.

Either instead of exceptions #

You can use Either to signal the success or failure of an operation. By convention, the right case is used to signal success, so, by elimination, left means failure. You can signal errors in numerous ways, e.g. by using enum values, but another common strategy is to simply use string values.

Consider the following example. You receive a collection of values, where each element represents a vote for that element. For example, the list Sandra, Zoey, Sandra indicates two votes for Sandra, and one for Zoey. You need to write a method that returns the winner of a vote, but at least two distinct errors are possible: the input collection is empty, or there's a tie.

You can model the error cases with an enum:

public enum VoteError
{
    Empty = 0,
    Tie
}

This enables you to write a method to find the winners, with an explicit Either return type:

public static IEither<VoteErrorT> FindWinner<T>(IReadOnlyCollection<T> votes)
{
    var countedVotes = from v in votes
                       group v by v into g
                       let count = g.Count()
                       orderby count descending
                       select new { Vote = g.Key, Count = count };
    var c = countedVotes.Take(2).Count();
 
    if (c == 0)
        return new Left<VoteErrorT>(VoteError.Empty);
 
    var x0 = countedVotes.ElementAt(0);
    if (c == 1)
        return new Right<VoteErrorT>(x0.Vote);
 
    var x1 = countedVotes.ElementAt(1);
    if (Equals(x0.Count, x1.Count))
        return new Left<VoteErrorT>(VoteError.Tie);
 
    return new Right<VoteErrorT>(x0.Vote);
}

Notice that the return type of the FindWinner method is IEither<VoteError, T>; either you get a VoteError value, or you get a T value, but any client code doesn't know which it'll be, so it must handle both cases.

The method uses a C# query expression to group, count, and order the votes. If there's no elements, the return value is a left value containing VoteError.Empty. If there's only a single vote group (e.g. if the votes where all for Sandra), that value is returned in a right case. Otherwise, if the two highest ranked votes have the same count, a left value is returned containing VoteError.Tie. Finally, in all other cases, the highest voted element is returned in a right case.

Here's some examples in C# Interactive:

> FindWinner<int>()
Left<VoteError, int>(Empty)
> FindWinner(1, 2, 3, 1, 4, 2)
Left<VoteError, int>(Tie)
> FindWinner("Sandra""Zoey""Sandra")
Right<VoteError, string>("Sandra")

Instead of throwing two different types of exceptions on invalid input, the FindWinner method handles invalid input as left cases, and valid input as the right case. You can do that consistently, and thereby eliminate the need for exceptions. Errors are, instead, reported as left values.

Summary #

In this article, you saw how it's possible to define the Either container from first principles, using nothing but functions (and, for the C# examples, interfaces and classes in order to make the code easier to understand for object-oriented developers).

Like Maybe, you can also make Either a functor. This'll enable you to compose various error-producing functions in a sane manner.

Church-encoding enables you to model sum types as functions. So far in this article series, you've seen how to model Boolean values, natural numbers, Maybe, and Either. Common to all four examples is that the data type in question consists of two mutually exclusive cases. This is the reason they're all modelled as methods that take two arguments. What happens if, instead of two, you have three mutually exclusive cases? Read on.

Next: Church-encoded payment types.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 11 June 2018 15:43:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!