Programming languages don't have to have a built-in notion of null values. Missing or optional values can be created from first principles. An introduction for object-oriented programmers.

This article is part of a series of articles about Church encoding. In this series, you'll learn how to re-create various programming language features from first principles. In previous articles, you learned how to implement Boolean logic without Boolean primitives, as well as how to model natural numbers. Through these examples, you'll learn how to model sum types without explicit language support.

The billion-dollar mistake #

All mainstream programming languages have a built-in notion of null: a value that isn't there. There's nothing wrong with the concept; you often run into situations where you need to return a value, but in certain cases, you'll have nothing to return. Division by zero would be one example. Attempting to retrieve the first element from an empty collection would be another.

Unfortunately, for fifty years, we've been immersed in environments where null references have been the dominant way to model the absence of data. This, despite the fact that even Sir Antony Hoare, the inventor of null references, has publicly called it his billion-dollar mistake.

You can, however, model the potential absence of data in saner ways. Haskell, for example, has no built-in null support, but it does include a built-in Maybe type. In Haskell (as well as in F#, where it's called option), Maybe is defined as a sum type:

data Maybe a = Nothing | Just a deriving (EqOrd)

If you're not familiar with Haskell syntax, this is a type declaration that states that the parametrically polymorphic (AKA generic) data type Maybe is inhabited by Just values that contain other values, plus the constant Nothing.

This article series, however, examines how to implement sum types with Church encoding.

Lambda calculus maybe #

Church encoding is based on the lambda calculus, which defines a universal model of computation based entirely on functions (lambda expressions) and recursion. In lambda calculus, the contract of Maybe is defined as an expression that takes two arguments. There's two fundamental 'implementations' of the contract:

nothing =    λn.λj.n
   just = λx.λn.λj.j x

The contract is that the first function argument (n) represents the nothing case, whereas the second argument (j) represents the just case.

The nothing function is a lambda expression that takes two arguments (n and j), and always returns the first, left-most argument (n).

The just function is a lambda expression that takes three arguments (x, n, and j), and always returns j x. Recall that in the lambda calculus, everything is a function, including j, so j x means that the function j is called with the argument x.

A few paragraphs above, I wrote that the contract of maybe is modelled as an expression that takes two arguments, yet just takes three arguments. How does that fit?

In the lambda calculus, expressions are always curried, so instead of viewing just as a function with three arguments, you can view it as a function that takes a single element (x) and returns a function that takes two arguments. This agrees with Haskell's Just data constructor:

Prelude> :t Just
Just :: a -> Maybe a

Haskell tells us that Just is a function that takes an a value (corresponding to x in the above just lambda expression) and returns a Maybe a value.

Church-encoded Maybe in C# #

Both lambda calculus and Haskell rely on currying and partial application to make the contract fit. In C#, as you've previously seen, you can instead define an interface and rely on class fields for the 'extra' function arguments. Since Church-encoded Maybe is represented by a function that takes two arguments, we'll once again define an interface with a single method that takes two arguments:

public interface IMaybe<T>
    TResult Match<TResult>(TResult nothing, Func<TTResult> just);

In the first article, about Church-encoded Boolean values, you saw how two mutually exclusive values could be modelled as a method that takes two arguments. Boolean values are simply constants (true and false), where the next example (natural numbers) included a case where one case (successor) contained data. In that example, however, the data was statically typed as another INaturalNumber value. In the current IMaybe<T> example, the data contained in the just case is generic (it's of the type T).

Notice that there's two levels of generics in play. IMaybe<T> itself is a container of the generic type T, whereas Match enables you to convert the container into the rank-2 polymorphic type TResult.

Once more, the contract of IMaybe<T> is that the first, left-hand argument represents the nothing case, whereas the second, right-hand argument represents the just case. The nothing implementation, then, is similar to the previous ChurchTrue and Zero classes:

public class Nothing<T> : IMaybe<T>
    public TResult Match<TResult>(TResult nothing, Func<TTResult> just)
        return nothing;

Again, the implementation unconditionally returns nothing while ignoring just. You may, though, have noticed that, as is appropriate for Maybe, Nothing<T> has a distinct type. In other words, Nothing<string> doesn't have the same type as Nothing<int>. This is not only 'by design', but is a fundamental result of how we define Maybe. The code simply wouldn't compile if you tried to remove the type argument from the class. This is in contrast to C# null, which has no type.

You implement the just case like this:

public class Just<T> : IMaybe<T>
    private readonly T value;
    public Just(T value)
        this.value = value;
    public TResult Match<TResult>(TResult nothing, Func<TTResult> just)
        return just(value);

According to the contract, Just<T> ignores nothing and works exclusively with the just function argument. Notice that the value class field is private and not exposed as a public member. The only way you, as a caller, can potentially extract the value is by calling Match.

Here are some examples of using the API:

> new Nothing<Guid>().Match(nothing: "empty", just: g => g.ToString())
> new Just<int>(42).Match(nothing: "empty", just: i => i.ToString())
> new Just<int>(1337).Match(nothing: 0, just: i => i)

Notice that the third example shows how to extract the value contained in a Nothing<int> object without changing the output type. All you have to do is to supply a 'fall-back' value that can be used in case the value is nothing.

Maybe predicates #

You can easily implement the standard Maybe predicates IsNothing and IsJust:

public static IChurchBoolean IsNothing<T>(this IMaybe<T> m)
    return m.Match<IChurchBoolean>(
        nothing :   new ChurchTrue(), 
        just : _ => new ChurchFalse());
public static IChurchBoolean IsJust<T>(this IMaybe<T> m)
    return m.Match<IChurchBoolean>(
        nothing :   new ChurchFalse(),
        just : _ => new ChurchTrue());

Here, I arbitrarily chose to implement IsJust 'from scratch', but I could also have implemented it by negating the result of calling IsNothing. Once again, notice that the predicates are expressed in terms of Church-encoded Boolean values, instead of the built-in bool primitives.

Functor #

From Haskell (and F#) we know that Maybe is a functor. In C#, you turn a container into a functor by implementing an appropriate Select method. You can do this with IMaybe<T> as well:

public static IMaybe<TResult> Select<TTResult>(
    this IMaybe<T> source,
    Func<TTResult> selector)
    return source.Match<IMaybe<TResult>>(
        nothing:   new Nothing<TResult>(),
        just: x => new Just<TResult>(selector(x)));

Notice that this method turns an IMaybe<T> object into an IMaybe<TResult> object, using nothing but the Match method. This is possible because Match has a generic return type; thus, among other types of values, you can make it return IMaybe<TResult>.

When source is a Nothing<T> object, Match returns the object in the nothing case, which here becomes a new Nothing<TResult> object.

When source is a Just<T> object, Match invokes selector with the value contained in the just object, packages the result in a new Just<TResult> object, and returns it.

Because the Select method has the correct signature, you can use it with query syntax, as well as with normal method call syntax:

IMaybe<int> m = new Just<int>(42);
IMaybe<string> actual = from i in m
                        select i.ToString();

This example simply creates a just value containing the number 42, and then maps it to a string. Another way to write the same expression would be with method call syntax:

IMaybe<int> m = new Just<int>(42);
IMaybe<string> actual = m.Select(i => i.ToString());

In both cases, the result is a just case containing the string "42".

Summary #

In this article, you saw how it's possible to define the Maybe container from first principles, using nothing but functions (and, for the C# examples, interfaces and classes in order to make the code easier to understand for object-oriented developers).

Church-encoding enables you to model sum types as functions. So far in this article series, you've seen how to model Boolean values, natural numbers, and Maybe. Common to all three examples is that the data type in question consists of two mutually exclusive cases. There's at least one more interesting variation on that pattern.

Next: Church-encoded Either.


It's probably not your favorite thing to do anymore, but I thank you so much for continuing to provide C# examples for these concepts. It's invaluable for programmers wishing to adopt these concepts into the OOP languages they work with for a living. It's also a tremendous aid in briding the gap of understanding between OOP and FP.
2018-06-04 17:57 UTC

Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.


Monday, 04 June 2018 10:08:00 UTC


"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!