Lists, collections, deterministic Iterators form a monad. An article for object-oriented programmers.

This article is an instalment in an article series about monads.

In the article series about functors I never included an explicit article about collections. Instead, I wrote:

"Perhaps the most well-known of all functors is List, a.k.a. Sequence. C# query syntax can handle any functor, but most people only think of it as a language feature related to IEnumerable<T>. Since the combination of IEnumerable<T> and query syntax is already well-described, I'm not going to cover it explicitly here."

Like many other useful functors, a list also forms a monad. In this article, I call it list. In Haskell the most readily available equivalent is the built-in linked list ([]). Likewise, in F#, the fundamental collection types are linked lists and arrays. In C#, any IEnumerable<T> forms a monad. The .NET IEnumerable<T> interface is an application of the Iterator design pattern. There are many possible names for this monad: list, sequence, collection, iterator, etcetera. I've arbitrarily chosen to mostly use list.

SelectMany #

In the introduction to monads you learned that monads are characterised by having either a join (or flatten) function, or a bind function. While in Haskell, monadic bind is the >>= operator, in C# it's a method called SelectMany. Perhaps you wonder about the name.

The name SelectMany makes most sense in the context of lists. An example would be useful around here.

Imagine that you have to parse a CSV file. You've already read the file from disc and split the contents into lines. Now you need to split each line on commas.

> var lines = new[] { "foo,bar,baz""qux,quux,quuz" };
> lines.Select(l => l.Split(',')).ToArray()
string[2][] { string[3] { "foo", "bar", "baz" }, string[3] { "qux", "quux", "quuz" } }

When you use Select the result is a nested list. You've already learned that a monad is a functor you can flatten. In C#, you can 'flatten as you go' with SelectMany.

The above scenario offers a hint at why the method is named SelectMany. You use it instead of Select when the selector returns many values. In the above example, the Split method returns many values.

So, instead of Select, use SelectMany if you need a flattened list:

> lines.SelectMany(l => l.Split(',')).ToArray()
string[6] { "foo", "bar", "baz", "qux", "quux", "quuz" }

I've never had the opportunity to ask any of the LINQ designers about that naming choice, but this explanation makes sense to me. Even if it turns out that they had something else in mind when they named the method, I think that my explanation at least serves as a mnemonic device.

The name is still unfortunate, since it mostly makes sense for the list monad. Already when you use SelectMany with the Maybe monad, the name makes less sense.

Flatten #

In the introduction you learned that if you have a Flatten or Join function, you can implement SelectMany, and the other way around. In C#, LINQ already comes with SelectMany, but surprisingly, Flatten isn't built in.

You can always implement Flatten by using SelectMany with the identity function.

public static IEnumerable<T> Flatten<T>(this IEnumerable<IEnumerable<T>> source)
{
    return source.SelectMany(x => x);
}

Recall that the selector passed to SelectMany must have the type Func<TSource, IEnumerable<TResult>>. There's no rule, however, that says that TSource can't be IEnumerable<TResult>. If x in x => x has the type IEnumerable<TResult>, then x => x has the type Func<IEnumerable<TResult>, IEnumerable<TResult>> and it all type-checks.

Here's a test that demonstrates how Flatten works:

[Fact]
public void FlattenExample()
{
    string[][] nested = new[] { new[] { "foo""bar" }, new[] { "baz" } };
    IEnumerable<stringflattened = nested.Flatten();
    Assert.Equal(new[] { "foo""bar""baz" }, flattened);
}

In Haskell Flatten is called join and is implemented this way:

join :: (Monad m) => m (m a) -> m a
join x = x >>= id

It's the same implementation as the above C# method: monadic bind (SelectMany in C#; >>= in Haskell) with the identity function. The Haskell implementation works for all monads.

Return #

Apart from monadic bind, a monad must also define a way to put a normal value into the monad. For the list monad, this implies a single value promoted to a list. Surprisingly, the .NET base class library doesn't come with such a built-in function, but you can easily implement it yourself:

public static IEnumerable<T> Return<T>(T x)
{
    yield return x;
}

In practice, however, you'd normally instead create a singleton array, like this: new[] { 2112 }. Since arrays implement IEnumerable<T>, this works just as well.

Why is this the correct implementation of return?

Wouldn't this work?

public static IEnumerable<T> ReturnZero<T>(T x)
{
    yield break;
}

Or this?

public static IEnumerable<T> ReturnTwo<T>(T x)
{
    yield return x;
    yield return x;
}

None of these are appropriate because they break the monad laws. Try it, as an exercise!

Left identity #

Now that we're on the topic of the monad laws, let's see what they look like for the list monad, starting with the left identity law.

[Theory]
[InlineData("")]
[InlineData("foo")]
[InlineData("foo,bar")]
[InlineData("foo,bar,baz")]
public void LeftIdentity(string a)
{
    Func<string, IEnumerable<string>> @return = List.Return;
    Func<string, IEnumerable<string>> h = s => s.Split(',');
 
    Assert.Equal(@return(a).SelectMany(h), h(a));
}

As usual, a parametrised test is no proof that the law holds. I provide it only as an example of what the law looks like.

Right identity #

Likewise, we can showcase the right identity law as a test:

[Theory]
[InlineData(0)]
[InlineData(1)]
[InlineData(9)]
public void RightIdentity(int a)
{
    Func<int, IEnumerable<string>> f = i => Enumerable.Repeat("foo", i);
    Func<string, IEnumerable<string>> @return = List.Return;
 
    IEnumerable<stringm = f(a);
 
    Assert.Equal(m.SelectMany(@return), m);
}

These tests all pass.

Associativity #

The last monad law is the associativity law that we can demonstrate like this:

[Theory]
[InlineData(0)]
[InlineData(3)]
[InlineData(8)]
public void Associativity(int a)
{
    Func<int, IEnumerable<string>> f = i => Enumerable.Repeat("foo", i);
    Func<string, IEnumerable<char>> g = s => s;
    Func<char, IEnumerable<string>> h =
        c => new[] { c.ToString().ToUpper(), c.ToString().ToLower() };
 
    IEnumerable<stringm = f(a);
 
    Assert.Equal(m.SelectMany(g).SelectMany(h), m.SelectMany(x => g(x).SelectMany(h)));
}

The three functions f, g, and h are just silly functions I chose for variety's sake.

Conclusion #

Lists, arrays, collections, Iterators are various names for a common idea: that of an ordered sequence of values. In this article, I've called it list for simplicity's sake. It's possible to flatten nested lists in a well-behaved manner, which makes list a monad.

Next: The Maybe monad.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Tuesday, 19 April 2022 05:45:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Tuesday, 19 April 2022 05:45:00 UTC