ploeh blog2025-01-13T15:52:44+00:00Mark Seemanndanish software designhttps://blog.ploeh.dkRecawr Sandwichhttps://blog.ploeh.dk/2025/01/13/recawr-sandwich2025-01-13T15:52:00+00:00Mark Seemann
<div id="post">
<p>
<em>A pattern variation.</em>
</p>
<p>
After writing the articles <a href="/2024/11/18/collecting-and-handling-result-values">Collecting and handling result values</a> and <a href="/2024/12/02/short-circuiting-an-asynchronous-traversal">Short-circuiting an asynchronous traversal</a>, I realized that it might be valuable to describe a more disciplined variation of the <a href="/2020/03/02/impureim-sandwich">Impureim Sandwich</a> pattern.
</p>
<p>
The book <a href="/ref/dp">Design Patterns</a> describes each pattern over a number of sections. There's a description of the overall motivation, the structure of the pattern, UML diagrams, examples code, and more. One section discusses various implementation variations. I find it worthwhile, too, to explicitly draw attention to a particular variation of the more overall Impureim Sandwich pattern.
</p>
<p>
This variation imposes an additional constraint to the general pattern. While this may, at first glance, seem limiting, <a href="https://www.dotnetrocks.com/details/1542">constraints liberate</a>.
</p>
<p>
<img src="/content/binary/impureim-superset-of-recawr.png" alt="A subset labeled 'Recawr Sandwiches' contained in a superset labeled 'Impureim Sandwiches'.">
</p>
<p>
As a specialization, you may consider Recawr Sandwiches as a subset of all Impureim Sandwiches.
</p>
<h3 id="7b076cc0cc9148b9ba464bf41feb6128">
Read, calculate, write <a href="#7b076cc0cc9148b9ba464bf41feb6128">#</a>
</h3>
<p>
In short, the constraint is that the Sandwich should be organized in the following order:
</p>
<ul>
<li>Read data. This step is impure.</li>
<li>Calculate a result from the data. This step is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>.</li>
<li>Write data. This step is impure.</li>
</ul>
<p>
If the sandwich has <a href="/2023/10/09/whats-a-sandwich">more than three layers</a>, this order should still be maintained. Once you start writing data to the network, to disk, to a database, or to the user interface, you shouldn't go back to reading in more data.
</p>
<h3 id="12089f0da99644849da33faf7dd8ffa4">
Naming <a href="#12089f0da99644849da33faf7dd8ffa4">#</a>
</h3>
<p>
The name <em>Recawr Sandwich</em> is made from the first letters of <em>REad CAlculate WRite</em>. It's pronounced <em>recover sandwich</em>.
</p>
<p>
When the idea of naming this variation originally came to me, I first thought of the name <em>read/write sandwich</em>, but then I thought that the most important ingredient, the pure function, was missing. I've considered some other variations, such as <em>read, pure, write sandwich</em> or <em>input, referential transparency, output sandwich</em>, but none of them quite gets the point across, I think, in the same way as <em>read, calculate, write</em>.
</p>
<h3 id="954558da563244edbb98a6685b3f9460">
Precipitating example <a href="#954558da563244edbb98a6685b3f9460">#</a>
</h3>
<p>
To be clear, I've been applying the Recawr Sandwich pattern for years, but it sometimes takes a counter-example before you realize that some implicit, tacit knowledge should be made explicit. This happened to me as I was discussing <a href="/2024/11/18/collecting-and-handling-result-values">this implementation</a> of Impureim Sandwich:
</p>
<p>
<pre><span style="color:green;">// Impure</span>
<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">Error</span>>> <span style="font-weight:bold;color:#1f377f;">results</span> =
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));
<span style="color:green;">// Pure</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="font-weight:bold;color:#1f377f;">results</span>.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([], [], []),
(<span style="font-weight:bold;color:#1f377f;">state</span>, <span style="font-weight:bold;color:#1f377f;">result</span>) =>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Store</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>),
<span style="font-weight:bold;color:#1f377f;">notFound</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Fail</span>(<span style="font-weight:bold;color:#1f377f;">notFound</span>.Item),
<span style="font-weight:bold;color:#1f377f;">error</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Error</span>(<span style="font-weight:bold;color:#1f377f;">error</span>)));
<span style="color:green;">// Impure</span>
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkResult</span>(<span style="font-weight:bold;color:#1f377f;">result</span>);</pre>
</p>
<p>
Notice that the top impure step traverses a collection of items to apply each to an action called <code>UpdateItem</code>. As I discussed in the article, I don't actually know what <code>UpdateItem</code> does, but the name strongly suggests that it updates a particular database row. Even if the actual write doesn't happen until <code>SaveChangesAsync</code> is called, this still seems off.
</p>
<p>
To be honest, I didn't realize this until I started thinking about how I'd go about solving the implied problem, if I had to do it from scratch. Because I probably wouldn't do it like that at all.
</p>
<p>
It strikes me that doing the update 'too early' makes the code more complicated than it has to be.
</p>
<p>
What would a Recawr Sandwich look like?
</p>
<h3 id="e599dadd006a4d179289ba72a1978c1f">
Recawr example <a href="#e599dadd006a4d179289ba72a1978c1f">#</a>
</h3>
<p>
Perhaps one could instead start by querying the database about which items are actually in it, then prepare the result, and finally make the update.
</p>
<p>
<pre><span style="color:green;">// Read</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">existing</span> = <span style="color:blue;">await</span> <span style="color:#74531f;">FilterExisting</span>(<span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>);
<span style="color:green;">// Calculate</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([.. <span style="font-weight:bold;color:#1f377f;">existing</span>], [.. <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Except</span>(<span style="font-weight:bold;color:#1f377f;">existing</span>)], []);
<span style="color:green;">// Write</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">results</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">existing</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkResult</span>(<span style="font-weight:bold;color:#1f377f;">result</span>);</pre>
</p>
<p>
To be honest, this variation has different behaviour when <code>Error</code> values occur, but then again, I wasn't entirely sure what was even the purpose of the error value. If it's to <a href="/2024/01/29/error-categories-and-category-errors">model errors that client code can't recover from</a>, throw an exception instead.
</p>
<p>
In any case, the example is typical of many <a href="https://en.wikipedia.org/wiki/Input/output">I/O</a>-heavy operations, which veer dangerously close to the degenerate. There really isn't a lot of logic required, so one may reasonably ask whether the example is useful. It was, however, the example that got me thinking about giving the Recawr Sandwich an explicit name.
</p>
<h3 id="ef69b33222b44b3e889fc0c861537d48">
Other examples <a href="#ef69b33222b44b3e889fc0c861537d48">#</a>
</h3>
<p>
All the examples in the original <a href="/2020/03/02/impureim-sandwich">Impureim Sandwich</a> article are actually Recawr Sandwiches. Other articles with clear Recawr Sandwich examples are:
</p>
<ul>
<li><a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a></li>
<li><a href="/2019/09/16/picture-archivist-in-f">Picture archivist in F#</a></li>
<li><a href="/2021/09/06/the-command-handler-contravariant-functor">The Command Handler contravariant functor</a></li>
<li><a href="/2024/12/16/a-restaurant-sandwich">A restaurant sandwich</a></li>
</ul>
<p>
In other words, I'm just retroactively giving these examples a more specific label.
</p>
<p>
What's an example of an Impureim Sandwich which is <em>not</em> a Recawr Sandwich? Ironically, the first example in this article.
</p>
<h3 id="95dbd8e6364d429db7f040835d89e8e7">
Conclusion <a href="#95dbd8e6364d429db7f040835d89e8e7">#</a>
</h3>
<p>
A Recawr Sandwich is a specialization of the slightly more general Impureim Sandwich pattern. It specializes by assigning roles to the two impure layers of the sandwich. In the first, the code reads data. In the second impure layer, it writes data. In between, it performs referentially transparent calculations.
</p>
<p>
While more constraining, this specialization offers a good rule of thumb. Most well-designed sandwiches follow this template.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Encapsulating rod-cuttinghttps://blog.ploeh.dk/2025/01/06/encapsulating-rod-cutting2025-01-06T10:45:00+00:00Mark Seemann
<div id="post">
<p>
<em>Focusing on usage over implementation.</em>
</p>
<p>
This article is a part of a small article series about <a href="/2024/12/09/implementation-and-usage-mindsets">implementation and usage mindsets</a>. The hypothesis is that programmers who approach a problem with an implementation mindset may gravitate toward dynamically typed languages, whereas developers concerned with long-term maintenance and sustainability of a code base may be more inclined toward statically typed languages. This could be wrong, and is almost certainly too simplistic, but is still, I hope, worth thinking about. In the <a href="/2024/12/23/implementing-rod-cutting">previous article</a> you saw examples of an implementation-centric approach to problem-solving. In this article, I'll discuss what a usage-first perspective entails.
</p>
<p>
A usage perspective indicates that you're first and foremost concerned with how useful a programming interface is. It's what you do when you take advantage of test-driven development (TDD). First, you write a test, which furnishes an example of what a usage scenario looks like. Only then do you figure out how to implement the desired API.
</p>
<p>
In this article I didn't use TDD since I already had a particular implementation. Even so, while I didn't mention it in the previous article, I did add tests to verify that the code works as intended. In fact, because I wrote a few <a href="https://github.com/hedgehogqa/fsharp-hedgehog/">Hedgehog</a> properties, I have more than 10.000 test cases covering my implementation.
</p>
<p>
I bring this up because TDD is only one way to focus on sustainability and <a href="/encapsulation-and-solid">encapsulation</a>. It's the most scientific methodology that I know of, but you can employ more ad-hoc, ex-post analysis processes. I'll do that here.
</p>
<h3 id="03ca7a9c8c8146b6b7f0c1275ae9abcc">
Imperative origin <a href="#03ca7a9c8c8146b6b7f0c1275ae9abcc">#</a>
</h3>
<p>
In the <a href="/2024/12/23/implementing-rod-cutting">previous article</a> you saw how the <code>Extended-Bottom-Up-Cut-Rod</code> pseudocode was translated to this <a href="https://fsharp.org/">F#</a> function:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> (<span style="font-weight:bold;color:#1f377f;">p</span> : _ <span style="color:#2b91af;">array</span>) <span style="font-weight:bold;color:#1f377f;">n</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="font-weight:bold;color:#1f377f;">r</span>, <span style="font-weight:bold;color:#1f377f;">s</span></pre>
</p>
<p>
In case anyone is wondering: This is a bona-fide <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, even if the implementation is as imperative as can be. Given the same input, <code>cut</code> always returns the same output, and there are no side effects. We may wish to implement the function in a more <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> way, but that's not our first concern. <em>My</em> first concern, at least, is to make sure that preconditions, invariants, and postconditions are properly communicated.
</p>
<p>
The same goal applies to the <code>printSolution</code> action, also repeated here for your convenience.
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">printSolution</span> <span style="font-weight:bold;color:#1f377f;">p</span> <span style="font-weight:bold;color:#1f377f;">n</span> =
<span style="color:blue;">let</span> _, <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">p</span> <span style="font-weight:bold;color:#1f377f;">n</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">n</span> = <span style="font-weight:bold;color:#1f377f;">n</span>
<span style="color:blue;">while</span> <span style="color:#a08000;">n</span> > 0 <span style="color:blue;">do</span>
<span style="color:#74531f;">printfn</span> <span style="color:#a31515;">"</span><span style="color:#2b91af;">%i</span><span style="color:#a31515;">"</span> <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]
<span style="color:#a08000;">n</span> <span style="color:blue;"><-</span> <span style="color:#a08000;">n</span> - <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]</pre>
</p>
<p>
Not that I'm not interested in more idiomatic implementations, but after all, they're <em>by definition</em> just implementation details, so first, I'll discuss encapsulation. Or, if you will, the usage perspective.
</p>
<h3 id="bb978144e56743639e83448c9b1d4f01">
Names and types <a href="#bb978144e56743639e83448c9b1d4f01">#</a>
</h3>
<p>
Based on the above two code snippets, we're given two artefacts: <code>cut</code> and <code>printSolution</code>. Since F# is a statically typed language, each operation also has a type.
</p>
<p>
The type of <code>cut</code> is <code>int array -> int -> int array * int array</code>. If you're not super-comfortable with F# type signatures, this means that <code>cut</code> is a function that takes an integer array and an integer as inputs, and returns a tuple as output. The output tuple is a pair; that is, it contains two elements, and in this particular case, both elements have the same type: They are both integer arrays.
</p>
<p>
Likewise, the type of <code>printSolution</code> is <code>int array -> int -> unit</code>, which again indicates that inputs must be an integer array and an integer. In this case the output is <code>unit</code>, which, in a sense, corresponds to <code>void</code> in many <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>-based languages.
</p>
<p>
Both operations belong to a module called <code>Rod</code>, so their slightly longer, more formal names are <code>Rod.cut</code> and <code>Rod.printSolution</code>. Even so, <a href="/2020/11/23/good-names-are-skin-deep">good names are only skin-deep</a>, and I'm not even convinced that these are particularly good names. To be fair to myself, I adopted the names from the pseudocode from <a href="/ref/clrs">Introduction to Algorithms</a>. Had I been freer to name function and design APIs, I might have chosen different names. As it is, currently, there's no documentation, so the types are the only source of additional information.
</p>
<p>
Can we infer proper usage from these types? Do they sufficiently well communicate preconditions, invariants, and postconditions? In other words, do the types satisfactorily indicate the <em>contract</em> of each operation? Do the functions exhibit good <a href="/encapsulation-and-solid">encapsulation</a>?
</p>
<p>
We may start with the <code>cut</code> function. It takes as inputs an integer array and an integer. Are empty arrays allowed? Are all integers valid, or perhaps only natural numbers? What about zeroes? Are duplicates allowed? Does the array need to be sorted? Is there a relationship between the array and the integer? Can the single integer parameter be negative?
</p>
<p>
And what about the return value? Are the two integer arrays related in any way? Can one be empty, but the other large? Can they both be empty? May negative numbers or zeroes be present?
</p>
<p>
Similar questions apply to the <code>printSolution</code> action.
</p>
<p>
<a href="/2022/08/22/can-types-replace-validation">Not all such questions can be answered by types</a>, but since we already have a type system at our disposal, we might as well use it to address those questions that are easily modelled.
</p>
<h3 id="2a9f41707fb9425da8078a7181f6e7d6">
Encapsulating the relationship between price array and rod length <a href="#2a9f41707fb9425da8078a7181f6e7d6">#</a>
</h3>
<p>
The first question I decided to answer was this: <em>Is there a relationship between the array and the integer?</em>
</p>
<p>
The array, you may recall, is an array of prices. The integer is the length of the rod to cut up.
</p>
<p>
A relationship clearly exists. The length of the rod must not exceed the length of the array. If it does, <code>cut</code> throws an <a href="https://learn.microsoft.com/dotnet/api/system.indexoutofrangeexception">IndexOutOfRangeException</a>. We can't calculate the optimal cuts if we lack price information.
</p>
<p>
Likewise, we can already infer that the length must be a non-negative number.
</p>
<p>
While we could choose to enforce this relationship with Guard Clauses, we may also consider a simpler API. Let the function infer the rod length from the array length.
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> (<span style="font-weight:bold;color:#1f377f;">p</span> : _ <span style="color:#2b91af;">array</span>) =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="font-weight:bold;color:#1f377f;">r</span>, <span style="font-weight:bold;color:#1f377f;">s</span></pre>
</p>
<p>
You may argue that this API is more implicit, which <a href="https://peps.python.org/pep-0020/">we generally don't like</a>. The implication is that the rod length is determined by the array length. If you have a (one-indexed) price array of length <em>10</em>, then how do you calculate the optimal cuts for a rod of length <em>7?</em>
</p>
<p>
By shortening the price array:
</p>
<p>
<pre>> let p = [|0; 1; 5; 8; 9; 10; 17; 17; 20; 24; 30|];;
val p: int array = [|0; 1; 5; 8; 9; 10; 17; 17; 20; 24; 30|]
> cut (p |> Array.take (7 + 1));;
val it: int array * int array =
([|0; 1; 5; 8; 10; 13; 17; 18|], [|0; 1; 2; 3; 2; 2; 6; 1|])</pre>
</p>
<p>
This is clearly still sub-optimal. Notice, for example, how you need to add <code>1</code> to <code>7</code> in order to deal with the prefixed <code>0</code>. On the other hand, we're not done with the redesign, so it may be worth pursuing this course a little further.
</p>
<p>
(To be honest, while this is the direction I ultimately choose, I'm not blind to the disadvantages of this implicit design. It makes it less clear to a client developer how to indicate a rod length. An alternative design would keep the price array and the rod length as two separate parameters, but then introduce a Guard Clause to check that the rod length doesn't exceed the length of the price array. Outside of <a href="https://en.wikipedia.org/wiki/Dependent_type">dependent types</a> I can't think of a way to model such a relationship between two values, and I admit to having no practical experience with dependent types. All this said, however, it's also possible that I'm missing an obvious design alternative. If you can think of a way to model this relationship in a non-<a href="https://www.hillelwayne.com/post/constructive/">predicative</a> way, please <a href="https://github.com/ploeh/ploeh.github.com?tab=readme-ov-file#comments">write a comment</a>.)
</p>
<p>
I gave the <code>printSolution</code> the same treatment, after first having extracted a <code>solve</code> function in order to <a href="/2016/09/26/decoupling-decisions-from-effects">separate decisions from effects</a>.
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">solve</span> <span style="font-weight:bold;color:#1f377f;">p</span> =
<span style="color:blue;">let</span> _, <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">p</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">l</span> = <span style="color:#2b91af;">ResizeArray</span> ()
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">while</span> <span style="color:#a08000;">n</span> > 0 <span style="color:blue;">do</span>
<span style="font-weight:bold;color:#1f377f;">l</span>.<span style="font-weight:bold;color:#74531f;">Add</span> <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]
<span style="color:#a08000;">n</span> <span style="color:blue;"><-</span> <span style="color:#a08000;">n</span> - <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]
<span style="font-weight:bold;color:#1f377f;">l</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span>
<span style="color:blue;">let</span> <span style="color:#74531f;">printSolution</span> <span style="font-weight:bold;color:#1f377f;">p</span> = <span style="color:#74531f;">solve</span> <span style="font-weight:bold;color:#1f377f;">p</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">iter</span> (<span style="color:#74531f;">printfn</span> <span style="color:#a31515;">"</span><span style="color:#2b91af;">%i</span><span style="color:#a31515;">"</span>)</pre>
</p>
<p>
The <em>implementation</em> of the <code>solve</code> function is still imperative, but if you view it as a black box, it's <a href="https://en.wikipedia.org/wiki/Referential_transparency">referentially transparent</a>. We'll get back to the implementation later.
</p>
<h3 id="01d4ef562a9d4552870ef093ae907f45">
Returning a list of cuts <a href="#01d4ef562a9d4552870ef093ae907f45">#</a>
</h3>
<p>
Let's return to all the questions I enumerated above, particularly the questions about the return value. Are the two integer arrays related?
</p>
<p>
Indeed they are! In fact, they have the same length.
</p>
<p>
As explained in the <a href="/2024/12/23/implementing-rod-cutting">previous article</a>, in the original pseudocode, the <code>r</code> array is supposed to be zero-indexed, but non-empty and containing <code>0</code> as the first element. The <code>s</code> array is supposed to be one-indexed, and be exactly one element shorter than the <code>r</code> array. In practice, in all three implementations shown in that article, I made both arrays zero-indexed, non-empty, and of the exact same length. This is also true for the F# implementation.
</p>
<p>
We can communicate this relationship much better to client developers by changing the return type of the <code>cut</code> function. Currently, the return type is <code>int array * int array</code>, indicating a pair of arrays. Instead, we can change the return type to an array of pairs, thereby indicating that the values are related two-and-two.
</p>
<p>
That would be a decent change, but we can further improve the API. A pair of integers are still implicit, because it isn't clear which integer represents the revenue and which one represents the size. Instead, we introduce a custom type with clear labels:
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#2b91af;">Cut</span> = { Revenue : <span style="color:#2b91af;">int</span>; Size : <span style="color:#2b91af;">int</span> }</pre>
</p>
<p>
Then we change the <code>cut</code> function to return a collection of <code>Cut</code> values:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> (<span style="font-weight:bold;color:#1f377f;">p</span> : _ <span style="color:#2b91af;">array</span>) =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:#2b91af;">ResizeArray</span> ()
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 0 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Add</span> { Revenue = <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">i</span>]; Size = <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] }
<span style="font-weight:bold;color:#1f377f;">result</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span></pre>
</p>
<p>
The type of <code>cut</code> is now <code>int array -> Cut list</code>. Notice that I decided to return a linked list rather than an array. This is mostly because I consider linked lists to be more idiomatic than arrays in a context of functional programming (FP), but to be honest, I'm not sure that it makes much difference as a return value.
</p>
<p>
In any case, you'll observe that the implementation is still imperative. The main topic of this article is how to give an API good encapsulation, so I treat the actual code as an implementation detail. It's not the most important thing.
</p>
<h3 id="8dca0872ab584d0ebefc10200877adde">
Linked list input <a href="#8dca0872ab584d0ebefc10200877adde">#</a>
</h3>
<p>
Although I wrote that I'm not sure it makes much difference whether <code>cut</code> returns an array or a list, it does matter when it comes to input values. Currently, <code>cut</code> takes an <code>int array</code> as input.
</p>
<p>
As the implementation so amply demonstrates, F# arrays are mutable; you can mutate the cells of an array. A client developer may worry, then, whether <code>cut</code> modifies the input array.
</p>
<p>
From the implementation code we know that it doesn't, but encapsulation is all about sparing client developers the burden of having to read the implementation. Rather, an API should communicate its contract in as succinct a way as possible, either via documentation or the type system.
</p>
<p>
In this case, we can use the type system to communicate this postcondition. Changing the input type to a linked list effectively communicates to all users of the API that <code>cut</code> doesn't mutate the input. This is because F# linked lists are truly immutable.
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">prices</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">p</span> = <span style="font-weight:bold;color:#1f377f;">prices</span> |> <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">ofList</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:#2b91af;">ResizeArray</span> ()
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 0 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Add</span> { Revenue = <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">i</span>]; Size = <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] }
<span style="font-weight:bold;color:#1f377f;">result</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span></pre>
</p>
<p>
The type of the <code>cut</code> function is now <code>int list -> Cut list</code>, which informs client developers of an invariant. You can trust that <code>cut</code> will not change the input arguments.
</p>
<h3 id="fe67c3b6121e4be780bc3d7f3b166a00">
Natural numbers <a href="#fe67c3b6121e4be780bc3d7f3b166a00">#</a>
</h3>
<p>
You've probably gotten the point by now, so let's move a bit quicker. There are still issues that we'd like to document. Perhaps the worst part of the current API is that client code is required to supply a <code>prices</code> list where the first element <em>must</em> be zero. That's a very specific requirement. It's easy to forget, and if you do, the <code>cut</code> function just silently fails. It doesn't throw an exception; it just gives you a wrong answer.
</p>
<p>
We may choose to add a Guard Clause, but why are we even putting that responsibility on the client developer? Why can't the <code>cut</code> function add that prefix itself? It can, and it turns out that once you do that, and also remove the initial zero element from the output, you're now working with natural numbers.
</p>
<p>
First, add a <code>NaturalNumber</code> wrapper of integers:
</p>
<p>
<pre>type <span style="color:#2b91af;">NaturalNumber</span> = private <span style="color:#2b91af;">NaturalNumber</span> of <span style="color:#2b91af;">int</span> with
member <span style="font-weight:bold;color:#1f377f;">this</span>.Value = let (<span style="color:#2b91af;">NaturalNumber</span> <span style="font-weight:bold;color:#1f377f;">i</span>) = <span style="font-weight:bold;color:#1f377f;">this</span> in <span style="font-weight:bold;color:#1f377f;">i</span>
static member <span style="font-weight:bold;color:#74531f;">tryCreate</span> <span style="font-weight:bold;color:#1f377f;">candidate</span> =
if <span style="font-weight:bold;color:#1f377f;">candidate</span> < 1 then <span style="color:#2b91af;">None</span> else <span style="color:#2b91af;">Some</span> <| <span style="color:#2b91af;">NaturalNumber</span> <span style="font-weight:bold;color:#1f377f;">candidate</span>
override <span style="font-weight:bold;color:#1f377f;">this</span>.<span style="font-weight:bold;color:#74531f;">ToString</span> () = let (<span style="color:#2b91af;">NaturalNumber</span> <span style="font-weight:bold;color:#1f377f;">i</span>) = <span style="font-weight:bold;color:#1f377f;">this</span> in <span style="color:#74531f;">string</span> <span style="font-weight:bold;color:#1f377f;">i</span></pre>
</p>
<p>
Since the case constructor is <code>private</code>, external code can only <em>try</em> to create values. Once you have a <code>NaturalNumber</code> value, you know that it's valid, but creation requires a run-time check. In other words, this is what Hillel Wayne calls <a href="https://www.hillelwayne.com/post/constructive/">predicative data</a>.
</p>
<p>
Armed with this new type, however, we can now strengthen the definition of the <code>Cut</code> record type:
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#2b91af;">Cut</span> = { Revenue : <span style="color:#2b91af;">int</span>; Size : <span style="color:#2b91af;">NaturalNumber</span> } <span style="color:blue;">with</span>
<span style="color:blue;">static</span> <span style="color:blue;">member</span> <span style="font-weight:bold;color:#74531f;">tryCreate</span> <span style="font-weight:bold;color:#1f377f;">revenue</span> <span style="font-weight:bold;color:#1f377f;">size</span> =
<span style="color:#2b91af;">NaturalNumber</span>.<span style="font-weight:bold;color:#74531f;">tryCreate</span> <span style="font-weight:bold;color:#1f377f;">size</span>
|> <span style="color:#2b91af;">Option</span>.<span style="color:#74531f;">map</span> (<span style="color:blue;">fun</span> <span style="font-weight:bold;color:#1f377f;">size</span> <span style="color:blue;">-></span> { Revenue = <span style="font-weight:bold;color:#1f377f;">revenue</span>; Size = <span style="font-weight:bold;color:#1f377f;">size</span> })</pre>
</p>
<p>
The <code>Revenue</code> may still be any integer, because it turns out that the algorithm also works with negative prices. (For a book that's very meticulous in its analysis of algorithms, <a href="/ref/clrs">CLRS</a> is surprisingly silent on this topic. Thorough testing with <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a>, however, indicates that this is so.) On the other hand, the <code>Size</code> of the <code>Cut</code> must be a <code>NaturalNumber</code>. Since, again, we don't have any constructive way (outside of using <a href="https://en.wikipedia.org/wiki/Refinement_type">refinement types</a>) of modelling this requirement, we also supply a <code>tryCreate</code> function.
</p>
<p>
This enables us to define the <code>cut</code> function like this:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">prices</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">p</span> = <span style="font-weight:bold;color:#1f377f;">prices</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">append</span> [0] |> <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">ofList</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:#2b91af;">ResizeArray</span> ()
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:#2b91af;">Cut</span>.<span style="font-weight:bold;color:#74531f;">tryCreate</span> <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] |> <span style="color:#2b91af;">Option</span>.<span style="color:#74531f;">iter</span> <span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Add</span>
<span style="font-weight:bold;color:#1f377f;">result</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span></pre>
</p>
<p>
It still has the type <code>int list -> Cut list</code>, but the <code>Cut</code> type is now more restrictively designed. In other words, we've provided a more conservative definition of what we return, in keeping with <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>.
</p>
<p>
Furthermore, notice that the first line prepends <code>0</code> to the <code>p</code> array, so that the client developer doesn't have to do that. Likewise, when returning the result, the <code>for</code> loop goes from <code>1</code> to <code>n</code>, which means that it omits the first zero cut.
</p>
<p>
These changes ripple through and also improves encapsulation of the <code>solve</code> function:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">solve</span> <span style="font-weight:bold;color:#1f377f;">prices</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">cuts</span> = <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">prices</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">l</span> = <span style="color:#2b91af;">ResizeArray</span> ()
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">n</span> = <span style="font-weight:bold;color:#1f377f;">prices</span>.Length
<span style="color:blue;">while</span> <span style="color:#a08000;">n</span> > 0 <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">idx</span> = <span style="color:#a08000;">n</span> - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="font-weight:bold;color:#1f377f;">cuts</span>.[<span style="font-weight:bold;color:#1f377f;">idx</span>].Size
<span style="font-weight:bold;color:#1f377f;">l</span>.<span style="font-weight:bold;color:#74531f;">Add</span> <span style="font-weight:bold;color:#1f377f;">s</span>
<span style="color:#a08000;">n</span> <span style="color:blue;"><-</span> <span style="color:#a08000;">n</span> - <span style="font-weight:bold;color:#1f377f;">s</span>.Value
<span style="font-weight:bold;color:#1f377f;">l</span> |> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span></pre>
</p>
<p>
The type of <code>solve</code> is now <code>int list -> NaturalNumber list</code>.
</p>
<p>
This is about as strong as I can think of making the API using F#'s type system. A type like <code>int list -> NaturalNumber list</code> tells you something about what you're allowed to do, what you're expected to do, and what you can expect in return. You can provide (almost) any list of integers, both positive, zero, or negative. You may also give an empty list. If we had wanted to prevent that, we could have used a <code>NonEmpty</code> list, as seen (among other places) in the article <a href="/2024/05/06/conservative-codomain-conjecture">Conservative codomain conjecture</a>.
</p>
<p>
Okay, to be perfectly honest, there's one more change that might be in order, but this is where I ran out of steam. One remaining precondition that I haven't yet discussed is that the input list must not contain 'too big' numbers. The problem is that the algorithm adds numbers together, and since 32-bit integers are bounded, you could run into overflow situations. Ask me how I know.
</p>
<p>
Changing the types to use 64-bit integers doesn't solve that problem (it only moves the boundary of where overflow happens), but consistently changing the API to work with <a href="https://learn.microsoft.com/dotnet/api/system.numerics.biginteger">BigInteger</a> values might. To be honest, I haven't tried.
</p>
<h3 id="641bc16e730542a1a4a231886d208f24">
Functional programming <a href="#641bc16e730542a1a4a231886d208f24">#</a>
</h3>
<p>
From an encapsulation perspective, we're done now. By using the type system, we've emphasized how to <em>use</em> the API, rather than how it's implemented. Along the way, we even hid away some warts that came with the implementation. If I wanted to take this further, I would seriously consider making the <code>cut</code> function a <code>private</code> helper function, because it doesn't really return a solution. It only returns an intermediary value that makes it easier for the <code>solve</code> function to return the actual solution.
</p>
<p>
If you're even just a little bit familiar with F# or functional programming, you may have found it painful to read this far. <em>All that imperative code. My eyes! For the love of God, please rewrite the implementation with proper FP idioms and patterns.</em>
</p>
<p>
Well, the point of the whole article is that the implementation doesn't really matter. It's how client code may <em>use</em> the API that's important.
</p>
<p>
That is, of course, until you have to go and change the implementation code. In any case, as a little consolation prize for those brave FP readers who've made it all this way, here follows more functional implementations of the functions.
</p>
<p>
The <code>NaturalNumber</code> and <code>Cut</code> types haven't changed, so the first change comes with the <code>cut</code> function:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:blue;">private</span> <span style="color:#74531f;">cons</span> <span style="font-weight:bold;color:#1f377f;">x</span> <span style="font-weight:bold;color:#1f377f;">xs</span> = <span style="font-weight:bold;color:#1f377f;">x</span> <span style="color:#2b91af;">::</span> <span style="font-weight:bold;color:#1f377f;">xs</span>
<span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">prices</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">p</span> = 0 <span style="color:#2b91af;">::</span> <span style="font-weight:bold;color:#1f377f;">prices</span> |> <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">ofList</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">n</span> = <span style="font-weight:bold;color:#1f377f;">p</span>.Length - 1
<span style="color:blue;">let</span> <span style="color:#74531f;">findBestCut</span> <span style="font-weight:bold;color:#1f377f;">revenues</span> <span style="font-weight:bold;color:#1f377f;">j</span> =
[1..<span style="font-weight:bold;color:#1f377f;">j</span>]
|> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">map</span> (<span style="color:blue;">fun</span> <span style="font-weight:bold;color:#1f377f;">i</span> <span style="color:blue;">-></span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="color:#2b91af;">Map</span>.<span style="color:#74531f;">find</span> (<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>) <span style="font-weight:bold;color:#1f377f;">revenues</span>, <span style="font-weight:bold;color:#1f377f;">i</span>)
|> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">maxBy</span> <span style="color:#74531f;">fst</span>
<span style="color:blue;">let</span> <span style="color:#74531f;">aggregate</span> <span style="font-weight:bold;color:#1f377f;">acc</span> <span style="font-weight:bold;color:#1f377f;">j</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">revenues</span> = <span style="color:#74531f;">snd</span> <span style="font-weight:bold;color:#1f377f;">acc</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">q</span>, <span style="font-weight:bold;color:#1f377f;">i</span> = <span style="color:#74531f;">findBestCut</span> <span style="font-weight:bold;color:#1f377f;">revenues</span> <span style="font-weight:bold;color:#1f377f;">j</span>
<span style="color:blue;">let</span> <span style="color:#74531f;">cuts</span> = <span style="color:#74531f;">fst</span> <span style="font-weight:bold;color:#1f377f;">acc</span>
<span style="color:#74531f;">cuts</span> << (<span style="color:#74531f;">cons</span> (<span style="font-weight:bold;color:#1f377f;">q</span>, <span style="font-weight:bold;color:#1f377f;">i</span>)), <span style="color:#2b91af;">Map</span>.<span style="color:#74531f;">add</span> <span style="font-weight:bold;color:#1f377f;">revenues</span>.Count <span style="font-weight:bold;color:#1f377f;">q</span> <span style="font-weight:bold;color:#1f377f;">revenues</span>
[1..<span style="font-weight:bold;color:#1f377f;">n</span>]
|> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">fold</span> <span style="color:#74531f;">aggregate</span> (<span style="color:#74531f;">id</span>, <span style="color:#2b91af;">Map</span>.<span style="color:#74531f;">add</span> 0 0 <span style="color:#2b91af;">Map</span>.empty)
|> <span style="color:#74531f;">fst</span> <| [] <span style="color:green;">// Evaluate Hughes list</span>
|> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">choose</span> (<span style="color:blue;">fun</span> (<span style="font-weight:bold;color:#1f377f;">r</span>, <span style="font-weight:bold;color:#1f377f;">i</span>) <span style="color:blue;">-></span> <span style="color:#2b91af;">Cut</span>.<span style="font-weight:bold;color:#74531f;">tryCreate</span> <span style="font-weight:bold;color:#1f377f;">r</span> <span style="font-weight:bold;color:#1f377f;">i</span>)</pre>
</p>
<p>
Even here, however, some implementation choices are dubious at best. For instance, I decided to use a Hughes list or difference list (see <a href="/2015/12/22/tail-recurse">Tail Recurse</a> for a detailed explanation of how this works in F#) without measuring whether or not it was better than just using normal <em>list consing</em> followed by <code>List.rev</code> (which is, in fact, often faster). That's one of the advantages of writing code for articles; such things don't really matter that much in that context.
</p>
<p>
Another choice that may leave you scratching your head is that I decided to model the <code>revenues</code> as a map (that is, an immutable dictionary) rather than an array. I did this because I was concerned that with the move towards immutable code, I'd have <code>n</code> reallocations of arrays. Perhaps, I thought, adding incrementally to a <code>Map</code> structure would be more efficient.
</p>
<p>
But really, all of that is just wanking, because I haven't measured.
</p>
<p>
The FP-style implementation of <code>solve</code> is, I believe, less controversial:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">solve</span> <span style="font-weight:bold;color:#1f377f;">prices</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">cuts</span> = <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">prices</span>
<span style="color:blue;">let</span> <span style="color:blue;">rec</span> <span style="color:#74531f;">imp</span> <span style="font-weight:bold;color:#1f377f;">n</span> =
<span style="color:blue;">if</span> <span style="font-weight:bold;color:#1f377f;">n</span> <= 0 <span style="color:blue;">then</span> [] <span style="color:blue;">else</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">idx</span> = <span style="font-weight:bold;color:#1f377f;">n</span> - 1
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="font-weight:bold;color:#1f377f;">cuts</span>[<span style="font-weight:bold;color:#1f377f;">idx</span>].Size
<span style="font-weight:bold;color:#1f377f;">s</span> <span style="color:#2b91af;">::</span> <span style="color:#74531f;">imp</span> (<span style="font-weight:bold;color:#1f377f;">n</span> - <span style="font-weight:bold;color:#1f377f;">s</span>.Value)
<span style="color:#74531f;">imp</span> <span style="font-weight:bold;color:#1f377f;">prices</span>.Length</pre>
</p>
<p>
This is a fairly standard implementation using a local recursive helper function.
</p>
<p>
Both <code>cut</code> and <code>solve</code> have the types previously reported. In other words, this final refactoring to functional implementations didn't change their types.
</p>
<h3 id="c009b6e42470466c9556f52a7c5af175">
Conclusion <a href="#c009b6e42470466c9556f52a7c5af175">#</a>
</h3>
<p>
This article goes through a series of code improvements to illustrate how a static type system can make it easier to use an API. Use it <em>correctly</em>, that is.
</p>
<p>
There's a common misconception about ease of use that it implies typing fewer characters, or getting instant <a href="/2024/05/13/gratification">gratification</a>. That's not my position. <a href="/2018/09/17/typing-is-not-a-programming-bottleneck">Typing is not a bottleneck</a>, and in any case, not much is gained if you make it easier for client developers to get the wrong answers from your API.
</p>
<p>
Static types gives you a consistent vocabulary you can use to communicate an API's contract to client developers. What must client code do in order to make a valid method or function call? What guarantees can client code rely on? <a href="/encapsulation-and-solid">Encapsulation</a>, in other words.
</p>
</div>
<hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Pytest is fasthttps://blog.ploeh.dk/2024/12/30/pytest-is-fast2024-12-30T16:01:00+00:00Mark Seemann
<div id="post">
<p>
<em>One major attraction of Python. A recent realization.</em>
</p>
<p>
Ever since I became aware of the distinction between statically and dynamically typed languages, I've struggled to understand the attraction of dynamically typed languages. As regular readers may have noticed, this is <a href="/2021/08/09/am-i-stuck-in-a-local-maximum">a bias that doesn't sit well with me</a>. Clearly, there are advantages to dynamic languages that I fail to notice. Is it <a href="/2024/12/09/implementation-and-usage-mindsets">a question of mindset</a>? Or is it a combination of several small advantages?
</p>
<p>
In this article, I'll discuss another potential benefit of at least one dynamically typed language, <a href="https://www.python.org/">Python</a>.
</p>
<h3 id="c9d1927a5f6e4df0bdb1e71766f37d1f">
Fast feedback <a href="#c9d1927a5f6e4df0bdb1e71766f37d1f">#</a>
</h3>
<p>
Rapid feedback is a cornerstone of <a href="/ref/modern-software-engineering">modern software engineering</a>. I've always considered the <a href="/2011/04/29/Feedbackmechanismsandtradeoffs">feedback from the compiler an important mechanism</a>, but I've recently begun to realize that it comes with a price. While a good type system keeps you honest, compilation takes time, too.
</p>
<p>
Since I've been so entrenched in the camp of statically typed languages (C#, <a href="https://fsharp.org/">F#</a>, <a href="https://www.haskell.org/">Haskell</a>), I've tended to regard compilation as a mandatory step. And since the compiler needs to run anyway, you might as well take advantage of it. Use the type system to <a href="https://blog.janestreet.com/effective-ml-video/">make illegal states unrepresentable</a>, and all that.
</p>
<p>
Even so, I've noticed that compilation time isn't a fixed size. This observation surely borders on the banal, but with sufficient cognitive bias, it can, apparently, take years to come to even such a trivial conclusion. After initial years with various programming languages, my formative years as a programmer were spent with C#. As it turns out, the C# compiler is relatively fast.
</p>
<p>
This is probably a combination of factors. Since C# is a one of the most popular languages, it has a big and skilled engineering team, and it's my impression that much effort goes into making it as fast and efficient as possible.
</p>
<p>
I also speculate that, since the C# type system isn't as powerful as F#'s or Haskell's, there's simply work that it can't do. When you can't expression certain constraints or relationships with the type system, the compiler can't check them, either.
</p>
<p>
That said, the C# compiler seems to have become slower over the years. This could be a consequence of all the extra language features that accumulate.
</p>
<p>
The F# compiler, in comparison, has always taken longer than the C# compiler. Again, this may be due to a combination of a smaller engineering team and that it actually <em>can</em> check more things at compile time, since the type system is more expressive.
</p>
<p>
This, at least, seems to fit with the observation that the Haskell compiler is even slower than F#. The language is incredibly expressive. There's a lot of constraints and relationships that you can model with the type system. Clearly, the compiler has to perform extra work to check that your types line up.
</p>
<p>
You're often left with the impression that <em>if it compiles, it works</em>. The drawback is that getting Haskell code to compile may be a non-trivial undertaking.
</p>
<p>
One thing is that you'll have to wait for the compiler. Another is that if you practice test-driven development (TDD), you'll have to compile the test code, too. Only once the tests are compiled can you run them. And <a href="/2012/05/24/TDDtestsuitesshouldrunin10secondsorless">TDD test suites should run in 10 seconds or less</a>.
</p>
<h3 id="c156a0786a0940a29d37d9982881d5d5">
Skipping compilation with pytest <a href="#c156a0786a0940a29d37d9982881d5d5">#</a>
</h3>
<p>
A few years ago I had to learn a bit of Python, so I decided to try <a href="https://adventofcode.com/2022">Advent of Code 2022</a> in Python. As the puzzles got harder, I added unit tests with <a href="https://pytest.org/">pytest</a>. When I ran them, I was taken aback at how fast they ran.
</p>
<p>
There's no compilation step, so the test suite runs immediately. Obviously, if you've made a mistake that a compiler would have caught, the test fails, but if the code makes sense to the interpreter, it just runs.
</p>
<p>
For various reasons, I ran out of steam, as one does with Advent of Code, but I managed to write a good little test suite. Until day 17, it ran in 0.15-0.20 seconds on my little laptop. To be honest, though, once I added tests for day 17, feedback time jumped to just under two seconds. This is clearly because I'd written some inefficient code for my System Under Test.
</p>
<p>
I can't really blame a test framework for being slow, when it's really my own code that slows it down.
</p>
<p>
A counter-argument is that a compiled language is much faster than an interpreted one. Thus, one might think that a faster language would counter poor implementations. Not so.
</p>
<h3 id="c97e1f3c539a4ef78d2a07f25e9c6b0c">
TDD with Haskell <a href="#c97e1f3c539a4ef78d2a07f25e9c6b0c">#</a>
</h3>
<p>
As I've already outlined, the Haskell compiler takes more time than C#, and obviously it takes more time than a language that isn't compiled at all. On the other hand, Haskell compiles to native machine code. My experience with it is that once you've compiled your program, it's <em>fast</em>.
</p>
<p>
In order to compare the two styles, I decided to record compilation and test times while doing <a href="https://adventofcode.com/">Advent of Code 2024</a> in Haskell. I set up a Haskell code base with <a href="https://haskellstack.org/">Stack</a> and <a href="https://hackage.haskell.org/package/HUnit">HUnit</a>, as <a href="/2018/05/07/inlined-hunit-test-lists">I usually do</a>. As I worked through the puzzles, I'd be adding and running tests. Every time I recorded the time it took, using the <a href="https://en.wikipedia.org/wiki/Time_(Unix)">time</a> command to measure the time it took for <code>stack test</code> to run.
</p>
<p>
I've plotted the observations in this chart:
</p>
<p>
<img src="/content/binary/haskell-compile-and-test-times.png" alt="Scatter plot of more than a thousand compile-and-test times, measured in seconds.">
</p>
<p>
The chart shows more than a thousand observations, with the first to the left, and the latest to the right. The times recorded are the entire time it took from I started a test run until I had an answer. For this, I used the time command's <em>real</em> time measurement, rather than <em>user</em> or <em>sys</em> time. What matters is the feedback time; not the CPU time.
</p>
<p>
Each measurement is in seconds. The dashed orange line indicates the linear trend.
</p>
<p>
It's not the first time I've written Haskell code, so I knew what to expect. While you get the occasional fast turnaround time, it easily takes around ten seconds to compile even an empty code base. It seems that there's a constant overhead of that size. While there's an upward trend line as I added more and more code, and more tests, actually running the tests takes almost no time. The initial 'average' feedback time was around eight seconds, and 1100 observations later, the trends sits around 11.5 seconds. At this time, I had more than 200 test cases.
</p>
<p>
You may also notice that the observations vary quite a bit. You occasionally see sub-second times, but also turnaround times over thirty seconds. There's an explanation for both.
</p>
<p>
The sub-second times usually happen if I run the test suite twice without changing any code. In that case, the Haskell Stack correctly skips recompiling the code and instead just reruns the tests. This only highlights that I'm not waiting for the tests to execute. The tests are fast. It's the compiler that causes over 90% of the delay.
</p>
<p>
(Why would I rerun a test suite without changing any code? This mostly happens when I take a break from programming, or if I get distracted by another task. In such cases, when I return to the code, I usually run the test suite in order to remind myself of the state in which I left it. Sometimes, it turns out, I'd left the code in a state were the last thing I did was to run all tests.)
</p>
<p>
The other extremes have a different explanation.
</p>
<h3 id="8fa24383e06349718fca6cb70f95c98f">
IDE woes <a href="#8fa24383e06349718fca6cb70f95c98f">#</a>
</h3>
<p>
Why do I have to suffer through those turnaround times over twenty seconds? A few times over thirty?
</p>
<p>
The short answer is that these represent complete rebuilds. Most of these are caused by problems with the <a href="https://en.wikipedia.org/wiki/Integrated_development_environment">IDE</a>. For Haskell development, I use <a href="https://code.visualstudio.com/">Visual Studio Code</a> with the <a href="https://marketplace.visualstudio.com/items?itemName=haskell.haskell">Haskell extension</a>.
</p>
<p>
Perhaps it's only my setup that's messed up, but whenever I change a function in the System Under Test (SUT), I can. not. make. VS Code pick up that the API changed. Even if I correct my tests so that they still compile and run successfully from the command line, VS Code will keep insisting that the code is wrong.
</p>
<p>
This is, of course, annoying. One of the major selling points of statically type languages is that a good IDE can tell you if you made mistakes. Well, if it operates on an outdated view of what the SUT looks like, this no longer works.
</p>
<p>
I've tried restarting the Haskell Language Server, but that doesn't work. The only thing that works, as far as I've been able to discover, is to close VS Code, delete <code>.stack-work</code>, recompile, and reopen VS Code. Yes, that takes a minute or two, so not something I like doing too often.
</p>
<p>
Deleting <code>.stack-work</code> does trigger a full rebuild, which is why we see those long build times.
</p>
<h3 id="81ef3b41740145af98094e9335a130eb">
Striking a good balance <a href="#81ef3b41740145af98094e9335a130eb">#</a>
</h3>
<p>
What bothers me about dynamic languages is that I find discoverability and encapsulation so hard. I can't just look at the type of an operation and deduce what inputs it might take, or what the output might look like.
</p>
<p>
To be honest, if you give me a plain text file with F# or Haskell, I can't do that either. A static type system doesn't magically surface that kind of information. Instead, you may rely on an IDE to provide such information at your fingertips. The Haskell extension, for example, gives you a little automatic type annotation above your functions, as discussed in the article <a href="/2024/11/04/pendulum-swing-no-haskell-type-annotation-by-default">Pendulum swing: no Haskell type annotation by default</a>, and shown in a figure reprinted here for your convenience:
</p>
<p>
<img src="/content/binary/haskell-code-with-inferred-type-displayed-by-vs-code.png" alt="Screen shot of a Haskell function in Visual Studio Code with the function's type automatically displayed above it by the Haskell extension.">
</p>
<p>
If this is to work well, this information must be immediate and responsive. On my system it isn't.
</p>
<p>
It may, again, be that there's some problem with my particular tool chain setup. Or perhaps a four-year-old Lenovo X1 Carbon is just too puny a machine to effectively run such a tool.
</p>
<p>
On the other hand, I don't have similar issues with C# in Visual Studio (not VS Code). When I make changes, the IDE quickly responds and tells me if I've made a mistake. To be honest, even here, I feel that <a href="/2023/07/24/is-software-getting-worse">it was faster and more responsive a decade ago</a>, but compared to Haskell programming, the feedback I get with C# is close to immediate.
</p>
<p>
My experience with F# is somewhere in between. Visual Studio is quicker to pick up changes in F# code than VS Code is to reflect changes in Haskell, but it's not as fast as C#.
</p>
<p>
With Python, what little IDE integration is available is usually not trustworthy. Essentially, when suggesting callable operations, the IDE is mostly guessing, based on what it's already seen.
</p>
<p>
But, good Lord! The tests run fast.
</p>
<h3 id="4ea6dd100fbf4cc1b9c2941520a051bf">
Conclusion <a href="#4ea6dd100fbf4cc1b9c2941520a051bf">#</a>
</h3>
<p>
My recent experiences with both Haskell and Python programming is giving me a better understanding of the balances and trade-offs involved with picking a language. While I still favour statically typed languages, I'm beginning to see some attractive qualities on the other side.
</p>
<p>
Particularly, if you buy the argument that TDD suites should run in 10 seconds or less, this effectively means that I can't do TDD in Haskell. Not with the hardware I'm running. Python, on the other hand, seems eminently well-suited for TDD.
</p>
<p>
That doesn't sit too well with me, but on the other hand, I'm glad. I've learned about a benefit of a dynamically typed language. While you may consider all of this ordinary and trite, it feels like a small breakthrough to me. I've been trying hard to see past my own limitations, and it finally feels as though I've found a few chinks in the armour of my biases.
</p>
<p>
I'll keep pushing those envelopes to see what else I may learn.
</p>
</div>
<div id="comments">
<hr>
<h2 id="comments-header">
Comments
</h2>
<div class="comment" id="b760e53201c74532a33f1ae4a10407f9">
<div class="comment-author">Daniel Tartaglia <a href="#b760e53201c74532a33f1ae4a10407f9">#</a></div>
<div class="commentt-content">
<p>An interesting insight, but if you consider that the compiler is effectively an additional test suit that is verifying the types are being used correctly, that extra compilation time is really just a whole suite of tests that you didn't have to write. I can't help but wonder how long it would take to manually implement all the tests that would be required to satisfy those checks in Python, and how much slower the Python test suite would then be.</p>
<p>Like you, I have a strong bias for typesafe languages (or at least moderately typesafe ones). The way I've always explained it is as follows: Developers tend to work faster when writing with dynamic typed languages because they don't have to explain as much to a compiler. This literally means less code to write. However, because the developer <i>hasen't</i> fully explained themself, any follow-on developer does not have as much context to work with.</p>
<p>After all, whether the language requires it or not, the developers need to define and consider types. The only question is, do they have to <i>write it down</i></p>
</div>
<div class="comment-date">2025-01-01 01:26 UTC</div>
</div>
<div class="comment" id="d9b64e35daa34be7b5a5c34c55043583">
<div class="comment-author"><a href="/">Mark Seemann</a> <a href="#d9b64e35daa34be7b5a5c34c55043583">#</a></div>
<div class="comment-content">
<p>
Daniel, thank you for writing. I'm <a href="/2011/04/29/Feedbackmechanismsandtradeoffs">well aware that a type checker is a 'first line of defence'</a>, and I agree that if we truly had to replicate everything that a type checker does, as tests, it would take a long time. It would take a long time to write all those tests, and it would probably also take a long time to execute them all.
</p>
<p>
That said, I think that any sane proponent of dynamically typed languages would counter that that's an unreasonable demand. After all, in most cases, it's hardly the case that the code was written by <a href="https://en.wikipedia.org/wiki/Infinite_monkey_theorem">a monkey with a typewriter</a>, but rather by a well-meaning human who did his or her best to write correct code.
</p>
<p>
In the end, however, it's all a question about context. <a href="/2018/11/12/what-to-test-and-not-to-test">How important is correctness</a>, after all?
<a href="https://dannorth.net/about/">Dan North</a> once kindly pointed out to me that in many cases, the software owner doesn't even know what he or she wants. It's only through a series of iterations that we learn what a business system is supposed to do. Until we reach that point, correctness is, at best, a secondary priority. On the other hand, you <a href="https://en.wikipedia.org/wiki/Mars_Climate_Orbiter">should really test your outer space proble software</a>.
</p>
<p>
But you're right. The <a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/">types are still there</a>, either way.
</p>
<p>
The last word in this debate are hardly said yet, but you may also find my recent article series <a href="/2024/12/09/implementation-and-usage-mindsets">Implementation and usage mindsets</a> interesting.
</p>
</div>
<div class="comment-date">2025-01-07 06:53 UTC</div>
</div>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Implementing rod-cuttinghttps://blog.ploeh.dk/2024/12/23/implementing-rod-cutting2024-12-23T08:53:00+00:00Mark Seemann
<div id="post">
<p>
<em>From pseudocode to implementation in three languages.</em>
</p>
<p>
This article picks up where <a href="/2024/12/09/implementation-and-usage-mindsets">Implementation and usage mindsets</a> left off, examining how <a href="https://www.infoq.com/presentations/Simple-Made-Easy/">easy</a> it is to implement an algorithm in three different programming languages.
</p>
<p>
As an example, I'll use the bottom-up rod-cutting algorithm from <a href="/ref/clrs">Introduction to Algorithms</a>.
</p>
<h3 id="0a09280df48e48c7b5257346dc98eab3">
Rod-cutting <a href="#0a09280df48e48c7b5257346dc98eab3">#</a>
</h3>
<p>
The problem is simple:
</p>
<blockquote>
<p>
"Serling Enterprises buys long steel rods and cuts them into shorter rods, which it then sells. Each cut is free. The management of Serling Enterprises wants to know the best way to cut up the rods."
</p>
<footer><cite><a href="/ref/clrs">Introduction to Algorithms. Fourth edition</a>, ch. 14.1</cite></footer>
</blockquote>
<p>
You're given an array of prices, or rather revenues, that each size is worth. The example from the book is given as a table:
</p>
<table>
<tbody>
<tr>
<td>length <em>i</em></td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
</tr>
<tr>
<td>price <em>p<sub>i</sub></em></td>
<td>1</td>
<td>5</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>17</td>
<td>17</td>
<td>20</td>
<td>24</td>
<td>30</td>
</tr>
</tbody>
</table>
<p>
Notice that while this implies an array like <code>[1, 5, 8, 9, 10, 17, 17, 20, 24, 30]</code>, the array is understood to be one-indexed, as is the most common case in the book. Most languages, including all three languages in this article, have zero-indexed arrays, but it turns out that we can get around the issue by adding a leading zero to the array: <code>[0, 1, 5, 8, 9, 10, 17, 17, 20, 24, 30]</code>.
</p>
<p>
Thus, given that price array, the best you can do with a rod of length <em>10</em> is to leave it uncut, yielding a revenue of <em>30</em>.
</p>
<p>
<img src="/content/binary/rod-size-10-no-cut.png" alt="A rod divided into 10 segments, left uncut, with the number 30 above it." width="400">
</p>
<p>
On the other hand, if you have a rod of length <em>7</em>, you can cut it into two rods of lengths <em>1</em> and <em>6</em>.
</p>
<p>
<img src="/content/binary/rod-size-7-cut-into-2.png" alt="Two rods, one of a single segment, and one made from six segments. Above the single segment is the number 1, and above the six segments is the number 17." width="320">
</p>
<p>
Another solution for a rod of length <em>7</em> is to cut it into three rods of sizes <em>2</em>, <em>2</em>, and <em>3</em>. Both solutions yield a total revenue of <em>18</em>. Thus, while more than one optimal solution exists, the algorithm given here only identifies one of them.
</p>
<p>
<pre>Extended-Bottom-Up-Cut-Rod(p, n)
1 let r[0:n] and s[1:n] be new arrays
2 r[0] = 0
3 for j = 1 to n // for increasing rod length j
4 q = -∞
5 for i = 1 to j // i is the position of the first cut
6 if q < p[i] + r[j - i]
7 q = p[i] + r[j - i]
8 s[j] = i // best cut location so far for length j
9 r[j] = q // remember the solution value for length j
10 return r and s</pre>
</p>
<p>
Which programming language is this? It's no particular language, but rather pseudocode.
</p>
<p>
The reason that the function is called <code>Extended-Bottom-Up-Cut-Rod</code> is that the book pedagogically goes through a few other algorithms before arriving at this one. Going forward, I don't intend to keep that rather verbose name, but instead just call the function <code>cut_rod</code>, <code>cutRod</code>, or <code>Rod.cut</code>.
</p>
<p>
The <code>p</code> parameter is a one-indexed price (or revenue) array, as explained above, and <code>n</code> is a rod size (e.g. <code>10</code> or <code>7</code>, reflecting the above examples).
</p>
<p>
Given the above price array and <code>n = 10</code>, the algorithm returns two arrays, <code>r</code> for maximum possible revenue for a given cut, and <code>s</code> for the size of the maximizing cut.
</p>
<table>
<thead>
<tr>
<td><em>i</em></td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
</tr>
</thead>
<tbody>
<tr>
<td><em>r</em>[<em>i</em>]</td>
<td>0</td>
<td>1</td>
<td>5</td>
<td>8</td>
<td>10</td>
<td>13</td>
<td>17</td>
<td>18</td>
<td>22</td>
<td>25</td>
<td>30</td>
</tr>
<tr>
<td><em>s</em>[<em>i</em>]</td>
<td></td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>2</td>
<td>2</td>
<td>6</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>10</td>
</tr>
</tbody>
</table>
<p>
Such output doesn't really give a <em>solution</em>, but rather the raw data to find a solution. For example, for <code>n = 10</code> (= <em>i</em>), you consult the table for (one-indexed) index <em>10</em>, and see that you can get the revenue <em>30</em> from making a cut at position <em>10</em> (which effectively means no cut). For <code>n = 7</code>, you consult the table for index 7 and observe that you can get the total revenue <em>18</em> by making a cut at position <em>1</em>. This leaves you with two rods, and you again consult the table. For <code>n = 1</code>, you can get the revenue <em>1</em> by making a cut at position <em>1</em>; i.e. no further cut. For <code>n = 7 - 1 = 6</code> you consult the table and observe that you can get the revenue <em>17</em> by making a cut at position <em>6</em>, again indicating that no further cut is necessary.
</p>
<p>
Another procedure prints the solution, using the above process:
</p>
<p>
<pre>Print-Cut-Rod-Solution(p, n)
1 (r, s) = Extended-Bottom-Up-Cut-Rod(p, n)
2 while n > 0
3 print s[n] // cut location for length n
4 n = n - s[n] // length of the remainder of the rod</pre>
</p>
<p>
Again, the procedure is given as pseudocode.
</p>
<p>
How easy is it translate this algorithm into code in a real programming language? Not surprisingly, this depends on the language.
</p>
<h3 id="36447b3aa2a14becbb895fd70fdd9d4a">
Translation to Python <a href="#36447b3aa2a14becbb895fd70fdd9d4a">#</a>
</h3>
<p>
The hypothesis of the <a href="/2024/12/09/implementation-and-usage-mindsets">previous</a> article is that dynamically typed languages may be more suited for implementation tasks. The dynamically typed language that I know best is <a href="https://www.python.org/">Python</a>, so let's try that.
</p>
<p>
<pre><span style="color:blue;">def</span> <span style="color:#2b91af;">cut_rod</span>(p, n):
r = [0] * (n + 1)
s = [0] * (n + 1)
r[0] = 0
<span style="color:blue;">for</span> j <span style="color:blue;">in</span> <span style="color:blue;">range</span>(1, n + 1):
q = <span style="color:#2b91af;">float</span>(<span style="color:#a31515;">'-inf'</span>)
<span style="color:blue;">for</span> i <span style="color:blue;">in</span> <span style="color:blue;">range</span>(1, j + 1):
<span style="color:blue;">if</span> q < p[i] + r[j - i]:
q = p[i] + r[j - i]
s[j] = i
r[j] = q
<span style="color:blue;">return</span> r, s</pre>
</p>
<p>
That does, indeed, turn out to be straightforward. I had to figure out the syntax for initializing arrays, and how to represent negative infinity, but a combination of <a href="https://github.com/features/copilot">GitHub Copilot</a> and a few web searches quickly cleared that up.
</p>
<p>
The same is true for the <code>Print-Cut-Rod-Solution</code> procedure.
</p>
<p>
<pre><span style="color:blue;">def</span> <span style="color:#2b91af;">print_cut_rod_solution</span>(p, n):
r, s = cut_rod(p, n)
<span style="color:blue;">while</span> n > 0:
<span style="color:blue;">print</span>(s[n])
n = n - s[n]</pre>
</p>
<p>
Apart from minor syntactical differences, the pseudocode translates directly to Python.
</p>
<p>
So far, the hypothesis seems to hold. This particular dynamically typed language, at least, easily implements that particular algorithm. If we must speculate about underlying reasons, we may argue that a dynamically typed language is <a href="/2019/12/16/zone-of-ceremony">low on ceremony</a>. You don't have to get side-tracked by declaring types of parameters, variables, or return values.
</p>
<p>
That, at least, is a common complaint about statically typed languages that I hear when I discuss with lovers of dynamically typed languages.
</p>
<p>
Let us, then, try to implement the rod-cutting algorithm in a statically typed language.
</p>
<h3 id="a55c4ff33cf247f0b57ae58aa6795343">
Translation to Java <a href="#a55c4ff33cf247f0b57ae58aa6795343">#</a>
</h3>
<p>
Together with other <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>-based languages, <a href="https://www.java.com/">Java</a> is infamous for requiring a high amount of ceremony to get anything done. How easy is it to translate the rod-cutting pseudocode to Java? Not surprisingly, it turns out that one has to jump through a few more hoops.
</p>
<p>
First, of course, one has to set up a code base and choose a build system. I'm not well-versed in Java development, but here I (more or less) arbitrarily chose <a href="https://gradle.org/">gradle</a>. When you're new to an ecosystem, this can be a significant barrier, but I know from decades of C# programming that tooling alleviates much of that pain. Still, a single <code>.py</code> file this isn't.
</p>
<p>
Apart from that, the biggest hurdle turned out to be that, as far as I can tell, Java doesn't have native tuple support. Thus, in order to return two arrays, I would have to either pick a reusable package that implements tuples, or define a custom class for that purpose. Object-oriented programmers often argue that tuples represent poor design, since a tuple doesn't really communicate the role or intent of each element. Given that the rod-cutting algorithm returns two integer arrays, I'd be inclined to agree. You can't even tell them apart based on their types. For that reason, I chose to define a class to hold the result of the algorithm.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">RodCuttingSolution</span> {
<span style="color:blue;">private</span> <span style="color:blue;">int</span>[] revenues;
<span style="color:blue;">private</span> <span style="color:blue;">int</span>[] sizes;
<span style="color:blue;">public</span> <span style="color:#2b91af;">RodCuttingSolution</span>(<span style="color:blue;">int</span>[] revenues, <span style="color:blue;">int</span>[] sizes) {
<span style="color:blue;">this</span>.revenues = revenues;
<span style="color:blue;">this</span>.sizes = sizes;
}
<span style="color:blue;">public</span> <span style="color:blue;">int</span>[] <span style="color:#2b91af;">getRevenues</span>() {
<span style="color:blue;">return</span> revenues;
}
<span style="color:blue;">public</span> <span style="color:blue;">int</span>[] <span style="color:#2b91af;">getSizes</span>() {
<span style="color:blue;">return</span> sizes;
}
}</pre>
</p>
<p>
Armed with this return type, the rest of the translation went smoothly.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">RodCuttingSolution</span> <span style="color:#2b91af;">cutRod</span>(<span style="color:blue;">int</span>[] p, <span style="color:blue;">int</span> n) {
var r = <span style="color:blue;">new</span> <span style="color:blue;">int</span>[n + 1];
var s = <span style="color:blue;">new</span> <span style="color:blue;">int</span>[n + 1];
r[0] = 0;
<span style="color:blue;">for</span> (<span style="color:blue;">int</span> j = 1; j <= n; j++) {
var q = <span style="color:blue;">Integer</span>.MIN_VALUE;
<span style="color:blue;">for</span> (<span style="color:blue;">int</span> i = 1; i <= j; i++) {
<span style="color:blue;">if</span> (q < p[i] + r[j - i]) {
q = p[i] + r[j - i];
s[j] = i;
}
}
r[j] = q;
}
<span style="color:blue;">return</span> <span style="color:blue;">new</span> <span style="color:blue;">RodCuttingSolution</span>(r, s);
}</pre>
</p>
<p>
Granted, there's a bit more ceremony involved compared to the Python code, since one must declare the types of both input parameters and method return type. You also have to declare the type of the arrays when initializing them, and you could argue that the <code>for</code> loop syntax is more complicated than Python's <code>for ... in range ...</code> syntax. One may also complain that all the brackets and parentheses makes it harder to read the code.
</p>
<p>
While I'm used to such C-like code, I'm not immune to such criticism. I actually do find the Python code more readable.
</p>
<p>
Translating the <code>Print-Cut-Rod-Solution</code> pseudocode is a bit easier:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">void</span> <span style="color:#2b91af;">printCutRodSolution</span>(<span style="color:blue;">int</span>[] p, <span style="color:blue;">int</span> n) {
var result = cutRod(p, n);
<span style="color:blue;">while</span> (n > 0) {
<span style="color:blue;">System</span>.out.println(result.getSizes()[n]);
n = n - result.getSizes()[n];
}
}</pre>
</p>
<p>
The overall structure of the code remains intact, but again we're burdened with extra ceremony. We have to declare input and output types, and call that awkward <code>getSizes</code> method to retrieve the array of cut sizes.
</p>
<p>
It's possible that my Java isn't perfectly <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a>. After all, although I've read many books with Java examples over the years, I rarely write Java code. Additionally, you may argue that <code>static</code> methods exhibit a code smell like <a href="https://wiki.c2.com/?FeatureEnvySmell">Feature Envy</a>. I might agree, but the purpose of the current example is to examine how easy or difficult it is to implement a particular algorithm in various languages. Now that we have an implementation in Java, we might wish to refactor to a more object-oriented design, but that's outside the scope of this article.
</p>
<p>
Given that the rod-cutting algorithm isn't the most complex algorithm that exists, we may jump to the conclusion that Java isn't <em>that bad</em> compared to Python. Consider, however, how the extra ceremony on display here impacts your work if you have to implement a larger algorithm, or if you need to iterate to find an algorithm on your own.
</p>
<p>
To be clear, C# would require a similar amount of ceremony, and I don't even want to think about doing this in C.
</p>
<p>
All that said, it'd be irresponsible to extrapolate from only a few examples. You'd need both more languages and more problems before it even seems reasonable to draw any conclusions. I don't, however, intend the present example to constitute a full argument. Rather, it's an illustration of an idea that I haven't pulled out of thin air.
</p>
<p>
One of the points of <a href="/2019/12/16/zone-of-ceremony">Zone of Ceremony</a> is that the degree of awkwardness isn't necessarily correlated to whether types are dynamically or statically defined. While I'm sure that I miss lots of points made by 'dynamists', this is a point that I often feel is missed by that camp. One language that exemplifies that 'beyond-ceremony' zone is <a href="https://fsharp.org/">F#</a>.
</p>
<h3 id="0bb95c0e7967419680fe3e6fcc9aed41">
Translation to F# <a href="#0bb95c0e7967419680fe3e6fcc9aed41">#</a>
</h3>
<p>
If I'm right, we should be able to translate the rod-cutting pseudocode to F# with approximately the same amount of trouble than when translating to Python. How do we fare?
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">cut</span> (<span style="font-weight:bold;color:#1f377f;">p</span> : _ <span style="color:#2b91af;">array</span>) <span style="font-weight:bold;color:#1f377f;">n</span> =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">r</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#2b91af;">Array</span>.<span style="color:#74531f;">zeroCreate</span> (<span style="font-weight:bold;color:#1f377f;">n</span> + 1)
<span style="font-weight:bold;color:#1f377f;">r</span>[0] <span style="color:blue;"><-</span> 0
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">n</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">q</span> = <span style="color:#2b91af;">Int32</span>.MinValue
<span style="color:blue;">for</span> <span style="font-weight:bold;color:#1f377f;">i</span> = 1 <span style="color:blue;">to</span> <span style="font-weight:bold;color:#1f377f;">j</span> <span style="color:blue;">do</span>
<span style="color:blue;">if</span> <span style="color:#a08000;">q</span> < <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>] <span style="color:blue;">then</span>
<span style="color:#a08000;">q</span> <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">p</span>[<span style="font-weight:bold;color:#1f377f;">i</span>] + <span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span> - <span style="font-weight:bold;color:#1f377f;">i</span>]
<span style="font-weight:bold;color:#1f377f;">s</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="font-weight:bold;color:#1f377f;">r</span>[<span style="font-weight:bold;color:#1f377f;">j</span>] <span style="color:blue;"><-</span> <span style="color:#a08000;">q</span>
<span style="font-weight:bold;color:#1f377f;">r</span>, <span style="font-weight:bold;color:#1f377f;">s</span></pre>
</p>
<p>
Fairly well, as it turns out, although we <em>do</em> have to annotate <code>p</code> by indicating that it's an array. Still, the underscore in front of the <code>array</code> keyword indicates that we're happy to let the compiler infer the type of array (which is <code>int array</code>).
</p>
<p>
(We <em>can</em> get around that issue by writing <code>Array.item i p</code> instead of <code>p[i]</code>, but that's verbose in a different way.)
</p>
<p>
Had we chosen to instead implement the algorithm based on an input list or map, we wouldn't have needed the type hint. One could therefore argue that the reason that the hint is even required is because arrays aren't the most idiomatic data structure for a functional language like F#.
</p>
<p>
Otherwise, I don't find that this translation was much harder than translating to Python, and I personally prefer <code><span style="color:blue;">for</span> <span style="color:#1f377f;">j</span> = 1 <span style="color:blue;">to</span> <span style="color:#1f377f;">n</span> <span style="color:blue;">do</span></code> over <code><span style="color:blue;">for</span> j <span style="color:blue;">in</span> <span style="color:blue;">range</span>(1, n + 1):</code>.
</p>
<p>
We also need to add the <code>mutable</code> keyword to allow <code>q</code> to change during the loop. You could argue that this is another example of additional ceremony, While I agree, it's not much related to static versus dynamic typing, but more to how values are immutable by default in F#. If I recall correctly, JavaScript similarly distinguishes between <code>let</code>, <code>var</code>, and <code>const</code>.
</p>
<p>
Translating <code>Print-Cut-Rod-Solution</code> requires, again due to values being immutable by default, a bit more effort than Python, but not much:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">printSolution</span> <span style="font-weight:bold;color:#1f377f;">p</span> <span style="font-weight:bold;color:#1f377f;">n</span> =
<span style="color:blue;">let</span> _, <span style="font-weight:bold;color:#1f377f;">s</span> = <span style="color:#74531f;">cut</span> <span style="font-weight:bold;color:#1f377f;">p</span> <span style="font-weight:bold;color:#1f377f;">n</span>
<span style="color:blue;">let</span> <span style="color:blue;">mutable</span> <span style="color:#a08000;">n</span> = <span style="font-weight:bold;color:#1f377f;">n</span>
<span style="color:blue;">while</span> <span style="color:#a08000;">n</span> > 0 <span style="color:blue;">do</span>
<span style="color:#74531f;">printfn</span> <span style="color:#a31515;">"</span><span style="color:#2b91af;">%i</span><span style="color:#a31515;">"</span> <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]
<span style="color:#a08000;">n</span> <span style="color:blue;"><-</span> <span style="color:#a08000;">n</span> - <span style="font-weight:bold;color:#1f377f;">s</span>[<span style="color:#a08000;">n</span>]</pre>
</p>
<p>
I had to shadow the <code>n</code> parameter with a <code>mutable</code> variable to stay as close to the pseudocode as possible. Again, one may argue that the overall problem here isn't the static type system, but that programming based on mutation isn't idiomatic for F# (or other functional programming languages). As you'll see in the next article, a more idiomatic implementation is even simpler than this one.
</p>
<p>
Notice, however, that the <code>printSolution</code> action requires no type declarations or annotations.
</p>
<p>
Let's see it all in use:
</p>
<p>
<pre>> let p = [|0; 1; 5; 8; 9; 10; 17; 17; 20; 24; 30|];;
val p: int array = [|0; 1; 5; 8; 9; 10; 17; 17; 20; 24; 30|]
> Rod.printSolution p 7;;
1
6</pre>
</p>
<p>
This little interactive session reproduces the example illustrated in the beginning of this article, when given the price array from the book and a rod of size <em>7</em>, the solution printed indicates cuts at positions <em>1</em> and <em>6</em>.
</p>
<p>
I find it telling that the translation to F# is on par with the translation to Python, even though the structure of the pseudocode is quite imperative.
</p>
<h3 id="eb28d2ce98b34628b2ec4d0df8905492">
Conclusion <a href="#eb28d2ce98b34628b2ec4d0df8905492">#</a>
</h3>
<p>
You could, perhaps, say that if your mindset is predominantly imperative, implementing an algorithm using Python is likely easier than both F# or Java. If, on the other hand, you're mostly in an implementation mindset, but not strongly attached to whether the implementation should be imperative, object-oriented, or functional, I'd offer the conjecture that a language like F# is as implementation-friendly as a language like Python.
</p>
<p>
If, on the other hand, you're more focused on encapsulating and documenting how an existing API works, perhaps that shift of perspective suggests another evaluation of dynamically versus statically typed languages.
</p>
<p>
In any case, the F# code shown here is hardly idiomatic, so it might be illuminating to see what happens if we refactor it.
</p>
<p>
<strong>Next:</strong> <a href="/2025/01/06/encapsulating-rod-cutting">Encapsulating rod-cutting</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.A restaurant sandwichhttps://blog.ploeh.dk/2024/12/16/a-restaurant-sandwich2024-12-16T19:11:00+00:00Mark Seemann
<div id="post">
<p>
<em>An Impureim Sandwich example in C#.</em>
</p>
<p>
When learning functional programming (FP) people often struggle with how to organize code. How do you <a href="/2020/02/24/discerning-and-maintaining-purity">discern and maintain purity</a>? <a href="/2017/02/02/dependency-rejection">How do you do Dependency Injection in FP?</a> What does <a href="/2018/11/19/functional-architecture-a-definition">a functional architecture</a> look like?
</p>
<p>
A common FP design pattern is the <a href="/2020/03/02/impureim-sandwich">Impureim Sandwich</a>. The entry point of an application is always impure, so you push all impure actions to the boundary of the system. This is also known as <a href="https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell">Functional Core, Imperative Shell</a>. If you have a <a href="/2017/07/10/pure-interactions">micro-operation-based architecture</a>, which includes all web-based systems, you can often get by with a 'sandwich'. Perform impure actions to collect all the data you need. Pass all data to a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>. Finally, use impure actions to handle the <a href="https://en.wikipedia.org/wiki/Referential_transparency">referentially transparent</a> return value from the pure function.
</p>
<p>
No design pattern applies universally, and neither does this one. In my experience, however, it's surprisingly often possible to apply this architecture. We're far past the <a href="https://en.wikipedia.org/wiki/Pareto_principle">Pareto principle</a>'s 80 percent.
</p>
<p>
Examples may help illustrate the pattern, as well as explore its boundaries. In this article you'll see how I refactored an entry point of a <a href="https://en.wikipedia.org/wiki/REST">REST</a> API, specifically the <code>PUT</code> handler in the sample code base that accompanies <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>.
</p>
<h3 id="67463d3ade684a2b9de807b261ebb03c">
Starting point <a href="#67463d3ade684a2b9de807b261ebb03c">#</a>
</h3>
<p>
As discussed in the book, the architecture of the sample code base is, in fact, Functional Core, Imperative Shell. This isn't, however, the main theme of the book, and the code doesn't explicitly apply the Impureim Sandwich. In spirit, that's actually what's going on, but it isn't clear from looking at the code. This was a deliberate choice I made, because I wanted to highlight other software engineering practices. This does have the effect, though, that the Impureim Sandwich is invisible.
</p>
<p>
For example, the book follows <a href="/2019/11/04/the-80-24-rule">the 80/24 rule</a> closely. This was a didactic choice on my part. Most code bases I've seen in the wild have far too big methods, so I wanted to hammer home the message that it's possible to develop and maintain a non-trivial code base with small code blocks. This meant, however, that I had to split up HTTP request handlers (in ASP.NET known as <em>action methods</em> on Controllers).
</p>
<p>
The most complex HTTP handler is the one that handles <code>PUT</code> requests for reservations. Clients use this action when they want to make changes to a restaurant reservation.
</p>
<p>
The action method actually invoked by an HTTP request is this <code>Put</code> method:
</p>
<p>
<pre>[<span style="color:#2b91af;">HttpPut</span>(<span style="color:#a31515;">"restaurants/{restaurantId}/reservations/{id}"</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">ActionResult</span>> <span style="font-weight:bold;color:#74531f;">Put</span>(
<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">restaurantId</span>,
<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">id</span>,
<span style="color:#2b91af;">ReservationDto</span> <span style="font-weight:bold;color:#1f377f;">dto</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">dto</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">throw</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>));
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">id</span>, <span style="color:blue;">out</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">rid</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="color:#2b91af;">Reservation</span>? <span style="font-weight:bold;color:#1f377f;">reservation</span> = <span style="font-weight:bold;color:#1f377f;">dto</span>.<span style="font-weight:bold;color:#74531f;">Validate</span>(<span style="font-weight:bold;color:#1f377f;">rid</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">reservation</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">restaurant</span> = <span style="color:blue;">await</span> RestaurantDatabase
.<span style="font-weight:bold;color:#74531f;">GetRestaurant</span>(<span style="font-weight:bold;color:#1f377f;">restaurantId</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">restaurant</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span>
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#74531f;">TryUpdate</span>(<span style="font-weight:bold;color:#1f377f;">restaurant</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
}</pre>
</p>
<p>
Since I, for pedagogical reasons, wanted to fit each method inside an 80x24 box, I made a few somewhat unnatural design choices. The above code is one of them. While I don't consider it completely indefensible, this method does a bit of up-front input validation and verification, and then delegates execution to the <code>TryUpdate</code> method.
</p>
<p>
This may seem all fine and dandy until you realize that the only caller of <code>TryUpdate</code> is that <code>Put</code> method. A similar thing happens in <code>TryUpdate</code>: It calls a method that has only that one caller. We may try to inline those two methods to see if we can spot the Impureim Sandwich.
</p>
<h3 id="dab8cd4011a5493ea55b47cb2240839b">
Inlined Transaction Script <a href="#dab8cd4011a5493ea55b47cb2240839b">#</a>
</h3>
<p>
Inlining those two methods leave us with a larger, <a href="https://martinfowler.com/eaaCatalog/transactionScript.html">Transaction Script</a>-like entry point:
</p>
<p>
<pre>[<span style="color:#2b91af;">HttpPut</span>(<span style="color:#a31515;">"restaurants/{restaurantId}/reservations/{id}"</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">ActionResult</span>> <span style="font-weight:bold;color:#74531f;">Put</span>(
<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">restaurantId</span>,
<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">id</span>,
<span style="color:#2b91af;">ReservationDto</span> <span style="font-weight:bold;color:#1f377f;">dto</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">dto</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">throw</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>));
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">id</span>, <span style="color:blue;">out</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">rid</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="color:#2b91af;">Reservation</span>? <span style="font-weight:bold;color:#1f377f;">reservation</span> = <span style="font-weight:bold;color:#1f377f;">dto</span>.<span style="font-weight:bold;color:#74531f;">Validate</span>(<span style="font-weight:bold;color:#1f377f;">rid</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">reservation</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">restaurant</span> = <span style="color:blue;">await</span> RestaurantDatabase
.<span style="font-weight:bold;color:#74531f;">GetRestaurant</span>(<span style="font-weight:bold;color:#1f377f;">restaurantId</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">restaurant</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="color:blue;">using</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">scope</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">TransactionScope</span>(
<span style="color:#2b91af;">TransactionScopeAsyncFlowOption</span>.Enabled);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">existing</span> = <span style="color:blue;">await</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservation</span>(<span style="font-weight:bold;color:#1f377f;">restaurant</span>.Id, <span style="font-weight:bold;color:#1f377f;">reservation</span>.Id)
.<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">existing</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">reservations</span> = <span style="color:blue;">await</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">restaurant</span>.Id, <span style="font-weight:bold;color:#1f377f;">reservation</span>.At)
.<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#1f377f;">reservations</span> =
<span style="font-weight:bold;color:#1f377f;">reservations</span>.<span style="font-weight:bold;color:#74531f;">Where</span>(<span style="font-weight:bold;color:#1f377f;">r</span> => <span style="font-weight:bold;color:#1f377f;">r</span>.Id <span style="font-weight:bold;color:#74531f;">!=</span> <span style="font-weight:bold;color:#1f377f;">reservation</span>.Id).<span style="font-weight:bold;color:#74531f;">ToList</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">now</span> = Clock.<span style="font-weight:bold;color:#74531f;">GetCurrentDateTime</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">ok</span> = <span style="font-weight:bold;color:#1f377f;">restaurant</span>.MaitreD.<span style="font-weight:bold;color:#74531f;">WillAccept</span>(
<span style="font-weight:bold;color:#1f377f;">now</span>,
<span style="font-weight:bold;color:#1f377f;">reservations</span>,
<span style="font-weight:bold;color:#1f377f;">reservation</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="font-weight:bold;color:#1f377f;">ok</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#74531f;">NoTables500InternalServerError</span>();
<span style="color:blue;">await</span> Repository.<span style="font-weight:bold;color:#74531f;">Update</span>(<span style="font-weight:bold;color:#1f377f;">restaurant</span>.Id, <span style="font-weight:bold;color:#1f377f;">reservation</span>)
.<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#1f377f;">scope</span>.<span style="font-weight:bold;color:#74531f;">Complete</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkObjectResult</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.<span style="font-weight:bold;color:#74531f;">ToDto</span>());
}</pre>
</p>
<p>
While I've definitely seen longer methods in the wild, this variation is already so big that it no longer fits on my laptop screen. I have to scroll up and down to read the whole thing. When looking at the bottom of the method, I have to <em>remember</em> what was at the top, because I can no longer see it.
</p>
<p>
A major point of <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a> is that what limits programmer productivity is human cognition. If you have to scroll your screen because you can't see the whole method at once, does that fit in your brain? Chances are, it doesn't.
</p>
<p>
Can you spot the Impureim Sandwich now?
</p>
<p>
If you can't, that's understandable. It's not really clear because there's quite a few small decisions being made in this code. You could argue, for example, that this decision is referentially transparent:
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">existing</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();</pre>
</p>
<p>
These two lines of code are deterministic and have no side effects. The branch only returns a <code>NotFoundResult</code> when <code>existing is null</code>. Additionally, these two lines of code are surrounded by impure actions both before and after. Is this the Sandwich, then?
</p>
<p>
No, it's not. This is how <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> imperative code looks. To borrow a diagram from <a href="/2020/03/23/repeatable-execution">another article</a>, pure and impure code is interleaved without discipline:
</p>
<p>
<img src="/content/binary/impure-with-stripes-of-purity.png" alt="A box of mostly impure (red) code with vertical stripes of green symbolising pure code.">
</p>
<p>
Even so, the above <code>Put</code> method implements the Functional Core, Imperative Shell architecture. The <code>Put</code> method <em>is</em> the Imperative Shell, but where's the Functional Core?
</p>
<h3 id="e9ccab8ae8234c139934b87238dcf672">
Shell perspective <a href="#e9ccab8ae8234c139934b87238dcf672">#</a>
</h3>
<p>
One thing to be aware of is that when looking at the Imperative Shell code, the Functional Core is close to invisible. This is because it's typically only a single function call.
</p>
<p>
In the above <code>Put</code> method, this is the Functional Core:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">ok</span> = <span style="font-weight:bold;color:#1f377f;">restaurant</span>.MaitreD.<span style="font-weight:bold;color:#74531f;">WillAccept</span>(
<span style="font-weight:bold;color:#1f377f;">now</span>,
<span style="font-weight:bold;color:#1f377f;">reservations</span>,
<span style="font-weight:bold;color:#1f377f;">reservation</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="font-weight:bold;color:#1f377f;">ok</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#74531f;">NoTables500InternalServerError</span>();</pre>
</p>
<p>
It's only a few lines of code, and had I not given myself the constraint of staying within an 80 character line width, I could have instead laid it out like this and inlined the <code>ok</code> flag:
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="font-weight:bold;color:#1f377f;">restaurant</span>.MaitreD.<span style="font-weight:bold;color:#74531f;">WillAccept</span>(<span style="font-weight:bold;color:#1f377f;">now</span>, <span style="font-weight:bold;color:#1f377f;">reservations</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#74531f;">NoTables500InternalServerError</span>();</pre>
</p>
<p>
Now that I try this, in fact, it turns out that this actually still stays within 80 characters. To be honest, I don't know exactly why I had that former code instead of this, but perhaps I found the latter alternative too dense. Or perhaps I simply didn't think of it. Code is rarely perfect. Usually when I revisit a piece of code after having been away from it for some time, I find some thing that I want to change.
</p>
<p>
In any case, that's beside the point. What matters here is that when you're looking through the Imperative Shell code, the Functional Core looks insignificant. Blink and you'll miss it. Even if we ignore all the other small pure decisions (the <code>if</code> statements) and pretend that we already have an Impureim Sandwich, from this viewpoint, the architecture <em>looks</em> like this:
</p>
<p>
<img src="/content/binary/impure-tiny-pure-impure-sandwich-box.png" alt="A box with a big red section on top, a thin green sliver middle, and another big red part at the bottom.">
</p>
<p>
It's tempting to ask, then: What's all the fuss about? Why even bother?
</p>
<p>
This is a natural experience for a code reader. After all, if you don't know a code base well, you often start at the entry point to try to understand how the application handles a certain stimulus. Such as an HTTP <code>PUT</code> request. When you do that, you see all of the Imperative Shell code before you see the Functional Core code. This could give you the wrong impression about the balance of responsibility.
</p>
<p>
After all, code like the above <code>Put</code> method has inlined most of the impure code so that it's right in your face. Granted, there's still some code hiding behind, say, <code>Repository.ReadReservations</code>, but a substantial fraction of the imperative code is visible in the method.
</p>
<p>
On the other hand, the Functional Core is just a single function call. If we inlined all of that code, too, the picture might rather look like this:
</p>
<p>
<img src="/content/binary/impure-pure-impure-sandwich-box.png" alt="A box with a thin red slice on top, a thick green middle, and a thin red slice at the bottom.">
</p>
<p>
This obviously depends on the de-facto ratio of pure to imperative code. In any case, inlining the pure code is a thought experiment only, because the whole point of functional architecture is that <a href="/2021/07/28/referential-transparency-fits-in-your-head">a referentially transparent function fits in your head</a>. Regardless of the complexity and amount of code hiding behind that <code>MaitreD.WillAccept</code> function, the return value is <em>equal</em> to the function call. It's the ultimate abstraction.
</p>
<h3 id="f3019f4107254a82b4280753cbbfab5f">
Standard combinators <a href="#f3019f4107254a82b4280753cbbfab5f">#</a>
</h3>
<p>
As I've already suggested, the inlined <code>Put</code> method looks like a Transaction Script. The <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> fortunately hovers on <a href="https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two">the magical number seven</a>, and branching is exclusively organized around <a href="https://en.wikipedia.org/wiki/Guard_(computer_science)">Guard Clauses</a>. Apart from that, there are no nested <code>if</code> statements or <code>for</code> loops.
</p>
<p>
Apart from the Guard Clauses, this mostly looks like a procedure that runs in a straight line from top to bottom. The exception is all those small conditionals that may cause the procedure to exit prematurely. Conditions like this:
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">id</span>, <span style="color:blue;">out</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">rid</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();</pre>
</p>
<p>
or
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">reservation</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>();</pre>
</p>
<p>
Such checks occur throughout the method. Each of them are actually small pure islands amidst all the imperative code, but each is ad hoc. Each checks if it's possible for the procedure to continue, and returns a kind of error value if it decides that it's not.
</p>
<p>
Is there a way to model such 'derailments' from the main flow?
</p>
<p>
If you've ever encountered Scott Wlaschin's <a href="https://fsharpforfunandprofit.com/rop/">Railway Oriented Programming</a> you may already see where this is going. Railway-oriented programming is a fantastic metaphor, because it gives you a way to visualize that you have, indeed, a main track, but then you have a side track that you may shuffle some trains too. And once the train is on the side track, it can't go back to the main track.
</p>
<p>
That's how the <a href="/2022/05/09/an-either-monad">Either monad</a> works. Instead of all those ad-hoc <code>if</code> statements, we should be able to replace them with what we may call <em>standard combinators</em>. The most important of these combinators is <a href="/2022/03/28/monads">monadic bind</a>. Composing a Transaction Script like <code>Put</code> with standard combinators will 'hide away' those small decisions, and make the Sandwich nature more apparent.
</p>
<p>
If we had had pure code, we could just have composed Either-valued functions. Unfortunately, most of what's going on in the <code>Put</code> method happens in a Task-based context. Thankfully, Either is one of those monads that nest well, implying that we can <a href="/2024/11/25/nested-monads">turn the combination into a composed TaskEither monad</a>. The linked article shows the core <code>TaskEither</code> <code>SelectMany</code> implementations.
</p>
<p>
The way to encode all those small decisions between 'main track' or 'side track', then, is to wrap 'naked' values in the desired <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>> </code> <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">container</a>:
</p>
<p>
<pre><span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="font-weight:bold;color:#1f377f;">id</span>.<span style="font-weight:bold;color:#74531f;">TryParseGuid</span>().<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>()))</pre>
</p>
<p>
This little code snippet makes use of a few small building blocks that we also need to introduce. First, .NET's standard <code>TryParse</code> APIs don't, compose, but since <a href="/2019/07/15/tester-doer-isomorphisms">they're isomorphic to Maybe-valued functions</a>, you can write an adapter like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Guid</span>? <span style="color:#74531f;">TryParseGuid</span>(<span style="color:blue;">this</span> <span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">candidate</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">candidate</span>, <span style="color:blue;">out</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">guid</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">guid</span>;
<span style="font-weight:bold;color:#8f08c4;">else</span>
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
}</pre>
</p>
<p>
In this code base, I treat <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/nullable-reference-types">nullable reference types</a> as equivalent to the <a href="/2022/04/25/the-maybe-monad">Maybe monad</a>, but if your language doesn't have that feature, you can use Maybe instead.
</p>
<p>
To implement the <code>Put</code> method, however, we don't want nullable (or Maybe) values. We need Either values, so we may introduce a <a href="/2022/07/18/natural-transformations">natural transformation</a>:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>> <span style="color:#74531f;">OnNull</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>(<span style="color:blue;">this</span> <span style="color:#2b91af;">R</span>? <span style="font-weight:bold;color:#1f377f;">candidate</span>, <span style="color:#2b91af;">L</span> <span style="font-weight:bold;color:#1f377f;">left</span>) <span style="color:blue;">where</span> <span style="color:#2b91af;">R</span> : <span style="color:blue;">struct</span>
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">candidate</span>.HasValue)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#74531f;">Right</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>(<span style="font-weight:bold;color:#1f377f;">candidate</span>.Value);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#74531f;">Left</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>(<span style="font-weight:bold;color:#1f377f;">left</span>);
}</pre>
</p>
<p>
In <a href="https://www.haskell.org/">Haskell</a> one might just make use of the <a href="https://hackage.haskell.org/package/base/docs/Data-Maybe.html#v:maybe">built-in</a> <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>:
</p>
<p>
<pre>ghci> maybe (Left "boo!") Right $ Just 123
Right 123
ghci> maybe (Left "boo!") Right $ Nothing
Left "boo!"</pre>
</p>
<p>
Such conversions from <code>Maybe</code> to <code>Either</code> hover just around the <a href="https://wiki.haskell.org/Fairbairn_threshold">Fairbairn threshold</a>, but since we are going to need it more than once, it makes sense to add a specialized <code>OnNull</code> transformation to the C# code base. The one shown here handles <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/nullable-value-types">nullable value types</a>, but the code base also includes an overload that handles nullable reference types. It's almost identical.
</p>
<h3 id="f158fc8250db4f07b1419a044fe23f91">
Support for query syntax <a href="#f158fc8250db4f07b1419a044fe23f91">#</a>
</h3>
<p>
There's more than one way to consume monadic values in C#. While many C# developers like <a href="https://learn.microsoft.com/dotnet/csharp/linq/">LINQ</a>, most seem to prefer the familiar <em>method call syntax</em>; that is, just call the <code>Select</code>, <code>SelectMany</code>, and <code>Where</code> methods as the normal <a href="https://learn.microsoft.com/dotnet/csharp/programming-guide/classes-and-structs/extension-methods">extension methods</a> they are. Another option, however, is to use <a href="https://learn.microsoft.com/dotnet/csharp/linq/get-started/query-expression-basics">query syntax</a>. This is what I'm aiming for here, since it'll make it easier to spot the Impureim Sandwich.
</p>
<p>
You'll see the entire sandwich later in the article. Before that, I'll highlight details and explain how to implement them. You can always scroll down to see the end result, and then scroll back here, if that's more to your liking.
</p>
<p>
The sandwich starts by parsing the <code>id</code> into a <a href="https://learn.microsoft.com/dotnet/api/system.guid">GUID</a> using the above building blocks:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sandwich</span> =
<span style="color:blue;">from</span> rid <span style="color:blue;">in</span> <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="font-weight:bold;color:#1f377f;">id</span>.<span style="font-weight:bold;color:#74531f;">TryParseGuid</span>().<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>()))</pre>
</p>
<p>
It then immediately proceeds to <code>Validate</code> (<a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">parse</a>, really) the <code>dto</code> into a proper Domain Model:
</p>
<p>
<pre><span style="color:blue;">from</span> reservation <span style="color:blue;">in</span> <span style="font-weight:bold;color:#1f377f;">dto</span>.<span style="font-weight:bold;color:#74531f;">Validate</span>(rid).<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>())</pre>
</p>
<p>
Notice that the second <code>from</code> expression doesn't wrap the result with <code>Task.FromResult</code>. How does that work? Is the return value of <code>dto.Validate</code> already a <code>Task</code>? No, this works because I added 'degenerate' <code>SelectMany</code> overloads:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">R1</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">source</span>.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">x</span>)));
}
<span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">U</span>, <span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">R1</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">U</span>>> <span style="font-weight:bold;color:#1f377f;">k</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">U</span>, <span style="color:#2b91af;">R1</span>> <span style="font-weight:bold;color:#1f377f;">s</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">source</span>.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">k</span>(<span style="font-weight:bold;color:#1f377f;">x</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">y</span> => <span style="font-weight:bold;color:#1f377f;">s</span>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">y</span>)));
}</pre>
</p>
<p>
Notice that the <code>selector</code> only produces an <code><span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>></code> value, rather than <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>></code>. This allows query syntax to 'pick up' the previous value (<code>rid</code>, which is 'really' a <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">ActionResult</span>, <span style="color:#2b91af;">Guid</span>>></code>) and continue with a function that doesn't produce a <code>Task</code>, but rather just an <code>Either</code> value. The first of these two overloads then wraps that <code>Either</code> value and wraps it with <code>Task.FromResult</code>. The second overload is just the usual <a href="/2019/12/16/zone-of-ceremony">ceremony</a> that enables query syntax.
</p>
<p>
Why, then, doesn't the <code>sandwich</code> use the same trick for <code>rid</code>? Why does it explicitly call <code>Task.FromResult</code>?
</p>
<p>
As far as I can tell, this is because of type inference. It looks as though the C# compiler infers the monad's type from the first expression. If I change the first expression to
</p>
<p>
<pre><span style="color:blue;">from</span> rid <span style="color:blue;">in</span> <span style="font-weight:bold;color:#1f377f;">id</span>.<span style="font-weight:bold;color:#74531f;">TryParseGuid</span>().<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>())</pre>
</p>
<p>
the compiler thinks that the query expression is based on <code><span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>></code>, rather than <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>></code>. This means that once we run into the first <code>Task</code> value, the entire expression no longer works.
</p>
<p>
By explicitly wrapping the first expression in a <code>Task</code>, the compiler correctly infers the monad we'd like it to. If there's a more elegant way to do this, I'm not aware of it.
</p>
<h3 id="066473b442cc4dd4904b43dccd257fa4">
Values that don't fail <a href="#066473b442cc4dd4904b43dccd257fa4">#</a>
</h3>
<p>
The sandwich proceeds to query various databases, using the now-familiar <code>OnNull</code> combinators to transform nullable values to <code>Either</code> values.
</p>
<p>
<pre><span style="color:blue;">from</span> restaurant <span style="color:blue;">in</span> RestaurantDatabase
.<span style="font-weight:bold;color:#74531f;">GetRestaurant</span>(<span style="font-weight:bold;color:#1f377f;">restaurantId</span>)
.<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>())
<span style="color:blue;">from</span> existing <span style="color:blue;">in</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservation</span>(restaurant.Id, reservation.Id)
.<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>())</pre>
</p>
<p>
This works like before because both <code>GetRestaurant</code> and <code>ReadReservation</code> are queries that may fail to return a value. Here's the interface definition of <code>ReadReservation</code>:
</p>
<p>
<pre><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Reservation</span>?> <span style="font-weight:bold;color:#74531f;">ReadReservation</span>(<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">restaurantId</span>, <span style="color:#2b91af;">Guid</span> <span style="font-weight:bold;color:#1f377f;">id</span>);</pre>
</p>
<p>
Notice the question mark that indicates that the result may be <code>null</code>.
</p>
<p>
The <code>GetRestaurant</code> method is similar.
</p>
<p>
The next query that the sandwich has to perform, however, is different. The return type of the <code>ReadReservations</code> method is <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">Reservation</span>>></code>. Notice that the type contained in the <code>Task</code> is <em>not</em> nullable. Barring database connection errors, this query <a href="/2024/01/29/error-categories-and-category-errors">can't fail</a>. If it finds no data, it returns an empty collection.
</p>
<p>
Since the value isn't nullable, we can't use <code>OnNull</code> to turn it into a <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>></code> value. We could try to use the <code>Right</code> creation function for that.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>> <span style="color:#74531f;">Right</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>(<span style="color:#2b91af;">R</span> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>.<span style="color:#74531f;">Right</span>(<span style="font-weight:bold;color:#1f377f;">right</span>);
}</pre>
</p>
<p>
This works, but is awkward:
</p>
<p>
<pre><span style="color:blue;">from</span> reservations <span style="color:blue;">in</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(restaurant.Id, reservation.At)
.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => <span style="color:#2b91af;">Either</span>.<span style="color:#74531f;">Right</span><<span style="color:#2b91af;">ActionResult</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">Reservation</span>>>(<span style="font-weight:bold;color:#1f377f;">rs</span>))</pre>
</p>
<p>
The problem with calling <code>Either.Right</code> is that while the compiler can infer which type to use for <code>R</code>, it doesn't know what the <code>L</code> type is. Thus, we have to tell it, and we can't tell it what <code>L</code> is without <em>also</em> telling it what <code>R</code> is. Even though it already knows that.
</p>
<p>
In such scenarios, the <a href="https://fsharp.org/">F#</a> compiler can usually figure it out, and <a href="https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler">GHC</a> always can (unless you add some exotic language extensions to your code). C# doesn't have any syntax that enables you to tell the compiler about only the type that it doesn't know about, and let it infer the rest.
</p>
<p>
All is not lost, though, because there's a little trick you can use in cases such as this. You <em>can</em> let the C# compiler infer the <code>R</code> type so that you only have to tell it what <code>L</code> is. It's a two-stage process. First, define an extension method on <code>R</code>:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">RightBuilder</span><<span style="color:#2b91af;">R</span>> <span style="color:#74531f;">ToRight</span><<span style="color:#2b91af;">R</span>>(<span style="color:blue;">this</span> <span style="color:#2b91af;">R</span> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">RightBuilder</span><<span style="color:#2b91af;">R</span>>(<span style="font-weight:bold;color:#1f377f;">right</span>);
}</pre>
</p>
<p>
The only type argument on this <code>ToRight</code> method is <code>R</code>, and since the <code>right</code> parameter is of the type <code>R</code>, the C# compiler can always infer the type of <code>R</code> from the type of <code>right</code>.
</p>
<p>
What's <code><span style="color:#2b91af;">RightBuilder</span><<span style="color:#2b91af;">R</span>></code>? It's this little auxiliary class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">RightBuilder</span><<span style="color:#2b91af;">R</span>>
{
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">R</span> right;
<span style="color:blue;">public</span> <span style="color:#2b91af;">RightBuilder</span>(<span style="color:#2b91af;">R</span> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
<span style="color:blue;">this</span>.right = <span style="font-weight:bold;color:#1f377f;">right</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>> <span style="font-weight:bold;color:#74531f;">WithLeft</span><<span style="color:#2b91af;">L</span>>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Either</span>.<span style="color:#74531f;">Right</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>(right);
}
}</pre>
</p>
<p>
The code base for <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a> was written on .NET 3.1, but today you could have made this a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">record</a> instead. The only purpose of this class is to break the type inference into two steps so that the <code>R</code> type can be automatically inferred. In this way, you only need to tell the compiler what the <code>L</code> type is.
</p>
<p>
<pre><span style="color:blue;">from</span> reservations <span style="color:blue;">in</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(restaurant.Id, reservation.At)
.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => <span style="font-weight:bold;color:#1f377f;">rs</span>.<span style="font-weight:bold;color:#74531f;">ToRight</span>().<span style="font-weight:bold;color:#74531f;">WithLeft</span><<span style="color:#2b91af;">ActionResult</span>>())</pre>
</p>
<p>
As indicated, this style of programming isn't language-neutral. Even if you find this little trick neat, I'd much rather have the compiler just figure it out for me. The entire <code>sandwich</code> query expression is already defined as working with <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">ActionResult</span>, <span style="color:#2b91af;">R</span>>></code>, and the <code>L</code> type can't change like the <code>R</code> type can. Functional compilers can figure this out, and while I intend this article to show object-oriented programmers how functional programming sometimes work, I don't wish to pretend that it's a good idea to write code like this in C#. I've <a href="/2019/03/18/the-programmer-as-decision-maker">covered that ground already</a>.
</p>
<p>
Not surprisingly, there's a mirror-image <code>ToLeft</code>/<code>WithRight</code> combo, too.
</p>
<h3 id="8628f41e4a8d4c6e8d282c5a64ad1c44">
Working with Commands <a href="#8628f41e4a8d4c6e8d282c5a64ad1c44">#</a>
</h3>
<p>
The ultimate goal with the <code>Put</code> method is to modify a row in the database. The method to do that has this interface definition:
</p>
<p>
<pre><span style="color:#2b91af;">Task</span> <span style="font-weight:bold;color:#74531f;">Update</span>(<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">restaurantId</span>, <span style="color:#2b91af;">Reservation</span> <span style="font-weight:bold;color:#1f377f;">reservation</span>);</pre>
</p>
<p>
I usually call that non-generic <a href="https://learn.microsoft.com/dotnet/api/system.threading.tasks.task">Task</a> class for 'asynchronous <code>void</code>' when explaining it to non-C# programmers. The <code>Update</code> method is an asynchronous <a href="https://en.wikipedia.org/wiki/Command%E2%80%93query_separation">Command</a>.
</p>
<p>
<code>Task</code> and <code>void</code> aren't legal values for use with LINQ query syntax, so you have to find a way to work around that limitation. In this case I defined a local helper method to make it look like a Query:
</p>
<p>
<pre><span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Reservation</span>> <span style="font-weight:bold;color:#74531f;">RunUpdate</span>(<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">restaurantId</span>, <span style="color:#2b91af;">Reservation</span> <span style="font-weight:bold;color:#1f377f;">reservation</span>, <span style="color:#2b91af;">TransactionScope</span> <span style="font-weight:bold;color:#1f377f;">scope</span>)
{
<span style="color:blue;">await</span> Repository.<span style="font-weight:bold;color:#74531f;">Update</span>(<span style="font-weight:bold;color:#1f377f;">restaurantId</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#1f377f;">scope</span>.<span style="font-weight:bold;color:#74531f;">Complete</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">reservation</span>;
}</pre>
</p>
<p>
It just echoes back the <code>reservation</code> parameter once the <code>Update</code> has completed. This makes it composable in the larger query expression.
</p>
<p>
You'll probably not be surprised when I tell you that both F# and Haskell handle this scenario gracefully, without requiring any hoop-jumping.
</p>
<h3 id="d7c6feabfcb74e2eb5174a9ad3dd9c7f">
Full sandwich <a href="#d7c6feabfcb74e2eb5174a9ad3dd9c7f">#</a>
</h3>
<p>
Those are all the building block. Here's the full <code>sandwich</code> definition, colour-coded like the examples in <a href="/2020/03/02/impureim-sandwich">Impureim sandwich</a>.
</p>
<p>
<pre><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">ActionResult</span>, <span style="color:#2b91af;">OkObjectResult</span>>> <span style="font-weight:bold;color:#1f377f;">sandwich</span> =
<span style="background-color: palegreen;"> <span style="color:blue;">from</span> rid <span style="color:blue;">in</span> <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(
<span style="font-weight:bold;color:#1f377f;">id</span>.<span style="font-weight:bold;color:#74531f;">TryParseGuid</span>().<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>()))
<span style="color:blue;">from</span> reservation <span style="color:blue;">in</span>
<span style="font-weight:bold;color:#1f377f;">dto</span>.<span style="font-weight:bold;color:#74531f;">Validate</span>(rid).<span style="font-weight:bold;color:#74531f;">OnNull</span>(
(<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>())</span>
<span style="background-color: lightsalmon;"> <span style="color:blue;">from</span> restaurant <span style="color:blue;">in</span> RestaurantDatabase
.<span style="font-weight:bold;color:#74531f;">GetRestaurant</span>(<span style="font-weight:bold;color:#1f377f;">restaurantId</span>)
.<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>())
<span style="color:blue;">from</span> existing <span style="color:blue;">in</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservation</span>(restaurant.Id, reservation.Id)
.<span style="font-weight:bold;color:#74531f;">OnNull</span>((<span style="color:#2b91af;">ActionResult</span>)<span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>())
<span style="color:blue;">from</span> reservations <span style="color:blue;">in</span> Repository
.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(restaurant.Id, reservation.At)
.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => <span style="font-weight:bold;color:#1f377f;">rs</span>.<span style="font-weight:bold;color:#74531f;">ToRight</span>().<span style="font-weight:bold;color:#74531f;">WithLeft</span><<span style="color:#2b91af;">ActionResult</span>>())
<span style="color:blue;">let</span> now = Clock.<span style="font-weight:bold;color:#74531f;">GetCurrentDateTime</span>()</span>
<span style="background-color: palegreen;"> <span style="color:blue;">let</span> reservations2 =
reservations.<span style="font-weight:bold;color:#74531f;">Where</span>(<span style="font-weight:bold;color:#1f377f;">r</span> => <span style="font-weight:bold;color:#1f377f;">r</span>.Id <span style="font-weight:bold;color:#74531f;">!=</span> reservation.Id)
<span style="color:blue;">let</span> ok = restaurant.MaitreD.<span style="font-weight:bold;color:#74531f;">WillAccept</span>(
now,
reservations2,
reservation)
<span style="color:blue;">from</span> reservation2 <span style="color:blue;">in</span>
ok
? reservation.<span style="font-weight:bold;color:#74531f;">ToRight</span>().<span style="font-weight:bold;color:#74531f;">WithLeft</span><<span style="color:#2b91af;">ActionResult</span>>()
: <span style="color:#74531f;">NoTables500InternalServerError</span>().<span style="font-weight:bold;color:#74531f;">ToLeft</span>().<span style="font-weight:bold;color:#74531f;">WithRight</span><<span style="color:#2b91af;">Reservation</span>>()</span>
<span style="background-color: lightsalmon;"> <span style="color:blue;">from</span> reservation3 <span style="color:blue;">in</span>
<span style="font-weight:bold;color:#74531f;">RunUpdate</span>(restaurant.Id, reservation2, <span style="font-weight:bold;color:#1f377f;">scope</span>)
.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">r</span> => <span style="font-weight:bold;color:#1f377f;">r</span>.<span style="font-weight:bold;color:#74531f;">ToRight</span>().<span style="font-weight:bold;color:#74531f;">WithLeft</span><<span style="color:#2b91af;">ActionResult</span>>())</span>
<span style="color:blue;">select</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkObjectResult</span>(reservation3.<span style="font-weight:bold;color:#74531f;">ToDto</span>());</pre>
</p>
<p>
As is evident from the colour-coding, this isn't quite a sandwich. The structure is honestly more accurately depicted like this:
</p>
<p>
<img src="/content/binary/pure-impure-pure-impure-box.png" alt="A box with green, red, green, and red horizontal tiers.">
</p>
<p>
As I've previously argued, <a href="/2023/10/09/whats-a-sandwich">while the metaphor becomes strained, this still works well as a functional-programming architecture</a>.
</p>
<p>
As defined here, the <code>sandwich</code> value is a <code>Task</code> that must be awaited.
</p>
<p>
<pre><span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">ActionResult</span>, <span style="color:#2b91af;">OkObjectResult</span>> <span style="font-weight:bold;color:#1f377f;">either</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">sandwich</span>.<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">either</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">x</span>);</pre>
</p>
<p>
By awaiting the task, we get an <code>Either</code> value. The <code>Put</code> method, on the other hand, must return an <code>ActionResult</code>. How do you turn an <code>Either</code> object into a single object?
</p>
<p>
By pattern matching on it, as the code snippet shows. The <code>L</code> type is already an <code>ActionResult</code>, so we return it without changing it. If C# had had a built-in identity function, I'd used that, but idiomatically, we instead use the <code><span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">x</span></code> lambda expression.
</p>
<p>
The same is the case for the <code>R</code> type, because <code>OkObjectResult</code> inherits from <code>ActionResult</code>. The identity expression automatically performs the type conversion for us.
</p>
<p>
This, by the way, is a recurring pattern with Either values that I run into in all languages. You've essentially computed an <code>Either<T, T></code>, with the same type on both sides, and now you just want to return whichever <code>T</code> value is contained in the Either value. You'd think this is such a common pattern that Haskell has a nice abstraction for it, but <a href="https://hoogle.haskell.org/?hoogle=Either%20a%20a%20-%3E%20a">even Hoogle fails to suggest a commonly-accepted function that does this</a>. Apparently, <code>either id id</code> is considered below the Fairbairn threshold, too.
</p>
<h3 id="fc8f5dd20a494cc297a55ce57c865212">
Conclusion <a href="#fc8f5dd20a494cc297a55ce57c865212">#</a>
</h3>
<p>
This article presents an example of a non-trivial Impureim Sandwich. When I introduced the pattern, I gave a few examples. I'd deliberately chosen these examples to be simple so that they highlighted the structure of the idea. The downside of that didactic choice is that some commenters found the examples too simplistic. Therefore, I think that there's value in going through more complex examples.
</p>
<p>
The code base that accompanies <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a> is complex enough that it borders on the realistic. It was deliberately written that way, and since I assume that the code base is familiar to readers of the book, I thought it'd be a good resource to show how an Impureim Sandwich might look. I explicitly chose to refactor the <code>Put</code> method, since it's easily the most complicated process in the code base.
</p>
<p>
The benefit of that code base is that it's written in a programming language that reach a large audience. Thus, for the reader curious about functional programming I thought that this could also be a useful introduction to some intermediate concepts.
</p>
<p>
As I've commented along the way, however, I wouldn't expect anyone to write production C# code like this. If you're able to do this, you're also able to do it in a language better suited for this programming paradigm.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Implementation and usage mindsetshttps://blog.ploeh.dk/2024/12/09/implementation-and-usage-mindsets2024-12-09T21:45:00+00:00Mark Seemann
<div id="post">
<p>
<em>A one-dimensional take on the enduring static-versus-dynamic debate.</em>
</p>
<p>
It recently occurred to me that one possible explanation for the standing, and probably never-ending, debate about static versus dynamic types may be that each camp have disjoint perspectives on the kinds of problems their favourite languages help them solve. In short, my hypothesis is that perhaps lovers of dynamically-typed languages often approach a problem from an implementation mindset, whereas proponents of static types emphasize usage.
</p>
<p>
<img src="/content/binary/implementation-versus-usage.png" alt="A question mark in the middle. An arrow from left labelled 'implementation' points to the question mark from a figure indicating a person. Another arrow from the right labelled 'usage' points to the question mark from another figure indicating a person.">
</p>
<p>
I'll expand on this idea here, and then provide examples in two subsequent articles.
</p>
<h3 id="d748f29ae31543fbb6db537711800c62">
Background <a href="#d748f29ae31543fbb6db537711800c62">#</a>
</h3>
<p>
For years I've struggled to understand 'the other side'. While I'm firmly in the statically typed camp, I realize that many highly skilled programmers and thought leaders enjoy, or get great use out of, dynamically typed languages. This worries me, because it <a href="/2021/08/09/am-i-stuck-in-a-local-maximum">might indicate that I'm stuck in a local maximum</a>.
</p>
<p>
In other words, just because I, personally, prefer static types, it doesn't follow that static types are universally better than dynamic types.
</p>
<p>
In reality, it's probably rather the case that we're dealing with a false dichotomy, and that the problem is really multi-dimensional.
</p>
<blockquote>
<p>
"Let me stop you right there: I don't think there is a real dynamic typing versus static typing debate.
</p>
<p>
"What such debates normally are is language X vs language Y debates (where X happens to be dynamic and Y happens to be static)."
</p>
<footer><cite><a href="https://twitter.com/KevlinHenney/status/1425513161252278280">Kevlin Henney</a></cite></footer>
</blockquote>
<p>
Even so, I can't help thinking about such things. Am I missing something?
</p>
<p>
For the past few years, I've dabbled with <a href="https://www.python.org/">Python</a> to see what writing in a popular dynamically typed language is like. It's not a bad language, and I can clearly see how it's attractive. Even so, I'm still frustrated every time I return to some Python code after a few weeks or more. The lack of static types makes it hard for me to pick up, or revisit, old code.
</p>
<h3 id="8b6d87e0536d40b6aaec28d8e6356553">
A question of perspective? <a href="#8b6d87e0536d40b6aaec28d8e6356553">#</a>
</h3>
<p>
Whenever I run into a difference of opinion, I often interpret it as a difference in perspective. Perhaps it's my academic background as an economist, but I consider it a given that people have different motivations, and that incentives influence actions.
</p>
<p>
A related kind of analysis deals with problem definitions. Are we even trying to solve the same problem?
</p>
<p>
I've <a href="/2021/08/09/am-i-stuck-in-a-local-maximum">discussed such questions before, but in a different context</a>. Here, it strikes me that perhaps programmers who gravitate toward dynamically typed languages are focused on another problem than the other group.
</p>
<p>
Again, I'd like to emphasize that I don't consider the world so black and white in reality. Some developers straddle the two camps, and as the above Kevlin Henney quote suggests, there really aren't only two kinds of languages. <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a> and <a href="https://www.haskell.org/">Haskell</a> are both statically typed, but the similarities stop there. Likewise, I don't know if it's fair to put JavaScript and <a href="https://clojure.org/">Clojure</a> in the same bucket.
</p>
<p>
That said, I'd still like to offer the following hypothesis, in the spirit that although <a href="https://en.wikipedia.org/wiki/All_models_are_wrong">all models are wrong</a>, some are useful.
</p>
<p>
The idea is that if you're trying to solve a problem related to <em>implementation</em>, dynamically typed languages may be more suitable. If you're trying to implement an algorithm, or even trying to invent one, a dynamic language seems useful. One year, I did a good chunk of <a href="https://adventofcode.com/">Advent of Code</a> in Python, and didn't find it harder than in Haskell. (I ultimately ran out of steam for reasons unrelated to Python.)
</p>
<p>
On the other hand, if your main focus may be <em>usage</em> of your code, perhaps you'll find a statically typed language more useful. At least, I do. I can use the static type system to communicate how my APIs work. How to instantiate my classes. How to call my functions. How return values are shaped. In other words, the preconditions, invariants, and postconditions of my reusable code: <a href="/encapsulation-and-solid/">Encapsulation</a>.
</p>
<h3 id="f0cbf02e11484e9a8c8d0fab9a6463f2">
Examples <a href="#f0cbf02e11484e9a8c8d0fab9a6463f2">#</a>
</h3>
<p>
Some examples may be in order. In the next two articles, I'll first examine how easy it is to implement an algorithm in various programming languages. Then I'll discuss how to encapsulate that algorithm.
</p>
<ul>
<li><a href="/2024/12/23/implementing-rod-cutting">Implementing rod-cutting</a></li>
<li><a href="/2025/01/06/encapsulating-rod-cutting">Encapsulating rod-cutting</a></li>
</ul>
<p>
The articles will both discuss the rod-cutting problem from <a href="/ref/clrs">Introduction to Algorithms</a>, but I'll introduce the problem in the next article.
</p>
<h3 id="97b3e722024b4228924faa2d6ff5d188">
Conclusion <a href="#97b3e722024b4228924faa2d6ff5d188">#</a>
</h3>
<p>
I'd be naive if I believed that a single model can fully explain why some people prefer dynamically typed languages, and others rather like statically typed languages. Even so, suggesting a model helps me understand how to analyze problems.
</p>
<p>
My hypothesis is that dynamically typed languages may be suitable for implementing algorithms, whereas statically typed languages offer better encapsulation.
</p>
<p>
This may be used as a heuristic for 'picking the right tool for the job'. If I need to suss out an algorithm, perhaps I should do it in Python. If, on the other hand, I need to publish a reusable library, perhaps Haskell is a better choice.
</p>
<p>
<strong>Next:</strong> <a href="/2024/12/23/implementing-rod-cutting">Implementing rod-cutting</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Short-circuiting an asynchronous traversalhttps://blog.ploeh.dk/2024/12/02/short-circuiting-an-asynchronous-traversal2024-12-02T09:32:00+00:00Mark Seemann
<div id="post">
<p>
<em>Another C# example.</em>
</p>
<p>
This article is a continuation of <a href="/2024/11/18/collecting-and-handling-result-values">an earlier post</a> about refactoring a piece of imperative code to a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>. It all started with <a href="https://stackoverflow.com/q/79112836/126014">a Stack Overflow question</a>, but read the previous article, and you'll be up to speed.
</p>
<h3 id="2bf66b90d3ba4dfe980538175b647070">
Imperative outset <a href="#2bf66b90d3ba4dfe980538175b647070">#</a>
</h3>
<p>
To begin, consider this mostly imperative code snippet:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">storedItems</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">failedItems</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">state</span> = (<span style="font-weight:bold;color:#1f377f;">storedItems</span>, <span style="font-weight:bold;color:#1f377f;">failedItems</span>, hasError: <span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">foreach</span> (<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">item</span> <span style="font-weight:bold;color:#8f08c4;">in</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>)
{
<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span>, <span style="color:#2b91af;">Error</span>> <span style="font-weight:bold;color:#1f377f;">updateResult</span> = <span style="color:blue;">await</span> <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>);
<span style="font-weight:bold;color:#1f377f;">state</span> = <span style="font-weight:bold;color:#1f377f;">updateResult</span>.<span style="font-weight:bold;color:#74531f;">Match</span><(<span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:blue;">bool</span>)>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => { <span style="font-weight:bold;color:#1f377f;">storedItems</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>); <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; },
<span style="font-weight:bold;color:#1f377f;">notFound</span> => { <span style="font-weight:bold;color:#1f377f;">failedItems</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">item</span>); <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; },
<span style="font-weight:bold;color:#1f377f;">error</span> => { <span style="font-weight:bold;color:#1f377f;">state</span>.hasError = <span style="color:blue;">true</span>; <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; }
);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">state</span>.hasError)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Results</span>.<span style="color:#74531f;">BadRequest</span>();
}
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Results</span>.<span style="color:#74531f;">Ok</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([.. <span style="font-weight:bold;color:#1f377f;">storedItems</span>], [.. <span style="font-weight:bold;color:#1f377f;">failedItems</span>]));</pre>
</p>
<p>
I'll recap a few points from the previous article. Apart from one crucial detail, it's similar to the other post. One has to infer most of the types and APIs, since the original post didn't show more code than that. If you're used to engaging with Stack Overflow questions, however, it's not too hard to figure out what most of the moving parts do.
</p>
<p>
The most non-obvious detail is that the code uses a library called <a href="https://github.com/mcintyre321/OneOf/">OneOf</a>, which supplies general-purpose, but rather abstract, sum types. Both the container type <code>OneOf</code>, as well as the two indicator types <code>NotFound</code> and <code>Error</code> are defined in that library.
</p>
<p>
The <code>Match</code> method implements standard <a href="/2018/05/22/church-encoding">Church encoding</a>, which enables the code to pattern-match on the three alternative values that <code>UpdateItem</code> returns.
</p>
<p>
One more detail also warrants an explicit description: The <code>itemsToUpdate</code> object is an input argument of the type <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">ShoppingListItem</span>></code>.
</p>
<p>
The major difference from before is that now the update process short-circuits on the first <code>Error</code>. If an error occurs, it stops processing the rest of the items. In that case, it now returns <code>Results.BadRequest()</code>, and it <em>doesn't</em> save the changes to <code>dbContext</code>.
</p>
<p>
The implementation makes use of mutable state and undisciplined I/O. How do you refactor it to a more functional design?
</p>
<h3 id="d5b47b3ebb0345ea9b1d2879755bec12">
Short-circuiting traversal <a href="#d5b47b3ebb0345ea9b1d2879755bec12">#</a>
</h3>
<p>
<a href="/2024/11/11/traversals">The standard Traverse function</a> isn't lazy, or rather, it does consume the entire input sequence. Even various <a href="https://www.haskell.org/">Haskell</a> data structures I investigated do that. And yes, I even tried to <code>traverse</code> <a href="https://hackage.haskell.org/package/list-t/docs/ListT.html">ListT</a>. If there's a data structure that you can <code>traverse</code> with deferred execution of I/O-bound actions, I'm not aware of it.
</p>
<p>
That said, all is not lost, but you'll need to implement a more specialized traversal. While consuming the input sequence, the function needs to know when to stop. It can't do that on just any <a href="https://learn.microsoft.com/dotnet/api/system.collections.generic.ienumerable-1">IEnumerable<T></a>, because it has no information about <code>T</code>.
</p>
<p>
If, on the other hand, you specialize the traversal to a sequence of items with more information, you can stop processing if it encounters a particular condition. You could generalize this to, say, <code>IEnumerable<Either<L, R>></code>, but since I already have the OneOf library in scope, I'll use that, instead of implementing or pulling in a general-purpose <a href="/2018/06/11/church-encoded-either">Either</a> data type.
</p>
<p>
In fact, I'll just use a three-way <code>OneOf</code> type compatible with the one that <code>UpdateItem</code> returns.
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">static</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">Error</span>>>> <span style="color:#74531f;">Sequence</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">Error</span>>>> <span style="font-weight:bold;color:#1f377f;">tasks</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">results</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">Error</span>>>();
<span style="font-weight:bold;color:#8f08c4;">foreach</span> (<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">task</span> <span style="font-weight:bold;color:#8f08c4;">in</span> <span style="font-weight:bold;color:#1f377f;">tasks</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">task</span>;
<span style="font-weight:bold;color:#1f377f;">results</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">result</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">result</span>.IsT2)
<span style="font-weight:bold;color:#8f08c4;">break</span>;
}
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">results</span>;
}</pre>
</p>
<p>
This implementation doesn't care what <code>T1</code> or <code>T2</code> is, so they're free to be <code>ShoppingListItem</code> and <code>NotFound</code>. The third type argument, on the other hand, must be <code>Error</code>.
</p>
<p>
The <code>if</code> conditional looks a bit odd, but as I wrote, the types that ship with the OneOf library have rather abstract APIs. A three-way <code>OneOf</code> value comes with three case tests called <code>IsT0</code>, <code>IsT1</code>, and <code>IsT2</code>. Notice that the library uses a zero-indexed naming convention for its type parameters. <code>IsT2</code> returns <code>true</code> if the value is the <em>third</em> kind, in this case <code>Error</code>. If a <code>task</code> turns out to produce an <code>Error</code>, the <code>Sequence</code> method adds that one error, but then stops processing any remaining items.
</p>
<p>
Some readers may complain that the entire implementation of <code>Sequence</code> is imperative. It hardly matters that much, since the mutation doesn't escape the method. The behaviour is as functional as it's possible to make it. It's fundamentally I/O-bound, so we can't consider it a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>. That said, if we hypothetically imagine that all the <code>tasks</code> are deterministic and have no side effects, the <code>Sequence</code> function does become a pure function when viewed as a black box. From the outside, you can't tell that the implementation is imperative.
</p>
<p>
It <em>is</em> possible to implement <code>Sequence</code> in a proper functional style, and it might make <a href="/2020/01/13/on-doing-katas">a good exercise</a>. I think, however, that it'll be difficult in C#. In <a href="https://fsharp.org/">F#</a> or Haskell I'd use recursion, and while you <em>can</em> do that in C#, I admit that I've lost sight of whether or not <a href="/2015/12/22/tail-recurse">tail recursion</a> is supported by the C# compiler.
</p>
<p>
Be that as it may, the traversal implementation doesn't change.
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">Error</span>>>> <span style="color:#74531f;">Traverse</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T1</span>> <span style="font-weight:bold;color:#1f377f;">items</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">Error</span>>>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">items</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>).<span style="font-weight:bold;color:#74531f;">Sequence</span>();
}</pre>
</p>
<p>
You can now <code>Traverse</code> the <code>itemsToUpdate</code>:
</p>
<p>
<pre><span style="color:green;">// Impure</span>
<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">Error</span>>> <span style="font-weight:bold;color:#1f377f;">results</span> =
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));</pre>
</p>
<p>
As the <code>// Impure</code> comment may suggest, this constitutes the first impure layer of an <a href="/2020/03/02/impureim-sandwich">Impureim Sandwich</a>.
</p>
<h3 id="e7d6b741e8e1406b9588a5788df0ff9b">
Aggregating the results <a href="#e7d6b741e8e1406b9588a5788df0ff9b">#</a>
</h3>
<p>
Since the above statement awaits the traversal, the <code>results</code> object is a 'pure' object that can be passed to a pure function. This does, however, assume that <code>ShoppingListItem</code> is an immutable object.
</p>
<p>
The next step must collect results and <code>NotFound</code>-related failures, but contrary to the previous article, it must short-circuit if it encounters an <code>Error</code>. This again suggests an Either-like data structure, but again I'll repurpose a <code>OneOf</code> container. I'll start by defining a <code>seed</code> for an aggregation (a <em>left fold</em>).
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">seed</span> =
<span style="color:#2b91af;">OneOf</span><(<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">ShoppingListItem</span>>), <span style="color:#2b91af;">Error</span>>
.<span style="color:#74531f;">FromT0</span>(([], []));</pre>
</p>
<p>
This type can be either a tuple or an error. The .NET tendency is often to define an explicit <code>Result<TSuccess, TFailure></code> type, where <code>TSuccess</code> is defined to the left of <code>TFailure</code>. This, for example, is <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/results">how F# defines Result types</a>, and other .NET libraries tend to emulate that design. That's also what I've done here, although I admit that I'm regularly confused when going back and forth between F# and Haskell, where the <code>Right</code> case is <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> considered to indicate success.
</p>
<p>
As already discussed, OneOf follows a zero-indexed naming convention for type parameters, so <code>FromT0</code> indicates the first (or leftmost) case. The seed is thus initialized with a tuple that contains two empty sequences.
</p>
<p>
As in the previous article, you can now use the <a href="https://learn.microsoft.com/dotnet/api/system.linq.enumerable.aggregate">Aggregate</a> method to collect the result you want.
</p>
<p>
<pre><span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">BulkUpdateResult</span>, <span style="color:#2b91af;">Error</span>> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="font-weight:bold;color:#1f377f;">results</span>
.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="font-weight:bold;color:#1f377f;">seed</span>,
(<span style="font-weight:bold;color:#1f377f;">state</span>, <span style="font-weight:bold;color:#1f377f;">result</span>) =>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">MapT0</span>(
<span style="font-weight:bold;color:#1f377f;">t</span> => (<span style="font-weight:bold;color:#1f377f;">t</span>.Item1.<span style="font-weight:bold;color:#74531f;">Append</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>), <span style="font-weight:bold;color:#1f377f;">t</span>.Item2)),
<span style="font-weight:bold;color:#1f377f;">notFound</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">MapT0</span>(
<span style="font-weight:bold;color:#1f377f;">t</span> => (<span style="font-weight:bold;color:#1f377f;">t</span>.Item1, <span style="font-weight:bold;color:#1f377f;">t</span>.Item2.<span style="font-weight:bold;color:#74531f;">Append</span>(<span style="font-weight:bold;color:#1f377f;">notFound</span>.Item))),
<span style="font-weight:bold;color:#1f377f;">e</span> => <span style="font-weight:bold;color:#1f377f;">e</span>))
.<span style="font-weight:bold;color:#74531f;">MapT0</span>(<span style="font-weight:bold;color:#1f377f;">t</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>(<span style="font-weight:bold;color:#1f377f;">t</span>.Item1.<span style="font-weight:bold;color:#74531f;">ToArray</span>(), <span style="font-weight:bold;color:#1f377f;">t</span>.Item2.<span style="font-weight:bold;color:#74531f;">ToArray</span>()));</pre>
</p>
<p>
This expression is a two-step composition. I'll get back to the concluding <code>MapT0</code> in a moment, but let's first discuss what happens in the <code>Aggregate</code> step. Since the <code>state</code> is now a discriminated union, the big lambda expression not only has to <code>Match</code> on the <code>result</code>, but it also has to deal with the two mutually exclusive cases in which <code>state</code> can be.
</p>
<p>
Although it comes third in the code listing, it may be easiest to explain if we start with the error case. Keep in mind that the <code>seed</code> starts with the optimistic assumption that the operation is going to succeed. If, however, we encounter an error <code>e</code>, we now switch the <code>state</code> to the <code>Error</code> case. Once in that state, it stays there.
</p>
<p>
The two other <code>result</code> cases map over the first (i.e. the success) case, appending the result to the appropriate sequence in the tuple <code>t</code>. Since these expressions map over the first (zero-indexed) case, these updates only run as long as the <code>state</code> is in the success case. If the <code>state</code> is in the error state, these lambda expressions don't run, and the <code>state</code> doesn't change.
</p>
<p>
After having collected the tuple of sequences, the final step is to map over the success case, turning the tuple <code>t</code> into a <code>BulkUpdateResult</code>. That's what <code>MapT0</code> does: It maps over the first (zero-indexed) case, which contains the tuple of sequences. It's a standard <a href="/2018/03/22/functors">functor</a> projection.
</p>
<h3 id="e4c3b20a30c34b4785ccdd886b20d197">
Saving the changes and returning the results <a href="#e4c3b20a30c34b4785ccdd886b20d197">#</a>
</h3>
<p>
The final, impure step in the sandwich is to save the changes and return the results:
</p>
<p>
<pre><span style="color:green;">// Impure</span>
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="color:blue;">async</span> <span style="font-weight:bold;color:#1f377f;">bulkUpdateResult</span> =>
{
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Results</span>.<span style="color:#74531f;">Ok</span>(<span style="font-weight:bold;color:#1f377f;">bulkUpdateResult</span>);
},
<span style="font-weight:bold;color:#1f377f;">_</span> => <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="color:#2b91af;">Results</span>.<span style="color:#74531f;">BadRequest</span>()));</pre>
</p>
<p>
Note that it only calls <code>dbContext.SaveChangesAsync()</code> in case the <code>result</code> is a success.
</p>
<h3 id="a6d28bd9d66a4e068bc4cd4ba21dde32">
Accumulating the bulk-update result <a href="#a6d28bd9d66a4e068bc4cd4ba21dde32">#</a>
</h3>
<p>
So far, I've assumed that the final <code>BulkUpdateResult</code> class is just a simple immutable container without much functionality. If, however, we add some copy-and-update functions to it, we can use that to aggregate the result, instead of an anonymous tuple.
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:#2b91af;">BulkUpdateResult</span> <span style="font-weight:bold;color:#74531f;">Store</span>(<span style="color:#2b91af;">ShoppingListItem</span> <span style="font-weight:bold;color:#1f377f;">item</span>) =>
<span style="color:blue;">new</span>([.. StoredItems, <span style="font-weight:bold;color:#1f377f;">item</span>], FailedItems);
<span style="color:blue;">internal</span> <span style="color:#2b91af;">BulkUpdateResult</span> <span style="font-weight:bold;color:#74531f;">Fail</span>(<span style="color:#2b91af;">ShoppingListItem</span> <span style="font-weight:bold;color:#1f377f;">item</span>) =>
<span style="color:blue;">new</span>(StoredItems, [.. FailedItems, <span style="font-weight:bold;color:#1f377f;">item</span>]);</pre>
</p>
<p>
I would have personally preferred the name <code>NotFound</code> instead of <code>Fail</code>, but I was going with the original post's <code>failedItems</code> terminology, and I thought that it made more sense to call a method <code>Fail</code> when it adds to a collection called <code>FailedItems</code>.
</p>
<p>
Adding these two instance methods to <code>BulkUpdateResult</code> simplifies the composing code:
</p>
<p>
<pre><span style="color:green;">// Pure</span>
<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">BulkUpdateResult</span>, <span style="color:#2b91af;">Error</span>> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="font-weight:bold;color:#1f377f;">results</span>
.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">BulkUpdateResult</span>, <span style="color:#2b91af;">Error</span>>.<span style="color:#74531f;">FromT0</span>(<span style="color:blue;">new</span>([], [])),
(<span style="font-weight:bold;color:#1f377f;">state</span>, <span style="font-weight:bold;color:#1f377f;">result</span>) =>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">MapT0</span>(<span style="font-weight:bold;color:#1f377f;">bur</span> => <span style="font-weight:bold;color:#1f377f;">bur</span>.<span style="font-weight:bold;color:#74531f;">Store</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>)),
<span style="font-weight:bold;color:#1f377f;">notFound</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">MapT0</span>(<span style="font-weight:bold;color:#1f377f;">bur</span> => <span style="font-weight:bold;color:#1f377f;">bur</span>.<span style="font-weight:bold;color:#74531f;">Fail</span>(<span style="font-weight:bold;color:#1f377f;">notFound</span>.Item)),
<span style="font-weight:bold;color:#1f377f;">e</span> => <span style="font-weight:bold;color:#1f377f;">e</span>));</pre>
</p>
<p>
This variation starts with an empty <code>BulkUpdateResult</code> and then uses <code>Store</code> or <code>Fail</code> as appropriate to update the state. The final, impure step of the sandwich remains the same.
</p>
<h3 id="ed88649e2d75403ab654fe7c034b6c1f">
Conclusion <a href="#ed88649e2d75403ab654fe7c034b6c1f">#</a>
</h3>
<p>
It's a bit more tricky to implement a short-circuiting traversal than the standard traversal. You can, still, implement a specialized <code>Sequence</code> or <code>Traverse</code> method, but it requires that the input stream carries enough information to decide when to stop processing more items. In this article, I used a specialized three-way union, but you could generalize this to use a standard Either or Result type.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Nested monadshttps://blog.ploeh.dk/2024/11/25/nested-monads2024-11-25T07:31:00+00:00Mark Seemann
<div id="post">
<p>
<em>You can stack some monads in such a way that the composition is also a monad.</em>
</p>
<p>
This article is part of <a href="/2022/07/11/functor-relationships">a series of articles about functor relationships</a>. In a previous article you learned that <a href="/2024/10/28/functor-compositions">nested functors form a functor</a>. You may have wondered if <a href="/2022/03/28/monads">monads</a> compose in the same way. Does a monad nested in a monad form a monad?
</p>
<p>
As far as I know, there's no universal rule like that, but some monads compose well. Fortunately, it's been my experience that the combinations that you need in practice are among those that exist and are well-known. In a <a href="https://www.haskell.org/">Haskell</a> context, it's often the case that you need to run some kind of 'effect' inside <code>IO</code>. Perhaps you want to use <code>Maybe</code> or <code>Either</code> nested within <code>IO</code>.
</p>
<p>
In .NET, you may run into a similar need to compose task-based programming with an effect. This happens more often in <a href="https://fsharp.org/">F#</a> than in C#, since F# comes with other native monads (<code>option</code> and <code>Result</code>, to name the most common).
</p>
<h3 id="d84f448d09124e31a8fbeb27abe3d826">
Abstract shape <a href="#d84f448d09124e31a8fbeb27abe3d826">#</a>
</h3>
<p>
You'll see some real examples in a moment, but as usual it helps to outline what it is that we're looking for. Imagine that you have a monad. We'll call it <code>F</code> in keeping with tradition. In this article series, you've seen how two or more <a href="/2018/03/22/functors">functors</a> compose. When discussing the abstract shapes of things, we've typically called our two abstract functors <code>F</code> and <code>G</code>. I'll stick to that naming scheme here, because monads are functors (<a href="/2022/03/28/monads">that you can flatten</a>).
</p>
<p>
Now imagine that you have a value that stacks two monads: <code>F<G<T>></code>. If the inner monad <code>G</code> is the 'right' kind of monad, that configuration itself forms a monad.
</p>
<p>
<img src="/content/binary/nested-monads-transformed-to-single-monad.png" alt="Nested monads depicted as concentric circles. To the left the circle F contains the circle G that again contains the circle a. To the right the wider circle FG contains the circle that contains a. An arrow points from the left circles to the right circles.">
</p>
<p>
In the diagram, I've simply named the combined monad <code>FG</code>, which is a naming strategy I've seen in the real world, too: <code>TaskResult</code>, etc.
</p>
<p>
As I've already mentioned, if there's a general theorem that says that this is always possible, I'm not aware of it. To the contrary, I seem to recall reading that this is distinctly not the case, but the source escapes me at the moment. One hint, though, is offered in the documentation of <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Compose.html">Data.Functor.Compose</a>:
</p>
<blockquote>
<p>
"The composition of applicative functors is always applicative, but the composition of monads is not always a monad."
</p>
</blockquote>
<p>
Thankfully, the monads that you mostly need to compose do, in fact, compose. They include <a href="/2022/04/25/the-maybe-monad">Maybe</a>, <a href="/2022/05/09/an-either-monad">Either</a>, <a href="/2022/06/20/the-state-monad">State</a>, <a href="/2022/11/14/the-reader-monad">Reader</a>, and <a href="/2022/05/16/the-identity-monad">Identity</a> (okay, that one maybe isn't that useful). In other words, any monad <code>F</code> that composes with e.g. <code>Maybe</code>, that is, <code>F<Maybe<T>></code>, also forms a monad.
</p>
<p>
Notice that it's the 'inner' monad that determines whether composition is possible. Not the 'outer' monad.
</p>
<p>
For what it's worth, I'm basing much of this on my personal experience, which was again helpfully guided by <a href="https://hackage.haskell.org/package/transformers/docs/Control-Monad-Trans-Class.html">Control.Monad.Trans.Class</a>. I don't, however, wish to turn this article into an article about monad transformers, because if you already know Haskell, you can read the documentation and look at examples. And if you don't know Haskell, the specifics of monad transformers don't readily translate to languages like C# or F#.
</p>
<p>
The conclusions do translate, but the specific language mechanics don't.
</p>
<p>
Let's look at some common examples.
</p>
<h3 id="51dcb0d54afc46d7b26b7f4021e08dbc">
TaskMaybe monad <a href="#51dcb0d54afc46d7b26b7f4021e08dbc">#</a>
</h3>
<p>
We'll start with a simple, yet realistic example. The article <a href="/2019/02/11/asynchronous-injection">Asynchronous Injection</a> shows a simple operation that involves reading from a database, making a decision, and potentially writing to the database. The final composition, repeated here for your convenience, is an asynchronous (that is, <code>Task</code>-based) process.
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> Repository.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.Date)
.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => maîtreD.<span style="font-weight:bold;color:#74531f;">TryAccept</span>(<span style="font-weight:bold;color:#1f377f;">rs</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>))
.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">m</span> => <span style="font-weight:bold;color:#1f377f;">m</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(Repository.<span style="font-weight:bold;color:#74531f;">Create</span>))
.<span style="font-weight:bold;color:#74531f;">Match</span>(<span style="font-weight:bold;color:#74531f;">InternalServerError</span>(<span style="color:#a31515;">"Table unavailable"</span>), <span style="font-weight:bold;color:#74531f;">Ok</span>);</pre>
</p>
<p>
The problem here is that <code>TryAccept</code> returns <code><span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">Reservation</span>></code>, but since the overall workflow already 'runs in' an <a href="/2022/06/06/asynchronous-monads">asynchronous monad</a> (<code>Task</code>), the monads are now nested as <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>>></code>.
</p>
<p>
The way I dealt with that issue in the above code snippet was to rely on a <a href="/2024/11/11/traversals">traversal</a>, but it's actually an inelegant solution. The way that the <code>SelectMany</code> invocation maps over the <code><span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">Reservation</span>></code> <code>m</code> is awkward. Instead of <a href="/2018/07/02/terse-operators-make-business-code-more-readable">composing a business process</a>, the scaffolding is on display, so to speak. Sometimes this is unavoidable, but at other times, there may be a better way.
</p>
<p>
In my defence, when I wrote that article in 2019 I had another pedagogical goal than teaching nested monads. It turns out, however, that you can rewrite the business process using the <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>>></code> stack as a monad in its own right.
</p>
<p>
A monad needs two functions: <em>return</em> and either <em>bind</em> or <em>join</em>. In C# or F#, you can often treat <em>return</em> as 'implied', in the sense that you can always wrap <code><span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>></code> in a call to <a href="https://learn.microsoft.com/dotnet/api/system.threading.tasks.task.fromresult">Task.FromResult</a>. You'll see that in a moment.
</p>
<p>
While you can be cavalier about monadic <em>return</em>, you'll need to explicitly implement either <em>bind</em> or <em>join</em>. In this case, it turns out that the sample code base already had a <code>SelectMany</code> implementation:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">m</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">source</span>;
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">m</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">nothing</span>: <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>()),
<span style="font-weight:bold;color:#1f377f;">just</span>: <span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">x</span>));
}</pre>
</p>
<p>
The method first awaits the <code>Maybe</code> value, and then proceeds to <code>Match</code> on it. In the <code>nothing</code> case, you see the implicit <em>return</em> being used. In the <code>just</code> case, the <code>SelectMany</code> method calls <code>selector</code> with whatever <code>x</code> value was contained in the <code>Maybe</code> object. The result of calling <code>selector</code> already has the desired type <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>></code>, so the implementation simply returns that value without further ado.
</p>
<p>
This enables you to rewrite the <code>SelectMany</code> call in the business process so that it instead looks like this:
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> Repository.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.Date)
.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => maîtreD.<span style="font-weight:bold;color:#74531f;">TryAccept</span>(<span style="font-weight:bold;color:#1f377f;">rs</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>))
.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">r</span> => Repository.<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="font-weight:bold;color:#1f377f;">r</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">i</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:blue;">int</span>>(<span style="font-weight:bold;color:#1f377f;">i</span>)))
.<span style="font-weight:bold;color:#74531f;">Match</span>(<span style="font-weight:bold;color:#74531f;">InternalServerError</span>(<span style="color:#a31515;">"Table unavailable"</span>), <span style="font-weight:bold;color:#74531f;">Ok</span>);</pre>
</p>
<p>
At first glance, it doesn't look like much of an improvement. To be sure, the lambda expression within the <code>SelectMany</code> method no longer operates on a <code>Maybe</code> value, but rather on the <code>Reservation</code> Domain Model <code>r</code>. On the other hand, we're now saddled with that graceless <code><span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">i</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:blue;">int</span>>(<span style="font-weight:bold;color:#1f377f;">i</span>))</code>.
</p>
<p>
Had this been Haskell, we could have made this more succinct by eta reducing the <code>Maybe</code> case constructor and used the <code><$></code> infix operator instead of <code>fmap</code>; something like <code>Just <$> create r</code>. In C#, on the other hand, we can do something that Haskell doesn't allow. We can overload the <code>SelectMany</code> method:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">T</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">TResult</span>>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">source</span>.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">x</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">y</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>(<span style="font-weight:bold;color:#1f377f;">y</span>)));
}</pre>
</p>
<p>
This overload generalizes the 'pattern' exemplified by the above business process composition. Instead of a specific method call, it now works with any <code>selector</code> function that returns <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">TResult</span>></code>. Since <code>selector</code> only returns a <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">TResult</span>></code> value, and not a <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>>></code> value, as actually required in this nested monad, the overload has to map (that is, <code>Select</code>) the result by wrapping it in a <code><span style="color:blue;">new</span> <span style="color:#2b91af;">Maybe</span><<span style="color:#2b91af;">TResult</span>></code>.
</p>
<p>
This now enables you to improve the business process composition to something more readable.
</p>
<p>
<pre><span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> Repository.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.Date)
.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">rs</span> => maîtreD.<span style="font-weight:bold;color:#74531f;">TryAccept</span>(<span style="font-weight:bold;color:#1f377f;">rs</span>, <span style="font-weight:bold;color:#1f377f;">reservation</span>))
.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(Repository.<span style="font-weight:bold;color:#74531f;">Create</span>)
.<span style="font-weight:bold;color:#74531f;">Match</span>(<span style="font-weight:bold;color:#74531f;">InternalServerError</span>(<span style="color:#a31515;">"Table unavailable"</span>), <span style="font-weight:bold;color:#74531f;">Ok</span>);</pre>
</p>
<p>
It even turned out to be possible to eta reduce the lambda expression instead of the (also valid, but more verbose) <code><span style="font-weight:bold;color:#1f377f;">r</span> => Repository.<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="font-weight:bold;color:#1f377f;">r</span>)</code>.
</p>
<p>
If you're interested in the sample code, I've pushed a branch named <code>use-monad-stack</code> to <a href="https://github.com/ploeh/asynchronous-injection">the GitHub repository</a>.
</p>
<p>
Not surprisingly, the F# <code>bind</code> function is much terser:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">bind</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">x</span> = <span style="color:blue;">async</span> {
<span style="color:blue;">match!</span> <span style="font-weight:bold;color:#1f377f;">x</span> <span style="color:blue;">with</span>
| <span style="color:#2b91af;">Some</span> <span style="font-weight:bold;color:#1f377f;">x'</span> <span style="color:blue;">-></span> <span style="color:blue;">return!</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">x'</span>
| <span style="color:#2b91af;">None</span> <span style="color:blue;">-></span> <span style="color:blue;">return</span> <span style="color:#2b91af;">None</span> }</pre>
</p>
<p>
You can find that particular snippet in the code base that accompanies the article <a href="/2019/12/02/refactoring-registration-flow-to-functional-architecture">Refactoring registration flow to functional architecture</a>, although as far as I can tell, it's not actually in use in that code base. I probably just added it because I could.
</p>
<p>
You can find Haskell examples of combining <a href="https://hackage.haskell.org/package/transformers/docs/Control-Monad-Trans-Maybe.html">MaybeT</a> with <code>IO</code> in various articles on this blog. One of them is <a href="/2017/02/02/dependency-rejection">Dependency rejection</a>.
</p>
<h3 id="74c0764ee623459596700a6462dd5452">
TaskResult monad <a href="#74c0764ee623459596700a6462dd5452">#</a>
</h3>
<p>
A similar, but slightly more complex, example involves nesting Either values in asynchronous workflows. In some languages, such as F#, Either is rather called <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/results">Result</a>, and asynchronous workflows are modelled by a <code>Task</code> <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">container</a>, as already demonstrated above. Thus, on .NET at least, this nested monad is often called <em>TaskResult</em>, but you may also see <em>AsyncResult</em>, <em>AsyncEither</em>, or other combinations. Depending on the programming language, such names may be used only for modules, and not for the container type itself. In C# or F# code, for example, you may look in vain after a class called <code>TaskResult<T></code>, but rather find a <code>TaskResult</code> static class or module.
</p>
<p>
In C# you can define monadic <em>bind</em> like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">R1</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">source</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">throw</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">source</span>));
<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>> <span style="font-weight:bold;color:#1f377f;">x</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">source</span>.<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">x</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">l</span> => <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="color:#2b91af;">Either</span>.<span style="color:#74531f;">Left</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>(<span style="font-weight:bold;color:#1f377f;">l</span>)),
<span style="font-weight:bold;color:#1f377f;">selector</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
}</pre>
</p>
<p>
Here I've again passed the eta-reduced <code>selector</code> straight to the <em>right</em> case of the <code>Either</code> value, but <code><span style="font-weight:bold;color:#1f377f;">r</span> => <span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">r</span>)</code> works, too.
</p>
<p>
The <em>left</em> case shows another example of 'implicit monadic <em>return</em>'. I didn't bother defining an explicit <code>Return</code> function, but rather use <code><span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">FromResult</span>(<span style="color:#2b91af;">Either</span>.<span style="color:#74531f;">Left</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>(<span style="font-weight:bold;color:#1f377f;">l</span>))</code> to return a <code><span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>></code> value.
</p>
<p>
As is the case with C#, you'll also need to add a special overload to enable the syntactic sugar of <a href="https://learn.microsoft.com/dotnet/csharp/linq/get-started/query-expression-basics">query expressions</a>:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R1</span>>> <span style="color:#74531f;">SelectMany</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">U</span>, <span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">R1</span>>(
<span style="color:blue;">this</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">R</span>>> <span style="font-weight:bold;color:#1f377f;">source</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">Either</span><<span style="color:#2b91af;">L</span>, <span style="color:#2b91af;">U</span>>>> <span style="font-weight:bold;color:#1f377f;">k</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">R</span>, <span style="color:#2b91af;">U</span>, <span style="color:#2b91af;">R1</span>> <span style="font-weight:bold;color:#1f377f;">s</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">source</span>.<span style="font-weight:bold;color:#74531f;">SelectMany</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">k</span>(<span style="font-weight:bold;color:#1f377f;">x</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">y</span> => <span style="font-weight:bold;color:#1f377f;">s</span>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">y</span>)));
}</pre>
</p>
<p>
You'll see a comprehensive example using these functions in a future article.
</p>
<p>
In F# I'd often first define a module with a few functions including <code>bind</code>, and then use those implementations to define a <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/computation-expressions">computation expression</a>, but in <a href="/2016/04/11/async-as-surrogate-io">one article</a>, I jumped straight to the expression builder:
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#4ec9b0;">AsyncEitherBuilder</span> () =
<span style="color:green;">// Async<Result<'a,'c>> * ('a -> Async<Result<'b,'c>>)</span>
<span style="color:green;">// -> Async<Result<'b,'c>></span>
<span style="color:blue;">member</span> this.<span style="color:navy;">Bind</span>(x, <span style="color:navy;">f</span>) =
<span style="color:blue;">async</span> {
<span style="color:blue;">let!</span> x' = x
<span style="color:blue;">match</span> x' <span style="color:blue;">with</span>
| <span style="color:navy;">Success</span> s <span style="color:blue;">-></span> <span style="color:blue;">return!</span> <span style="color:navy;">f</span> s
| <span style="color:navy;">Failure</span> f <span style="color:blue;">-></span> <span style="color:blue;">return</span> <span style="color:navy;">Failure</span> f }
<span style="color:green;">// 'a -> 'a</span>
<span style="color:blue;">member</span> this.<span style="color:navy;">ReturnFrom</span> x = x
<span style="color:blue;">let</span> asyncEither = <span style="color:#4ec9b0;">AsyncEitherBuilder</span> ()</pre>
</p>
<p>
That article also shows usage examples. Another article, <a href="/2022/02/14/a-conditional-sandwich-example">A conditional sandwich example</a>, shows more examples of using this nested monad, although there, the computation expression is named <code>taskResult</code>.
</p>
<h3 id="e6426619b2ae4f8d97d62edfe9cae0ca">
Stateful computations that may fail <a href="#e6426619b2ae4f8d97d62edfe9cae0ca">#</a>
</h3>
<p>
To be honest, you mostly run into a scenario where nested monads are useful when some kind of 'effect' (errors, mostly) is embedded in an <a href="https://en.wikipedia.org/wiki/Input/output">I/O</a>-bound computation. In Haskell, this means <code>IO</code>, in C# <code>Task</code>, and in F# either <code>Task</code> or <code>Async</code>.
</p>
<p>
Other combinations are possible, however, but I've rarely encountered a need for additional nested monads outside of Haskell. In multi-paradigmatic languages, you can usually find other good designs that address issues that you may occasionally run into in a purely functional language. The following example is a Haskell-only example. You can skip it if you don't know or care about Haskell.
</p>
<p>
Imagine that you want to keep track of some statistics related to a software service you offer. If the <a href="https://en.wikipedia.org/wiki/Variance">variance</a> of some number (say, response time) exceeds 10 then you want to issue an alert that the <a href="https://en.wikipedia.org/wiki/Service-level_agreement">SLA</a> was violated. Apparently, in your system, reliability means staying consistent.
</p>
<p>
You have millions of observations, and they keep arriving, so you need an <a href="https://en.wikipedia.org/wiki/Online_algorithm">online algorithm</a>. For average and variance we'll use <a href="https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance">Welford's algorithm</a>.
</p>
<p>
The following code uses these imports:
</p>
<p>
<pre><span style="color:blue;">import</span> Control.Monad
<span style="color:blue;">import</span> Control.Monad.Trans.State.Strict
<span style="color:blue;">import</span> Control.Monad.Trans.Maybe</pre>
</p>
<p>
First, you can define a data structure to hold the aggregate values required for the algorithm, as well as an initial, empty value:
</p>
<p>
<pre><span style="color:blue;">data</span> Aggregate = Aggregate { count :: Int, meanA :: Double, m2 :: Double } <span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Eq</span>, <span style="color:#2b91af;">Show</span>)
<span style="color:#2b91af;">emptyA</span> <span style="color:blue;">::</span> <span style="color:blue;">Aggregate</span>
emptyA = Aggregate 0 0 0</pre>
</p>
<p>
You can also define a function to update the aggregate values with a new observation:
</p>
<p>
<pre><span style="color:#2b91af;">update</span> <span style="color:blue;">::</span> <span style="color:blue;">Aggregate</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">Aggregate</span>
update (Aggregate count mean m2) x =
<span style="color:blue;">let</span> count' = count + 1
delta = x - mean
mean' = mean + delta / <span style="color:blue;">fromIntegral</span> count'
delta2 = x - mean'
m2' = m2 + delta * delta2
<span style="color:blue;">in</span> Aggregate count' mean' m2'</pre>
</p>
<p>
Given an existing <code>Aggregate</code> record and a new observation, this function implements the algorithm to calculate a new <code>Aggregate</code> record.
</p>
<p>
The values in an <code>Aggregate</code> record, however, are only intermediary values that you can use to calculate statistics such as mean, variance, and sample variance. You'll need a data type and function to do that, as well:
</p>
<p>
<pre><span style="color:blue;">data</span> Statistics =
Statistics
{ mean :: Double, variance :: Double, sampleVariance :: Maybe Double }
<span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Eq</span>, <span style="color:#2b91af;">Show</span>)
<span style="color:#2b91af;">extractStatistics</span> <span style="color:blue;">::</span> <span style="color:blue;">Aggregate</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Maybe</span> <span style="color:blue;">Statistics</span>
extractStatistics (Aggregate count mean m2) =
<span style="color:blue;">if</span> count < 1 <span style="color:blue;">then</span> Nothing
<span style="color:blue;">else</span>
<span style="color:blue;">let</span> variance = m2 / <span style="color:blue;">fromIntegral</span> count
sampleVariance =
<span style="color:blue;">if</span> count < 2 <span style="color:blue;">then</span> Nothing <span style="color:blue;">else</span> Just $ m2 / <span style="color:blue;">fromIntegral</span> (count - 1)
<span style="color:blue;">in</span> Just $ Statistics mean variance sampleVariance</pre>
</p>
<p>
This is where the computation becomes 'failure-prone'. Granted, we only have a real problem when we have zero observations, but this still means that we need to return a <code>Maybe Statistics</code> value in order to avoid division by zero.
</p>
<p>
(There might be other designs that avoid that problem, or you might simply decide to tolerate that edge case and code around it in other ways. I've decided to design the <code>extractStatistics</code> function in this particular way in order to furnish an example. Work with me here.)
</p>
<p>
Let's say that as the next step, you'd like to compose these two functions into a single function that both adds a new observation, computes the statistics, but also returns the updated <code>Aggregate</code>.
</p>
<p>
You <em>could</em> write it like this:
</p>
<p>
<pre><span style="color:#2b91af;">addAndCompute</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">Aggregate</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Maybe</span> (<span style="color:blue;">Statistics</span>, <span style="color:blue;">Aggregate</span>)
addAndCompute x agg = <span style="color:blue;">do</span>
<span style="color:blue;">let</span> agg' = update agg x
stats <- extractStatistics agg'
<span style="color:blue;">return</span> (stats, agg')</pre>
</p>
<p>
This implementation uses <code>do</code> notation to automate handling of <code>Nothing</code> values. Still, it's a bit inelegant with its two <code>agg</code> values only distinguishable by the prime sign after one of them, and the need to explicitly return a tuple of the value and the new state.
</p>
<p>
This is the kind of problem that the State monad addresses. You could instead write the function like this:
</p>
<p>
<pre><span style="color:#2b91af;">addAndCompute</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">State</span> <span style="color:blue;">Aggregate</span> (<span style="color:#2b91af;">Maybe</span> <span style="color:blue;">Statistics</span>)
addAndCompute x = <span style="color:blue;">do</span>
modify $ <span style="color:blue;">flip</span> update x
gets extractStatistics</pre>
</p>
<p>
You could actually also write it as a one-liner, but that's already a bit too terse to my liking:
</p>
<p>
<pre><span style="color:#2b91af;">addAndCompute</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">State</span> <span style="color:blue;">Aggregate</span> (<span style="color:#2b91af;">Maybe</span> <span style="color:blue;">Statistics</span>)
addAndCompute x = modify (`update` x) >> gets extractStatistics</pre>
</p>
<p>
And if you really hate your co-workers, you can always visit <a href="https://pointfree.io">pointfree.io</a> to entirely obscure that expression, but I digress.
</p>
<p>
The point is that the State monad <a href="/ref/doocautbm">amplifies the essential and eliminates the irrelevant</a>.
</p>
<p>
Now you'd like to add a function that issues an alert if the variance is greater than 10. Again, you <em>could</em> write it like this:
</p>
<p>
<pre><span style="color:#2b91af;">monitor</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">State</span> <span style="color:blue;">Aggregate</span> (<span style="color:#2b91af;">Maybe</span> <span style="color:#2b91af;">String</span>)
monitor x = <span style="color:blue;">do</span>
stats <- addAndCompute x
<span style="color:blue;">case</span> stats <span style="color:blue;">of</span>
Just Statistics { variance } -> <span style="color:blue;">return</span> $
<span style="color:blue;">if</span> 10 < variance
<span style="color:blue;">then</span> Just <span style="color:#a31515;">"SLA violation"</span>
<span style="color:blue;">else</span> Nothing
Nothing -> <span style="color:blue;">return</span> Nothing</pre>
</p>
<p>
But again, the code is graceless with its explicit handling of <code>Maybe</code> cases. Whenever you see code that matches <code>Maybe</code> cases and maps <code>Nothing</code> to <code>Nothing</code>, your spider sense should be tingling. Could you abstract that away with a functor or monad?
</p>
<p>
Yes you can! You can use the <code>MaybeT</code> monad transformer, which nests <code>Maybe</code> computations inside another monad. In this case <code>State</code>:
</p>
<p>
<pre><span style="color:#2b91af;">monitor</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Double</span> <span style="color:blue;">-></span> <span style="color:blue;">State</span> <span style="color:blue;">Aggregate</span> (<span style="color:#2b91af;">Maybe</span> <span style="color:#2b91af;">String</span>)
monitor x = runMaybeT $ <span style="color:blue;">do</span>
Statistics { variance } <- MaybeT $ addAndCompute x
guard (10 < variance)
<span style="color:blue;">return</span> <span style="color:#a31515;">"SLA Violation"</span></pre>
</p>
<p>
The function type is the same, but the implementation is much simpler. First, the code lifts the <code>Maybe</code>-valued <code>addAndCompute</code> result into <code>MaybeT</code> and pattern-matches on the <code>variance</code>. Since the code is now 'running in' a <code>Maybe</code>-like context, this line of code only executes if there's a <code>Statistics</code> value to extract. If, on the other hand, <code>addAndCompute</code> returns <code>Nothing</code>, the function already short-circuits there.
</p>
<p>
The <code>guard</code> works just like imperative <a href="https://en.wikipedia.org/wiki/Guard_(computer_science)">Guard Clauses</a>. The third line of code only runs if the <code>variance</code> is greater than 10. In that case, it returns an alert message.
</p>
<p>
The entire <code>do</code> workflow gets unwrapped with <code>runMaybeT</code> so that we return back to a normal stateful computation that may fail.
</p>
<p>
Let's try it out:
</p>
<p>
<pre>ghci> (evalState $ monitor 1 >> monitor 7) emptyA
Nothing
ghci> (evalState $ monitor 1 >> monitor 8) emptyA
Just "SLA Violation"</pre>
</p>
<p>
Good, rigorous testing suggests that it's working.
</p>
<h3 id="e67fa8bc1b40459c91c1c8b45595c379">
Conclusion <a href="#e67fa8bc1b40459c91c1c8b45595c379">#</a>
</h3>
<p>
You sometimes run into situations where monads are nested. This mostly happens in I/O-bound computations, where you may have a Maybe or Either value embedded inside <code>Task</code> or <code>IO</code>. This can sometimes make working with the 'inner' monad awkward, but in many cases there's a good solution at hand.
</p>
<p>
Some monads, like Maybe, Either, State, Reader, and Identity, nest nicely inside other monads. Thus, if your 'inner' monad is one of those, you can turn the nested arrangement into a monad in its own right. This may help simplify your code base.
</p>
<p>
In addition to the common monads listed here, there are few more exotic ones that also play well in a nested configuration. Additionally, if your 'inner' monad is a custom data structure of your own creation, it's up to you to investigate if it nests nicely in another monad. As far as I can tell, though, if you can make it nest in one monad (e.g Task, Async, or IO) you can probably make it nest in any monad.
</p>
<p>
<strong>Next:</strong> <a href="/2018/01/08/software-design-isomorphisms">Software design isomorphisms</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Collecting and handling result valueshttps://blog.ploeh.dk/2024/11/18/collecting-and-handling-result-values2024-11-18T07:39:00+00:00Mark Seemann
<div id="post">
<p>
<em>The answer is traverse. It's always traverse.</em>
</p>
<p>
I recently came across <a href="https://stackoverflow.com/q/79112836/126014">a Stack Overflow question</a> about collecting and handling <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a> (AKA discriminated unions or, in this case, result types). While the question was tagged <em>functional-programming</em>, the overall structure of the code was so imperative, with so much interleaved <a href="https://en.wikipedia.org/wiki/Input/output">I/O</a>, that it hardly <a href="/2018/11/19/functional-architecture-a-definition">qualified as functional architecture</a>.
</p>
<p>
Instead, I gave <a href="https://stackoverflow.com/a/79112992/126014">an answer which involved a minimal change to the code</a>. Subsequently, the original poster asked to see a more functional version of the code. That's a bit too large a task for a Stack Overflow answer, I think, so I'll do it here on the blog instead.
</p>
<p>
Further comments and discussion on the original post reveal that the poster is interested in two alternatives. I'll start with the alternative that's only discussed, but not shown, in the question. The motivation for this ordering is that this variation is easier to implement than the other one, and I consider it pedagogical to start with the simplest case.
</p>
<p>
I'll do that in this article, and then follow up with another article that covers the short-circuiting case.
</p>
<h3 id="9b3987ad5daf4df48c8155a54fb39318">
Imperative outset <a href="#9b3987ad5daf4df48c8155a54fb39318">#</a>
</h3>
<p>
To begin, consider this mostly imperative code snippet:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">storedItems</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">failedItems</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">errors</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">Error</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">state</span> = (<span style="font-weight:bold;color:#1f377f;">storedItems</span>, <span style="font-weight:bold;color:#1f377f;">failedItems</span>, <span style="font-weight:bold;color:#1f377f;">errors</span>);
<span style="font-weight:bold;color:#8f08c4;">foreach</span> (<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">item</span> <span style="font-weight:bold;color:#8f08c4;">in</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>)
{
<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span>, <span style="color:#2b91af;">Error</span>> <span style="font-weight:bold;color:#1f377f;">updateResult</span> = <span style="color:blue;">await</span> <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>);
<span style="font-weight:bold;color:#1f377f;">state</span> = <span style="font-weight:bold;color:#1f377f;">updateResult</span>.<span style="font-weight:bold;color:#74531f;">Match</span><(<span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">Error</span>>)>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => { <span style="font-weight:bold;color:#1f377f;">storedItems</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>); <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; },
<span style="font-weight:bold;color:#1f377f;">notFound</span> => { <span style="font-weight:bold;color:#1f377f;">failedItems</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">item</span>); <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; },
<span style="font-weight:bold;color:#1f377f;">error</span> => { <span style="font-weight:bold;color:#1f377f;">errors</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">error</span>); <span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">state</span>; }
);
}
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Results</span>.<span style="color:#74531f;">Ok</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([.. <span style="font-weight:bold;color:#1f377f;">storedItems</span>], [.. <span style="font-weight:bold;color:#1f377f;">failedItems</span>], [.. <span style="font-weight:bold;color:#1f377f;">errors</span>]));</pre>
</p>
<p>
There's quite a few things to take in, and one has to infer most of the types and APIs, since the original post didn't show more code than that. If you're used to engaging with Stack Overflow questions, however, it's not too hard to figure out what most of the moving parts do.
</p>
<p>
The most non-obvious detail is that the code uses a library called <a href="https://github.com/mcintyre321/OneOf/">OneOf</a>, which supplies general-purpose, but rather abstract, sum types. Both the container type <code>OneOf</code>, as well as the two indicator types <code>NotFound</code> and <code>Error</code> are defined in that library.
</p>
<p>
The <code>Match</code> method implements standard <a href="/2018/05/22/church-encoding">Church encoding</a>, which enables the code to pattern-match on the three alternative values that <code>UpdateItem</code> returns.
</p>
<p>
One more detail also warrants an explicit description: The <code>itemsToUpdate</code> object is an input argument of the type <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">ShoppingListItem</span>></code>.
</p>
<p>
The implementation makes use of mutable state and undisciplined I/O. How do you refactor it to a more functional design?
</p>
<h3 id="c4e1b030e919464aa22ade11a511414f">
Standard traversal <a href="#c4e1b030e919464aa22ade11a511414f">#</a>
</h3>
<p>
I'll pretend that we only need to turn the above code snippet into a functional design. Thus, I'm ignoring that the code is most likely part of a larger code base. Because of the implied database interaction, the method isn't a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>. Unless it's a top-level method (that is, at the boundary of the application), it doesn't exemplify larger-scale <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>.
</p>
<p>
That said, my goal is to refactor the code to an <a href="/2020/03/02/impureim-sandwich">Impureim Sandwich</a>: Impure actions first, then the meat of the functionality as a pure function, and then some more impure actions to complete the functionality. This strongly suggests that the first step should be to map over <code>itemsToUpdate</code> and call <code>UpdateItem</code> for each.
</p>
<p>
If, however, you do that, you get this:
</p>
<p>
<pre><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span>, <span style="color:#2b91af;">Error</span>>>> <span style="font-weight:bold;color:#1f377f;">results</span> =
<span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));</pre>
</p>
<p>
The <code>results</code> object is a sequence of tasks. If we consider <a href="/2020/07/27/task-asynchronous-programming-as-an-io-surrogate">Task as a surrogate for IO</a>, each task should be considered impure, as it's either non-deterministic, has side effects, or both. This means that we can't pass <code>results</code> to a pure function, and that frustrates the ambition to structure the code as an Impureim Sandwich.
</p>
<p>
This is one of the most common problems in functional programming, and the answer is usually: Use a <a href="/2024/11/11/traversals">traversal</a>.
</p>
<p>
<pre><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">Error</span>>> <span style="font-weight:bold;color:#1f377f;">results</span> =
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));</pre>
</p>
<p>
Because this first, impure layer of the sandwich awaits the task, <code>results</code> is now an immutable value that can be passed to the pure step. This, by the way, assumes that <code>ShoppingListItem</code> is immutable, too.
</p>
<p>
Notice that I adjusted one of the cases of the discriminated union to <code><span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">ShoppingListItem</span>></code> rather than just <code>NotFound</code>. While the OneOf library ships with a <code>NotFound</code> type, it doesn't have a generic container of that name, so I defined it myself:
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">T</span>>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Item</span>);</pre>
</p>
<p>
I added it to make the next step simpler.
</p>
<h3 id="8f0e6fb0f34047ed99c59f6140a2b08f">
Aggregating the results <a href="#8f0e6fb0f34047ed99c59f6140a2b08f">#</a>
</h3>
<p>
The next step is to sort the <code>results</code> into three 'buckets', as it were.
</p>
<p>
<pre><span style="color:green;">// Pure</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">seed</span> =
(
<span style="color:#2b91af;">Enumerable</span>.<span style="color:#74531f;">Empty</span><<span style="color:#2b91af;">ShoppingListItem</span>>(),
<span style="color:#2b91af;">Enumerable</span>.<span style="color:#74531f;">Empty</span><<span style="color:#2b91af;">ShoppingListItem</span>>(),
<span style="color:#2b91af;">Enumerable</span>.<span style="color:#74531f;">Empty</span><<span style="color:#2b91af;">Error</span>>()
);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="font-weight:bold;color:#1f377f;">results</span>.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="font-weight:bold;color:#1f377f;">seed</span>,
(<span style="font-weight:bold;color:#1f377f;">state</span>, <span style="font-weight:bold;color:#1f377f;">result</span>) =>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => (<span style="font-weight:bold;color:#1f377f;">state</span>.Item1.<span style="font-weight:bold;color:#74531f;">Append</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>), <span style="font-weight:bold;color:#1f377f;">state</span>.Item2, <span style="font-weight:bold;color:#1f377f;">state</span>.Item3),
<span style="font-weight:bold;color:#1f377f;">notFound</span> => (<span style="font-weight:bold;color:#1f377f;">state</span>.Item1, <span style="font-weight:bold;color:#1f377f;">state</span>.Item2.<span style="font-weight:bold;color:#74531f;">Append</span>(<span style="font-weight:bold;color:#1f377f;">notFound</span>.Item), <span style="font-weight:bold;color:#1f377f;">state</span>.Item3),
<span style="font-weight:bold;color:#1f377f;">error</span> => (<span style="font-weight:bold;color:#1f377f;">state</span>.Item1, <span style="font-weight:bold;color:#1f377f;">state</span>.Item2, <span style="font-weight:bold;color:#1f377f;">state</span>.Item3.<span style="font-weight:bold;color:#74531f;">Append</span>(<span style="font-weight:bold;color:#1f377f;">error</span>))));</pre>
</p>
<p>
It's also possible to inline the <code>seed</code> value, but here I defined it in a separate expression in an attempt at making the code a little more readable. I don't know if I succeeded, because regardless of where it goes, it's hardly <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> to break tuple initialization over multiple lines. I had to, though, because otherwise the code would run <a href="/2019/11/04/the-80-24-rule">too far to the right</a>.
</p>
<p>
The lambda expression handles each <code>result</code> in <code>results</code> and uses <code>Match</code> to append the value to its proper 'bucket'. The outer <code>result</code> is a tuple of the three collections.
</p>
<h3 id="035012be047e431d8904686ec9915b8f">
Saving the changes and returning the results <a href="#035012be047e431d8904686ec9915b8f">#</a>
</h3>
<p>
The final, impure step in the sandwich is to save the changes and return the results:
</p>
<p>
<pre><span style="color:green;">// Impure</span>
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkResult</span>(
<span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([.. <span style="font-weight:bold;color:#1f377f;">result</span>.Item1], [.. <span style="font-weight:bold;color:#1f377f;">result</span>.Item2], [.. <span style="font-weight:bold;color:#1f377f;">result</span>.Item3]));</pre>
</p>
<p>
To be honest, the last line of code is pure, but <a href="/2023/10/09/whats-a-sandwich">that's not unusual</a> when it comes to Impureim Sandwiches.
</p>
<h3 id="178ff7d455e44a619b67d911a6aecba7">
Accumulating the bulk-update result <a href="#178ff7d455e44a619b67d911a6aecba7">#</a>
</h3>
<p>
So far, I've assumed that the final <code>BulkUpdateResult</code> class is just a simple immutable container without much functionality. If, however, we add some copy-and-update functions to it, we can use them to aggregate the result, instead of an anonymous tuple.
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:#2b91af;">BulkUpdateResult</span> <span style="font-weight:bold;color:#74531f;">Store</span>(<span style="color:#2b91af;">ShoppingListItem</span> <span style="font-weight:bold;color:#1f377f;">item</span>) =>
<span style="color:blue;">new</span>([.. StoredItems, <span style="font-weight:bold;color:#1f377f;">item</span>], FailedItems, Errors);
<span style="color:blue;">internal</span> <span style="color:#2b91af;">BulkUpdateResult</span> <span style="font-weight:bold;color:#74531f;">Fail</span>(<span style="color:#2b91af;">ShoppingListItem</span> <span style="font-weight:bold;color:#1f377f;">item</span>) =>
<span style="color:blue;">new</span>(StoredItems, [.. FailedItems, <span style="font-weight:bold;color:#1f377f;">item</span>], Errors);
<span style="color:blue;">internal</span> <span style="color:#2b91af;">BulkUpdateResult</span> <span style="font-weight:bold;color:#74531f;">Error</span>(<span style="color:#2b91af;">Error</span> <span style="font-weight:bold;color:#1f377f;">error</span>) =>
<span style="color:blue;">new</span>(StoredItems, FailedItems, [.. Errors, <span style="font-weight:bold;color:#1f377f;">error</span>]);</pre>
</p>
<p>
I would have personally preferred the name <code>NotFound</code> instead of <code>Fail</code>, but I was going with the original post's <code>failedItems</code> terminology, and I thought that it made more sense to call a method <code>Fail</code> when it adds to a collection called <code>FailedItems</code>.
</p>
<p>
Adding these three instance methods to <code>BulkUpdateResult</code> simplifies the composing code:
</p>
<p>
<pre><span style="color:green;">// Impure</span>
<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">OneOf</span><<span style="color:#2b91af;">ShoppingListItem</span>, <span style="color:#2b91af;">NotFound</span><<span style="color:#2b91af;">ShoppingListItem</span>>, <span style="color:#2b91af;">Error</span>>> <span style="font-weight:bold;color:#1f377f;">results</span> =
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">itemsToUpdate</span>.<span style="font-weight:bold;color:#74531f;">Traverse</span>(<span style="font-weight:bold;color:#1f377f;">item</span> => <span style="color:#74531f;">UpdateItem</span>(<span style="font-weight:bold;color:#1f377f;">item</span>, <span style="font-weight:bold;color:#1f377f;">dbContext</span>));
<span style="color:green;">// Pure</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="font-weight:bold;color:#1f377f;">results</span>.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="color:blue;">new</span> <span style="color:#2b91af;">BulkUpdateResult</span>([], [], []),
(<span style="font-weight:bold;color:#1f377f;">state</span>, <span style="font-weight:bold;color:#1f377f;">result</span>) =>
<span style="font-weight:bold;color:#1f377f;">result</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">storedItem</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Store</span>(<span style="font-weight:bold;color:#1f377f;">storedItem</span>),
<span style="font-weight:bold;color:#1f377f;">notFound</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Fail</span>(<span style="font-weight:bold;color:#1f377f;">notFound</span>.Item),
<span style="font-weight:bold;color:#1f377f;">error</span> => <span style="font-weight:bold;color:#1f377f;">state</span>.<span style="font-weight:bold;color:#74531f;">Error</span>(<span style="font-weight:bold;color:#1f377f;">error</span>)));
<span style="color:green;">// Impure</span>
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">dbContext</span>.<span style="font-weight:bold;color:#74531f;">SaveChangesAsync</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkResult</span>(<span style="font-weight:bold;color:#1f377f;">result</span>);</pre>
</p>
<p>
This variation starts with an empty <code>BulkUpdateResult</code> and then uses <code>Store</code>, <code>Fail</code>, or <code>Error</code> as appropriate to update the state.
</p>
<h3 id="32e680ea1dbb4bc7bc097e8fcfcb90e9">
Parallel Sequence <a href="#32e680ea1dbb4bc7bc097e8fcfcb90e9">#</a>
</h3>
<p>
If the tasks you want to traverse are thread-safe, you might consider making the traversal concurrent. You can use <a href="https://learn.microsoft.com/dotnet/api/system.threading.tasks.task.whenall">Task.WhenAll</a> for that. It has the same type as <code>Sequence</code>, so if you can live with the extra non-determinism that comes with parallel execution, you can use that instead:
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">static</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>>> <span style="color:#74531f;">Sequence</span><<span style="color:#2b91af;">T</span>>(<span style="color:blue;">this</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">T</span>>> <span style="font-weight:bold;color:#1f377f;">tasks</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="color:#2b91af;">Task</span>.<span style="color:#74531f;">WhenAll</span>(<span style="font-weight:bold;color:#1f377f;">tasks</span>);
}</pre>
</p>
<p>
Since the method signature doesn't change, the rest of the code remains unchanged.
</p>
<h3 id="a54fe20498bd4aca99d7d4184209a4df">
Conclusion <a href="#a54fe20498bd4aca99d7d4184209a4df">#</a>
</h3>
<p>
One of the most common stumbling blocks in functional programming is when you have a collection of values, and you need to perform an impure action (typically I/O) for each. This leaves you with a collection of impure values (<code>Task</code> in C#, <code>Task</code> or <code>Async</code> in <a href="https://fsharp.org/">F#</a>, <code>IO</code> in <a href="https://www.haskell.org/">Haskell</a>, etc.). What you actually need is a single impure value that contains the collection of results.
</p>
<p>
The solution to this kind of problem is to <em>traverse</em> the collection, rather than mapping over it (with <code>Select</code>, <code>map</code>, <code>fmap</code>, or similar). Note that computer scientists often talk about <em>traversing</em> a data structure like a <a href="https://en.wikipedia.org/wiki/Tree_(abstract_data_type)">tree</a>. This is a less well-defined use of the word, and not directly related. That said, you <em>can</em> also write <code>Traverse</code> and <code>Sequence</code> functions for trees.
</p>
<p>
This article used a Stack Overflow question as the starting point for an example showing how to refactor imperative code to an Impureim Sandwich.
</p>
<p>
This completes the first variation requested in the Stack Overflow question.
</p>
<p>
<strong>Next:</strong> <a href="/2024/12/02/short-circuiting-an-asynchronous-traversal">Short-circuiting an asynchronous traversal</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Traversalshttps://blog.ploeh.dk/2024/11/11/traversals2024-11-11T07:45:00+00:00Mark Seemann
<div id="post">
<p>
<em>How to convert a list of tasks into an asynchronous list, and similar problems.</em>
</p>
<p>
This article is part of <a href="/2022/07/11/functor-relationships">a series of articles about functor relationships</a>. In a previous article you learned about <a href="/2022/07/18/natural-transformations">natural transformations</a>, and then how <a href="/2018/03/22/functors">functors</a> compose. You can skip several of them if you like, but you might find the one about <a href="/2024/10/28/functor-compositions">functor compositions</a> relevant. Still, this article can be read independently of the rest of the series.
</p>
<p>
You can go a long way with just a single functor or <a href="/2022/03/28/monads">monad</a>. Consider how useful C#'s LINQ API is, or similar kinds of APIs in other languages - typically <code>map</code> and <code>flatMap</code> methods. These APIs work exclusively with the <a href="/2022/04/19/the-list-monad">List monad</a> (which is also a functor). Working with lists, sequences, or collections is so useful that many languages have other kinds of special syntax specifically aimed at working with multiple values: <a href="https://en.wikipedia.org/wiki/List_comprehension">List comprehension</a>.
</p>
<p>
<a href="/2022/06/06/asynchronous-monads">Asynchronous monads</a> like <a href="https://docs.microsoft.com/dotnet/api/system.threading.tasks.task-1">Task<T></a> or <a href="https://fsharp.org/">F#</a>'s <a href="https://fsharp.github.io/fsharp-core-docs/reference/fsharp-control-fsharpasync-1.html">Async<'T></a> are another kind of functor so useful in their own right that languages have special <code>async</code> and <code>await</code> keywords to compose them.
</p>
<p>
Sooner or later, though, you run into situations where you'd like to combine two different functors.
</p>
<h3 id="ebf67a9789e44ad8997832e1ac7c17da">
Lists and tasks <a href="#ebf67a9789e44ad8997832e1ac7c17da" title="permalink">#</a>
</h3>
<p>
It's not unusual to combine collections and asynchrony. If you make an asynchronous database query, you could easily receive something like <code>Task<IEnumerable<Reservation>></code>. This, in isolation, hardly causes problems, but things get more interesting when you need to compose multiple reads.
</p>
<p>
Consider a query like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> Task<Foo> Read(<span style="color:blue;">int</span> id)</pre>
</p>
<p>
What happens if you have a collection of IDs that you'd like to read? This happens:
</p>
<p>
<pre><span style="color:blue;">var</span> ids = <span style="color:blue;">new</span>[] { 42, 1337, 2112 };
IEnumerable<Task<Foo>> fooTasks = ids.Select(id => Foo.Read(id));</pre>
</p>
<p>
You get a collection of Tasks, which may be awkward because you can't <code>await</code> it. Perhaps you'd rather prefer a single Task that contains a collection: <code>Task<IEnumerable<Foo>></code>. In other words, you'd like to flip the functors:
</p>
<p>
<pre>IEnumerable<Task<Foo>>
Task<IEnumerable<Foo>></pre>
</p>
<p>
The top type is what you have. The bottom type is what you'd like to have.
</p>
<p>
The combination of asynchrony and collections is so common that .NET has special methods to do that. I'll briefly mention one of these later, but what's the <em>general</em> solution to this problem?
</p>
<p>
Whenever you need to flip two functors, you need a <em>traversal</em>.
</p>
<h3 id="b962041a5e3d4eb9ba5101641407ca3f">
Sequence <a href="#b962041a5e3d4eb9ba5101641407ca3f" title="permalink">#</a>
</h3>
<p>
As is almost always the case, we can look to <a href="https://www.haskell.org/">Haskell</a> for a canonical definition of traversals - or, as the type class is called: <a href="https://hackage.haskell.org/package/base/docs/Data-Traversable.html">Traversable</a>.
</p>
<p>
A <em>traversable functor</em> is a functor that enables you to flip that functor and another functor, like the above C# example. In more succinct syntax:
</p>
<p>
<pre>t (f a) -> f (t a)</pre>
</p>
<p>
Here, <code>t</code> symbolises any traversable functor (like <code>IEnumerable<T></code> in the above C# example), and <code>f</code> is another functor (like <code>Task<T></code>, above). By flipping the functors I mean making <code>t</code> and <code>f</code> change places; just like <code>IEnumerable</code> and <code>Task</code>, above.
</p>
<p>
Thinking of <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">functors as containers</a> we might depict the function like this:
</p>
<p>
<img src="/content/binary/traversal-sequence.png" alt="Nested functors depicted as concentric circles. To the left the circle t contains the circle f that again contains the circle a. To the right the circle f contains the circle t that again contains the circle a. An arrow points from the left circles to the right circles.">
</p>
<p>
To the left, we have an outer functor <code>t</code> (e.g. <code>IEnumerable</code>) that contains another functor <code>f</code> (e.g. <code>Task</code>) that again 'contains' values of type <code>a</code> (in C# typically called <code>T</code>). We'd like to flip how the containers are nested so that <code>f</code> contains <code>t</code>.
</p>
<p>
Contrary to what you might expect, the function that does that isn't called <em>traverse</em>; it's called <em>sequence</em>. (For those readers who are interested in Haskell specifics, the function I'm going to be talking about is actually called <a href="https://hackage.haskell.org/package/base/docs/Data-Traversable.html#v:sequenceA">sequenceA</a>. There's also a function called <a href="https://hackage.haskell.org/package/base/docs/Data-Traversable.html#v:sequence">sequence</a>, but it's not as general. The reason for the odd names are related to the evolution of various Haskell type classes.)
</p>
<p>
The <em>sequence</em> function doesn't work for any old functor. First, <code>t</code> has to be a <em>traversable functor</em>. We'll get back to that later. Second, <code>f</code> has to be an <a href="/2018/10/01/applicative-functors">applicative functor</a>. (To be honest, I'm not sure if this is <em>always</em> required, or if it's possible to produce an example of a specific functor that isn't applicative, but where it's still possible to implement a <em>sequence</em> function. The Haskell <code>sequenceA</code> function has <code>Applicative f</code> as a constraint, but as far as I can tell, this only means that this is a <em>sufficient</em> requirement - not that it's necessary.)
</p>
<p>
Since tasks (e.g. <code>Task<T></code>) are applicative functors (they are, because <a href="/2022/06/06/asynchronous-monads">they are monads</a>, and <a href="/2022/03/28/monads">all monads are applicative functors</a>), that second requirement is fulfilled for the above example. I'll show you how to implement a <code>Sequence</code> function in C# and how to use it, and then we'll return to the general discussion of what a traversable functor is:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> Task<IEnumerable<T>> Sequence<<span style="color:#2b91af;">T</span>>(
<span style="color:blue;">this</span> IEnumerable<Task<T>> source)
{
<span style="color:blue;">return</span> source.Aggregate(
Task.FromResult(Enumerable.Empty<T>()),
<span style="color:blue;">async</span> (acc, t) =>
{
<span style="color:blue;">var</span> xs = <span style="color:blue;">await</span> acc;
<span style="color:blue;">var</span> x = <span style="color:blue;">await</span> t;
<span style="color:blue;">return</span> xs.Concat(<span style="color:blue;">new</span>[] { x });
});
}</pre>
</p>
<p>
This <code>Sequence</code> function enables you to flip any <code>IEnumerable<Task<T>></code> to a <code>Task<IEnumerable<T>></code>, including the above <code>fooTasks</code>:
</p>
<p>
<pre>Task<IEnumerable<Foo>> foosTask = fooTasks.Sequence();</pre>
</p>
<p>
You can also implement <code>sequence</code> in F#:
</p>
<p>
<pre><span style="color:green;">// Async<'a> list -> Async<'a list></span>
<span style="color:blue;">let</span> sequence asyncs =
<span style="color:blue;">let</span> go acc t = async {
<span style="color:blue;">let!</span> xs = acc
<span style="color:blue;">let!</span> x = t
<span style="color:blue;">return</span> List.append xs [x] }
List.fold go (fromValue []) asyncs</pre>
</p>
<p>
and use it like this:
</p>
<p>
<pre><span style="color:blue;">let</span> fooTasks = ids |> List.map Foo.Read
<span style="color:blue;">let</span> foosTask = fooTasks |> Async.sequence</pre>
</p>
<p>
For this example, I put the <code>sequence</code> function in a local <code>Async</code> module; it's not part of any published <code>Async</code> module.
</p>
<p>
These C# and F# examples are specific translations: From lists of tasks to a task of list. If you need another translation, you'll have to write a new function for that particular combination of functors. Haskell has more general capabilities, so that you don't have to write functions for all combinations. I'm not assuming that you know Haskell, however, so I'll proceed with the description.
</p>
<h3 id="d63d059d841b4d9783f42c0360b21662">
Traversable functor <a href="#d63d059d841b4d9783f42c0360b21662" title="permalink">#</a>
</h3>
<p>
The <em>sequence</em> function requires that the 'other' functor (the one that's <em>not</em> the traversable functor) is an applicative functor, but what about the traversable functor itself? What does it take to be a traversable functor?
</p>
<p>
I have to admit that I have to rely on Haskell specifics to a greater extent than normal. For most other concepts and abstractions in <a href="/2017/10/04/from-design-patterns-to-category-theory">the overall article series</a>, I've been able to draw on various sources, chief of which are <a href="https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/">Category Theory for Programmers</a>. In various articles, I've cited my sources whenever possible. While I've relied on Haskell libraries for 'canonical' ways to <em>represent</em> concepts in a programming language, I've tried to present ideas as having a more universal origin than just Haskell.
</p>
<p>
When it comes to traversable functors, I haven't come across universal reasoning like that which gives rise to concepts like <a href="/2017/10/06/monoids">monoids</a>, functors, <a href="/2018/05/22/church-encoding">Church encodings</a>, or <a href="/2019/04/29/catamorphisms">catamorphisms</a>. This is most likely a failing on my part.
</p>
<p>
Traversals of the Haskell kind are, however, so <em>useful</em> that I find it appropriate to describe them. When consulting, it's a common solution to a lot of problems that people are having with functional programming.
</p>
<p>
Thus, based on Haskell's <a href="https://hackage.haskell.org/package/base/docs/Data-Traversable.html">Data.Traversable</a>, a traversable functor must:
<ul>
<li>be a functor</li>
<li>be a 'foldable' functor</li>
<li>define a <em>sequence</em> or <em>traverse</em> function</li>
</ul>
You've already seen examples of <em>sequence</em> functions, and I'm also assuming that (since you've made it so far in the article already) you know what a functor is. But what's a <em>foldable</em> functor?
</p>
<p>
Haskell comes with a <a href="https://hackage.haskell.org/package/base/docs/Data-Foldable.html">Foldable</a> type class. It defines a class of data that has a particular type of <a href="/2019/04/29/catamorphisms">catamorphism</a>. As I've outlined in my article on catamorphisms, Haskell's notion of a <em>fold</em> sometimes coincides with a (or 'the') catamorphism for a type, and sometimes not. For <a href="/2019/05/20/maybe-catamorphism">Maybe</a> and <a href="/2019/05/27/list-catamorphism">List</a> they do coincide, while they don't for <a href="/2019/06/03/either-catamorphism">Either</a> or <a href="/2019/06/10/tree-catamorphism">Tree</a>. It's not that you can't define <code>Foldable</code> for <a href="/2018/06/11/church-encoded-either">Either</a> or <a href="/2018/08/06/a-tree-functor">Tree</a>, it's just that it's not 'the' <em>general</em> catamorphism for that type.
</p>
<p>
I can't tell whether <code>Foldable</code> is a universal abstraction, or if it's just an ad-hoc API that turns out to be useful in practice. It looks like the latter to me, but my knowledge is only limited. Perhaps I'll be wiser in a year or two.
</p>
<p>
I will, however, take it as licence to treat this topic a little less formally than I've done with other articles. While there <em>are</em> laws associated with <code>Traversable</code>, they are rather complex, so I'm going to skip them.
</p>
<p>
The above requirements will enable you to define traversable functors if you run into some more exotic ones, but in practice, the common functors List, <a href="/2018/03/26/the-maybe-functor">Maybe</a>, <a href="/2019/01/14/an-either-functor">Either</a>, <a href="/2018/08/06/a-tree-functor">Tree</a>, and <a href="/2018/09/03/the-identity-functor">Identity</a> are all traversable. That it useful to know. If any of those functors is the outer functor in a composition of functors, then you can flip them to the inner position as long as the other functor is an applicative functor.
</p>
<p>
Since <code>IEnumerable<T></code> is traversable, and <code>Task<T></code> (or <code>Async<'T></code>) is an applicative functor, it's possible to use <code>Sequence</code> to convert <code>IEnumerable<Task<Foo>></code> to <code>Task<IEnumerable<Foo>></code>.
</p>
<h3 id="3346c092666c4dacb9a61cc1f622fc0f">
Traverse <a href="#3346c092666c4dacb9a61cc1f622fc0f" title="permalink">#</a>
</h3>
<p>
The C# and F# examples you've seen so far arrive at the desired type in a two-step process. First they produce the 'wrong' type with <code>ids.Select(Foo.Read)</code> or <code>ids |> List.map Foo.Read</code>, and then they use <code>Sequence</code> to arrive at the desired type.
</p>
<p>
When you use two expressions, you need two lines of code, and you also need to come up with a name for the intermediary value. It might be easier to chain the two function calls into a single expression:
</p>
<p>
<pre>Task<IEnumerable<Foo>> foosTask = ids.Select(Foo.Read).Sequence();</pre>
</p>
<p>
Or, in F#:
</p>
<p>
<pre><span style="color:blue;">let</span> foosTask = ids |> List.map Foo.Read |> Async.sequence</pre>
</p>
<p>
Chaining <code>Select</code>/<code>map</code> with <code>Sequence</code>/<code>sequence</code> is so common that it's a named function: <em>traverse</em>. In C#:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> Task<IEnumerable<TResult>> Traverse<<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> IEnumerable<T> source,
Func<T, Task<TResult>> selector)
{
<span style="color:blue;">return</span> source.Select(selector).Sequence();
}</pre>
</p>
<p>
This makes usage a little easier:
</p>
<p>
<pre>Task<IEnumerable<Foo>> foosTask = ids.Traverse(Foo.Read);</pre>
</p>
<p>
In F# the implementation might be similar:
</p>
<p>
<pre><span style="color:green;">// ('a -> Async<'b>) -> 'a list -> Async<'b list></span>
<span style="color:blue;">let</span> traverse f xs = xs |> List.map f |> sequence</pre>
</p>
<p>
Usage then looks like this:
</p>
<p>
<pre><span style="color:blue;">let</span> foosTask = ids |> Async.traverse Foo.Read</pre>
</p>
<p>
As you can tell, if you've already implemented <em>sequence</em> you can always implement <em>traverse</em>. The converse is also true: If you've already implemented <em>traverse</em>, you can always implement <em>sequence</em>. You'll see an example of that later.
</p>
<h3 id="117fac3b686e4db8b6c3c4e0ac556929">
A reusable idea <a href="#117fac3b686e4db8b6c3c4e0ac556929" title="permalink">#</a>
</h3>
<p>
If you know the .NET Task Parallel Library (TPL), you may demur that my implementation of <code>Sequence</code> seems like an inefficient version of <a href="https://docs.microsoft.com/dotnet/api/system.threading.tasks.task.whenall">Task.WhenAll</a>, and that <code>Traverse</code> could be written like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">async</span> Task<IEnumerable<TResult>> Traverse<<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> IEnumerable<T> source,
Func<T, Task<TResult>> selector)
{
<span style="color:blue;">return</span> <span style="color:blue;">await</span> Task.WhenAll(source.Select(selector));
}</pre>
</p>
<p>
This alternative is certainly possible. Whether it's more efficient I don't know; I haven't measured. As foreshadowed in the beginning of the article, the combination of collections and asynchrony is so common that .NET has special APIs to handle that. You may ask, then: <em>What's the point?</em>
</p>
<p>
The point of is that a traversable functor is <em>a reusable idea</em>.
</p>
<p>
You may be able to find existing APIs like <code>Task.WhenAll</code> to deal with combinations of collections and asynchrony, but what if you need to deal with asynchronous Maybe or Either? Or a List of Maybes?
</p>
<p>
There may be no existing API to flip things around - before you add it. Now you know that there's a (dare I say it?) design pattern you can implement.
</p>
<h3 id="f81375a0121247698f0ad5eac4deebff">
Asynchronous Maybe <a href="#f81375a0121247698f0ad5eac4deebff" title="permalink">#</a>
</h3>
<p>
Once people go beyond collections they often run into problems. You may, for example, decide to use the <a href="/2022/04/25/the-maybe-monad">Maybe monad</a> in order to model the presence or absence of a value. Then, once you combine Maybe-based decision values with asynchronous processesing, you may run into problems.
</p>
<p>
For example, in my article <a href="/2019/02/11/asynchronous-injection">Asynchronous Injection</a> I modelled the core domaim logic as returning <code>Maybe<Reservation></code>. When handling an HTTP request, the application should use that value to determine what to do next. If the return value is empty it should do nothing, but when the Maybe value is populated, it should save the reservation in a data store using this method:
</p>
<p>
<pre>Task<<span style="color:blue;">int</span>> Create(Reservation reservation)</pre>
</p>
<p>
Finally, if accepting the reservation, the HTTP handler (<code>ReservationsController</code>) should return the resevation ID, which is the <code>int</code> returned by <code>Create</code>. Please refer to the article for details. It also links to the sample code on GitHub.
</p>
<p>
The entire expression is, however, <code>Task</code>-based:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">async</span> Task<IActionResult> Post(Reservation reservation)
{
<span style="color:blue;">return</span> <span style="color:blue;">await</span> Repository.ReadReservations(reservation.Date)
.Select(rs => maîtreD.TryAccept(rs, reservation))
.SelectMany(m => m.Traverse(Repository.Create))
.Match(InternalServerError(<span style="color:#a31515;">"Table unavailable"</span>), Ok);
}</pre>
</p>
<p>
The <code>Select</code> and <code>SelectMany</code> methods are defined on the <code>Task</code> monad. The <code>m</code> in the <code>SelectMany</code> lambda expression is the <code>Maybe<Reservation></code> returned by <code>TryAccept</code>. What would happen if you didn't have a <code>Traverse</code> method?
</p>
<p>
<pre>Task<Maybe<Task<<span style="color:blue;">int</span>>>> whatIsThis = Repository.ReadReservations(reservation.Date)
.Select(rs => maîtreD.TryAccept(rs, reservation))
.Select(m => m.Select(Repository.Create));</pre>
</p>
<p>
Notice that <code>whatIsThis</code> (so named because it's a temporary variable used to investigate the type of the expression so far) has an awkward type: <code>Task<Maybe<Task<<span style="color:blue;">int</span>>>></code>. That's a Task within a Maybe within a Task.
</p>
<p>
This makes it difficult to continue the composition and return an HTTP result.
</p>
<p>
Instead, use <code>Traverse</code>:
</p>
<p>
<pre>Task<Task<Maybe<<span style="color:blue;">int</span>>>> whatIsThis = Repository.ReadReservations(reservation.Date)
.Select(rs => maîtreD.TryAccept(rs, reservation))
.Select(m => m.Traverse(Repository.Create));</pre>
</p>
<p>
This flips the inner <code>Maybe<Task<<span style="color:blue;">int</span>>></code> to <code>Task<Maybe<<span style="color:blue;">int</span>>></code>. Now you have a Maybe within a Task within a Task. The outer two Tasks are now nicely nested, and it's a job for a monad to remove one level of nesting. That's the reason that the final composition uses <code>SelectMany</code> instead of <code>Select</code>.
</p>
<p>
The <code>Traverse</code> function is implemented like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> Task<Maybe<TResult>> Traverse<<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>>(
<span style="color:blue;">this</span> Maybe<T> source,
Func<T, Task<TResult>> selector)
{
<span style="color:blue;">return</span> source.Match(
nothing: Task.FromResult(<span style="color:blue;">new</span> Maybe<TResult>()),
just: <span style="color:blue;">async</span> x => <span style="color:blue;">new</span> Maybe<TResult>(<span style="color:blue;">await</span> selector(x)));
}</pre>
</p>
<p>
The <em>idea</em> is reusable. You can also implement a similar traversal in F#:
</p>
<p>
<pre><span style="color:green;">// ('a -> Async<'b>) -> 'a option -> Async<'b option></span>
<span style="color:blue;">let</span> traverse f = <span style="color:blue;">function</span>
| Some x <span style="color:blue;">-></span> async {
<span style="color:blue;">let!</span> x' = f x
<span style="color:blue;">return</span> Some x' }
| None <span style="color:blue;">-></span> async { <span style="color:blue;">return</span> None }</pre>
</p>
<p>
You can see the F# function as well as a usage example in the article <a href="/2019/12/02/refactoring-registration-flow-to-functional-architecture">Refactoring registration flow to functional architecture</a>.
</p>
<h3 id="a9e25f8c3dc24d99b669f90a4e46afa0">
Sequence from traverse <a href="#a9e25f8c3dc24d99b669f90a4e46afa0" title="permalink">#</a>
</h3>
<p>
You've already seen that if you have a <em>sequence</em> function, you can implement <em>traverse</em>. I also claimed that the reverse is true: If you have <em>traverse</em> you can implement <em>sequence</em>.
</p>
<p>
When you've encountered these kinds of dual definitions a couple of times, you start to expect the ubiquitous identity function to make an appearance, and indeed it does:
</p>
<p>
<pre><span style="color:blue;">let</span> sequence x = traverse id x</pre>
</p>
<p>
That's the F# version where the identity function is built in as <code>id</code>. In C# you'd use a lambda expression:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> Task<Maybe<T>> Sequence<<span style="color:#2b91af;">T</span>>(<span style="color:blue;">this</span> Maybe<Task<T>> source)
{
<span style="color:blue;">return</span> source.Traverse(x => x);
}</pre>
</p>
<p>
Since C# doesn't come with a predefined identity function, it's <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> to use <code>x => x</code> instead.
</p>
<h3 id="cc6c409706e24ea9b3ebefa49fcc3235">
Conclusion <a href="#cc6c409706e24ea9b3ebefa49fcc3235" title="permalink">#</a>
</h3>
<p>
Traversals are useful when you need to 'flip' the order of two different, nested functors. The outer one must be a traversable functor, and the inner an applicative functor.
</p>
<p>
Common traversable functors are List, Maybe, Either, Tree, and Identity, but there are more than those. In .NET you often need them when combining them with Tasks. In Haskell, they are useful when combined with <code>IO</code>.
</p>
<p>
<strong>Next:</strong> <a href="/2024/11/25/nested-monads">Nested monads</a>.
</p>
</div>
<div id="comments">
<hr>
<h2 id="comments-header">
Comments
</h2>
<div class="comment" id="c72c30e16cdd48419f95fd7ad5c74f81">
<div class="comment-author">qfilip <a href="#c72c30e16cdd48419f95fd7ad5c74f81">#</a></div>
<div class="comment-content">
<p>
Thanks for this one. You might be interested in <a href="https://andrewlock.net/working-with-the-result-pattern-part-1-replacing-exceptions-as-control-flow/">Andrew Lock's</a> take on the whole subject as well.
</p>
</div>
<div class="comment-date">2024-11-17 14:51 UTC</div>
</div>
</div>
<hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Pendulum swing: no Haskell type annotation by defaulthttps://blog.ploeh.dk/2024/11/04/pendulum-swing-no-haskell-type-annotation-by-default2024-11-04T07:45:00+00:00Mark Seemann
<div id="post">
<p>
<em>Are Haskell IDE plugins now good enough that you don't need explicit type annotations?</em>
</p>
<p>
More than three years ago, I published <a href="/2021/02/22/pendulum-swings">a small article series</a> to document that I'd changed my mind on various small practices. Belatedly, here comes a fourth article, which, frankly, is a cousin rather than a sibling. Still, it fits the overall theme well enough to become another instalment in the series.
</p>
<p>
Here, I consider using fewer <a href="https://www.haskell.org/">Haskell</a> type annotations, following a practice that I've always followed in <a href="https://fsharp.org/">F#</a>.
</p>
<p>
To be honest, though, it's not that I've already applied the following practice for a long time, and only now write about it. It's rather that I feel the need to write this article to kick an old habit and start a new.
</p>
<h3 id="227874a509f24b93b9a091429b9ad03e">
Inertia <a href="#227874a509f24b93b9a091429b9ad03e">#</a>
</h3>
<p>
As I write in the dedication in <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>,
</p>
<blockquote>
<p>
"To my parents:
</p>
<p>
"My mother, Ulla Seemann, to whom I owe my attention to detail.
</p>
<p>
"My father, Leif Seemann, from whom I inherited my contrarian streak."
</p>
<footer><cite><a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a></cite>, dedication</footer>
</blockquote>
<p>
One should always be careful simplifying one's personality to a simple, easy-to-understand model, but a major point here is that I have two traits that pull in almost the opposite direction.
</p>
<p>
<img src="/content/binary/neatness-contrariness-vector-sum.png" alt="Two vectors labelled respectively neatness and contrariness pulling in almost opposing directions, while still not quite cancelling each other out, leaving a short vector sum pointing to the right.">
</p>
<p>
Despite much work, I only make slow progress. My desire to make things neat and proper almost cancel out my tendency to go against the norms. I tend to automatically toe whatever line that exists until the cognitive dissonance becomes so great that I can no longer ignore it.
</p>
<p>
I then write an article for the blog to clarify my thoughts.
</p>
<p>
You may read what comes next and ask, <em>what took you so long?!</em>
</p>
<p>
I can only refer to the above. I may look calm on the surface, but underneath I'm paddling like the dickens. Despite much work, though, only limited progress is visible.
</p>
<h3 id="a00a292d223a435b873f7cc1de1730c3">
Nudge <a href="#a00a292d223a435b873f7cc1de1730c3">#</a>
</h3>
<p>
Haskell is a statically typed language with the most powerful type system I know my way around. The types carry so much information that one can often infer <a href="/2022/10/24/encapsulation-in-functional-programming">a function's contract</a> from the type alone. This is also fortunate, since many Haskell libraries tend to have, shall we say, minimal documentation. Even so, I've often found myself able to figure out how to use an unfamiliar Haskell API by examining the various types that a library exports.
</p>
<p>
In fact, the type system is so powerful that it drives <a href="https://hoogle.haskell.org/">a specialized search engine</a>. If you need a function with the type <code>(<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">IO</span> <span style="color:#2b91af;">Int</span>) <span style="color:blue;">-></span> [<span style="color:#2b91af;">String</span>] <span style="color:blue;">-></span> <span style="color:#2b91af;">IO</span> [<span style="color:#2b91af;">Int</span>]</code> you can search for it. Hoogle will list all functions that match that type, including functions that are more abstract than your specialized need. You don't even have to imagine what the name might be.
</p>
<p>
Since the type system is so powerful, it's a major means of communication. Thus, it makes sense that <a href="https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler">GHC</a> regularly issues <a href="https://downloads.haskell.org/ghc/latest/docs/users_guide/using-warnings.html#ghc-flag--Wmissing-signatures">a warning</a> if a function lacks a type annotation.
</p>
<p>
While the compiler enables you to control which warnings are turned on, the <code>missing-signatures</code> warning is included in the popular <a href="https://downloads.haskell.org/ghc/latest/docs/users_guide/using-warnings.html#ghc-flag--Wall">all</a> flag that most people, I take it, use. I do, at least.
</p>
<p>
If you forget to declare the type of a function, the compiler will complain:
</p>
<p>
<pre>src\SecurityManager.hs:15:1: <span style="color:red;">warning</span>: [<span style="color:red;">GHC-38417</span>] [<span style="color:red;">-Wmissing-signatures</span>]
Top-level binding with no type signature:
createUser :: (Monad m, Text.Printf.PrintfArg b,
Text.Printf.PrintfArg (t a), Foldable t, Eq (t a)) =>
(String -> m ()) -> m (t a) -> (t a -> b) -> m ()
<span style="color:blue;"> |</span>
<span style="color:blue;">15 |</span> <span style="color:red;">createUser</span> writeLine readLine encrypt = do
<span style="color:blue;"> |</span> <span style="color:red;">^^^^^^^^^^</span></pre>
</p>
<p>
This is a strong nudge that you're supposed to give each function a type declaration, so I've been doing that for years. Neat and proper.
</p>
<p>
Of course, if you treat warnings as errors, as <a href="/code-that-fits-in-your-head">I recommend</a>, the nudge becomes a law.
</p>
<h3 id="cf16318003ef46ed8c67d81217e56011">
Learning from F# <a href="#cf16318003ef46ed8c67d81217e56011">#</a>
</h3>
<p>
While I try to adopt the style and <a href="/2015/08/03/idiomatic-or-idiosyncratic">idioms</a> of any language I work in, it's always annoyed me that I had to add a type annotation to a Haskell function. After all, the compiler can usually infer the type. Frankly, adding a type signature feels like redundant ceremony. It's like having to declare a function in a header file before being able to implement it in another file.
</p>
<p>
This particularly bothers me because I've long since abandoned type annotations in F#. As far as I can tell, most of the F# community has, too.
</p>
<p>
When you implement an F# function, you just write the implementation and let the compiler infer the type. (Code example from <a href="/2019/12/16/zone-of-ceremony">Zone of Ceremony</a>.)
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:blue;">inline</span> <span style="color:#74531f;">consume</span> <span style="color:#1f377f;">quantity</span> =
<span style="color:blue;">let</span> <span style="color:#74531f;">go</span> (<span style="color:#1f377f;">acc</span>, <span style="color:#1f377f;">xs</span>) <span style="color:#1f377f;">x</span> =
<span style="color:blue;">if</span> <span style="color:#1f377f;">quantity</span> <= <span style="color:#1f377f;">acc</span>
<span style="color:blue;">then</span> (<span style="color:#1f377f;">acc</span>, <span style="color:#2b91af;">Seq</span>.<span style="color:#74531f;">append</span> <span style="color:#1f377f;">xs</span> (<span style="color:#2b91af;">Seq</span>.<span style="color:#74531f;">singleton</span> <span style="color:#1f377f;">x</span>))
<span style="color:blue;">else</span> (<span style="color:#1f377f;">acc</span> + <span style="color:#1f377f;">x</span>, <span style="color:#1f377f;">xs</span>)
<span style="color:#2b91af;">Seq</span>.<span style="color:#74531f;">fold</span> <span style="color:#74531f;">go</span> (<span style="color:#2b91af;">LanguagePrimitives</span>.GenericZero, <span style="color:#2b91af;">Seq</span>.empty) >> <span style="color:#74531f;">snd</span></pre>
</p>
<p>
Since F# often has to interact with .NET code written in C#, you regularly have to add <em>some</em> type annotations to help the compiler along:
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">average</span> (<span style="font-weight:bold;color:#1f377f;">timeSpans</span> : <span style="color:#2b91af;">NonEmpty</span><<span style="color:#2b91af;">TimeSpan</span>>) =
[ <span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Head ] @ <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">ofSeq</span> <span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Tail
|> <span style="color:#2b91af;">List</span>.<span style="color:#74531f;">averageBy</span> (_.Ticks >> <span style="color:#74531f;">double</span>)
|> <span style="color:#74531f;">int64</span>
|> <span style="color:#2b91af;">TimeSpan</span>.<span style="font-weight:bold;color:#74531f;">FromTicks</span></pre>
</p>
<p>
Even so, I follow the rule of minimal annotations: Only add the type information required to compile, and let the compiler infer the rest. For example, the above <a href="/2024/05/06/conservative-codomain-conjecture">average function</a> has the inferred type <code><span style="color:#2b91af;">NonEmpty</span><span style="color:#2b91af;"><</span><span style="color:#2b91af;">TimeSpan</span><span style="color:#2b91af;">></span> <span style="color:blue;">-></span> <span style="color:#2b91af;">TimeSpan</span></code>. While I had to specify the input type in order to be able to use the <a href="https://learn.microsoft.com/dotnet/api/system.datetime.ticks">Ticks property</a>, I didn't have to specify the return type. So I didn't.
</p>
<p>
My impression from reading other people's F# code is that this is a common, albeit not universal, approach to type annotation.
</p>
<p>
This minimizes ceremony, since you only need to declare and maintain the types that the compiler can't infer. There's no reason to repeat the work that the compiler can already do, and in practice, if you do, it just gets in the way.
</p>
<h3 id="fdd9161164f64f438aa0bedf5ff6f9a8">
Motivation for explicit type definitions <a href="#fdd9161164f64f438aa0bedf5ff6f9a8">#</a>
</h3>
<p>
When I extol the merits of static types, proponents of dynamically typed languages often argue that the types are in the way. Granted, this is <a href="/2021/08/09/am-i-stuck-in-a-local-maximum">a discussion that I still struggle with</a>, but based on my understanding of the argument, it seems entirely reasonable. After all, if you have to spend time declaring the type of each and every parameter, as well as a function's return type, it does seem to be in the way. This is only exacerbated if you later change your mind.
</p>
<p>
Programming is, to a large extend, an explorative activity. You start with one notion of how your code should be structured, but as you progress, you learn. You'll often have to go back and change existing code. This, as far as I can tell, is much easier in, say, <a href="https://www.python.org/">Python</a> or <a href="https://clojure.org/">Clojure</a> than in C# or <a href="https://www.java.com/">Java</a>.
</p>
<p>
If, however, one extrapolates from the experience with Java or C# to all statically typed languages, that would be a logical fallacy. My point with <a href="/2019/12/16/zone-of-ceremony">Zone of Ceremony</a> was exactly that there's a group of languages 'to the right' of high-ceremony languages with low levels of ceremony. Even though they're statically typed.
</p>
<p>
I have to admit, however, that in that article I cheated a little in order to drive home a point. While you <em>can</em> write Haskell code in a low-ceremony style, the tooling (in the form of the <code>all</code> warning set, at least) encourages a high-ceremony style. Add those type definitions, even thought they're redundant.
</p>
<p>
It's not that I don't understand some of the underlying motivation behind that rule. <a href="http://dmwit.com/">Daniel Wagner</a> enumerated several reasons in <a href="https://stackoverflow.com/a/19626857/126014">a 2013 Stack Overflow answer</a>. Some of the reasons still apply, but on the other hand, the world has also moved on in the intervening decade.
</p>
<p>
To be honest, the Haskell <a href="https://en.wikipedia.org/wiki/Integrated_development_environment">IDE</a> situation has always been precarious. One day, it works really well; the next day, I struggle with it. Over the years, though, things have improved.
</p>
<p>
There was a time when an explicit type definition was a indisputable help, because you couldn't rely on tools to light up and tell you what the inferred type was.
</p>
<p>
Today, on the other hand, the <a href="https://marketplace.visualstudio.com/items?itemName=haskell.haskell">Haskell extension for Visual Studio Code</a> automatically displays the inferred type above a function implementation:
</p>
<p>
<img src="/content/binary/haskell-code-with-inferred-type-displayed-by-vs-code.png" alt="Screen shot of a Haskell function in Visual Studio Code with the function's type automatically displayed above it by the Haskell extension.">
</p>
<p>
To be clear, the top line that shows the type definition is not part of the source code. It's just shown by Visual Studio Code as a code lens (I think it's called), and it automatically changes if I edit the code in such a way that the type changes.
</p>
<p>
If you can rely on such automatic type information, it seems that an explicit type declaration is less useful. It's at least one less reason to add type annotations to the source code.
</p>
<h3 id="367135868de54bcb8eebd2d9bc9a0f8c">
Ceremony example <a href="#367135868de54bcb8eebd2d9bc9a0f8c">#</a>
</h3>
<p>
In order to explain what I mean by <em>the types being in the way</em>, I'll give an example. Consider the code example from the article <a href="/2024/10/21/legacy-security-manager-in-haskell">Legacy Security Manager in Haskell</a>. In it, I described how every time I made a change to the <code>createUser</code> action, I had to effectively remove and re-add the type declaration.
</p>
<p>
It doesn't have to be like that. If instead I'd started without type annotations, I could have moved forward without being slowed down by having to edit type definitions. Take the first edit, breaking the dependency on the console, as an example. Without type annotations, the <code>createUser</code> action would look exactly as before, just without the type declaration. Its type would still be <code>IO ()</code>.
</p>
<p>
After the first edit, the first lines of the action now look like this:
</p>
<p>
<pre>createUser writeLine readLine = <span style="color:blue;">do</span>
<span style="color:blue;">()</span> <- writeLine <span style="color:#a31515;">"Enter a username"</span>
<span style="color:green;">-- ...</span></pre>
</p>
<p>
Even without a type definition, the action still has a type. The compiler infers it to be <code>(<span style="color:blue;">Monad</span> m, <span style="color:blue;">Eq</span> a, <span style="color:blue;">IsChar</span> a) <span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m [a] <span style="color:blue;">-></span> m ()</code>, which is certainly a bit of a mouthful, but exactly what I had explicitly added in the other article.
</p>
<p>
The code doesn't compile until I also change the <code>main</code> method to pass the new parameters:
</p>
<p>
<pre>main = createUser <span style="color:blue;">putStrLn</span> <span style="color:blue;">getLine</span></pre>
</p>
<p>
You'd have to make a similar edit in, say, Python, although there'd be no compiler to remind you. My point isn't that this is better than a dynamically typed language, but rather that it's on par. The types aren't in the way.
</p>
<p>
We see the similar lack of required ceremony when the <code>createUser</code> action finally pulls in the <code>comparePasswords</code> and <code>validatePassword</code> functions:
</p>
<p>
<pre>createUser writeLine readLine encrypt = <span style="color:blue;">do</span>
<span style="color:blue;">()</span> <- writeLine <span style="color:#a31515;">"Enter a username"</span>
username <- readLine
writeLine <span style="color:#a31515;">"Enter your full name"</span>
fullName <- readLine
writeLine <span style="color:#a31515;">"Enter your password"</span>
password <- readLine
writeLine <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- readLine
writeLine $ either
<span style="color:blue;">id</span>
(printf <span style="color:#a31515;">"Saving Details for User (%s, %s, %s)"</span> username fullName . encrypt)
(validatePassword =<< comparePasswords password confirmPassword)</pre>
</p>
<p>
Again, there's no type annotation, and while the type actually <em>does</em> change to
</p>
<p>
<pre>(<span style="color:blue;">Monad</span> m, <span style="color:blue;">PrintfArg</span> b, <span style="color:blue;">PrintfArg</span> (t a), <span style="color:blue;">Foldable</span> t, <span style="color:blue;">Eq</span> (t a)) <span style="color:blue;">=></span>
(<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m (t a) <span style="color:blue;">-></span> (t a <span style="color:blue;">-></span> b) <span style="color:blue;">-></span> m ()</pre>
</p>
<p>
it impacts none of the existing code. Again, the types aren't in the way, and no ceremony is required.
</p>
<p>
Compare that inferred type signature with the explicit final type annotation in <a href="/2024/10/21/legacy-security-manager-in-haskell">the previous article</a>. The inferred type is much more abstract and permissive than the explicit declaration, although I also grant that Daniel Wagner had a point that you can make explicit type definitions more reader-friendly.
</p>
<h3 id="d4469073def54f289edb56d1ca8417ee">
Flies in the ointment <a href="#d4469073def54f289edb56d1ca8417ee">#</a>
</h3>
<p>
Do the inferred types communicate intent? That's debatable. For example, it's not immediately clear that the above <code>t a</code> allows <code>String</code>.
</p>
<p>
Another thing that annoys me is that I had to add that <em>unit</em> binding on the first line:
</p>
<p>
<pre>createUser writeLine readLine encrypt = <span style="color:blue;">do</span>
<span style="color:blue;">()</span> <- writeLine <span style="color:#a31515;">"Enter a username"</span>
<span style="color:green;">-- ...</span></pre>
</p>
<p>
The reason for that is that if I don't do that (that is, if I just write <code>writeLine "Xyz"</code> all the way), the compiler infers the type of <code>writeLine</code> to be <code><span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m b2</code>, rather than just <code><span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()</code>. In effect, I want <code>b2 ~ ()</code>, but because the compiler thinks that <code>b2</code> may be anything, it issues an <a href="https://downloads.haskell.org/ghc/latest/docs/users_guide/using-warnings.html#ghc-flag--Wunused-do-bind">unused-do-bind</a> warning.
</p>
<p>
The idiomatic way to resolve that situation is to add a type definition, but that's the situation I'm trying to avoid. Thus, my desire to do without annotations pushes me to write unnatural implementation code. This reminds me of the notion of <a href="https://dhh.dk/2014/test-induced-design-damage.html">test-induced damage</a>. This is at best a disagreeable compromise.
</p>
<p>
It also annoys me that implementation details leak out to the inferred type, witnessed by the <code>PrintfArg</code> type constraint. What happens if I change the implementation to use list concatenation?
</p>
<p>
<pre>createUser writeLine readLine encrypt = <span style="color:blue;">do</span>
<span style="color:blue;">()</span> <- writeLine <span style="color:#a31515;">"Enter a username"</span>
username <- readLine
writeLine <span style="color:#a31515;">"Enter your full name"</span>
fullName <- readLine
writeLine <span style="color:#a31515;">"Enter your password"</span>
password <- readLine
writeLine <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- readLine
<span style="color:blue;">let</span> createMsg pwd =
<span style="color:#a31515;">"Saving Details for User ("</span> ++ username ++<span style="color:#a31515;">", "</span> ++ fullName ++ <span style="color:#a31515;">", "</span> ++ pwd ++<span style="color:#a31515;">")"</span>
writeLine $ either
<span style="color:blue;">id</span>
(createMsg . encrypt)
(validatePassword =<< comparePasswords password confirmPassword)</pre>
</p>
<p>
If I do that, the type also changes:
</p>
<p>
<pre><span style="color:blue;">Monad</span> m <span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m [<span style="color:#2b91af;">Char</span>] <span style="color:blue;">-></span> ([<span style="color:#2b91af;">Char</span>] <span style="color:blue;">-></span> [<span style="color:#2b91af;">Char</span>]) <span style="color:blue;">-></span> m ()</pre>
</p>
<p>
While we get rid of the <code>PrintfArg</code> type constraint, the type becomes otherwise more concrete, now operating on <code>String</code> values (keeping in mind that <code>String</code> is a type synonym for <code>[Char]</code>).
</p>
<p>
The code still compiles, and all tests still pass, because the abstraction I've had in mind all along is essentially this last type.
</p>
<p>
The <code>writeLine</code> action should take a <code>String</code> and have some side effect, but return no data. The type <code><span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()</code> nicely models that, striking a fine balance between being sufficiently concrete to capture intent, but still abstract enough to be testable.
</p>
<p>
The <code>readLine</code> action should provide input <code>String</code> values, and again <code>m String</code> nicely models that concern.
</p>
<p>
Finally, <code>encrypt</code> is indeed a naked <code>String</code> <a href="https://en.wikipedia.org/wiki/Endomorphism">endomorphism</a>: <code>String -> String</code>.
</p>
<p>
With my decades of experience with object-oriented design, it still strikes me as odd that implementation details can make a type more abstract, but once you think it over, it may be okay.
</p>
<h3 id="a82d4017be064ce980c40e22aa6f801e">
More liberal abstractions <a href="#a82d4017be064ce980c40e22aa6f801e">#</a>
</h3>
<p>
The inferred types are consistently more liberal than the abstraction I have in mind, which is
</p>
<p>
<pre><span style="color:blue;">Monad</span> m <span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m <span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">String</span>) <span style="color:blue;">-></span> m ()</pre>
</p>
<p>
In all cases, the inferred types include that type as a subset.
</p>
<p>
<img src="/content/binary/create-user-abstraction-sets.png" alt="Various sets of inferred types.">
</p>
<p>
I hope that I've created the above diagram so that it makes sense, but the point I'm trying to get across is that the two type definitions in the lower middle are equivalent, and are the most specific types. That's the intended abstraction. Thinking of <a href="/2021/11/15/types-as-sets">types as sets</a>, all the other inferred types are supersets of that type, in various ways. Even though implementation details leak out in the shape of <code>PrintfArg</code> and <code>IsChar</code>, these are effectually larger sets.
</p>
<p>
This takes some getting used to: The implementation details are <em>more</em> liberal than the abstraction. This seems to be at odds with the <a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">Dependency Inversion Principle</a> (DIP), which suggests that abstractions shouldn't depend on implementation details. I'm not yet sure what to make of this, but I suspect that this is more of problem of overlapping linguistic semantics than software design. What I mean is that I have a feeling that 'implementation detail' have more than one meaning. At least, in the perspective of the DIP, an implementation detail <em>limits</em> your options. For example, depending on a particular database technology is more constraining than depending on some abstract notion of what the persistence mechanism might be. Contrast this with an implementation detail such as the <code>PrintfArg</code> type constraint. It doesn't narrow your options; on the contrary, it makes the implementation more liberal.
</p>
<p>
Still, while an implementation should <a href="https://en.wikipedia.org/wiki/Robustness_principle">be liberal in what it accepts</a>, it's probably not a good idea to publish such a capability to the wider world. After all, if you do, <a href="https://www.hyrumslaw.com/">someone will eventually rely on it</a>.
</p>
<h3 id="42ffe5249c7542809ca55a95a8f15f6c">
For internal use only <a href="#42ffe5249c7542809ca55a95a8f15f6c">#</a>
</h3>
<p>
Going through all these considerations, I think I'll revise my position as the following.
</p>
<p>
I'll forgo type annotations as long as I explore a problem space. For internal application use, this may effectively mean forever, in the sense that how you compose an application from smaller building blocks is likely to be in permanent flux. Here I have in mind your average web asset or other public-facing service that's in constant development. You keep adding new features, or changing domain logic as the overall business evolves.
</p>
<p>
As I've also recently discussed, <a href="/2024/02/05/statically-and-dynamically-typed-scripts">Haskell is a great scripting language</a>, and I think that here, too, I'll dial down the type definitions.
</p>
<p>
If I ever do another <a href="https://adventofcode.com/">Advent of Code</a> in Haskell, I think I'll also eschew explicit type annotations.
</p>
<p>
On the other hand, I can see that once an API stabilizes, you may want to lock it down. This may also apply to internal abstractions if you're working in a team and you explicitly want to communicate what a contract is.
</p>
<p>
If the code is a reusable library, I think that explicit type definitions are still required. Both for the reasons outlined by Daniel Wagner, and also to avoid being the victim of <a href="https://www.hyrumslaw.com/">Hyrum's law</a>.
</p>
<p>
That's why I phrase this pendulum swing as a new <em>default</em>. I'll begin programming without type definitions, but add them as needed. The point is rather that there may be parts of a code base where they're never needed, and then it's okay to keep going without them.
</p>
<p>
You can use a language pragma to opt out of the <code>missing-signatures</code> compiler warning on a module-by-module basis:
</p>
<p>
<pre>{-# <span style="color:gray;">OPTIONS_GHC</span> -Wno-missing-signatures #-}</pre>
</p>
<p>
This will enable me to rely on type inference in parts of the code base, while keeping the build clean of compiler warnings.
</p>
<h3 id="36e2b141fff548678e34d24eda5a3e03">
Conclusion <a href="#36e2b141fff548678e34d24eda5a3e03">#</a>
</h3>
<p>
I've always appreciated the F# compiler's ability to infer types and just let type changes automatically ripple through the code base. For that reason, the Haskell norm of explicitly adding a (redundant) type annotation has always vexed me.
</p>
<p>
It often takes me a long time to reach seemingly obvious conclusions, such as: Don't always add type definitions to Haskell functions. Let the type inference engine do its job.
</p>
<p>
The reason it takes me so long to take such a small step is that I want to follow 'best practice'; I want to write idiomatic code. When the standard compiler-warning set complains about missing type definitions, it takes me significant deliberation to discard such advice. I could imagine other programmers being in the same situation, which is one reason I wrote this article.
</p>
<p>
The point isn't that type definitions are a universally bad idea. They aren't. Rather, the point is only that it's also okay to do without them in parts of a code base. Perhaps only temporarily, but in some cases maybe permanently.
</p>
<p>
The <code>missing-signatures</code> warning shouldn't, I now believe, be considered an absolute law, but rather a contextual rule.
</p>
</div>
<hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Functor compositionshttps://blog.ploeh.dk/2024/10/28/functor-compositions2024-10-28T06:58:00+00:00Mark Seemann
<div id="post">
<p>
<em>A functor nested within another functor forms a functor. With examples in C# and another language.</em>
</p>
<p>
This article is part of <a href="/2022/07/11/functor-relationships">a series of articles about functor relationships</a>. In this one you'll learn about a universal composition of functors. In short, if you have one functor nested within another functor, then this composition itself gives rise to a functor.
</p>
<p>
Together with other articles in this series, this result can help you answer questions such as: <em>Does this data structure form a functor?</em>
</p>
<p>
Since <a href="/2018/03/22/functors">functors</a> tend to be quite common, and since they're useful enough that many programming languages have special support or syntax for them, the ability to recognize a potential functor can be useful. Given a type like <code>Foo<T></code> (C# syntax) or <code>Bar<T1, T2></code>, being able to recognize it as a functor can come in handy. One scenario is if you yourself have just defined this data type. Recognizing that it's a functor strongly suggests that you should give it a <code>Select</code> method in C#, a <code>map</code> function in <a href="https://fsharp.org/">F#</a>, and so on.
</p>
<p>
Not all generic types give rise to a (covariant) functor. Some are rather <a href="/2021/09/02/contravariant-functors">contravariant functors</a>, and some are <a href="/2022/08/01/invariant-functors">invariant</a>.
</p>
<p>
If, on the other hand, you have a data type where one functor is nested within another functor, then the data type itself gives rise to a functor. You'll see some examples in this article.
</p>
<h3 id="a97b2f6471b74db6a83362a552ee5b03">
Abstract shape <a href="#a97b2f6471b74db6a83362a552ee5b03">#</a>
</h3>
<p>
Before we look at some examples found in other code, it helps if we know what we're looking for. Imagine that you have two functors <code>F</code> and <code>G</code>, and you're now considering a data structure that contains a value where <code>G</code> is nested inside of <code>F</code>.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">T</span>>
{
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">F</span><<span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>>> ginf;
<span style="color:blue;">public</span> <span style="color:#2b91af;">GInF</span>(<span style="color:#2b91af;">F</span><<span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>>> <span style="font-weight:bold;color:#1f377f;">ginf</span>)
{
<span style="color:blue;">this</span>.ginf = <span style="font-weight:bold;color:#1f377f;">ginf</span>;
}
<span style="color:green;">// Methods go here...</span></pre>
</p>
<p>
The <code><span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">T</span>></code> class has a single class field. The type of this field is an <code>F</code> <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">container</a>, but 'inside' <code>F</code> there's a <code>G</code> functor.
</p>
<p>
This kind of data structure gives rise to a functor. Knowing that, you can give it a <code>Select</code> method:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">TResult</span>>(ginf.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">g</span> => <span style="font-weight:bold;color:#1f377f;">g</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)));
}</pre>
</p>
<p>
The composed <code>Select</code> method calls <code>Select</code> on the <code>F</code> functor, passing it a lambda expression that calls <code>Select</code> on the <code>G</code> functor. That nested <code>Select</code> call produces an <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">G</span><<span style="color:#2b91af;">TResult</span>>></code> that the composed <code>Select</code> method finally wraps in a <code><span style="color:blue;">new</span> <span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">TResult</span>></code> object that it returns.
</p>
<p>
I'll have more to say about how this generalizes to a nested composition of more than two functors, but first, let's consider some examples.
</p>
<h3 id="fcd4126b51c24b10867de4280f5e8844">
Priority list <a href="#fcd4126b51c24b10867de4280f5e8844">#</a>
</h3>
<p>
A common configuration is when the 'outer' functor is a collection, and the 'inner' functor is some other kind of container. The article <a href="/2024/07/01/an-immutable-priority-collection">An immutable priority collection</a> shows a straightforward example. The <code><span style="color:#2b91af;">PriorityCollection</span><<span style="color:#2b91af;">T</span>></code> class composes a single class field:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>>[] priorities;</pre>
</p>
<p>
The <code>priorities</code> field is an array (a collection) of <code><span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>></code> objects. That type is a simple <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">record</a> type:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Item</span>, <span style="color:blue;">byte</span> <span style="font-weight:bold;color:#1f377f;">Priority</span>);</pre>
</p>
<p>
If we squint a little and consider only the parameter list, we may realize that this is fundamentally an 'embellished' tuple: <code>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Item</span>, <span style="color:blue;">byte</span> <span style="font-weight:bold;color:#1f377f;">Priority</span>)</code>. <a href="/2018/12/31/tuple-bifunctor">A pair forms a bifunctor</a>, but in the <a href="https://www.haskell.org/">Haskell</a> <code>Prelude</code> a tuple is also a <code>Functor</code> instance over its rightmost element. In other words, if we'd swapped the <code><span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>></code> constructor parameters, it might have naturally looked like something we could <code>fmap</code>:
</p>
<p>
<pre>ghci> fmap (elem 'r') (55, "foo")
(55,False)</pre>
</p>
<p>
Here we have a tuple of an integer and a string. Imagine that the number <code>55</code> is the priority that we give to the label <code>"foo"</code>. This little ad-hoc example demonstrates how to map that tuple to another tuple with a priority, but now it instead holds a Boolean value indicating whether or not the string contained the character <code>'r'</code> (which it didn't).
</p>
<p>
You can easily swap the elements:
</p>
<p>
<pre>ghci> import Data.Tuple
ghci> swap (55, "foo")
("foo",55)</pre>
</p>
<p>
This looks just like the <code><span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>></code> parameter list. This also implies that if you originally have the parameter list in that order, you could <code>swap</code> it, map it, and swap it again:
</p>
<p>
<pre>ghci> swap $ fmap (elem 'r') $ swap ("foo", 55)
(False,55)</pre>
</p>
<p>
My point is only that <code><span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>></code> is isomorphic to a known functor. In reality you rarely need to analyze things that thoroughly to come to that realization, but the bottom line is that you can give <code><span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>></code> a lawful <code>Select</code> method:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">T</span>>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Item</span>, <span style="color:blue;">byte</span> <span style="font-weight:bold;color:#1f377f;">Priority</span>)
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">Prioritized</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>(Item), Priority);
}
}</pre>
</p>
<p>
Hardly surprising, but since this article postulates that a functor of a functor is a functor, and since we already know that collections give rise to a functor, we should deduce that we can give <code><span style="color:#2b91af;">PriorityCollection</span><<span style="color:#2b91af;">T</span>></code> a <code>Select</code> method. And we can:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">PriorityCollection</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">PriorityCollection</span><<span style="color:#2b91af;">TResult</span>>(
priorities.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">p</span> => <span style="font-weight:bold;color:#1f377f;">p</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)).<span style="font-weight:bold;color:#74531f;">ToArray</span>());
}</pre>
</p>
<p>
Notice how much this implementation looks like the above <code><span style="color:#2b91af;">GInF</span><<span style="color:#2b91af;">T</span>></code> 'shape' implementation.
</p>
<h3 id="32b4e828d4584c3d8cda81a9682aee34">
Tree <a href="#32b4e828d4584c3d8cda81a9682aee34">#</a>
</h3>
<p>
An example only marginally more complicated than the above is shown in <a href="/2018/08/06/a-tree-functor">A Tree functor</a>. The <code><span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>></code> class shown in that article contains two constituents:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>>> children;
<span style="color:blue;">public</span> <span style="color:#2b91af;">T</span> Item { <span style="color:blue;">get</span>; }</pre>
</p>
<p>
Just like <code><span style="color:#2b91af;">PriorityCollection</span><<span style="color:#2b91af;">T</span>></code> there's a collection, as well as a 'naked' <code>T</code> value. The main difference is that here, the collection is of the same type as the object itself: <code><span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>></code>.
</p>
<p>
You've seen a similar example in <a href="/2024/10/14/functor-sums">the previous article</a>, which also had a recursive data structure. If you assume, however, that <code><span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>></code> gives rise to a functor, then so does the nested composition of putting it in a collection. This means, from the 'theorem' put forth in this article, that <code><span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>>></code> composes as a functor. Finally you have a product of a <code>T</code> (which is isomorphic to the <a href="/2018/09/03/the-identity-functor">Identity functor</a>) and that composed functor. From <a href="/2024/09/16/functor-products">Functor products</a> it follows that that's a functor too, which explains why <code><span style="color:#2b91af;">Tree</span><<span style="color:#2b91af;">T</span>></code> forms a functor. <a href="/2018/08/06/a-tree-functor">The article</a> shows the <code>Select</code> implementation.
</p>
<h3 id="17209725eab64da598ba924342dafbd0">
Binary tree Zipper <a href="#17209725eab64da598ba924342dafbd0">#</a>
</h3>
<p>
In both previous articles you've seen pieces of the puzzle explaining why the <a href="/2024/09/09/a-binary-tree-zipper-in-c">binary tree Zipper</a> gives rise to functor. There's one missing piece, however, that we can now finally address.
</p>
<p>
Recall that <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code> composes these two objects:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> Tree { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>> Breadcrumbs { <span style="color:blue;">get</span>; }</pre>
</p>
<p>
We've already established that both <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> and <code><span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>></code> form functors. In this article you've learned that a functor in a functor is a functor, which applies to <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>></code>. Both of the above read-only properties are functors, then, which means that the entire class is a product of functors. The <code>Select</code> method follows:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">TResult</span>>(
Tree.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>),
Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">c</span> => <span style="font-weight:bold;color:#1f377f;">c</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)));
}</pre>
</p>
<p>
Notice that this <code>Select</code> implementation calls <code>Select</code> on the 'outer' <code>Breadcrumbs</code> by calling <code>Select</code> on each <code><span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>></code>. This is similar to the previous examples in this article.
</p>
<h3 id="800728c4c9c54aec815c62352843d52b">
Other nested containers <a href="#800728c4c9c54aec815c62352843d52b">#</a>
</h3>
<p>
There are plenty of other examples of functors that contains other functor values. Asynchronous programming supplies its own family of examples.
</p>
<p>
The way that C# and many other languages model asynchronous or I/O-bound actions is to wrap them in a <a href="https://learn.microsoft.com/dotnet/api/system.threading.tasks.task-1">Task</a> container. If the value inside the <code>Task<T></code> container is itself a functor, you can make that a functor, too. Examples include <code>Task<IEnumerable<T>></code>, <code>Task<Maybe<T>></code> (or its close cousin <code>Task<T?></code>; notice <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/nullable-reference-types">the question mark</a>), <code>Task<Result<T1, T2>></code>, etc. You'll run into such types every time you have an I/O-bound or concurrent operation that returns <code>IEnumerable<T></code>, <code>Maybe<T></code> etc. as an asynchronous result.
</p>
<p>
While you <em>can</em> make such nested task functors a functor in its own right, you rarely need that in languages with native <code>async</code> and <code>await</code> features, since those languages nudge you in other directions.
</p>
<p>
You can, however, run into other issues with task-based programming, but you'll see examples and solutions in <a href="/2024/11/11/traversals">a future article</a>.
</p>
<p>
You'll run into other examples of nested containers with many property-based testing libraries. They typically define <a href="/2017/09/18/the-test-data-generator-functor">Test Data Generators</a>, often called <code>Gen<T></code>. For .NET, both <a href="https://fscheck.github.io/FsCheck/">FsCheck</a>, <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a>, and <a href="https://github.com/AnthonyLloyd/CsCheck">CsCheck</a> does this. For Haskell, <a href="https://hackage.haskell.org/package/QuickCheck">QuickCheck</a>, too, defines <code>Gen a</code>.
</p>
<p>
You often need to generate random collections, in which case you'd work with <code>Gen<IEnumerable<T>></code> or a similar collection type. If you need random <a href="/2018/03/26/the-maybe-functor">Maybe</a> values, you'll work with <code>Gen<Maybe<T>></code>, and so on.
</p>
<p>
On the other hand, <a href="/2016/06/28/roman-numerals-via-property-based-tdd">sometimes you need</a> to work with a collection of generators, such as <code>seq<Gen<'a>></code>.
</p>
<p>
These are all examples of functors within functors. It's not a given that you <em>must</em> treat such a combination as a functor in its own right. To be honest, typically, you don't. On the other hand, if you find yourself writing <code>Select</code> within <code>Select</code>, or <code>map</code> within <code>map</code>, depending on your language, it might make your code more succinct and readable if you give that combination a specialized functor affordance.
</p>
<h3 id="bffe8909eb904260be8aa4ab1a22efb2">
Higher arities <a href="#bffe8909eb904260be8aa4ab1a22efb2">#</a>
</h3>
<p>
Like the previous two articles, the 'theorem' presented here generalizes to more than two functors. If you have a third <code>H</code> functor, then <code>F<G<H<T>>></code> also gives rise to a functor. You can easily prove this by simple induction. We may first consider the base case. With a single functor (<em>n = 1</em>) any functor (say, <code>F</code>) is trivially a functor.
</p>
<p>
In the induction step (<em>n > 1</em>), you then assume that the <em>n - 1</em> 'stack' of functors already gives rise to a functor, and then proceed to prove that the configuration where all those nested functors are wrapped by yet another functor also forms a functor. Since the 'inner stack' of functors forms a functor (by assumption), you only need to prove that a configuration of the outer functor, and that 'inner stack', gives rise to a functor. You've seen how this works in this article, but I admit that a few examples constitute no proof. I'll leave you with only a sketch of this step, but you may consider using equational reasoning <a href="https://bartoszmilewski.com/2015/01/20/functors/">as demonstrated by Bartosz Milewski</a> and then prove the functor laws for such a composition.
</p>
<p>
The Haskell <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Compose.html">Data.Functor.Compose</a> module defines a general-purpose data type to compose functors. You may, for example, compose a tuple inside a Maybe inside a list:
</p>
<p>
<pre><span style="color:#2b91af;">thriceNested</span> <span style="color:blue;">::</span> <span style="color:blue;">Compose</span> [] (<span style="color:blue;">Compose</span> <span style="color:#2b91af;">Maybe</span> ((,) <span style="color:#2b91af;">Integer</span>)) <span style="color:#2b91af;">String</span>
thriceNested = Compose [Compose (Just (42, <span style="color:#a31515;">"foo"</span>)), Compose Nothing, Compose (Just (89, <span style="color:#a31515;">"ba"</span>))]</pre>
</p>
<p>
You can easily <code>fmap</code> that data structure, for example by evaluating whether the number of characters in each string is an odd number (if it's there at all):
</p>
<p>
<pre>ghci> fmap (odd . length) thriceNested
Compose [Compose (Just (42,True)),Compose Nothing,Compose (Just (89,False))]</pre>
</p>
<p>
The first element now has <code>True</code> as the second tuple element, since <code>"foo"</code> has an odd number of characters (3). The next element is <code>Nothing</code>, because <code>Nothing</code> maps to <code>Nothing</code>. The third element has <code>False</code> in the rightmost tuple element, since <code>"ba"</code> doesn't have an odd number of characters (it has 2).
</p>
<h3 id="8c6ca7bcdc554856bee94bd11981aa6f">
Relations to monads <a href="#8c6ca7bcdc554856bee94bd11981aa6f">#</a>
</h3>
<p>
A nested 'stack' of functors may remind you of the way that I prefer to teach <a href="/2022/03/28/monads">monads</a>: <em>A monad is a functor your can flatten</em>. In short, the definition is the ability to 'flatten' <code>F<F<T>></code> to <code>F<T></code>. A function that can do that is often called <code>join</code> or <code>Flatten</code>.
</p>
<p>
So far in this article, we've been looking at stacks of different functors, abstractly denoted <code>F<G<T>></code>. There's no rule, however, that says that <code>F</code> and <code>G</code> may not be the same. If <code>F = G</code> then <code>F<G<T>></code> is really <code>F<F<T>></code>. This starts to look like the <a href="https://en.wikipedia.org/wiki/Antecedent_(logic)">antecedent</a> of the monad definition.
</p>
<p>
While the starting point may be the same, these notions are not equivalent. Yes, <code>F<F<T>></code> <em>may</em> form a monad (if you can flatten it), but it does, universally, give rise to a functor. On the other hand, we can hardly talk about flattening <code>F<G<T>></code>, because that would imply that you'd have to somehow 'throw away' either <code>F</code> or <code>G</code>. There may be specific functors (e.g. Identity) for which this is possible, but there's no universal law to that effect.
</p>
<p>
Not all 'stacks' of functors are monads. <a href="/2022/03/28/monads">All monads, on the other hand, are functors</a>.
</p>
<h3 id="14f39729b7ab426e83a35a067cf8f3a1">
Conclusion <a href="#14f39729b7ab426e83a35a067cf8f3a1">#</a>
</h3>
<p>
A data structure that configures one type of functor inside of another functor itself forms a functor. The examples shown in this article are mostly constrained to two functors, but if you have a 'stack' of three, four, or more functors, that arrangement still gives rise to a functor.
</p>
<p>
This is useful to know, particularly if you're working in a language with only partial support for functors. Mainstream languages aren't going to automatically turn such stacks into functors, in the way that Haskell's <code>Compose</code> container almost does. Thus, knowing when you can safely give your generic types a <code>Select</code> method or <code>map</code> function may come in handy.
</p>
<p>
To be honest, though, this result is hardly the most important 'theorem' concerning stacks of functors. In reality, you often run into situations where you <em>do</em> have a stack of functors, but they're in the wrong order. You may have a collection of asynchronous tasks, but you really need an asynchronous task that contains a collection of values. The next article addresses that problem.
</p>
<p>
<strong>Next:</strong> <a href="/2024/11/11/traversals">Traversals</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Legacy Security Manager in Haskellhttps://blog.ploeh.dk/2024/10/21/legacy-security-manager-in-haskell2024-10-21T06:14:00+00:00Mark Seemann
<div id="post">
<p>
<em>A translation of the kata, and my first attempt at it.</em>
</p>
<p>
In early 2013 Richard Dalton published an article about <a href="https://www.devjoy.com/blog/legacy-code-katas/">legacy code katas</a>. The idea is to present a piece of 'legacy code' that you have to somehow refactor or improve. Of course, in order to make the exercise manageable, it's necessary to reduce it to some essence of what we might regard as legacy code. It'll only be one aspect of true legacy code. For the legacy Security Manager exercise, the main problem is that the code is difficult to unit test.
</p>
<p>
The original kata presents the 'legacy code' in C#, which may exclude programmers who aren't familiar with that language and platform. Since I find the exercise useful, I've previous published <a href="https://github.com/ploeh/SecurityManagerPython">a port to Python</a>. In this article, I'll port the exercise to <a href="https://www.haskell.org/">Haskell</a>, as well as walk through one attempt at achieving the goals of the kata.
</p>
<h3 id="03ee8805b5a44e77b92f9f6d132513bf">
The legacy code <a href="#03ee8805b5a44e77b92f9f6d132513bf">#</a>
</h3>
<p>
The original C# code is a <code>static</code> procedure that uses the <a href="https://learn.microsoft.com/dotnet/api/system.console">Console</a> API to ask a user a few simple questions, do some basic input validation, and print a message to the standard output stream. That's easy enough to port to Haskell:
</p>
<p>
<pre><span style="color:blue;">module</span> SecurityManager (<span style="color:#2b91af;">createUser</span>) <span style="color:blue;">where</span>
<span style="color:blue;">import</span> Text.Printf (<span style="color:#2b91af;">printf</span>)
<span style="color:#2b91af;">createUser</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">IO</span> ()
createUser = <span style="color:blue;">do</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"Enter a username"</span>
username <- <span style="color:blue;">getLine</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"Enter your full name"</span>
fullName <- <span style="color:blue;">getLine</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"Enter your password"</span>
password <- <span style="color:blue;">getLine</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- <span style="color:blue;">getLine</span>
<span style="color:blue;">if</span> password /= confirmPassword
<span style="color:blue;">then</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"The passwords don't match"</span>
<span style="color:blue;">else</span>
<span style="color:blue;">if</span> <span style="color:blue;">length</span> password < 8
<span style="color:blue;">then</span>
<span style="color:blue;">putStrLn</span> <span style="color:#a31515;">"Password must be at least 8 characters in length"</span>
<span style="color:blue;">else</span> <span style="color:blue;">do</span>
<span style="color:green;">-- Encrypt the password (just reverse it, should be secure)
</span> <span style="color:blue;">let</span> array = <span style="color:blue;">reverse</span> password
<span style="color:blue;">putStrLn</span> $
printf <span style="color:#a31515;">"Saving Details for User (%s, %s, %s)"</span> username fullName array</pre>
</p>
<p>
Notice how the Haskell code seems to suffer slightly from the <a href="https://wiki.c2.com/?ArrowAntiPattern">Arrow code smell</a>, which is a problem that the C# code actually doesn't exhibit. The reason is that when using Haskell in an 'imperative style' (which you can, after a fashion, with <code>do</code> notation), you can't 'exit early' from a an <code>if</code> check. The problem is that you can't have <code>if</code>-<code>then</code> without <code>else</code>.
</p>
<p>
Haskell has other language features that enable you to get rid of Arrow code, but in the spirit of the exercise, this would take us too far away from the original C# code. Making the code prettier should be a task for the refactoring exercise, rather than the starting point.
</p>
<p>
I've <a href="https://github.com/ploeh/SecurityManagerHaskell">published the code to GitHub</a>, if you want a leg up.
</p>
<p>
Combined with Richard Dalton's original article, that's all you need to try your hand at the exercise. In the rest of this article, I'll go through my own attempt at the exercise. That said, while this was my first attempt at the Haskell version of it, I've done it multiple times in C#, and once in <a href="https://www.python.org/">Python</a>. In other words, this isn't my first rodeo.
</p>
<h3 id="b5098b724e8443c4afeaa56e92c2f0d2">
Break the dependency on the Console <a href="#b5098b724e8443c4afeaa56e92c2f0d2">#</a>
</h3>
<p>
As warned, the rest of the article is a walkthrough of the exercise, so if you'd like to try it yourself, stop reading now. On the other hand, if you want to read on, but follow along in the GitHub repository, I've pushed the rest of the code to a branch called <code>first-pass</code>.
</p>
<p>
The first part of the exercise is to <em>break the dependency on the console</em>. In a language like Haskell where functions are first-class citizens, this part is trivial. I removed the type declaration, moved <code>putStrLn</code> and <code>getLine</code> to parameters and renamed them. Finally, I asked the compiler what the new type is, and added the new type signature.
</p>
<p>
<pre><span style="color:blue;">import</span> Text.Printf (<span style="color:#2b91af;">printf</span>, <span style="color:blue;">IsChar</span>)
<span style="color:#2b91af;">createUser</span> <span style="color:blue;">::</span> (<span style="color:blue;">Monad</span> m, <span style="color:blue;">Eq</span> a, <span style="color:blue;">IsChar</span> a) <span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m [a] <span style="color:blue;">-></span> m ()
createUser writeLine readLine = <span style="color:blue;">do</span>
writeLine <span style="color:#a31515;">"Enter a username"</span>
username <- readLine
writeLine <span style="color:#a31515;">"Enter your full name"</span>
fullName <- readLine
writeLine <span style="color:#a31515;">"Enter your password"</span>
password <- readLine
writeLine <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- readLine
<span style="color:blue;">if</span> password /= confirmPassword
<span style="color:blue;">then</span>
writeLine <span style="color:#a31515;">"The passwords don't match"</span>
<span style="color:blue;">else</span>
<span style="color:blue;">if</span> <span style="color:blue;">length</span> password < 8
<span style="color:blue;">then</span>
writeLine <span style="color:#a31515;">"Password must be at least 8 characters in length"</span>
<span style="color:blue;">else</span> <span style="color:blue;">do</span>
<span style="color:green;">-- Encrypt the password (just reverse it, should be secure)
</span> <span style="color:blue;">let</span> array = <span style="color:blue;">reverse</span> password
writeLine $
printf <span style="color:#a31515;">"Saving Details for User (%s, %s, %s)"</span> username fullName array</pre>
</p>
<p>
I also changed the <code>main</code> action of the program to pass <code>putStrLn</code> and <code>getLine</code> as arguments:
</p>
<p>
<pre><span style="color:blue;">import</span> SecurityManager (<span style="color:#2b91af;">createUser</span>)
<span style="color:#2b91af;">main</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">IO</span> ()
main = createUser <span style="color:blue;">putStrLn</span> <span style="color:blue;">getLine</span></pre>
</p>
<p>
Manual testing indicates that I didn't break any functionality.
</p>
<h3 id="53e3144fa5b04528a8d54ae035dc40b8">
Get the password comparison feature under test <a href="#53e3144fa5b04528a8d54ae035dc40b8">#</a>
</h3>
<p>
The next task is to <em>get the password comparison feature under test</em>. Over a small series of Git commits, I added these <a href="/2018/05/07/inlined-hunit-test-lists">inlined, parametrized HUnit tests</a>:
</p>
<p>
<pre><span style="color:#a31515;">"Matching passwords"</span> ~: <span style="color:blue;">do</span>
pw <- [<span style="color:#a31515;">"password"</span>, <span style="color:#a31515;">"12345678"</span>, <span style="color:#a31515;">"abcdefgh"</span>]
<span style="color:blue;">let</span> actual = comparePasswords pw pw
<span style="color:blue;">return</span> $ Right pw ~=? actual
,
<span style="color:#a31515;">"Non-matching passwords"</span> ~: <span style="color:blue;">do</span>
(pw1, pw2) <-
[
(<span style="color:#a31515;">"password"</span>, <span style="color:#a31515;">"PASSWORD"</span>),
(<span style="color:#a31515;">"12345678"</span>, <span style="color:#a31515;">"12345677"</span>),
(<span style="color:#a31515;">"abcdefgh"</span>, <span style="color:#a31515;">"bacdefgh"</span>),
(<span style="color:#a31515;">"aaa"</span>, <span style="color:#a31515;">"bbb"</span>)
]
<span style="color:blue;">let</span> actual = comparePasswords pw1 pw2
<span style="color:blue;">return</span> $ Left <span style="color:#a31515;">"The passwords don't match"</span> ~=? actual</pre>
</p>
<p>
The resulting implementation is this <code>comparePasswords</code> function:
</p>
<p>
<pre><span style="color:#2b91af;">comparePasswords</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Either</span> <span style="color:#2b91af;">String</span> <span style="color:#2b91af;">String</span>
comparePasswords pw1 pw2 =
<span style="color:blue;">if</span> pw1 == pw2
<span style="color:blue;">then</span> Right pw1
<span style="color:blue;">else</span> Left <span style="color:#a31515;">"The passwords don't match"</span></pre>
</p>
<p>
You'll notice that I chose to implement it as an <code>Either</code>-valued function. While I consider <a href="/2020/12/14/validation-a-solved-problem">validation a solved problem</a>, the usual solution involves some <a href="/2018/11/05/applicative-validation">applicative validation</a> container. In this exercise, validation is already short-circuiting, which means that we can use the standard monadic composition that <code>Either</code> affords.
</p>
<p>
At this point in the exercise, I just left the <code>comparePasswords</code> function there, without trying to use it within <code>createUser</code>. The reason for that is that <code>Either</code>-based composition is sufficiently different from <code>if</code>-<code>then</code>-<code>else</code> code that I wanted to get the entire system under test before I attempted that.
</p>
<h3 id="a1dc5d33f8eb4d5b80d015b197d1afc3">
Get the password validation feature under test <a href="#a1dc5d33f8eb4d5b80d015b197d1afc3">#</a>
</h3>
<p>
The third task of the exercise is to <em>get the password validation feature under test</em>. That's similar to the previous task. Once more, I'll show the tests first, and then the function driven by those tests, but I want to point out that both code artefacts came iteratively into existence through the usual <a href="/2019/10/21/a-red-green-refactor-checklist">red-green-refactor</a> cycle.
</p>
<p>
<pre><span style="color:#a31515;">"Validate short password"</span> ~: <span style="color:blue;">do</span>
pw <- [<span style="color:#a31515;">""</span>, <span style="color:#a31515;">"1"</span>, <span style="color:#a31515;">"12"</span>, <span style="color:#a31515;">"abc"</span>, <span style="color:#a31515;">"1234"</span>, <span style="color:#a31515;">"gtrex"</span>, <span style="color:#a31515;">"123456"</span>, <span style="color:#a31515;">"1234567"</span>]
<span style="color:blue;">let</span> actual = validatePassword pw
<span style="color:blue;">return</span> $ Left <span style="color:#a31515;">"Password must be at least 8 characters in length"</span> ~=? actual
,
<span style="color:#a31515;">"Validate long password"</span> ~: <span style="color:blue;">do</span>
pw <- [<span style="color:#a31515;">"12345678"</span>, <span style="color:#a31515;">"123456789"</span>, <span style="color:#a31515;">"abcdefghij"</span>, <span style="color:#a31515;">"elevenchars"</span>]
<span style="color:blue;">let</span> actual = validatePassword pw
<span style="color:blue;">return</span> $ Right pw ~=? actual</pre>
</p>
<p>
The resulting function is hardly surprising.
</p>
<p>
<pre><span style="color:#2b91af;">validatePassword</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Either</span> <span style="color:#2b91af;">String</span> <span style="color:#2b91af;">String</span>
validatePassword pw =
<span style="color:blue;">if</span> <span style="color:blue;">length</span> pw < 8
<span style="color:blue;">then</span> Left <span style="color:#a31515;">"Password must be at least 8 characters in length"</span>
<span style="color:blue;">else</span> Right pw</pre>
</p>
<p>
As in the previous step, I chose to postpone <em>using</em> this function from within <code>createUser</code> until I had a set of characterization tests. That may not be entirely in the spirit of the four subtasks of the exercise, but on the other hand, I intended to do more than just those four activities. The code here is actually simple enough that I could easily refactor without full test coverage, but recalling that this is a legacy code exercise, I find it warranted to <em>pretend</em> that it's complicated.
</p>
<p>
To be fair to the exercise, there'd <em>also</em> be a valuable exercise in attempting to extract each feature piecemeal, because it's not alway possible to add complete characterization test coverage to a piece of gnarly legacy code. Be that as it may, I've already done that kind of exercise in C# a few times, and I had a different agenda for the Haskell exercise. In short, I was curious about what sort of inferred type <code>createUser</code> would have, once I'd gone through all four subtasks. I'll return to that topic in a moment. First, I want to address the fourth subtask.
</p>
<h3 id="dc17b82e5e374cce8d59e2791eadfdfb">
Allow different encryption algorithms to be used <a href="#dc17b82e5e374cce8d59e2791eadfdfb">#</a>
</h3>
<p>
The final part of the exercise is to <em>add a feature to allow different encryption algorithms to be used</em>. Once again, when you're working in a language where functions are first-class citizens, and <a href="https://en.wikipedia.org/wiki/Higher-order_function">higher-order functions</a> are <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a>, one solution is easily at hand:
</p>
<p>
<pre><span style="color:#2b91af;">createUser</span> <span style="color:blue;">::</span> (<span style="color:blue;">Monad</span> m, <span style="color:blue;">Foldable</span> t, <span style="color:blue;">Eq</span> (t a), <span style="color:blue;">PrintfArg</span> (t a), <span style="color:blue;">PrintfArg</span> b)
<span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m (t a) <span style="color:blue;">-></span> (t a <span style="color:blue;">-></span> b) <span style="color:blue;">-></span> m ()
createUser writeLine readLine encrypt = <span style="color:blue;">do</span>
writeLine <span style="color:#a31515;">"Enter a username"</span>
username <- readLine
writeLine <span style="color:#a31515;">"Enter your full name"</span>
fullName <- readLine
writeLine <span style="color:#a31515;">"Enter your password"</span>
password <- readLine
writeLine <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- readLine
<span style="color:blue;">if</span> password /= confirmPassword
<span style="color:blue;">then</span>
writeLine <span style="color:#a31515;">"The passwords don't match"</span>
<span style="color:blue;">else</span>
<span style="color:blue;">if</span> <span style="color:blue;">length</span> password < 8
<span style="color:blue;">then</span>
writeLine <span style="color:#a31515;">"Password must be at least 8 characters in length"</span>
<span style="color:blue;">else</span> <span style="color:blue;">do</span>
<span style="color:blue;">let</span> array = encrypt password
writeLine $
printf <span style="color:#a31515;">"Saving Details for User (%s, %s, %s)"</span> username fullName array</pre>
</p>
<p>
The only change I've made is to promote <code>encrypt</code> to a parameter. This, of course, ripples through the code that calls the action, but currently, that's only the <code>main</code> action, where I had to add <code>reverse</code> as a third argument:
</p>
<p>
<pre><span style="color:#2b91af;">main</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">IO</span> ()
main = createUser <span style="color:blue;">putStrLn</span> <span style="color:blue;">getLine</span> <span style="color:blue;">reverse</span></pre>
</p>
<p>
Before I made the change, I removed the type annotation from <code>createUser</code>, because adding a parameter causes the type to change. Keeping the type annotation would have caused a compilation error. Eschewing type annotations makes it easier to make changes. Once I'd made the change, I added the new annotation, inferred by the <a href="https://marketplace.visualstudio.com/items?itemName=haskell.haskell">Haskell Visual Studio Code extension</a>.
</p>
<p>
I was curious what kind of abstraction would arise. Would it be testable in some way?
</p>
<h3 id="da305705261f4c1fae7842d204097c6b">
Testability <a href="#da305705261f4c1fae7842d204097c6b">#</a>
</h3>
<p>
Consider the inferred type of <code>createUser</code> above. It's quite abstract, and I was curious if it was flexible enough to allow testability without adding <a href="https://dhh.dk/2014/test-induced-design-damage.html">test-induced damage</a>. In short, in object-oriented programming, you often need to add Dependency Injection to make code testable, and the valid criticism is that this makes code more complicated than it would otherwise have been. I consider such reproval justified, although I disagree with the conclusion. It's not the desire for testability that causes the damage, but rather that object-oriented design is at odds with testability.
</p>
<p>
That's my conjecture, anyway, so I'm always curious when working with other paradigms like functional programming. Is idiomatic code already testable, or do you need to 'do damage to it' in order to make it testable?
</p>
<p>
As a Haskell action goes, I would consider its type fairly idiomatic. The code, too, is straightforward, although perhaps rather naive. It looks like beginner Haskell, and as we'll see later, we can rewrite it to be more elegant.
</p>
<p>
Before I started the exercise, I wondered whether it'd be necessary to <a href="/2017/07/11/hello-pure-command-line-interaction">use free monads to model pure command-line interactions</a>. Since <code>createUser</code> returns <code>m ()</code>, where <code>m</code> is any <code>Monad</code> instance, using a free monad would be possible, but turns out to be overkill. After having thought about it a bit, I recalled that in many languages and platforms, you can <a href="https://stackoverflow.com/a/2139303/126014">redirect <em>standard in</em> and <em>standard out</em> for testing purposes</a>. The way you do that is typically by replacing each with some kind of text stream. Based on that knowledge, I thought I could use <a href="/2022/06/20/the-state-monad">the State monad</a> for characterization testing, with a list of strings for each text stream.
</p>
<p>
In other words, the code is already testable as it is. No test-induced damage here.
</p>
<h3 id="ae4ba5da448b4e248cb63f124b135834">
Characterization tests <a href="#ae4ba5da448b4e248cb63f124b135834">#</a>
</h3>
<p>
To use the State monad, I started by importing <a href="https://hackage.haskell.org/package/transformers/docs/Control-Monad-Trans-State-Lazy.html">Control.Monad.Trans.State.Lazy</a> into my test code. This enabled me to write the first characterization test:
</p>
<p>
<pre><span style="color:#a31515;">"Happy path"</span> ~: <span style="color:blue;">flip</span> evalState
([<span style="color:#a31515;">"just.inhale"</span>, <span style="color:#a31515;">"Justin Hale"</span>, <span style="color:#a31515;">"12345678"</span>, <span style="color:#a31515;">"12345678"</span>], <span style="color:blue;">[]</span>) $ <span style="color:blue;">do</span>
<span style="color:blue;">let</span> writeLine x = modify (second (++ [x]))
<span style="color:blue;">let</span> readLine = state (\(i, o) -> (<span style="color:blue;">head</span> i, (<span style="color:blue;">tail</span> i, o)))
<span style="color:blue;">let</span> encrypt = <span style="color:blue;">reverse</span>
createUser writeLine readLine encrypt
actual <- gets <span style="color:blue;">snd</span>
<span style="color:blue;">let</span> expected = [
<span style="color:#a31515;">"Enter a username"</span>,
<span style="color:#a31515;">"Enter your full name"</span>,
<span style="color:#a31515;">"Enter your password"</span>,
<span style="color:#a31515;">"Re-enter your password"</span>,
<span style="color:#a31515;">"Saving Details for User (just.inhale, Justin Hale, 87654321)"</span>]
<span style="color:blue;">return</span> $ expected ~=? actual</pre>
</p>
<p>
I consulted my earlier code from <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">An example of state-based testing in Haskell</a> instead of reinventing the wheel, so if you want a more detailed walkthrough, you may want to consult that article as well as this one.
</p>
<p>
The type of the state that the test makes use of is <code>([String], [String])</code>. As the lambda expression suggests by naming the elements <code>i</code> and <code>o</code>, the two string lists are used for respectively input and output. The test starts with an 'input stream' populated by 'user input' values, corresponding to each of the four answers a user might give to the questions asked.
</p>
<p>
The <code>readLine</code> function works by pulling the <code>head</code> off the input list <code>i</code>, while on the other hand not touching the output list <code>o</code>. Its type is <code>State ([a], b) a</code>, compatible with <code>createUser</code>, which requires its <code>readLine</code> parameter to have the type <code>m (t a)</code>, where <code>m</code> is a <code>Monad</code> instance, and <code>t</code> a <code>Foldable</code> instance. The effective type turns out to be <code>t a ~ [Char] = String</code>, so that <code>readLine</code> effectively has the type <code>State ([String], b) String</code>. Since <code>State ([String], b)</code> is a <code>Monad</code> instance, it fits the <code>m</code> type argument of the requirement.
</p>
<p>
The same kind of reasoning applies to <code>writeLine</code>, which appends the input value to the 'output stream', which is the second list in the I/O tuple.
</p>
<p>
The test runs the <code>createUser</code> action and then checks that the output list contains the <code>expected</code> values.
</p>
<p>
A similar test verifies the behaviour when the passwords don't match:
</p>
<p>
<pre><span style="color:#a31515;">"Mismatched passwords"</span> ~: <span style="color:blue;">flip</span> evalState
([<span style="color:#a31515;">"i.lean.right"</span>, <span style="color:#a31515;">"Ilene Wright"</span>, <span style="color:#a31515;">"password"</span>, <span style="color:#a31515;">"Password"</span>], <span style="color:blue;">[]</span>) $ <span style="color:blue;">do</span>
<span style="color:blue;">let</span> writeLine x = modify (second (++ [x]))
<span style="color:blue;">let</span> readLine = state (\(i, o) -> (<span style="color:blue;">head</span> i, (<span style="color:blue;">tail</span> i, o)))
<span style="color:blue;">let</span> encrypt = <span style="color:blue;">reverse</span>
createUser writeLine readLine encrypt
actual <- gets <span style="color:blue;">snd</span>
<span style="color:blue;">let</span> expected = [
<span style="color:#a31515;">"Enter a username"</span>,
<span style="color:#a31515;">"Enter your full name"</span>,
<span style="color:#a31515;">"Enter your password"</span>,
<span style="color:#a31515;">"Re-enter your password"</span>,
<span style="color:#a31515;">"The passwords don't match"</span>]
<span style="color:blue;">return</span> $ expected ~=? actual</pre>
</p>
<p>
You can see the third and final characterization test in the GitHub repository.
</p>
<h3 id="ba7601efc69a4b929e738396588dc69a">
Refactored action <a href="#ba7601efc69a4b929e738396588dc69a">#</a>
</h3>
<p>
With <a href="/2015/11/16/code-coverage-is-a-useless-target-measure">full test coverage</a> I could proceed to refactor the <code>createUser</code> action, pulling in the two functions I'd test-driven into existence earlier:
</p>
<p>
<pre><span style="color:#2b91af;">createUser</span> <span style="color:blue;">::</span> (<span style="color:blue;">Monad</span> m, <span style="color:blue;">PrintfArg</span> a)
<span style="color:blue;">=></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> m ()) <span style="color:blue;">-></span> m <span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> (<span style="color:#2b91af;">String</span> <span style="color:blue;">-></span> a) <span style="color:blue;">-></span> m ()
createUser writeLine readLine encrypt = <span style="color:blue;">do</span>
writeLine <span style="color:#a31515;">"Enter a username"</span>
username <- readLine
writeLine <span style="color:#a31515;">"Enter your full name"</span>
fullName <- readLine
writeLine <span style="color:#a31515;">"Enter your password"</span>
password <- readLine
writeLine <span style="color:#a31515;">"Re-enter your password"</span>
confirmPassword <- readLine
writeLine $ either
<span style="color:blue;">id</span>
(printf <span style="color:#a31515;">"Saving Details for User (%s, %s, %s)"</span> username fullName . encrypt)
(validatePassword =<< comparePasswords password confirmPassword)</pre>
</p>
<p>
Because <code>createUser</code> now calls <code>comparePasswords</code> and <code>validatePassword</code>, the type of the overall composition is also more concrete. That's really just an artefact of my (misguided?) decision to give each of the two helper functions types that are more concrete than necessary.
</p>
<p>
As you can see, I left the initial call-and-response sequence intact, since I didn't feel that it needed improvement.
</p>
<h3 id="5dcbfa4c67c64780a76dc380fb64b138">
Conclusion <a href="#5dcbfa4c67c64780a76dc380fb64b138">#</a>
</h3>
<p>
I ported the Legacy Security Manager kata to Haskell because I thought it'd be easy enough to port the code itself, and I also found the exercise compelling enough in its own right.
</p>
<p>
The most interesting point, I think, is that the <code>createUser</code> action remains testable without making any other concession to testability than turning it into a higher-order function. For pure functions, we would expect this to be the case, since <a href="/2015/05/07/functional-design-is-intrinsically-testable">pure functions are intrinsically testable</a>, but for impure actions like <code>createUser</code>, this isn't a given. Interacting exclusively with the command-line API is, however, sufficiently simple that we can get by with the State monad. No free monad is needed, and so test-induced damage is kept at a minimum.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Functor sumshttps://blog.ploeh.dk/2024/10/14/functor-sums2024-10-14T18:26:00+00:00Mark Seemann
<div id="post">
<p>
<em>A choice of two or more functors gives rise to a functor. An article for object-oriented programmers.</em>
</p>
<p>
This article is part of <a href="/2022/07/11/functor-relationships">a series of articles about functor relationships</a>. In this one you'll learn about a universal composition of functors. In short, if you have a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> of functors, that data structure itself gives rise to a functor.
</p>
<p>
Together with other articles in this series, this result can help you answer questions such as: <em>Does this data structure form a functor?</em>
</p>
<p>
Since <a href="/2018/03/22/functors">functors</a> tend to be quite common, and since they're useful enough that many programming languages have special support or syntax for them, the ability to recognize a potential functor can be useful. Given a type like <code>Foo<T></code> (C# syntax) or <code>Bar<T1, T2></code>, being able to recognize it as a functor can come in handy. One scenario is if you yourself have just defined this data type. Recognizing that it's a functor strongly suggests that you should give it a <code>Select</code> method in C#, a <code>map</code> function in <a href="https://fsharp.org/">F#</a>, and so on.
</p>
<p>
Not all generic types give rise to a (covariant) functor. Some are rather <a href="/2021/09/02/contravariant-functors">contravariant functors</a>, and some are <a href="/2022/08/01/invariant-functors">invariant</a>.
</p>
<p>
If, on the other hand, you have a data type which is a sum of two or more (covariant) functors <em>with the same type parameter</em>, then the data type itself gives rise to a functor. You'll see some examples in this article.
</p>
<h3 id="fd1c2960d14946008a49b07698151647">
Abstract shape in F# <a href="#fd1c2960d14946008a49b07698151647">#</a>
</h3>
<p>
Before we look at some examples found in other code, it helps if we know what we're looking for. You'll see a C# example in a minute, but since sum types require so much <a href="/2019/12/16/zone-of-ceremony">ceremony</a> in C#, we'll make a brief detour around F#.
</p>
<p>
Imagine that you have two lawful functors, <code>F</code> and <code>G</code>. Also imagine that you have a data structure that holds either an <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">'a</span>></code> value or a <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">'a</span>></code> value:
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">'a</span>> = <span style="color:#2b91af;">FOrGF</span> <span style="color:blue;">of</span> <span style="color:#2b91af;">F</span><<span style="color:#2b91af;">'a</span>> | <span style="color:#2b91af;">FOrGG</span> <span style="color:blue;">of</span> <span style="color:#2b91af;">G</span><<span style="color:#2b91af;">'a</span>></pre>
</p>
<p>
The name of the type is <code>FOrG</code>. In the <code>FOrGF</code> case, it holds an <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">'a</span>></code> value, and in the <code>FOrGG</code> case it holds a <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">'a</span>></code> value.
</p>
<p>
The point of this article is that since both <code>F</code> and <code>G</code> are (lawful) functors, then <code>FOrG</code> also gives rise to a functor. The composed <code>map</code> function can pattern-match on each case and call the respective <code>map</code> function that belongs to each of the two functors.
</p>
<p>
<pre><span style="color:blue;">let</span> <span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">forg</span> =
<span style="color:blue;">match</span> <span style="font-weight:bold;color:#1f377f;">forg</span> <span style="color:blue;">with</span>
| <span style="color:#2b91af;">FOrGF</span> <span style="font-weight:bold;color:#1f377f;">fa</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">FOrGF</span> (<span style="color:#2b91af;">F</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">fa</span>)
| <span style="color:#2b91af;">FOrGG</span> <span style="font-weight:bold;color:#1f377f;">ga</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">FOrGG</span> (<span style="color:#2b91af;">G</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">ga</span>)</pre>
</p>
<p>
For clarity I've named the values <code>fa</code> indicating <em>f of a</em> and <code>ga</code> indicating <em>g of a</em>.
</p>
<p>
Notice that it's an essential requirement that the individual functors (here <code>F</code> and <code>G</code>) are parametrized by the same type parameter (here <code>'a</code>). If your data structure contains <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">'a</span>></code> and <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">'b</span>></code>, the 'theorem' doesn't apply.
</p>
<h3 id="9ff2f85804104bf192941ec8634757b6">
Abstract shape in C# <a href="#9ff2f85804104bf192941ec8634757b6">#</a>
</h3>
<p>
The same kind of abstract shape requires much more boilerplate in C#. When defining a sum type in a language that doesn't support them, we may instead either <a href="/2018/06/25/visitor-as-a-sum-type">turn to the Visitor design pattern</a> or alternatively use <a href="/2018/05/22/church-encoding">Church encoding</a>. While the two are isomorphic, Church encoding is a bit simpler while the <a href="https://en.wikipedia.org/wiki/Visitor_pattern">Visitor pattern</a> seems more object-oriented. In this example I've chosen the simplicity of Church encoding.
</p>
<p>
Like in the above F# code, I've named the data structure the same, but it's now a class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">T</span>></pre>
</p>
<p>
Two constructors enable you to initialize it with either an <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>></code> or a <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>></code> value.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FOrG</span>(<span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">f</span>)
<span style="color:blue;">public</span> <span style="color:#2b91af;">FOrG</span>(<span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">g</span>)</pre>
</p>
<p>
Notice that <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>></code> and <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>></code> share the same type parameter <code>T</code>. If a class had, instead, composed either <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T1</span>></code> or <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T2</span>></code>, the 'theorem' doesn't apply.
</p>
<p>
Finally, a <code>Match</code> method completes the Church encoding.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenF</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenG</span>)</pre>
</p>
<p>
Regardless of exactly what <code>F</code> and <code>G</code> are, you can add a <code>Select</code> method to <code><span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">T</span>></code> like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">whenF</span>: <span style="font-weight:bold;color:#1f377f;">f</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">TResult</span>>(<span style="font-weight:bold;color:#1f377f;">f</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)),
<span style="font-weight:bold;color:#1f377f;">whenG</span>: <span style="font-weight:bold;color:#1f377f;">g</span> => <span style="color:blue;">new</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">TResult</span>>(<span style="font-weight:bold;color:#1f377f;">g</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)));
}</pre>
</p>
<p>
Since we assume that <code>F</code> and <code>G</code> are functors, which in C# <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> have a <code>Select</code> method, we pass the <code>selector</code> to their respective <code>Select</code> methods. <code>f.Select</code> returns a new <code>F</code> value, while <code>g.Select</code> returns a new <code>G</code> value, but there's a constructor for each case, so the composed <code>Select</code> method repackages those return values in <code><span style="color:blue;">new</span> <span style="color:#2b91af;">FOrG</span><<span style="color:#2b91af;">TResult</span>></code> objects.
</p>
<p>
I'll have more to say about how this generalizes to a sum of more than two alternatives, but first, let's consider some examples.
</p>
<h3 id="03a6f1ef94ca4ca2927b38d95e34c31f">
Open or closed endpoints <a href="#03a6f1ef94ca4ca2927b38d95e34c31f">#</a>
</h3>
<p>
The simplest example that I can think of is that of <a href="/2024/01/01/variations-of-the-range-kata">range</a> endpoints. A range may be open, closed, or a mix thereof. Some mathematical notations use <code>(1, 6]</code> to indicate the range between 1 and 6, where 1 is excluded from the range, but 6 is included. An alternative notation is <code>]1, 6]</code>.
</p>
<p>
A given endpoint (1 and 6, above) is either open or closed, which implies a sum type. <a href="/2024/01/15/a-range-kata-implementation-in-f">In F# I defined it like this</a>:
</p>
<p>
<pre><span style="color:blue;">type</span> Endpoint<'a> = Open <span style="color:blue;">of</span> 'a | Closed <span style="color:blue;">of</span> 'a</pre>
</p>
<p>
If you're at all familiar with F#, this is clearly a <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/discriminated-unions">discriminated union</a>, which is just what the F# documentation calls sum types.
</p>
<p>
The article <a href="/2024/02/12/range-as-a-functor">Range as a functor</a> goes through examples in both <a href="https://www.haskell.org/">Haskell</a>, F#, and C#, demonstrating, among other points, how an endpoint sum type forms a functor.
</p>
<h3 id="9cf974abd1fb497aa43087e7697bb982">
Binary tree <a href="#9cf974abd1fb497aa43087e7697bb982">#</a>
</h3>
<p>
The next example we'll consider is the binary tree from <a href="/2024/09/09/a-binary-tree-zipper-in-c">A Binary Tree Zipper in C#</a>. In the <a href="https://learnyouahaskell.com/zippers">original Haskell Zippers article</a>, the data type is defined like this:
</p>
<p>
<pre><span style="color:blue;">data</span> Tree a = Empty | Node a (Tree a) (Tree a) <span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Show</span>)</pre>
</p>
<p>
Even if you're not familiar with Haskell syntax, the vertical bar (<code>|</code>) indicates a choice between the left-hand side and the right-hand side. Many programming languages use the <code>|</code> character for Boolean disjunction (<em>or</em>), so the syntax should be intuitive. In this definition, a binary tree is either empty or a node with a value and two subtrees. What interests us here is that it's a sum type.
</p>
<p>
One way this manifests in C# is in the choice of two alternative constructors:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span>() : <span style="color:blue;">this</span>(<span style="color:#2b91af;">Empty</span>.Instance)
{
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">left</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)
: <span style="color:blue;">this</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">Node</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">left</span>.root, <span style="font-weight:bold;color:#1f377f;">right</span>.root))
{
}</pre>
</p>
<p>
<code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> clearly has a generic type parameter. Does the class give rise to a functor?
</p>
<p>
It does if it's composed from a sum of two functors. Is that the case?
</p>
<p>
On the 'left' side, it seems that we have nothing. In the Haskell code, it's called <code>Empty</code>. In the C# code, this case is represented by the parameterless constructor (also known as the <em>default constructor</em>). There's no <code>T</code> there, so that doesn't look much like a functor.
</p>
<p>
All is, however, not lost. We may view this lack of data as a particular value ('nothing') wrapped in <a href="/2024/10/07/the-const-functor">the Const functor</a>. In Haskell and F# a value without data is called <em>unit</em> and written <code>()</code>. In C# or <a href="https://www.java.com/">Java</a> you may <a href="/2018/01/15/unit-isomorphisms">think of it as void</a>, although <em>unit</em> is a value that you can pass around, which isn't the case for <code>void</code>.
</p>
<p>
In Haskell, we could instead represent <code>Empty</code> as <code>Const ()</code>, which is a bona-fide <code>Functor</code> instance that you can <code>fmap</code>:
</p>
<p>
<pre>ghci> emptyNode = Const ()
ghci> fmap (+1) emptyNode
Const ()</pre>
</p>
<p>
This examples pretends to 'increment' a number that isn't there. Not that you'd need to do this. I'm only showing you this to make the argument that the empty node forms a functor.
</p>
<p>
The 'right' side of the sum type is most succinctly summarized by the Haskell code:
</p>
<p>
<pre>Node a (Tree a) (Tree a)</pre>
</p>
<p>
It's a 'naked' generic value and two generic trees. In C# it's the parameter list
</p>
<p>
<pre>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">left</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)</pre>
</p>
<p>
Does that make a functor? Yes, it's a triple of a 'naked' generic value and two recursive subtrees, all sharing the same <code>T</code>. Just like in the <a href="/2024/09/16/functor-products">previous article</a> we can view a 'naked' generic value as equivalent to <a href="/2018/09/03/the-identity-functor">the Identity functor</a>, so that parameter is a functor. The other ones are recursive types: They are of the same type as the type we're trying to evaluate, <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code>. If we assume that that forms a functor, that triple is a product type of functors. From the previous article, we know that that gives rise to a functor.
</p>
<p>
This means that in C#, for example, you can add the idiomatic <code>Select</code> method:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>: () => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">TResult</span>>(),
<span style="font-weight:bold;color:#1f377f;">whenNode</span>: (<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">left</span>, <span style="font-weight:bold;color:#1f377f;">right</span>) =>
<span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">TResult</span>>(<span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">value</span>), <span style="font-weight:bold;color:#1f377f;">left</span>, <span style="font-weight:bold;color:#1f377f;">right</span>));
}</pre>
</p>
<p>
In languages that support pattern-matching on sum types (such as F#), you'd have to match on each case and explicitly deal with the recursive mapping. Notice, however, that here I've used the <code>Aggregate</code> method to implement <code>Select</code>. The <code>Aggregate</code> method is the <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> class' <a href="/2019/04/29/catamorphisms">catamorphism</a>, and it already handles the recursion for us. In other words, <code>left</code> and <code>right</code> are already <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">TResult</span>></code> objects.
</p>
<p>
What remains is only to tell <code>Aggregate</code> what to do when the tree is empty, and how to transform the 'naked' node <code>value</code>. The <code>Select</code> implementation handles the former by returning a new empty tree, and the latter by invoking <code><span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">value</span>)</code>.
</p>
<p>
Not only does the binary tree form a functor, but it turns out that the <a href="/2024/08/19/zippers">Zipper</a> does as well, because the breadcrumbs also give rise to a functor.
</p>
<h3 id="02e7e55d7f6f4c0d94c50cf577238859">
Breadcrumbs <a href="#02e7e55d7f6f4c0d94c50cf577238859">#</a>
</h3>
<p>
The <a href="https://learnyouahaskell.com/zippers">original Haskell Zippers article</a> defines a breadcrumb for the binary tree Zipper like this:
</p>
<p>
<pre><span style="color:blue;">data</span> Crumb a = LeftCrumb a (Tree a) | RightCrumb a (Tree a) <span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Show</span>)</pre>
</p>
<p>
That's another sum type with generics on the left as well as the right. In C# the two options may be best illustrated by these two creation methods:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>> <span style="color:#74531f;">Left</span><<span style="color:#2b91af;">T</span>>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>.<span style="color:#74531f;">Left</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">right</span>);
}
<span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>> <span style="color:#74531f;">Right</span><<span style="color:#2b91af;">T</span>>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">left</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>.<span style="color:#74531f;">Right</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">left</span>);
}</pre>
</p>
<p>
Notice that the <code>Left</code> and <code>Right</code> choices have the same structure: A 'naked' generic <code>T</code> value, and a <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> object. Only the names differ. This suggests that we only need to think about one of them, and then we can reuse our conclusion for the other.
</p>
<p>
As we've already done once, we consider a <code>T</code> value equivalent with <code><span style="color:#2b91af;">Identity</span><<span style="color:#2b91af;">T</span>></code>, which is a functor. We've also, just above, established that <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> forms a functor. We have a product (argument list, or tuple) of functors, so that combination forms a functor.
</p>
<p>
Since this is true for both alternatives, this sum type, too, gives rise to a functor. This enables you to implement a <code>Select</code> method:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Match</span>(
(<span style="font-weight:bold;color:#1f377f;">v</span>, <span style="font-weight:bold;color:#1f377f;">r</span>) => <span style="color:#2b91af;">Crumb</span>.<span style="color:#74531f;">Left</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">v</span>), <span style="font-weight:bold;color:#1f377f;">r</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)),
(<span style="font-weight:bold;color:#1f377f;">v</span>, <span style="font-weight:bold;color:#1f377f;">l</span>) => <span style="color:#2b91af;">Crumb</span>.<span style="color:#74531f;">Right</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">v</span>), <span style="font-weight:bold;color:#1f377f;">l</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)));
}</pre>
</p>
<p>
By now the pattern should be familiar. Call <code><span style="font-weight:bold;color:#1f377f;">selector</span>(<span style="font-weight:bold;color:#1f377f;">v</span>)</code> directly on the 'naked' values, and pass <code>selector</code> to any other functors' <code>Select</code> method.
</p>
<p>
That's <em>almost</em> all the building blocks we have to declare <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code> a functor as well, but we need one last theorem before we can do that. We'll conclude this work in <a href="/2024/10/28/functor-compositions">the next article</a>.
</p>
<h3 id="2b3a70f8791c41eb952ff160398fe441">
Higher arities <a href="#2b3a70f8791c41eb952ff160398fe441">#</a>
</h3>
<p>
Although we finally saw a 'real' triple product, all the sum types have involved binary choices between a 'left side' and a 'right side'. As was the case with functor products, the result generalizes to higher arities. A sum type with any number of cases forms a functor if all the cases give rise to a functor.
</p>
<p>
We can, again, use canonicalized forms to argue the case. (See <a href="https://thinkingwithtypes.com/">Thinking with Types</a> for a clear explanation of canonicalization of types.) A two-way choice is isomorphic to <a href="/2019/01/14/an-either-functor">Either</a>, and a three-way choice is isomorphic to <code>Either a (Either b c)</code>. Just like it's possible to build triples, quadruples, etc. by nesting pairs, we can construct n-ary choices by nesting Eithers. It's the same kind of inductive reasoning.
</p>
<p>
This is relevant because just as Haskell's <a href="https://hackage.haskell.org/package/base">base</a> library provides <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Product.html">Data.Functor.Product</a> for composing two (and thereby any number of) functors, it also provides <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Sum.html">Data.Functor.Sum</a> for composing functor sums.
</p>
<p>
The <code>Sum</code> type defines two case constructors: <code>InL</code> and <code>InR</code>, but it's isomorphic with <code>Either</code>:
</p>
<p>
<pre><span style="color:#2b91af;">canonizeSum</span> <span style="color:blue;">::</span> <span style="color:blue;">Sum</span> f g a <span style="color:blue;">-></span> <span style="color:#2b91af;">Either</span> (f a) (g a)
canonizeSum (InL x) = Left x
canonizeSum (InR y) = Right y
<span style="color:#2b91af;">summarizeEither</span> <span style="color:blue;">::</span> <span style="color:#2b91af;">Either</span> (f a) (g a) <span style="color:blue;">-></span> <span style="color:blue;">Sum</span> f g a
summarizeEither (Left x) = InL x
summarizeEither (Right y) = InR y</pre>
</p>
<p>
The point is that we can compose not only a choice of two, but of any number of functors, to a single functor type. A simple example is this choice between <a href="/2018/03/26/the-maybe-functor">Maybe</a>, list, or <a href="/2018/08/06/a-tree-functor">Tree</a>:
</p>
<p>
<pre><span style="color:#2b91af;">maybeOrListOrTree</span> <span style="color:blue;">::</span> <span style="color:blue;">Sum</span> (<span style="color:blue;">Sum</span> <span style="color:#2b91af;">Maybe</span> []) <span style="color:blue;">Tree</span> <span style="color:#2b91af;">String</span>
maybeOrListOrTree = InL (InL (Just <span style="color:#a31515;">"foo"</span>))</pre>
</p>
<p>
If we rather wanted to embed a list in that type, we can do that as well:
</p>
<p>
<pre><span style="color:#2b91af;">maybeOrListOrTree'</span> <span style="color:blue;">::</span> <span style="color:blue;">Sum</span> (<span style="color:blue;">Sum</span> <span style="color:#2b91af;">Maybe</span> []) <span style="color:blue;">Tree</span> <span style="color:#2b91af;">String</span>
maybeOrListOrTree' = InL (InR [<span style="color:#a31515;">"bar"</span>, <span style="color:#a31515;">"baz"</span>])</pre>
</p>
<p>
Both values have the same type, and since it's a <code>Functor</code> instance, you can <code>fmap</code> over it:
</p>
<p>
<pre>ghci> fmap (elem 'r') maybeOrListOrTree
InL (InL (Just False))
ghci> fmap (elem 'r') maybeOrListOrTree'
InL (InR [True,False])</pre>
</p>
<p>
These queries examine each <code>String</code> to determine whether or not they contain the letter <code>'r'</code>, which only <code>"bar"</code> does.
</p>
<p>
The point, anyway, is that sum types of any arity form a functor if all the cases do.
</p>
<h3 id="8545e09908fb4df4ace08e7b20ffc509">
Conclusion <a href="#8545e09908fb4df4ace08e7b20ffc509">#</a>
</h3>
<p>
In the previous article, you learned that a functor product gives rise to a functor. In this article, you learned that a functor sum does, too. If a data structure contains a choice of two or more functors, then that data type itself forms a functor.
</p>
<p>
As the previous article argues, this is useful to know, particularly if you're working in a language with only partial support for functors. Mainstream languages aren't going to automatically turn such sums into functors, in the way that Haskell's <code>Sum</code> <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">container</a> almost does. Thus, knowing when you can safely give your generic types a <code>Select</code> method or <code>map</code> function may come in handy.
</p>
<p>
There's one more rule like this one.
</p>
<p>
<strong>Next:</strong> <a href="/2024/10/28/functor-compositions">Functor compositions</a>.
</p>
</div>
<hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.The Const functorhttps://blog.ploeh.dk/2024/10/07/the-const-functor2024-10-07T18:37:00+00:00Mark Seemann
<div id="post">
<p>
<em>Package a constant value, but make it look like a functor. An article for object-oriented programmers.</em>
</p>
<p>
This article is an instalment in <a href="/2018/03/22/functors">an article series about functors</a>. In previous articles, you've learned about useful functors such as <a href="/2018/03/26/the-maybe-functor">Maybe</a> and <a href="/2019/01/14/an-either-functor">Either</a>. You've also seen at least one less-than useful functor: <a href="/2018/09/03/the-identity-functor">The Identity functor</a>. In this article, you'll learn about another (practically) useless functor called <em>Const</em>. You can skip this article if you want.
</p>
<p>
Like Identity, the Const functor may not be that useful, but it nonetheless exists. You'll probably not need it for actual programming tasks, but knowing that it exists, like Identity, can be a useful as an analysis tool. It may help you quickly evaluate whether a particular data structure affords various compositions. For example, it may enable you to quickly identify whether, say, a constant type and a list <a href="/2022/07/11/functor-relationships">may compose to a functor</a>.
</p>
<p>
This article starts with C#, then proceeds over <a href="https://fsharp.org/">F#</a> to finally discuss <a href="https://www.haskell.org/">Haskell</a>'s built-in Const functor. You can just skip the languages you don't care about.
</p>
<h3 id="050cc4bc478f449ca11c28a83f8a2fda">
C# Const class <a href="#050cc4bc478f449ca11c28a83f8a2fda">#</a>
</h3>
<p>
While C# supports <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">records</a>, and you can implement Const as one, I here present it as a full-fledged class. For readers who may not be that familiar with modern C#, a normal class may be more recognizable.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">T1</span> Value { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">Const</span>(<span style="color:#2b91af;">T1</span> <span style="font-weight:bold;color:#1f377f;">value</span>)
{
Value = <span style="font-weight:bold;color:#1f377f;">value</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T2</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">TResult</span>>(Value);
}
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">bool</span> <span style="font-weight:bold;color:#74531f;">Equals</span>(<span style="color:blue;">object</span> <span style="font-weight:bold;color:#1f377f;">obj</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">obj</span> <span style="color:blue;">is</span> <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>> <span style="font-weight:bold;color:#1f377f;">@const</span> &&
<span style="color:#2b91af;">EqualityComparer</span><<span style="color:#2b91af;">T1</span>>.Default.<span style="font-weight:bold;color:#74531f;">Equals</span>(Value, <span style="font-weight:bold;color:#1f377f;">@const</span>.Value);
}
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">int</span> <span style="font-weight:bold;color:#74531f;">GetHashCode</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> -1584136870 + <span style="color:#2b91af;">EqualityComparer</span><<span style="color:#2b91af;">T1</span>>.Default.<span style="font-weight:bold;color:#74531f;">GetHashCode</span>(Value);
}
}</pre>
</p>
<p>
The point of the Const functor is to make a constant value look like a functor; that is, <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">a container</a> that you can map from one type to another. The difference from the Identity functor is that Const doesn't allow you to map the constant. Rather, it cheats and pretends having a mappable type that, however, has no value associated with it; a <a href="https://wiki.haskell.org/Phantom_type">phantom type</a>.
</p>
<p>
In <code><span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">T1</span>, <span style="color:#2b91af;">T2</span>></code>, the <code>T2</code> type parameter is the 'pretend' type. While the class contains a <code>T1</code> value, it contains no <code>T2</code> value. The <code>Select</code> method, on the other hand, maps <code>T2</code> to <code>TResult</code>. The operation is close to being a <a href="https://en.wikipedia.org/wiki/NOP_(code)">no-op</a>, but still not quite. While it doesn't do anything particularly practical, it <em>does</em> change the type of the returned value.
</p>
<p>
Here's a simple example of using the <code>Select</code> method:
</p>
<p>
<pre><span style="color:#2b91af;">Const</span><<span style="color:blue;">string</span>, <span style="color:blue;">double</span>> <span style="font-weight:bold;color:#1f377f;">c</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:blue;">string</span>, <span style="color:blue;">int</span>>(<span style="color:#a31515;">"foo"</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">i</span> => <span style="color:#2b91af;">Math</span>.<span style="color:#74531f;">Sqrt</span>(<span style="font-weight:bold;color:#1f377f;">i</span>));</pre>
</p>
<p>
The new <code>c</code> value <em>also</em> contains <code>"foo"</code>. Only its type has changed.
</p>
<p>
If you find this peculiar, think of it as similar to mapping an empty list, or an empty Maybe value. In those cases, too, no <em>values</em> change; only the type changes. The difference between empty Maybe objects or empty lists, and the Const functor is that Const isn't empty. There <em>is</em> a value; it's just not the value being mapped.
</p>
<h3 id="3262b7a3818d46bca452500138f776b2">
Functor laws <a href="#3262b7a3818d46bca452500138f776b2">#</a>
</h3>
<p>
Although the Const functor doesn't really do anything, it still obeys the functor laws. To illustrate it (but not to prove it), here's an <a href="https://fscheck.github.io/FsCheck/">FsCheck</a> property that exercises the first functor law:
</p>
<p>
<pre>[<span style="color:#2b91af;">Property</span>(QuietOnSuccess = <span style="color:blue;">true</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">ConstObeysFirstFunctorLaw</span>(<span style="color:blue;">int</span> <span style="font-weight:bold;color:#1f377f;">i</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">left</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:blue;">int</span>, <span style="color:blue;">string</span>>(<span style="font-weight:bold;color:#1f377f;">i</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">right</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:blue;">int</span>, <span style="color:blue;">string</span>>(<span style="font-weight:bold;color:#1f377f;">i</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">x</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="font-weight:bold;color:#1f377f;">left</span>, <span style="font-weight:bold;color:#1f377f;">right</span>);
}</pre>
</p>
<p>
If you think it over for a minute, this makes sense. The test creates a <code><span style="color:#2b91af;">Const</span><<span style="color:blue;">int</span>, <span style="color:blue;">string</span>></code> that contains the integer <code>i</code>, and then proceeds to map <em>the string that isn't there</em> to 'itself'. Clearly, this doesn't change the <code>i</code> value contained in the <code><span style="color:#2b91af;">Const</span><<span style="color:blue;">int</span>, <span style="color:blue;">string</span>></code> container.
</p>
<p>
In the same spirit, a property demonstrates the second functor law:
</p>
<p>
<pre>[<span style="color:#2b91af;">Property</span>(QuietOnSuccess = <span style="color:blue;">true</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">ConstObeysSecondFunctorLaw</span>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">byte</span>> <span style="font-weight:bold;color:#1f377f;">f</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">int</span>, <span style="color:blue;">string</span>> <span style="font-weight:bold;color:#1f377f;">g</span>,
<span style="color:blue;">short</span> <span style="font-weight:bold;color:#1f377f;">s</span>)
{
<span style="color:#2b91af;">Const</span><<span style="color:blue;">short</span>, <span style="color:blue;">byte</span>> <span style="font-weight:bold;color:#1f377f;">left</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:blue;">short</span>, <span style="color:blue;">int</span>>(<span style="font-weight:bold;color:#1f377f;">s</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">g</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">f</span>);
<span style="color:#2b91af;">Const</span><<span style="color:blue;">short</span>, <span style="color:blue;">byte</span>> <span style="font-weight:bold;color:#1f377f;">right</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Const</span><<span style="color:blue;">short</span>, <span style="color:blue;">int</span>>(<span style="font-weight:bold;color:#1f377f;">s</span>).<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">x</span> => <span style="font-weight:bold;color:#1f377f;">f</span>(<span style="font-weight:bold;color:#1f377f;">g</span>(<span style="font-weight:bold;color:#1f377f;">x</span>)));
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="font-weight:bold;color:#1f377f;">left</span>, <span style="font-weight:bold;color:#1f377f;">right</span>);
}</pre>
</p>
<p>
Again, the same kind of almost-no-op takes place. The <code>g</code> function first changes the <code>int</code> type to <code>string</code>, and then <code>f</code> changes the <code>string</code> type to <code>byte</code>, but no <em>value</em> ever changes; only the second type parameter. Thus, <code>left</code> and <code>right</code> remain equal, since they both contain the same value <code>s</code>.
</p>
<h3 id="ca40bd6e23794a0b9de36b0835dce6cb">
F# Const <a href="#ca40bd6e23794a0b9de36b0835dce6cb">#</a>
</h3>
<p>
In F# we may <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> express Const as a single-case union:
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">'v</span>, <span style="color:#2b91af;">'a</span>> = <span style="color:#2b91af;">Const</span> <span style="color:blue;">of</span> <span style="color:#2b91af;">'v</span></pre>
</p>
<p>
Here I've chosen to name the first type parameter <code>'v</code> (for <em>value</em>) in order to keep the 'functor type parameter' name <code>'a</code>. This enables me to meaningfully annotate the functor mapping function with the type <code><span style="color:#2b91af;">'a</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">'b</span></code>:
</p>
<p>
<pre><span style="color:blue;">module</span> <span style="color:#2b91af;">Const</span> =
<span style="color:blue;">let</span> <span style="color:#74531f;">get</span> (<span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">x</span>) = <span style="font-weight:bold;color:#1f377f;">x</span>
<span style="color:blue;">let</span> <span style="color:#74531f;">map</span> (<span style="color:#74531f;">f</span> : <span style="color:#2b91af;">'a</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">'b</span>) (<span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">x</span> : <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">'v</span>, <span style="color:#2b91af;">'a</span>>) : <span style="color:#2b91af;">Const</span><<span style="color:#2b91af;">'v</span>, <span style="color:#2b91af;">'b</span>> = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">x</span></pre>
</p>
<p>
Usually, you don't need to annotate F# functions like <code>map</code>, but in this case I added explicit types in order to make it a recognizable functor map.
</p>
<p>
I could also have defined <code>map</code> like this:
</p>
<p>
<pre><span style="color:green;">// 'a -> Const<'b,'c> -> Const<'b,'d></span>
<span style="color:blue;">let</span> <span style="color:#74531f;">map</span> <span style="color:#1f377f;">f</span> (<span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">x</span>) = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">x</span></pre>
</p>
<p>
This still works, but is less recognizable as a functor map, since <code>f</code> may be any <code>'a</code>. Notice that if type inference is left to its own devices, it names the input type <code>Const<'b,'c></code> and the return type <code>Const<'b,'d></code>. This also means that if you want to supply <code>f</code> as a mapping function, this is legal, because we may consider <code>'a ~ 'c -> 'd</code>. It's still a functor map, but a less familiar representation.
</p>
<p>
Similar to the above C# code, two FsCheck properties demonstrate that the <code>Const</code> type obeys the functor laws.
</p>
<p>
<pre>[<<span style="color:#2b91af;">Property</span>(QuietOnSuccess = <span style="color:blue;">true</span>)>]
<span style="color:blue;">let</span> <span style="color:#74531f;">``Const obeys first functor law``</span> (<span style="font-weight:bold;color:#1f377f;">i</span> : <span style="color:#2b91af;">int</span>) =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">left</span> = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">i</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">right</span> = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">i</span> |> <span style="color:#2b91af;">Const</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">id</span>
<span style="font-weight:bold;color:#1f377f;">left</span> =! <span style="font-weight:bold;color:#1f377f;">right</span>
[<<span style="color:#2b91af;">Property</span>(QuietOnSuccess = <span style="color:blue;">true</span>)>]
<span style="color:blue;">let</span> <span style="color:#74531f;">``Const obeys second functor law``</span> (<span style="color:#74531f;">f</span> : <span style="color:#2b91af;">string</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">byte</span>) (<span style="color:#74531f;">g</span> : <span style="color:#2b91af;">int</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">string</span>) (<span style="font-weight:bold;color:#1f377f;">s</span> : <span style="color:#2b91af;">int16</span>) =
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">left</span> = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">s</span> |> <span style="color:#2b91af;">Const</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">g</span> |> <span style="color:#2b91af;">Const</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span>
<span style="color:blue;">let</span> <span style="font-weight:bold;color:#1f377f;">right</span> = <span style="color:#2b91af;">Const</span> <span style="font-weight:bold;color:#1f377f;">s</span> |> <span style="color:#2b91af;">Const</span>.<span style="color:#74531f;">map</span> (<span style="color:#74531f;">g</span> >> <span style="color:#74531f;">f</span>)
<span style="font-weight:bold;color:#1f377f;">left</span> =! <span style="font-weight:bold;color:#1f377f;">right</span></pre>
</p>
<p>
The assertions use <a href="https://github.com/SwensenSoftware/unquote">Unquote</a>'s <code>=!</code> operator, which I usually read as <em>should equal</em> or <em>must equal</em>.
</p>
<h3 id="9474bc7665ed4f1da688dbb2484ccbf9">
Haskell Const <a href="#9474bc7665ed4f1da688dbb2484ccbf9">#</a>
</h3>
<p>
The Haskell <a href="https://hackage.haskell.org/package/base">base</a> library already comes with a <a href="https://hackage.haskell.org/package/base/docs/Control-Applicative.html#t:Const">Const</a> <code>newtype</code>.
</p>
<p>
You can easily create a new <code>Const</code> value:
</p>
<p>
<pre>ghci> Const "foo"
Const "foo"</pre>
</p>
<p>
If you inquire about its type, GHCi will tell you in a rather verbose way that the first type parameter is <code>String</code>, but the second may be any type <code>b</code>:
</p>
<p>
<pre>ghci> :t Const "foo"
Const "foo" :: forall {k} {b :: k}. Const String b</pre>
</p>
<p>
You can also map by 'incrementing' its non-existent second value:
</p>
<p>
<pre>ghci> (+1) <$> Const "foo"
Const "foo"
ghci> :t (+1) <$> Const "foo"
(+1) <$> Const "foo" :: Num b => Const String b</pre>
</p>
<p>
While the value remains <code>Const "foo"</code>, the type of <code>b</code> is now constrained to a <a href="https://hackage.haskell.org/package/base/docs/Prelude.html#t:Num">Num</a> instance, which follows from the use of the <code>+</code> operator.
</p>
<h3 id="83eea33a91f84b2c9ff4d364b0c868d6">
Functor law proofs <a href="#83eea33a91f84b2c9ff4d364b0c868d6">#</a>
</h3>
<p>
If you look at the source code for the <code>Functor</code> instance, it looks much like its F# equivalent:
</p>
<p>
<pre>instance Functor (Const m) where
fmap _ (Const v) = Const v</pre>
</p>
<p>
We can use equational reasoning with <a href="https://bartoszmilewski.com/2015/01/20/functors/">the notation that Bartosz Milewski uses</a> to prove that both functor laws hold, starting with the first:
</p>
<p>
<pre> fmap id (Const x)
= { definition of fmap }
Const x</pre>
</p>
<p>
Clearly, there's not much to that part. What about the second functor law?
</p>
<p>
<pre> fmap (g . f) (Const x)
= { definition of fmap }
Const x
= { definition of fmap }
fmap g (Const x)
= { definition of fmap }
fmap g (fmap f (Const x))
= { definition of composition }
(fmap g . fmap f) (Const x)</pre>
</p>
<p>
While that proof takes a few more steps, most are as trivial as the first proof.
</p>
<h3 id="e71a037a6f3f491ca3f755ce31809123">
Conclusion <a href="#e71a037a6f3f491ca3f755ce31809123">#</a>
</h3>
<p>
The Const functor is hardly a programming construct you'll use in your day-to-day work, but the fact that it exists can be used to generalize some results that involve functors. Now, whenever you have a result that involves a functor, you know that it also generalizes to constant values, just like the Identity functor teaches us that 'naked' type parameters can be thought of as functors.
</p>
<p>
To give a few examples, we may already know that <code>Tree<T></code> (C# syntax) is a functor, but a 'naked' generic type parameter <code>T</code> also gives rise to a functor (Identity), as does a non-generic type (such as <code>int</code> or <code>MyCustomClass</code>).
</p>
<p>
Thus, if you have a function that operates on any functor, it may also, conceivably, operate on data structures that have non-generic types. This may for example be interesting when we begin to consider <a href="/2022/07/11/functor-relationships">how functors compose</a>.
</p>
<p>
<strong>Next:</strong> <a href="/2021/07/19/the-state-functor">The State functor</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Das verflixte Hunde-Spielhttps://blog.ploeh.dk/2024/10/03/das-verflixte-hunde-spiel2024-10-03T17:41:00+00:00Mark Seemann
<div id="post">
<p>
<em>A puzzle kata, and a possible solution.</em>
</p>
<p>
When I was a boy I had a nine-piece puzzle that I'd been gifted by the Swizz branch of my family. It's called <em>Das verflixte Hunde-Spiel</em>, which means something like <em>the confounded dog game</em> in English. And while a puzzle with nine pieces doesn't sound like much, it is, in fact, incredibly difficult.
</p>
<p>
It's just a specific incarnation of a kind of game that you've almost certainly encountered, too.
</p>
<p>
<img src="/content/binary/hunde-spiel.jpg" alt="A picture of the box of the puzzle, together with the tiles spread out in unordered fashion.">
</p>
<p>
There are nine tiles, each with two dog heads and two dog ends. A dog may be coloured in one of four different patterns. The object of the game is to lay out the nine tiles in a 3x3 square so that all dog halves line up.
</p>
<h3 id="ddf5aa390eed4147a55a35e95803b6ad">
Game details <a href="#ddf5aa390eed4147a55a35e95803b6ad">#</a>
</h3>
<p>
The game is from 1979. Two of the tiles are identical, and, according to the information on the back of the box, two possible solutions exist. Described from top clockwise, the tiles are the following:
</p>
<ul>
<li>Brown head, grey head, umber tail, spotted tail</li>
<li>Brown head, spotted head, brown tail, umber tail</li>
<li>Brown head, spotted head, grey tail, umber tail</li>
<li>Brown head, spotted head, grey tail, umber tail</li>
<li>Brown head, umber head, spotted tail, grey tail</li>
<li>Grey head, brown head, spotted tail, umber tail</li>
<li>Grey head, spotted head, brown tail, umber tail</li>
<li>Grey head, umber head, brown tail, spotted tail</li>
<li>Grey head, umber head, grey tail, spotted tail</li>
</ul>
<p>
I've taken the liberty of using a shorthand for the patterns. The grey dogs are actually also spotted, but since there's only one grey pattern, the <em>grey</em> label is unambiguous. The dogs I've named <em>umber</em> are actually rather <em>burnt umber</em>, but that's too verbose for my tastes, so I just named them <em>umber</em>. Finally, the label <em>spotted</em> indicates dogs that are actually burnt umber with brown blotches.
</p>
<p>
Notice that there are two tiles with a brown head, a spotted head, a grey tail, and an umber tail.
</p>
<p>
The object of the game is to lay down the tiles in a 3x3 square so that all dogs fit. For further reference, I've numbered each position from one to nine like this:
</p>
<p>
<img src="/content/binary/numbered-3x3-tiles.png" alt="Nine tiles arranged in a three-by-three square, numbered from 1 to 9 from top left to bottom right.">
</p>
<p>
What makes the game hard? There are nine cards, so if you start with the upper left corner, you have nine choices. If you just randomly put down the tiles, you now have eight left for the top middle position, and so on. Standard combinatorics indicate that there are at least 9! = 362,880 permutations.
</p>
<p>
That's not the whole story, however, since you can rotate each tile in four different ways. You can rotate the first tile four ways, the second tile four ways, etc. for a total of 4<sup>9</sup> = 262,144 ways. Multiply these two numbers together, and you get 4<sup>9</sup>9! = 95,126,814,720 combinations. No wonder this puzzle is hard if there's only two solutions.
</p>
<p>
When analysed this way, however, there are actually 16 solutions, but that still makes it incredibly unlikely to arrive at a solution by chance. I'll get back to why there are 16 solutions later. For now, you should have enough information to try your hand with this game, if you'd like.
</p>
<p>
I found that the game made for an interesting <a href="/2020/01/13/on-doing-katas">kata</a>: Write a program that finds all possible solutions to the puzzle.
</p>
<p>
If you'd like to try your hand at this exercise, I suggest that you pause reading here.
</p>
<p>
In the rest of the article, I'll outline my first attempt. Spoiler alert: I'll also show one of the solutions.
</p>
<h3 id="113acd886fde4791b10c4a2b6f394216">
Types <a href="#113acd886fde4791b10c4a2b6f394216">#</a>
</h3>
<p>
When you program in <a href="https://www.haskell.org/">Haskell</a>, it's natural to start by defining some types.
</p>
<p>
<pre><span style="color:blue;">data</span> Half = Head | Tail <span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Show</span>, <span style="color:#2b91af;">Eq</span>)
<span style="color:blue;">data</span> Pattern = Brown | Grey | Spotted | Umber <span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Show</span>, <span style="color:#2b91af;">Eq</span>)
<span style="color:blue;">data</span> Tile = Tile {
<span style="color:#2b91af;">top</span> <span style="color:blue;">::</span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>),
<span style="color:#2b91af;">right</span> <span style="color:blue;">::</span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>),
<span style="color:#2b91af;">bottom</span> <span style="color:blue;">::</span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>),
<span style="color:#2b91af;">left</span> <span style="color:blue;">::</span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>) }
<span style="color:blue;">deriving</span> (<span style="color:#2b91af;">Show</span>, <span style="color:#2b91af;">Eq</span>)</pre>
</p>
<p>
Each tile describes what you find on its <code>top</code>, <code>right</code> side, <code>bottom</code>, and <code>left</code> side.
</p>
<p>
We're also going to need a function to evaluate whether two halves match:
</p>
<p>
<pre><span style="color:#2b91af;">matches</span> <span style="color:blue;">::</span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>) <span style="color:blue;">-></span> (<span style="color:blue;">Pattern</span>, <span style="color:blue;">Half</span>) <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
matches (p1, h1) (p2, h2) = p1 == p2 && h1 /= h2</pre>
</p>
<p>
This function demands that the patterns match, but that the halves are opposites.
</p>
<p>
You can use the <code>Tile</code> type and its constituents to define the nine tiles of the game:
</p>
<p>
<pre><span style="color:#2b91af;">tiles</span> <span style="color:blue;">::</span> [<span style="color:blue;">Tile</span>]
tiles =
[
Tile (Brown, Head) (Grey, Head) (Umber, Tail) (Spotted, Tail),
Tile (Brown, Head) (Spotted, Head) (Brown, Tail) (Umber, Tail),
Tile (Brown, Head) (Spotted, Head) (Grey, Tail) (Umber, Tail),
Tile (Brown, Head) (Spotted, Head) (Grey, Tail) (Umber, Tail),
Tile (Brown, Head) (Umber, Head) (Spotted, Tail) (Grey, Tail),
Tile (Grey, Head) (Brown, Head) (Spotted, Tail) (Umber, Tail),
Tile (Grey, Head) (Spotted, Head) (Brown, Tail) (Umber, Tail),
Tile (Grey, Head) (Umber, Head) (Brown, Tail) (Spotted, Tail),
Tile (Grey, Head) (Umber, Head) (Grey, Tail) (Spotted, Tail)
]</pre>
</p>
<p>
Because I'm the neatnik that I am, I've sorted the tiles in lexicographic order, but the solution below doesn't rely on that.
</p>
<h3 id="1568796e41484e21bae6bb5734f996eb">
Brute force doesn't work <a href="#1568796e41484e21bae6bb5734f996eb">#</a>
</h3>
<p>
Before I started, I cast around the internet to see if there was an appropriate algorithm for the problem. While I found a few answers on <a href="https://stackoverflow.com/">Stack Overflow</a>, none of them gave me indication that any sophisticated algorithm was available. (Even so, there may be, and I just didn't find it.)
</p>
<p>
It seems clear, however, that you can implement some kind of recursive search-tree algorithm that cuts a branch off as soon as it realizes that it doesn't work. I'll get back to that later, so let's leave that for now.
</p>
<p>
Since I'd planned on writing the code in Haskell, I decided to first try something that might look like brute force. Because Haskell is lazily evaluated, you can sometimes get away with techniques that look wasteful when you're used to strict/eager evaluation. In this case, it turned out to not work, but it's often quicker to just make the attempt than trying to analyze the problem.
</p>
<p>
As already outlined, I first attempted a purely brute-force solution, betting that Haskell's lazy evaluation would be enough to skip over the unnecessary calculations:
</p>
<p>
<pre>allRotationsOf9 = replicateM 9 [0..3]
<span style="color:#2b91af;">allRotations</span> <span style="color:blue;">::</span> [<span style="color:blue;">Tile</span>] <span style="color:blue;">-></span> [[<span style="color:blue;">Tile</span>]]
allRotations ts = <span style="color:blue;">fmap</span> (\rs -> (\(r, t) -> rotations t !! r) <$> <span style="color:blue;">zip</span> rs ts) allRotationsOf9
<span style="color:#2b91af;">allConfigurations</span> <span style="color:blue;">::</span> [[<span style="color:blue;">Tile</span>]]
allConfigurations = permutations tiles >>= allRotations
solutions = <span style="color:blue;">filter</span> isSolution allConfigurations</pre>
</p>
<p>
My idea with the <code>allConfigurations</code> value was that it's supposed to enumerate all 95 billion combinations. Whether it actually does that, I was never able to verify, because if I try to run that code, my poor laptop runs for a couple of hours before it eventually runs out of memory. In other words, the GHCi process crashes.
</p>
<p>
I haven't shown <code>isSolution</code> or <code>rotations</code>, because I consider the implementations irrelevant. This attempt doesn't work anyway.
</p>
<p>
Now that I look at it, it's quite clear why this isn't a good strategy. There's little to be gained from lazy evaluation when the final attempt just attempts to <code>filter</code> a list. Even with lazy evaluation, the code still has to run through all 95 billion combinations.
</p>
<p>
Things might have been different if I just had to find one solution. With a little luck, it might be that the first solution appears after, say, a hundred million iterations, and lazy evaluation would then had meant that the remaining combinations would never run. Not so here, but hindsight is 20-20.
</p>
<h3 id="93754ac1a84e4a42b87253f1ffded97b">
Search tree <a href="#93754ac1a84e4a42b87253f1ffded97b">#</a>
</h3>
<p>
Back to the search tree idea. It goes like this: Start from the top left position and pick a random tile and rotation. Now pick an arbitrary tile <em>that fits</em> and place it to the right of it, and so on. As far as I can tell, you can always place the first four cards, but from there, you can easily encounter a combination that allows no further tiles. Here's an example:
</p>
<p>
<img src="/content/binary/hunde-spiel-no-fifth-tile.jpg" alt="Four matching tiles put down, with the remaining five tiles arranged to show that none of them fit the fifth position.">
</p>
<p>
None of the remaining five tiles fit in the fifth position. This means that we don't have to do <em>any</em> permutations that involve these four tiles in that combination. While the algorithm has to search through all five remaining tiles and rotations to discover that none fit in position 5, once it knows that, it doesn't have to go through the remaining four positions. That's 4<sup>4</sup>4! = 6,144 combinations that it can skip every time it discovers an impossible beginning. That doesn't sound like that much, but if we assume that this happens more often than not, it's still an improvement by orders of magnitude.
</p>
<p>
We may think of this algorithm as constructing a search tree, but immediately pruning all branches that aren't viable, as close to the root as possible.
</p>
<h3 id="76d4e7e1898a4da89de0d7afabbdc4e8">
Matches <a href="#76d4e7e1898a4da89de0d7afabbdc4e8">#</a>
</h3>
<p>
Before we get to the algorithm proper we need a few simple helper functions. One kind of function is a predicate that determines if a particular tile can occupy a given position. Since we may place any tile in any rotation in the first position, we don't need to write a predicate for that, but if we wanted to generalize, <code>const True</code> would do.
</p>
<p>
Whether or not we can place a given tile in the second position depends exclusively on the tile in the first position:
</p>
<p>
<pre><span style="color:#2b91af;">tile2Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile2Matches t1 t2 = right t1 `matches` left t2</pre>
</p>
<p>
If the <code>right</code> dog part of the first tile <code>matches</code> the <code>left</code> part of the second tile, the return value is <code>True</code>; otherwise, it's <code>False</code>. Note that I'm using infix notation for <code>matches</code>. I could also have written the function as
</p>
<p>
<pre><span style="color:#2b91af;">tile2Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile2Matches t1 t2 = matches (right t1) (left t2)</pre>
</p>
<p>
but it doesn't read as well.
</p>
<p>
In any case, the corresponding matching functions for the third and forth tile look similar:
</p>
<p>
<pre><span style="color:#2b91af;">tile3Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile3Matches t2 t3 = right t2 `matches` left t3
<span style="color:#2b91af;">tile4Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile4Matches t1 t4 = bottom t1 `matches` top t4</pre>
</p>
<p>
Notice that <code>tile4Matches</code> compares the fourth tile with the first tile rather than the third tile, because position 4 is directly beneath position 1, rather than to the right of position 3 (cf. the grid above). For that reason it also compares the <code>bottom</code> of tile 1 to the <code>top</code> of the fourth tile.
</p>
<p>
The matcher for the fifth tile is different:
</p>
<p>
<pre><span style="color:#2b91af;">tile5Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile5Matches t2 t4 t5 = bottom t2 `matches` top t5 && right t4 `matches` left t5</pre>
</p>
<p>
This is the first predicate that depends on two, rather than one, previous tiles. In position 5 we need to examine both the tile in position 2 and the one in position 4.
</p>
<p>
The same is true for position 6:
</p>
<p>
<pre><span style="color:#2b91af;">tile6Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile6Matches t3 t5 t6 = bottom t3 `matches` top t6 && right t5 `matches` left t6</pre>
</p>
<p>
but then the matcher for position 7 looks like the predicate for position 4:
</p>
<p>
<pre><span style="color:#2b91af;">tile7Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile7Matches t4 t7 = bottom t4 `matches` top t7</pre>
</p>
<p>
This is, of course, because the tile in position 7 only has to consider the tile in position 4. Finally, not surprising, the two remaining predicates look like something we've already seen:
</p>
<p>
<pre><span style="color:#2b91af;">tile8Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile8Matches t5 t7 t8 = bottom t5 `matches` top t8 && right t7 `matches` left t8
<span style="color:#2b91af;">tile9Matches</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
tile9Matches t6 t8 t9 = bottom t6 `matches` top t9 && right t8 `matches` left t9</pre>
</p>
<p>
You may suggest that it'd be possible to reduce the number of predicates. After all, there's effectively only three different predicates: One that only looks at the tile to the left, one that only looks at the tile above, and one that looks both to the left and above.
</p>
<p>
Indeed, I could have boiled it down to just three functions:
</p>
<p>
<pre><span style="color:#2b91af;">matchesHorizontally</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
matchesHorizontally x y = right x `matches` left y
<span style="color:#2b91af;">matchesVertically</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
matchesVertically x y = bottom x `matches` top y
<span style="color:#2b91af;">matchesBoth</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:#2b91af;">Bool</span>
matchesBoth x y z = matchesVertically x z && matchesHorizontally y z</pre>
</p>
<p>
but I now run the risk of calling the wrong predicate from my implementation of the algorithm. As you'll see, I'll call each predicate by name at each appropriate step, but if I had only these three functions, there's a risk that I might mistakenly use <code>matchesHorizontally</code> when I should have used <code>matchesVertically</code>, or vice versa. Reducing eight one-liners to three one-liners doesn't really seem to warrant the risk.
</p>
<h3 id="01bf66f9df1947d296a004e93638450d">
Rotations <a href="#01bf66f9df1947d296a004e93638450d">#</a>
</h3>
<p>
In addition to examining whether a given tile fits in a given position, we also need to be able to rotate any tile:
</p>
<p>
<pre><span style="color:#2b91af;">rotateClockwise</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span>
rotateClockwise (Tile t r b l) = Tile l t r b
<span style="color:#2b91af;">rotateCounterClockwise</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span>
rotateCounterClockwise (Tile t r b l) = Tile r b l t
<span style="color:#2b91af;">upend</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> <span style="color:blue;">Tile</span>
upend (Tile t r b l) = Tile b l t r</pre>
</p>
<p>
What is really needed, it turns out, is to enumerate all four rotations of a tile:
</p>
<p>
<pre><span style="color:#2b91af;">rotations</span> <span style="color:blue;">::</span> <span style="color:blue;">Tile</span> <span style="color:blue;">-></span> [<span style="color:blue;">Tile</span>]
rotations t = [t, rotateClockwise t, upend t, rotateCounterClockwise t]</pre>
</p>
<p>
Since this, like everything else here, is a pure function, I experimented with defining a 'memoized tile' type that embedded all four rotations upon creation, so that the algorithm doesn't need to call the <code>rotations</code> function millions of times, but I couldn't measure any discernable performance improvement from it. There's no reason to make things more complicated than they need to be, so I didn't keep that change. (Since I do, however, <a href="https://stackoverflow.blog/2022/12/19/use-git-tactically/">use Git tactically</a> i did, of course, <a href="https://git-scm.com/docs/git-stash">stash</a> the experiment.)
</p>
<h3 id="8e7ce1c1bb9c4403abd72d5d3d87bf02">
Permutations <a href="#8e7ce1c1bb9c4403abd72d5d3d87bf02">#</a>
</h3>
<p>
While I couldn't make things work by enumerating all 95 billion combinations, enumerating all 362,880 permutations of non-rotated tiles is well within the realm of the possible:
</p>
<p>
<pre><span style="color:#2b91af;">allPermutations</span> <span style="color:blue;">::</span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
allPermutations =
(\[t1, t2, t3, t4, t5, t6, t7, t8, t9] -> (t1, t2, t3, t4, t5, t6, t7, t8, t9))
<$> permutations tiles</pre>
</p>
<p>
Doing this in GHCi on my old laptop takes 300 milliseconds, which is good enough compared to what comes next.
</p>
<p>
This list value uses <a href="https://hackage.haskell.org/package/base/docs/Data-List.html#v:permutations">permutations</a> to enumerate all the permutations. You may already have noticed that it converts the result into a nine-tuple. The reason for that is that this enables the algorithm to pattern-match into specific positions without having to resort to the <a href="https://hackage.haskell.org/package/base/docs/Data-List.html#v:-33--33-">index operator</a>, which is both partial and requires iteration of the list to reach the indexed element. Granted, the list is only nine elements long, and often the algorithm will only need to index to the fourth or fifth element. On the other hand, it's going to do it <em>a lot</em>. Perhaps it's a premature optimization, but if it is, it's at least one that makes the code more, rather than less, readable.
</p>
<h3 id="3f0af3d6c91a4cd68b026a0ccf93a0e2">
Algorithm <a href="#3f0af3d6c91a4cd68b026a0ccf93a0e2">#</a>
</h3>
<p>
I found it easiest to begin at the 'bottom' of what is effectively a recursive algorithm, even though I didn't implement it that way. At the 'bottom', I imagine that I'm almost done: That I've found eight tiles that match, and now I only need to examine if I can rotate the final tile so that it matches:
</p>
<p>
<pre><span style="color:#2b91af;">solve9th</span> <span style="color:blue;">::</span> (a, b, c, d, e, <span style="color:blue;">Tile</span>, g, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(a, b, c, d, e, <span style="color:blue;">Tile</span>, g, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve9th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile9Matches t6 t8) $ rotations t9
<span style="color:blue;">return</span> (t1, t2, t3, t4, t5, t6, t7, t8, match)</pre>
</p>
<p>
Recalling that Haskell functions compose from right to left, the function starts by enumerating the four <code>rotations</code> of the ninth and final tile <code>t9</code>. It then filters those four rotations by the <code>tile9Matches</code> predicate.
</p>
<p>
The <code>match</code> value is a rotation of <code>t9</code> that matches <code>t6</code> and <code>t8</code>. Whenever <code>solve9th</code> finds such a match, it returns the entire nine-tuple, because the assumption is that the eight first tiles are already valid.
</p>
<p>
Notice that the function uses <code>do</code> notation in the list monad, so it's quite possible that the first <code>filter</code> expression produces no <code>match</code>. In that case, the second line of code never runs, and instead, the function returns the empty list.
</p>
<p>
How do we find a tuple where the first eight elements are valid? Well, if we have seven valid tiles, we may consider the eighth and subsequently call <code>solve9th</code>:
</p>
<p>
<pre><span style="color:#2b91af;">solve8th</span> <span style="color:blue;">::</span> (a, b, c, d, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(a, b, c, d, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve8th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile8Matches t5 t7) $ rotations t8
solve9th (t1, t2, t3, t4, t5, t6, t7, match, t9)</pre>
</p>
<p>
This function looks a lot like <code>solve9th</code>, but it instead enumerates the four <code>rotations</code> of the eighth tile <code>t8</code> and filters with the <code>tile8Matches</code> predicate. Due to the <code>do</code> notation, it'll only call <code>solve9th</code> if it finds a <code>match</code>.
</p>
<p>
Once more, this function assumes that the first seven tiles are already in a legal constellation. How do we find seven valid tiles? The same way we find eight: By assuming that we have six valid tiles, and then finding the seventh, and so on:
</p>
<p>
<pre><span style="color:#2b91af;">solve7th</span> <span style="color:blue;">::</span> (a, b, c, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(a, b, c, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve7th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile7Matches t4) $ rotations t7
solve8th (t1, t2, t3, t4, t5, t6, match, t8, t9)
<span style="color:#2b91af;">solve6th</span> <span style="color:blue;">::</span> (a, b, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(a, b, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve6th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile6Matches t3 t5) $ rotations t6
solve7th (t1, t2, t3, t4, t5, match, t7, t8, t9)
<span style="color:#2b91af;">solve5th</span> <span style="color:blue;">::</span> (a, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(a, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve5th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile5Matches t2 t4) $ rotations t5
solve6th (t1, t2, t3, t4, match, t6, t7, t8, t9)
<span style="color:#2b91af;">solve4th</span> <span style="color:blue;">::</span> (<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve4th (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile4Matches t1) $ rotations t4
solve5th (t1, t2, t3, match, t5, t6, t7, t8, t9)
<span style="color:#2b91af;">solve3rd</span> <span style="color:blue;">::</span> (<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve3rd (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile3Matches t2) $ rotations t3
solve4th (t1, t2, match, t4, t5, t6, t7, t8, t9)
<span style="color:#2b91af;">solve2nd</span> <span style="color:blue;">::</span> (<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve2nd (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- <span style="color:blue;">filter</span> (tile2Matches t1) $ rotations t2
solve3rd (t1, match, t3, t4, t5, t6, t7, t8, t9)</pre>
</p>
<p>
You'll observe that <code>solve7th</code> down to <code>solve2nd</code> are very similar. The only things that really vary are the predicates, and the positions of the tile being examined, as well as its neighbours. Clearly I can generalize this code, but I'm not sure it's worth it. I wrote a few of these in the order I've presented them here, because it helped me think the problem through, and to be honest, once I had two or three of them, <a href="https://github.com/features/copilot">GitHub Copilot</a> picked up on the pattern and wrote the remaining functions for me.
</p>
<p>
Granted, <a href="/2018/09/17/typing-is-not-a-programming-bottleneck">typing isn't a programming bottleneck</a>, so we should rather ask if this kind of duplication looks like a maintenance problem. Given that this is a one-time exercise, I'll just leave it be and move on.
</p>
<p>
Particularly, if you're struggling to understand how this implements the 'truncated search tree', keep in mind that e..g <code>solve5th</code> is likely to produce no valid <code>match</code>, in which case it'll never call <code>solve6th</code>. The same may happen in <code>solve6th</code>, etc.
</p>
<p>
The 'top' function is a bit different because it doesn't need to <code>filter</code> anything:
</p>
<p>
<pre><span style="color:#2b91af;">solve1st</span> <span style="color:blue;">::</span> (<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)
<span style="color:blue;">-></span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solve1st (t1, t2, t3, t4, t5, t6, t7, t8, t9) = <span style="color:blue;">do</span>
match <- rotations t1
solve2nd (match, t2, t3, t4, t5, t6, t7, t8, t9)</pre>
</p>
<p>
In the first position, any tile in any rotation is legal, so <code>solve1st</code> only enumerates all four <code>rotations</code> of <code>t1</code> and calls <code>solve2nd</code> for each.
</p>
<p>
The final step is to compose <code>allPermutations</code> with <code>solve1st</code>:
</p>
<p>
<pre><span style="color:#2b91af;">solutions</span> <span style="color:blue;">::</span> [(<span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>, <span style="color:blue;">Tile</span>)]
solutions = allPermutations >>= solve1st</pre>
</p>
<p>
Running this in GHCi on my 4½-year old laptop produces all 16 solutions in approximately 22 seconds.
</p>
<h3 id="d3d8c77398334534b5a200a240d7bddc">
Evaluation <a href="#d3d8c77398334534b5a200a240d7bddc">#</a>
</h3>
<p>
Is that good performance? Well, it turns out that it's possible to substantially improve on the situation. As I've mentioned a couple of times, so far I've been running the program from GHCi, the Haskell <a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">REPL</a>. Most of the 22 seconds are spent interpreting or compiling the code.
</p>
<p>
If I compile the code with some optimizations turned on, the executable runs in approximately 300 ms. That seems quite decent, if I may say so.
</p>
<p>
I can think of a few tweaks to the code that might conceivably improve things even more, but when I test, there's no discernable difference. Thus, I'll keep the code as shown here.
</p>
<p>
Here's one of the solutions:
</p>
<p>
<img src="/content/binary/hunde-spiel-solution.jpg" alt="One of the game solutions.">
</p>
<p>
The information on the box claims that there's two solutions. Why does the code shown here produce 16 solutions?
</p>
<p>
There's a good explanation for that. Recall that two of the tiles are identical. In the above solution picture, it's tile 1 and 3, although they're rotated 90° in relation to each other. This implies that you could take tile 1, rotate it counter-clockwise and put it in position 3, while simultaneously taking tile 3, rotating it clockwise, and putting it in position 1. Visually, you can't tell the difference, so they don't count as two distinct solutions. The algorithm, however, doesn't make that distinction, so it enumerates what is effectively the same solution twice.
</p>
<p>
Not surprising, it turns out that all 16 solutions are doublets in that way. We can confirm that by evaluating <code>length $ <a href="https://hackage.haskell.org/package/base/docs/Data-List.html#v:nub">nub</a> solutions</code>, which returns <code>8</code>.
</p>
<p>
Eight solutions are, however, still four times more than two. Can you figure out what's going on?
</p>
<p>
The algorithm also enumerates four rotations of each solution. Once we take this into account, there's only two visually distinct solutions left. One of them is shown above. I also have a picture of the other one, but I'm not going to totally spoil things for you.
</p>
<h3 id="f97500846a6e481ebe1706278f324979">
Conclusion <a href="#f97500846a6e481ebe1706278f324979">#</a>
</h3>
<p>
When I was eight, I might have had the time and the patience to actually lay the puzzle. Despite the incredibly bad odds, I vaguely remember finally solving it. There must be some more holistic processing going on in the brain, if even a kid can solve the puzzle, because it seems inconceivable that it should be done as described here.
</p>
<p>
Today, I don't care for that kind of puzzle in analog form, but I did, on the other hand, find it an interesting programming exercise.
</p>
<p>
The code could be smaller, but I like it as it is. While a bit on the verbose side, I think that it communicates well what's going on.
</p>
<p>
I was pleasantly surprised that I managed to get execution time down to 300 ms. I'd honestly not expected that when I started.
</p>
</div>
<div id="comments">
<hr>
<h2 id="comments-header">
Comments
</h2>
<div class="comment" id="fa087e5b49ce4a58936ac782cc44561b">
<div class="comment-author"><a href="https://github.com/anka-213">Andreas Källberg</a> <a href="#fa087e5b49ce4a58936ac782cc44561b">#</a></div>
<div class="comment-content">
<p>
Thanks for a nice blog post! I found the challange interesting, so I have written my own version of the code that both tries to be faster and also remove the redundant solutions, so it only generates two solutions in total. The code is available <a href="https://github.com/anka-213/haskell_toy_experiments/blob/master/HundeSpiel.hs">here</a>. It executes in roughly 8 milliseconds both in ghci and compiled (and takes a second to compile and run using runghc) on my laptop.
</p>
<p>
In order to improve the performance, I start with a blank grid and one-by-one add tiles until it is no longer possible to do so, and then bactrack, kind of like how you would do it by hand. As a tiny bonus, that I haven't actually measured if it makes any practical difference, I also selected the order of filling in the grid so that they can constrain each other as much as possible, by filling 2-by-2 squares as early as possible. I have however calculated the number of boards explored in each of the two variations. With a spiral order, 6852 boards are explored, while with a linear order, 9332 boards are explored.
</p>
<p>
In order to eliminate rotational symmetry, I start by filling the center square and fixing its rotation, rather than trying all rotations for it, since we could view any initial rotation of the center square as equivalent to rotating the whole board. In order to eliminate the identical solutions from the two identical tiles, I changed the encoding to use a number next to the tile to say how many copies are left of it, so when we choose a tile, there is only a single way to choose each tile, even if there are multiple copies of it. Both of these would also in theory make the code slightly faster if the time wasn't already dominated by general IO and other unrelated things.
</p>
<p>
I also added various pretty printing and tracing utilites to the code, so you can see exactly how it executes and which partial solutions it explores.
</p>
</div>
<div class="comment-date">2024-10-16 00:32 UTC</div>
</div>
<div class="comment" id="984fc5acb2314c79b2f2d7ddfacea285">
<div class="comment-author"><a href="/">Mark Seemann</a> <a href="#984fc5acb2314c79b2f2d7ddfacea285">#</a></div>
<div class="comment-content">
<p>
Thank you for writing. I did try filling the two-by-two square first, as you suggest, but in isolation it makes no discernable difference.
</p>
<p>
I haven't tried your two other optimizations. The one to eliminate rotations should, I guess, reduce the search space to a fourth of mine, unless I'm mistaken. That would reduce my 300 ms to approximately 75 ms.
</p>
<p>
I can't easily guess how much time the other optimization shaves off, but it could be the one that makes the bigger difference.
</p>
</div>
<div class="comment-date">2024-10-19 08:21 UTC</div>
</div>
</div>
<hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.FSZipper in C#https://blog.ploeh.dk/2024/09/23/fszipper-in-c2024-09-23T06:13:00+00:00Mark Seemann
<div id="post">
<p>
<em>Another functional model of a file system, with code examples in C#.</em>
</p>
<p>
This article is part of <a href="/2024/08/19/zippers">a series about Zippers</a>. In this one, I port the <code>FSZipper</code> data structure from the <a href="https://learnyouahaskell.com/">Learn You a Haskell for Great Good!</a> article <a href="https://learnyouahaskell.com/zippers">Zippers</a>.
</p>
<p>
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there. Additionally, I'll make heavy use of certain standard constructs to port <a href="https://www.haskell.org/">Haskell</a> code, most notably <a href="/2018/05/22/church-encoding">Church encoding</a> to model <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a> in languages that don't natively have them. Such as C#. In some cases, I'll implement the Church encoding using the data structure's <a href="/2019/04/29/catamorphisms">catamorphism</a>. Since the <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> of the resulting code is quite low, you may be able to follow what's going on even if you don't know what Church encoding or catamorphisms are, but if you want to understand the background and motivation for that style of programming, you can consult the cited resources.
</p>
<p>
The code shown in this article is <a href="https://github.com/ploeh/CSharpZippers">available on GitHub</a>.
</p>
<h3 id="dd4cbc996cfa4347afa4b9279c95f6e1">
File system item initialization and structure <a href="#dd4cbc996cfa4347afa4b9279c95f6e1">#</a>
</h3>
<p>
If you haven't already noticed, Haskell (and other statically typed functional programming languages like <a href="https://fsharp.org/">F#</a>) makes heavy use of <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>, and the <code>FSZipper</code> example is no exception. It starts with a one-liner to define a file system item, which may be either a file or a folder. In C# we must instead use a class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">FSItem</span></pre>
</p>
<p>
Contrary to the two previous examples, the <code>FSItem</code> class has no generic type parameter. This is because I'm following the Haskell example code as closely as possible, but as I've previously shown, you can <a href="/2019/08/26/functional-file-system">model a file hierarchy with a general-purpose rose tree</a>.
</p>
<p>
Staying consistent with the two previous articles, I'll use Church encoding to model a sum type, and as discussed in <a href="/2024/09/09/a-binary-tree-zipper-in-c">the previous article</a> I use a <code>private</code> implementation for that.
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IFSItem</span> imp;
<span style="color:blue;">private</span> <span style="color:#2b91af;">FSItem</span>(<span style="color:#2b91af;">IFSItem</span> <span style="font-weight:bold;color:#1f377f;">imp</span>)
{
<span style="color:blue;">this</span>.imp = <span style="font-weight:bold;color:#1f377f;">imp</span>;
}
<span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">FSItem</span> <span style="color:#74531f;">CreateFile</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">name</span>, <span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">data</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">File</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">data</span>));
}
<span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">FSItem</span> <span style="color:#74531f;">CreateFolder</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">name</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> <span style="font-weight:bold;color:#1f377f;">items</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">Folder</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">items</span>));
}</pre>
</p>
<p>
Two <code>static</code> creation methods enable client developers to create a single <code>FSItem</code> object, or an entire tree, like the example from the Haskell code, here ported to C#:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">static</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">FSItem</span> myDisk =
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"root"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"goat_yelling_like_man.wmv"</span>, <span style="color:#a31515;">"baaaaaa"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"pope_time.avi"</span>, <span style="color:#a31515;">"god bless"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"pics"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"ape_throwing_up.jpg"</span>, <span style="color:#a31515;">"bleargh"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"watermelon_smash.gif"</span>, <span style="color:#a31515;">"smash!!"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>, <span style="color:#a31515;">"Yikes!"</span>)
]),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"dijon_poupon.doc"</span>, <span style="color:#a31515;">"best mustard"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"programs"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"fartwizard.exe"</span>, <span style="color:#a31515;">"10gotofart"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"owl_bandit.dmg"</span>, <span style="color:#a31515;">"mov eax, h00t"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"not_a_virus.exe"</span>, <span style="color:#a31515;">"really not a virus"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"source code"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"best_hs_prog.hs"</span>, <span style="color:#a31515;">"main = print (fix error)"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"random.hs"</span>, <span style="color:#a31515;">"main = print 4"</span>)
])
])
]);</pre>
</p>
<p>
Since the <code>imp</code> class field is just a <code>private</code> implementation detail, a client developer needs a way to query an <code>FSItem</code> object about its contents.
</p>
<h3 id="246063d761a94eb880a079f1d31b817d">
File system item catamorphism <a href="#246063d761a94eb880a079f1d31b817d">#</a>
</h3>
<p>
Just like the previous article, I'll start with the catamorphism. This is essentially the <a href="/2019/08/05/rose-tree-catamorphism">rose tree catamorphism</a>, just less generic, since <code>FSItem</code> doesn't have a generic type parameter.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">string</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFile</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">TResult</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> imp.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(<span style="font-weight:bold;color:#1f377f;">whenFile</span>, <span style="font-weight:bold;color:#1f377f;">whenFolder</span>);
}</pre>
</p>
<p>
The <code>Aggregate</code> method delegates to its internal implementation class field, which is defined as the <code>private</code> nested interface <code>IFSItem</code>:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">interface</span> <span style="color:#2b91af;">IFSItem</span>
{
<span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">string</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFile</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">TResult</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>);
}</pre>
</p>
<p>
As discussed in the previous article, the interface is hidden away because it's only a vehicle for polymorphism. It's not intended for client developers to be used (although that would be benign) or implemented (which could break <a href="/encapsulation-and-solid">encapsulation</a>). There are only, and should ever only be, two implementations. The one that represents a file is the simplest:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">File</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">Name</span>, <span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">Data</span>) : <span style="color:#2b91af;">IFSItem</span>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">string</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFile</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">TResult</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">whenFile</span>(Name, Data);
}
}</pre>
</p>
<p>
The <code>File</code> record's <code>Aggregate</code> method unconditionally calls the supplied <code>whenFile</code> function argument with the <code>Name</code> and <code>Data</code> that was originally supplied via its constructor.
</p>
<p>
The <code>Folder</code> implementation is a bit trickier, mostly due to its recursive nature, but also because I wanted it to have structural equality.
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">Folder</span> : <span style="color:#2b91af;">IFSItem</span>
{
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:blue;">string</span> name;
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> items;
<span style="color:blue;">public</span> <span style="color:#2b91af;">Folder</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">Name</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> <span style="font-weight:bold;color:#1f377f;">Items</span>)
{
name = <span style="font-weight:bold;color:#1f377f;">Name</span>;
items = <span style="font-weight:bold;color:#1f377f;">Items</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">string</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFile</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">TResult</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>(
name,
items.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">i</span> => <span style="font-weight:bold;color:#1f377f;">i</span>.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(<span style="font-weight:bold;color:#1f377f;">whenFile</span>, <span style="font-weight:bold;color:#1f377f;">whenFolder</span>)).<span style="font-weight:bold;color:#74531f;">ToList</span>());
}
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">bool</span> <span style="font-weight:bold;color:#74531f;">Equals</span>(<span style="color:blue;">object</span>? <span style="font-weight:bold;color:#1f377f;">obj</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">obj</span> <span style="color:blue;">is</span> <span style="color:#2b91af;">Folder</span> <span style="font-weight:bold;color:#1f377f;">folder</span> &&
name == <span style="font-weight:bold;color:#1f377f;">folder</span>.name &&
items.<span style="font-weight:bold;color:#74531f;">SequenceEqual</span>(<span style="font-weight:bold;color:#1f377f;">folder</span>.items);
}
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">int</span> <span style="font-weight:bold;color:#74531f;">GetHashCode</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">HashCode</span>.<span style="color:#74531f;">Combine</span>(name, items);
}
}</pre>
</p>
<p>
It, too, unconditionally calls one of the two functions passed to its <code>Aggregate</code> method, but this time <code>whenFolder</code>. It does that, however, by first <em>recursively</em> calling <code>Aggregate</code> within a <code>Select</code> expression. It needs to do that because the <code>whenFolder</code> function expects the subtree to have been already converted to values of the <code>TResult</code> return type. This is a common pattern with catamorphisms, and takes a bit of time getting used to. You can see similar examples in the articles <a href="/2019/06/10/tree-catamorphism">Tree catamorphism</a>, <a href="/2019/08/05/rose-tree-catamorphism">Rose tree catamorphism</a>, <a href="/2019/06/24/full-binary-tree-catamorphism">Full binary tree catamorphism</a>, as well as the previous one in this series.
</p>
<p>
I also had to make <code>Folder</code> a <code>class</code> rather than a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">record</a>, because I wanted the type to have structural equality, and you can't override <a href="https://learn.microsoft.com/dotnet/api/system.object.equals">Equals</a> on records (and if the base class library has any collection type with structural equality, I'm not aware of it).
</p>
<h3 id="761c6a5ede2b4df68985e61f6664822f">
File system item Church encoding <a href="#761c6a5ede2b4df68985e61f6664822f">#</a>
</h3>
<p>
True to the structure of the previous article, the catamorphism doesn't look quite like a Church encoding, but it's possible to define the latter from the former.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:blue;">string</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFile</span>,
<span style="color:#2b91af;">Func</span><<span style="color:blue;">string</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenFolder</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span>(
<span style="font-weight:bold;color:#1f377f;">whenFile</span>: (<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">data</span>) =>
(item: <span style="color:#74531f;">CreateFile</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">data</span>), result: <span style="font-weight:bold;color:#1f377f;">whenFile</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">data</span>)),
<span style="font-weight:bold;color:#1f377f;">whenFolder</span>: (<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">pairs</span>) =>
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">items</span> = <span style="font-weight:bold;color:#1f377f;">pairs</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">i</span> => <span style="font-weight:bold;color:#1f377f;">i</span>.item).<span style="font-weight:bold;color:#74531f;">ToList</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> (<span style="color:#74531f;">CreateFolder</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">items</span>), <span style="font-weight:bold;color:#1f377f;">whenFolder</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">items</span>));
}).result;
}</pre>
</p>
<p>
The trick is the same as in the previous article: Build up an intermediate tuple that contains both the current <code>item</code> as well as the <code>result</code> being accumulated. Once the <code>Aggregate</code> method returns, the <code>Match</code> method returns only the <code>result</code> part of the resulting tuple.
</p>
<p>
I implemented the <code>whenFolder</code> expression as a code block, because both tuple elements needed the <code>items</code> collection. You can inline the <code>Select</code> expression, but that would cause it to run twice. That's probably a premature optimization, but it also made the code a bit shorter, and, one may hope, a bit more readable.
</p>
<h3 id="c2ffdb1994bc400d99f359ecf4edb312">
Fily system breadcrumb <a href="#c2ffdb1994bc400d99f359ecf4edb312">#</a>
</h3>
<p>
Finally, things seem to be becoming a little easier. The port of <code>FSCrumb</code> is straightforward.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">FSCrumb</span>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">FSCrumb</span>(
<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">name</span>,
<span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> <span style="font-weight:bold;color:#1f377f;">left</span>,
<span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
Name = <span style="font-weight:bold;color:#1f377f;">name</span>;
Left = <span style="font-weight:bold;color:#1f377f;">left</span>;
Right = <span style="font-weight:bold;color:#1f377f;">right</span>;
}
<span style="color:blue;">public</span> <span style="color:blue;">string</span> Name { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> Left { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSItem</span>> Right { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">bool</span> <span style="font-weight:bold;color:#74531f;">Equals</span>(<span style="color:blue;">object</span>? <span style="font-weight:bold;color:#1f377f;">obj</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">obj</span> <span style="color:blue;">is</span> <span style="color:#2b91af;">FSCrumb</span> <span style="font-weight:bold;color:#1f377f;">crumb</span> &&
Name == <span style="font-weight:bold;color:#1f377f;">crumb</span>.Name &&
Left.<span style="font-weight:bold;color:#74531f;">SequenceEqual</span>(<span style="font-weight:bold;color:#1f377f;">crumb</span>.Left) &&
Right.<span style="font-weight:bold;color:#74531f;">SequenceEqual</span>(<span style="font-weight:bold;color:#1f377f;">crumb</span>.Right);
}
<span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">int</span> <span style="font-weight:bold;color:#74531f;">GetHashCode</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:#2b91af;">HashCode</span>.<span style="color:#74531f;">Combine</span>(Name, Left, Right);
}
}</pre>
</p>
<p>
The only reason this isn't a <code>record</code> is, once again, that I want to override <code>Equals</code> so that the type can have structural equality. <a href="https://visualstudio.microsoft.com/">Visual Studio</a> wants me to convert to a <a href="https://learn.microsoft.com/dotnet/csharp/programming-guide/classes-and-structs/instance-constructors">primary constructor</a>. That would simplify the code a bit, but actually not that much.
</p>
<p>
(I'm still somewhat conservative in my choice of new C# language features. Not that I have anything against primary constructors which, after all, F# has had forever. The reason I'm holding back is for didactic reasons. Not every reader is on the latest language version, and some readers may be using another programming language entirely. On the other hand, primary constructors seem natural and intuitive, so I may start using them here on the blog as well. I don't think that they're going to be much of a barrier to understanding.)
</p>
<p>
Now that we have both the data type we want to zip, as well as the breadcrumb type we need, we can proceed to add the Zipper.
</p>
<h3 id="bb452627a3c3420a95a412dd33ad0efa">
File system Zipper <a href="#bb452627a3c3420a95a412dd33ad0efa">#</a>
</h3>
<p>
The <code>FSZipper</code> C# class fills the position of the eponymous Haskell type alias. Data structure and initialization is straightforward.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">FSZipper</span>
{
<span style="color:blue;">private</span> <span style="color:#2b91af;">FSZipper</span>(<span style="color:#2b91af;">FSItem</span> <span style="font-weight:bold;color:#1f377f;">fSItem</span>, <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSCrumb</span>> <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>)
{
FSItem = <span style="font-weight:bold;color:#1f377f;">fSItem</span>;
Breadcrumbs = <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">FSZipper</span>(<span style="color:#2b91af;">FSItem</span> <span style="font-weight:bold;color:#1f377f;">fSItem</span>) : <span style="color:blue;">this</span>(<span style="font-weight:bold;color:#1f377f;">fSItem</span>, [])
{
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">FSItem</span> FSItem { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">FSCrumb</span>> Breadcrumbs { <span style="color:blue;">get</span>; }
<span style="color:green;">// Methods follow here...</span></pre>
</p>
<p>
True to the style I've already established, I've made the master constructor <code>private</code> in order to highlight that the <code>Breadcrumbs</code> are the responsibility of the <code>FSZipper</code> class itself. It's not something client code need worry about.
</p>
<h3 id="5dd35bee62764a3fb455c406f1a63754">
Going down <a href="#5dd35bee62764a3fb455c406f1a63754">#</a>
</h3>
<p>
The Haskell Zippers article introduces <code>fsUp</code> before <code>fsTo</code>, but if we want to see some example code, we need to navigate <em>to</em> somewhere before we can navigate up. Thus, I'll instead start with the function that navigates to a child node.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FSZipper</span>? <span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">name</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> FSItem.<span style="font-weight:bold;color:#74531f;">Match</span>(
(<span style="font-weight:bold;color:#1f377f;">_</span>, <span style="font-weight:bold;color:#1f377f;">_</span>) => <span style="color:blue;">null</span>,
(<span style="font-weight:bold;color:#1f377f;">folderName</span>, <span style="font-weight:bold;color:#1f377f;">items</span>) =>
{
<span style="color:#2b91af;">FSItem</span>? <span style="font-weight:bold;color:#1f377f;">item</span> = <span style="color:blue;">null</span>;
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">ls</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">FSItem</span>>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">rs</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">List</span><<span style="color:#2b91af;">FSItem</span>>();
<span style="font-weight:bold;color:#8f08c4;">foreach</span> (<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">i</span> <span style="font-weight:bold;color:#8f08c4;">in</span> <span style="font-weight:bold;color:#1f377f;">items</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">item</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span> && <span style="font-weight:bold;color:#1f377f;">i</span>.<span style="font-weight:bold;color:#74531f;">IsNamed</span>(<span style="font-weight:bold;color:#1f377f;">name</span>))
<span style="font-weight:bold;color:#1f377f;">item</span> = <span style="font-weight:bold;color:#1f377f;">i</span>;
<span style="font-weight:bold;color:#8f08c4;">else</span> <span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">item</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#1f377f;">ls</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">i</span>);
<span style="font-weight:bold;color:#8f08c4;">else</span>
<span style="font-weight:bold;color:#1f377f;">rs</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">i</span>);
}
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">item</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(
<span style="font-weight:bold;color:#1f377f;">item</span>,
Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Prepend</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">FSCrumb</span>(<span style="font-weight:bold;color:#1f377f;">folderName</span>, <span style="font-weight:bold;color:#1f377f;">ls</span>, <span style="font-weight:bold;color:#1f377f;">rs</span>)).<span style="font-weight:bold;color:#74531f;">ToList</span>());
});
}</pre>
</p>
<p>
This is by far the most complicated navigation we've seen so far, and I've even taken the liberty of writing an imperative implementation. It's not that I don't know how I could implement it in a purely functional fashion, but I've chosen this implementation for a couple of reasons. The first of which is that, frankly, it was easier this way.
</p>
<p>
This stems from the second reason: That the .NET base class library, as far as I know, offers no functionality like Haskell's <a href="https://hackage.haskell.org/package/base/docs/Data-List.html#v:break">break</a> function. I could have written such a function myself, but felt that it was too much of a digression, even for me. Maybe I'll do that another day. It might make for <a href="/2020/01/13/on-doing-katas">a nice little exercise</a>.
</p>
<p>
The third reason is that <a href="/2011/10/11/CheckingforexactlyoneiteminasequenceusingCandF">C# doesn't afford pattern matching on sequences</a>, in the shape of destructuring the head and the tail of a list. (Not that I know of, anyway, but that language changes rapidly at the moment, and it does have <em>some</em> pattern-matching features now.) This means that I have to check <code>item</code> for <code>null</code> anyway.
</p>
<p>
In any case, while the implementation is imperative, an external caller can't tell. The <code>GoTo</code> method is still <a href="https://en.wikipedia.org/wiki/Referential_transparency">referentially transparent</a>. Which means that <a href="/2021/07/28/referential-transparency-fits-in-your-head">it fits in your head</a>.
</p>
<p>
You may have noticed that the implementation calls <code>IsNamed</code>, which is also new.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">bool</span> <span style="font-weight:bold;color:#74531f;">IsNamed</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">name</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Match</span>((<span style="font-weight:bold;color:#1f377f;">n</span>, <span style="font-weight:bold;color:#1f377f;">_</span>) => <span style="font-weight:bold;color:#1f377f;">n</span> == <span style="font-weight:bold;color:#1f377f;">name</span>, (<span style="font-weight:bold;color:#1f377f;">n</span>, <span style="font-weight:bold;color:#1f377f;">_</span>) => <span style="font-weight:bold;color:#1f377f;">n</span> == <span style="font-weight:bold;color:#1f377f;">name</span>);
}</pre>
</p>
<p>
This is an instance method I added to <code>FSItem</code>.
</p>
<p>
In summary, the <code>GoTo</code> method enables client code to navigate down in the file hierarchy, as this unit test demonstrates:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">GoToSkullMan</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(myDisk);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"pics"</span>)?.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>, <span style="color:#a31515;">"Yikes!"</span>),
<span style="font-weight:bold;color:#1f377f;">actual</span>.FSItem);
}</pre>
</p>
<p>
The example is elementary. First go to the <code>pics</code> folder, and from there to the <code>skull_man(scary).bmp</code>.
</p>
<h3 id="be9c842baa2c4cbb8afc50fdb9ea13c7">
Going up <a href="#be9c842baa2c4cbb8afc50fdb9ea13c7">#</a>
</h3>
<p>
Going back up the hierarchy isn't as complicated.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FSZipper</span>? <span style="font-weight:bold;color:#74531f;">GoUp</span>()
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (Breadcrumbs.Count == 0)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">head</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">First</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">tail</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Skip</span>(1);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="font-weight:bold;color:#1f377f;">head</span>.Name, [.. <span style="font-weight:bold;color:#1f377f;">head</span>.Left, FSItem, .. <span style="font-weight:bold;color:#1f377f;">head</span>.Right]),
<span style="font-weight:bold;color:#1f377f;">tail</span>.<span style="font-weight:bold;color:#74531f;">ToList</span>());
}</pre>
</p>
<p>
If the <code>Breadcrumbs</code> collection is empty, we're already at the root, in which case we can't go further up. In that case, the <code>GoUp</code> method returns <code>null</code>, as does the <code>GoTo</code> method if it can't find an item with the desired name. This possibility is explicitly indicated by the <code><span style="color:#2b91af;">FSZipper</span>?</code> return type; notice the question mark, <a href="https://learn.microsoft.com/dotnet/csharp/nullable-references">which indicates that the value may be null</a>. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the <a href="/2022/04/25/the-maybe-monad">Maybe monad</a> (which is also what you'd <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> do in Haskell).
</p>
<p>
If <code>Breadcrumbs</code> is <em>not</em> empty, it means that there's a place to go up to. It also implies that the previous operation navigated down, and the only way that's possible is if the previous node was a folder. Thus, the <code>GoUp</code> method knows that it needs to reconstitute a folder, and from the <code>head</code> breadcrumb, it knows that folder's name, and what was originally to the <code>Left</code> and <code>Right</code> of the Zipper's <code>FSItem</code> property.
</p>
<p>
This unit test demonstrates how client code may use the <code>GoUp</code> method:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">GoUpFromSkullMan</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(myDisk);
<span style="color:green;">// This is the same as the GoToSkullMan test</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">newFocus</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"pics"</span>)?.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">newFocus</span>?.<span style="font-weight:bold;color:#74531f;">GoUp</span>()?.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"watermelon_smash.gif"</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"watermelon_smash.gif"</span>, <span style="color:#a31515;">"smash!!"</span>),
<span style="font-weight:bold;color:#1f377f;">actual</span>.FSItem);
}</pre>
</p>
<p>
This test first repeats the navigation also performed by the other test, then uses <code>GoUp</code> to go one level up, which finally enables it to navigate to the <code>watermelon_smash.gif</code> file.
</p>
<h3 id="7c96d9a847f04adfb660973e66246d13">
Renaming a file or folder <a href="#7c96d9a847f04adfb660973e66246d13">#</a>
</h3>
<p>
A Zipper enables you to navigate a data structure, but you can also use it to modify the element in focus. One option is to rename a file or folder.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FSZipper</span> <span style="font-weight:bold;color:#74531f;">Rename</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">newName</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(
FSItem.<span style="font-weight:bold;color:#74531f;">Match</span>(
(<span style="font-weight:bold;color:#1f377f;">_</span>, <span style="font-weight:bold;color:#1f377f;">dat</span>) => <span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="font-weight:bold;color:#1f377f;">newName</span>, <span style="font-weight:bold;color:#1f377f;">dat</span>),
(<span style="font-weight:bold;color:#1f377f;">_</span>, <span style="font-weight:bold;color:#1f377f;">items</span>) => <span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="font-weight:bold;color:#1f377f;">newName</span>, <span style="font-weight:bold;color:#1f377f;">items</span>)),
Breadcrumbs);
}</pre>
</p>
<p>
The <code>Rename</code> method 'pattern-matches' on the 'current' <code>FSItem</code> and in both cases creates a new file or folder with the new name. Since it doesn't need the old name for anything, it uses the wildcard pattern to ignore that value. This operation is always possible, so the return type is <code>FSZipper</code>, without a question mark, indicating that the method never returns <code>null</code>.
</p>
<p>
The following unit test replicates the Haskell article's example by renaming the <code>pics</code> folder to <code>cspi</code>.
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">RenamePics</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(myDisk);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"pics"</span>)?.<span style="font-weight:bold;color:#74531f;">Rename</span>(<span style="color:#a31515;">"cspi"</span>).<span style="font-weight:bold;color:#74531f;">GoUp</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>.Breadcrumbs);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"root"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"goat_yelling_like_man.wmv"</span>, <span style="color:#a31515;">"baaaaaa"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"pope_time.avi"</span>, <span style="color:#a31515;">"god bless"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"cspi"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"ape_throwing_up.jpg"</span>, <span style="color:#a31515;">"bleargh"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"watermelon_smash.gif"</span>, <span style="color:#a31515;">"smash!!"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>, <span style="color:#a31515;">"Yikes!"</span>)
]),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"dijon_poupon.doc"</span>, <span style="color:#a31515;">"best mustard"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"programs"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"fartwizard.exe"</span>, <span style="color:#a31515;">"10gotofart"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"owl_bandit.dmg"</span>, <span style="color:#a31515;">"mov eax, h00t"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"not_a_virus.exe"</span>, <span style="color:#a31515;">"really not a virus"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"source code"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"best_hs_prog.hs"</span>, <span style="color:#a31515;">"main = print (fix error)"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"random.hs"</span>, <span style="color:#a31515;">"main = print 4"</span>)
])
])
]),
<span style="font-weight:bold;color:#1f377f;">actual</span>.FSItem);
}</pre>
</p>
<p>
Since the test uses <code>GoUp</code> after <code>Rename</code>, the <code>actual</code> value contains the entire tree, while the <code>Breadcrumbs</code> collection is empty.
</p>
<h3 id="827bcbd5632844fa97b2a92e8beb17cf">
Adding a new file <a href="#827bcbd5632844fa97b2a92e8beb17cf">#</a>
</h3>
<p>
Finally, we can add a new file to a folder.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FSZipper</span>? <span style="font-weight:bold;color:#74531f;">Add</span>(<span style="color:#2b91af;">FSItem</span> <span style="font-weight:bold;color:#1f377f;">item</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> FSItem.<span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">FSZipper</span>?>(
<span style="font-weight:bold;color:#1f377f;">whenFile</span>: (<span style="font-weight:bold;color:#1f377f;">_</span>, <span style="font-weight:bold;color:#1f377f;">_</span>) => <span style="color:blue;">null</span>,
<span style="font-weight:bold;color:#1f377f;">whenFolder</span>: (<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">items</span>) => <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="font-weight:bold;color:#1f377f;">name</span>, <span style="font-weight:bold;color:#1f377f;">items</span>.<span style="font-weight:bold;color:#74531f;">Prepend</span>(<span style="font-weight:bold;color:#1f377f;">item</span>).<span style="font-weight:bold;color:#74531f;">ToList</span>()),
Breadcrumbs));
}</pre>
</p>
<p>
This operation may fail, since we can't add a file to a file. This is, again, clearly indicated by the return type, which allows <code>null</code>.
</p>
<p>
This implementation adds the file to the start of the folder, but it would also be possible to add it at the end. I would consider that slightly more idiomatic in C#, but here I've followed the Haskell example code, which conses the new <code>item</code> to the beginning of the list. As is idiomatic in Haskell.
</p>
<p>
The following unit test reproduces the Haskell article's example.
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">AddPic</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">FSZipper</span>(myDisk);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoTo</span>(<span style="color:#a31515;">"pics"</span>)?.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"heh.jpg"</span>, <span style="color:#a31515;">"lol"</span>))?.<span style="font-weight:bold;color:#74531f;">GoUp</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"root"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"goat_yelling_like_man.wmv"</span>, <span style="color:#a31515;">"baaaaaa"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"pope_time.avi"</span>, <span style="color:#a31515;">"god bless"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"pics"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"heh.jpg"</span>, <span style="color:#a31515;">"lol"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"ape_throwing_up.jpg"</span>, <span style="color:#a31515;">"bleargh"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"watermelon_smash.gif"</span>, <span style="color:#a31515;">"smash!!"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"skull_man(scary).bmp"</span>, <span style="color:#a31515;">"Yikes!"</span>)
]),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"dijon_poupon.doc"</span>, <span style="color:#a31515;">"best mustard"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"programs"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"fartwizard.exe"</span>, <span style="color:#a31515;">"10gotofart"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"owl_bandit.dmg"</span>, <span style="color:#a31515;">"mov eax, h00t"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"not_a_virus.exe"</span>, <span style="color:#a31515;">"really not a virus"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFolder</span>(<span style="color:#a31515;">"source code"</span>,
[
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"best_hs_prog.hs"</span>, <span style="color:#a31515;">"main = print (fix error)"</span>),
<span style="color:#2b91af;">FSItem</span>.<span style="color:#74531f;">CreateFile</span>(<span style="color:#a31515;">"random.hs"</span>, <span style="color:#a31515;">"main = print 4"</span>)
])
])
]),
<span style="font-weight:bold;color:#1f377f;">actual</span>.FSItem);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>.Breadcrumbs);
}</pre>
</p>
<p>
This example also follows the edit with a <code>GoUp</code> call, with the effect that the Zipper is once more focused on the entire tree. The assertion verifies that the new <code>heh.jpg</code> file is the first file in the <code>pics</code> folder.
</p>
<h3 id="18720e9e88d94384921a2b664b4e0a7a">
Conclusion <a href="#18720e9e88d94384921a2b664b4e0a7a">#</a>
</h3>
<p>
The code for <code>FSZipper</code> is actually a bit simpler than for the binary tree. This, I think, is mostly attributable to the <code>FSZipper</code> having fewer constituent sum types. While sum types are trivial, and extraordinarily useful in languages that natively support them, they require a lot of boilerplate in a language like C#.
</p>
<p>
Do you need something like <code>FSZipper</code> in C#? Probably not. As I've already discussed, this article series mostly exists as a programming exercise.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Functor productshttps://blog.ploeh.dk/2024/09/16/functor-products2024-09-16T06:08:00+00:00Mark Seemann
<div id="post">
<p>
<em>A tuple or class of functors is also a functor. An article for object-oriented developers.</em>
</p>
<p>
This article is part of <a href="/2022/07/11/functor-relationships">a series of articles about functor relationships</a>. In this one you'll learn about a universal composition of <a href="/2018/03/22/functors">functors</a>. In short, if you have a <a href="https://en.wikipedia.org/wiki/Product_type">product type</a> of functors, that data structure itself gives rise to a functor.
</p>
<p>
Together with other articles in this series, this result can help you answer questions such as: <em>Does this data structure form a functor?</em>
</p>
<p>
Since functors tend to be quite common, and since they're useful enough that many programming languages have special support or syntax for them, the ability to recognize a potential functor can be useful. Given a type like <code>Foo<T></code> (C# syntax) or <code>Bar<T1, T2></code>, being able to recognize it as a functor can come in handy. One scenario is if you yourself have just defined such a data type. Recognizing that it's a functor strongly suggests that you should give it a <code>Select</code> method in C#, a <code>map</code> function in <a href="https://fsharp.org/">F#</a>, and so on.
</p>
<p>
Not all generic types give rise to a (covariant) functor. Some are rather <a href="/2021/09/02/contravariant-functors">contravariant functors</a>, and some are <a href="/2022/08/01/invariant-functors">invariant</a>.
</p>
<p>
If, on the other hand, you have a data type which is a product of two or more (covariant) functors <em>with the same type parameter</em>, then the data type itself gives rise to a functor. You'll see some examples in this article.
</p>
<h3 id="9fc25288b4504ff3b4fabe932ecf2ea2">
Abstract shape <a href="#9fc25288b4504ff3b4fabe932ecf2ea2">#</a>
</h3>
<p>
Before we look at some examples found in other code, it helps if we know what we're looking for. Most (if not all?) languages support product types. In canonical form, they're just tuples of values, but in an object-oriented language like C#, such types are typically classes.
</p>
<p>
Imagine that you have two functors <code>F</code> and <code>G</code>, and you're now considering a data structure that contains a value of both types.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">T</span>>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">FAndG</span>(<span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">f</span>, <span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">g</span>)
{
F = <span style="font-weight:bold;color:#1f377f;">f</span>;
G = <span style="font-weight:bold;color:#1f377f;">g</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>> F { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>> G { <span style="color:blue;">get</span>; }
<span style="color:green;">// Methods go here...</span></pre>
</p>
<p>
The name of the type is <code><span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">T</span>></code> because it contains both an <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T</span>></code> object and a <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T</span>></code> object.
</p>
<p>
Notice that it's an essential requirement that the individual functors (here <code>F</code> and <code>G</code>) are parametrized by the same type parameter (here <code>T</code>). If your data structure contains <code><span style="color:#2b91af;">F</span><<span style="color:#2b91af;">T1</span>></code> and <code><span style="color:#2b91af;">G</span><<span style="color:#2b91af;">T2</span>></code>, the following 'theorem' doesn't apply.
</p>
<p>
The point of this article is that such an <code><span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">T</span>></code> data structure forms a functor. The <code>Select</code> implementation is quite unsurprising:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">TResult</span>>(F.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>), G.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>));
}</pre>
</p>
<p>
Since we've assumed that both <code>F</code> and <code>G</code> already are functors, they must come with some projection function. In C# it's <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> called <code>Select</code>, while in F# it'd typically be called <code>map</code>:
</p>
<p>
<pre><span style="color:green;">// ('a -> 'b) -> FAndG<'a> -> FAndG<'b></span>
<span style="color:blue;">let</span> <span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">fandg</span> = { F = <span style="color:#2b91af;">F</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">fandg</span>.F; G = <span style="color:#2b91af;">G</span>.<span style="color:#74531f;">map</span> <span style="color:#74531f;">f</span> <span style="font-weight:bold;color:#1f377f;">fandg</span>.G }</pre>
</p>
<p>
assuming a record type like
</p>
<p>
<pre><span style="color:blue;">type</span> <span style="color:#2b91af;">FAndG</span><<span style="color:#2b91af;">'a</span>> = { F : <span style="color:#2b91af;">F</span><<span style="color:#2b91af;">'a</span>>; G : <span style="color:#2b91af;">G</span><<span style="color:#2b91af;">'a</span>> }</pre>
</p>
<p>
In both the C# <code>Select</code> example and the F# <code>map</code> function, the composed functor passes the function argument (<code>selector</code> or <code>f</code>) to both <code>F</code> and <code>G</code> and uses it to map both constituents. It then composes a new product from these individual results.
</p>
<p>
I'll have more to say about how this generalizes to a product of more than two functors, but first, let's consider some examples.
</p>
<h3 id="e3b18df7ac4440d7aada000ce27044f3">
List Zipper <a href="#e3b18df7ac4440d7aada000ce27044f3">#</a>
</h3>
<p>
One of the simplest example I can think of is a List Zipper, which <a href="https://learnyouahaskell.com/zippers">in Haskell</a> is nothing but a type alias of a tuple of lists:
</p>
<p>
<pre><span style="color:blue;">type</span> ListZipper a = ([a],[a])</pre>
</p>
<p>
In the article <a href="/2024/08/26/a-list-zipper-in-c">A List Zipper in C#</a> you saw how the <code><span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>></code> class composes two <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code> objects.
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> values;
<span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> Breadcrumbs { <span style="color:blue;">get</span>; }
<span style="color:blue;">private</span> <span style="color:#2b91af;">ListZipper</span>(<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">values</span>, <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>)
{
<span style="color:blue;">this</span>.values = <span style="font-weight:bold;color:#1f377f;">values</span>;
Breadcrumbs = <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>;
}</pre>
</p>
<p>
Since we already know that sequences like <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code> form functors, we now know that so must <code><span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>></code>. And indeed, the <code>Select</code> implementation looks similar to the above 'shape outline'.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">TResult</span>>(values.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>), Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>));
}</pre>
</p>
<p>
It passes the <code>selector</code> function to the <code>Select</code> method of both <code>values</code> and <code>Breadcrumbs</code>, and composes the results into a <code><span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">TResult</span>></code>.
</p>
<p>
While this example is straightforward, it may not be the most compelling, because <code><span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>></code> composes two identical functors: <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code>. The knowledge that functors compose is more general than that.
</p>
<h3 id="051a4cc14ad74c2ca2f62fd12051f97c">
Non-empty collection <a href="#051a4cc14ad74c2ca2f62fd12051f97c">#</a>
</h3>
<p>
Next after the above List Zipper, the simplest example I can think of is a non-empty list. On this blog I originally introduced it in the article <a href="/2017/12/11/semigroups-accumulate">Semigroups accumulate</a>, but here I'll use the variant from <a href="/2023/08/07/nonempty-catamorphism">NonEmpty catamorphism</a>. It composes a single value of the type <code>T</code> with an <code><span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">T</span>></code>.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">NonEmptyCollection</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">head</span>, <span style="color:blue;">params</span> <span style="color:#2b91af;">T</span>[] <span style="font-weight:bold;color:#1f377f;">tail</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">head</span> == <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">throw</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">head</span>));
<span style="color:blue;">this</span>.Head = <span style="font-weight:bold;color:#1f377f;">head</span>;
<span style="color:blue;">this</span>.Tail = <span style="font-weight:bold;color:#1f377f;">tail</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">T</span> Head { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">T</span>> Tail { <span style="color:blue;">get</span>; }</pre>
</p>
<p>
The <code>Tail</code>, being an <code><span style="color:#2b91af;">IReadOnlyCollection</span><<span style="color:#2b91af;">T</span>></code>, easily forms a functor, since it's a kind of list. But what about <code>Head</code>, which is a 'naked' <code>T</code> value? Does that form a functor? If so, which one?
</p>
<p>
Indeed, a 'naked' <code>T</code> value is isomorphic to <a href="/2018/09/03/the-identity-functor">the Identity functor</a>. This situation is an example of how knowing about the Identity functor is useful, even if you never actually write code that uses it. Once you realize that <code>T</code> is equivalent with a functor, you've now established that <code><span style="color:#2b91af;">NonEmptyCollection</span><<span style="color:#2b91af;">T</span>></code> composes two functors. Therefore, it must itself form a functor, and you realize that you can give it a <code>Select</code> method.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">NonEmptyCollection</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NonEmptyCollection</span><<span style="color:#2b91af;">TResult</span>>(<span style="font-weight:bold;color:#1f377f;">selector</span>(Head), Tail.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>).<span style="font-weight:bold;color:#74531f;">ToArray</span>());
}</pre>
</p>
<p>
Notice that even though we understand that <code>T</code> is equivalent to the Identity functor, there's no reason to actually wrap <code>Head</code> in an <code><span style="color:#2b91af;">Identity</span><<span style="color:#2b91af;">T</span>></code> <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers/">container</a> just to call <code>Select</code> on it and unwrap the result. Rather, the above <code>Select</code> implementation directly invokes <code>selector</code> with <code>Head</code>. It is, after all, a function that takes a <code>T</code> value as input and returns a <code>TResult</code> object as output.
</p>
<h3 id="d721c5ba6eda4016be1417ea01105bea">
Ranges <a href="#d721c5ba6eda4016be1417ea01105bea">#</a>
</h3>
<p>
It's hard to come up with an example that's both somewhat compelling and realistic, and at the same time prototypically pure. Stripped of all 'noise' functor products are just tuples, but that hardly makes for a compelling example. On the other hand, most other examples I can think of combine results about functors where they compose in more than one way. Not only as products, but also as <a href="/2024/10/14/functor-sums">sums of functors</a>, as well as nested compositions. You'll be able to read about these in future articles, but for the next examples, you'll have to accept some claims about functors at face value.
</p>
<p>
In <a href="/2024/02/12/range-as-a-functor">Range as a functor</a> you saw how both <code><span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>></code> and <code><span style="color:#2b91af;">Range</span><<span style="color:#2b91af;">T</span>></code> are functors. The article shows functor implementations for each, in both C#, F#, and <a href="https://www.haskell.org/">Haskell</a>. For now we'll ignore the deeper underlying reason why <code><span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>></code> forms a functor, and instead focus on <code><span style="color:#2b91af;">Range</span><<span style="color:#2b91af;">T</span>></code>.
</p>
<p>
In Haskell I never defined an explicit <code>Range</code> type, but rather just treated ranges as tuples. As stated repeatedly already, tuples are the essential products, so if you accept that <code>Endpoint</code> gives rise to a functor, then a 'range tuple' does, too.
</p>
<p>
In F# <code>Range</code> is defined like this:
</p>
<p>
<pre><span style="color:blue;">type</span> Range<'a> = { LowerBound : Endpoint<'a>; UpperBound : Endpoint<'a> }</pre>
</p>
<p>
Such a record type is also easily identified as a product type. In a sense, we can think of a record type as a 'tuple with metadata', where the metadata contains <em>names</em> of elements.
</p>
<p>
In C# <code><span style="color:#2b91af;">Range</span><<span style="color:#2b91af;">T</span>></code> is a class with two <code><span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>></code> fields.
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>> min;
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>> max;
<span style="color:blue;">public</span> <span style="color:#2b91af;">Range</span>(<span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">min</span>, <span style="color:#2b91af;">Endpoint</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">max</span>)
{
<span style="color:blue;">this</span>.min = <span style="font-weight:bold;color:#1f377f;">min</span>;
<span style="color:blue;">this</span>.max = <span style="font-weight:bold;color:#1f377f;">max</span>;
}</pre>
</p>
<p>
In a sense, you can think of such an immutable class as equivalent to a record type, only requiring substantial <a href="/2019/12/16/zone-of-ceremony">ceremony</a>. The point is that because a range is a product of two functors, it itself gives rise to a functor. You can see all the implementations in <a href="/2024/02/12/range-as-a-functor">Range as a functor</a>.
</p>
<h3 id="25e4dea36f644217ba1e28f2a509f3ab">
Binary tree Zipper <a href="#25e4dea36f644217ba1e28f2a509f3ab">#</a>
</h3>
<p>
In <a href="/2024/09/09/a-binary-tree-zipper-in-c">A Binary Tree Zipper in C#</a> you saw that the <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code> class has two class fields:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> Tree { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>> Breadcrumbs { <span style="color:blue;">get</span>; }</pre>
</p>
<p>
Both have the same generic type parameter <code>T</code>, so the question is whether <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code> may form a functor? We now know that the answer is affirmative if <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> and <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>></code> are both functors.
</p>
<p>
For now, believe me when I claim that this is the case. This means that you can add a <code>Select</code> method to the class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#74531f;">Select</span><<span style="color:#2b91af;">TResult</span>>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">selector</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">TResult</span>>(
Tree.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>),
Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">c</span> => <span style="font-weight:bold;color:#1f377f;">c</span>.<span style="font-weight:bold;color:#74531f;">Select</span>(<span style="font-weight:bold;color:#1f377f;">selector</span>)));
}</pre>
</p>
<p>
By now, this should hardly be surprising: Call <code>Select</code> on each constituent functor and create a proper return value from the results.
</p>
<h3 id="1179fc33850d430780c843583c16adcb">
Higher arities <a href="#1179fc33850d430780c843583c16adcb">#</a>
</h3>
<p>
All examples have involved products of only two functors, but the result generalizes to higher arities. To gain an understanding of why, consider that it's always possible to rewrite tuples of higher arities as nested pairs. As an example, a triple like <code>(42, <span style="color:#a31515;">"foo"</span>, True)</code> can be rewritten as <code>(42, (<span style="color:#a31515;">"foo"</span>, True))</code> without loss of information. The latter representation is a pair (a two-tuple) where the first element is <code>42</code>, but the second element is another pair. These two representations are isomorphic, meaning that we can go back and forth without losing data.
</p>
<p>
By induction you can generalize this result to any arity. The point is that the only data type you need to describe a product is a pair.
</p>
<p>
Haskell's <a href="https://hackage.haskell.org/package/base">base</a> library defines a specialized container called <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Product.html">Product</a> for this very purpose: If you have two <code>Functor</code> instances, you can <code>Pair</code> them up, and they become a single <code>Functor</code>.
</p>
<p>
Let's start with a <code>Pair</code> of <code>Maybe</code> and a list:
</p>
<p>
<pre>ghci> Pair (Just "foo") ["bar", "baz", "qux"]
Pair (Just "foo") ["bar","baz","qux"]</pre>
</p>
<p>
This is a single 'object', if you will, that composes those two <code>Functor</code> instances. This means that you can map over it:
</p>
<p>
<pre>ghci> elem 'b' <$> Pair (Just "foo") ["bar", "baz", "qux"]
Pair (Just False) [True,True,False]</pre>
</p>
<p>
Here I've used the infix <code><$></code> operator as an alternative to <code>fmap</code>. By composing with <code>elem 'b'</code>, I'm asking every value inside the container whether or not it contains the character <code>b</code>. The <code>Maybe</code> value doesn't, while the first two list elements do.
</p>
<p>
If you want to compose three, rather than two, <code>Functor</code> instances, you just nest the <code>Pairs</code>, just like you can nest tuples:
</p>
<p>
<pre>ghci> elem 'b' <$> Pair (Identity "quux") (Pair (Just "foo") ["bar", "baz", "qux"])
Pair (Identity False) (Pair (Just False) [True,True,False])</pre>
</p>
<p>
This example now introduces the <code>Identity</code> container as a third <code>Functor</code> instance. I could have used any other <code>Functor</code> instance instead of <code>Identity</code>, but some of them are more awkward to create or display. For example, the <a href="/2021/08/30/the-reader-functor">Reader</a> or <a href="/2021/07/19/the-state-functor">State</a> functors have no <code>Show</code> instances in Haskell, meaning that GHCi doesn't know how to print them as values. Other <code>Functor</code> instances didn't work as well for the example, since they tend to be more awkward to create. As an example, any non-trivial <a href="https://hackage.haskell.org/package/containers/docs/Data-Tree.html#t:Tree">Tree</a> requires substantial editor space to express.
</p>
<h3 id="329c3274f8f54171905d747867fc293b">
Conclusion <a href="#329c3274f8f54171905d747867fc293b">#</a>
</h3>
<p>
A product of functors may itself be made a functor. The examples shown in this article are all constrained to two functors, but if you have a product of three, four, or more functors, that product still gives rise to a functor.
</p>
<p>
This is useful to know, particularly if you're working in a language with only partial support for functors. Mainstream languages aren't going to automatically turn such products into functors, in the way that Haskell's <code>Product</code> container almost does. Thus, knowing when you can safely give your generic types a <code>Select</code> method or <code>map</code> function may come in handy.
</p>
<p>
There are more rules like this one. The next article examines another.
</p>
<p>
<strong>Next:</strong> <a href="/2024/10/14/functor-sums">Functor sums</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.A Binary Tree Zipper in C#https://blog.ploeh.dk/2024/09/09/a-binary-tree-zipper-in-c2024-09-09T06:09:00+00:00Mark Seemann
<div id="post">
<p>
<em>A port of another Haskell example, still just because.</em>
</p>
<p>
This article is part of <a href="/2024/08/19/zippers">a series about Zippers</a>. In this one, I port the <code>Zipper</code> data structure from the <a href="https://learnyouahaskell.com/">Learn You a Haskell for Great Good!</a> article also called <a href="https://learnyouahaskell.com/zippers">Zippers</a>.
</p>
<p>
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there. Additionally, I'll make heavy use of certain standard constructs to port <a href="https://www.haskell.org/">Haskell</a> code, most notably <a href="/2018/05/22/church-encoding">Church encoding</a> to model <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a> in languages that don't natively have them. Such as C#. In some cases, I'll implement the Church encoding using the data structure's <a href="/2019/04/29/catamorphisms">catamorphism</a>. Since the <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> of the resulting code is quite low, you may be able to follow what's going on even if you don't know what Church encoding or catamorphisms are, but if you want to understand the background and motivation for that style of programming, you can consult the cited resources.
</p>
<p>
The code shown in this article is <a href="https://github.com/ploeh/CSharpZippers">available on GitHub</a>.
</p>
<h3 id="e612adde6ff2487ebd026c858f36233f">
Binary tree initialization and structure <a href="#e612adde6ff2487ebd026c858f36233f">#</a>
</h3>
<p>
In the Haskell code, the binary <code>Tree</code> type is a recursive <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>, defined on a single line of code. C#, on the other hand, has no built-in language construct that supports sum types, so a more elaborate solution is required. At least two options are available to us. One is to <a href="/2018/06/25/visitor-as-a-sum-type">model a sum type as a Visitor</a>. Another is to use <a href="/2018/05/22/church-encoding">Church encoding</a>. In this article, I'll do the latter.
</p>
<p>
I find the type name (<code>Tree</code>) used in the Zippers article a bit too vague, and since I consider <a href="https://peps.python.org/pep-0020/">explicit better than implicit</a>, I'll use a more precise class name:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></pre>
</p>
<p>
Even so, there are different kinds of binary trees. In <a href="/2019/06/24/full-binary-tree-catamorphism">a previous article</a> I've shown a catamorphism for a <em>full <a href="https://en.wikipedia.org/wiki/Binary_tree">binary tree</a></em>. This variation is not as strict, since it allows a node to have zero, one, or two children. Or, strictly speaking, a node always has exactly two children, but both, or one of them, may be empty. <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> uses Church encoding to distinguish between the two, but we'll return to that in a moment.
</p>
<p>
First, we'll examine how the class allows initialization:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IBinaryTree</span> root;
<span style="color:blue;">private</span> <span style="color:#2b91af;">BinaryTree</span>(<span style="color:#2b91af;">IBinaryTree</span> <span style="font-weight:bold;color:#1f377f;">root</span>)
{
<span style="color:blue;">this</span>.root = <span style="font-weight:bold;color:#1f377f;">root</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span>() : <span style="color:blue;">this</span>(<span style="color:#2b91af;">Empty</span>.Instance)
{
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">left</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)
: <span style="color:blue;">this</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">Node</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">left</span>.root, <span style="font-weight:bold;color:#1f377f;">right</span>.root))
{
}</pre>
</p>
<p>
The class uses a <code>private</code> <code>root</code> object to implement behaviour, and constructor chaining for initialization. The master constructor is <code>private</code>, since the <code>IBinaryTree</code> interface is <code>private</code>. The parameterless constructor implicitly indicates an empty node, whereas the other <code>public</code> constructor indicates a node with a value and two children. Yes, I know that I just wrote that explicit is better than implicit, but it turns out that with the <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/operators/new-operator">target-typed <code>new</code></a> operator feature in C#, constructing trees in code becomes easier with this design choice:
</p>
<p>
<pre><span style="color:#2b91af;">BinaryTree</span><<span style="color:blue;">int</span>> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span>(
42,
<span style="color:blue;">new</span>(),
<span style="color:blue;">new</span>(2, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()));</pre>
</p>
<p>
As <a href="/2020/11/30/name-by-role">the variable name suggests</a>, I've taken this code example from a unit test.
</p>
<h3 id="57fddbbeebc44489b3ebc0c4fd7c0d9f">
Private interface <a href="#57fddbbeebc44489b3ebc0c4fd7c0d9f">#</a>
</h3>
<p>
The class delegates method calls to the <code>root</code> field, which is an instance of the <code>private</code>, nested <code>IBinaryTree</code> interface:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">interface</span> <span style="color:#2b91af;">IBinaryTree</span>
{
<span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenNode</span>);
}</pre>
</p>
<p>
Why is <code>IBinaryTree</code> a <code>private</code> interface? Why does that interface even exist?
</p>
<p>
To be frank, I could have chosen another implementation strategy. Since there's only two mutually exclusive alternatives (<em>node</em> or <em>empty</em>), I could also have indicated which is which with a Boolean flag. You can see an example of that implementation tactic in the <code>Table</code> class in the sample code that accompanies <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>.
</p>
<p>
Using a Boolean flag, however, only works when there are exactly two choices. If you have three or more, things because more complicated. You could try to use an <a href="https://en.wikipedia.org/wiki/Enumerated_type">enum</a>, but in most languages, these tend to be nothing but glorified integers, and are typically not type-safe. If you define a three-way enum, there's no guarantee that a value of that type takes only one of these three values, and a good compiler will typically insist that you check for any other value as well. The C# compiler certainly does.
</p>
<p>
Church encoding offers a better alternative, but since it makes use of polymorphism, the most <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> choice in C# is either an interface or a base class. Since I favour interfaces over base classes, that's what I've chosen here, but for the purposes of this little digression, it makes no difference: The following argument applies to base classes as well.
</p>
<p>
An interface (or base class) suggests to users of an API that they can implement it in order to extend behaviour. That's an impression I don't wish to give client developers. The purpose of the interface is exclusively to enable <a href="https://en.wikipedia.org/wiki/Double_dispatch">double dispatch</a> to work. There's only two implementations of the <code>IBinaryTree</code> interface, and under no circumstances should there be more.
</p>
<p>
The interface is an implementation detail, which is why both it, and its implementations, are <code>private</code>.
</p>
<h3 id="72ecf86f028f482ebcdb02e914e4cd06">
Binary tree catamorphism <a href="#72ecf86f028f482ebcdb02e914e4cd06">#</a>
</h3>
<p>
The <code>IBinaryTree</code> interface defines a catamorphism for the <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> class. Since we may often view a catamorphism as a sort of 'generalized fold', and since these kinds of operations in C# are typically called <code>Aggregate</code>, that's what I've called the method.
</p>
<p>
An aggregate function affords a way to traverse a data structure and collect information into a single value, here of type <code>TResult</code>. The return type may, however, be a complex type, including another <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code>. You'll see examples of complex return values later in this article.
</p>
<p>
As already discussed, there are exactly two implementations of <code>IBinaryTree</code>. The one representing an empty node is the simplest:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">Empty</span> : <span style="color:#2b91af;">IBinaryTree</span>
{
<span style="color:blue;">public</span> <span style="color:blue;">readonly</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Empty</span> Instance = <span style="color:blue;">new</span>();
<span style="color:blue;">private</span> <span style="color:#2b91af;">Empty</span>()
{
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenNode</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>();
}
}</pre>
</p>
<p>
The <code>Aggregate</code> implementation unconditionally calls the supplied <code>whenEmpty</code> function, which returns some <code>TResult</code> value unknown to the <code>Empty</code> class.
</p>
<p>
Although not strictly necessary, I've made the class a <a href="https://en.wikipedia.org/wiki/Singleton_pattern">Singleton</a>. Since I like to <a href="/2021/05/03/structural-equality-for-better-tests">take advantage of structural equality to write better tests</a>, it was either that, or overriding <code>Equals</code> and <code>GetHashCode</code>.
</p>
<p>
The other implementation gets around that problem by being a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">record</a>:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">Node</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Value</span>, <span style="color:#2b91af;">IBinaryTree</span> <span style="font-weight:bold;color:#1f377f;">Left</span>, <span style="color:#2b91af;">IBinaryTree</span> <span style="font-weight:bold;color:#1f377f;">Right</span>) : <span style="color:#2b91af;">IBinaryTree</span>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenNode</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">whenNode</span>(
Value,
Left.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>, <span style="font-weight:bold;color:#1f377f;">whenNode</span>),
Right.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>, <span style="font-weight:bold;color:#1f377f;">whenNode</span>));
}
}</pre>
</p>
<p>
It, too, unconditionally calls one of the two functions passed to its <code>Aggregate</code> method, but this time <code>whenNode</code>. It does that, however, by first <em>recursively</em> calling <code>Aggregate</code> on both <code>Left</code> and <code>Right</code>. It needs to do that because the <code>whenNode</code> function expects the subtrees to have been already converted to values of the <code>TResult</code> return type. This is a common pattern with catamorphisms, and takes a bit of time getting used to. You can see similar examples in the articles <a href="/2019/06/10/tree-catamorphism">Tree catamorphism</a>, <a href="/2019/08/05/rose-tree-catamorphism">Rose tree catamorphism</a>, and <a href="/2019/06/24/full-binary-tree-catamorphism">Full binary tree catamorphism</a>.
</p>
<p>
The <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> class defines a <code>public</code> <code>Aggregate</code> method that delegates to its <code>root</code> field:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Aggregate</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenNode</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> root.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>, <span style="font-weight:bold;color:#1f377f;">whenNode</span>);
}</pre>
</p>
<p>
The astute reader may now remark that the <code>Aggregate</code> method doesn't look like a Church encoding.
</p>
<h3 id="e99e9074e04e416c82a7345574d4944b">
Binary tree Church encoding <a href="#e99e9074e04e416c82a7345574d4944b">#</a>
</h3>
<p>
A Church encoding will typically have a <code>Match</code> method that enables client code to match on all the alternative cases in the sum type, without those confusing already-converted <code>TResult</code> values. It turns out that you can implement the desired <code>Match</code> method with the <code>Aggregate</code> method.
</p>
<p>
One of the advantages of doing meaningless coding exercises like this one is that you can pursue various ideas that interest you. One idea that interests me is the potential universality of catamorphisms. I conjecture that a catamorphism is an <a href="https://en.wikipedia.org/wiki/Algebraic_data_type">algebraic data type</a>'s universal API, and that you can implement all other methods or functions with it. I admit that I haven't done much research in the form of perusing existing literature, but at least it seems to be the case conspicuously often.
</p>
<p>
As it is here.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenNode</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> root
.<span style="font-weight:bold;color:#74531f;">Aggregate</span>(
() => (tree: <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(), result: <span style="font-weight:bold;color:#1f377f;">whenEmpty</span>()),
(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>, <span style="font-weight:bold;color:#1f377f;">r</span>) => (
<span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>.tree, <span style="font-weight:bold;color:#1f377f;">r</span>.tree),
<span style="font-weight:bold;color:#1f377f;">whenNode</span>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>.tree, <span style="font-weight:bold;color:#1f377f;">r</span>.tree)))
.result;
}</pre>
</p>
<p>
Now, I readily admit that it took me a couple of hours tossing and turning in my bed before this solution came to me. I don't find it intuitive at all, but it works.
</p>
<p>
The <code>Aggregate</code> method requires that the <code>whenNode</code> function's <em>left</em> and <em>right</em> values are of <em>the same</em> <code>TResult</code> type as the return type. How do we consolidate that requirement with the <code>Match</code> method's variation, where <em>its</em> <code>whenNode</code> function requires the <em>left</em> and <em>right</em> values to be <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> values, but the return type still <code>TResult</code>?
</p>
<p>
The way out of this conundrum, it turns out, is to combine both in a tuple. Thus, when <code>Match</code> calls <code>Aggregate</code>, the implied <code>TResult</code> type is <em>not</em> the <code>TResult</code> visible in the <code>Match</code> method declaration. Rather, it's inferred to be of the type <code>(<span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>)</code>. That is, a tuple where the first element is a <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code> value, and the second element is a <code><span style="color:#2b91af;">TResult</span></code> value. The C# compiler's type inference engine then figures out that <code>(<span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>)</code> must also be the return type of the <code>Aggregate</code> method call.
</p>
<p>
That's not what <code>Match</code> should return, but the second tuple element contains a value of the correct type, so it returns that. Since I've given the tuple elements names, the <code>Match</code> implementation accomplishes that by returning the <code>result</code> tuple field.
</p>
<h3 id="816773c095624bfcb5cced827ba76455">
Breadcrumbs <a href="#816773c095624bfcb5cced827ba76455">#</a>
</h3>
<p>
That's just the tree that we want to zip. So far, we can only move from root to branches, but not the other way. Before we can define a Zipper for the tree, we need a data structure to store breadcrumbs (the navigation log, if you will).
</p>
<p>
In Haskell it's just another one-liner, but in C# this requires another full-fledged class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>></pre>
</p>
<p>
It's another sum type, so once more, I make the constructor private and use a <code>private</code> class field for the implementation:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">ICrumb</span> imp;
<span style="color:blue;">private</span> <span style="color:#2b91af;">Crumb</span>(<span style="color:#2b91af;">ICrumb</span> <span style="font-weight:bold;color:#1f377f;">imp</span>)
{
<span style="color:blue;">this</span>.imp = <span style="font-weight:bold;color:#1f377f;">imp</span>;
}
<span style="color:blue;">internal</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>> <span style="color:#74531f;">Left</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">right</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">LeftCrumb</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">right</span>));
}
<span style="color:blue;">internal</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>> <span style="color:#74531f;">Right</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">left</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">RightCrumb</span>(<span style="font-weight:bold;color:#1f377f;">value</span>, <span style="font-weight:bold;color:#1f377f;">left</span>));
}</pre>
</p>
<p>
To stay consistent throughout the code base, I also use Church encoding to distinguish between a <code>Left</code> and <code>Right</code> breadcrumb, and the technique is similar. First, define a <code>private</code> interface:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">interface</span> <span style="color:#2b91af;">ICrumb</span>
{
<span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenLeft</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenRight</span>);
}</pre>
</p>
<p>
Then, use <code>private</code> nested types to implement the interface.
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">sealed</span> <span style="color:blue;">record</span> <span style="color:#2b91af;">LeftCrumb</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">Value</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">Right</span>) : <span style="color:#2b91af;">ICrumb</span>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenLeft</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenRight</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">whenLeft</span>(Value, Right);
}
}</pre>
</p>
<p>
The <code>RightCrumb</code> record is essentially just the 'mirror image' of the <code>LeftCrumb</code> record, and just as was the case with <code><span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>></code>, the <code><span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>></code> class exposes an externally accessible <code>Match</code> method that just delegates to the <code>private</code> class field:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TResult</span> <span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">TResult</span>>(
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenLeft</span>,
<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>, <span style="color:#2b91af;">TResult</span>> <span style="font-weight:bold;color:#1f377f;">whenRight</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> imp.<span style="font-weight:bold;color:#74531f;">Match</span>(<span style="font-weight:bold;color:#1f377f;">whenLeft</span>, <span style="font-weight:bold;color:#1f377f;">whenRight</span>);
}</pre>
</p>
<p>
Finally, all the building blocks are ready for the actual Zipper.
</p>
<h3 id="f345665355144ccfbbc5767d75f48ece">
Zipper data structure and initialization <a href="#f345665355144ccfbbc5767d75f48ece">#</a>
</h3>
<p>
In the Haskell code, the Zipper is another one-liner, and really just a type alias. In C#, once more, we're going to need a full class.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></pre>
</p>
<p>
The Haskell article simply calls this type alias <code>Zipper</code>, but I find that name too general, since there's more than one kind of Zipper. I think I understand that the article chooses that name for didactic reasons, but here I've chosen a more consistent disambiguation scheme, so I've named the class <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code>.
</p>
<p>
The Haskell example is just a type alias for a tuple, and the C# class is similar, although with significantly more <a href="/2019/12/16/zone-of-ceremony">ceremony</a>:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> Tree { <span style="color:blue;">get</span>; }
<span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>> Breadcrumbs { <span style="color:blue;">get</span>; }
<span style="color:blue;">private</span> <span style="color:#2b91af;">BinaryTreeZipper</span>(
<span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">tree</span>,
<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">Crumb</span><<span style="color:#2b91af;">T</span>>> <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>)
{
Tree = <span style="font-weight:bold;color:#1f377f;">tree</span>;
Breadcrumbs = <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span>(<span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">tree</span>) : <span style="color:blue;">this</span>(<span style="font-weight:bold;color:#1f377f;">tree</span>, [])
{
}</pre>
</p>
<p>
I've here chosen to add an extra bit of <a href="/2022/10/24/encapsulation-in-functional-programming">encapsulation</a> by making the master constructor <code>private</code>. This prevents client code from creating an arbitrary object with breadcrumbs without having navigated through the tree. To be honest, I don't think it violates any contract even if we allow this, but it at least highlights that the <code>Breadcrumbs</code> role is to keep a log of what previously happened to the object.
</p>
<h3 id="9cafe5fe05cd4d619b8d50cd3a86f549">
Navigation <a href="#9cafe5fe05cd4d619b8d50cd3a86f549">#</a>
</h3>
<p>
We can now reproduce the navigation functions from the Haskell article.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">GoLeft</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> Tree.<span style="font-weight:bold;color:#74531f;">Match</span><<span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>?>(
<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>: () => <span style="color:blue;">null</span>,
<span style="font-weight:bold;color:#1f377f;">whenNode</span>: (<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>, <span style="font-weight:bold;color:#1f377f;">r</span>) => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>(
<span style="font-weight:bold;color:#1f377f;">l</span>,
Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Prepend</span>(<span style="color:#2b91af;">Crumb</span>.<span style="color:#74531f;">Left</span>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">r</span>))));
}</pre>
</p>
<p>
Going left 'pattern-matches' on the <code>Tree</code> and, if not empty, constructs a new <code>BinaryTreeZipper</code> object with the left tree, and a <code>Left</code> breadcrumb that stores the 'current' node value and the right subtree. If the 'current' node is empty, on the other hand, the method returns <code>null</code>. This possibility is explicitly indicated by the <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>?</code> return type; notice the question mark, <a href="https://learn.microsoft.com/dotnet/csharp/nullable-references">which indicates that the value may be null</a>. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the <a href="/2022/04/25/the-maybe-monad">Maybe monad</a> (which is also what you'd <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> do in Haskell).
</p>
<p>
The <code>GoRight</code> method is similar to <code>GoLeft</code>.
</p>
<p>
We may also attempt to navigate up in the tree, undoing our last downward move:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">GoUp</span>()
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (!Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Any</span>())
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">head</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">First</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">tail</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Skip</span>(1);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">head</span>.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">whenLeft</span>: (<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">r</span>) => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>(
<span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">x</span>, Tree, <span style="font-weight:bold;color:#1f377f;">r</span>),
<span style="font-weight:bold;color:#1f377f;">tail</span>),
<span style="font-weight:bold;color:#1f377f;">whenRight</span>: (<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>) => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>(
<span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>, Tree),
<span style="font-weight:bold;color:#1f377f;">tail</span>));
}</pre>
</p>
<p>
This is another operation that may fail. If we're already at the root of the tree, there are no <code>Breadcrumbs</code>, in which case the only option is to return a value indicating that the operation failed; here, <code>null</code>, but in other languages perhaps <code>None</code> or <code>Nothing</code>.
</p>
<p>
If, on the other hand, there's at least one breadcrumb, the <code>GoUp</code> method uses the most recent one (<code>head</code>) to construct a new <code><span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>></code> object that reconstitutes the opposite (sibling) subtree and the parent node. It does that by 'pattern-matching' on the <code>head</code> breadcrumb, which enables it to distinguish a left breadcrumb from a right breadcrumb.
</p>
<p>
Finally, we may keep trying to <code>GoUp</code> until we reach the root:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#74531f;">TopMost</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">GoUp</span>()?.<span style="font-weight:bold;color:#74531f;">TopMost</span>() ?? <span style="color:blue;">this</span>;
}</pre>
</p>
<p>
You'll see an example of that a little later.
</p>
<h3 id="56a16be50dc4405d8931e9210895b5a0">
Modifications <a href="#56a16be50dc4405d8931e9210895b5a0">#</a>
</h3>
<p>
Continuing the port of the Haskell code, we can <code>Modify</code> the current node with a function:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#74531f;">Modify</span>(<span style="color:#2b91af;">Func</span><<span style="color:#2b91af;">T</span>, <span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">f</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>(
Tree.<span style="font-weight:bold;color:#74531f;">Match</span>(
<span style="font-weight:bold;color:#1f377f;">whenEmpty</span>: () => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(),
<span style="font-weight:bold;color:#1f377f;">whenNode</span>: (<span style="font-weight:bold;color:#1f377f;">x</span>, <span style="font-weight:bold;color:#1f377f;">l</span>, <span style="font-weight:bold;color:#1f377f;">r</span>) => <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">f</span>(<span style="font-weight:bold;color:#1f377f;">x</span>), <span style="font-weight:bold;color:#1f377f;">l</span>, <span style="font-weight:bold;color:#1f377f;">r</span>)),
Breadcrumbs);
}</pre>
</p>
<p>
This operation always succeeds, since it chooses to ignore the change if the tree is empty. Thus, there's no question mark on the return type, indicating that the method never returns <code>null</code>.
</p>
<p>
Finally, we may replace a node with a new subtree:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#74531f;">Attach</span>(<span style="color:#2b91af;">BinaryTree</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">tree</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">tree</span>, Breadcrumbs);
}</pre>
</p>
<p>
The following unit test demonstrates a combination of several of the methods shown above:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">AttachAndGoTopMost</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">BinaryTreeZipper</span><<span style="color:blue;">char</span>>(freeTree);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">farLeft</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoLeft</span>()?.<span style="font-weight:bold;color:#74531f;">GoLeft</span>()?.<span style="font-weight:bold;color:#74531f;">GoLeft</span>()?.<span style="font-weight:bold;color:#74531f;">GoLeft</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">farLeft</span>?.<span style="font-weight:bold;color:#74531f;">Attach</span>(<span style="color:blue;">new</span>(<span style="color:#a31515;">'Z'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>())).<span style="font-weight:bold;color:#74531f;">TopMost</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(
<span style="color:blue;">new</span>(<span style="color:#a31515;">'P'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'O'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'L'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'N'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'Z'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()),
<span style="color:blue;">new</span>()),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'T'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>())),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'Y'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'S'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'A'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()))),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'L'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'W'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'C'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'R'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>())),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'A'</span>,
<span style="color:blue;">new</span>(<span style="color:#a31515;">'A'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>()),
<span style="color:blue;">new</span>(<span style="color:#a31515;">'C'</span>, <span style="color:blue;">new</span>(), <span style="color:blue;">new</span>())))),
<span style="font-weight:bold;color:#1f377f;">actual</span>.Tree);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>.Breadcrumbs);
}</pre>
</p>
<p>
The test starts with <code>freeTree</code> (not shown) and first navigates to the leftmost empty node. Here it uses <code>Attach</code> to add a new 'singleton' subtree with the value <code>'Z'</code>. Finally, it uses <code>TopMost</code> to return to the root node.
</p>
<p>
In <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">the Assert phase</a>, the test verifies that the <code>actual</code> object contains the expected values.
</p>
<h3 id="8eaa9438655f4bcbb9447796a7ed7154">
Conclusion <a href="#8eaa9438655f4bcbb9447796a7ed7154">#</a>
</h3>
<p>
The Tree Zipper shown here is a port of the example given in the Haskell <a href="https://learnyouahaskell.com/zippers">Zippers article</a>. As I've already discussed in the <a href="/2024/08/19/zippers">introduction article</a>, this data structure doesn't make much sense in C#, where you can easily implement a navigable tree with two-way links. Even if this requires state mutation, you can package such a data structure in a proper object with good <a href="/encapsulation-and-solid">encapsulation</a>, so that operations don't leave any dangling pointers or the like.
</p>
<p>
As far as I can tell, the code shown in this article isn't useful in production code, but I hope that, at least, you still learned something from it. I always learn a new thing or two from <a href="/2020/01/13/on-doing-katas">doing programming exercises</a> and writing about them, and this was no exception.
</p>
<p>
In the next article, I continue with the final of the Haskell article's three examples.
</p>
<p>
<strong>Next:</strong> <a href="/2024/09/23/fszipper-in-c">FSZipper in C#</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Keeping cross-cutting concerns out of application codehttps://blog.ploeh.dk/2024/09/02/keeping-cross-cutting-concerns-out-of-application-code2024-09-02T06:19:00+00:00Mark Seemann
<div id="post">
<p>
<em>Don't inject third-party dependencies. Use Decorators.</em>
</p>
<p>
I recently came across <a href="https://stackoverflow.com/q/78887199/126014">a Stack Overflow question</a> that reminded me of a topic I've been meaning to write about for a long time: <a href="https://en.wikipedia.org/wiki/Cross-cutting_concern">Cross-cutting concerns</a>.
</p>
<p>
When it comes to <a href="https://en.wikipedia.org/wiki/Casablanca_(film)">the usual suspects</a>, logging, fault tolerance, caching, the best solution is usually to apply the <a href="https://en.wikipedia.org/wiki/Decorator_pattern">Decorator pattern</a>.
</p>
<p>
I often see code that uses Dependency Injection (DI) to inject, say, a logging interface into application code. You can see an example of that in <a href="/2020/03/23/repeatable-execution">Repeatable execution</a>, as well as a suggestion for a better design. Not surprisingly, the better design involves logging Decorators.
</p>
<p>
The Stack Overflow question isn't about logging, but rather about fault tolerance; <a href="https://martinfowler.com/bliki/CircuitBreaker.html">Circuit Breaker</a>, retry policies, timeouts, etc.
</p>
<h3 id="02d07297ea6341c6aef55c0fcb76678c">
Injected concern <a href="#02d07297ea6341c6aef55c0fcb76678c">#</a>
</h3>
<p>
The question does a good job of presenting a minimal, reproducible example. At the outset, the code looks like this:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">MyApi</span>
{
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">ResiliencePipeline</span> pipeline;
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IOrganizationService</span> service;
<span style="color:blue;">public</span> <span style="color:#2b91af;">MyApi</span>(<span style="color:#2b91af;">ResiliencePipelineProvider</span><<span style="color:blue;">string</span>> <span style="font-weight:bold;color:#1f377f;">provider</span>, <span style="color:#2b91af;">IOrganizationService</span> <span style="font-weight:bold;color:#1f377f;">service</span>)
{
<span style="color:blue;">this</span>.pipeline = <span style="font-weight:bold;color:#1f377f;">provider</span>.<span style="font-weight:bold;color:#74531f;">GetPipeline</span>(<span style="color:#a31515;">"retry-pipeline"</span>);
<span style="color:blue;">this</span>.service = <span style="font-weight:bold;color:#1f377f;">service</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">List</span><<span style="color:blue;">string</span>> <span style="font-weight:bold;color:#74531f;">GetSomething</span>(<span style="color:#2b91af;">QueryByAttribute</span> <span style="font-weight:bold;color:#1f377f;">query</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = <span style="color:blue;">this</span>.pipeline.<span style="font-weight:bold;color:#74531f;">Execute</span>(() => service.<span style="font-weight:bold;color:#74531f;">RetrieveMultiple</span>(<span style="font-weight:bold;color:#1f377f;">query</span>));
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">result</span>.Entities.<span style="font-weight:bold;color:#74531f;">Cast</span><<span style="color:blue;">string</span>>().<span style="font-weight:bold;color:#74531f;">ToList</span>();
}
}</pre>
</p>
<p>
The Stack Overflow question asks how to test this implementation, but I'd rather take the example as an opportunity to discuss design alternatives. Not surprisingly, it turns out that with a more decoupled design, testing becomes easier, too.
</p>
<p>
Before we proceed, a few words about this example code. I assume that this isn't <a href="https://stackoverflow.com/users/3805597">Andy Cooke</a>'s actual production code. Rather, I interpret it as a reduced example that highlights the actual question. This is important because you might ask: <em>Why bother testing two lines of code?</em>
</p>
<p>
Indeed, as presented, the <code>GetSomething</code> method is <a href="/2018/11/12/what-to-test-and-not-to-test">so simple that you may consider not testing it</a>. Thus, I interpret the second line of code as a stand-in for more complicated production code. Hold on to that thought, because once I'm done, that's all that's going to be left, and you may then think that it's so simple that it really doesn't warrant all this hoo-ha.
</p>
<h3 id="32211a755e0a4b9bbd04a049ddbba0c8">
Coupling <a href="#32211a755e0a4b9bbd04a049ddbba0c8">#</a>
</h3>
<p>
As shown, the <code>MyApi</code> class is coupled to <a href="https://www.thepollyproject.org/">Polly</a>, because <code>ResiliencePipeline</code> is defined by that library. To be clear, all I've heard is that Polly is a fine library. I've used it for a few projects myself, but I also admit that I haven't that much experience with it. I'd probably use it again the next time I need a Circuit Breaker or similar, so the following discussion isn't a denouncement of Polly. Rather, it applies to all third-party dependencies, or perhaps even dependencies that are part of your language's base library.
</p>
<p>
Coupling is a major cause of <a href="https://en.wikipedia.org/wiki/Spaghetti_code">spaghetti code</a> and code rot in general. To write sustainable code, you should be cognizant of coupling. The most decoupled code is <a href="/2022/11/21/decouple-to-delete">code that you can easily delete</a>.
</p>
<p>
This doesn't mean that you shouldn't use high-quality third-party libraries like Polly. Among myriads of software engineering heuristics, we know that we should be aware of the <a href="https://en.wikipedia.org/wiki/Not_invented_here">not-invented-here syndrome</a>.
</p>
<p>
When it comes to classic cross-cutting concerns, the Decorator pattern is usually a better design than injecting the concern into application code. The above example clearly looks innocuous, but imagine injecting both a <code>ResiliencePipeline</code>, a logger, and perhaps a caching service, and your real application code eventually disappears in 'infrastructure code'.
</p>
<p>
It's not that we don't want to have these third-party dependencies, but rather that we want to move them somewhere else.
</p>
<h3 id="67a215289ba944b984b4d113b10e419c">
Resilient Decorator <a href="#67a215289ba944b984b4d113b10e419c">#</a>
</h3>
<p>
The concern in the above example is the desire to make the <code>IOrganizationService</code> dependency more resilient. The <code>MyApi</code> class only becomes more resilient as a transitive effect. The first refactoring step, then, is to introduce a resilient Decorator.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">ResilientOrganizationService</span>(
<span style="color:#2b91af;">ResiliencePipeline</span> <span style="font-weight:bold;color:#1f377f;">pipeline</span>,
<span style="color:#2b91af;">IOrganizationService</span> <span style="font-weight:bold;color:#1f377f;">inner</span>) : <span style="color:#2b91af;">IOrganizationService</span>
{
<span style="color:blue;">public</span> <span style="color:#2b91af;">QueryResult</span> <span style="font-weight:bold;color:#74531f;">RetrieveMultiple</span>(<span style="color:#2b91af;">QueryByAttribute</span> <span style="font-weight:bold;color:#1f377f;">query</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">pipeline</span>.<span style="font-weight:bold;color:#74531f;">Execute</span>(() => <span style="font-weight:bold;color:#1f377f;">inner</span>.<span style="font-weight:bold;color:#74531f;">RetrieveMultiple</span>(<span style="font-weight:bold;color:#1f377f;">query</span>));
}
}</pre>
</p>
<p>
As Decorators must, this class composes another <code>IOrganizationService</code> while also implementing that interface itself. It does so by being an <a href="https://en.wikipedia.org/wiki/Adapter_pattern">Adapter</a> over the Polly API.
</p>
<p>
I've applied <a href="https://vuscode.wordpress.com/2009/10/16/inversion-of-control-single-responsibility-principle-and-nikola-s-laws-of-dependency-injection/">Nikola Malovic's 4th law of DI</a>:
</p>
<blockquote>
<p>
"Every constructor of a class being resolved should not have any implementation other then accepting a set of its own dependencies."
</p>
<footer><cite><a href="https://vuscode.wordpress.com/2009/10/16/inversion-of-control-single-responsibility-principle-and-nikola-s-laws-of-dependency-injection/">Inversion Of Control, Single Responsibility Principle and Nikola’s laws of dependency injection</a></cite>, Nikola Malovic, 2009</footer>
</blockquote>
<p>
Instead of injecting a <code><span style="color:#2b91af;">ResiliencePipelineProvider</span><<span style="color:blue;">string</span>></code> only to call <code>GetPipeline</code> on it, it just receives a <code>ResiliencePipeline</code> and saves the object for use in the <code>RetrieveMultiple</code> method. It does that via a <a href="https://learn.microsoft.com/dotnet/csharp/programming-guide/classes-and-structs/instance-constructors#primary-constructors">primary constructor</a>, which is a recent C# language addition. It's just syntactic sugar for Constructor Injection, and as usual <a href="https://fsharp.org/">F#</a> developers should feel right at home.
</p>
<h3 id="8e967ac0c4ea4323b280e7a665825903">
Simplifying MyApi <a href="#8e967ac0c4ea4323b280e7a665825903">#</a>
</h3>
<p>
Now that you have a resilient version of <code>IOrganizationService</code> you don't need to have any Polly code in <code>MyApi</code>. Remove it and simplify:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">MyApi</span>
{
<span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IOrganizationService</span> service;
<span style="color:blue;">public</span> <span style="color:#2b91af;">MyApi</span>(<span style="color:#2b91af;">IOrganizationService</span> <span style="font-weight:bold;color:#1f377f;">service</span>)
{
<span style="color:blue;">this</span>.service = <span style="font-weight:bold;color:#1f377f;">service</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">List</span><<span style="color:blue;">string</span>> <span style="font-weight:bold;color:#74531f;">GetSomething</span>(<span style="color:#2b91af;">QueryByAttribute</span> <span style="font-weight:bold;color:#1f377f;">query</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">result</span> = service.<span style="font-weight:bold;color:#74531f;">RetrieveMultiple</span>(<span style="font-weight:bold;color:#1f377f;">query</span>);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">result</span>.Entities.<span style="font-weight:bold;color:#74531f;">Cast</span><<span style="color:blue;">string</span>>().<span style="font-weight:bold;color:#74531f;">ToList</span>();
}
}</pre>
</p>
<p>
As promised, there's almost nothing left of it now, but I'll remind you that I consider the second line of <code>GetSomething</code> as a stand-in for something more complicated that you might need to test. As it is now, though, testing it is trivial:
</p>
<p>
<pre>[<span style="color:#2b91af;">Theory</span>]
[<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"bar"</span>, <span style="color:#a31515;">"baz"</span>)]
[<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">"qux"</span>, <span style="color:#a31515;">"quux"</span>, <span style="color:#a31515;">"corge"</span>)]
[<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">"grault"</span>, <span style="color:#a31515;">"garply"</span>, <span style="color:#a31515;">"waldo"</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">GetSomething</span>(<span style="color:blue;">params</span> <span style="color:blue;">string</span>[] <span style="font-weight:bold;color:#1f377f;">expected</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">service</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Mock</span><<span style="color:#2b91af;">IOrganizationService</span>>();
<span style="font-weight:bold;color:#1f377f;">service</span>
.<span style="font-weight:bold;color:#74531f;">Setup</span>(<span style="font-weight:bold;color:#1f377f;">s</span> => <span style="font-weight:bold;color:#1f377f;">s</span>.<span style="font-weight:bold;color:#74531f;">RetrieveMultiple</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">QueryByAttribute</span>()))
.<span style="font-weight:bold;color:#74531f;">Returns</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">QueryResult</span>(<span style="font-weight:bold;color:#1f377f;">expected</span>));
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">MyApi</span>(<span style="font-weight:bold;color:#1f377f;">service</span>.Object);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GetSomething</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">QueryByAttribute</span>());
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="font-weight:bold;color:#1f377f;">expected</span>, <span style="font-weight:bold;color:#1f377f;">actual</span>);
}</pre>
</p>
<p>
The larger point, however, is that not only have you now managed to keep third-party dependencies out of your application code, you've also simplified it and made it easier to test.
</p>
<h3 id="c9dd80b39e234c6595ef31de1fea30c2">
Composition <a href="#c9dd80b39e234c6595ef31de1fea30c2">#</a>
</h3>
<p>
You can still create a resilient <code>MyApi</code> object in your <a href="/2011/07/28/CompositionRoot">Composition Root</a>:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">service</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">ResilientOrganizationService</span>(<span style="font-weight:bold;color:#1f377f;">pipeline</span>, <span style="font-weight:bold;color:#1f377f;">inner</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">myApi</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">MyApi</span>(<span style="font-weight:bold;color:#1f377f;">service</span>);</pre>
</p>
<p>
Decomposing the problem in this way, you decouple your application code from third-party dependencies. You can define <code>ResilientOrganizationService</code> in the application's Composition Root, which also keeps the Polly dependency there. Even so, you can implement <code>MyApi</code> as part of your application layer.
</p>
<p>
<img src="/content/binary/polly-in-outer-shell.png" alt="Three circles arranged in layers. In the outer layer, there's a box labelled 'ResilientOrganizationService' and another box labelled 'Polly'. An arrow points from 'ResilientOrganizationService' to 'Polly'. In the second layer in there's a box labelled 'MyApi'. The inner circle is empty." >
</p>
<p>
I usually illustrate <a href="/2013/12/03/layers-onions-ports-adapters-its-all-the-same">Ports and Adapters</a>, or, if you will, <a href="/ref/clean-architecture">Clean Architecture</a> as concentric circles, but in this diagram I've skewed the circles to make space for the boxes. In other words, the diagram is 'not to scale'. Ideally, the outermost layer is much smaller and thinner than any of the the other layers. I've also included an inner green layer which indicates the architecture's Domain Model, but since I assume that <code>MyApi</code> is part of some application layer, I've left the Domain Model empty.
</p>
<h3 id="ba7304112c214dd6be84011aea811dbf">
Reasons to decouple <a href="#ba7304112c214dd6be84011aea811dbf">#</a>
</h3>
<p>
Why is it important to decouple application code from Polly? First, keep in mind that in this discussion Polly is just a stand-in for any third-party dependency. It's up to you as a software architect to decide how you'll structure your code, but third-party dependencies are one of the first things I look for. A third-party component changes with time, and often independently of your base platform. You may have to deal with breaking changes or security patches at inopportune times. The organization that maintains the component may cease to operate. This happens to commercial entities and open-source contributors alike, although for different reasons.
</p>
<p>
Second, even a top-tier library like Polly will undergo changes. If your time horizon is five to ten years, you'll be surprised how much things change. You may protest that no-one designs software systems with such a long view, but I think that if you ask the business people involved with your software, they most certainly expect your system to last a long time.
</p>
<p>
I believe that I heard on a podcast that some Microsoft teams had taken a dependency on Polly. Assuming, for the sake of argument, that this is true, while we may not wish to depend on some random open-source component, depending on Polly is safe, right? In the long run, it isn't. Five years ago, you had the same situation with <a href="https://www.newtonsoft.com/json">Json.NET</a>, but then Microsoft hired James Newton-King and had him make a JSON API as part of the .NET base library. While Json.NET isn't dead by any means, now you have two competing JSON libraries, and Microsoft uses their own in the frameworks and libraries that they release.
</p>
<p>
Deciding to decouple your application code from a third-party component is ultimately a question of risk management. It's up to you to make the bet. Do you pay the up-front cost of decoupling, or do you postpone it, hoping it'll never be necessary?
</p>
<p>
I usually do the former, because the cost is low, and there are other benefits as well. As I've already touched on, unit testing becomes easier.
</p>
<h3 id="f126ff285a014a6d85cff276436321c8">
Configuration <a href="#f126ff285a014a6d85cff276436321c8">#</a>
</h3>
<p>
Since Polly only lives in the Composition Root, you'll also need to define the <code>ResiliencePipeline</code> there. You can write the code that creates that pieline wherever you like, but it might be natural to make it a creation function on the <code>ResilientOrganizationService</code> class:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:#2b91af;">ResiliencePipeline</span> <span style="color:#74531f;">CreatePipeline</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ResiliencePipelineBuilder</span>()
.<span style="font-weight:bold;color:#74531f;">AddRetry</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">RetryStrategyOptions</span>
{
MaxRetryAttempts = 4
})
.<span style="font-weight:bold;color:#74531f;">AddTimeout</span>(<span style="color:#2b91af;">TimeSpan</span>.<span style="color:#74531f;">FromSeconds</span>(1))
.<span style="font-weight:bold;color:#74531f;">Build</span>();
}</pre>
</p>
<p>
That's just an example, and perhaps not what you'd like to do. Perhaps you rather want some of these values to be defined in a configuration file. Thus, this isn't what you <em>have</em> to do, but rather what you <em>could</em> do.
</p>
<p>
If you use this option, however, you could take the return value of this method and inject it into the <code>ResilientOrganizationService</code> constructor.
</p>
<h3 id="11bbc9df98474c33a6ce0902b13178d4">
Conclusion <a href="#11bbc9df98474c33a6ce0902b13178d4">#</a>
</h3>
<p>
Cross-cutting concerns, like caching, logging, security, or, in this case, fault tolerance, are usually best addressed with the Decorator pattern. In this article, you saw an example of using the Decorator pattern to decouple the concern of fault tolerance from the consumer of the service that you need to handle in a fault-tolerant manner.
</p>
<p>
The specific example dealt with the Polly library, but the point isn't that Polly is a particularly nasty third-party component that you need to protect yourself against. Rather, it just so happened that I came across a Stack Overflow question that used Polly, and I though it was a a nice example.
</p>
<p>
As far as I can tell, Polly is actually one of the top .NET open-source packages, so this article is not a denouncement of Polly. It's just a sketch of how to move useful dependencies around in your code base to make sure that they impact your application code as little as possible.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.A List Zipper in C#https://blog.ploeh.dk/2024/08/26/a-list-zipper-in-c2024-08-26T13:19:00+00:00Mark Seemann
<div id="post">
<p>
<em>A port of a Haskell example, just because.</em>
</p>
<p>
This article is part of <a href="/2024/08/19/zippers">a series about Zippers</a>. In this one, I port the <code>ListZipper</code> data structure from the <a href="https://learnyouahaskell.com/">Learn You a Haskell for Great Good!</a> article also called <a href="https://learnyouahaskell.com/zippers">Zippers</a>.
</p>
<p>
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there.
</p>
<p>
The code shown in this article is <a href="https://github.com/ploeh/CSharpZippers">available on GitHub</a>.
</p>
<h3 id="04e3cad425414735aff6a3a0507a9855">
Initialization and structure <a href="#04e3cad425414735aff6a3a0507a9855">#</a>
</h3>
<p>
In the Haskell code, <code>ListZipper</code> is just a type alias, but C# doesn't have that, so instead, we'll have to introduce a class.
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>> : <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></pre>
</p>
<p>
Since it implements <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code>, it may be used like any other sequence, but it also comes with some special operations that enable client code to move forward and backward, as well as inserting and removing values.
</p>
<p>
The class has the following fields, properties, and constructors:
</p>
<p>
<pre><span style="color:blue;">private</span> <span style="color:blue;">readonly</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> values;
<span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> Breadcrumbs { <span style="color:blue;">get</span>; }
<span style="color:blue;">private</span> <span style="color:#2b91af;">ListZipper</span>(<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">values</span>, <span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>)
{
<span style="color:blue;">this</span>.values = <span style="font-weight:bold;color:#1f377f;">values</span>;
Breadcrumbs = <span style="font-weight:bold;color:#1f377f;">breadcrumbs</span>;
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span>(<span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#1f377f;">values</span>) : <span style="color:blue;">this</span>(<span style="font-weight:bold;color:#1f377f;">values</span>, [])
{
}
<span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span>(<span style="color:blue;">params</span> <span style="color:#2b91af;">T</span>[] <span style="font-weight:bold;color:#1f377f;">values</span>) : <span style="color:blue;">this</span>(<span style="font-weight:bold;color:#1f377f;">values</span>.<span style="font-weight:bold;color:#74531f;">AsEnumerable</span>())
{
}</pre>
</p>
<p>
It uses constructor chaining to initialize a <code>ListZipper</code> object with proper <a href="/encapsulation-and-solid">encapsulation</a>. Notice that the master constructor is private. This prevents client code from initializing an object with arbitrary <code>Breadcrumbs</code>. Rather, the <code>Breadcrumbs</code> (the log, if you will) is going to be the result of various operations performed by client code, and only the <code>ListZipper</code> class itself can use this constructor.
</p>
<p>
You may consider the constructor that takes a single <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code> as the 'main' <code>public</code> constructor, and the other one as a convenience that enables a client developer to write code like <code><span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"bar"</span>, <span style="color:#a31515;">"baz"</span>)</code>.
</p>
<p>
The class' <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code> implementation only enumerates the <code>values</code>:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">IEnumerator</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#74531f;">GetEnumerator</span>()
{
<span style="font-weight:bold;color:#8f08c4;">return</span> values.<span style="font-weight:bold;color:#74531f;">GetEnumerator</span>();
}</pre>
</p>
<p>
In other words, when enumerating a <code>ListZipper</code>, you only get the 'forward' <code>values</code>. Client code may still examine the <code>Breadcrumbs</code>, since this is a <code>public</code> property, but it should have little need for that.
</p>
<p>
(I admit that making <code>Breadcrumbs</code> public is a concession to testability, since it enabled me to write assertions against this property. It's a form of <a href="/2013/04/04/structural-inspection">structural inspection</a>, which is a technique that I use much less than I did a decade ago. Still, in this case, while you may argue that it violates <a href="https://en.wikipedia.org/wiki/Information_hiding">information hiding</a>, it at least doesn't allow client code to put an object in an invalid state. Had the <code>ListZipper</code> class been a part of a reusable library, I would probably have hidden that data, too, but since this is exercise code, I found this an acceptable compromise. Notice, too, that in the original Haskell code, the breadcrumbs are available to client code.)
</p>
<p>
Regular readers of this blog may be aware that <a href="/2013/07/20/linq-versus-the-lsp">I usually favour IReadOnlyCollection<T> over IEnumerable<T></a>. Here, on the other hand, I've allowed <code>values</code> to be any <code><span style="color:#2b91af;">IEnumerable</span><<span style="color:#2b91af;">T</span>></code>, which includes infinite sequences. I decided to do that because Haskell lists, too, may be infinite, and as far as I can tell, <code>ListZipper</code> actually does work with infinite sequences. I have, at least, written a few tests with infinite sequences, and they pass. (I may still have missed an edge case or two. I can't rule that out.)
</p>
<h3 id="908d0fe3cf5d453da3541127ae365d00">
Movement <a href="#908d0fe3cf5d453da3541127ae365d00">#</a>
</h3>
<p>
It's not much fun just being able to initialize an object. You also want to be able to do something with it, such as moving forward:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">GoForward</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">head</span> = values.<span style="font-weight:bold;color:#74531f;">Take</span>(1);
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="font-weight:bold;color:#1f377f;">head</span>.<span style="font-weight:bold;color:#74531f;">Any</span>())
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">tail</span> = values.<span style="font-weight:bold;color:#74531f;">Skip</span>(1);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">tail</span>, <span style="font-weight:bold;color:#1f377f;">head</span>.<span style="font-weight:bold;color:#74531f;">Concat</span>(Breadcrumbs));
}</pre>
</p>
<p>
You can move forward through any <code>IEnumerable</code>, so why make things so complicated? The benefit of this <code>GoForward</code> method (<a href="https://en.wikipedia.org/wiki/Pure_function">function</a>, really) is that it records where it came from, which means that moving backwards becomes an option:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">GoBack</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">head</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Take</span>(1);
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="font-weight:bold;color:#1f377f;">head</span>.<span style="font-weight:bold;color:#74531f;">Any</span>())
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">tail</span> = Breadcrumbs.<span style="font-weight:bold;color:#74531f;">Skip</span>(1);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>(<span style="font-weight:bold;color:#1f377f;">head</span>.<span style="font-weight:bold;color:#74531f;">Concat</span>(values), <span style="font-weight:bold;color:#1f377f;">tail</span>);
}</pre>
</p>
<p>
This test may serve as an example of client code that makes use of those two operations:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">GoBack1</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:blue;">int</span>>(1, 2, 3, 4);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">GoBack</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>([3, 4], <span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>([2, 1], <span style="font-weight:bold;color:#1f377f;">actual</span>?.Breadcrumbs);
}</pre>
</p>
<p>
Going forward takes the first element off <code>values</code> and adds it to the front of <code>Breadcrumbs</code>. Going backwards is nearly symmetrical: It takes the first element off the <code>Breadcrumbs</code> and adds it back to the front of the <code>values</code>. Used in this way, <code>Breadcrumbs</code> works as a <a href="https://en.wikipedia.org/wiki/Stack_(abstract_data_type)">stack</a>.
</p>
<p>
Notice that both <code>GoForward</code> and <code>GoBack</code> admit the possibility of failure. If <code>values</code> is empty, you can't go forward. If <code>Breadcrumbs</code> is empty, you can't go back. In both cases, the functions return <code>null</code>, which are also indicated by the <code><span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>?</code> return types; notice the question mark, <a href="https://learn.microsoft.com/dotnet/csharp/nullable-references">which indicates that the value may be null</a>. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the <a href="/2022/04/25/the-maybe-monad">Maybe monad</a> (which is also what you'd <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> do in Haskell).
</p>
<p>
To be clear, the <a href="https://learnyouahaskell.com/zippers">Zippers article</a> does discuss handling failures using Maybe, but only applies it to its binary tree example. Thus, the error handling shown here is my own addition.
</p>
<h3 id="704f23586ead4b199b171baa50dfd1da">
Modifications <a href="#704f23586ead4b199b171baa50dfd1da">#</a>
</h3>
<p>
In addition to moving back and forth in the list, we can also modify it. The following operations are also not in the Zippers article, but are rather my own contributions. Adding a new element is easy:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>> <span style="font-weight:bold;color:#74531f;">Insert</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">value</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>(values.<span style="font-weight:bold;color:#74531f;">Prepend</span>(<span style="font-weight:bold;color:#1f377f;">value</span>), Breadcrumbs);
}</pre>
</p>
<p>
Notice that this operation is always possible. Even if the list is empty, we can <code>Insert</code> a value. In that case, it just becomes the list's first and only element.
</p>
<p>
A simple test demonstrates usage:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">InsertAtFocus</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"bar"</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">Insert</span>(<span style="color:#a31515;">"ploeh"</span>).<span style="font-weight:bold;color:#74531f;">GoBack</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>([<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"ploeh"</span>, <span style="color:#a31515;">"bar"</span>], <span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>.Breadcrumbs);
}</pre>
</p>
<p>
Likewise, we may attempt to remove an element from the list:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">Remove</span>()
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (!values.<span style="font-weight:bold;color:#74531f;">Any</span>())
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">null</span>;
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>(values.<span style="font-weight:bold;color:#74531f;">Skip</span>(1), Breadcrumbs);
}</pre>
</p>
<p>
Contrary to <code>Insert</code>, the <code>Remove</code> operation will fail if <code>values</code> is empty. Notice that this doesn't necessarily imply that the list as such is empty, but only that the focus is at the end of the list (which, of course, never happens if <code>values</code> is infinite):
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">RemoveAtEnd</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"bar"</span>).<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">GoForward</span>();
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>?.<span style="font-weight:bold;color:#74531f;">Remove</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Null</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">sut</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">sut</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>([<span style="color:#a31515;">"bar"</span>, <span style="color:#a31515;">"foo"</span>], <span style="font-weight:bold;color:#1f377f;">sut</span>.Breadcrumbs);
}</pre>
</p>
<p>
In this example, the focus is at the end of the list, so there's nothing to remove. The list, however, is not empty, but all the data currently reside in the <code>Breadcrumbs</code>.
</p>
<p>
Finally, we can combine insertion and removal to implement a replacement operation:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>>? <span style="font-weight:bold;color:#74531f;">Replace</span>(<span style="color:#2b91af;">T</span> <span style="font-weight:bold;color:#1f377f;">newValue</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#74531f;">Remove</span>()?.<span style="font-weight:bold;color:#74531f;">Insert</span>(<span style="font-weight:bold;color:#1f377f;">newValue</span>);
}</pre>
</p>
<p>
As the name implies, this operation replaces the value currently in focus with a completely different value. Here's an example:
</p>
<p>
<pre>[<span style="color:#2b91af;">Fact</span>]
<span style="color:blue;">public</span> <span style="color:blue;">void</span> <span style="font-weight:bold;color:#74531f;">ReplaceAtFocus</span>()
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">sut</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">ListZipper</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"bar"</span>, <span style="color:#a31515;">"baz"</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">actual</span> = <span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">GoForward</span>()?.<span style="font-weight:bold;color:#74531f;">Replace</span>(<span style="color:#a31515;">"qux"</span>)?.<span style="font-weight:bold;color:#74531f;">GoBack</span>();
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotNull</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>([<span style="color:#a31515;">"foo"</span>, <span style="color:#a31515;">"qux"</span>, <span style="color:#a31515;">"baz"</span>], <span style="font-weight:bold;color:#1f377f;">actual</span>);
<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Empty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>.Breadcrumbs);
}</pre>
</p>
<p>
Once more, this may fail if the current focus is empty, so <code>Replace</code> also returns a nullable value.
</p>
<h3 id="5979ae4ab42543f79df4e572f7f5c2c3">
Conclusion <a href="#5979ae4ab42543f79df4e572f7f5c2c3">#</a>
</h3>
<p>
For a C# developer, the <code><span style="color:#2b91af;">ListZipper</span><<span style="color:#2b91af;">T</span>></code> class looks odd. Why would you ever want to use this data structure? Why not just use <a href="https://learn.microsoft.com/dotnet/api/system.collections.generic.list-1">List<T></a>?
</p>
<p>
As I hope I've made clear in the <a href="/2024/08/19/zippers">introduction article</a>, I can't, indeed, think of a good reason.
</p>
<p>
I've gone through this exercise <a href="/2020/01/13/on-doing-katas">to hone my skills</a>, and to prepare myself for the more intimidating exercise it is to implement a binary tree Zipper.
</p>
<p>
<strong>Next:</strong> <a href="/2024/09/09/a-binary-tree-zipper-in-c">A Binary Tree Zipper in C#</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Zippershttps://blog.ploeh.dk/2024/08/19/zippers2024-08-19T14:13:00+00:00Mark Seemann
<div id="post">
<p>
<em>Some functional programming examples ported to C#, just because.</em>
</p>
<p>
Many algorithms rely on data structures that enable the implementation to move in more than one way. A simple example is a <a href="https://en.wikipedia.org/wiki/Doubly_linked_list">doubly-linked list</a>, where an algorithm can move both forward and backward from a given element. Other examples are various tree-based algorithms, such as <a href="https://en.wikipedia.org/wiki/Red%E2%80%93black_tree">red-black trees</a> where certain operations trigger reorganization of the tree. Yet other data structures, such as <a href="https://en.wikipedia.org/wiki/Fibonacci_heap">Fibonacci heaps</a>, combine doubly-linked lists with trees that allow navigation in more than one direction.
</p>
<p>
In an imperative programming language, you can easily implement such data structures, as long as the language allows data mutation. Here's a simple example:
</p>
<p>
<pre><span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">node1</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Node</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"foo"</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">node2</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">Node</span><<span style="color:blue;">string</span>>(<span style="color:#a31515;">"bar"</span>) { Previous = <span style="font-weight:bold;color:#1f377f;">node1</span> };
<span style="font-weight:bold;color:#1f377f;">node1</span>.Next = <span style="font-weight:bold;color:#1f377f;">node2</span>;</pre>
</p>
<p>
It's possible to double-link <code>node1</code> to <code>node2</code> by first creating <code>node1</code>. At that point, <code>node2</code> still doesn't exist, so you can't yet assign <code><span style="font-weight:bold;color:#1f377f;">node1</span>.Next</code>, but once you've initialized <code>node2</code>, you can mutate the state of <code>node1</code> by changing its <code>Next</code> property.
</p>
<p>
When data structures are immutable (as they must be in functional programming) this is no longer possible. How may you get around that limitation?
</p>
<h3 id="3b3c3d4cba1f4ae8bef462b28047860a">
Alternatives <a href="#3b3c3d4cba1f4ae8bef462b28047860a">#</a>
</h3>
<p>
Some languages get around this problem in various ways. <a href="https://www.haskell.org/">Haskell</a>, because of its lazy evaluation, enables a technique called <a href="https://wiki.haskell.org/Tying_the_Knot">tying the knot</a> that, frankly, makes my head hurt.
</p>
<p>
Even though I write a decent amount of Haskell code, that's not something that I make use of. Usually, it turns out, you can solve most problems by thinking about them differently. By choosing another perspective, and another data structure, you can often arrive at a good, functional solution to your problem.
</p>
<p>
One family of general-purpose data structures are called Zippers. The general idea is that the data structure has a natural 'focus' (e.g. the head of a list), but it also keeps a record of 'breadcrumbs', that is, where the caller has previously been. This enables client code to 'go back' or 'go up', if the natural direction is to 'go forward' or 'go down'. It's a bit like <a href="https://martinfowler.com/eaaDev/EventSourcing.html">Event Sourcing</a>, in that every operation leaves a log entry that can later be used to reconstruct what happened. <a href="/2020/03/23/repeatable-execution">Repeatable Execution</a> also comes to mind, although it's not quite the same.
</p>
<p>
For an introduction to Zippers, I recommend the excellent and highly readable article <a href="https://learnyouahaskell.com/zippers">Zippers</a>. In this article series, I'm going to assume that you're familiar with the contents of that article.
</p>
<h3 id="8ec371f87d2f468ea6ebbc3a2e420cbb">
C# ports <a href="#8ec371f87d2f468ea6ebbc3a2e420cbb">#</a>
</h3>
<p>
While I may add more articles to this series in the future, as I'm writing this, I have nothing more planned than writing about how it's possible to implement the article's three Zippers in C#.
</p>
<ul>
<li><a href="/2024/08/26/a-list-zipper-in-c">A List Zipper in C#</a></li>
<li><a href="/2024/09/09/a-binary-tree-zipper-in-c">A Binary Tree Zipper in C#</a></li>
<li><a href="/2024/09/23/fszipper-in-c">FSZipper in C#</a></li>
</ul>
<p>
Why would you want to do this?
</p>
<p>
To be honest, for production code, I can't think of a good reason. I did it for a few reasons, most of them didactic. Additionally, <a href="/2020/01/13/on-doing-katas">writing code for exercise</a> helps you improve. If you know enough Haskell to understand what's going on in the <a href="https://learnyouahaskell.com/zippers">Zippers article</a>, you may consider porting some of it to your favourite language, as an exercise.
</p>
<p>
It may help you <a href="/ref/stranger-in-a-strange-land">grokking</a> functional programming.
</p>
<p>
That's really it, though. There's no reason to use Zippers in a language like C#, which <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> makes use of mutation. If you want a doubly-linked list, you can just write code as shown in the beginning of this article.
</p>
<p>
If you're interested in an <a href="https://fsharp.org/">F#</a> perspective on Zippers, <a href="https://tomasp.net/">Tomas Petricek</a> has a cool article: <a href="https://tomasp.net/blog/tree-zipper-query.aspx/">Processing trees with F# zipper computation</a>.
</p>
<h3 id="8a124e3b10aa4b0b889efe866f63dc91">
Conclusion <a href="#8a124e3b10aa4b0b889efe866f63dc91">#</a>
</h3>
<p>
Zippers constitute a family of data structures that enables you to move in multiple directions. Left and right in a list. Up or down in a tree. For an imperative programmer, that's literally just another day at the office, but in disciplined functional programming, making cyclic graphs can be surprisingly tricky.
</p>
<p>
Even in functional programming, I rarely reach for a Zipper, since I can often find a library with a higher level of abstraction that does what I need it to do. Still, learning of new ways to solve problems never seems a waste to me.
</p>
<p>
In the next three articles, I'll go through the examples from <a href="https://learnyouahaskell.com/zippers">the Zipper article</a> and show how I ported them to C#. While that article starts with a <a href="https://en.wikipedia.org/wiki/Binary_tree">binary tree</a>, I'll instead begin with the doubly-linked list, since it's the simplest of the three.
</p>
<p>
<strong>Next:</strong> <a href="/2024/08/26/a-list-zipper-in-c">A List Zipper in C#</a>.
</p>
</div><hr>
This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>.Using only a Domain Model to persist restaurant table configurationshttps://blog.ploeh.dk/2024/08/12/using-only-a-domain-model-to-persist-restaurant-table-configurations2024-08-12T12:57:00+00:00Mark Seemann
<div id="post">
<p>
<em>A data architecture example in C# and ASP.NET.</em>
</p>
<p>
This is part of a <a href="/2024/07/25/three-data-architectures-for-the-server">small article series on data architectures</a>. In this, the third instalment, you'll see an alternative way of modelling data in a server-based application. One that doesn't rely on statically typed classes to model data. As the introductory article explains, the example code shows how to create a new restaurant table configuration, or how to display an existing resource. The sample code base is an ASP.NET 8.0 <a href="https://en.wikipedia.org/wiki/REST">REST</a> API.
</p>
<p>
Keep in mind that while the sample code does store data in a relational database, the term <em>table</em> in this article mainly refers to physical tables, rather than database tables.
</p>
<p>
The idea is to use 'raw' serialization APIs to handle communication with external systems. For the presentation layer, the example even moves representation concerns to middleware, so that it's nicely abstracted away from the application layer.
</p>
<p>
An architecture diagram like this attempts to capture the design:
</p>
<p>
<img src="/content/binary/domain-model-only-data-architecture.png" alt="Architecture diagram showing a box labelled Domain Model with bidirectional arrows both above and below, pointing below towards a cylinder, and above towards a document.">
</p>
<p>
Here, the arrows indicate mappings, not dependencies.
</p>
<p>
Like in the <a href="/2024/07/29/using-ports-and-adapters-to-persist-restaurant-table-configurations">DTO-based Ports and Adapters architecture</a>, the goal is to being able to design Domain Models unconstrained by serialization concerns, but also being able to format external data unconstrained by Reflection-based serializers. Thus, while this architecture is centred on a Domain Model, there are no <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Objects</a> (DTOs) to represent <a href="https://json.org/">JSON</a>, <a href="https://en.wikipedia.org/wiki/XML">XML</a>, or database rows.
</p>
<h3 id="799ef3debcb748079610a1ff818360e2">
HTTP interaction <a href="#799ef3debcb748079610a1ff818360e2">#</a>
</h3>
<p>
To establish the context of the application, here's how HTTP interactions may play out. The following is a copy of the identically named section in the article <a href="/2024/07/29/using-ports-and-adapters-to-persist-restaurant-table-configurations">Using Ports and Adapters to persist restaurant table configurations</a>, repeated here for your convenience.
</p>
<p>
A client can create a new table with a <code>POST</code> HTTP request:
</p>
<p>
<pre>POST /tables HTTP/1.1
content-type: application/json
{ <span style="color:#2e75b6;">"communalTable"</span>: { <span style="color:#2e75b6;">"capacity"</span>: 16 } }</pre>
</p>
<p>
Which might elicit a response like this:
</p>
<p>
<pre>HTTP/1.1 201 Created
Location: https://example.com/Tables/844581613e164813aa17243ff8b847af</pre>
</p>
<p>
Clients can later use the address indicated by the <code>Location</code> header to retrieve a representation of the resource:
</p>
<p>
<pre>GET /Tables/844581613e164813aa17243ff8b847af HTTP/1.1
accept: application/json</pre>
</p>
<p>
Which would result in this response:
</p>
<p>
<pre>HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{<span style="color:#2e75b6;">"communalTable"</span>:{<span style="color:#2e75b6;">"capacity"</span>:16}}</pre>
</p>
<p>
By default, ASP.NET handles and returns JSON. Later in this article you'll see how well it deals with other data formats.
</p>
<h3 id="63cacca8023b4adb9534a34aba0c50ff">
Boundary <a href="#63cacca8023b4adb9534a34aba0c50ff">#</a>
</h3>
<p>
ASP.NET supports some variation of the <a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">model-view-controller</a> (MVC) pattern, and Controllers handle HTTP requests. At the outset, the <em>action method</em> that handles the <code>POST</code> request looks like this:
</p>
<p>
<pre>[<span style="color:#2b91af;">HttpPost</span>]
<span style="color:blue;">public</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IActionResult</span>> <span style="font-weight:bold;color:#74531f;">Post</span>(<span style="color:#2b91af;">Table</span> <span style="font-weight:bold;color:#1f377f;">table</span>)
{
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">id</span> = <span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">NewGuid</span>();
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">repository</span>.<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="font-weight:bold;color:#1f377f;">id</span>, <span style="font-weight:bold;color:#1f377f;">table</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">CreatedAtActionResult</span>(
<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#74531f;">Get</span>),
<span style="color:blue;">null</span>,
<span style="color:blue;">new</span> { id = <span style="font-weight:bold;color:#1f377f;">id</span>.<span style="font-weight:bold;color:#74531f;">ToString</span>(<span style="color:#a31515;">"N"</span>) },
<span style="color:blue;">null</span>);
}</pre>
</p>
<p>
While this looks identical to the <code>Post</code> method for <a href="/2024/08/05/using-a-shared-data-model-to-persist-restaurant-table-configurations">the Shared Data Model architecture</a>, it's not, because it's not the same <code>Table</code> class. Not by a long shot. The <code>Table</code> class in use here is the one originally introduced in the article <a href="/2023/12/25/serializing-restaurant-tables-in-c">Serializing restaurant tables in C#</a>, with a few inconsequential differences.
</p>
<p>
How does a Controller <em>action method</em> receive an input parameter directly in the form of a Domain Model, keeping in mind that this particular Domain Model is far from serialization-friendly? The short answer is <em>middleware</em>, which we'll get to in a moment. Before we look at that, however, let's also look at the <code>Get</code> method that supports HTTP <code>GET</code> requests:
</p>
<p>
<pre>[<span style="color:#2b91af;">HttpGet</span>(<span style="color:#a31515;">"</span><span style="color:#0073ff;">{</span>id<span style="color:#0073ff;">}</span><span style="color:#a31515;">"</span>)]
<span style="color:blue;">public</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">IActionResult</span>> <span style="font-weight:bold;color:#74531f;">Get</span>(<span style="color:blue;">string</span> <span style="font-weight:bold;color:#1f377f;">id</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (!<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">TryParseExact</span>(<span style="font-weight:bold;color:#1f377f;">id</span>, <span style="color:#a31515;">"N"</span>, <span style="color:blue;">out</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">guid</span>))
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">BadRequestResult</span>();
<span style="color:#2b91af;">Table</span>? <span style="font-weight:bold;color:#1f377f;">table</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">repository</span>.<span style="font-weight:bold;color:#74531f;">Read</span>(<span style="font-weight:bold;color:#1f377f;">guid</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">table</span> <span style="color:blue;">is</span> <span style="color:blue;">null</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">NotFoundResult</span>();
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">OkObjectResult</span>(<span style="font-weight:bold;color:#1f377f;">table</span>);
}</pre>
</p>
<p>
This, too, looks exactly like the Shared Data Model architecture, again with the crucial difference that the <code>Table</code> class is completely different. The <code>Get</code> method just takes the <code>table</code> object and wraps it in an <code>OkObjectResult</code> and returns it.
</p>
<p>
The <code>Table</code> class is, in reality, extraordinarily opaque, and not at all friendly to serialization, so how do the service turn it into JSON?
</p>
<h3 id="38d9d73532fc4912834452fff3d33b3a">
JSON middleware <a href="#38d9d73532fc4912834452fff3d33b3a">#</a>
</h3>
<p>
Most web frameworks come with extensibility points where you can add middleware. A common need is to be able to add custom serializers. In ASP.NET they're called <em>formatters</em>, and can be added at application startup:
</p>
<p>
<pre><span style="font-weight:bold;color:#1f377f;">builder</span>.Services.<span style="font-weight:bold;color:#74531f;">AddControllers</span>(<span style="font-weight:bold;color:#1f377f;">opts</span> =>
{
<span style="font-weight:bold;color:#1f377f;">opts</span>.InputFormatters.<span style="font-weight:bold;color:#74531f;">Insert</span>(0, <span style="color:blue;">new</span> <span style="color:#2b91af;">TableJsonInputFormatter</span>());
<span style="font-weight:bold;color:#1f377f;">opts</span>.OutputFormatters.<span style="font-weight:bold;color:#74531f;">Insert</span>(0, <span style="color:blue;">new</span> <span style="color:#2b91af;">TableJsonOutputFormatter</span>());
});</pre>
</p>
<p>
As the names imply, <code>TableJsonInputFormatter</code> deserializes JSON input, while <code>TableJsonOutputFormatter</code> serializes strongly typed objects to JSON.
</p>
<p>
We'll look at each in turn, starting with <code>TableJsonInputFormatter</code>, which is responsible for deserializing JSON documents into <code>Table</code> objects, as used by, for example, the <code>Post</code> method.
</p>
<h3 id="9ff7ca45e5fd4d19bb9be29769e9a298">
JSON input formatter <a href="#9ff7ca45e5fd4d19bb9be29769e9a298">#</a>
</h3>
<p>
You create an input formatter by implementing the <a href="https://learn.microsoft.com/dotnet/api/microsoft.aspnetcore.mvc.formatters.iinputformatter">IInputFormatter</a> interface, although in this example code base, inheriting from <a href="https://learn.microsoft.com/dotnet/api/microsoft.aspnetcore.mvc.formatters.textinputformatter">TextInputFormatter</a> is enough:
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">TableJsonInputFormatter</span> : <span style="color:#2b91af;">TextInputFormatter</span></pre>
</p>
<p>
You can use the constructor to define which media types and encodings the formatter will support:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:#2b91af;">TableJsonInputFormatter</span>()
{
SupportedMediaTypes.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="color:#2b91af;">MediaTypeHeaderValue</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">"application/json"</span>));
SupportedEncodings.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="color:#2b91af;">Encoding</span>.UTF8);
SupportedEncodings.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="color:#2b91af;">Encoding</span>.Unicode);
}</pre>
</p>
<p>
You'll also need to tell the formatter, which .NET type it supports:
</p>
<p>
<pre><span style="color:blue;">protected</span> <span style="color:blue;">override</span> <span style="color:blue;">bool</span> <span style="font-weight:bold;color:#74531f;">CanReadType</span>(<span style="color:#2b91af;">Type</span> <span style="font-weight:bold;color:#1f377f;">type</span>)
{
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">type</span> <span style="font-weight:bold;color:#74531f;">==</span> <span style="color:blue;">typeof</span>(<span style="color:#2b91af;">Table</span>);
}</pre>
</p>
<p>
As far as I can tell, the ASP.NET framework will first determine which <em>action method</em> (that is, which Controller, and which method on that Controller) should handle a given HTTP request. For a <code>POST</code> request, as shown above, it'll determine that the appropriate <em>action method</em> is the <code>Post</code> method.
</p>
<p>
Since the <code>Post</code> method takes a <code>Table</code> object as input, the framework then goes through the registered formatters and asks them whether they can read from an HTTP request into that type. In this case, the <code>TableJsonInputFormatter</code> answers <code>true</code> only if the <code>type</code> is <code>Table</code>.
</p>
<p>
When <code>CanReadType</code> answers <code>true</code>, the framework then invokes a method to turn the HTTP request into an object:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span><<span style="color:#2b91af;">InputFormatterResult</span>> <span style="font-weight:bold;color:#74531f;">ReadRequestBodyAsync</span>(
<span style="color:#2b91af;">InputFormatterContext</span> <span style="font-weight:bold;color:#1f377f;">context</span>,
<span style="color:#2b91af;">Encoding</span> <span style="font-weight:bold;color:#1f377f;">encoding</span>)
{
<span style="color:blue;">using</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">rdr</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">StreamReader</span>(<span style="font-weight:bold;color:#1f377f;">context</span>.HttpContext.Request.Body, <span style="font-weight:bold;color:#1f377f;">encoding</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">json</span> = <span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">rdr</span>.<span style="font-weight:bold;color:#74531f;">ReadToEndAsync</span>().<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">table</span> = <span style="color:#2b91af;">TableJson</span>.<span style="color:#74531f;">Deserialize</span>(<span style="font-weight:bold;color:#1f377f;">json</span>);
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">table</span> <span style="color:blue;">is</span> { })
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="color:#2b91af;">InputFormatterResult</span>.<span style="color:#74531f;">SuccessAsync</span>(<span style="font-weight:bold;color:#1f377f;">table</span>).<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="font-weight:bold;color:#8f08c4;">else</span>
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="color:blue;">await</span> <span style="color:#2b91af;">InputFormatterResult</span>.<span style="color:#74531f;">FailureAsync</span>().<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
}</pre>
</p>
<p>
The <code>ReadRequestBodyAsync</code> method reads the HTTP request body into a <code>string</code> value called <code>json</code>, and then passes the value to <code>TableJson.Deserialize</code>. You can see the implementation of the <code>Deserialize</code> method in the article <a href="/2023/12/25/serializing-restaurant-tables-in-c">Serializing restaurant tables in C#</a>. In short, it uses the default .NET JSON parser to probe a document object model. If it can turn the JSON document into a <code>Table</code> value, it does that. Otherwise, it returns <code>null</code>.
</p>
<p>
The above <code>ReadRequestBodyAsync</code> method then checks if the return value from <code>TableJson.Deserialize</code> is <code>null</code>. If it's not, it wraps the result in a value that indicates success. If it's <code>null</code>, it uses <code>FailureAsync</code> to indicate a deserialization failure.
</p>
<p>
With this input formatter in place as middleware, any action method that takes a <code>Table</code> parameter will automatically receive a deserialized JSON object, if possible.
</p>
<h3 id="e04e265f6cdd48869aa8510e14092644">
JSON output formatter <a href="#e04e265f6cdd48869aa8510e14092644">#</a>
</h3>
<p>
The <code>TableJsonOutputFormatter</code> class works much in the same way, but instead derives from the <a href="https://learn.microsoft.com/dotnet/api/microsoft.aspnetcore.mvc.formatters.textoutputformatter">TextOutputFormatter</a> base class:
</p>
<p>
<pre><span style="color:blue;">internal</span> <span style="color:blue;">sealed</span> <span style="color:blue;">class</span> <span style="color:#2b91af;">TableJsonOutputFormatter</span> : <span style="color:#2b91af;">TextOutputFormatter</span></pre>
</p>
<p>
The constructor looks just like the <code>TableJsonInputFormatter</code>, and instead of a <code>CanReadType</code> method, it has a <code>CanWriteType</code> method that also looks identical.
</p>
<p>
The <code>WriteResponseBodyAsync</code> serializes a <code>Table</code> object to JSON:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">override</span> <span style="color:#2b91af;">Task</span> <span style="font-weight:bold;color:#74531f;">WriteResponseBodyAsync</span>(
<span style="color:#2b91af;">OutputFormatterWriteContext</span> <span style="font-weight:bold;color:#1f377f;">context</span>,
<span style="color:#2b91af;">Encoding</span> <span style="font-weight:bold;color:#1f377f;">selectedEncoding</span>)
{
<span style="font-weight:bold;color:#8f08c4;">if</span> (<span style="font-weight:bold;color:#1f377f;">context</span>.Object <span style="color:blue;">is</span> <span style="color:#2b91af;">Table</span> <span style="font-weight:bold;color:#1f377f;">table</span>)
<span style="font-weight:bold;color:#8f08c4;">return</span> <span style="font-weight:bold;color:#1f377f;">context</span>.HttpContext.Response.<span style="font-weight:bold;color:#74531f;">WriteAsync</span>(<span style="font-weight:bold;color:#1f377f;">table</span>.<span style="font-weight:bold;color:#74531f;">Serialize</span>(), <span style="font-weight:bold;color:#1f377f;">selectedEncoding</span>);
<span style="font-weight:bold;color:#8f08c4;">throw</span> <span style="color:blue;">new</span> <span style="color:#2b91af;">InvalidOperationException</span>(<span style="color:#a31515;">"Expected a Table object."</span>);
}</pre>
</p>
<p>
If <code>context.Object</code> is, in fact, a <code>Table</code> object, the method calls <code>table.Serialize()</code>, which you can also see in the article <a href="/2023/12/25/serializing-restaurant-tables-in-c">Serializing restaurant tables in C#</a>. In short, it pattern-matches on the two possible kinds of tables and builds an appropriate <a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">abstract syntax tree</a> or document object model that it then serializes to JSON.
</p>
<h3 id="7c935b48e0cf42369b5d0c55c688d5bf">
Data access <a href="#7c935b48e0cf42369b5d0c55c688d5bf">#</a>
</h3>
<p>
While the application stores data in <a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server">SQL Server</a>, it uses no <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mapper</a> (ORM). Instead, it simply uses ADO.NET, as also outlined in the article <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">Do ORMs reduce the need for mapping?</a>
</p>
<p>
At first glance, the <code>Create</code> method looks simple:
</p>
<p>
<pre><span style="color:blue;">public</span> <span style="color:blue;">async</span> <span style="color:#2b91af;">Task</span> <span style="font-weight:bold;color:#74531f;">Create</span>(<span style="color:#2b91af;">Guid</span> <span style="font-weight:bold;color:#1f377f;">id</span>, <span style="color:#2b91af;">Table</span> <span style="font-weight:bold;color:#1f377f;">table</span>)
{
<span style="color:blue;">using</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">conn</span> = <span style="color:blue;">new</span> <span style="color:#2b91af;">SqlConnection</span>(<span style="font-weight:bold;color:#1f377f;">connectionString</span>);
<span style="color:blue;">using</span> <span style="color:blue;">var</span> <span style="font-weight:bold;color:#1f377f;">cmd</span> = <span style="font-weight:bold;color:#1f377f;">table</span>.<span style="font-weight:bold;color:#74531f;">Accept</span>(<span style="color:blue;">new</span> <span style="color:#2b91af;">SqlInsertCommandVisitor</span>(<span style="font-weight:bold;color:#1f377f;">id</span>));
<span style="font-weight:bold;color:#1f377f;">cmd</span>.Connection = <span style="font-weight:bold;color:#1f377f;">conn</span>;
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">conn</span>.<span style="font-weight:bold;color:#74531f;">OpenAsync</span>().<span style="font-weight:bold;color:#74531f;">ConfigureAwait</span>(<span style="color:blue;">false</span>);
<span style="color:blue;">await</span> <span style="font-weight:bold;color:#1f377f;">cmd</span>.<span style="font-weight:bold;color:#74531f;">ExecuteNonQueryAsync</span>().<span style="font-weight:bold;color: