ploeh blog 2019-09-16T13:23:58+00:00 Mark Seemann danish software design https://blog.ploeh.dk Picture archivist in F# https://blog.ploeh.dk/2019/09/16/picture-archivist-in-f 2019-09-16T05:59:00+00:00 Mark Seemann <div id="post"> <p> <em>A comprehensive code example showing how to implement a functional architecture in F#.</em> </p> <p> This article shows how to implement the <a href="/2019/08/26/functional-file-system">picture archivist architecture described in a previous article</a>. In short, the task is to move some image files to directories based on their date-taken metadata. The architectural idea is to load a directory structure from disk into an in-memory tree, manipulate that tree, and use the resulting tree to perform the desired actions: </p> <p> <img src="/content/binary/functional-file-system-interaction.png" alt="A functional program typically loads data, transforms it, and stores it again."> </p> <p> Much of the program will manipulate the tree data, which is immutable. </p> <p> The previous article showed how to implement the <a href="/2019/09/09/picture-archivist-in-haskell">picture archivist architecture in Haskell</a>. In this article, you'll see how to do it in <a href="https://fsharp.org">F#</a>. This is essentially a port of the <a href="https://www.haskell.org">Haskell</a> code. </p> <h3 id="949a876ffec843e09d4faa5ae1c1b4c5"> Tree <a href="#949a876ffec843e09d4faa5ae1c1b4c5" title="permalink">#</a> </h3> <p> You can start by defining a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Tree&lt;&#39;a,&nbsp;&#39;b&gt;&nbsp;=&nbsp;Node&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a&nbsp;*&nbsp;Tree&lt;&#39;a,&nbsp;&#39;b&gt;&nbsp;list&nbsp;|&nbsp;Leaf&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;b</pre> </p> <p> If you wanted to, you could put all the <code>Tree</code> code in a reusable library, because none of it is coupled to a particular application, such as <a href="https://amzn.to/2V06Kji">moving pictures</a>. You could also write a comprehensive test suite for the following functions, but in this article, I'll skip that. </p> <p> Notice that this sort of tree explicitly distinguishes between internal and leaf nodes. This is necessary because you'll need to keep track of the directory names (the internal nodes), while at the same time you'll want to enrich the leaves with additional data - data that you can't meaningfully add to the internal nodes. You'll see this later in the article. </p> <p> While I typically tend to define F# types outside of modules (so that you don't have to, say, prefix the type name with the module name - <code>Tree.Tree</code> is so awkward), the rest of the tree code goes into a module, including two helper functions: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;&#39;b&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;leaf&nbsp;=&nbsp;Leaf &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;&#39;a&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;list&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;node&nbsp;x&nbsp;xs&nbsp;=&nbsp;Node&nbsp;(x,&nbsp;xs)</pre> </p> <p> The <code>leaf</code> function doesn't add much value, but the <code>node</code> function offers a curried alternative to the <code>Node</code> case constructor. That's occasionally useful. </p> <p> The rest of the code related to trees is also defined in the <code>Tree</code> module, but I'm going to present it formatted as free-standing functions. If you're confused about the layout of the code, the entire code base is <a href="https://github.com/ploeh/picture-archivist">available on GitHub</a>. </p> <p> The <a href="/2019/08/05/rose-tree-catamorphism">rose tree catamorphism</a> is this <code>cata</code> function: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;c&nbsp;list&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;cata&nbsp;fd&nbsp;ff&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Leaf&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ff&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Node&nbsp;(x,&nbsp;xs)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;xs&nbsp;|&gt;&nbsp;List.map&nbsp;(cata&nbsp;fd&nbsp;ff)&nbsp;|&gt;&nbsp;fd&nbsp;x</pre> </p> <p> In the corresponding Haskell implementation of this architecture, I called this function <code>foldTree</code>, so why not retain that name? The short answer is that the naming conventions differ between Haskell and F#, and while I favour learning from Haskell, I still want my F# code to be as <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> as possible. </p> <p> While I don't enforce that client code <em>must</em> use the <code>Tree</code> module name to access the functions within, I prefer to name the functions so that they make sense when used with qualified access. Having to write <code>Tree.foldTree</code> seems redundant. A more idiomatic name would be <code>fold</code>, so that you could write <code>Tree.fold</code>. The problem with that name, though, is that <code>fold</code> usually implies a list-biased <em>fold</em> (corresponding to <code>foldl</code> in Haskell), and I'll actually need that name for that particular purpose later. </p> <p> So, <code>cata</code> it is. </p> <p> In this article, tree functionality is (with one exception) directly or transitively implemented with <code>cata</code>. </p> <h3 id="3f30722983ad47bd83c88cec4ba80983"> Filtering trees <a href="#3f30722983ad47bd83c88cec4ba80983" title="permalink">#</a> </h3> <p> It'll be useful to be able to filter the contents of a tree. For example, the picture archivist program will only move image files with valid metadata. This means that it'll need to filter out all files that aren't image files, as well as image files without valid metadata. </p> <p> It turns out that it'll be useful to supply a function that throws away <code>None</code> values from a tree of <code>option</code> leaves. This is similar to <a href="https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/list.choose%5B't%2C'u%5D-function-%5Bfsharp%5D">List.choose</a>, so I call it <code>Tree.choose</code>: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b&nbsp;option)&nbsp;-&gt;&nbsp;Tree&lt;&#39;c,&#39;a&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;c,&#39;b&gt;&nbsp;option</span> <span style="color:blue;">let</span>&nbsp;choose&nbsp;f&nbsp;=&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;List.choose&nbsp;id&nbsp;&gt;&gt;&nbsp;node&nbsp;x&nbsp;&gt;&gt;&nbsp;Some)&nbsp;(f&nbsp;&gt;&gt;&nbsp;Option.map&nbsp;Leaf)</pre> </p> <p> You may find the type of the function surprising. Why does it return a <code>Tree option</code>, instead of simply a <code>Tree</code>? </p> <p> While <code>List.choose</code> simply returns a list, it can do this because lists can be empty. This <code>Tree</code> type, on the other hand, can't be empty. If the purpose of <code>Tree.choose</code> is to throw away all <code>None</code> values, then how do you return a tree from <code>Leaf None</code>? </p> <p> You can't return a <code>Leaf</code> because you have no value to put in the leaf. Similarly, you can't return a <code>Node</code> because, again, you have no value to put in the node. </p> <p> In order to handle this edge case, then, you'll have to return <code>None</code>: </p> <p> <pre>&gt; let l : Tree&lt;string, int option&gt; = Leaf None;; val l : Tree&lt;string,int option&gt; = Leaf None &gt; Tree.choose id l;; val it : Tree&lt;string,int&gt; option = None</pre> </p> <p> If you have anything other than a <code>None</code> leaf, though, you'll get a proper tree, but wrapped in an <code>option</code>: </p> <p> <pre>&gt; Tree.node "Foo" [Leaf (Some 42); Leaf None; Leaf (Some 2112)] |&gt; Tree.choose id;; val it : Tree&lt;string,int&gt; option = Some (Node ("Foo",[Leaf 42; Leaf 2112]))</pre> </p> <p> While the resulting tree is wrapped in a <code>Some</code> case, the leaves contain unwrapped values. </p> <h3 id="32f46f2c16cf428abc39c3d79433caa6"> Bifunctor, functor, and folds <a href="#32f46f2c16cf428abc39c3d79433caa6" title="permalink">#</a> </h3> <p> Through its type class language feature, Haskell has formal definitions of <a href="/2018/03/22/functors">functors</a>, <a href="/2018/12/24/bifunctors">bifunctors</a>, and other types of <em>folds</em> (list-biased <a href="/2019/04/29/catamorphisms">catamorphisms</a>). F# doesn't have a similar degree of formalism, which means that while you can still implement the corresponding functionality, you'll have to rely on conventions to make the functions recognisable. </p> <p> It's straighforward to start with the bifunctor functionality: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;d)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;c&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;b,&#39;d&gt;</span> <span style="color:blue;">let</span>&nbsp;bimap&nbsp;f&nbsp;g&nbsp;=&nbsp;cata&nbsp;(f&nbsp;&gt;&gt;&nbsp;node)&nbsp;(g&nbsp;&gt;&gt;&nbsp;leaf)</pre> </p> <p> This is, apart from the syntax differences, the same implementation as in Haskell. Based on <code>bimap</code>, you can also trivially implement <code>mapNode</code> and <code>mapLeaf</code> functions if you'd like, but you're not going to need those for the code in this article. You do need, however, a function that we could consider an alias of a hypothetical <code>mapLeaf</code> function: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;c&gt;</span> <span style="color:blue;">let</span>&nbsp;map&nbsp;f&nbsp;=&nbsp;bimap&nbsp;id&nbsp;f</pre> </p> <p> This makes <code>Tree</code> a functor. </p> <p> It'll also be useful to reduce a tree to a potentially more compact value, so you can add some specialised folds: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;a&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;bifold&nbsp;f&nbsp;g&nbsp;z&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;flip&nbsp;f&nbsp;x&nbsp;y&nbsp;=&nbsp;f&nbsp;y&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;xs&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;flip&nbsp;f&nbsp;x&nbsp;&gt;&gt;&nbsp;List.fold&nbsp;(&gt;&gt;)&nbsp;id&nbsp;xs)&nbsp;(flip&nbsp;g)&nbsp;t&nbsp;z <span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;bifoldBack&nbsp;f&nbsp;g&nbsp;t&nbsp;z&nbsp;=&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;xs&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;List.foldBack&nbsp;(&lt;&lt;)&nbsp;xs&nbsp;id&nbsp;&gt;&gt;&nbsp;f&nbsp;x)&nbsp;g&nbsp;t&nbsp;z</pre> </p> <p> In an attempt to emulate the F# naming conventions, I named the functions as I did. There are similar functions in the <code>List</code> and <code>Option</code> modules, for instance. If you're comparing the F# code with the Haskell code in the previous article, <code>Tree.bifold</code> corresponds to <code>bifoldl</code>, and <code>Tree.bifoldBack</code> corresponds to <code>bifoldr</code>. </p> <p> These enable you to implement folds over leaves only: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;fold&nbsp;f&nbsp;=&nbsp;bifold&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x)&nbsp;f <span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;foldBack&nbsp;f&nbsp;=&nbsp;bifoldBack&nbsp;(<span style="color:blue;">fun</span>&nbsp;_&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x)&nbsp;f</pre> </p> <p> These, again, enable you to implement another function that'll turn out to be useful in this article: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;unit)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;iter&nbsp;f&nbsp;=&nbsp;fold&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;f&nbsp;x)&nbsp;()</pre> </p> <p> The picture archivist program isn't going to explicitly need all of these, but transitively, it will. </p> <h3 id="8a9a50c69a2d461cac5bb87fa4cf3cd9"> Moving pictures <a href="#8a9a50c69a2d461cac5bb87fa4cf3cd9" title="permalink">#</a> </h3> <p> So far, all the code shown here could be in a general-purpose reusable library, since it contains no functionality specifically related to image files. The rest of the code in this article, however, will be specific to the program. I'll put the domain model code in another module that I call <code>Archive</code>. Later in the article, we'll look at how to load a tree from the file system, but for now, we'll just pretend that we have such a tree. </p> <p> The major logic of the program is to create a destination tree based on a source tree. The leaves of the tree will have to carry some extra information apart from a file path, so you can introduce a specific type to capture that information: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;PhotoFile&nbsp;=&nbsp;{&nbsp;File&nbsp;:&nbsp;FileInfo;&nbsp;TakenOn&nbsp;:&nbsp;DateTime&nbsp;}</pre> </p> <p> A <code>PhotoFile</code> not only contains the file path for an image file, but also the date the photo was taken. This date can be extracted from the file's metadata, but that's an impure operation, so we'll delegate that work to the start of the program. We'll return to that later. </p> <p> Given a source tree of <code>PhotoFile</code> leaves, though, the program must produce a destination tree of files: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,PhotoFile&gt;&nbsp;-&gt;&nbsp;Tree&lt;string,FileInfo&gt;</span> <span style="color:blue;">let</span>&nbsp;moveTo&nbsp;destination&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dirNameOf&nbsp;(dt&nbsp;:&nbsp;DateTime)&nbsp;=&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;%d-%02d&quot;</span>&nbsp;dt.Year&nbsp;dt.Month &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;groupByDir&nbsp;pf&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;key&nbsp;=&nbsp;dirNameOf&nbsp;pf.TakenOn &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dir&nbsp;=&nbsp;Map.tryFind&nbsp;key&nbsp;m&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;[] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Map.add&nbsp;key&nbsp;(pf.File&nbsp;::&nbsp;dir)&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;addDir&nbsp;name&nbsp;files&nbsp;dirs&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Tree.node&nbsp;name&nbsp;(List.map&nbsp;Leaf&nbsp;files)&nbsp;::&nbsp;dirs &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;m&nbsp;=&nbsp;Tree.foldBack&nbsp;groupByDir&nbsp;t&nbsp;Map.empty &nbsp;&nbsp;&nbsp;&nbsp;Map.foldBack&nbsp;addDir&nbsp;m&nbsp;[]&nbsp;|&gt;&nbsp;Tree.node&nbsp;destination</pre> </p> <p> This <code>moveTo</code> function looks, perhaps, overwhelming, but it's composed of three conceptual steps: <ol> <li>Create a map of destination folders (<code>m</code>).</li> <li>Create a list of branches from the map (<code>Map.foldBack addDir m []</code>).</li> <li>Create a tree from the list (<code>Tree.node destination</code>).</li> </ol> The <code>moveTo</code> function starts by folding the input data into a map <code>m</code>. The map is keyed by the directory name, which is formatted by the <code>dirNameOf</code> function. This function takes a <code>DateTime</code> as input and formats it to a <code>YYYY-MM</code> format. For example, December 20, 2018 becomes <code>"2018-12"</code>. </p> <p> The entire mapping step groups the <code>PhotoFile</code> values into a map of the type <code>Map&lt;string,FileInfo list&gt;</code>. All the image files taken in April 2014 are added to the list with the <code>"2014-04"</code> key, all the image files taken in July 2011 are added to the list with the <code>"2011-07"</code> key, and so on. </p> <p> In the next step, the <code>moveTo</code> function converts the map to a list of trees. This will be the branches (or sub-directories) of the <code>destination</code> directory. Because of the desired structure of the destination tree, this is a list of shallow branches. Each node contains only leaves. </p> <p> <img src="/content/binary/shallow-photo-destination-directories.png" alt="Shallow photo destination directories."> </p> <p> The only remaining step is to add that list of branches to a <code>destination</code> node. This is done by piping (<code>|&gt;</code>) the list of sub-directories into <code>Tree.node destination</code>. </p> <p> Since this is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">easy to unit test</a>. Just create some test cases and call the function. First, the test cases. </p> <p> In this code base, I'm using <a href="https://xunit.github.io">xUnit.net</a> 2.4.1, so I'll first create a set of test cases as a test-specific class: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;MoveToDestinationTestData&nbsp;()&nbsp;<span style="color:blue;">as</span>&nbsp;this&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;TheoryData&lt;Tree&lt;string,&nbsp;PhotoFile&gt;,&nbsp;string,&nbsp;Tree&lt;string,&nbsp;string&gt;&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;photoLeaf&nbsp;name&nbsp;(y,&nbsp;mth,&nbsp;d,&nbsp;h,&nbsp;m,&nbsp;s)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;{&nbsp;File&nbsp;=&nbsp;FileInfo&nbsp;name;&nbsp;TakenOn&nbsp;=&nbsp;DateTime&nbsp;(y,&nbsp;mth,&nbsp;d,&nbsp;h,&nbsp;m,&nbsp;s)&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;(2018,&nbsp;11,&nbsp;9,&nbsp;11,&nbsp;47,&nbsp;17), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[Node&nbsp;(<span style="color:#a31515;">&quot;2018-11&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;S&quot;</span>,&nbsp;[photoLeaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>&nbsp;(1972,&nbsp;6,&nbsp;6,&nbsp;16,&nbsp;15,&nbsp;0)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[Node&nbsp;(<span style="color:#a31515;">&quot;1972-06&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;S&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>&nbsp;(2002,&nbsp;10,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>&nbsp;(2007,&nbsp;4,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2002-10&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2007-04&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-01&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;bar&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;baz&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;1,&nbsp;2,&nbsp;3,&nbsp;4); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>&nbsp;(2011,&nbsp;3,&nbsp;4,&nbsp;3,&nbsp;2,&nbsp;1)])]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;qux&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-01&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2011-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>])]))</pre> </p> <p> That looks like a lot of code, but is really just a list of test cases. Each test case is a triple of a source tree, a destination directory name, and an expected result (another tree). </p> <p> The test itself, on the other hand, is compact: </p> <p> <pre>[&lt;Theory;&nbsp;ClassData(typeof&lt;MoveToDestinationTestData&gt;)&gt;] <span style="color:blue;">let</span>&nbsp;Move&nbsp;to&nbsp;destination&nbsp;source&nbsp;destination&nbsp;expected&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Archive.moveTo&nbsp;destination&nbsp;source &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;Tree.map&nbsp;string&nbsp;actual</pre> </p> <p> The <code>=!</code> operator comes from <a href="https://github.com/SwensenSoftware/unquote">Unquote</a> and means something like <em>must equal</em>. It's an assertion that will throw an exception if <code>expected</code> isn't equal to <code>Tree.map string actual</code>. </p> <p> The reason that the assertion maps <code>actual</code> to a tree of strings is that <code>actual</code> is a <code>Tree&lt;string,FileInfo&gt;</code>, but <code>FileInfo</code> doesn't have structural equality. So either I had to implement a test-specific equality comparer for <code>FileInfo</code> (and for <code>Tree&lt;string,FileInfo&gt;</code>), or map the tree to something with proper equality, such as a <code>string</code>. I chose the latter. </p> <h3 id="abe95ba6865745bc9df8004079d8a250"> Calculating moves <a href="#abe95ba6865745bc9df8004079d8a250" title="permalink">#</a> </h3> <p> One pure step remains. The result of calling the <code>moveTo</code> function is a tree with the desired structure. In order to actually move the files, though, for each file you'll need to keep track of both the source path and the destination path. To make that explicit, you can define a type for that purpose: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Move&nbsp;=&nbsp;{&nbsp;Source&nbsp;:&nbsp;FileInfo;&nbsp;Destination&nbsp;:&nbsp;FileInfo&nbsp;}</pre> </p> <p> A <code>Move</code> is simply a data structure. Contrast this with typical object-oriented design, where it would be a (possibly polymorphic) method on an object. In functional programming, you'll regularly model <em>intent</em> with a data structure. As long as intents remain data, you can easily manipulate them, and once you're done with that, you can run an interpreter over your data structure to perform the work you want accomplished. </p> <p> The unit test cases for the <code>moveTo</code> function suggest that file names are local file names like <code>"L"</code>, <code>"J"</code>, <code>"a"</code>, and so on. That was only to make the tests as compact as possible, since the function actually doesn't manipulate the specific <code>FileInfo</code> objects. </p> <p> In reality, the file names will most likely be longer, and they could also contain the full path, instead of the local path: <code>"C:\foo\bar\a.jpg"</code>. </p> <p> If you call <code>moveTo</code> with a tree where each leaf has a fully qualified path, the output tree will have the desired structure of the destination tree, but the leaves will still contain the full path to each source file. That means that you can calculate a <code>Move</code> for each file: </p> <p> <pre><span style="color:green;">//&nbsp;Tree&lt;string,FileInfo&gt;&nbsp;-&gt;&nbsp;Tree&lt;string,Move&gt;</span> <span style="color:blue;">let</span>&nbsp;calculateMoves&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;replaceDirectory&nbsp;(f&nbsp;:&nbsp;FileInfo)&nbsp;d&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FileInfo&nbsp;(Path.Combine&nbsp;(d,&nbsp;f.Name)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;imp&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Leaf&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;{&nbsp;Source&nbsp;=&nbsp;x;&nbsp;Destination&nbsp;=&nbsp;replaceDirectory&nbsp;x&nbsp;path&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Node&nbsp;(x,&nbsp;xs)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;newNPath&nbsp;=&nbsp;Path.Combine&nbsp;(path,&nbsp;x) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Tree.node&nbsp;newNPath&nbsp;(List.map&nbsp;(imp&nbsp;newNPath)&nbsp;xs) &nbsp;&nbsp;&nbsp;&nbsp;imp&nbsp;<span style="color:#a31515;">&quot;&quot;</span></pre> </p> <p> This function takes as input a <code>Tree&lt;string,FileInfo&gt;</code>, which is compatible with the output of <code>moveTo</code>. It returns a <code>Tree&lt;string,Move&gt;</code>, i.e. a tree where the leaves are <code>Move</code> values. </p> <p> Earlier, I wrote that you can implement desired <code>Tree</code> functionality with the <code>cata</code> function, but that was a simplification. If you can implement the functionality of <code>calculateMoves</code> with <code>cata</code>, I don't know how. You can, however, implement it using explicit pattern matching and simple recursion. </p> <p> The <code>imp</code> function builds up a file path as it recursively negotiates the tree. All <code>Leaf</code> nodes are converted to a <code>Move</code> value using the leaf node's current <code>FileInfo</code> value as the <code>Source</code>, and the <code>path</code> to figure out the desired <code>Destination</code>. </p> <p> This code is still easy to unit test. First, test cases: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;CalculateMovesTestData&nbsp;()&nbsp;<span style="color:blue;">as</span>&nbsp;this&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;TheoryData&lt;Tree&lt;string,&nbsp;FileInfo&gt;,&nbsp;Tree&lt;string,&nbsp;(string&nbsp;*&nbsp;string)&gt;&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;(Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>),&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>))])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>);&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>))])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>)]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;c&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>)])]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>),&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>))]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>),&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;3&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>))])]))</pre> </p> <p> The test cases in this parametrised test are tuples of an input tree and the expected tree. For each test case, the test calls the <code>Archive.calculateMoves</code> function with <code>tree</code> and asserts that the <code>actual</code> tree is equal to the <code>expected</code> tree: </p> <p> <pre>[&lt;Theory;&nbsp;ClassData(typeof&lt;CalculateMovesTestData&gt;)&gt;] <span style="color:blue;">let</span>&nbsp;Calculate&nbsp;moves&nbsp;tree&nbsp;expected&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Archive.calculateMoves&nbsp;tree &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;Tree.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;m&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(m.Source.ToString&nbsp;(),&nbsp;m.Destination.ToString&nbsp;()))&nbsp;actual</pre> </p> <p> Again, the test maps <code>FileInfo</code> objects to <code>strings</code> to support easy comparison. </p> <p> That's all the pure code you need in order to implement the desired functionality. Now you only need to write some code that loads a tree from disk, and imprints a destination tree to disk, as well as the code that composes it all. </p> <h3 id="bac6be79cf8c44a7b47923e2ec90d99f"> Loading a tree from disk <a href="#bac6be79cf8c44a7b47923e2ec90d99f" title="permalink">#</a> </h3> <p> The remaining code in this article is impure. You could put it in dedicated modules, but for this program, you're only going to need three functions and a bit of composition code, so you could also just put it all in the <code>Program</code> module. That's what I did. </p> <p> To load a tree from disk, you'll need a root directory, under which you load the entire tree. Given a directory path, you read a tree using a recursive function like this: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;Tree&lt;string,string&gt;</span> <span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;readTree&nbsp;path&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;File.Exists&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;Leaf&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dirsAndFiles&nbsp;=&nbsp;Directory.EnumerateFileSystemEntries&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;branches&nbsp;=&nbsp;Seq.map&nbsp;readTree&nbsp;dirsAndFiles&nbsp;|&gt;&nbsp;Seq.toList &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(path,&nbsp;branches)</pre> </p> <p> This recursive function starts by checking whether the <code>path</code> is a file that exists. If it does, the path is a file, so it creates a new <code>Leaf</code> with that path. </p> <p> If <code>path</code> isn't a file, it's a directory. In that case, use <code>Directory.EnumerateFileSystemEntries</code> to enumerate all the directories and files in that directory, and map all those directory entries recursively. That produces all the <code>branches</code> for the current node. Finally, return a new <code>Node</code> with the <code>path</code> and the <code>branches</code>. </p> <h3 id="7f5e06eb61024264ad214d41b63a8a74"> Loading metadata <a href="#7f5e06eb61024264ad214d41b63a8a74" title="permalink">#</a> </h3> <p> The <code>readTree</code> function only produces a tree with <code>string</code> leaves, while the program requires a tree with <code>PhotoFile</code> leaves. You'll need to read the <a href="https://en.wikipedia.org/wiki/Exif">Exif</a> metadata from each file and enrich the tree with the <em>date-taken</em> data. </p> <p> In this code base, I've written a little <code>Photo</code> module to extract the desired metadata from an image file. I'm not going to list all the code here; if you're interested, the code is <a href="https://github.com/ploeh/picture-archivist">available on GitHub</a>. The <code>Photo</code> module enables you to write an impure operation like this: </p> <p> <pre><span style="color:green;">//&nbsp;FileInfo&nbsp;-&gt;&nbsp;PhotoFile&nbsp;option</span> <span style="color:blue;">let</span>&nbsp;readPhoto&nbsp;file&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;Photo.extractDateTaken&nbsp;file &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Option.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;dateTaken&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;{&nbsp;File&nbsp;=&nbsp;file;&nbsp;TakenOn&nbsp;=&nbsp;dateTaken&nbsp;})</pre> </p> <p> This operation can fail for various reasons: <ul> <li>The file may not exist.</li> <li>The file exists, but has no metadata.</li> <li>The file has metadata, but no <em>date-taken</em> metadata.</li> <li>The <em>date-taken</em> metadata string is malformed.</li> </ul> When you traverse a <code>Tree&lt;string,string&gt;</code> with <code>readPhoto</code>, you'll get a <code>Tree&lt;string,PhotoFile option&gt;</code>. That's when you'll need <code>Tree.choose</code>. You'll see this soon. </p> <h3 id="59159ef499884e10ae92e5ef6e666c36"> Writing a tree to disk <a href="#59159ef499884e10ae92e5ef6e666c36" title="permalink">#</a> </h3> <p> The above <code>calculateMoves</code> function creates a <code>Tree&lt;string,Move&gt;</code>. The final piece of impure code you'll need to write is an operation that traverses such a tree and executes each <code>Move</code>. </p> <p> <pre><span style="color:green;">//&nbsp;Tree&lt;&#39;a,Move&gt;&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;writeTree&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;copy&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Directory.CreateDirectory&nbsp;m.Destination.DirectoryName&nbsp;|&gt;&nbsp;ignore &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;m.Source.CopyTo&nbsp;m.Destination.FullName&nbsp;|&gt;&nbsp;ignore &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;printfn&nbsp;<span style="color:#a31515;">&quot;Copied&nbsp;to&nbsp;%s&quot;</span>&nbsp;m.Destination.FullName &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;compareFiles&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sourceStream&nbsp;=&nbsp;File.ReadAllBytes&nbsp;m.Source.FullName &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationStream&nbsp;=&nbsp;File.ReadAllBytes&nbsp;m.Destination.FullName &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sourceStream&nbsp;=&nbsp;destinationStream &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;move&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copy&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;compareFiles&nbsp;m&nbsp;<span style="color:blue;">then</span>&nbsp;m.Source.Delete&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;Tree.iter&nbsp;move&nbsp;t</pre> </p> <p> The <code>writeTree</code> function traverses the input tree, and for each <code>Move</code>, it first copies the file, then it verifies that the copy was successful, and finally, if that's the case, it deletes the source file. </p> <h3 id="f30093164b184bbf877f307fa4cf4c63"> Composition <a href="#f30093164b184bbf877f307fa4cf4c63" title="permalink">#</a> </h3> <p> You can now compose an <em>impure-pure-impure sandwich</em> from all the Lego pieces: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;string&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;movePhotos&nbsp;source&nbsp;destination&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sourceTree&nbsp;=&nbsp;readTree&nbsp;source&nbsp;|&gt;&nbsp;Tree.map&nbsp;FileInfo &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;photoTree&nbsp;=&nbsp;Tree.choose&nbsp;readPhoto&nbsp;sourceTree &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Option.map&nbsp;(Archive.moveTo&nbsp;destination&nbsp;&gt;&gt;&nbsp;Archive.calculateMoves)&nbsp;photoTree &nbsp;&nbsp;&nbsp;&nbsp;Option.iter&nbsp;writeTree&nbsp;destinationTree</pre> </p> <p> First, you load the <code>sourceTree</code> using the <code>readTree</code> operation. This returns a <code>Tree&lt;string,string&gt;</code>, so map the leaves to <code>FileInfo</code> objects. You then load the image metatadata by traversing <code>sourceTree</code> with <code>Tree.choose readPhoto</code>. Each call to <code>readPhoto</code> produces a <code>PhotoFile option</code>, so this is where you want to use <code>Tree.choose</code> to throw all the <code>None</code> values away. </p> <p> Those two lines of code is the initial impure step of the sandwich (yes: mixed metaphors, I know). </p> <p> The pure part of the sandwich is the composition of the pure functions <code>moveTo</code> and <code>calculateMoves</code>. Since <code>photoTree</code> is a <code>Tree&lt;string,PhotoFile&gt; option</code>, you'll need to perform that transformation inside of <code>Option.map</code>. The resulting <code>destinationTree</code> is a <code>Tree&lt;string,Move&gt; option</code>. </p> <p> The final, impure step of the sandwich, then, is to apply all the moves with <code>writeTree</code>. </p> <h3 id="ab0013f79c184586a10aa014db496bef"> Execution <a href="#ab0013f79c184586a10aa014db496bef" title="permalink">#</a> </h3> <p> The <code>movePhotos</code> operation takes <code>source</code> and <code>destination</code> arguments. You could hypothetically call it from a rich client or a background process, but here I'll just call if from a command-line program. The <code>main</code> operation will have to parse the input arguments and call <code>movePhotos</code>: </p> <p> <pre>[&lt;EntryPoint&gt;] <span style="color:blue;">let</span>&nbsp;main&nbsp;argv&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;argv&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;[|source;&nbsp;destination|]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;movePhotos&nbsp;source&nbsp;destination &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;printfn&nbsp;<span style="color:#a31515;">&quot;Please&nbsp;provide&nbsp;source&nbsp;and&nbsp;destination&nbsp;directories&nbsp;as&nbsp;arguments.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;<span style="color:green;">//&nbsp;return&nbsp;an&nbsp;integer&nbsp;exit&nbsp;code</span></pre> </p> <p> You could write more sophisticated parsing of the program arguments, but that's not the topic of this article, so I only wrote the bare minimum required to get the program working. </p> <p> You can now compile and run the program: </p> <p> <pre>$./ArchivePictures "C:\Users\mark\Desktop\Test" "C:\Users\mark\Desktop\Test-Out" Copied to C:\Users\mark\Desktop\Test-Out\2003-04\2003-04-29 15.11.50.jpg Copied to C:\Users\mark\Desktop\Test-Out\2011-07\2011-07-10 13.09.36.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-04\2014-04-18 14.05.02.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-04\2014-04-17 17.11.40.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-05\2014-05-23 16.07.20.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-06\2014-06-21 16.48.40.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-06\2014-06-30 15.44.52.jpg Copied to C:\Users\mark\Desktop\Test-Out\2016-05\2016-05-01 09.25.23.jpg Copied to C:\Users\mark\Desktop\Test-Out\2017-08\2017-08-22 19.53.28.jpg</pre> </p> <p> This does indeed produce the expected destination directory structure. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> It's always nice when something turns out to work in practice, as well as in theory. </p> <h3 id="3e4503b89d8f4b81b8b9cac9d1f39021"> Summary <a href="#3e4503b89d8f4b81b8b9cac9d1f39021" title="permalink">#</a> </h3> <p> <a href="/2018/11/19/functional-architecture-a-definition">Functional software architecture</a> involves separating pure from impure code so that no pure functions invoke impure operations. Often, you can achieve that with what I call the <em>impure-pure-impure sandwich</em> architecture. In this example, you saw how to model the file system as a tree. This enables you to separate the impure file interactions from the pure program logic. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Picture archivist in Haskell https://blog.ploeh.dk/2019/09/09/picture-archivist-in-haskell 2019-09-09T08:19:00+00:00 Mark Seemann <div id="post"> <p> <em>A comprehensive code example showing how to implement a functional architecture in Haskell.</em> </p> <p> This article shows how to implement the <a href="/2019/08/26/functional-file-system">picture archivist architecture described in the previous article</a>. In short, the task is to move some image files to directories based on their date-taken metadata. The architectural idea is to load a directory structure from disk into an in-memory tree, manipulate that tree, and use the resulting tree to perform the desired actions: </p> <p> <img src="/content/binary/functional-file-system-interaction.png" alt="A functional program typically loads data, transforms it, and stores it again."> </p> <p> Much of the program will manipulate the tree data, which is immutable. </p> <h3 id="770cf37f0e3c457782ea20b53257f2d1"> Tree <a href="#770cf37f0e3c457782ea20b53257f2d1" title="permalink">#</a> </h3> <p> You can start by defining a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Tree&nbsp;a&nbsp;b&nbsp;=&nbsp;Node&nbsp;a&nbsp;[Tree&nbsp;a&nbsp;b]&nbsp;|&nbsp;Leaf&nbsp;b&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> If you wanted to, you could put all the <code>Tree</code> code in a reusable library, because none of it is coupled to a particular application, such as <a href="https://amzn.to/2V06Kji">moving pictures</a>. You could also write a comprehensive test suite for the following functions, but in this article, I'll skip that. </p> <p> Notice that this sort of tree explicitly distinguishes between internal and leaf nodes. This is necessary because you'll need to keep track of the directory names (the internal nodes), while at the same time you'll want to enrich the leaves with additional data - data that you can't meaningfully add to the internal nodes. You'll see this later in the article. </p> <p> The <a href="/2019/08/05/rose-tree-catamorphism">rose tree catamorphism</a> is this <code>foldTree</code> function: </p> <p> <pre><span style="color:#2b91af;">foldTree</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[c]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c foldTree&nbsp;&nbsp;_&nbsp;fl&nbsp;(Leaf&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x foldTree&nbsp;fn&nbsp;fl&nbsp;(Node&nbsp;x&nbsp;xs)&nbsp;=&nbsp;fn&nbsp;x&nbsp;$&nbsp;foldTree&nbsp;fn&nbsp;fl&nbsp;&lt;$&gt;&nbsp;xs</pre> </p> <p> Sometimes I name the catamorphism <code>cata</code>, sometimes something like <code>tree</code>, but using a library like <code>Data.Tree</code> as another source of inspiration, in this article I chose to name it <code>foldTree</code>. </p> <p> In this article, tree functionality is (with one exception) directly or transitively implemented with <code>foldTree</code>. </p> <h3 id="f5541d8a36b04cf9a455824c5f3a21c7"> Filtering trees <a href="#f5541d8a36b04cf9a455824c5f3a21c7" title="permalink">#</a> </h3> <p> It'll be useful to be able to filter the contents of a tree. For example, the picture archivist program will only move image files with valid metadata. This means that it'll need to filter out all files that aren't image files, as well as image files without valid metadata. </p> <p> It turns out that it'll be useful to supply a function that throws away <code>Nothing</code> values from a tree of <code>Maybe</code> leaves. This is similar to the <code>catMaybes</code> function from <code>Data.Maybe</code>, so I call it <code>catMaybeTree</code>: </p> <p> <pre><span style="color:#2b91af;">catMaybeTree</span>&nbsp;::&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a&nbsp;b) catMaybeTree&nbsp;=&nbsp;foldTree&nbsp;(\x&nbsp;-&gt;&nbsp;Just&nbsp;.&nbsp;Node&nbsp;x&nbsp;.&nbsp;catMaybes)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;Leaf)</pre> </p> <p> You may find the type of the function surprising. Why does it return a <code>Maybe Tree</code>, instead of simply a <code>Tree</code>? And if you accept the type as given, isn't this simply the <code>sequence</code> function? </p> <p> While <code>catMaybes</code> simply returns a list, it can do this because lists can be empty. This <code>Tree</code> type, on the other hand, can't be empty. If the purpose of <code>catMaybeTree</code> is to throw away all <code>Nothing</code> values, then how do you return a tree from <code>Leaf Nothing</code>? </p> <p> You can't return a <code>Leaf</code> because you have no value to put in the leaf. Similarly, you can't return a <code>Node</code> because, again, you have no value to put in the node. </p> <p> In order to handle this edge case, then, you'll have to return <code>Nothing</code>: </p> <p> <pre>Prelude Tree&gt; catMaybeTree$ Leaf Nothing Nothing</pre> </p> <p> Isn't this the same as <code>sequence</code>, then? It's not, because <code>sequence</code> short-circuits all data, as this list example shows: </p> <p> <pre>Prelude&gt; sequence [Just 42, Nothing, Just 2112] Nothing</pre> </p> <p> Contrast this with the behaviour of <code>catMaybes</code>: </p> <p> <pre>Prelude Data.Maybe&gt; catMaybes [Just 42, Nothing, Just 2112] [42,2112]</pre> </p> <p> You've yet to see the <code>Traversable</code> instance for <code>Tree</code>, but it behaves in the same way: </p> <p> <pre>Prelude Tree&gt; sequence $Node "Foo" [Leaf (Just 42), Leaf Nothing, Leaf (Just 2112)] Nothing</pre> </p> <p> The <code>catMaybeTree</code> function, on the other hand, returns a filtered tree: </p> <p> <pre>Prelude Tree&gt; catMaybeTree$ Node "Foo" [Leaf (Just 42), Leaf Nothing, Leaf (Just 2112)] Just (Node "Foo" [Leaf 42,Leaf 2112])</pre> </p> <p> While the resulting tree is wrapped in a <code>Just</code> case, the leaves contain unwrapped values. </p> <h3 id="5f0287c6d6fe42f3ad73a8e31ba9b3c4"> Instances <a href="#5f0287c6d6fe42f3ad73a8e31ba9b3c4" title="permalink">#</a> </h3> <p> The <a href="/2019/08/05/rose-tree-catamorphism">article about the rose tree catamorphism</a> already covered how to add instances of <code>Bifunctor</code>, <code>Bifoldable</code>, and <code>Bitraversable</code>, so I'll give this only cursory treatment. Refer to that article for a more detailed treatment. The code that accompanies that article also has <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a> properties that verify the various laws associated with those instances. Here, I'll just list the instances without further comment: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifunctor</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bimap&nbsp;f&nbsp;s&nbsp;=&nbsp;foldTree&nbsp;(Node&nbsp;.&nbsp;f)&nbsp;(Leaf&nbsp;.&nbsp;s) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifoldable</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bifoldMap&nbsp;f&nbsp;=&nbsp;foldTree&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;mconcat&nbsp;xs) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bitraversable</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bitraverse&nbsp;f&nbsp;s&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;foldTree&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;Node&nbsp;&lt;$&gt;&nbsp;f&nbsp;x&nbsp;&lt;*&gt;&nbsp;sequenceA&nbsp;xs)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;Leaf&nbsp;.&nbsp;s) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;=&nbsp;second <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;bifoldMap&nbsp;mempty <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;bisequenceA&nbsp;.&nbsp;first&nbsp;pure</pre> </p> <p> The picture archivist program isn't going to explicitly need all of these, but transitively, it will. </p> <h3 id="d1bbd6ef895f45619822126f44bf6bfb"> Moving pictures <a href="#d1bbd6ef895f45619822126f44bf6bfb" title="permalink">#</a> </h3> <p> So far, all the code shown here could be in a general-purpose reusable library, since it contains no functionality specifically related to image files. The rest of the code in this article, however, will be specific to the program. I'll put the domain model code in another module and import some functionality: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Archive&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;Data.Time <span style="color:blue;">import</span>&nbsp;Text.Printf <span style="color:blue;">import</span>&nbsp;System.FilePath <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.Map.Strict&nbsp;<span style="color:blue;">as</span>&nbsp;Map <span style="color:blue;">import</span>&nbsp;Tree</pre> </p> <p> Notice that <code>Tree</code> is one of the imported modules. </p> <p> Later, we'll look at how to load a tree from the file system, but for now, we'll just pretend that we have such a tree. </p> <p> The major logic of the program is to create a destination tree based on a source tree. The leaves of the tree will have to carry some extra information apart from a file path, so you can introduce a specific type to capture that information: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PhotoFile&nbsp;= &nbsp;&nbsp;PhotoFile&nbsp;{&nbsp;photoFileName&nbsp;::&nbsp;FilePath,&nbsp;takenOn&nbsp;::&nbsp;LocalTime&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> A <code>PhotoFile</code> not only contains the file path for an image file, but also the date the photo was taken. This date can be extracted from the file's metadata, but that's an impure operation, so we'll delegate that work to the start of the program. We'll return to that later. </p> <p> Given a source tree of <code>PhotoFile</code> leaves, though, the program must produce a destination tree of files: </p> <p> <pre><span style="color:#2b91af;">moveTo</span>&nbsp;::&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Ord</span>&nbsp;a,&nbsp;<span style="color:blue;">PrintfType</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;t&nbsp;<span style="color:blue;">PhotoFile</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;<span style="color:#2b91af;">FilePath</span> moveTo&nbsp;destination&nbsp;= &nbsp;&nbsp;Node&nbsp;destination&nbsp;.&nbsp;Map.foldrWithKey&nbsp;addDir&nbsp;<span style="color:blue;">[]</span>&nbsp;.&nbsp;<span style="color:blue;">foldr</span>&nbsp;groupByDir&nbsp;Map.empty &nbsp;&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;&nbsp;&nbsp;dirNameOf&nbsp;(LocalTime&nbsp;d&nbsp;_)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;(y,&nbsp;m,&nbsp;_)&nbsp;=&nbsp;toGregorian&nbsp;d&nbsp;<span style="color:blue;">in</span>&nbsp;printf&nbsp;<span style="color:#a31515;">&quot;%d-%02d&quot;</span>&nbsp;y&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;groupByDir&nbsp;(PhotoFile&nbsp;fileName&nbsp;t)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Map.insertWith&nbsp;<span style="color:#2b91af;">(++)</span>&nbsp;(dirNameOf&nbsp;t)&nbsp;[fileName] &nbsp;&nbsp;&nbsp;&nbsp;addDir&nbsp;name&nbsp;files&nbsp;dirs&nbsp;=&nbsp;Node&nbsp;name&nbsp;(Leaf&nbsp;&lt;$&gt;&nbsp;files)&nbsp;:&nbsp;dirs</pre> </p> <p> This <code>moveTo</code> function looks, perhaps, overwhelming, but it's composed of only three steps: <ol> <li>Create a map of destination folders (<code>foldr groupByDir Map.empty</code>).</li> <li>Create a list of branches from the map (<code>Map.foldrWithKey addDir []</code>).</li> <li>Create a tree from the list (<code>Node destination</code>).</li> </ol> Recall that when Haskell functions are composed with the <code>.</code> operator, you'll have to read the composition from right to left. </p> <p> Notice that this function works with any <code>Foldable</code> data container, so it'd work with lists and other data structures besides trees. </p> <p> The <code>moveTo</code> function starts by folding the input data into a map. The map is keyed by the directory name, which is formatted by the <code>dirNameOf</code> function. This function takes a <code>LocalTime</code> as input and formats it to a <code>YYYY-MM</code> format. For example, December 20, 2018 becomes <code>"2018-12"</code>. </p> <p> The entire mapping step groups the <code>PhotoFile</code> values into a map of the type <code>Map a [FilePath]</code>. All the image files taken in April 2014 are added to the list with the <code>"2014-04"</code> key, all the image files taken in July 2011 are added to the list with the <code>"2011-07"</code> key, and so on. </p> <p> In the next step, the <code>moveTo</code> function converts the map to a list of trees. This will be the branches (or sub-directories) of the <code>destination</code> directory. Because of the desired structure of the destination tree, this is a list of shallow branches. Each node contains only leaves. </p> <p> <img src="/content/binary/shallow-photo-destination-directories.png" alt="Shallow photo destination directories."> </p> <p> The only remaining step is to add that list of branches to a <code>destination</code> node. </p> <p> Since this is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">easy to unit test</a>. Just create some input values and call the function: </p> <p> <pre><span style="color:#a31515;">&quot;Move&nbsp;to&nbsp;destination&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(source,&nbsp;destination,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;lt&nbsp;2018&nbsp;11&nbsp;9&nbsp;11&nbsp;47&nbsp;17 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;2018-11&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;S&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>&nbsp;$&nbsp;lt&nbsp;1972&nbsp;6&nbsp;6&nbsp;16&nbsp;15&nbsp;00] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;1972-06&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;S&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>&nbsp;$&nbsp;lt&nbsp;2002&nbsp;10&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>&nbsp;$&nbsp;lt&nbsp;2007&nbsp;4&nbsp;21&nbsp;17&nbsp;18&nbsp;19] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;2002-10&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2007-04&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;21&nbsp;17&nbsp;18&nbsp;19] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-01&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;21&nbsp;17&nbsp;18&nbsp;19], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;baz&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;1&nbsp;2&nbsp;3&nbsp;4, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>&nbsp;$&nbsp;lt&nbsp;2011&nbsp;3&nbsp;4&nbsp;3&nbsp;2&nbsp;1 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-01&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2011-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;moveTo&nbsp;destination&nbsp;source &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> This is an <a href="/2018/05/07/inlined-hunit-test-lists">inlined</a> <a href="/2018/04/30/parametrised-unit-tests-in-haskell">parametrised HUnit test</a>. While it looks like a big unit test, it still follows my <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">test formatting heuristic</a>. There's only three expressions, but the <em>arrange</em> expression is big because it creates a list of test cases. </p> <p> Each test case is a triple of a <code>source</code> tree, a <code>destination</code> directory name, and an <code>expected</code> result. In order to make the test data code more compact, it utilises this test-specific helper function: </p> <p> <pre>lt&nbsp;y&nbsp;mth&nbsp;d&nbsp;h&nbsp;m&nbsp;s&nbsp;=&nbsp;LocalTime&nbsp;(fromGregorian&nbsp;y&nbsp;mth&nbsp;d)&nbsp;(TimeOfDay&nbsp;h&nbsp;m&nbsp;s)</pre> </p> <p> For each test case, the test calls the <code>moveTo</code> function with the <code>destination</code> directory name and the <code>source</code> tree. It then asserts that the <code>expected</code> value is equal to the <code>actual</code> value. </p> <h3 id="bcf9e8fd9d1b42bbb47b811be75385d0"> Calculating moves <a href="#bcf9e8fd9d1b42bbb47b811be75385d0" title="permalink">#</a> </h3> <p> One pure step remains. The result of calling the <code>moveTo</code> function is a tree with the desired structure. In order to actually move the files, though, for each file you'll need to keep track of both the source path and the destination path. To make that explicit, you can define a type for that purpose: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Move&nbsp;= &nbsp;&nbsp;Move&nbsp;{&nbsp;sourcePath&nbsp;::&nbsp;FilePath,&nbsp;destinationPath&nbsp;::&nbsp;FilePath&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> A <code>Move</code> is simply a data structure. Contrast this with typical object-oriented design, where it would be a (possibly polymorphic) method on an object. In functional programming, you'll regularly model <em>intent</em> with a data structure. As long as intents remain data, you can easily manipulate them, and once you're done with that, you can run an interpreter over your data structure to perform the work you want accomplished. </p> <p> The unit test cases for the <code>moveTo</code> function suggest that file names are local file names like <code>"L"</code>, <code>"J"</code>, <code>"a"</code>, and so on. That was only to make the tests as compact as possible, since the function actually doesn't manipulate the specific <code>FilePath</code> values. </p> <p> In reality, the file names will most likely be longer, and they could also contain the full path, instead of the local path: <code>"C:\foo\bar\a.jpg"</code>. </p> <p> If you call <code>moveTo</code> with a tree where each leaf has a fully qualified path, the output tree will have the desired structure of the destination tree, but the leaves will still contain the full path to each source file. That means that you can calculate a <code>Move</code> for each file: </p> <p> <pre><span style="color:#2b91af;">calculateMoves</span>&nbsp;::&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">Move</span> calculateMoves&nbsp;=&nbsp;imp&nbsp;<span style="color:#a31515;">&quot;&quot;</span> &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;imp&nbsp;path&nbsp;&nbsp;&nbsp;&nbsp;(Leaf&nbsp;x)&nbsp;=&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;x&nbsp;$&nbsp;replaceDirectory&nbsp;x&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;imp&nbsp;path&nbsp;(Node&nbsp;x&nbsp;xs)&nbsp;=&nbsp;Node&nbsp;(path&nbsp;&lt;/&gt;&nbsp;x)&nbsp;$&nbsp;imp&nbsp;(path&nbsp;&lt;/&gt;&nbsp;x)&nbsp;&lt;$&gt;&nbsp;xs</pre> </p> <p> This function takes as input a <code>Tree FilePath FilePath</code>, which is compatible with the output of <code>moveTo</code>. It returns a <code>Tree FilePath Move</code>, i.e. a tree where the leaves are <code>Move</code> values. </p> <p> To be fair, returning a tree is overkill. A <code>[Move]</code> (list of moves) would have been just as useful, but in this article, I'm trying to describe how to write code with a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>. In the overview article, I explained how you can model a file system using a rose tree, and in order to emphasise that point, I'll stick with that model a little while longer. </p> <p> Earlier, I wrote that you can implement desired <code>Tree</code> functionality with the <code>foldTree</code> function, but that was a simplification. If you can implement the functionality of <code>calculateMoves</code> with <code>foldTree</code>, I don't know how. You can, however, implement it using explicit pattern matching and simple recursion. </p> <p> The <code>imp</code> function builds up a file path (using the <code>&lt;/&gt;</code> path combinator) as it recursively negotiates the tree. All <code>Leaf</code> nodes are converted to a <code>Move</code> value using the leaf node's current <code>FilePath</code> value as the <code>sourcePath</code>, and the <code>path</code> to figure out the desired <code>destinationPath</code>. </p> <p> This code is still easy to unit test: </p> <p> <pre><span style="color:#a31515;">&quot;Calculate&nbsp;moves&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(tree,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>]], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>)&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>)&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;calculateMoves&nbsp;tree &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> The test cases in this parametrised test are tuples of an input <code>tree</code> and the <code>expected</code> tree. For each test case, the test calls the <code>calculateMoves</code> function with <code>tree</code> and asserts that the <code>actual</code> tree is equal to the <code>expected</code> tree. </p> <p> That's all the pure code you need in order to implement the desired functionality. Now you only need to write some code that loads a tree from disk, and imprints a destination tree to disk, as well as the code that composes it all. </p> <h3 id="062fff475b2b47e188dbd2bc930aa882"> Loading a tree from disk <a href="#062fff475b2b47e188dbd2bc930aa882" title="permalink">#</a> </h3> <p> The remaining code in this article is impure. You could put it in dedicated modules, but for this program, you're only going to need three functions and a bit of composition code, so you could also just put it all in the <code>Main</code> module. That's what I did. </p> <p> To load a tree from disk, you'll need a root directory, under which you load the entire tree. Given a directory path, you read a tree using a recursive function like this: </p> <p> <pre><span style="color:#2b91af;">readTree</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:#2b91af;">FilePath</span>) readTree&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;isFile&nbsp;&lt;-&nbsp;doesFileExist&nbsp;path &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isFile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;Leaf&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dirsAndfiles&nbsp;&lt;-&nbsp;listDirectory&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;paths&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(path&nbsp;&lt;/&gt;)&nbsp;dirsAndfiles &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;branches&nbsp;&lt;-&nbsp;traverse&nbsp;readTree&nbsp;paths &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;Node&nbsp;path&nbsp;branches</pre> </p> <p> This recursive function starts by checking whether the <code>path</code> is a file or a directory. If it's a file, it creates a new <code>Leaf</code> with that <code>FilePath</code>. </p> <p> If <code>path</code> isn't a file, it's a directory. In that case, use <code>listDirectory</code> to enumerate all the directories and files in that directory. These are only local names, so prefix them with <code>path</code> to create full paths, then <code>traverse</code> all those directory entries recursively. That produces all the <code>branches</code> for the current node. Finally, return a new <code>Node</code> with the <code>path</code> and the <code>branches</code>. </p> <h3 id="5ba31d6e6e7f4eee942e39349a45e1ed"> Loading metadata <a href="#5ba31d6e6e7f4eee942e39349a45e1ed" title="permalink">#</a> </h3> <p> The <code>readTree</code> function only produces a tree with <code>FilePath</code> leaves, while the program requires a tree with <code>PhotoFile</code> leaves. You'll need to read the <a href="https://en.wikipedia.org/wiki/Exif">Exif</a> metadata from each file and enrich the tree with the <em>date-taken</em> data. </p> <p> In this code base, I've used the <a href="http://hackage.haskell.org/package/hsexif">hsexif</a> library for this. That enables you to write an impure operation like this: </p> <p> <pre><span style="color:#2b91af;">readPhoto</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">PhotoFile</span>) readPhoto&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;exifData&nbsp;&lt;-&nbsp;parseFileExif&nbsp;path &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dateTaken&nbsp;=&nbsp;either&nbsp;(<span style="color:blue;">const</span>&nbsp;Nothing)&nbsp;Just&nbsp;exifData&nbsp;&gt;&gt;=&nbsp;getDateTimeOriginal &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;PhotoFile&nbsp;path&nbsp;&lt;$&gt;&nbsp;dateTaken</pre> </p> <p> This operation can fail for various reasons: <ul> <li>The file may not exist.</li> <li>The file exists, but has no metadata.</li> <li>The file has metadata, but no <em>date-taken</em> metadata.</li> <li>The <em>date-taken</em> metadata string is malformed.</li> </ul> The program is just going to skip all files from which it can't extract <em>date-taken</em> metadata, so <code>readPhoto</code> converts the <code>Either</code> value returned by <code>parseFileExif</code> to <code>Maybe</code> and binds the result with <code>getDateTimeOriginal</code>. </p> <p> When you <code>traverse</code> a <code>Tree FilePath FilePath</code> with <code>readPhoto</code>, you'll get a <code>Tree FilePath (Maybe PhotoFile)</code>. That's when you'll need <code>catMaybeTree</code>. You'll see this soon. </p> <h3 id="8b8d1709f9ed4fe2bc78e4ea9b2a2508"> Writing a tree to disk <a href="#8b8d1709f9ed4fe2bc78e4ea9b2a2508" title="permalink">#</a> </h3> <p> The above <code>calculateMoves</code> function creates a <code>Tree FilePath Move</code>. The final piece of impure code you'll need to write is an operation that traverses such a tree and executes each <code>Move</code>. </p> <p> <pre><span style="color:#2b91af;">applyMoves</span>&nbsp;::&nbsp;<span style="color:blue;">Foldable</span>&nbsp;t&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;<span style="color:blue;">Move</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() applyMoves&nbsp;=&nbsp;traverse_&nbsp;move &nbsp;&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;&nbsp;&nbsp;move&nbsp;m&nbsp;=&nbsp;copy&nbsp;m&nbsp;&gt;&gt;&nbsp;compareFiles&nbsp;m&nbsp;&gt;&gt;=&nbsp;deleteSource &nbsp;&nbsp;&nbsp;&nbsp;copy&nbsp;(Move&nbsp;s&nbsp;d)&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;createDirectoryIfMissing&nbsp;True&nbsp;$&nbsp;takeDirectory&nbsp;d &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copyFileWithMetadata&nbsp;s&nbsp;d &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;Copied&nbsp;to&nbsp;&quot;</span>&nbsp;++&nbsp;<span style="color:blue;">show</span>&nbsp;d &nbsp;&nbsp;&nbsp;&nbsp;compareFiles&nbsp;m@(Move&nbsp;s&nbsp;d)&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sourceBytes&nbsp;&lt;-&nbsp;B.<span style="color:blue;">readFile</span>&nbsp;s &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;destinationBytes&nbsp;&lt;-&nbsp;B.<span style="color:blue;">readFile</span>&nbsp;d &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;<span style="color:blue;">if</span>&nbsp;sourceBytes&nbsp;==&nbsp;destinationBytes&nbsp;<span style="color:blue;">then</span>&nbsp;Just&nbsp;m&nbsp;<span style="color:blue;">else</span>&nbsp;Nothing &nbsp;&nbsp;&nbsp;&nbsp;deleteSource&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Nothing&nbsp;=&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">()</span> &nbsp;&nbsp;&nbsp;&nbsp;deleteSource&nbsp;(Just&nbsp;(Move&nbsp;s&nbsp;_))&nbsp;=&nbsp;removeFile&nbsp;s</pre> </p> <p> As I wrote above, a tree of <code>Move</code> values is, to be honest, overkill. Any <code>Foldable</code> container will do, as the <code>applyMoves</code> operation demonstrates. It traverses the data structure, and for each <code>Move</code>, it first copies the file, then it verifies that the copy was successful, and finally, if that's the case, it deletes the source file. </p> <p> All of the operations invoked by these three steps are defined in various libraries part of the base GHC installation. You're welcome to peruse <a href="https://github.com/ploeh/picture-archivist">the source code repository</a> if you're interested in the details. </p> <h3 id="d336cf55dc9746c08cbed32041803173"> Composition <a href="#d336cf55dc9746c08cbed32041803173" title="permalink">#</a> </h3> <p> You can now compose an <em>impure-pure-impure sandwich</em> from all the Lego pieces: </p> <p> <pre><span style="color:#2b91af;">movePhotos</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() movePhotos&nbsp;source&nbsp;destination&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;fold&nbsp;$&nbsp;runMaybeT&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;sourceTree&nbsp;&lt;-&nbsp;lift&nbsp;$&nbsp;readTree&nbsp;source &nbsp;&nbsp;photoTree&nbsp;&lt;-&nbsp;MaybeT&nbsp;$&nbsp;catMaybeTree&nbsp;&lt;$&gt;&nbsp;traverse&nbsp;readPhoto&nbsp;sourceTree &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationTree&nbsp;=&nbsp;calculateMoves&nbsp;$&nbsp;moveTo&nbsp;destination&nbsp;photoTree &nbsp;&nbsp;lift&nbsp;$&nbsp;applyMoves&nbsp;destinationTree</pre> </p> <p> First, you load the <code>sourceTree</code> using the <code>readTree</code> operation. This is a <code>Tree FilePath FilePath</code> value, because the code is written in <code>do</code> notation, and the context is <code>MaybeT IO ()</code>. You then load the image metatadata by traversing <code>sourceTree</code> with <code>readPhoto</code>. This produces a <code>Tree FilePath (Maybe PhotoFile)</code> that you then filter with <code>catMaybeTree</code>. Again, because of <code>do</code> notation and monad transformer shenanigans, <code>photoTree</code> is a <code>Tree FilePath PhotoFile</code> value. </p> <p> Those two lines of code is the initial impure step of the sandwich (yes: mixed metaphors, I know). </p> <p> The pure part of the sandwich is the composition of the pure functions <code>moveTo</code> and <code>calculateMoves</code>. The result is a <code>Tree FilePath Move</code> value. </p> <p> The final, impure step of the sandwich, then, is to <code>applyMoves</code>. </p> <h3 id="8b44f4d2cd2241e18bff6d40c1ad9ee9"> Execution <a href="#8b44f4d2cd2241e18bff6d40c1ad9ee9" title="permalink">#</a> </h3> <p> The <code>movePhotos</code> operation takes <code>source</code> and <code>destination</code> arguments. You could hypothetically call it from a rich client or a background process, but here I'll just call if from a command-line program. The <code>main</code> operation will have to parse the input arguments and call <code>movePhotos</code>: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;args&nbsp;&lt;-&nbsp;getArgs &nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;args&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;[source,&nbsp;destination]&nbsp;-&gt;&nbsp;movePhotos&nbsp;source&nbsp;destination &nbsp;&nbsp;&nbsp;&nbsp;_&nbsp;-&gt;&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;<span style="color:#a31515;">&quot;Please&nbsp;provide&nbsp;source&nbsp;and&nbsp;destination&nbsp;directories&nbsp;as&nbsp;arguments.&quot;</span></pre> </p> <p> You could write more sophisticated parsing of the program arguments, but that's not the topic of this article, so I only wrote the bare minimum required to get the program working. </p> <p> You can now compile and run the program: </p> <p> <pre>$ ./archpics "C:\Users\mark\Desktop\Test" "C:\Users\mark\Desktop\Test-Out" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2003-04\\2003-04-29 15.11.50.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2011-07\\2011-07-10 13.09.36.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-04\\2014-04-17 17.11.40.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-04\\2014-04-18 14.05.02.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-05\\2014-05-23 16.07.20.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-06\\2014-06-30 15.44.52.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-06\\2014-06-21 16.48.40.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2016-05\\2016-05-01 09.25.23.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2017-08\\2017-08-22 19.53.28.jpg"</pre> </p> <p> This does indeed produce the expected destination directory structure. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> It's always nice when something turns out to work in practice, as well as in theory. </p> <h3 id="c50c7ac1276146d79715a5e7ddadfe6d"> Summary <a href="#c50c7ac1276146d79715a5e7ddadfe6d" title="permalink">#</a> </h3> <p> Functional software architecture involves separating pure from impure code so that no pure functions invoke impure operations. Often, you can achieve that with what I call the <em>impure-pure-impure sandwich</em> architecture. In this example, you saw how to model the file system as a tree. This enables you to separate the impure file interactions from the pure program logic. </p> <p> The Haskell type system enforces the <em>functional interaction law</em>, which implies that the architecture is, indeed, properly functional. Other languages, like <a href="https://fsharp.org">F#</a>, don't enforce the law via the compiler, but that doesn't prevent you doing functional programming. Now that we've verified that the architecture is, indeed, functional, we can port it to F#. </p> <p> <strong>Next:</strong> <a href="/2019/09/16/picture-archivist-in-f">Picture archivist in F#</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="f237d98d453a4bcb9a3d58a05bf21d34"> <div class="comment-author"><a href="https://majiehong.com">Jiehong</a></div> <div class="comment-content"> <p> This seems a fair architecture. </p> <p> However, at first glance it does not seem very memory efficient, because everything might be loaded in RAM, and that poses a strict limit. </p> <p> But then, I remember that Haskell does lazy evaluation, so is it the case here? Are path and the tree lazily loaded and processed? </p> <p> In "traditional" architectures, IO would be scattered inside the program, and as each file might be read one at a time, and handled. This sandwich of purity with impure buns forces not to do that. </p> </div> <div class="comment-date">2019-09-09 11:47 UTC</div> </div> <div class="comment" id="ca660cdc1f094bfb8cc9896bb1084460"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Jiehong, thank you for writing. It's true that Haskell is lazily evaluated, but some strictness rules apply to <code>IO</code>, so it's not so simple. </p> <p> Just running a quick experiment with the code base shown here, when I try to move thousands of files, the program sits and thinks for quite some time before it starts to output progress. This indicates to me that it does, indeed, load at least the <em>structure</em> of the tree into memory before it starts moving the files. Once it does that, though, it looks like it runs at constant memory. </p> <p> There's an interplay of laziness and <code>IO</code> in Haskell that I still don't sufficiently master. When I publish the port to F#, however, it should be clear that you could replace all the nodes of the tree with explicitly lazy values. I'd be surprised if something like that isn't possible in Haskell as well, but here I'll solicit help from readers more well-versed in these matters than I am. </p> </div> <div class="comment-date">2019-09-09 19:16 UTC</div> </div> <div class="comment" id="dd26f6d047b5492b8a012b30d96ad18b"> <div class="comment-author">André Cardoso</div> <div class="comment-content"> <p> I really like your posts and I'm really liking this series. But I struggle with Haskell syntax, specially the difference between the operators $, &lt;$&gt;, &lt;&gt;, &lt;*&gt;. Is there a cheat sheet explaining these operators? </p> </div> <div class="comment-date">2019-09-12 13:51 UTC</div> </div> <div class="comment" id="2e71f695ed9f4cfa8467df818f072da8"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> André, thank you for writing. I've written about why <a href="/2018/07/02/terse-operators-make-business-code-more-readable">I think that terse operators make the code overall more readable</a>, but that's obviously not an explanation of any of those operators. </p> <p> I'm not aware of any cheat sheets for Haskell, although a Google search seems to indicate that many exist. I'm not sure that a cheat sheet will help much if one doesn't know Haskell, and if one does know Haskell, one is likely to also know those operators. </p> <p> <a href="https://hackage.haskell.org/package/base/docs/Prelude.html#v:-36-">$</a> is a sort of delimiter that often saves you from having to nest other function calls in brackets. </p> <p> <a href="https://hackage.haskell.org/package/base/docs/Prelude.html#v:-60--36--62-">&lt;$&gt;</a> is just an infix alias for <code>fmap</code>. In C#, that <a href="/2018/03/22/functors">corresponds to the <code>Select</code> method</a>. </p> <p> <code>&lt;&gt;</code> is a generalised associative binary operation as defined by <a href="http://hackage.haskell.org/package/base/docs/Data-Semigroup.html">Data.Semigroup</a> or <a href="http://hackage.haskell.org/package/base/docs/Data-Monoid.html">Data.Monoid</a>. You can <a href="/2017/10/05/monoids-semigroups-and-friends">read more about monoids and semigroups here on the blog</a>. </p> <p> <a href="http://hackage.haskell.org/package/base/docs/Control-Applicative.html">&lt;*&gt;</a> is part of the <code>Applicative</code> type class. It's hard to translate to other languages, but <a href="/2018/10/01/applicative-functors">when I make the attempt</a>, I usually call it <code>Apply</code>. </p> </div> <div class="comment-date">2019-09-12 15:45 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Naming newtypes for QuickCheck Arbitraries https://blog.ploeh.dk/2019/09/02/naming-newtypes-for-quickcheck-arbitraries 2019-09-02T13:07:00+00:00 Mark Seemann <div id="post"> <p> <em>A simple naming scheme for newtypes to add Arbitrary instances.</em> </p> <p> Naming is one of those recurring difficult problems in software development. How do you come up with good names? </p> <p> I'm not aware of any <em>general</em> heuristic for that, but sometimes, in specific contexts, a naming scheme presents itself. Here's one. </p> <h3 id="c7391ad662e943f1bbe2b52d6b8bde59"> Orphan instances <a href="#c7391ad662e943f1bbe2b52d6b8bde59" title="permalink">#</a> </h3> <p> When you write <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a> properties that involve your own custom types, you'll have to add <code>Arbitrary</code> instances for those types. As an example, here's a restaurant reservation record type: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Reservation&nbsp;=&nbsp;Reservation &nbsp;&nbsp;{&nbsp;reservationId&nbsp;::&nbsp;UUID &nbsp;&nbsp;,&nbsp;reservationDate&nbsp;::&nbsp;LocalTime &nbsp;&nbsp;,&nbsp;reservationName&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;reservationEmail&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;reservationQuantity&nbsp;::&nbsp;Int &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>,&nbsp;<span style="color:#2b91af;">Generic</span>)</pre> </p> <p> You can easily add an Arbitrary instance to such a type: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">Reservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;liftM5&nbsp;Reservation&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary</pre> </p> <p> The type itself is part of your domain model, while the <code>Arbitrary</code> instance only belongs to your test code. You shouldn't add the <code>Arbitrary</code> instance to the domain model, but that means that you'll have to define the instance apart from the type definition. That, however, is an orphan instance, and the compiler will complain: </p> <p> <pre>test\ReservationAPISpec.hs:31:1: <span style="color:red;">warning:</span> [<span style="color:red;">-Worphans</span>] Orphan instance: instance Arbitrary Reservation To avoid this move the instance declaration to the module of the class or of the type, or wrap the type with a newtype and declare the instance on the new type. <span style="color:blue;">|</span> <span style="color:blue;">31 |</span> <span style="color:red;">instance Arbitrary Reservation where</span> <span style="color:blue;">|</span> <span style="color:red;">^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...</span></pre> </p> <p> Technically, this isn't a difficult problem to solve. The warning even suggests remedies. Moving the instance to the module that declares the type is, however, inappropriate, since test-specific instances don't belong in the domain model. Wrapping the type in a <code>newtype</code> is more appropriate, but what should you call the type? </p> <h3 id="c192d6524b4b4444a35121443f9a61a8"> Suppress the warning <a href="#c192d6524b4b4444a35121443f9a61a8" title="permalink">#</a> </h3> <p> I had trouble coming up with good names for such <code>newtype</code> wrappers, so at first I decided to just suppress that particular compiler warning. I simply added the <code>-fno-warn-orphans</code> flag <em>exclusively to my test code</em>. </p> <p> That solved the immediate problem, but I felt a little dirty. It's okay, though, because you're not supposed to reuse test libraries anyway, so the usual problems with orphan instances don't apply. </p> <p> After having worked a little like this, however, it dawned on me that I needed more than one <code>Arbitrary</code> instance, and a naming scheme presented itself. </p> <h3 id="a946b2c622c6403cb69a3f224551514c"> Naming scheme <a href="#a946b2c622c6403cb69a3f224551514c" title="permalink">#</a> </h3> <p> For some of the properties I wrote, I needed a <em>valid</em> <code>Reservation</code> value. In this case, <em>valid</em> means that the <code>reservationQuantity</code> is a positive number, and that the <code>reservationDate</code> is in the future. It seemed natural to signify these constraints with a <code>newtype</code>: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;ValidReservation&nbsp;=&nbsp;ValidReservation&nbsp;Reservation&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">ValidReservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;rid&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;&lt;-&nbsp;(\dt&nbsp;-&gt;&nbsp;addLocalTime&nbsp;(getPositive&nbsp;dt)&nbsp;now2019)&nbsp;&lt;$&gt;&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;n&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;e&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;(Positive&nbsp;q)&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;ValidReservation&nbsp;$&nbsp;Reservation&nbsp;rid&nbsp;d&nbsp;n&nbsp;e&nbsp;q</pre> </p> <p> The <code>newtype</code> is, naturally, called <code>ValidReservation</code> and can, for example, be used like this: </p> <p> <pre>it&nbsp;<span style="color:#a31515;">&quot;responds&nbsp;with&nbsp;200&nbsp;after&nbsp;reservation&nbsp;is&nbsp;added&quot;</span>&nbsp;$&nbsp;WQC.property&nbsp;$&nbsp;\ &nbsp;&nbsp;(ValidReservation&nbsp;r)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;_&nbsp;&lt;-&nbsp;postJSON&nbsp;<span style="color:#a31515;">&quot;/reservations&quot;</span>&nbsp;$&nbsp;encode&nbsp;r &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;get&nbsp;$&nbsp;<span style="color:#a31515;">&quot;/reservations/&quot;</span>&nbsp;&lt;&gt;&nbsp;toASCIIBytes&nbsp;(reservationId&nbsp;r) &nbsp;&nbsp;actual&nbsp;shouldRespondWith&nbsp;200</pre> </p> <p> For the few properties where <em>any</em> <code>Reservation</code> goes, a name for a <code>newtype</code> now also suggests itself: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;AnyReservation&nbsp;=&nbsp;AnyReservation&nbsp;Reservation&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">AnyReservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;AnyReservation&nbsp;&lt;$&gt; &nbsp;&nbsp;&nbsp;&nbsp;liftM5&nbsp;Reservation&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary</pre> </p> <p> The only use I've had for that particular instance so far, though, is to ensure that any <code>Reservation</code> correctly serialises to, and deserialises from, JSON: </p> <p> <pre>it&nbsp;<span style="color:#a31515;">&quot;round-trips&quot;</span>&nbsp;$&nbsp;property&nbsp;$&nbsp;\(AnyReservation&nbsp;r)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;json&nbsp;=&nbsp;encode&nbsp;r &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;decode&nbsp;json &nbsp;&nbsp;actual&nbsp;shouldBe&nbsp;Just&nbsp;r</pre> </p> <p> With those two <code>newtype</code> wrappers, I no longer have any orphan instances. </p> <h3 id="758fef8609784b998c3fad65b2fe6e2f"> Summary <a href="#758fef8609784b998c3fad65b2fe6e2f" title="permalink">#</a> </h3> <p> A simple naming scheme for <code>newtype</code> wrappers for QuickCheck <code>Arbitrary</code> instances, then, is: <ul> <li>If the instance is truly unbounded, prefix the wrapper name with <em>Any</em></li> <li>If the instance only produces valid values, prefix the wrapper name with <em>Valid</em></li> </ul> This strikes me as a practical naming scheme. Other variations seem natural. If, for example, you need an <em>invalid</em> value, you can prefix the wrapper name with <em>Invalid</em>. Why you'd need that, though, I'm not sure. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Functional file system https://blog.ploeh.dk/2019/08/26/functional-file-system 2019-08-26T06:00:00+00:00 Mark Seemann <div id="post"> <p> <em>How do you model file systems in a functional manner, so that unit testing is enabled? An overview.</em> </p> <p> One of the many reasons that I like functional programming is that it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">intrinsically testable</a>. In object-oriented programming, you often have to jump through hoops to enable testing. This is also the case whenever you need to interact with the computer's file system. Just try to search the web for <em>file system interface</em>, or <em>mock file system</em>. I'm not going to give you any links, because I think such questions are <a href="https://en.wikipedia.org/wiki/XY_problem">XY problems</a>. I don't think that the most common suggestions are proper solutions. </p> <p> In functional programming, anyway, <a href="/2017/01/30/partial-application-is-dependency-injection">Dependency Injection isn't functional, because it makes everything impure</a>. How, then, do you model the file system in such a way that it's pure, decoupled from the logic you'd like to add on top of it, and still has enough fidelity that you can perform most tasks? </p> <p> You model the file system as a tree, or a forest. </p> <h3 id="4920bedd948d4f7487a13fa96f836371"> File systems are hierarchies <a href="#4920bedd948d4f7487a13fa96f836371" title="permalink">#</a> </h3> <p> It should come as no surprise that file systems are hierarchies, or trees. Each logical drive is the root of a tree. Files are leaves, and directories are internal nodes. Does that sound familiar? That sounds like a <a href="/2019/07/29/church-encoded-rose-tree">rose tree</a>. </p> <p> Rose trees are immutable data structures. It doesn't get much more functional than that. Why not using a rose tree (or a forest) to model the file system? </p> <p> What about interaction with the actual file system? Usually, when you encounter object-oriented attempts at decoupling an abstraction from the actual file system, you'll find polymorphic operations such as <code>WriteAllText</code>, <code>GetFileSystemEntries</code>, <code>CreateDirectory</code>, and so on. These would be the (mockable) methods that you have to implement, usually as <a href="http://xunitpatterns.com/Humble%20Object.html">Humble Objects</a>. </p> <p> If you, instead of a set of interfaces, model the file system as a forest, interacting with the actual file system is not even part of the abstraction. That's a typical shift of perspective from object-oriented design to functional programming. </p> <p> <img src="/content/binary/ood-and-fp-views-on-fily-system-abstraction.png" alt="Object-oriented and functional ways to abstractly model file systems."> </p> <p> In object-oriented design, you typically attempt to model <em>data with behaviour</em>. Sometimes that fits the underlying reality well, but in this case it doesn't. While you have file and directory objects with behaviour, the actual structure of a file system is implicit. It's hidden in the interactions between the objects. </p> <p> By modelling the file system as a tree, you explicitly use the structure of the data. How you load a tree into program memory, or how you imprint a tree unto the file system isn't part of the abstraction. When it comes to input and output, you're free to do what you want. </p> <p> Once you have a model of a directory structure in memory, you can manipulate it to your heart's content. Since <a href="/2019/08/19/a-rose-tree-functor">rose trees are functors</a>, you know that all transformations are structure-preserving. That means that you don't even need to write tests for those parts of your application. </p> <p> You'll appreciate an example, I'm sure. </p> <h3 id="5e19438122b94e059c155509e96c964f"> Picture archivist example <a href="#5e19438122b94e059c155509e96c964f" title="permalink">#</a> </h3> <p> As an example, I'll attempt to answer <a href="https://codereview.stackexchange.com/q/99271/3878">an old Code Review question</a>. I already gave <a href="https://codereview.stackexchange.com/a/99290/3878">an answer</a> in 2015, but I'm not so happy with it today as I was back then. The question is great, though, because it explicitly demonstrates how people have a hard time escaping the notion that abstraction is only available via interfaces or abstract base classes. In 2015, I had long since figured out that <a href="/2009/05/28/DelegatesAreAnonymousInterfaces">delegates (and thus functions) are anonymous interfaces</a>, but I still hadn't figured out how to separate pure from impure behaviour. </p> <p> The question's scenario is how to implement a small program that can inspect a collection of image files, extract the date-taken metadata from each file, and move the files to a new directory structure based on that information. </p> <p> For example, you could have files organised in various directories according to motive. </p> <p> <img src="/content/binary/picture-archivist-source-directory.png" alt="Three example directories with pictures."> </p> <p> You soon realise, however, that that archiving strategy is untenable, because what do you do if there's more than one type of motive in a picture? Instead, you decide to organise the files according to month and year. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> Clearly, there's some input and output involved in this application, but there's also some logic that you'd like to unit test. You need to parse the metadata, figure out where to move each image file, filter out files that are not images, and so on. </p> <h3 id="e3bc8b23a3494628a44348749a0369ca"> Object-oriented picture archivist <a href="#e3bc8b23a3494628a44348749a0369ca" title="permalink">#</a> </h3> <p> If you were to implement such a picture archivist program with an object-oriented design, you may use Dependency Injection so that you can 'mock' the file system during unit testing. A typical program might then work like this at run time: </p> <p> <img src="/content/binary/object-oriented-file-system-interaction.png" alt="An object-oriented program typically has busy interaction with the file system."> </p> <p> The program has fine-grained, busy interaction with the file system (through a polymorphic interface). It'll typically read one file, load its metadata, decide where to put the file, and copy it there. Then it'll move on to the next file, although it might also do this in parallel. Throughout the program execution, there's input and output going on, which makes it difficult to isolate the pure from the impure code. </p> <p> Even if you write a program like that in <a href="https://fsharp.org">F#</a>, it's hardly a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>. </p> <p> Such an architecture is, in theory, testable, but my experience is that if you attempt to reproduce such busy, fine-grained interaction with mocks and stubs, you're likely to end up with brittle tests. </p> <h3 id="6cddf0e7ca3549c49a87006bfba5d349"> Functional picture archivist <a href="#6cddf0e7ca3549c49a87006bfba5d349" title="permalink">#</a> </h3> <p> In functional programming, you'll have to <a href="/2017/02/02/dependency-rejection">reject the notion of dependencies</a>. Instead, you can often resort to the simple architecture I call an <em>impure-pure-impure sandwich</em>; here, specifically: <ol> <li>Load data from disk (impure)</li> <li>Transform the data (pure)</li> <li>Write data to disk (impure)</li> </ol> A typical program might then work like this at run time: </p> <p> <img src="/content/binary/functional-file-system-interaction.png" alt="A functional program typically loads data, transforms it, and stores it again."> </p> <p> When the program starts, it loads data from disk into a tree. It then manipulates the in-memory model of the files in question, and once it's done, it traverses the entire tree and applies the changes. </p> <p> This gives you a much clearer separation between the pure and impure parts of the code base. The pure part is bigger, and easier to unit test. </p> <h3 id="09d2184be64a428d85b4f01f1149ea7a"> Example code <a href="#09d2184be64a428d85b4f01f1149ea7a" title="permalink">#</a> </h3> <p> This article gave you an overview of the functional architecture. In the next two articles, you'll see how to do this in practice. First, I'll implement the above architecture in <a href="https://www.haskell.org">Haskell</a>, so that we know that if it works there, the architecture does, indeed, respect <a href="/2018/11/19/functional-architecture-a-definition">the functional interaction law</a>. </p> <p> Based on the Haskell implementation, you'll then see a port to F#. <ul> <li><a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a></li> <li><a href="/2019/09/16/picture-archivist-in-f">Picture archivist in F#</a></li> </ul> These two articles share the same architecture. You can read both, or one of them, as you like. The source code is available on GitHub. </p> <h3 id="09e32b681b7a48aa808965bd66c4794b"> Summary <a href="#09e32b681b7a48aa808965bd66c4794b" title="permalink">#</a> </h3> <p> One of the hardest problems in transitioning from object-oriented programming to functional programming is that the design approach is so different. Many well-understood design patterns and principles don't translate easily. Dependency Injection is one of those. Often, you'll have to flip the model on its head, so to speak, before you can take it on in a functional manner. </p> <p> While most object-oriented programmers would say that object-oriented design involves focusing on 'the nouns', in practice, it often revolves around interactions and behaviour. Sometimes, that's appropriate, but often, it's not. </p> <p> Functional programming, in contrast, tends to take a more data-oriented perspective. Load some data, manipulate it, and publish it. If you can come up with an appropriate data structure for the data, you're probably on your way to implementing a functional architecture. </p> <p> <strong>Next:</strong> <a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A rose tree functor https://blog.ploeh.dk/2019/08/19/a-rose-tree-functor 2019-08-19T08:08:00+00:00 Mark Seemann <div id="post"> <p> <em>Rose trees form normal functors. A place-holder article for object-oriented programmers.</em> </p> <p> This article is an instalment in <a href="/2018/03/22/functors">an article series about functors</a>. As another article explains, <a href="/2019/08/12/rose-tree-bifunctor">a rose tree is a bifunctor</a>. This makes it trivially a functor. As such, this article is mostly a place-holder to fit the spot in the <em>functor table of contents</em>, thereby indicating that rose trees are functors. </p> <p> Since a rose tree is a bifunctor, it's actually not one, but two, functors. Many languages, C# included, are best equipped to deal with unambiguous functors. This is also true in <a href="https://haskell.org">Haskell</a>, where you'd usally define the <code>Functor</code> instance over a bifunctor's right, or second, side. Likewise, in C#, you can make <code>IRoseTree&lt;N, L&gt;</code> a functor by implementing <code>Select</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;Select&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectLeaf(selector); }</pre> </p> <p> This method simply delegates all implementation to the <code>SelectLeaf</code> method; it's just <code>SelectLeaf</code> by another name. It obeys the functor laws, since these are just specializations of the bifunctor laws, and we know that a rose tree is a proper bifunctor. </p> <p> It would have been technically possible to instead implement a <code>Select</code> method by calling <code>SelectNode</code>, but it seems marginally more useful to enable syntactic sugar for mapping over the leaves. </p> <h3 id="134b75d98069421e9fe70a8630ac140f"> Menu example <a href="#134b75d98069421e9fe70a8630ac140f" title="permalink">#</a> </h3> <p> As an example, imagine that you're defining part of a menu bar for an old-fashioned desktop application. Perhaps you're even loading the structure of the menu from a text file. Doing so, you could create a simple tree that represents the <em>edit</em> menu: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;editMenuTemplate&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Edit&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Find&nbsp;and&nbsp;Replace&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Find&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Replace&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Case&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Upper&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Lower&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Cut&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Copy&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Paste&quot;</span>));</pre> </p> <p> At this point, you have an <code>IRoseTree&lt;string, string&gt;</code>, so you might as well have used a <a href="/2018/08/06/a-tree-functor">'normal' tree</a> instead of a rose tree. The above template, however, is only a first step, because you have this <a href="https://en.wikipedia.org/wiki/Command_pattern">Command</a> class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Command</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;Command(<span style="color:blue;">string</span>&nbsp;name) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;Name&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">virtual</span>&nbsp;<span style="color:blue;">void</span>&nbsp;Execute() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Apart from this base class, you also have classes that derive from it: <code>FindCommand</code>, <code>ReplaceCommand</code>, and so on. These classes override the <code>Execute</code> method by implenting <em>find</em>, <em>replace</em>, etc. functionality. Imagine that you also have a store or dictionary of these derived objects. This enables you to transform the template tree into a useful user menu: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">Command</span>&gt;&nbsp;editMenu&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;name&nbsp;<span style="color:blue;">in</span>&nbsp;editMenuTemplate &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;commandStore.Lookup(name);</pre> </p> <p> Notice how this transforms only the leaves, using the command store's <code>Lookup</code> method. This example uses C# query syntax, because this is what the <code>Select</code> method enables, but you could also have written the translation by just calling the <code>Select</code> method. </p> <p> The internal nodes in a menu have no behavious, so it makes little sense to attempt to turn them into <code>Command</code> objects as well. They're only there to provide structure to the menu. With a 'normal' tree, you wouldn't have been able to enrich only the leaves, while leaving the internal nodes untouched, but with a rose tree you can. </p> <p> The above example uses the <code>Select</code> method (via query syntax) to translate the nodes, thereby providing a demonstration of how to use the rose tree as the functor it is. </p> <h3 id="c77f1f9491b246f1bdb7c75d93eaa4ff"> Summary <a href="#c77f1f9491b246f1bdb7c75d93eaa4ff" title="permalink">#</a> </h3> <p> The <code>Select</code> doesn't implement any behaviour not already provided by <code>SelectLeaf</code>, but it enables C# query syntax. The C# compiler understands functors, but not bifunctors, so when you have a bifunctor, you might as well light up that language feature as well by adding a <code>Select</code> method. </p> <p> <strong>Next:</strong> <a href="/2018/08/13/a-visitor-functor">A Visitor functor</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Rose tree bifunctor https://blog.ploeh.dk/2019/08/12/rose-tree-bifunctor 2019-08-12T10:33:00+00:00 Mark Seemann <div id="post"> <p> <em>A rose tree forms a bifunctor. An article for object-oriented developers.</em> </p> <p> This article is an instalment in <a href="/2018/12/24/bifunctors">an article series about bifunctors</a>. While the overview article explains that there's essentially two practically useful bifunctors, here's a third one. <a href="https://en.wikipedia.org/wiki/Rose_tree">rose trees</a>. </p> <h3 id="985e3bc5291c4f8ba98ce258e78f4ec8"> Mapping both dimensions <a href="#985e3bc5291c4f8ba98ce258e78f4ec8" title="permalink">#</a> </h3> <p> Like in the <a href="/2019/01/07/either-bifunctor">previous article on the Either bifunctor</a>, I'll start by implementing the simultaneous two-dimensional translation <code>SelectBoth</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;SelectBoth&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>&gt;&nbsp;selectNode, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selectLeaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Cata( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(n,&nbsp;branches)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;(selectNode(n),&nbsp;branches), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;l&nbsp;=&gt;&nbsp;(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;)<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;(selectLeaf(l))); }</pre> </p> <p> This article uses the previously shown <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree</a> and <a href="/2019/08/05/rose-tree-catamorphism">its catamorphism</a> <code>Cata</code>. </p> <p> In the <code>leaf</code> case, the <code>l</code> argument received by the lambda expression is an object of the type <code>L</code>, since the <code>source</code> tree is an <code>IRoseTree&lt;N, L&gt;</code> object; i.e. a tree with leaves of the type <code>L</code> and nodes of the type <code>N</code>. The <code>selectLeaf</code> argument is a function that converts an <code>L</code> object to an <code>L1</code> object. Since <code>l</code> is an <code>L</code> object, you can call <code>selectLeaf</code> with it to produce an <code>L1</code> object. You can use this resulting object to create a new <code>RoseLeaf&lt;N1, L1&gt;</code>. Keep in mind that while the <code>RoseLeaf</code> class requires two type arguments, it never requires an object of its <code>N</code> type argument, which means that you can create an object with any <em>node</em> type argument, including <code>N1</code>, even if you don't have an object of that type. </p> <p> In the <code>node</code> case, the lambda expression receives two objects: <code>n</code> and <code>branches</code>. The <code>n</code> object has the type <code>N</code>, while the <code>branches</code> object has the type <code>IEnumerable&lt;IRoseTree&lt;N1, L1&gt;&gt;</code>. In other words, the <code>branches</code> have already been translated to the desired result type. That's how the catamorphism works. This means that you only have to figure out how to translate the <code>N</code> object <code>n</code> to an <code>N1</code> object. The <code>selectNode</code> function argument can do that, so you can then create a new <code>RoseNode&lt;N1, L1&gt;</code> and return it. </p> <p> This works as expected: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree RoseNode&lt;string, int&gt;("foo", IRoseTree&lt;string, int&gt;[2] { 42, 1337 }) &gt; tree.SelectBoth(s&nbsp;=&gt;&nbsp;s.Length,&nbsp;i&nbsp;=&gt;&nbsp;i.ToString()) RoseNode&lt;int, string&gt;(3, IRoseTree&lt;int, string&gt;[2] { "42", "1337" })</pre> </p> <p> This <em>C# Interactive</em> example shows how to convert a tree with internal string nodes and integer leaves to a tree of internal integer nodes and string leaves. The strings are converted to strings by counting their <code>Length</code>, while the integers are turned into strings using the standard <code>ToString</code> method available on all objects. </p> <h3 id="c0ea04cfe7d3412c86b9ba3953812025"> Mapping nodes <a href="#c0ea04cfe7d3412c86b9ba3953812025" title="permalink">#</a> </h3> <p> When you have <code>SelectBoth</code>, you can trivially implement the translations for each dimension in isolation. For <a href="/2018/12/31/tuple-bifunctor">tuple bifunctors</a>, I called these methods <code>SelectFirst</code> and <code>SelectSecond</code>, while for <a href="/2019/01/07/either-bifunctor">Either bifunctors</a>, I chose to name them <code>SelectLeft</code> and <code>SelectRight</code>. Continuing the trend of naming the translations after what they translate, instead of their positions, I'll name the corresponding methods here <code>SelectNode</code> and <code>SelectLeaf</code>. In <a href="https://www.haskell.org">Haskell</a>, the functions associated with <code>Data.Bifunctor</code> are always called <code>first</code> and <code>second</code>, but I see no reason to preserve such abstract naming in C#. In Haskell, these functions are part of the <code>Bifunctor</code> type class; the abstract names serve an actual purpose. This isn't the case in C#, so there's no reason to retain the abstract names. You might as well use names that communicate intent, which is what I've tried to do here. </p> <p> If you want to map only the internal nodes, you can implement a <code>SelectNode</code> method based on <code>SelectBoth</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;SelectNode&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectBoth(selector,&nbsp;l&nbsp;=&gt;&nbsp;l); }</pre> </p> <p> This simply uses the <code>l =&gt; l</code> lambda expression as an ad-hoc <em>identity</em> function, while passing <code>selector</code> as the <code>selectNode</code> argument to the <code>SelectBoth</code> method. </p> <p> You can use this to map the above <code>tree</code> to a tree made entirely of numbers: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree.SelectNode(s =&gt; s.Length) RoseNode&lt;int, int&gt;(3, IRoseTree&lt;int, int&gt;[2] { 42, 1337 })</pre> </p> <p> Such a tree is, incidentally, isomorphic to a <a href="/2018/08/06/a-tree-functor">'normal' tree</a>. It might be a good exercise, if you need one, to demonstrate the isormorphism by writing functions that convert a <code>Tree&lt;T&gt;</code> into an <code>IRoseTree&lt;T, T&gt;</code>, and vice versa. </p> <h3 id="baa9136b506241e39e13639e43679b31"> Mapping leaves <a href="#baa9136b506241e39e13639e43679b31" title="permalink">#</a> </h3> <p> Similar to <code>SelectNode</code>, you can also trivially implement <code>SelectLeaf</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;SelectLeaf&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectBoth(n&nbsp;=&gt;&nbsp;n,&nbsp;selector); }</pre> </p> <p> This is another one-liner calling <code>SelectBoth</code>, with the difference that the identity function <code>n =&gt; n</code> is passed as the first argument, instead of as the last. This ensures that only <code>RoseLeaf</code> values are mapped: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree.SelectLeaf(i =&gt; i % 2 == 0) RoseNode&lt;string, bool&gt;("foo", IRoseTree&lt;string, bool&gt;[2] { true, false })</pre> </p> <p> In the above <em>C# Interactive</em> session, the leaves are mapped to Boolean values, indicating whether they're even or odd. </p> <h3 id="afddb846bd244f4aa8f658fb5716b392"> Identity laws <a href="#afddb846bd244f4aa8f658fb5716b392" title="permalink">#</a> </h3> <p> Rose trees obey all the bifunctor laws. While it's formal work to prove that this is the case, you can get an intuition for it via examples. Often, I use a property-based testing library like <a href="https://fscheck.github.io/FsCheck">FsCheck</a> or <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a> to demonstrate (not prove) that laws hold, but in this article, I'll keep it simple and only cover each law with a parametrised test. </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;Id&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">T</span>&nbsp;x)&nbsp;=&gt;&nbsp;x; <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">object</span>[]&gt;&nbsp;BifunctorLawsData { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">get</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(42)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>))&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;exampleTree&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectNodeObeysFirstFunctorLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectNode(Id)); }</pre> </p> <p> This test uses <a href="https://xunit.github.io">xUnit.net</a>'s <code>[Theory]</code> feature to supply a small set of example input values. The input values are defined by the <code>BifunctorLawsData</code> property, since I'll reuse the same values for all the bifunctor law demonstration tests. The <code>exampleTree</code> object is the tree shown in <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree</a>. </p> <p> The tests also use the identity function implemented as a <code>private</code> function called <code>Id</code>, since C# doesn't come equipped with such a function in the Base Class Library. </p> <p> For all the <code>IRoseTree&lt;int, string&gt;</code> objects <code>t</code>, the test simply verifies that the original tree <code>t</code> is equal to the tree projected over the first axis with the <code>Id</code> function. </p> <p> Likewise, the first functor law applies when translating over the second dimension: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectLeafObeysFirstFunctorLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectLeaf(Id)); }</pre> </p> <p> This is the same test as the previous test, with the only exception that it calls <code>SelectLeaf</code> instead of <code>SelectNode</code>. </p> <p> Both <code>SelectNode</code> and <code>SelectLeaf</code> are implemented by <code>SelectBoth</code>, so the real test is whether this method obeys the identity law: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectBothObeysIdentityLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectBoth(Id,&nbsp;Id)); }</pre> </p> <p> Projecting over both dimensions with the identity function does, indeed, return an object equal to the input object. </p> <h3 id="bfaa0b763e5346c488f4bd9576ab894c"> Consistency law <a href="#bfaa0b763e5346c488f4bd9576ab894c" title="permalink">#</a> </h3> <p> In general, it shouldn't matter whether you map with <code>SelectBoth</code> or a combination of <code>SelectNode</code> and <code>SelectLeaf</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;ConsistencyLawHolds(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;f(<span style="color:blue;">int</span>&nbsp;i)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(i); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;<span style="color:blue;">string</span>.IsNullOrWhiteSpace(s); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t.SelectBoth(f,&nbsp;g),&nbsp;t.SelectLeaf(g).SelectNode(f)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(f).SelectLeaf(g), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(g).SelectNode(f)); }</pre> </p> <p> This example creates two local functions <code>f</code> and <code>g</code>. The first function, <code>f</code>, creates a new <code>DateTime</code> object from an integer, using one of the <code>DateTime</code> constructor overloads. The second function, <code>g</code>, just delegates to <code>string.IsNullOrWhiteSpace</code>, although I want to stress that this is just an example. The law should hold for any two (<a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>) functions. </p> <p> The test then verifies that you get the same result from calling <code>SelectBoth</code> as when you call <code>SelectNode</code> followed by <code>SelectLeaf</code>, or the other way around. </p> <h3 id="dd3046c49d564991bb47924b6e8e65fb"> Composition laws <a href="#dd3046c49d564991bb47924b6e8e65fb" title="permalink">#</a> </h3> <p> The composition laws insist that you can compose functions, or translations, and that again, the choice to do one or the other doesn't matter. Along each of the axes, it's just the second functor law applied. This parametrised test demonstrates that the law holds for <code>SelectNode</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SecondFunctorLawHoldsForSelectNode(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">char</span>&nbsp;f(<span style="color:blue;">bool</span>&nbsp;b)&nbsp;=&gt;&nbsp;b&nbsp;?&nbsp;<span style="color:#a31515;">&#39;T&#39;</span>&nbsp;:&nbsp;<span style="color:#a31515;">&#39;F&#39;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">int</span>&nbsp;i)&nbsp;=&gt;&nbsp;i&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(x&nbsp;=&gt;&nbsp;f(g(x))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(g).SelectNode(f)); }</pre> </p> <p> Here, <code>f</code> is a local function that returns the the character <code>'T'</code> for <code>true</code>, and <code>'F'</code> for <code>false</code>; <code>g</code> is the <em>even</em> function. The second functor law states that mapping <code>f(g(x))</code> in a single step is equivalent to first mapping over <code>g</code> and then map the result of that using <code>f</code>. </p> <p> The same law applies if you fix the first dimension and translate over the second: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SecondFunctorLawHoldsForSelectLeaf(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;f(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;g(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;s.Length; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(x&nbsp;=&gt;&nbsp;f(g(x))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(g).SelectLeaf(f)); }</pre> </p> <p> Here, <code>f</code> is the <em>even</em> function, whereas <code>g</code> is a local function that returns the length of a string. Again, the test demonstrates that the output is the same whether you map over an intermediary step, or whether you map using only a single step. </p> <p> This generalises to the composition law for <code>SelectBoth</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectBothCompositionLawHolds(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">char</span>&nbsp;f(<span style="color:blue;">bool</span>&nbsp;b)&nbsp;=&gt;&nbsp;b&nbsp;?&nbsp;<span style="color:#a31515;">&#39;T&#39;</span>&nbsp;:&nbsp;<span style="color:#a31515;">&#39;F&#39;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;h(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;i(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;s.Length; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectBoth(x&nbsp;=&gt;&nbsp;f(g(x)),&nbsp;y&nbsp;=&gt;&nbsp;h(i(y))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectBoth(g,&nbsp;i).SelectBoth(f,&nbsp;h)); }</pre> </p> <p> Again, whether you translate in one or two steps shouldn't affect the outcome. </p> <p> As all of these tests demonstrate, the bifunctor laws hold for rose trees. The tests only showcase five examples, but I hope it gives you an intuition how any rose tree is a bifunctor. After all, the <code>SelectNode</code>, <code>SelectLeaf</code>, and <code>SelectBoth</code> methods are all generic, and they behave the same for all generic type arguments. </p> <h3 id="a1a5dea3d85d4ed1a3ee3fb0a4dca820"> Summary <a href="#a1a5dea3d85d4ed1a3ee3fb0a4dca820" title="permalink">#</a> </h3> <p> Rose trees are bifunctors. You can translate the node and leaf dimension of a rose tree independently of each other, and the bifunctor laws hold for any pure translation, no matter how you compose the projections. </p> <p> As always, there can be performance differences between the various compositions, but the outputs will be the same regardless of composition. </p> <p> A functor, and by extension, a bifunctor, is a structure-preserving map. This means that any projection preserves the structure of the underlying container. For rose trees this means that the shape of the tree remains the same. The number of leaves remain the same, as does the number of internal nodes. </p> <p> <strong>Next:</strong> <a href="/2018/01/08/software-design-isomorphisms">Software design isomorphisms</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Rose tree catamorphism https://blog.ploeh.dk/2019/08/05/rose-tree-catamorphism 2019-08-05T08:30:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for a tree with different types of nodes and leaves is made up from two functions.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">tree</a> is that internal nodes can hold values of a different type than leaf values. </p> <p> <img src="/content/binary/rose-tree-example.png" alt="A rose tree example diagram, with internal nodes containing integers, and leafs containing strings."> </p> <p> The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches. </p> <h3 id="078386d5f3924a63add86ff199fd88d0"> C# catamorphism <a href="#078386d5f3924a63add86ff199fd88d0" title="permalink">#</a> </h3> <p> As a C# representation of a rose tree, I'll use the <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree I've previously described</a>. The catamorphism is this extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Cata&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;tree, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(n,&nbsp;branches)&nbsp;=&gt;&nbsp;node(n,&nbsp;branches.Select(t&nbsp;=&gt;&nbsp;t.Cata(node,&nbsp;leaf))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;leaf); }</pre> </p> <p> Like most of the other catamorphisms shown in this article series, this one consists of two functions. One that handles the <em>leaf</em> case, and one that handles the partially reduced <em>node</em> case. Compare it with the <a href="/2019/06/10/tree-catamorphism">tree catamorphism</a>: notice that the rose tree catamorphism's <code>node</code> function is identical to the the tree catamorphism. The <code>leaf</code> function, however, is new. </p> <p> In previous articles, you've seen other examples of catamorphisms for <a href="/2018/05/22/church-encoding">Church-encoded</a> types. The most common pattern has been that the Church encoding (the <code>Match</code> method) was also the catamorphism, with the <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a> being the only exception so far. When it comes to the Peano catamorphism, however, I'm not entirely confident that the difference between Church encoding and catamorphism is real, or whether it's just an artefact of the way I originally designed the Church encoding. </p> <p> When it comes to the present rose tree, however, notice that the catamorphisms is distinctly different from the Church encoding. That's the reason I called the method <code>Cata</code> instead of <code>Match</code>. </p> <p> The method simply delegates the <code>leaf</code> handler to <code>Match</code>, while it adds behaviour to the <code>node</code> case. It works the same way as for the 'normal' tree catamorphism. </p> <h3 id="87e2c79711c24c63a5ed82fbe4f7b581"> Examples <a href="#87e2c79711c24c63a5ed82fbe4f7b581" title="permalink">#</a> </h3> <p> You can use <code>Cata</code> to implement most other behaviour you'd like <code>IRoseTree&lt;N, L&gt;</code> to have. In a future article, you'll see how to <a href="/2019/08/12/rose-tree-bifunctor">turn the rose tree into a bifunctor</a> and <a href="/2019/08/19/a-rose-tree-functor">functor</a>, so here, we'll look at some other, more ad hoc, examples. As is also the case for the 'normal' tree, you can calculate the sum of all nodes, if you can associate a number with each node. </p> <p> Consider the example tree in the above diagram. You can create it as an <code>IRoseTree&lt;int, string&gt;</code> object like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;exampleTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(2112, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(90125, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;baz&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;qux&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quux&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quuz&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;corge&quot;</span>));</pre> </p> <p> If you want to calculate a sum for a tree like that, you can use the integers for the internal nodes, and perhaps the length of the strings of the leaves. That hardly makes much sense, but is technically possible: </p> <p> <pre>&gt; exampleTree.Cata((x,&nbsp;xs)&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;xs.Sum(),&nbsp;x&nbsp;=&gt;&nbsp;x.Length) 93641</pre> </p> <p> Perhaps slightly more useful is to count the number of leaves: </p> <p> <pre>&gt; exampleTree.Cata((_,&nbsp;xs)&nbsp;=&gt;&nbsp;xs.Sum(),&nbsp;_&nbsp;=&gt;&nbsp;1) 7</pre> </p> <p> A leaf node has, by definition, exactly one leaf node, so the <code>leaf</code> lambda expression always returns <code>1</code>. In the <code>node</code> case, <code>xs</code> contains the partially summed leaf node count, so just <code>Sum</code> those together while ignoring the value of the internal node. </p> <p> You can also measure the maximum depth of the tree: </p> <p> <pre>&gt; exampleTree.Cata((_,&nbsp;xs)&nbsp;=&gt;&nbsp;1&nbsp;+&nbsp;xs.Max(),&nbsp;_&nbsp;=&gt;&nbsp;0) 3</pre> </p> <p> Consistent with the example for 'normal' trees, you can arbitrarily decide that the depth of a leaf node is <code>0</code>, so again, the <code>leaf</code> lambda expression just returns a constant value. The <code>node</code> lambda expression takes the <code>Max</code> of the partially reduced <code>xs</code> and adds <code>1</code>, since an internal node represents another level of depth in a tree. </p> <h3 id="9e673c50edc14c1790a9e89a67d069d1"> Rose tree F-Algebra <a href="#9e673c50edc14c1790a9e89a67d069d1" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/10/tree-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> As always, start with the underlying endofunctor: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;RoseTreeF&nbsp;a&nbsp;b&nbsp;c&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;NodeF&nbsp;{&nbsp;nodeValue&nbsp;::&nbsp;a,&nbsp;nodes&nbsp;::&nbsp;ListFix&nbsp;c&nbsp;} &nbsp;&nbsp;|&nbsp;LeafF&nbsp;{&nbsp;leafValue&nbsp;::&nbsp;b&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">RoseTreeF</span>&nbsp;a&nbsp;b)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;NodeF&nbsp;x&nbsp;$&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;ns &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;LeafF&nbsp;x</pre> </p> <p> Instead of using Haskell's standard list (<code>[]</code>) for the nodes, I've used <code>ListFix</code> from <a href="/2019/05/27/list-catamorphism">the article on list catamorphism</a>. This should, hopefully, demonstrate how you can build on already established definitions derived from first principles. </p> <p> As usual, I've called the 'data' types <code>a</code> and <code>b</code>, and the carrier type <code>c</code> (for <em>carrier</em>). The <code>Functor</code> instance as usual translates the carrier type; the <code>fmap</code> function has the type <code>(c -&gt; c1) -&gt; RoseTreeF a b c -&gt; RoseTreeF a b c1</code>. </p> <p> As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like <code>Fix (RoseTreeF a b)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;RoseTreeFix&nbsp;a&nbsp;b&nbsp;= &nbsp;&nbsp;RoseTreeFix&nbsp;{&nbsp;unRoseTreeFix&nbsp;::&nbsp;Fix&nbsp;(RoseTreeF&nbsp;a&nbsp;b)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Bifunctor</code>, <code>Bifoldable</code>, <code>Bitraversable</code>, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A pair of helper functions make it easier to define <code>RoseTreeFix</code> values: </p> <p> <pre><span style="color:#2b91af;">roseLeafF</span>&nbsp;::&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b roseLeafF&nbsp;=&nbsp;RoseTreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;LeafF <span style="color:#2b91af;">roseNodeF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b roseNodeF&nbsp;x&nbsp;=&nbsp;RoseTreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;NodeF&nbsp;x&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;unRoseTreeFix</pre> </p> <p> <code>roseLeafF</code> creates a leaf node: </p> <p> <pre>Prelude Fix List RoseTree&gt; roseLeafF "ploeh" RoseTreeFix {unRoseTreeFix = Fix (LeafF "ploeh")}</pre> </p> <p> <code>roseNodeF</code> is a helper function to create internal nodes: </p> <p> <pre>Prelude Fix List RoseTree&gt; roseNodeF 6 (consF (roseLeafF 0) nilF) RoseTreeFix {unRoseTreeFix = Fix (NodeF 6 (ListFix (Fix (ConsF (Fix (LeafF 0)) (Fix NilF)))))}</pre> </p> <p> Even with helper functions, construction of <code>RoseTreeFix</code> values is cumbersome, but keep in mind that the code shown here isn't meant to be used in practice. The goal is only to deduce catamorphisms from more basic universal abstractions, and you now have all you need to do that. </p> <h3 id="0bfc3f600a9e43eea1026f1a4a3b7604"> Haskell catamorphism <a href="#0bfc3f600a9e43eea1026f1a4a3b7604" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>RoseTreeF a b</code>), and an object <code>c</code>, but you still need to find a morphism <code>RoseTreeF a b c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not any of the 'data types' <code>a</code> or <code>b</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code> or <code>b</code>, as you'll see. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>roseTreeF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>LeafF</code> case? You could pass a function argument to the <code>roseTreeF</code> function and use it with <code>x</code>: </p> <p> <pre>roseTreeF&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x</pre> </p> <p> While you could, technically, pass an argument of the type <code>c</code> to <code>roseTreeF</code> and then return that value from the <code>LeafF</code> case, that would mean that you would ignore the <code>x</code> value. This would be incorrect, so instead, make the argument a function and call it with <code>x</code>. Likewise, you can deal with the <code>NodeF</code> case in the same way: </p> <p> <pre><span style="color:#2b91af;">roseTreeF</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c roseTreeF&nbsp;fn&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;fn&nbsp;x&nbsp;ns &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>RoseTreeF</code>, the compiler infers that the <code>alg</code> function has the type <code>RoseTreeF a b c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for a rose tree. As has been the most common pattern so far, it's a pair, made from two functions. It's still not the only possible catamorphism, since you could trivially flip the arguments to <code>roseTreeF</code>, or the arguments to <code>fn</code>. </p> <p> I've chosen the representation shown here because it's similar to the catamorphism I've shown for a 'normal' tree, just with the added function for leaves. </p> <h3 id="256fd0a09c4a4651b6c27b5626b0fb33"> Basis <a href="#256fd0a09c4a4651b6c27b5626b0fb33" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>roseTreeF</code>. Here's the <code>Bifunctor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifunctor</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bimap&nbsp;f&nbsp;s&nbsp;=&nbsp;roseTreeF&nbsp;(roseNodeF&nbsp;.&nbsp;f)&nbsp;(roseLeafF&nbsp;.&nbsp;s)</pre> </p> <p> Notice how naturally the catamorphism implements <code>bimap</code>. </p> <p> From that instance, the <code>Functor</code> instance trivially follows: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;=&nbsp;second</pre> </p> <p> You could probably also add <code>Applicative</code> and <code>Monad</code> instances, but I find those hard to grasp, so I'm going to skip them in favour of <code>Bifoldable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifoldable</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bifoldMap&nbsp;f&nbsp;=&nbsp;roseTreeF&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;fold&nbsp;xs)</pre> </p> <p> The <code>Bifoldable</code> instance enables you to trivially implement the <code>Foldable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;bifoldMap&nbsp;mempty</pre> </p> <p> You may find the presence of <code>mempty</code> puzzling, since <code>bifoldMap</code> takes two functions as arguments. Is <code>mempty</code> a function? </p> <p> Yes, <code>mempty</code> can be a function. Here, it is. There's a <code>Monoid</code> instance for any function <code>a -&gt; m</code>, where <code>m</code> is a <code>Monoid</code> instance, and <code>mempty</code> is the identity for that monoid. That's the instance in use here. </p> <p> Just as <code>RoseTreeFix</code> is <code>Bifoldable</code>, it's also <code>Bitraversable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bitraversable</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bitraverse&nbsp;f&nbsp;s&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;roseTreeF&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;roseNodeF&nbsp;&lt;$&gt;&nbsp;f&nbsp;x&nbsp;&lt;*&gt;&nbsp;sequenceA&nbsp;xs)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;roseLeafF&nbsp;.&nbsp;s)</pre> </p> <p> You can comfortably implement the <code>Traversable</code> instance based on the <code>Bitraversable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;bisequenceA&nbsp;.&nbsp;first&nbsp;pure</pre> </p> <p> That rose trees are <code>Traversable</code> turns out to be useful, as a future article will show. </p> <h3 id="c02950d3b4954435b384b1f7520d24d4"> Relationships <a href="#c02950d3b4954435b384b1f7520d24d4" title="permalink">#</a> </h3> <p> As was the case for 'normal' trees, the catamorphism for rose trees is more powerful than the <em>fold</em>. There are operations that you can express with the <code>Foldable</code> instance, but other operations that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using <code>RoseTreeFix</code>, you can define that tree like this: </p> <p> <pre>exampleTree&nbsp;= &nbsp;&nbsp;roseNodeF&nbsp;42&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;1337&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>)&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>)&nbsp;nilF))&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;2112&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;90125&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;baz&quot;</span>)&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>)&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;quux&quot;</span>)&nbsp;nilF))&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;quuz&quot;</span>)&nbsp;nilF))&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseLeafF&nbsp;<span style="color:#a31515;">&quot;corge&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;nilF)</pre> </p> <p> You can trivially calculate the sum of string lengths of all leaves, using only the <code>Foldable</code> instance: </p> <p> <pre>Prelude RoseTree&gt; sum $length &lt;$&gt; exampleTree 25</pre> </p> <p> You can also fairly easily calculate a sum of all nodes, using the length of the strings as in the above C# example, but that requires the <code>Bifoldable</code> instance: </p> <p> <pre>Prelude Data.Bifoldable Data.Semigroup RoseTree&gt; bifoldMap Sum (Sum . length) exampleTree Sum {getSum = 93641}</pre> </p> <p> Fortunately, we get the same result as above. </p> <p> Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the <code>Foldable</code> instance, but interestingly, it turns out that counting leaves is possible with the <code>Bifoldable</code> instance: </p> <p> <pre><span style="color:#2b91af;">countLeaves</span>&nbsp;::&nbsp;(<span style="color:blue;">Bifoldable</span>&nbsp;p,&nbsp;<span style="color:blue;">Num</span>&nbsp;n)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;p&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n countLeaves&nbsp;=&nbsp;getSum&nbsp;.&nbsp;bifoldMap&nbsp;(<span style="color:blue;">const</span>&nbsp;$&nbsp;Sum&nbsp;0)&nbsp;(<span style="color:blue;">const</span>&nbsp;$&nbsp;Sum&nbsp;1)</pre> </p> <p> This works well with the example tree: </p> <p> <pre>Prelude RoseTree&gt; countLeaves exampleTree 7</pre> </p> <p> Notice, however, that <code>countLeaves</code> works for any <code>Bifoldable</code> instance. Does that mean that you can 'count the leaves' of a tuple? Yes, it does: </p> <p> <pre>Prelude RoseTree&gt; countLeaves ("foo", "bar") 1 Prelude RoseTree&gt; countLeaves (1, 42) 1</pre> </p> <p> Or what about <code>EitherFix</code>: </p> <p> <pre>Prelude RoseTree Either&gt; countLeaves $leftF "foo" 0 Prelude RoseTree Either&gt; countLeaves$ rightF "bar" 1</pre> </p> <p> Notice that 'counting the leaves' of tuples always returns <code>1</code>, while 'counting the leaves' of <code>Either</code> always returns <code>0</code> for <code>Left</code> values, and <code>1</code> for <code>Right</code> values. This is because <code>countLeaves</code> considers the left, or <em>first</em>, data type to represent internal nodes, and the right, or <em>second</em>, data type to indicate leaves. </p> <p> You can further follow that train of thought to realise that you can convert both tuples and <code>EitherFix</code> values to small rose trees: </p> <p> <pre><span style="color:#2b91af;">fromTuple</span>&nbsp;::&nbsp;(a,&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b fromTuple&nbsp;(x,&nbsp;y)&nbsp;=&nbsp;roseNodeF&nbsp;x&nbsp;(consF&nbsp;(roseLeafF&nbsp;y)&nbsp;nilF) <span style="color:#2b91af;">fromEitherFix</span>&nbsp;::&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b fromEitherFix&nbsp;=&nbsp;eitherF&nbsp;(roseNodeF&nbsp;nilF)&nbsp;roseLeafF</pre> </p> <p> The <code>fromTuple</code> function creates a small rose tree with one internal node and one leaf. The label of the internal node is the first value of the tuple, and the label of the leaf is the second value. Here's an example: </p> <p> <pre>Prelude RoseTree&gt; fromTuple ("foo", 42) RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix (ConsF (Fix (LeafF 42)) (Fix NilF)))))}</pre> </p> <p> The <code>fromEitherFix</code> function turns a <em>left</em> value into an internal node with no leaves, and a <em>right</em> value into a leaf. Here are some examples: </p> <p> <pre>Prelude RoseTree Either&gt; fromEitherFix $leftF "foo" RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix NilF)))} Prelude RoseTree Either&gt; fromEitherFix$ rightF 42 RoseTreeFix {unRoseTreeFix = Fix (LeafF 42)}</pre> </p> <p> While counting leaves can be implemented using <code>Bifoldable</code>, that's not the case for measuring the depths of trees (I think; leave a comment if you know of a way to do this with one of the instances shown here). You can, however, measure tree depth with the catamorphism: </p> <p> <pre><span style="color:#2b91af;">treeDepth</span>&nbsp;::&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Integer</span> treeDepth&nbsp;=&nbsp;roseTreeF&nbsp;(\_&nbsp;xs&nbsp;-&gt;&nbsp;1&nbsp;+&nbsp;<span style="color:blue;">maximum</span>&nbsp;xs)&nbsp;(<span style="color:blue;">const</span>&nbsp;0)</pre> </p> <p> The implementation is similar to the implementation for 'normal' trees. I've arbitrarily decided that leaves have a depth of zero, so the function that handles leaves always returns <code>0</code>. The function that handles internal nodes receives <code>xs</code> as a partially reduced list of depths below the node in question. Take the maximum of those and add <code>1</code>, since each internal node has a depth of one. </p> <p> <pre>Prelude RoseTree&gt; treeDepth exampleTree 3</pre> </p> <p> This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation. </p> <h3 id="4276c6f8fab248c0acc52a7f14462e41"> Summary <a href="#4276c6f8fab248c0acc52a7f14462e41" title="permalink">#</a> </h3> <p> The catamorphism for rose trees is a pair of functions. One function transforms internal nodes with their partially reduced branches, while the other function transforms leaves. </p> <p> For a realistic example of using a rose tree in a real program, see <a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a>. </p> <p> This article series has so far covered progressively more complex data structures. The first examples (<a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a> and <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a>) were neither <a href="/2018/03/22/functors">functors</a>, <a href="/2018/10/01/applicative-functors">applicatives</a>, nor monads. All subsequent examples, on the other hand, are all of these, and more. The next example presents a functor that's neither applicative nor monad, yet still foldable. Obviously, what functionality it offers is still based on a catamorphism. </p> <p> <strong>Next:</strong> <a href="/2019/06/24/full-binary-tree-catamorphism">Full binary tree catamorphism</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Church-encoded rose tree https://blog.ploeh.dk/2019/07/29/church-encoded-rose-tree 2019-07-29T13:14:00+00:00 Mark Seemann <div id="post"> <p> <em>A rose tree is a tree with leaf nodes of one type, and internal nodes of another.</em> </p> <p> This article is part of <a href="/2018/05/22/church-encoding">a series of articles about Church encoding</a>. In the previous articles, you've seen <a href="/2018/06/04/church-encoded-maybe">how to implement a Maybe container</a>, and <a href="/2018/06/11/church-encoded-either">how to implement an Either container</a>. Through these examples, you've learned how to model <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a> without explicit language support. In this article, you'll see how to model a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>. </p> <p> A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">tree</a> is that internal nodes can hold values of a different type than leaf values. </p> <p> <img src="/content/binary/rose-tree-example.png" alt="A rose tree example diagram, with internal nodes containing integers, and leaves containing strings."> </p> <p> The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches. </p> <h3 id="5255946728c14810a5aaef3c1022d126"> Contract <a href="#5255946728c14810a5aaef3c1022d126" title="permalink">#</a> </h3> <p> In C#, you can represent the fundamental structure of a rose tree with a Church encoding, starting with an interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf); }</pre> </p> <p> The structure of a rose tree includes two mutually exclusive cases: internal nodes and leaf nodes. Since there's two cases, the <code>Match</code> method takes two arguments, one for each case. </p> <p> The interface is generic, with two type arguments: <code>N</code> (for <em>Node</em>) and <code>L</code> (for <em>leaf</em>). Any consumer of an <code>IRoseTree&lt;N, L&gt;</code> object must supply two functions when calling the <code>Match</code> method: a function that turns a node into a <code>TResult</code> value, and a function that turns a leaf into a <code>TResult</code> value. </p> <p> Both cases must have a corresponding implementation. </p> <h3 id="89c4833c4e4d46cc8eef2d5eb546f61d"> Leaves <a href="#89c4833c4e4d46cc8eef2d5eb546f61d" title="permalink">#</a> </h3> <p> The <em>leaf</em> implementation is the simplest: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;:&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">L</span>&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;RoseLeaf(<span style="color:#2b91af;">L</span>&nbsp;value) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;leaf(value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(value,&nbsp;other.value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;value.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The <code>RoseLeaf</code> class is an <a href="https://en.wikipedia.org/wiki/Adapter_pattern">Adapter</a> over a value of the generic type <code>L</code>. As is always the case with Church encoding, it implements the <code>Match</code> method by unconditionally calling one of the arguments, in this case the <code>leaf</code> function, with its adapted <code>value</code>. </p> <p> While it doesn't have to do this, it also overrides <code>Equals</code> and <code>GetHashCode</code>. This is an immutable class, so it's a great candidate to be a <a href="https://martinfowler.com/bliki/ValueObject.html">Value Object</a>. Making it a Value Object makes it easier to compare expected and actual values in unit tests, among other benefits. </p> <h3 id="f211476563fe40379eac66ee887ed75b"> Nodes <a href="#f211476563fe40379eac66ee887ed75b" title="permalink">#</a> </h3> <p> The <em>node</em> implementation is slightly more complex: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;:&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">N</span>&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;&nbsp;branches; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;RoseNode(<span style="color:#2b91af;">N</span>&nbsp;value,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;&nbsp;branches) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.branches&nbsp;=&nbsp;branches; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;node(value,&nbsp;branches); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(value,&nbsp;other.value) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;&amp;&nbsp;<span style="color:#2b91af;">Enumerable</span>.SequenceEqual(branches,&nbsp;other.branches); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;value.GetHashCode()&nbsp;^&nbsp;branches.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> A node contains both a value (of the type <code>N</code>) and a collection of sub-trees, or <code>branches</code>. The class implements the <code>Match</code> method by unconditionally calling the <code>node</code> function argument with its constituent values. </p> <p> Again, it overrides <code>Equals</code> and <code>GetHashCode</code> for the same reasons as <code>RoseLeaf</code>. This isn't required to implement Church encoding, but makes comparison and unit testing easier. </p> <h3 id="a5c04c7e127349ed9b759e6361af5ab3"> Usage <a href="#a5c04c7e127349ed9b759e6361af5ab3" title="permalink">#</a> </h3> <p> You can use the <code>RoseLeaf</code> and <code>RoseNode</code> constructors to create new trees, but it sometimes helps to have a static helper method to create values. It turns out that there's little value in a helper method for leaves, but for nodes, it's marginally useful: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;Node&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:#2b91af;">N</span>&nbsp;value,&nbsp;<span style="color:blue;">params</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;[]&nbsp;branches) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(value,&nbsp;branches); }</pre> </p> <p> This enables you to create tree objects, like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337));</pre> </p> <p> That's a single node with the label <code>"foo"</code> and two leaves with the values <code>42</code> and <code>1337</code>, respectively. You can create the tree shown in the above diagram like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;exampleTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(2112, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(90125, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;baz&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;qux&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quux&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quuz&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;corge&quot;</span>));</pre> </p> <p> You can add various extension methods to implement useful functionality. In later articles, you'll see some more compelling examples, so here, I'm only going to show a few basic examples. One of the simplest features you can add is a method that will tell you if an <code>IRoseTree&lt;N, L&gt;</code> object is a node or a leaf: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsLeaf&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IChurchBoolean</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(_,&nbsp;__)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchFalse</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchTrue</span>()); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsNode&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchNot</span>(source.IsLeaf()); }</pre> </p> <p> Since this article is part of the overall article series on Church encoding, and the purpose of that article series is also to show how basic language features can be created from Church encodings, these two methods return <a href="/2018/05/24/church-encoded-boolean-values">Church-encoded Boolean values</a> instead of the built-in <code>bool</code> type. I'm sure you can imagine how you could change the type to <code>bool</code> if you'd like. </p> <p> You can use these methods like this: </p> <p> <pre>&gt; <span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">double</span>&gt;&nbsp;tree&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">double</span>&gt;(-3.2); &gt; tree.IsLeaf() ChurchTrue { } &gt; tree.IsNode() ChurchNot(ChurchTrue) &gt; <span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">long</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node&lt;<span style="color:blue;">long</span>,&nbsp;<span style="color:blue;">string</span>&gt;(42); &gt; tree.IsLeaf() ChurchFalse { } &gt; tree.IsNode() ChurchNot(ChurchFalse)</pre> </p> <p> In a <a href="/2019/09/16/picture-archivist-in-f">future article, you'll see some more compelling examples</a>. </p> <h3 id="3be01779f059443799df57342e2510cb"> Terminology <a href="#3be01779f059443799df57342e2510cb" title="permalink">#</a> </h3> <p> It's not entirely clear what to call a tree like the one shown here. <a href="https://en.wikipedia.org/wiki/Rose_tree">The Wikipedia entry</a> doesn't state one way or the other whether internal node types ought to be distinguishable from leaf node types, but there are <a href="https://twitter.com/kbattocchi/status/1072538730911752192">indications that this could be the case</a>. At least, it seems that the <a href="https://mail.haskell.org/pipermail/haskell-cafe/2015-May/119633.html">term isn't well-defined</a>, so I took the liberty to retcon the name <em>rose tree</em> to the data structure shown here. </p> <p> In the paper that introduces the <em>rose tree</em> term, Meertens writes: <blockquote> <p> "We consider trees whose internal nodes may fork into an arbitrary (natural) number of sub-trees. (If such a node has zero descendants, we still consider it internal.) Each external node carries a data item. No further information is stored in the tree; in particular, internal nodes are unlabelled." </p> <footer><cite><em>First Steps towards the Theory of Rose Trees</em>, Lambert Meertens, 1988</cite></footer> </blockquote> While the concept is foreign in C#, you can trivially introduce a <a href="/2018/01/15/unit-isomorphisms">unit</a> data type: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Unit</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Unit</span>&nbsp;Instance&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Unit</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;Unit()&nbsp;{&nbsp;} }</pre> </p> <p> This enables you to create a rose tree according to Meertens' definition: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;meertensTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(2112)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(90125)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1984)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(666));</pre> </p> <p> Visually, you could draw it like this: </p> <p> <img src="/content/binary/meertens-tree-example.png" alt="A Meertens rose tree example diagram, with leaves containing integers."> </p> <p> Thus, the tree structure shown here seems to be a generalisation of Meertens' original definition. </p> <p> I'm not a mathematician, so I may have misunderstood some things. If you have a better name than <em>rose tree</em> for the data structure shown here, please leave a comment. </p> <h3 id="331fa8452cdd435c86ce87b5d39d51c5"> Yeats <a href="#331fa8452cdd435c86ce87b5d39d51c5" title="permalink">#</a> </h3> <p> Now that we're on the topic of <em>rose tree</em> as a term, you may, as a bonus, enjoy a similarly-titled poem: <blockquote> <h4>THE ROSE TREE</h4> <p> "O words are lightly spoken"<br> Said Pearse to Connolly,<br> "Maybe a breath of politic words<br> Has withered our Rose Tree;<br> Or maybe but a wind that blows<br> Across the bitter sea." </p> <p> "It needs to be but watered,"<br> James Connolly replied,<br> "To make the green come out again<br> And spread on every side,<br> And shake the blossom from the bud<br> To be the garden's pride."<br> </p> <p> "But where can we draw water"<br> Said Pearse to Connolly,<br> "When all the wells are parched away?<br> O plain as plain can be<br> There's nothing but our own red blood<br> Can make a right Rose Tree." </p> <footer><cite><a href="https://en.wikipedia.org/wiki/W._B._Yeats">W. B. Yeats</a></cite></footer> </blockquote> As far as I can tell, though, Yeats' metaphor is dissimilar to Meertens'. </p> <h3 id="9906b9a8856248f38b4f03e40252b761"> Summary <a href="#9906b9a8856248f38b4f03e40252b761" title="permalink">#</a> </h3> <p> You may occasionally find use for a tree that distinguishes between internal and leaf nodes. You can model such a tree with a Church encoding, as shown in this article. </p> <p> <strong>Next: </strong> <a href="/2019/04/29/catamorphisms">Catamorphisms</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Chain of Responsibility as catamorphisms https://blog.ploeh.dk/2019/07/22/chain-of-responsibility-as-catamorphisms 2019-07-22T14:11:00+00:00 Mark Seemann <div id="post"> <p> <em>The Chain of Responsibility design pattern can be viewed as a list fold over the First monoid, followed by a Maybe fold.</em> </p> <p> This article is part of <a href="/2018/03/05/some-design-patterns-as-universal-abstractions">a series of articles about specific design patterns and their category theory counterparts</a>. In it, you'll see how the <a href="https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern">Chain of Responsibility design pattern</a> is equivalent to a succession of <a href="/2019/04/29/catamorphisms">catamorphisms</a>. First, you apply the <a href="/2018/04/03/maybe-monoids">First Maybe monoid</a> over the <a href="/2019/05/27/list-catamorphism">list catamorphism</a>, and then you conclude the reduction with the <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. </p> <h3 id="46a6c41949db446d9387c8befbf3fdb1"> Pattern <a href="#46a6c41949db446d9387c8befbf3fdb1" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern gives you a way to model cascading conditionals with an object structure. It's a chain (or linked list) of objects that all implement the same interface (or base class). Each object (apart from the the last) has a reference to the next object in the list. </p> <p> <img src="/content/binary/chain-of-responsibility-diagram.png" alt="General diagram of the Chain of Responsibility design pattern."> </p> <p> A client (some other code) calls a method on the first object in the list. If that object can handle the request, it does so, and the interaction ends there. If the method returns a value, the object returns the value. </p> <p> If the first object determines that it can't handle the method call, it calls the next object in the chain. It only knows the next object as the interface, so the only way it can delegate the call is by calling the same method as the first one. In the above diagram, <em>Imp1</em> can't handle the method call, so it calls the same method on <em>Imp2</em>, which also can't handle the request and delegates responsibility to <em>Imp3</em>. In the diagram, <em>Imp3</em> can handle the method call, so it does so and returns a result that propagates back up the chain. In that particular example, <em>Imp4</em> never gets involved. </p> <p> You'll see an example below. </p> <p> One of the advantages of the pattern is that you can rearrange the chain to change its behaviour. You can even do this at run time, if you'd like, since all objects implement the same interface. </p> <h3 id="08a67dafd71f4bdd9a2e2577b0e43f9a"> User icon example <a href="#08a67dafd71f4bdd9a2e2577b0e43f9a" title="permalink">#</a> </h3> <p> Consider an online system that maintains user profiles for users. A user is modelled with the <code>User</code> class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;User(<span style="color:blue;">int</span>&nbsp;id,&nbsp;<span style="color:blue;">string</span>&nbsp;name,&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;<span style="color:blue;">bool</span>&nbsp;useGravatar,&nbsp;<span style="color:blue;">bool</span>&nbsp;useIdenticon)</pre> </p> <p> While I only show the signature of the class' constructor, it should be enough to give you an idea. If you need more details, the entire example code base is <a href="https://github.com/ploeh/UserProfile">available on GitHub</a>. </p> <p> Apart from an <code>id</code>, a <code>name</code> and <code>email</code> address, a user also has two flags. One flag tracks whether the user wishes to use his or her <a href="http://www.gravatar.com">Gravatar</a>, while another flag tracks if the user would like to use an <a href="https://en.wikipedia.org/wiki/Identicon">Identicon</a>. Obviously, both flags could be <code>true</code>, in which case the current business rule states that the Gravatar should take precedence. </p> <p> If none of the flags are set, users might still have a picture associated with their profile. This could be a picture that they've uploaded to the system, and is being tracked by a database. </p> <p> If no user icon can be found or generated, ultimately the system should use a fallback, default icon: </p> <p> <img src="/content/binary/default-user-icon.png" alt="Default user icon."> </p> <p> To summarise, the current rules are: <ol> <li>Use Gravatar if flag is set.</li> <li>Use Identicon if flag is set.</li> <li>Use uploaded picture if available.</li> <li>Use default icon.</li> </ol> The order of precedence could change in the future, new images sources could be added, or some of the present sources could be removed. Modelling this set of rules as a Chain of Responsibility makes it easy for you to reorder the rules, should you need to. </p> <p> To request an icon, a client can use the <code>IIconReader</code> interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> The <code>Icon</code> class is just a <a href="https://martinfowler.com/bliki/ValueObject.html">Value Object</a> wrapper around a URL. The idea is that such a URL can be used in an <code>img</code> tag to show the icon. Again, the full source code is available on GitHub if you'd like to investigate the details. </p> <p> The various rules for icon retrieval can be implemented using this interface. </p> <h3 id="b2a4cbfb576949c392ea0e0b3d440175"> Gravatar reader <a href="#b2a4cbfb576949c392ea0e0b3d440175" title="permalink">#</a> </h3> <p> Although you don't have to implement the classes in the order in which you are going to compose them, it seems natural to do so, starting with the Gravatar implementation. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;GravatarReader(<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseGravatar) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Gravatar</span>(user.Email).Url); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The <code>GravatarReader</code> class both implements the <code>IIconReader</code> interface, but also decorates another object of the same polymorphic type. If <code>user.UseGravatar</code> is <code>true</code>, it generates the appropriate Gravatar URL based on the user's <code>Email</code> address; otherwise, it delegates the work to the <code>next</code> object in the Chain of Responsibility. </p> <p> The <code>Gravatar</code> class contains the implementation details to generate the Gravatar <code>Url</code>. Again, please refer to the GitHub repository if you're interested in the details. </p> <h3 id="222ae025b264455695f1dbbd74cad17b"> Identicon reader <a href="#222ae025b264455695f1dbbd74cad17b" title="permalink">#</a> </h3> <p> When you compose the chain, according to the above business logic, the next type of icon you should attempt to generate is an Identicon. It's natural to implement the Identicon reader next, then: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IdenticonReader(<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseIdenticon) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Uri</span>(baseUrl,&nbsp;HashUser(user))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Implementation&nbsp;details&nbsp;go&nbsp;here...</span> }</pre> </p> <p> Again, I'm omitting implementation details in order to focus on the Chain of Responsibility design pattern. If <code>user.UseIdenticon</code> is <code>true</code>, the <code>IdenticonReader</code> generates the appropriate Identicon and returns the URL for it; otherwise, it delegates the work to the <code>next</code> object in the chain. </p> <h3 id="e9f2904333b940c1a9a90522d19a41f3"> Database icon reader <a href="#e9f2904333b940c1a9a90522d19a41f3" title="permalink">#</a> </h3> <p> The <code>DBIconReader</code> class attempts to find an icon ID in a database. If it succeeds, it creates a URL corresponding to that ID. The assumption is that that resource exists; either it's a file on disk, or it's an image resource generated on the spot based on binary data stored in the database. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;DBIconReader(<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository,&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.repository&nbsp;=&nbsp;repository; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!repository.TryReadIconId(user.Id,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;iconId)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;parameters&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Dictionary</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;<span style="color:#a31515;">&quot;iconId&quot;</span>,&nbsp;iconId&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(urlTemplate.BindByName(baseUrl,&nbsp;parameters)); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">Uri</span>&nbsp;baseUrl&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Uri</span>(<span style="color:#a31515;">&quot;https://example.com&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">UriTemplate</span>&nbsp;urlTemplate&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UriTemplate</span>(<span style="color:#a31515;">&quot;users/{iconId}/icon&quot;</span>); }</pre> </p> <p> This class demonstrates some variations in the way you can implement the Chain of Responsibility design pattern. The above <code>GravatarReader</code> and <code>IdenticonReader</code> classes both follow the same implementation pattern of checking a condition, and then performing work if the condition is <code>true</code>. The delegation to the next object in the chain happens, in those two classes, outside of the <code>if</code> statement. </p> <p> The <code>DBIconReader</code> class, on the other hand, reverses the structure of the code. It uses a <a href="https://refactoring.com/catalog/replaceNestedConditionalWithGuardClauses.html">Guard Clause</a> to detect whether to exit early, which is done by delegating work to the <code>next</code> object in the chain. </p> <p> If <code>TryReadIconId</code> returns <code>true</code>, however, the <code>ReadIcon</code> method proceeds to create the appropriate icon URL. </p> <p> Another variation on the Chain of Responsibility design pattern demonstrated by the <code>DBIconReader</code> class is that it takes a second dependency, apart from <code>next</code>. The <code>repository</code> is the usual misapplication of the Repository design pattern that everyone think they use correctly. Here, it's used in the common sense to provide access to a database. The main point, though, is that you can add as many other dependencies to a link in the chain as you'd like. All links, apart from the last, however, must have a reference to the <code>next</code> link in the chain. </p> <h3 id="cee40120578b4732892e6fd72329d5de"> Default icon reader <a href="#cee40120578b4732892e6fd72329d5de" title="permalink">#</a> </h3> <p> Like linked lists, a Chain of Responsibility has to ultimately terminate. You can use the following <code>DefaultIconReader</code> for that. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">DefaultIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Icon</span>.Default; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This class unconditionally returns the <code>Default</code> icon. Notice that it doesn't have any <code>next</code> object it delegates to. This terminates the chain. If no previous implementation of the <code>IIconReader</code> has returned an <code>Icon</code> for the <code>user</code>, this one does. </p> <h3 id="8eb05bed2d98488a91c09bab52b00a53"> Chain composition <a href="#8eb05bed2d98488a91c09bab52b00a53" title="permalink">#</a> </h3> <p> With four implementations of <code>IIconReader</code>, you can now compose the Chain of Responsibility: </p> <p> <pre><span style="color:#2b91af;">IIconReader</span>&nbsp;reader&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repo, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DefaultIconReader</span>())));</pre> </p> <p> The first link in the chain is a <code>GravatarReader</code> object that contains an <code>IdenticonReader</code> object as its <code>next</code> link, and so on. Referring back to the source code of <code>GravatarReader</code>, notice that its <code>next</code> dependency is declared as an <code>IIconReader</code>. Since the <code>IdenticonReader</code> class implements that interface, you can compose the chain like this, but if you later decide to change the order of the objects, you can do so simply by changing the composition. You could remove objects altogether, or add new classes, and you could even do this at run time, if required. </p> <p> The <code>DBIconReader</code> class requires an extra <code>IUserRepository</code> dependency, here simply an existing object called <code>repo</code>. </p> <p> The <code>DefaultIconReader</code> takes no other dependencies, so this effectively terminates the chain. If you try to pass another <code>IIconReader</code> to its constructor, the code doesn't compile. </p> <h3 id="fc1551665bb940b8ba5e75be81c0629a"> Haskell proof of concept <a href="#fc1551665bb940b8ba5e75be81c0629a" title="permalink">#</a> </h3> <p> When evaluating whether a design is <a href="/2018/11/19/functional-architecture-a-definition">a functional architecture</a>, I often port the relevant parts to <a href="https://www.haskell.org">Haskell</a>. You can do the same with the above example, and put it in a form where it's clearer that the Chain of Responsibility pattern is equivalent to two well-known catamorphisms. </p> <p> Readers not comfortable with Haskell can skip the next few sections. The object-oriented example continues below. </p> <p> <code>User</code> and <code>Icon</code> types are defined by types equivalent to above. There's no explicit interface, however. Creation of Gravatars and Identicons are both pure functions with the type <code>User -&gt; Maybe Icon</code>. Here's the Gravatar function, but the Identicon function looks similar: </p> <p> <pre><span style="color:#2b91af;">gravatarUrl</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">String</span> gravatarUrl&nbsp;email&nbsp;= &nbsp;&nbsp;<span style="color:#a31515;">&quot;https://www.gravatar.com/avatar/&quot;</span>&nbsp;++&nbsp;<span style="color:blue;">show</span>&nbsp;(hashString&nbsp;email&nbsp;::&nbsp;MD5Digest) <span style="color:#2b91af;">getGravatar</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Icon</span> getGravatar&nbsp;u&nbsp;= &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;useGravatar&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;Just&nbsp;$&nbsp;Icon&nbsp;$&nbsp;gravatarUrl&nbsp;&nbsp;userEmail&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;Nothing</pre> </p> <p> Reading an icon ID from a database, however, is an impure operation, so the function to do this has the type <code>User -&gt; IO (Maybe Icon)</code>. </p> <h3 id="11adf8bd104d41fab9e6bcaef249210c"> Lazy I/O in Haskell <a href="#11adf8bd104d41fab9e6bcaef249210c" title="permalink">#</a> </h3> <p> Notice that the database icon-querying function has the return type <code>IO (Maybe Icon)</code>. In the introduction you read that the Chain of Responsibility design pattern is a sequence of catamorphisms - the first one over a list of <code>First</code> values. While <code>First</code> is, in itself, a <code>Semigroup</code> instance, it gives rise to a <code>Monoid</code> instance when combined with <code>Maybe</code>. Thus, to showcase the abstractions being used, you could create a list of <code>Maybe (First Icon)</code> values. This forms a <code>Monoid</code>, so is easy to fold. </p> <p> The problem with that, however, is that <code>IO</code> is strict under evaluation, so while it works, <a href="https://stackoverflow.com/q/47120384/126014">it's no longer lazy</a>. You can combine <code>IO (Maybe (First Icon))</code> values, but it leads to too much I/O activity. </p> <p> You can <a href="https://stackoverflow.com/q/47120384/126014">solve this problem with a newtype wrapper</a>: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;FirstIO&nbsp;a&nbsp;=&nbsp;FirstIO&nbsp;(MaybeT&nbsp;IO&nbsp;a)&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Functor</span>,&nbsp;<span style="color:#2b91af;">Applicative</span>,&nbsp;<span style="color:#2b91af;">Monad</span>,&nbsp;<span style="color:#2b91af;">Alternative</span>) <span style="color:#2b91af;">firstIO</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;a firstIO&nbsp;=&nbsp;FirstIO&nbsp;.&nbsp;MaybeT <span style="color:#2b91af;">getFirstIO</span>&nbsp;::&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;a) getFirstIO&nbsp;(FirstIO&nbsp;(MaybeT&nbsp;x))&nbsp;=&nbsp;x <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;(<span style="color:blue;">FirstIO</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:#2b91af;">(&lt;&gt;)</span>&nbsp;=&nbsp;<span style="color:#2b91af;">(&lt;|&gt;)</span> <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monoid</span>&nbsp;(<span style="color:blue;">FirstIO</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;mempty&nbsp;=&nbsp;empty</pre> </p> <p> This uses the <code>GeneralizedNewtypeDeriving</code> GHC extension to automatically make <code>FirstIO</code> <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, and <code>Alternative</code>. It also uses the <code>Alternative</code> instance to implement <code>Semigroup</code> and <code>Monoid</code>. You may recall from <a href="http://hackage.haskell.org/package/base/docs/Control-Applicative.html">the documentation</a> that <code>Alternative</code> is already a "monoid on applicative functors." </p> <h3 id="995f9ea8f8344aea93b2ffd0b3aad71f"> Alignment <a href="#995f9ea8f8344aea93b2ffd0b3aad71f" title="permalink">#</a> </h3> <p> You now have three functions with different types: two pure functions with the type <code>User -&gt; Maybe Icon</code> and one impure database-bound function with the type <code>User -&gt; IO (Maybe Icon)</code>. In order to have a common abstraction, you should align them so that all types match. At first glance, <code>User -&gt; IO (Maybe (First Icon))</code> seems like a type that fits all implementations, but that causes too much I/O to take place, so instead, use <code>User -&gt; FirstIO Icon</code>. Here's how to lift the pure <code>getGravatar</code> function: </p> <p> <pre><span style="color:#2b91af;">getGravatarIO</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;<span style="color:blue;">Icon</span> getGravatarIO&nbsp;=&nbsp;firstIO&nbsp;.&nbsp;<span style="color:blue;">return</span>&nbsp;.&nbsp;getGravatar</pre> </p> <p> You can lift the other functions in similar fashion, to produce <code>getGravatarIO</code>, <code>getIdenticonIO</code>, and <code>getDBIconIO</code>, all with the mutual type <code>User -&gt; FirstIO Icon</code>. </p> <h3 id="f601a51f3006430398232e05b6595da0"> Haskell composition <a href="#f601a51f3006430398232e05b6595da0" title="permalink">#</a> </h3> <p> The goal of the Haskell proof of concept is to compose a function that can provide an <code>Icon</code> for any <code>User</code> - just like the above C# composition that uses Chain of Responsibility. There's, however, no way around impurity, because one of the steps involve a database, so the aim is a composition with the type <code>User -&gt; IO Icon</code>. </p> <p> While a more compact composition is possible, I'll show it in a way that makes the catamorphisms explicit: </p> <p> <pre><span style="color:#2b91af;">getIcon</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;<span style="color:blue;">Icon</span> getIcon&nbsp;u&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;lazyIcons&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(\f&nbsp;-&gt;&nbsp;f&nbsp;u)&nbsp;[getGravatarIO,&nbsp;getIdenticonIO,&nbsp;getDBIconIO] &nbsp;&nbsp;m&nbsp;&lt;-&nbsp;getFirstIO&nbsp;&nbsp;fold&nbsp;lazyIcons &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;fromMaybe&nbsp;defaultIcon&nbsp;m</pre> </p> <p> The <code>getIcon</code> function starts with a list of all three functions. For each of them, it calls the function with the <code>User</code> value <code>u</code>. This may seem inefficient and redundant, because all three function calls may not be required, but since the return values are <code>FirstIO</code> values, all three function calls are lazily evaluated - even under <code>IO</code>. The result, <code>lazyIcons</code>, is a <code>[FirstIO Icon]</code> value; i.e. a lazily evaluated list of lazily evaluated values. </p> <p> This first step is just to put the potential values in a form that's recognisable. You can now <code>fold</code> the <code>lazyIcons</code> to a single <code>FirstIO Icon</code> value, and then use <code>getFirstIO</code> to unwrap it. Due to <code>do</code> notation, <code>m</code> is a <code>Maybe Icon</code> value. </p> <p> This is the first catamorphism. Granted, the generalisation that <code>fold</code> offers is not really required, since <code>lazyIcons</code> is a list; <code>mconcat</code> would have worked just as well. I did, however, choose to use <code>fold</code> (from <code>Data.Foldable</code>) to emphasise the point. While the <code>fold</code> function itself isn't the catamorphism for lists, we know that <a href="/2019/05/27/list-catamorphism">it's derived from the list catamorphism</a>. </p> <p> The final step is to utilise the Maybe catamorphism to reduce the <code>Maybe Icon</code> value to an <code>Icon</code> value. Again, the <code>getIcon</code> function doesn't use the Maybe catamorphism directly, but rather the derived <code>fromMaybe</code> function. The <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a> is the <code>maybe</code> function, but you can trivially implement <code>fromMaybe</code> with <code>maybe</code>. </p> <p> For <a href="https://en.wikipedia.org/wiki/Code_golf">golfers</a>, it's certainly possible to write this function in a more compact manner. Here's a <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free</a> version: </p> <p> <pre><span style="color:#2b91af;">getIcon</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;<span style="color:blue;">Icon</span> getIcon&nbsp;= &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;(fromMaybe&nbsp;defaultIcon)&nbsp;.&nbsp;getFirstIO&nbsp;.&nbsp;fold&nbsp;[getGravatarIO,&nbsp;getIdenticonIO,&nbsp;getDBIconIO]</pre> </p> <p> This alternative version utilises that <code>a -&gt; m</code> is a <code>Monoid</code> instance when <code>m</code> is a <code>Monoid</code> instance. That's the reason that you can <code>fold</code> a list of functions. The more explicit version above doesn't do that, but the behaviour is the same in both cases. </p> <p> That's all the Haskell code we need to discern the universal abstractions involved in the Chain of Responsibility design pattern. We can now return to the C# code example. </p> <h3 id="492ff50788784d7dbf6560ed08ed6bf7"> Chains as lists <a href="#492ff50788784d7dbf6560ed08ed6bf7" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern is often illustrated like above, in a staircase-like diagram. There's, however, no inherent requirement to do so. You could also flatten the diagram: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list.png" alt="Chain of Responsibility illustrated as a linked list."> </p> <p> This looks a lot like a linked list. </p> <p> The difference is, however, that the terminator of a linked list is usually empty. Here, however, you have two types of objects. All objects apart from the rightmost object represent a <em>potential</em>. Each object may, or may not, handle the method call and produce an outcome; if an object can't handle the method call, it'll delegate to the next object in the chain. </p> <p> The rightmost object, however, is different. This object can't delegate any further, but <em>must</em> handle the method call. In the icon reader example, this is the <code>DefaultIconReader</code> class. </p> <p> Once you start to see most of the list as a list of potential values, you may realise that you'll be able to collapse into it a single potential value. This is possible because <a href="/2018/04/03/maybe-monoids">a list of values where you pick the first non-empty value forms a monoid</a>. This is sometimes called the <em>First</em> <a href="/2017/10/06/monoids">monoid</a>. </p> <p> In other words, you can reduce, or fold, all of the list, except the rightmost value, to a single potential value: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list-single-fold.png" alt="Chain of Responsibility illustrated as a linked list, with all but the rightmost objects folded to one."> </p> <p> When you do that, however, you're left with a single potential value. The result of folding most of the list is that you get the leftmost non-empty value in the list. There's no guarantee, however, that that value is non-empty. If all the values in the list are empty, the result is also empty. This means that you somehow need to combine a potential value with a value that's guaranteed to be present: the terminator. </p> <p> You can do that wither another fold: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list-double-fold.png" alt="Chain of Responsibility illustrated as a linked list, with two consecutive folds."> </p> <p> This second fold isn't a list fold, but rather a Maybe fold. </p> <h3 id="7632b9ff458d417fa49b1c65f7b198ed"> Maybe <a href="#7632b9ff458d417fa49b1c65f7b198ed" title="permalink">#</a> </h3> <p> The <em>First</em> monoid is a monoid over <a href="/2018/03/26/the-maybe-functor">Maybe</a>, so add a <code>Maybe</code> class to the code base. In Haskell, the catamorphism for Maybe is called <code>maybe</code>, but that's not a good method name in object-oriented design. Another option is some variation of <em>fold</em>, but in C#, this functionality tends to be called <code>Aggregate</code>, at least for <code>IEnumerable&lt;T&gt;</code>, so I'll reuse that terminology: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Aggregate&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">TResult</span>&nbsp;@default,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;func) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(func&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(func)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;hasItem&nbsp;?&nbsp;func(item)&nbsp;:&nbsp;@default; }</pre> </p> <p> You can implement another, more list-like <code>Aggregate</code> overload from this one, but for this article, you don't need it. </p> <h3 id="8b60d0c605d14cffbfa5e237cf26b7b2"> From TryReadIconId to Maybe <a href="#8b60d0c605d14cffbfa5e237cf26b7b2" title="permalink">#</a> </h3> <p> In the above code examples, <code>DBIconReader</code> depends on <code>IUserRepository</code>, which defined this method: </p> <p> <pre><span style="color:blue;">bool</span>&nbsp;TryReadIconId(<span style="color:blue;">int</span>&nbsp;userId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;iconId);</pre> </p> <p> From <a href="/2019/07/15/tester-doer-isomorphisms">Tester-Doer isomorphisms</a> we know, however, that such a design is isomorphic to returning a Maybe value, and since that's more composable, do that: </p> <p> <pre><span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;ReadIconId(<span style="color:blue;">int</span>&nbsp;userId);</pre> </p> <p> This requires you to refactor the <code>DBIconReader</code> implementation of the <code>ReadIcon</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;mid&nbsp;=&nbsp;repository.ReadIconId(user.Id); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;lazyResult&nbsp;=&nbsp;mid.Aggregate( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;@default:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(()&nbsp;=&gt;&nbsp;next.ReadIcon(user)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;func:&nbsp;id&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(()&nbsp;=&gt;&nbsp;CreateIcon(id))); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;lazyResult.Value; }</pre> </p> <p> A few things are worth a mention. Notice that the above <code>Aggregate</code> method (the Maybe catamorphism) requires you to supply a <code>@default</code> value (to be used if the Maybe object is empty). In the Chain of Responsibility design pattern, however, the fallback value is produced by calling the <code>next</code> object in the chain. If you do this unconditionally, however, you perform too much work. You're only supposed to call <code>next</code> if the current object can't handle the method call. </p> <p> The solution is to aggregate the <code>mid</code> object to a <code>Lazy&lt;Icon&gt;</code> and then return its <code>Value</code>. The <code>@default</code> value is now a lazy computation that calls <code>next</code> only if its <code>Value</code> is read. When <code>mid</code> is populated, on the other hand, the lazy computation calls the private <code>CreateIcon</code> method when <code>Value</code> is accessed. The private <code>CreateIcon</code> method contains the same logic as before the refactoring. </p> <p> This change of <code>DBIconReader</code> isn't strictly necessary in order to change the overall Chain of Responsibility to a pair of catamorphisms, but serves, I think, as a nice introduction to the use of the Maybe catamorphism. </p> <h3 id="ec329c8a0b70432d81d6f69e7084c13f"> Optional icon readers <a href="#ec329c8a0b70432d81d6f69e7084c13f" title="permalink">#</a> </h3> <p> Previously, the <code>IIconReader</code> interface <em>required</em> each implementation to return an <code>Icon</code> object: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> When you have an object like <code>GravatarReader</code> that may or may not return an <code>Icon</code>, this requirement leads toward the Chain of Responsibility design pattern. You can, however, shift the responsibility of what to do next by changing the interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> An implementation like <code>GravatarReader</code> becomes simpler: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseGravatar) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Gravatar</span>(user.Email).Url)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> No longer do you have to pass in a <code>next</code> dependency. Instead, you just return an empty <code>Maybe&lt;Icon&gt;</code> if you can't handle the method call. The same change applies to the <code>IdenticonReader</code> class. </p> <p> Since <a href="/2018/03/26/the-maybe-functor">Maybe is a functor</a>, and the <code>DBIconReader</code> already works on a <code>Maybe&lt;string&gt;</code> value, its implementation is greatly simplified: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;repository.ReadIconId(user.Id).Select(CreateIcon); }</pre> </p> <p> Since <code>ReadIconId</code> returns a <code>Maybe&lt;string&gt;</code>, you can simply use <code>Select</code> to transform the icon ID to an <code>Icon</code> object if the Maybe is populated. </p> <h3 id="94cac3b9e52e48c2a1768fd24c72e4bd"> Coalescing Composite <a href="#94cac3b9e52e48c2a1768fd24c72e4bd" title="permalink">#</a> </h3> <p> As an intermediate step, you can compose the various readers using a <a href="/2018/04/09/coalescing-composite-as-a-monoid">Coalescing Composite</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">CompositeIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[]&nbsp;iconReaders; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;CompositeIconReader(<span style="color:blue;">params</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[]&nbsp;iconReaders) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.iconReaders&nbsp;=&nbsp;iconReaders; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;iconReader&nbsp;<span style="color:blue;">in</span>&nbsp;iconReaders) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;mIcon&nbsp;=&nbsp;iconReader.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(IsPopulated(mIcon)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;mIcon; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;IsPopulated&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;m) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Aggregate(<span style="color:blue;">false</span>,&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">true</span>); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> I prefer a more explicit design over this one, so this is just an intermediate step. This <code>IIconReader</code> implementation composes an array of other <code>IIconReader</code> objects and queries each in order to return the first populated Maybe value it finds. If it doesn't find any populated value, it returns an empty Maybe object. </p> <p> You can now compose your <code>IIconReader</code> objects into a <a href="https://en.wikipedia.org/wiki/Composite_pattern">Composite</a>: </p> <p> <pre><span style="color:#2b91af;">IIconReader</span>&nbsp;reader&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">CompositeIconReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repo));</pre> </p> <p> While this gives you a single object on which you can call <code>ReadIcon</code>, the return value of that method is still a <code>Maybe&lt;Icon&gt;</code> object. You still need to reduce the <code>Maybe&lt;Icon&gt;</code> object to an <code>Icon</code> object. You can do this with a Maybe helper method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;GetValueOrDefault(<span style="color:#2b91af;">T</span>&nbsp;@default) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Aggregate(@default,&nbsp;x&nbsp;=&gt;&nbsp;x); }</pre> </p> <p> Given a <code>User</code> object named <code>user</code>, you can now use the composition and the <code>GetValueOrDefault</code> method to get an <code>Icon</code> object: </p> <p> <pre><span style="color:#2b91af;">Icon</span>&nbsp;icon&nbsp;=&nbsp;reader.ReadIcon(user).GetValueOrDefault(<span style="color:#2b91af;">Icon</span>.Default);</pre> </p> <p> First you use the composed <code>reader</code> to produce a <code>Maybe&lt;Icon&gt;</code> object, and then you use the <code>GetValueOrDefault</code> method to reduce the <code>Maybe&lt;Icon&gt;</code> object to an <code>Icon</code> object. </p> <p> The latter of these two steps, <code>GetValueOrDefault</code>, is already based on the Maybe catamorphism, but the first step is still too implicit to clearly show the nature of what's actually going on. The next step is to refactor the Coalescing Composite to a list of monoidal values. </p> <h3 id="c75ce57c2b4f4315a93eaa91b653a370"> First <a href="#c75ce57c2b4f4315a93eaa91b653a370" title="permalink">#</a> </h3> <p> While not strictly necessary, you can introduce a <code>First&lt;T&gt;</code> wrapper: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;First(<span style="color:#2b91af;">T</span>&nbsp;item) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(item&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(item)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Item&nbsp;=&nbsp;item; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;Item&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(Item,&nbsp;other.Item); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Item.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> In this particular example, the <code>First&lt;T&gt;</code> class adds no new capabilities, so it's technically redundant. You could add to it methods to combine two <code>First&lt;T&gt;</code> objects into one (since <em>First</em> forms a <a href="/2017/11/27/semigroups">semigroup</a>), and perhaps a method or two to <a href="/2017/12/11/semigroups-accumulate">accumulate multiple values</a>, but in this article, none of those are required. </p> <p> While the class as shown above doesn't add any behaviour, I like that it signals intent, so I'll use it in that role. </p> <h3 id="c3feb40d90fc4d389fa0b3812abaa62c"> Lazy I/O in C# <a href="#c3feb40d90fc4d389fa0b3812abaa62c" title="permalink">#</a> </h3> <p> Like in the above Haskell code, you'll need to be able to combine two <code>First&lt;T&gt;</code> objects in a lazy fashion, in such a way that if the first object is populated, the I/O associated with producing the second value never happens. In Haskell I addressed that concern with a <code>newtype</code> that, among other abstractions, is a monoid. You can do the same in C# with an extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;FindFirst&lt;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;m, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;other) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(m.Value.IsPopulated()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;other; } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;IsPopulated&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;m) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Aggregate(<span style="color:blue;">false</span>,&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">true</span>); }</pre> </p> <p> The <code>FindFirst</code> method returns the first (leftmost) non-empty object of two options. It's a lazy version of the <em>First</em> monoid, and <a href="/2019/04/15/lazy-monoids">that's still a monoid</a>. It's truly lazy because it never accesses the <code>Value</code> property on <code>other</code>. While it has to force evaluation of the first lazy computation, <code>m</code>, it doesn't have to evaluate <code>other</code>. Thus, whenever <code>m</code> is populated, <code>other</code> can remain non-evaluated. </p> <p> Since <a href="/2017/11/20/monoids-accumulate">monoids accumulate</a>, you can also write an extension method to implement that functionality: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;FindFirst&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;identity&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;(()&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Aggregate(identity,&nbsp;(acc,&nbsp;x)&nbsp;=&gt;&nbsp;acc.FindFirst(x)); }</pre> </p> <p> This overload just uses the earlier <code>FindFirst</code> extension method to fold an arbitrary number of lazy <code>First&lt;T&gt;</code> objects into one. Notice that <code>Aggregate</code> is the C# name for the list catamorphisms. </p> <p> You can now compose the desired functionality using the basic building blocks of monoids, <a href="/2018/03/22/functors">functors</a>, and catamorphisms. </p> <h3 id="0fe80a69c74c463dacb8af0f86898518"> Composition from universal abstractions <a href="#0fe80a69c74c463dacb8af0f86898518" title="permalink">#</a> </h3> <p> The goal is still a function that takes a <code>User</code> object as input and produces an <code>Icon</code> object as output. While you could compose that functionality directly in-line where you need it, I think it may be helpful to package the composition in a <a href="https://en.wikipedia.org/wiki/Facade_pattern">Facade</a> object. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">IconReaderFacade</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IReadOnlyCollection</span>&lt;<span style="color:#2b91af;">IIconReader</span>&gt;&nbsp;readers; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IconReaderFacade(<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readers&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repository) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;&gt;&nbsp;lazyIcons&nbsp;=&nbsp;readers &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(r&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;(()&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r.ReadIcon(user).Select(i&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(i)))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;&nbsp;m&nbsp;=&nbsp;lazyIcons.FindFirst(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Value.Aggregate(<span style="color:#2b91af;">Icon</span>.Default,&nbsp;fi&nbsp;=&gt;&nbsp;fi.Item); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> When you initialise an <code>IconReaderFacade</code> object, it creates an array of the desired <code>readers</code>. Whenever <code>ReadIcon</code> is invoked, it first transforms all those readers to a sequence of potential icons. All the values in the sequence are lazily evaluated, so in this step, nothing actually happens, even though it looks as though all readers' <code>ReadIcon</code> method gets called. The <code>Select</code> method is a structure-preserving map, so all readers are still potential producers of <code>Icon</code> objects. </p> <p> You now have an <code>IEnumerable&lt;Lazy&lt;Maybe&lt;First&lt;Icon&gt;&gt;&gt;&gt;</code>, which must be a good candidate for the prize for the <em>most nested generic .NET type of 2019</em>. It fits, though, the input type for the above <code>FindFirst</code> overload, so you can call that. The result is a single potential value <code>m</code>. That's the list catamorphism applied. </p> <p> Finally, you force evaluation of the lazy computation and apply the Maybe catamorphism (<code>Aggregate</code>). The <code>@default</code> value is <code>Icon.Default</code>, which gets returned if <code>m</code> turns out to be empty. When <code>m</code> is populated, you pull the <code>Item</code> out of the <code>First</code> object. In either case, you now have an <code>Icon</code> object to return. </p> <p> This composition has exactly the same behaviour as the initial Chain of Responsibility implementation, but is now composed from universal abstractions. </p> <h3 id="23819ca370344b94875ddbf5bde5aef3"> Summary <a href="#23819ca370344b94875ddbf5bde5aef3" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern describes a flexible way to implement conditional logic. Instead of relying on keywords like <code>if</code> or <code>switch</code>, you can compose the conditional logic from polymorphic objects. This gives you several advantages. One is that you get better separations of concerns, which will tend to make it easier to refactor the code. Another is that it's possible to change the behaviour at run time, by moving the objects around. </p> <p> You can achieve a similar design, with equivalent advantages, by composing polymorphically similar functions in a list, map the functions to a list of potential values, and then use the list catamorphism to reduce many potential values to one. Finally, you apply the Maybe catamorphism to produce a value, even if the potential value is empty. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Tester-Doer isomorphisms https://blog.ploeh.dk/2019/07/15/tester-doer-isomorphisms 2019-07-15T07:35:00+00:00 Mark Seemann <div id="post"> <p> <em>The Tester-Doer pattern is equivalent to the Try-Parse idiom; both are equivalent to Maybe.</em> </p> <p> This article is part of <a href="/2018/01/08/software-design-isomorphisms">a series of articles about software design isomorphisms</a>. An isomorphism is when a bi-directional lossless translation exists between two representations. Such translations exist between the <em>Tester-Doer</em> pattern and the <em>Try-Parse</em> idiom. Both can also be translated into operations that return <a href="/2018/03/26/the-maybe-functor">Maybe</a>. </p> <p> <img src="/content/binary/tester-doer-try-parse-maybe-isomorphism.png" alt="Isomorphisms between Tester-Doer, Try-Parse, and Maybe."> </p> <p> Given an implementation that uses one of those three idioms or abstractions, you can translate your design into one of the other options. This doesn't imply that each is of equal value. When it comes to composability, Maybe is superior to the two other alternatives, and Tester-Doer isn't thread-safe. </p> <h3 id="e95c8f5d7a6445139b58445d30498493"> Tester-Doer <a href="#e95c8f5d7a6445139b58445d30498493" title="permalink">#</a> </h3> <p> The first time I explicitly encountered the Tester-Doer pattern was in the <a href="https://amzn.to/2zXCCfH">Framework Design Guidelines</a>, which is from where I've taken the name. The pattern is, however, older. The idea that you can query an object about whether a given operation would be possible, and then you only perform it if the answer is affirmative, is almost a leitmotif in <a href="http://amzn.to/1claOin">Object-Oriented Software Construction</a>. Bertrand Meyer often uses linked lists and stacks as examples, but I'll instead use the example that Krzysztof Cwalina and Brad Abrams use: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:blue;">if</span>&nbsp;(!numbers.IsReadOnly) &nbsp;&nbsp;&nbsp;&nbsp;numbers.Add(1);</pre> </p> <p> The idea with the Tester-Doer pattern is that you test whether an intended operation is legal, and only perform it if the answer is affirmative. In the example, you only add to the <code>numbers</code> collection if <code>IsReadOnly</code> is <code>false</code>. Here, <code>IsReadOnly</code> is the <em>Tester</em>, and <code>Add</code> is the <em>Doer</em>. </p> <p> As Jeffrey Richter points out in the book, this is a dangerous pattern: <blockquote> "The potential problem occurs when you have multiple threads accessing the object at the same time. For example, one thread could execute the test method, which reports that all is OK, and before the doer method executes, another thread could change the object, causing the doer to fail." </blockquote> In other words, the pattern isn't thread-safe. While multi-threaded programming was always supported in .NET, this was less of a concern when the guidelines were first published (2006) than it is today. The guidelines were in internal use in Microsoft years before they were published, and there wasn't many multi-core processors in use back then. </p> <p> Another problem with the Tester-Doer pattern is with discoverability. If you're looking for a way to add an element to a collection, you'd usually consider your search over once you find the <code>Add</code> method. Even if you wonder <em>Is this operation safe? Can I always add an element to a collection?</em> you <em>might</em> consider looking for a <code>CanAdd</code> method, but not an <code>IsReadOnly</code> property. Most people don't even ask the question in the first place, though. </p> <h3 id="08bc9f42d8f048119f952aa9c2d94b34"> From Tester-Doer to Try-Parse <a href="#08bc9f42d8f048119f952aa9c2d94b34" title="permalink">#</a> </h3> <p> You could refactor such a Tester-Doer API to a single method, which is both thread-safe and discoverable. One option is a variation of the Try-Parse idiom (discussed in detail below). Using it could look like this: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:blue;">bool</span>&nbsp;wasAdded&nbsp;=&nbsp;numbers.TryAdd(1);</pre> </p> <p> In this special case, you may not need the <code>wasAdded</code> variable, because the original <code>Add</code> operation never returned a value. If, on the other hand, you do care whether or not the element was added to the collection, you'd have to figure out what to do in the case where the return value is <code>true</code> and <code>false</code>, respectively. </p> <p> Compared to the more idiomatic example of the Try-Parse idiom below, you may have noticed that the <code>TryAdd</code> method shown here takes no <code>out</code> parameter. This is because the original <code>Add</code> method returns <code>void</code>; there's nothing to return. From <a href="/2018/01/15/unit-isomorphisms">unit isomorphisms</a>, however, we know that <em>unit</em> is isomorphic to <code>void</code>, so we could, more explicitly, have defined a <code>TryAdd</code> method with this signature: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">Unit</span>&nbsp;unit)</pre> </p> <p> There's no point in doing this, however, apart from demonstrating that the isomorphism holds. </p> <h3 id="e246bcfabcab42e8b76e2b3e314174c4"> From Tester-Doer to Maybe <a href="#e246bcfabcab42e8b76e2b3e314174c4" title="permalink">#</a> </h3> <p> You can also refactor the add-to-collection example to return a Maybe value, although in this degenerate case, it makes little sense. If you automate the refactoring process, you'd arrive at an API like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Unit</span>&gt;&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item)</pre> </p> <p> Using it would look like this: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Unit</span>&gt;&nbsp;m&nbsp;=&nbsp;numbers.TryAdd(1);</pre> </p> <p> The contract is consistent with what Maybe implies: You'd get an empty <code>Maybe&lt;Unit&gt;</code> object if the <em>add</em> operation 'failed', and a populated <code>Maybe&lt;Unit&gt;</code> object if the <em>add</em> operation succeeded. Even in the populated case, though, the value contained in the Maybe object would be <em>unit</em>, which carries no further information than its existence. </p> <p> To be clear, this isn't close to a proper functional design because all the interesting action happens as a side effect. Does the design have to be functional? No, it clearly isn't in this case, but Maybe is a concept that originated in functional programming, so you could be misled to believe that I'm trying to pass this particular design off as functional. It's not. </p> <p> A functional version of this API could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">ICollection</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item)</pre> </p> <p> An implementation wouldn't mutate the object itself, but rather return a new collection with the added item, in case that was possible. This is, however, always possible, because you can always concatenate <code>item</code> to the front of the collection. In other words, this particular line of inquiry is increasingly veering into the territory of the absurd. This isn't, however, a counter-example of my proposition that the isomorphism exists; it's just a result of the initial example being degenerate. </p> <h3 id="9817f0d35d99428f93c38cab9fabc9ad"> Try-Parse <a href="#9817f0d35d99428f93c38cab9fabc9ad" title="permalink">#</a> </h3> <p> Another idiom described in the Framework Design Guidelines is the Try-Parse idiom. This seems to be a coding idiom more specific to the .NET framework, which is the reason I call it an <em>idiom</em> instead of a <em>pattern</em>. (Perhaps it is, after all, a pattern... I'm sure many of my readers are better informed about how problems like these are solved in other languages, and can enlighten me.) </p> <p> A better name might be <em>Try-Do</em>, since the idiom doesn't have to be constrained to parsing. The example that Cwalina and Abrams supply, however, relates to parsing a <code>string</code> into a <code>DateTime</code> value. Such an API is <a href="https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tryparse">already available in the base class library</a>. Using it looks like this: </p> <p> <pre><span style="color:blue;">bool</span>&nbsp;couldParse&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.TryParse(candidate,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;dateTime);</pre> </p> <p> Since <code>DateTime</code> is a <a href="https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/value-types">value type</a>, the <code>out</code> parameter will never be <code>null</code>, even if parsing fails. You can, however, examine the return value <code>couldParse</code> to determine whether the <code>candidate</code> could be parsed. </p> <p> In the running commentary in the book, Jeffrey Richter likes this much better: <blockquote> "I like this guideline a lot. It solves the race-condition problem and the performance problem." </blockquote> I agree that it's better than Tester-Doer, but that doesn't mean that you can't refactor such a design to that pattern. </p> <h3 id="166ef01b6b64481a85fe64a6e9e07dc6"> From Try-Parse to Tester-Doer <a href="#166ef01b6b64481a85fe64a6e9e07dc6" title="permalink">#</a> </h3> <p> While I see no compelling reason to design parsing attempts with the Tester-Doer pattern, it's possible. You could create an API that enables interaction like this: </p> <p> <pre><span style="color:#2b91af;">DateTime</span>&nbsp;dateTime&nbsp;=&nbsp;<span style="color:blue;">default</span>(<span style="color:#2b91af;">DateTime</span>); <span style="color:blue;">bool</span>&nbsp;canParse&nbsp;=&nbsp;<span style="color:#2b91af;">DateTimeEnvy</span>.CanParse(candidate); <span style="color:blue;">if</span>&nbsp;(canParse) &nbsp;&nbsp;&nbsp;&nbsp;dateTime&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(candidate);</pre> </p> <p> You'd need to add a new <code>CanParse</code> method with this signature: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;CanParse(<span style="color:blue;">string</span>&nbsp;candidate)</pre> </p> <p> In this particular example, you don't have to add a <code>Parse</code> method, because it already exists in the base class library, but in other examples, you'd have to add such a method as well. </p> <p> This example doesn't suffer from issues with thread safety, since strings are immutable, but in general, that problem is always a concern with the Tester-Doer <a href="/2019/01/21/some-thoughts-on-anti-patterns">anti-pattern</a>. Discoverability still suffers in this example. </p> <h3 id="ffd6284cfc8f4f528d1a3b80849fbf8c"> From Try-Parse to Maybe <a href="#ffd6284cfc8f4f528d1a3b80849fbf8c" title="permalink">#</a> </h3> <p> While the Try-Parse idiom is thread-safe, it isn't composable. Every time you run into an API modelled over this template, you have to stop what you're doing and check the return value. Did the operation succeed? Was should the code do if it didn't? </p> <p> <em>Maybe</em>, on the other hand, is composable, so is a much better way to model problems such as parsing. Typically, methods or functions that return Maybe values are still prefixed with <em>Try</em>, but there's no longer any <code>out</code> parameter. A Maybe-based <code>TryParse</code> function could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">DateTime</span>&gt;&nbsp;TryParse(<span style="color:blue;">string</span>&nbsp;candidate)</pre> </p> <p> You could use it like this: </p> <p> <pre><span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">DateTime</span>&gt;&nbsp;m&nbsp;=&nbsp;<span style="color:#2b91af;">DateTimeEnvy</span>.TryParse(candidate);</pre> </p> <p> If the <code>candidate</code> was successfully parsed, you get a populated <code>Maybe&lt;DateTime&gt;</code>; if the string was invalid, you get an empty <code>Maybe&lt;DateTime&gt;</code>. </p> <p> A Maybe object composes much better with other computations. Contrary to the Try-Parse idiom, you don't have to stop and examine a Boolean return value. You don't even have to deal with empty cases at the point where you parse. Instead, you can defer the decision about what to do in case of failure until a later time, where it may be more obvious what to do in that case. </p> <h3 id="4f27ce3476114a5f9b0f80fd415e5370"> Maybe <a href="#4f27ce3476114a5f9b0f80fd415e5370" title="permalink">#</a> </h3> <p> In my <a href="https://blog.ploeh.dk/encapsulation-and-solid">Encapsulation and SOLID</a> Pluralsight course, you get a walk-through of all three options for dealing with an operation that could potentially fail. Like in this article, the course starts with Tester-Doer, progresses over Try-Parse, and arrives at a Maybe-based implementation. In that course, the example involves reading a (previously stored) message from a text file. The final API looks like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;Read(<span style="color:blue;">int</span>&nbsp;id)</pre> </p> <p> The protocol implied by such a signature is that you supply an ID, and if a message with that ID exists on disc, you receive a populated <code>Maybe&lt;string&gt;</code>; otherwise, an empty object. This is not only composable, but also thread-safe. For anyone who understands the <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> of Maybe, it's clear that this is an operation that could fail. Ultimately, client code will have to deal with empty Maybe values, but this doesn't have to happen immediately. Such a decision can be deferred until a proper context exists for that purpose. </p> <h3 id="d35fbacb32bb4ef6afc843813ba901f1"> From Maybe to Tester-Doer <a href="#d35fbacb32bb4ef6afc843813ba901f1" title="permalink">#</a> </h3> <p> Since Tester-Doer is the least useful of the patterns discussed in this article, it makes little sense to refactor a Maybe-based API to a Tester-Doer implementation. Nonetheless, it's still possible. The API could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Exists(<span style="color:blue;">int</span>&nbsp;id) <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;Read(<span style="color:blue;">int</span>&nbsp;id)</pre> </p> <p> Not only is this design not thread-safe, but it's another example of poor discoverability. While the doer is called <code>Read</code>, the tester isn't called <code>CanRead</code>, but rather <code>Exists</code>. If the class has other members, these could be listed interleaved between <code>Exists</code> and <code>Read</code>. It wouldn't be obvious that these two members were designed to be used together. </p> <p> Again, the intended usage is code like this: </p> <p> <pre><span style="color:blue;">string</span>&nbsp;message; <span style="color:blue;">if</span>&nbsp;(fileStore.Exists(49)) &nbsp;&nbsp;&nbsp;&nbsp;message&nbsp;=&nbsp;fileStore.Read(49);</pre> </p> <p> This is still problematic, because you need to decide what to do in the <code>else</code> case as well, although you don't see that case here. </p> <p> The point is, still, that you <em>can</em> translate from one representation to another without loss of information; not that you should. </p> <h3 id="3bbc92082af143d29681b2ce0bb11ccb"> From Maybe to Try-Parse <a href="#3bbc92082af143d29681b2ce0bb11ccb" title="permalink">#</a> </h3> <p> Of the three representations discussed in this article, I firmly believe that a Maybe-based API is superior. Unfortunately, the .NET base class library doesn't (yet) come with a built-in Maybe object, so if you're developing an API as part of a reusable library, you have two options: <ul> <li>Export the library's <code>Maybe&lt;T&gt;</code> type together with the methods that return it.</li> <li>Use Try-Parse for interoperability reasons.</li> </ul> This is the only reason I can think of to use the Try-Parse idiom. For the <code>FileStore</code> example from my Pluralsight course, this would imply not a <code>TryParse</code> method, but a <code>TryRead</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;TryRead(<span style="color:blue;">int</span>&nbsp;id,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;message)</pre> </p> <p> This would enable you to expose the method in a reusable library. Client code could interact with it like this: </p> <p> <pre><span style="color:blue;">string</span>&nbsp;message; <span style="color:blue;">if</span>&nbsp;(!fileStore.TryRead(50,&nbsp;<span style="color:blue;">out</span>&nbsp;message)) &nbsp;&nbsp;&nbsp;&nbsp;message&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>;</pre> </p> <p> This has all the problems associated with the Try-Parse idiom already discussed in this article, but it does, at least, have a basic use case. </p> <h3 id="c04073bcc534481eaaf1ba43dd2a22a4"> Isomorphism with Either <a href="#c04073bcc534481eaaf1ba43dd2a22a4" title="permalink">#</a> </h3> <p> At this point, I hope that you find it reasonable to believe that the three representations, Tester-Doer, Try-Parse, and Maybe, are isomorphic. You can translate between any of these representations to any other of these without loss of information. This also means that you can translate back again. </p> <p> While I've only argued with a series of examples, it's my experience that these three representations are truly isomorphic. You can always translate any of these representations into another. Mostly, though, I translate into Maybe. If you disagree with my proposition, all you have to do is to provide a counter-example. </p> <p> There's a fourth isomorphism that's already well-known, and that's between Maybe and <a href="/2018/06/11/church-encoded-either">Either</a>. Specifically, <code>Maybe&lt;T&gt;</code> is isomorphic to <code>Either&lt;Unit, T&gt;</code>. In <a href="https://www.haskell.org">Haskell</a>, this is easily demonstrated with this set of functions: </p> <p> <pre><span style="color:#2b91af;">toMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;()&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a toMaybe&nbsp;(Left&nbsp;<span style="color:blue;">()</span>)&nbsp;=&nbsp;Nothing toMaybe&nbsp;(Right&nbsp;x)&nbsp;=&nbsp;Just&nbsp;x <span style="color:#2b91af;">fromMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;()&nbsp;a fromMaybe&nbsp;Nothing&nbsp;=&nbsp;Left&nbsp;<span style="color:blue;">()</span> fromMaybe&nbsp;(Just&nbsp;x)&nbsp;=&nbsp;Right&nbsp;x</pre> </p> <p> Translated to C#, using the <a href="/2018/06/04/church-encoded-maybe">Church-encoded Maybe</a> together with the Church-encoded Either, these two functions could look like the following, starting with the conversion from Maybe to Either: </p> <p> <pre><span style="color:green;">//&nbsp;On&nbsp;Maybe:</span> <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;ToEither&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nothing:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Left</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">Unit</span>.Value), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;just:&nbsp;x&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;(x)); }</pre> </p> <p> Likewise, the conversion from Either to Maybe: </p> <p> <pre><span style="color:green;">//&nbsp;On&nbsp;Either:</span> <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;ToMaybe&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onLeft:&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Nothing</span>&lt;<span style="color:#2b91af;">T</span>&gt;(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onRight:&nbsp;x&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Just</span>&lt;<span style="color:#2b91af;">T</span>&gt;(x)); }</pre> </p> <p> You can convert back and forth to your heart's content, as this parametrised <a href="https://xunit.github.io">xUnit.net</a> 2.3.1 test shows: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(42)] [<span style="color:#2b91af;">InlineData</span>(1337)] [<span style="color:#2b91af;">InlineData</span>(2112)] [<span style="color:#2b91af;">InlineData</span>(90125)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;IsomorphicWithPopulatedMaybe(<span style="color:blue;">int</span>&nbsp;i) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(i); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;expected.ToMaybe().ToEither(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(expected,&nbsp;actual); }</pre> </p> <p> I decided to exclude <code>IEither&lt;Unit, T&gt;</code> from the overall theme of this article in order to better contrast three alternatives that may not otherwise look equivalent. That <code>IEither&lt;Unit, T&gt;</code> is isomorphic to <code>IMaybe&lt;T&gt;</code> is a well-known result. Besides, I think that both of these two representations already inhabit the same conceptual space. Either and Maybe are both well-known in statically typed functional programming. </p> <h3 id="8e3e7b55ac1e49568712675713426e59"> Summary <a href="#8e3e7b55ac1e49568712675713426e59" title="permalink">#</a> </h3> <p> The Tester-Doer pattern is a decades-old design pattern that attempts to model how to perform operations that can potentially fail, without relying on exceptions for flow control. It predates mainstream multi-core processors by decades, which can explain why it even exists as a pattern in the first place. At the time people arrived at the pattern, thread-safety wasn't a big concern. </p> <p> The Try-Parse idiom is a thread-safe alternative to the Tester-Doer pattern. It combines the two <em>tester</em> and <em>doer</em> methods into a single method with an <code>out</code> parameter. While thread-safe, it's not composable. </p> <p> <em>Maybe</em> offers the best of both worlds. It's both thread-safe and composable. It's also as discoverable as any Try-Parse method. </p> <p> These three alternatives are all, however, isomorphic. This means that you can refactor any of the three designs into one of the other designs, without loss of information. It also means that you can implement <a href="https://en.wikipedia.org/wiki/Adapter_pattern">Adapters</a> between particular implementations, should you so desire. You see this frequently in <a href="https://fsharp.org">F#</a> code, where functions that return <code>'a option</code> adapt Try-Parse methods from the .NET base class library. </p> <p> While all three designs are equivalent in the sense that you can translate one into another, it doesn't imply that they're equally useful. <em>Maybe</em> is the superior design, and Tester-Doer clearly inferior. </p> <p> <strong>Next:</strong> <a href="/2018/05/22/church-encoding">Church encoding</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Payment types catamorphism https://blog.ploeh.dk/2019/07/08/payment-types-catamorphism 2019-07-08T06:08:00+00:00 Mark Seemann <div id="post"> <p> <em>You can find the catamorphism for a custom sum type. Here's an example.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a domain-specific <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with a few examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> In all previous articles in the series, you've seen catamorphisms for well-known data structures: <a href="/2019/05/06/boolean-catamorphism">Boolean values</a>, <a href="/2019/05/13/peano-catamorphism">Peano numbers</a>, <a href="/2019/05/20/maybe-catamorphism">Maybe</a>, <a href="/2019/06/10/tree-catamorphism">trees</a>, and so on. These are all general-purpose data structures, so you might be left with the impression that catamorphisms are only related to such general types. That's not the case. The point of this article is to demonstrate that you can find the catamorphism for your own custom, domain-specific sum type as well. </p> <h3 id="2b6f7df594c0474589ae9805f1e1a1d0"> C# catamorphism <a href="#2b6f7df594c0474589ae9805f1e1a1d0" title="permalink">#</a> </h3> <p> The custom type we'll examine in this article is the <a href="/2018/06/18/church-encoded-payment-types">Church-encoded payment types</a> I've previously written about. It's just an example of a custom data type, but it serves the purpose of illustration because I've already shown it as a Church encoding in C#, <a href="/2018/06/25/visitor-as-a-sum-type">as a Visitor in C#</a>, and <a href="/2016/11/28/easy-domain-modelling-with-types">as a discriminated union in F#</a>. </p> <p> The catamorphism for the <code>IPaymentType</code> interface is the <code>Match</code> method: </p> <p> <pre><span style="color:#2b91af;">T</span>&nbsp;Match&lt;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">PaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;individual, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">PaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;parent, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">ChildPaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;child);</pre> </p> <p> As has turned out to be a common trait, the catamorphism is identical to the Church encoding. </p> <p> I'm not going to show more than a few examples of using the <code>Match</code> method, because you can find other examples in the previous articles, </p> <p> <pre>&gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Individual</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>)); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa" &gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Parent</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>)); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa" &gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Child</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChildPaymentService</span>(<span style="color:#a31515;">&quot;1234&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>))); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa"</pre> </p> <p> These three examples from a <em>C# Interactive</em> session demonstrate that no matter which payment method you use, you can use the same <code>Match</code> method call to extract the payment name from the <code>p</code> object. </p> <h3 id="f2334a900eef421cb24c6e48a96e411b"> Payment types F-Algebra <a href="#f2334a900eef421cb24c6e48a96e411b" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/24/full-binary-tree-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> First, you'll have to define the auxiliary types involved in this API: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentService&nbsp;=&nbsp;PaymentService&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;paymentServiceName&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;paymentServiceAction&nbsp;::&nbsp;String &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">data</span>&nbsp;ChildPaymentService&nbsp;=&nbsp;ChildPaymentService&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;originalTransactionKey&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;parentPaymentService&nbsp;::&nbsp;PaymentService &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. You already saw examples of that in the articles about <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a>, <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>, and <a href="/2019/06/03/either-catamorphism">Either catamorphism</a>. While each of the three payment types have associated data, none of it is parametrically polymorphic, so a single type argument for the carrier type suffices: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentTypeF&nbsp;c&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;IndividualF&nbsp;PaymentService &nbsp;&nbsp;|&nbsp;ParentF&nbsp;PaymentService &nbsp;&nbsp;|&nbsp;ChildF&nbsp;ChildPaymentService &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">PaymentTypeF</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;IndividualF&nbsp;ps &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;ParentF&nbsp;ps &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;ChildF&nbsp;cps</pre> </p> <p> I chose to call the carrier type <code>c</code> (for <em>carrier</em>). As was also the case with <code>BoolF</code>, <code>MaybeF</code>, and <code>EitherF</code>, the <code>Functor</code> instance ignores the map function because the carrier type is missing from all three cases. Like the <code>Functor</code> instances for <code>BoolF</code>, <code>MaybeF</code>, and <code>EitherF</code>, it'd seem that nothing happens, but at the type level, this is still a translation from <code>PaymentTypeF c</code> to <code>PaymentTypeF c1</code>. Not much of a function, perhaps, but definitely an <em>endofunctor</em>. </p> <p> Some helper functions make it a little easier to create <code>Fix PaymentTypeF</code> values, but there's really not much to them: </p> <p> <pre><span style="color:#2b91af;">individualF</span>&nbsp;::&nbsp;<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> individualF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;IndividualF <span style="color:#2b91af;">parentF</span>&nbsp;::&nbsp;<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> parentF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;ParentF <span style="color:#2b91af;">childF</span>&nbsp;::&nbsp;<span style="color:blue;">ChildPaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> childF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;ChildF</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="da3c2c0fee2747bebb1db38c15110bcb"> Haskell catamorphism <a href="#da3c2c0fee2747bebb1db38c15110bcb" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>PaymentTypeF</code>), and an object <code>c</code>, but you still need to find a morphism <code>PaymentTypeF c -&gt; c</code>. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>paymentF&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>IndividualF</code> case? You could pass an argument to the <code>paymentF</code> function, but you shouldn't ignore the data <code>ps</code> contained in the case, so it has to be a function: </p> <p> <pre>paymentF&nbsp;fi&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> I chose to call the argument <code>fi</code>, for <em>function, individual</em>. You can pass a similar argument to deal with the <code>ParentF</code> case: </p> <p> <pre>paymentF&nbsp;fi&nbsp;fp&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;fp&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> And of course with the remaining <code>ChildF</code> case as well: </p> <p> <pre><span style="color:#2b91af;">paymentF</span>&nbsp;::&nbsp;(<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">ChildPaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">Fix&nbsp;PaymentTypeF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c paymentF&nbsp;fi&nbsp;fp&nbsp;fc&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;fp&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;fc&nbsp;cps</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>PaymentTypeF</code>, the compiler infers that the <code>alg</code> function has the type <code>PaymentTypeF c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for the payment types. Except for the <a href="/2019/06/10/tree-catamorphism">tree catamorphism</a>, all catamorphisms so far have been pairs, but this one is a triplet of functions. This is because the sum type has three cases instead of two. </p> <p> As you've seen repeatedly, this isn't the only possible catamorphism, since you can, for example, trivially reorder the arguments to <code>paymentF</code>. The version shown here is, however, equivalent to the above C# <code>Match</code> method. </p> <h3 id="e6248a9ea34148c79c2b03acc92de5f7"> Usage <a href="#e6248a9ea34148c79c2b03acc92de5f7" title="permalink">#</a> </h3> <p> You can use the catamorphism as a basis for other functionality. If, for example, you want to convert a <code>Fix PaymentTypeF</code> value to JSON, you can first define an <a href="http://hackage.haskell.org/package/aeson/docs/Data-Aeson.html">Aeson</a> record type for that purpose: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentJson&nbsp;=&nbsp;PaymentJson&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;name&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;action&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;startRecurrent&nbsp;::&nbsp;Bool &nbsp;&nbsp;,&nbsp;transactionKey&nbsp;::&nbsp;Maybe&nbsp;String &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Generic</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">PaymentJson</span></pre> </p> <p> Subsequently, you can use <code>paymentF</code> to implement a conversion from <code>Fix PaymentTypeF</code> to <code>PaymentJson</code>, as in the previous articles: </p> <p> <pre><span style="color:#2b91af;">toJson</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">PaymentJson</span> toJson&nbsp;= &nbsp;&nbsp;paymentF &nbsp;&nbsp;&nbsp;&nbsp;(\(PaymentService&nbsp;n&nbsp;a)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;False&nbsp;Nothing) &nbsp;&nbsp;&nbsp;&nbsp;(\(PaymentService&nbsp;n&nbsp;a)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;True&nbsp;Nothing) &nbsp;&nbsp;&nbsp;&nbsp;(\(ChildPaymentService&nbsp;k&nbsp;(PaymentService&nbsp;n&nbsp;a))&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;False&nbsp;$&nbsp;Just&nbsp;k)</pre> </p> <p> Testing it in GHCi, it works as it's supposed to: </p> <p> <pre>Prelude Data.Aeson B Payment&gt; B.putStrLn $encode$ toJson $parentF$ PaymentService "Visa" "Pay" {"transactionKey":null,"startRecurrent":true,"action":"Pay","name":"Visa"}</pre> </p> <p> Clearly, it would have been easier to define the payment types shown here as a regular Haskell sum type and just use standard pattern matching, but the purpose of this article isn't to present useful code; the only purpose of the code here is to demonstrate how to identify the catamorphism for a custom domain-specific sum type. </p> <h3 id="153479fffaf647f6ad6f5fc6a63fe025"> Summary <a href="#153479fffaf647f6ad6f5fc6a63fe025" title="permalink">#</a> </h3> <p> Even custom, domain-specific sum types have catamorphisms. This article presented the catamorphism for a custom payment sum type. Because this particular sum type has three cases, the catamorphism is a triplet, instead of a pair, which has otherwise been the most common shape of catamorphisms in previous articles. </p> <p> <strong>Next:</strong> <a href="/2018/03/05/some-design-patterns-as-universal-abstractions">Some design patterns as universal abstractions</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Yes silver bullet https://blog.ploeh.dk/2019/07/01/yes-silver-bullet 2019-07-01T07:38:00+00:00 Mark Seemann <div id="post"> <p> <em>Since Fred Brooks published his essay, I believe that we, contrary to his prediction, have witnessed several silver bullets.</em> </p> <p> I've been rereading <a href="https://en.wikipedia.org/wiki/Fred_Brooks">Fred Brooks</a>'s 1986 essay <a href="https://en.wikipedia.org/wiki/No_Silver_Bullet">No Silver Bullet</a> because I've become increasingly concerned that people seem to draw the wrong conclusions from it. <a href="https://martinfowler.com/bliki/SemanticDiffusion.html">Semantic diffusion</a> seems to have set in. These days, when people state something along the lines that there's <em>no silver bullet in software development</em>, I often get the impression that they mean that there's no panacea. </p> <p> Indeed; I agree. There's no miracle cure that will magically make all problems in software development go away. That's not what the essay states, however. It is, fortunately, more subtle than that. </p> <h3 id="712292e6c9c34663801dd40b4f278d3d"> No silver bullet reread <a href="#712292e6c9c34663801dd40b4f278d3d" title="permalink">#</a> </h3> <p> It's a great essay. It's not my intent to dispute the central argument of the essay, but I think that Brooks made one particular assumption that I disagree with. That doesn't make me smarter in any way. He wrote the essay in 1986. I'm writing this in 2019, with the benefit of the experience of all the years in-between. Hindsight is 20-20, so anyone could make the observations that I do here. </p> <p> Before we get to that, though, a brief summary of the essence of the essay is in order. In short, the conclusion is this: <blockquote> <p> "There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity." </p> <footer><cite>Fred Brooks, <em>No Silver Bullet</em>, 1986</cite></footer> </blockquote> The beginning of the essay is a brilliant analysis of the reasons why software development is inherently difficult. If you read this together with Jack Reeves <em>What Is Software Design?</em> (available various places on the internet, or as an appendix in <a href="http://amzn.to/19W4JHk">APPP</a>), you'll probably agree that there's an inherent complexity to software development that no invention is likely to dispel. </p> <p> Ostensibly in the tradition of <a href="https://en.wikipedia.org/wiki/Aristotle">Aristotle</a>, Brooks distinguishes between <em>essential</em> and <em>accidental</em> complexity. This distinction is central to his argument, so it's worth discussing for a minute. </p> <p> Software development problems are complex, i.e. made up of many interacting sub-problems. Some of that complexity is <em>accidental</em>. This doesn't imply randomness or sloppiness, but only that the complexity isn't inherent to the problem; that it's only the result of our (human) failure to achieve perfection. </p> <p> If you imagine that you could whittle away all the accidental complexity, you'd ultimately reach a point where, in the words of Saint Exupéry, <em>there is nothing more to remove</em>. What's left is the <em>essential</em> complexity. </p> <p> Brooks' conjecture is that a typical software development project comes with both essential and accidental complexity. In his 1995 reflections <em>"No Silver Bullet" Refired</em> (available in <a href="http://bit.ly/mythical-man-month">The Mythical Man-Month</a>), he clarifies what he already implied in 1986: <blockquote> <p> "It is my opinion, and that is all, that the accidental or representational part of the work is now down to about half or less of the total." </p> <footer><cite>Fred Brooks, <em>"No Silver Bullet" Refired</em>, 1995</cite></footer> </blockquote> This I fundamentally disagree with, but more on that later. It makes sense to me to graphically represent the argument like this: </p> <p> <img src="/content/binary/essential-accidental-complexity-shells-brooks-scenario.png" alt="Some, but not much, accidental complexity as a shell around essential complexity."> </p> <p> The way that I think of Brooks' argument is that any software project contains some essential and some accidental complexity. For a given project, the size of the essential complexity is fixed. </p> <p> Brooks believes that less than half of the overall complexity is accidental: </p> <p> <img src="/content/binary/essential-accidental-complexity-pie-chart-brooks-scenario.png" alt="Essential and accidental complexity pie chart."> </p> <p> While a pie chart better illustrates the supposed ratio between the two types of complexity, I prefer to view Brooks' arguments as the first diagram, above. In that visualisation, the essential complexity is a core of fixed size, while accidental complexity is something you can work at removing. If you keep improving your process and technology, you may, conceptually, be able to remove (almost) all of it. </p> <p> <img src="/content/binary/essential-almost-no-accidental-complexity-shells.png" alt="Essential complexity with a very thin shell of accidental complexity."> </p> <p> Brooks' point, with which I agree, is that if the essential complexity is inherent, then you can't reduce the size of it. The only way to decrease the overall complexity is to reduce the accidental complexity. </p> <p> If you agree with the assessment that less than half of the overall complexity in modern software development is accidental, then it follows that no dramatic improvements are available. Even if you remove all accidental complexity, you've only reduced overall complexity by, say, forty percent. </p> <h3 id="d8e6f84d104b4ff6ad6b5473e46a4e30"> Accidental complexity abounds <a href="#d8e6f84d104b4ff6ad6b5473e46a4e30" title="permalink">#</a> </h3> <p> I find Brooks' arguments compelling. I do not, however, accept the premise that there's only little accidental complexity left. Instead of the above diagrams, I believe that the situation looks more like this (not to scale): </p> <p> <img src="/content/binary/accidental-complexity-with-tiny-core-of-essential-complexity.png" alt="Accidental complexity with a tiny core of essential complexity."> </p> <p> I think that most of the complexity in software development is accidental. I'm not sure about today, but I believe that I have compelling evidence that this was the case in 1986, so I don't see why it shouldn't still be the case. </p> <p> To be clear, this is all anecdotal, since I don't believe that software development is quantifiable. In the essay, Brooks explicitly talks about the <em>invisibility</em> of software. Software is pure <em>thought stuff;</em> you can't measure it. I discuss this in my <a href="https://cleancoders.com/episode/humane-code-real-episode-1/show">Humane Code video</a>, but I also recommend that you read <a href="http://bit.ly/leprechauns-of-software-engineering">The Leprechauns of Software Engineering</a> if you have any illusions that we, as an industry, have any reliable measurements of productivity. </p> <p> Brooks predicts that, within the decade (from 1986 to 1996), there would be no single development that would increase productivity with an order of magnitude, i.e. by a factor of at least ten. Ironically, when he wrote <em>"No Silver Bullet" Refired</em> in 1995, at least two such developments were already in motion. </p> <p> We can't blame Brooks for not identifying those developments, because in 1995, their impact was not yet apparent. Again, hindsight is 20-20. </p> <p> Neither of these two developments are purely technological, although technology plays a role. Notice, though, that Brooks' prediction included <em>technology or management technique</em>. It's in the interaction between technology and the humane that the orders-of-magnitude developments emerged. </p> <h3 id="1d23f6fb89884b6d9833ce09d68a3b0f"> World Wide Web <a href="#1d23f6fb89884b6d9833ce09d68a3b0f" title="permalink">#</a> </h3> <p> I have a dirty little secret. In the beginning of my programming career, I became quite the expert on a programming framework called <a href="https://en.wikipedia.org/wiki/Microsoft_Commerce_Server">Microsoft Commerce Server</a>. In fact, I co-authored a chapter of <a href="https://amzn.to/2CpE4rr">Professional Commerce Server 2000 Programming</a>, and in 2003 I received an <a href="https://mvp.microsoft.com">MVP</a> award as an acknowledgement of my work in the Commerce Server community (such as it were; it was mostly on <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a>). </p> <p> The Commerce Server framework was a black box. This was long before Microsoft embraced open source, and while there was a bit of official documentation, it was superficial; it was mostly of the <em>getting-started</em> kind. </p> <p> Over several years, I managed to figure out how the framework really worked, and thus, how one could extend it. This was a painstaking process. Since it was a black box, I couldn't just go and read the code to figure out how it worked. The framework was written in C++ and Visual Basic, so there wasn't even IL code to decompile. </p> <p> I had one window into the framework. It relied on SQL Server, and I could attach the profiler tool to spy on its interaction with the database. Painstakingly, over several years, I managed to wrest the framework's secrets from it. </p> <p> I wasted much time doing detective work like that. </p> <p> In general, programming in the late nineties and early two-thousands was less productive, not because the languages or tools were orders-of-magnitude worse than today, but because when you hit a snag, you were in trouble. </p> <p> These days, if you run into a problem beyond your abilities, you can ask for help on the World Wide Web. Usually, you'll find an existing answer on <a href="https://stackoverflow.com">Stack Overflow</a>, and you'll be able to proceed without too much delay. </p> <p> Compared to twenty years ago, I believe that the World Wide Web has increased my productivity more than ten-fold. While it also existed in 1995, there wasn't much content. It's not the technology itself that provides the productivity increase, but rather the synergy of technology and human knowledge. </p> <p> I think that Brooks vastly underestimated how much time one can waste when one is stuck. That's a sort of accidental complexity, although in the development process rather than in the technology itself. </p> <h3 id="a3b19483cd6a4c509d8c3a77fe324872"> Automated testing <a href="#a3b19483cd6a4c509d8c3a77fe324872" title="permalink">#</a> </h3> <p> In the late nineties, I was developing web sites (with Commerce Server). When I wanted to run my code to see if it worked, I'd launch the web site on my laptop, log in, click around and enter data until I was convinced that the functionality was working as it should. Most of the time, however, it wasn't, so I'd change a bit of the code, and go through the same process again. </p> <p> I think that's a common way to 'test' software; at least, it was back then. </p> <p> While you could get good at going through these motions quickly, verifying a single, or a handful of related functionalities, could easily take at least a couple of seconds, and usually more like half a minute. </p> <p> If you had dozens, or even hundreds, of different scenarios to address, you obviously wouldn't run through them all every time you changed the code. At the very best, you'd click your way through three of four usage scenarios that you thought were relevant to the change you'd made. Other functionality, earlier declared <em>done</em>, you just considered to be unaffected. </p> <p> Needless to say, regressions were regular occurrences. </p> <p> In 2003 I discovered test-driven development, and through that, automated testing. While you can't directly compare unit tests with whole usage scenarios, I think it's fair to compare something like automated integration tests or user-scenario tests (whatever you want to call them) with manually clicking through an application. </p> <p> Even an integration test, if written properly, can verify a scenario <em>at least</em> ten times faster than you can do it by hand. A more realistic estimate is probably hundred times faster, or more. </p> <p> Granted, you have to write the automated test as well, and I know that it's not always trivial. Still, once you have an automated test suite in place, you can run it all the time. </p> <p> I never ran through <em>all</em> usage scenarios when I manually 'tested' my software. With automated tests, I do. This saves me from most regressions. </p> <p> This improvement is, in my opinion, a no-brainer. It's easily a factor ten improvement. All the time wasted manually 'testing' the software, plus the time wasted fixing regressions, can be put to better use. </p> <p> At the time Brooks was writing his own retrospective (in 1995), Kent Beck was beginning to talk to other people about test-driven development. As is a common theme in this article, hindsight is 20-20. </p> <h3 id="c7ca9269cce04b3ab934c97bc8cf0328"> Honourable mentions <a href="#c7ca9269cce04b3ab934c97bc8cf0328" title="permalink">#</a> </h3> <p> There's been other improvements in software development since 1986. I considered including several other improvements as bona fide orders-of-magnitude improvements, but I think that's probably going too far. Each of the following developments have, however, offered significant improvements: <ul> <li> <strong>Git.</strong> It's surprising how much more productive Git can make you. While it's somewhat better than centralised source control systems at the functionality also available with those other systems, the productivity increase comes from all the new, unanticipated workflows it enables. Before I started using DVCS, I'd have lots of code that was commented out, so that I could experiment with various alternatives. With Git, I just create a new branch, or stash my changes, and experiment with abandon. While it's probably not a ten-fold increase in productivity, I believe it's the simplest technology change you can make to dramatically increase your productivity. </li> <li> <strong>Garbage collection.</strong> Since I've admitted that I worked with Microsoft Commerce Server, I've probably lost all credibility with my reader already, but let's see if I can win back a little. While Commerce Server programming involved <a href="https://en.wikipedia.org/wiki/VBScript">VBScript</a> programming, it also often involved <a href="https://en.wikipedia.org/wiki/Component_Object_Model">COM</a> programming, and I did quite a bit of that in C++. Having to make sure that you've cleaned up all memory after use is a bother. Garbage collection just makes this work go away. It's hardly a ten-fold improvement in productivity, but I do find it significant. </li> <li> <strong>Agile software development.</strong> The methodology of decreasing the feedback time between implementation and deployment has made me much more productive. I'm not interested in peddling any particular methodology like Scrum as much as just the general concept of getting rapid feedback. Particularly if you combine continuous delivery with Git, you have a powerful combination. Brooks already talked about incremental software development, and had some hopes attached to this as well. My personal experience can only agree with his sentiment. Again, probably not in itself a ten-fold increase in productivity, but enough that I wouldn't want to work on a project where rapid feedback and incremental development wasn't valued. </li> </ul> I'm probably forgetting lots of other improvements that have happened in the last decades. That's fine. The purpose of this article isn't to produce an exhaustive list, but rather to make the argument that significant improvements have been made since Brooks wrote his essay. I think it'd be folly, then, to believe that we've seen the last of such improvements. </p> <p> Personally, I'm inclined to believe another order-of-magnitude improvement is right at our feet. </p> <h3 id="bd2d47d8dac2401e936ca7902bc9109d"> Statically typed functional programming <a href="#bd2d47d8dac2401e936ca7902bc9109d" title="permalink">#</a> </h3> <p> This section is conjecture on my part. The improvements I've so far covered are already realised (at least for those who choose to take advantage of them). The improvement I'll cover here is more speculative. </p> <p> I believe that statically typed functional programming offers another order-of-magnitude improvement over existing software development. Twenty years ago, I believed that object-oriented programming was a good idea. I now believe that I was wrong about that, so it's possible that in another twenty years, I'll also believe that I was wrong about functional programming. Take the following for what it is. </p> <p> When I carefully reread <em>No Silver Bullet</em>, I got the distinct impression that Brooks considered low-level details of programming part of its essential complexity: <blockquote> <p> "Much of the complexity in a software construct is, however, not due to conformity to the external world but rather to the implementation itself - its data structures, its algorithms, its connectivity." </p> <footer><cite>Fred Brooks, <em>"No Silver Bullet" Refired</em>, 1995</cite></footer> </blockquote> It's unreasonable to blame anyone writing in 1986, or 1995 for that matter, to think that <code>for</code> loops, variables, program state, and such other programming stables were anything but essential parts of the complexity of developing software. </p> <p> Someone, unfortunately I forget who, once made the point that all mainstream programming languages are layers of abstractions of how a CPU works. Assembly language is basically just mnemonics on top of a CPU instruction set, then C can be thought of as an abstraction over assembly language, C++ as the next step in abstraction, Java and C# as sort of abstractions of C++, and so on. The origin of the design is the physical CPU. You could say that these languages are designed in a bottom-up fashion. </p> <p> <img src="/content/binary/imperative-bottom-up-functional-top-down.png" alt="Imperative languages depicted as designed bottom-up, and functional languages as designed top-down."> </p> <p> Some functional languages (perhaps most famously <a href="https://www.haskell.org">Haskell</a>, but also <a href="https://en.wikipedia.org/wiki/APL_(programming_language)">APL</a>, and, possibly, <a href="https://en.wikipedia.org/wiki/Lisp_(programming_language)">Lisp</a>) are designed in a much more top-down fashion. You start with mathematical abstractions like <a href="https://en.wikipedia.org/wiki/Category_theory">category theory</a> and then figure out how to crystallise the theory into a programming language, and then again, via more layers of abstractions, how to turn the abstract language into machine code. </p> <p> The more you learn about the <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a> functional alternative to programming, the more you begin to see mutable program state, variables, <code>for</code> loops, and similar language constructs merely as artefacts of the underlying model. Brooks, I think, thought of these as part of the essential complexity of programming. I don't think that that's the case. You can get by just fine with other abstractions instead. </p> <p> Besides, Brooks writes, under the heading of <em>Complexity:</em> <blockquote> <p> "From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability. From the complexity of the functions comes the difficulty of invoking those functions, which makes programs hard to use." </p> <footer><cite>Fred Brooks, <em>No Silver Bullet</em>, 1986</cite></footer> </blockquote> When he writes <em>functions</em>, I don't think that he means functions in the Haskell sense. I think that he means <em>operations</em>, <em>procedures</em>, or <em>methods</em>. </p> <p> Indeed, when you look at a C# method signature like the following, it's hard to enumerate, understand, or remember, all that it does: </p> <p> <pre><span style="color:blue;">int</span>?&nbsp;TryAccept(<span style="color:#2b91af;">Reservation</span>&nbsp;reservation);</pre> </p> <p> If this is a high-level function, many things could happen when you call that method. It could change the state of a database. It could send an email. It could mutate a variable. Not only that, but the behaviour could depend on non-deterministic factors, such as the date, time of day, or just raw randomness. Finally, how should you handle the return value? What does it mean if the return value is <em>null</em>? What if it's not? Is <code>0</code> a valid value? Are negative numbers valid? Are they different from positive values? </p> <p> It is, indeed, difficult to enumerate all the possible states of such a function. </p> <p> Consider, instead, a Haskell function with a type like this: </p> <p> <pre><span style="color:#2b91af;">tryAccept</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Int</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Reservation</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeT</span>&nbsp;<span style="color:blue;">ReservationsProgram</span>&nbsp;<span style="color:#2b91af;">Int</span></pre> </p> <p> What happens if you invoke this function? It returns a value. Does it send any emails? Does it mutate any state? No, it can't, because the static type informs us that this is a pure function. If any programmer, anywhere inside of the function, or the functions it calls, or functions they call, etc. tried to do something impure, it wouldn't have compiled. </p> <p> Can we enumerate the states of the program? Certainly. We just have to figure out what <code>ReservationsProgram</code> is. After following a few types, we find this statically typed enumeration: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;ReservationsInstruction&nbsp;next&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;IsReservationInFuture&nbsp;Reservation&nbsp;(Bool&nbsp;-&gt;&nbsp;next) &nbsp;&nbsp;|&nbsp;ReadReservations&nbsp;UTCTime&nbsp;([Reservation]&nbsp;-&gt;&nbsp;next) &nbsp;&nbsp;|&nbsp;Create&nbsp;Reservation&nbsp;(Int&nbsp;-&gt;&nbsp;next) &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;Functor</pre> </p> <p> Essentially, there's three 'actions' that this type enables. The <code>tryAccept</code> function returns the <code>ReservationsProgram</code> inside of a <code>MaybeT</code> container, so there's a fourth option that something short-circuits along the way. </p> <p> You don't even have to keep track of this yourself. The compiler keeps you honest. Whenever you invoke the <code>tryAccept</code> function, the compiler will insist that you write code that can handle all possible outcomes. If you turn on the right compiler flags, the code is not going to compile if you don't. </p> <p> (Both code examples are taken from <a href="https://github.com/ploeh/dependency-injection-revisited">the same repository</a>.) </p> <p> Haskellers jokingly declare that <em>if Haskell code compiles, it works</em>. While humorous, there's a kernel of truth in that. An advanced type system can carry much information about the behaviour of a program. Some people, particularly programmers who come from a dynamically typed background, find Haskell's type system rigid. That's not an unreasonable criticism, but often, in dynamically typed languages, you have to write many automated tests to ensure that your program behaves as desired, and that it correctly handles various edge cases. A type system like Haskell's, on the other hand, embeds those rules in types instead of in tests. </p> <p> While you should still write automated tests for Haskell programs, fewer are needed. How many fewer? Compared to C-based languages, a factor ten isn't an unreasonable guess. </p> <p> After a few false starts, in 2014 I finally decided that <a href="https://fsharp.org">F#</a> would be my default choice of language on .NET. The reason for that decision was that I felt so much more productive in F# compared to C#. While F#'s type system doesn't embed information about pure versus impure functions, it does support <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>, which is what enables the sort of compile-time <em>enumeration</em> that Brooks discusses. </p> <p> F# is still my .NET language of choice, but I find that I mostly 'think in' Haskell these days. My conjecture is that a sufficiently advanced type system (like Haskell's) could easily represent another order-of-magnitude improvement over mainstream imperative languages. </p> <h3 id="a75ae35933314755b1a0cdb665262bc5"> Improvements for those who want them <a href="#a75ae35933314755b1a0cdb665262bc5" title="permalink">#</a> </h3> <p> The essay <em>No Silver Bullet</em> is a perspicacious work. I think more people should read at least the first part, where Brooks explains why software development is hard. I find that analysis brilliant, and I agree: software development presupposes essential complexity. It's inherently hard. </p> <p> There's no reason to make it harder than it has to be, though. </p> <p> More than once, I've discussed productivity improvements with people, only to be met with the dismissal that 'there's no silver bullet'. </p> <p> Granted, there's no magical solution that will solve all problems with software development, but that doesn't mean that improvements can't be had. </p> <p> Consider the improvements I've argued for here. Everyone now uses the World Wide Web and sites like Stack Overflow for research; that particular improvement is firmly embedded in all organisations. On the other hand, I still regularly talk to organisations that don't routinely use automated testing. </p> <p> People still use centralised version control (like TFS or SVN). If there was ever a low-hanging fruit, changing to Git is one. Git is <em>free</em>, and there's plenty of tools you can use to migrate your version history to it. There's also plenty of training and help to be had. Yes, it'll require a small investment to make the change, but the productivity increase is significant. <blockquote> <p> "The future is already here — it's just not very evenly distributed." </p> <footer><cite>William Gibson</cite></footer> </blockquote> So it is with technology improvements. Automated testing is available, but not ubiquitous. Git is free, but still organisations stick to suboptimal version control. Haskell and F# are mature languages, yet programmers still program in C# or Java. </p> <h3 id="864e39a22bc84129bfecaafe33dd1757"> Summary <a href="#864e39a22bc84129bfecaafe33dd1757" title="permalink">#</a> </h3> <p> The essay <em>No Silver Bullet</em> was written in 1986, but seems to me to be increasingly misunderstood. When people today talk about it at all, it's mostly as an excuse to stay where they are. "There's no silver bullets," they'll say. </p> <p> The essay, however, doesn't argue that no improvements can be had. It only argues that no more order-of-magnitude improvements can be had. </p> <p> In the present essay I argue that, since Brooks wrote <em>No Silver Bullet</em>, more than one such improvement happened. Once the World Wide Web truly began furnishing <em>information at your fingertips</em>, you could be more productive because you wouldn't be <em>stuck</em> for days or weeks. Automated testing reduces the work that manual testers used to perform, as well as limiting regressions. </p> <p> If you accept my argument, that order-of-magnitude improvements appeared after 1986, this implies that Brooks' premise was wrong. In that case, there's no reason to believe that we've seen the last significant improvement to software development. </p> <p> I think that more such improvements await us. I suggest that statically typed functional programming offers such an advance, but if history teaches us anything, it seems that breakthroughs tend to be unpredictable. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="7e7e932f5eea47f3bab328c58e9d164a"> <div class="comment-author"><a href="http://blog.strobaek.org">Karsten Strøbæk</a></div> <div class="comment-content"> <p> As always I enjoy reading your blog, even though I don't understand half of it most of the time. Or is that most of it half of the time? Allow me to put a few observations forward. </p> <p> First I should confess, that I have actually not read the whole of Brook's essay. When I initially tried I got about half way through; it sounds like I should make another go at it. That of course will not stop me from commenting on the above. </p> <p> Brook talks about complexity. To me designing and implementing a software system is not complex. Quantum physics is complex. Flying an airplane is difficult. Software development may be difficult depending on the task at hand (and unfortunately the qualifications of the team), but I would argue that it at most falls into the same category as flying an airplane. </p> <p> I would properly also state, that there are no silver bullets. But like you I feel that people understand it incorrectly and there is definetely no reason for making things harder than they are. I think the examples of technology that helps are excellent and exactly describe that things do move forward. </p> <p> That being said, it does not take away the creativity of the right decomposition, the responsibility for getting the use cases right, and especially the liability for getting it wrong. Sadly especially the last of overlooked. People should be reminded of where the phrase 'live under the bridge' comes from. </p> <p> To end my ramblins, I would also look a little into the future. As you know I am somewhat sceptial about machine learning and AI. However, looking at the recent break throughs and use cases in these areas, I would not be surprised of a future where software development is done by 'an AI' assemblying pre-defined 'entities' to create the software we need. Like an F16 cannot be flown without a computer, future software cannot be created by a human. </p> </div> <div class="comment-date">2019-07-04 18:29:00 UTC</div> </div> <div class="comment" id="756066e5cb0e42368ff9eeb9569fa47f"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Karsten, thank you for writing. I'm not inclined to agree that software development falls into the same category of complexity as flying a plane. It seems to me to be orders of magnitudes more complex. </p> <p> Just look at error rates. </p> <p> Would you ever board an air plane if flying had error rates similar to those observed in software development? Would you fly even if only one percent of all flights ended with plane crash? </p> <p> In reality, flying is extremely safe. Would you claim that software development is as safe, predictable, and manageable as flying? </p> <p> I see no evidence of that. </p> <p> Are pilots significantly more capable human beings than software developers, or does something else explain the discrepancy in failure rates? </p> </div> <div class="comment-date">2019-07-05 15:47 UTC</div> </div> <div class="comment" id="7e7e932f5eea47f3bab328c58e9d164b"> <div class="comment-author"><a href="http://blog.strobaek.org">Karsten Strøbæk</a></div> <div class="comment-content"> <p> Hi Mark. The fact that error rates are higher in software development is more a statement to the bad state our industry is in and has been for a milinium or more. </p> <p> Why do we except that we produce crappy systems or in your words software that is no safe, predictable, and manageble? The list of excuses is very long and the list of results is very short. We as an industry are simply doing it wrong, but most people prefers hand waving and marketing than simple and plausible heuristic. </p> <p> To use your analogy about planes I could ask if you would fly with a place that had (only) been unit tested? Properly not as it is never the unit that fails, but always the integration. Should be test all integrations then? Yes, why not? </p> <p> The used of planes or pilots (or whatever) may have been bad. My point was, the I do not see software development as complex. </p> </div> <div class="comment-date">2019-07-05 20:12 UTC</div> </div> <div class="comment" id="0df7412992fb499d915e6f4cdbb644a0"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Karsten, if we, as an industry, are doing it wrong, then why are we doing that? </p> <p> And what should we be doing instead? </p> </div> <div class="comment-date">2019-07-06 16:00 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Full binary tree catamorphism https://blog.ploeh.dk/2019/06/24/full-binary-tree-catamorphism 2019-06-24T06:00:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for a full binary tree is a pair of functions.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a full <a href="https://en.wikipedia.org/wiki/Binary_tree">binary tree</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> A <em>full binary tree</em> (also known as a <em>proper</em> or <em>plane</em> binary tree) is a tree in which each node has either two or no branches. </p> <p> <img src="/content/binary/full-binary-tree-example.png" alt="A full binary tree example diagram, with each node containing integers."> </p> <p> The diagram shows an example of a tree of integers. The left branch contains two children, of which the right branch again contains two sub-branches. The rest of the nodes are leaf-nodes with no sub-branches. </p> <h3 id="d6b9699fa3894a4383f9b2b2992a9e8f"> C# catamorphism <a href="#d6b9699fa3894a4383f9b2b2992a9e8f" title="permalink">#</a> </h3> <p> As a C# representation of a full binary tree, I'll start with the <code>IBinaryTree&lt;T&gt;</code> API from <a href="/2018/08/13/a-visitor-functor">A Visitor functor</a>. The catamorphism is the <code>Accept</code> method: </p> <p> <pre><span style="color:#2b91af;">TResult</span>&nbsp;Accept&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">IBinaryTreeVisitor</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;visitor);</pre> </p> <p> So far in this article series, you've mostly seen <a href="/2018/05/22/church-encoding">Church-encoded</a> catamorphisms, so a catamorphism represented as a <a href="https://en.wikipedia.org/wiki/Visitor_pattern">Visitor</a> may be too big of a cognitive leap. We know, however, from <a href="/2018/06/25/visitor-as-a-sum-type">Visitor as a sum type</a> that a Visitor representation is isomorphic to a Church encoding. Since these are isomorphic, it's possible to refactor <code>IBinaryTree&lt;T&gt;</code> to a Church encoding. The <a href="https://github.com/ploeh/ChurchEncoding">GitHub repository</a> contains a series of commits that demonstrates how that refactoring works. Once you're done, you arrive at this <code>Match</code> method, which is the refactored <code>Accept</code> method: </p> <p> <pre><span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf);</pre> </p> <p> This method takes a pair of functions as arguments. The <code>node</code> function deals with an internal node in the tree (the blue nodes in the above diagram), whereas the <code>leaf</code> function deals with the leaf nodes (the green nodes in the diagram). </p> <p> The <code>leaf</code> function may be the easiest one to understand. A leaf node only contains a value of the type <code>T</code>, so the only operation the function has to support is translating the <code>T</code> value to a <code>TResult</code> value. This is also the premise of the <code>Leaf</code> class' implementation of the method: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;item; <span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;leaf(item); }</pre> </p> <p> The <code>node</code> function is more tricky. It takes three input arguments, of the types <code>TResult</code>, <code>T</code>, and <code>TResult</code>. The roles of these are respectively <em>left</em>, <em>item</em>, and <em>right</em>. This is a typical representation of a binary node. Since there's always a left and a right branch, you put the node's value in the middle. As was the case with the <a href="/2019/06/10/tree-catamorphism">tree catamorphism</a>, the catamorphism function receives the branches as already-translated values; that is, both the left and right branch have already been translated to <code>TResult</code> when <code>node</code> is called. While it looks like magic, as always it's just the result of recursion: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;left; <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;item; <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;right; <span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;node(left.Match(node,&nbsp;leaf),&nbsp;item,&nbsp;right.Match(node,&nbsp;leaf)); }</pre> </p> <p> This is the <code>Node&lt;T&gt;</code> class implementation of the <code>Match</code> method. It calls <code>node</code> and returns whatever it returns, but notice that as the <code>left</code> and <code>right</code> arguments, if first, recursively, calls <code>left.Match</code> and <code>right.Match</code>. This is how it can call <code>node</code> with the translated branches, as well as with the basic <code>item</code>. </p> <p> The recursion stops and unwinds on <code>left</code> and <code>right</code> whenever one of those are <code>Leaf</code> instances. </p> <h3 id="c64210d585c94cb78653b96380cbf0e6"> Examples <a href="#c64210d585c94cb78653b96380cbf0e6" title="permalink">#</a> </h3> <p> You can use <code>Match</code> to implement most other behaviour you'd like <code>IBinaryTree&lt;T&gt;</code> to have. In <a href="/2018/08/13/a-visitor-functor">the original article on the full binary tree functor</a> you saw how to implement <code>Select</code> with a Visitor, but now that the API is Church-encoded, you can derive <code>Select</code> from <code>Match</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;Select&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;tree, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(tree&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(tree)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(selector&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(selector)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(l,&nbsp;x,&nbsp;r)&nbsp;=&gt;&nbsp;Create(l,&nbsp;selector(x),&nbsp;r), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;x&nbsp;=&gt;&nbsp;Leaf(selector(x))); }</pre> </p> <p> In the <code>leaf</code> case, the <code>Select</code> method simply calls <code>selector</code> with the <code>x</code> value it receives, and puts the resulting <code>TResult</code> object into a new <code>Leaf</code> object. </p> <p> In the <code>node</code> case, the lambda expression receives three arguments: <code>l</code> and <code>r</code> are the <em>already-translated</em> left and right branches, so you only need to call <code>selector</code> on <code>x</code> and call the <code>Create</code> helper method to produce a new <code>Node</code> object. </p> <p> You can also implement more specialised functionality, like calculating the sum of nodes, measuring the depth of the tree, and similar functions. You saw equivalent examples in the <a href="/2019/06/10/tree-catamorphism">previous article</a>. </p> <p> For the examples in this article, I'll use the tree shown in the above diagram. Using static helper methods, you can write it like this: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Create( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Create( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Leaf(42), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Create( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Leaf(2112), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5040, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Leaf(1984))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">BinaryTree</span>.Leaf(90125));</pre> </p> <p> To calculate the sum of all nodes, you can write a function like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Sum(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match((l,&nbsp;x,&nbsp;r)&nbsp;=&gt;&nbsp;l&nbsp;+&nbsp;x&nbsp;+&nbsp;r,&nbsp;x&nbsp;=&gt;&nbsp;x); }</pre> </p> <p> The <code>leaf</code> function just returns the value of the node, while the <code>node</code> function adds the numbers together. It works for the above <code>tree</code>: </p> <p> <pre>&gt; tree.Sum() 100642</pre> </p> <p> To find the maximum value, you can write another extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Max(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match((l,&nbsp;x,&nbsp;r)&nbsp;=&gt;&nbsp;<span style="color:#2b91af;">Math</span>.Max(<span style="color:#2b91af;">Math</span>.Max(l,&nbsp;r),&nbsp;x),&nbsp;x&nbsp;=&gt;&nbsp;x); }</pre> </p> <p> Again, the <code>leaf</code> function just returns the value of the node. The <code>node</code> function receives the value of the current node <code>x</code>, as well as the already-found maximum value of the left branch and the right branch; it then returns the maximum of these three values: </p> <p> <pre>&gt; tree.Max() 90125</pre> </p> <p> As was also the case for trees, both of these operations are part of the standard repertoire available via a data structure's <em>fold</em>. That's not the case for the next two functions, which can't be implemented using a fold, but which can be defined with the catamorphism. The first is a function to count the leaves of a tree: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;CountLeaves&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match((l,&nbsp;_,&nbsp;r)&nbsp;=&gt;&nbsp;l&nbsp;+&nbsp;r,&nbsp;_&nbsp;=&gt;&nbsp;1); }</pre> </p> <p> Since the <code>leaf</code> function handles a leaf node, the number of leaf nodes in a leaf node is, by definition, <em>one</em>. Thus, that function can ignore the value of the node and always return <code>1</code>. The <code>node</code> function, on the other hand, receives the number of leaf nodes on the left-hand side (<code>l</code>), the value of the current node, and the number of leaf nodes on the right-hand side (<code>r</code>). Notice that since an internal node is never a leaf node, it doesn't count; instead, just add <code>l</code> and <code>r</code> together. Notice that, again, the value of the node itself is irrelevant. </p> <p> How many leaf nodes does the above tree have? </p> <p> <pre>&gt; tree.CountLeaves() 4</pre> </p> <p> You can also measure the maximum depth of a tree: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;MeasureDepth&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IBinaryTree</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match((l,&nbsp;_,&nbsp;r)&nbsp;=&gt;&nbsp;1&nbsp;+&nbsp;<span style="color:#2b91af;">Math</span>.Max(l,&nbsp;r),&nbsp;_&nbsp;=&gt;&nbsp;0); }</pre> </p> <p> Like in the previous article, I've arbitrarily decided that the depth of a leaf node is <em>zero</em>; therefore, the <code>leaf</code> function always returns <code>0</code>. The <code>node</code> function receives the depth of the left and right branches, and returns the maximum of those two values, plus one, since the current node adds one level of depth. </p> <p> <pre>&gt; tree.MeasureDepth() 3</pre> </p> <p> You may not have much need for working with full binary trees in your normal, day-to-day C# work, but I found it worthwhile to include this example for a couple of reasons. First, because the original of the API shows that a catamorphism may be hiding in a Visitor. Second, because binary trees are interesting, in that they're foldable <a href="/2018/03/22/functors">functors</a>, but not monads. </p> <p> Where does the catamorphism come from, though? How can you trust that the <code>Match</code> method is the catamorphism? </p> <h3 id="d015bcc9afe742408d7c8ba6c6edce2a"> Binary tree F-Algebra <a href="#d015bcc9afe742408d7c8ba6c6edce2a" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/10/tree-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> As always, start with the underlying endofunctor. You can think of this one as a specialisation of the rose tree from the previous article: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;FullBinaryTreeF&nbsp;a&nbsp;c&nbsp;=&nbsp;LeafF&nbsp;a&nbsp;|&nbsp;NodeF&nbsp;c&nbsp;a&nbsp;c&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">FullBinaryTreeF</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;LeafF&nbsp;x &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(NodeF&nbsp;l&nbsp;x&nbsp;r)&nbsp;=&nbsp;NodeF&nbsp;(f&nbsp;l)&nbsp;x&nbsp;(f&nbsp;r)</pre> </p> <p> As usual, I've called the 'data' type <code>a</code> and the carrier type <code>c</code> (for <em>carrier</em>). The <code>Functor</code> instance as usual translates the carrier type; the <code>fmap</code> function has the type <code>(c -&gt; c1) -&gt; FullBinaryTreeF a c -&gt; FullBinaryTreeF a c1</code>. </p> <p> As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like <code>Fix (FullBinaryTreeF a)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;FullBinaryTreeFix&nbsp;a&nbsp;= &nbsp;&nbsp;FullBinaryTreeFix&nbsp;{&nbsp;unFullBinaryTreeFix&nbsp;::&nbsp;Fix&nbsp;(FullBinaryTreeF&nbsp;a)&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Functor</code>, <code>Foldable</code>, and <code>Traversable</code> instances (but not <code>Monad</code>) for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A pair of helper functions make it easier to define <code>FullBinaryTreeFix</code> values: </p> <p> <pre><span style="color:#2b91af;">fbtLeafF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a fbtLeafF&nbsp;=&nbsp;FullBinaryTreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;LeafF <span style="color:#2b91af;">fbtNodeF</span>&nbsp;::&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a fbtNodeF&nbsp;(FullBinaryTreeFix&nbsp;l)&nbsp;x&nbsp;(FullBinaryTreeFix&nbsp;r)&nbsp;=&nbsp;FullBinaryTreeFix&nbsp;$&nbsp;Fix&nbsp;$&nbsp;NodeF&nbsp;l&nbsp;x&nbsp;r</pre> </p> <p> In order to distinguish these helper functions from the ones that create <code>TreeFix a</code> values, I prefixed them with <code>fbt</code> (for <em>Full Binary Tree</em>). <code>fbtLeafF</code> creates a leaf node: </p> <p> <pre>Prelude Fix FullBinaryTree&gt; fbtLeafF "fnaah" FullBinaryTreeFix {unFullBinaryTreeFix = Fix (LeafF "fnaah")}</pre> </p> <p> <code>fbtNodeF</code> is a helper function to create an internal node: </p> <p> <pre>Prelude Fix FullBinaryTree&gt; fbtNodeF (fbtLeafF 1337) 42 (fbtLeafF 2112) FullBinaryTreeFix {unFullBinaryTreeFix = Fix (NodeF (Fix (LeafF 1337)) 42 (Fix (LeafF 2112)))}</pre> </p> <p> The <code>FullBinaryTreeFix</code> type, or rather the underlying <code>FullBinaryTreeF a</code> functor, is all you need to identify the catamorphism. </p> <h3 id="ced0da7dc61943b0be872ec79b4e3651"> Haskell catamorphism <a href="#ced0da7dc61943b0be872ec79b4e3651" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>FullBinaryTreeF a</code>), and an object <code>c</code>, but you still need to find a morphism <code>FullBinaryTreeF a c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not the 'data type' <code>a</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code>, as you'll see. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>fullBinaryTreeF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unFullBinaryTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(NodeF&nbsp;l&nbsp;x&nbsp;r)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementation of <code>alg</code>, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from <code>alg</code>? You could pass a function argument to the <code>fullBinaryTreeF</code> function and use it with <code>x</code>: </p> <p> <pre>fullBinaryTreeF&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unFullBinaryTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(NodeF&nbsp;l&nbsp;x&nbsp;r)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> I called the function <code>fl</code> for <em>function, leaf</em>, because we're also going to need a function for the <code>NodeF</code> case: </p> <p> <pre><span style="color:#2b91af;">fullBinaryTreeF</span>&nbsp;::&nbsp;(c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c fullBinaryTreeF&nbsp;fn&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unFullBinaryTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(NodeF&nbsp;l&nbsp;x&nbsp;r)&nbsp;=&nbsp;fn&nbsp;l&nbsp;x&nbsp;r</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>FullBinaryTreeF</code>, the compiler infers that the <code>alg</code> function has the type <code>FullBinaryTreeF a c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for a full binary tree. As always, it's not the only possible catamorphism, since you can easily reorder the arguments to both <code>fullBinaryTreeF</code>, <code>fn</code>, and <code>fl</code>. These would all be isomorphic, though. </p> <h3 id="3f87d49db58f4cd59dec76a97d31c0d2"> Basis <a href="#3f87d49db58f4cd59dec76a97d31c0d2" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>treeF</code>. Here's the <code>Functor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;=&nbsp;fullBinaryTreeF&nbsp;(\l&nbsp;x&nbsp;r&nbsp;-&gt;&nbsp;fbtNodeF&nbsp;l&nbsp;(f&nbsp;x)&nbsp;r)&nbsp;(fbtLeafF&nbsp;.&nbsp;f)</pre> </p> <p> The <code>fl</code> function first invokes <code>f</code>, followed by <code>fbtLeafF</code>. The <code>fn</code> function uses the <code>fbtNodeF</code> helper function to create a new internal node. <code>l</code> and <code>r</code> are already-translated branches, so you just need to call <code>f</code> with the node value <code>x</code>. </p> <p> There's no <code>Monad</code> instance for binary trees, because you can't flatten a binary tree of binary trees. You can, on the other hand, define a <code>Foldable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;f&nbsp;=&nbsp;fullBinaryTreeF&nbsp;(\l&nbsp;x&nbsp;r&nbsp;-&gt;&nbsp;l&nbsp;&lt;&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;r)&nbsp;f</pre> </p> <p> The <code>f</code> function passed to <code>foldMap</code> has the type <code>Monoid m =&gt; (a -&gt; m)</code>, so the <code>fl</code> function that handles leaf nodes simply calls <code>f</code> with the contents of the node. The <code>fn</code> function receives two branches already translated to <code>m</code>, so it just has to call <code>f</code> with <code>x</code> and combine all the <code>m</code> values using the <code>&lt;&gt;</code> operator. </p> <p> The <code>Traversable</code> instance follows right on the heels of <code>Foldable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;fullBinaryTreeF&nbsp;(liftA3&nbsp;fbtNodeF)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;fbtLeafF)</pre> </p> <p> There are operations on binary trees that you can implement with a fold, but some that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using <code>FullBinaryTreeFix</code>, you can define that tree like this: </p> <p> <pre>tree&nbsp;=&nbsp; &nbsp;&nbsp;fbtNodeF &nbsp;&nbsp;&nbsp;&nbsp;(fbtNodeF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(fbtLeafF&nbsp;42) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1337 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(fbtNodeF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(fbtLeafF&nbsp;2112) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5040 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(fbtLeafF&nbsp;1984))) &nbsp;&nbsp;&nbsp;&nbsp;2 &nbsp;&nbsp;&nbsp;&nbsp;(fbtLeafF&nbsp;90125)</pre> </p> <p> Since <code>FullBinaryTreeFix</code> is <code>Foldable</code>, and that type class already comes with <code>sum</code> and <code>maximum</code> functions, no further work is required to repeat the first two of the above C# examples: </p> <p> <pre>Prelude Fix FullBinaryTree&gt; sum tree 100642 Prelude Fix FullBinaryTree&gt; maximum tree 90125</pre> </p> <p> Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the <code>Foldable</code> instance, but can be implemented using the catamorphism: </p> <p> <pre><span style="color:#2b91af;">countLeaves</span>&nbsp;::&nbsp;<span style="color:blue;">Num</span>&nbsp;n&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n countLeaves&nbsp;=&nbsp;fullBinaryTreeF&nbsp;(\l&nbsp;_&nbsp;r&nbsp;-&gt;&nbsp;l&nbsp;+&nbsp;r)&nbsp;(<span style="color:blue;">const</span>&nbsp;1) <span style="color:#2b91af;">treeDepth</span>&nbsp;::&nbsp;(<span style="color:blue;">Ord</span>&nbsp;n,&nbsp;<span style="color:blue;">Num</span>&nbsp;n)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">FullBinaryTreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n treeDepth&nbsp;=&nbsp;fullBinaryTreeF&nbsp;(\l&nbsp;_&nbsp;r&nbsp;-&gt;&nbsp;1&nbsp;+&nbsp;<span style="color:blue;">max</span>&nbsp;l&nbsp;r)&nbsp;(<span style="color:blue;">const</span>&nbsp;0)</pre> </p> <p> The reasoning is the same as already explained in the above C# examples. The functions also produce the same results: </p> <p> <pre>Prelude Fix FullBinaryTree&gt; countLeaves tree 4 Prelude Fix FullBinaryTree&gt; treeDepth tree 3</pre> </p> <p> This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation. </p> <h3 id="81b3e77b6fbe4760bc8c74805e4edba8"> Summary <a href="#81b3e77b6fbe4760bc8c74805e4edba8" title="permalink">#</a> </h3> <p> The catamorphism for a full binary tree is a pair of functions. One function handles internal nodes, while the other function handles leaf nodes. </p> <p> I thought it was interesting to show this example for two reasons: First, the original example was a Visitor implementation, and I think it's worth realising that a Visitor's <code>Accept</code> method can also be viewed as a catamorphism. Second, a binary tree is an example of a data structure that has a fold, but isn't a monad. </p> <p> All articles in the article series have, so far, covered data structures well-known from computer science. The next example will, on the other hand, demonstrate that even completely ad-hoc domain-specific data structures have catamorphisms. </p> <p> <strong>Next:</strong> <a href="/2019/07/08/payment-types-catamorphism">Payment types catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Composition Root location https://blog.ploeh.dk/2019/06/17/composition-root-location 2019-06-17T05:55:00+00:00 Mark Seemann <div id="post"> <p> <em>A Composition Root should be located near the point where user code first executes.</em> </p> <p> Prompted by a recent Internet discussion, my <a href="https://amzn.to/2TE8tJx">DIPPP</a> co-author <a href="https://blogs.cuttingedge.it/steven/">Steven van Deursen</a> wrote to me in order to help clarify the <a href="/2011/07/28/CompositionRoot">Composition Root</a> pattern. </p> <p> In the email, Steven ponders whether it's defensible to use an API that <a href="/2010/11/01/PatternRecognitionAbstractFactoryorServiceLocator">looks like a Service Locator</a> from within a unit test. He specifically calls out my article that describes the <a href="/2013/03/11/auto-mocking-container">Auto-mocking Container design pattern</a>. </p> <p> In that article, I show how to use Castle Windsor's <code>Resolve</code> method from within a unit test: </p> <p> <pre style="margin: 0px;">[<span style="color: #2b91af;">Fact</span>] <span style="color: blue;">public</span> <span style="color: blue;">void</span> SutIsController() { &nbsp;&nbsp;&nbsp; <span style="color: blue;">var</span> container = <span style="color: blue;">new</span> <span style="color: #2b91af;">WindsorContainer</span>().Install(<span style="color: blue;">new</span> <span style="color: #2b91af;">ShopFixture</span>()); &nbsp;&nbsp;&nbsp; <span style="color: blue;">var</span> sut = container.Resolve&lt;<span style="color: #2b91af;">BasketController</span>&gt;(); &nbsp;&nbsp;&nbsp; <span style="color: #2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color: #2b91af;">IHttpController</span>&gt;(sut); }</pre> </p> <p> Is the test using a <a href="/2010/02/03/ServiceLocatorisanAnti-Pattern">Service Locator</a>? If so, why is that okay? If not, why isn't it a Service Locator? </p> <p> This article argues that that this use of <code>Resolve</code> isn't a Service Locator. </p> <h3 id="e9a6c124fa1d4610ae57b3cba83254b0"> Entry points defined <a href="#e9a6c124fa1d4610ae57b3cba83254b0" title="permalink">#</a> </h3> <p> The <a href="/2011/07/28/CompositionRoot">original article about the Composition Root pattern</a> defines a Composition Root as the place where you compose your object graph(s). It repeatedly describes how this ought to happen in, or as close as possible to, the application's entry point. I believe that this definition is compatible with the pattern description given in <a href="https://amzn.to/2TE8tJx">our book</a>. </p> <p> I do realise, however, that we may never have explicitly defined what an <em>entry point</em> is. </p> <p> In order to do so, it may be helpful to establish a bit of terminology. In the following, I'll use the terms <em>user code</em> as opposed to <em>framework code</em>. </p> <p> Much of the code you write probably runs within some sort of framework. If you're writing a web application, you're probably using a web framework. If you're writing a message-based application, you might be using some message bus, or actor, framework. If you're writing an app for a mobile device, you're probably using some sort of framework for that, too. </p> <p> Even as a programmer, you're a <em>user</em> of frameworks. </p> <p> As I usually do, I'll use <a href="http://tomasp.net">Tomas Petricek</a>'s distinction between <a href="http://tomasp.net/blog/2015/library-frameworks">libraries and frameworks</a>. A library is a collection of APIs that you can call. A framework is a software system that calls your code. </p> <p> <img src="/content/binary/user-code-in-framework.png" alt="User code running in a framework."> </p> <p> The reality is often more complex, as illustrated by the figure. While a framework will call your code, you can also invoke APIs afforded by the framework. </p> <p> The point, however, is that <em>user code</em> is code that you write, while <em>framework code</em> is code that someone else wrote to develop the framework. The framework starts up first, and at some point in its lifetime, it calls your code. </p> <p class="text-center"> <strong>Definition:</strong> The <em>entry point</em> is the user code that the framework calls first. </p> <p> As an example, in ASP.NET Core, the (conventional) <code>Startup</code> class is the first user code that the framework calls. (If you follow Tomas Petricek's definition to the letter, ASP.NET Core isn't a framework, but a library, because you have to write a <code>Main</code> method and call <code>WebHost.CreateDefaultBuilder(args).UseStartup&lt;Startup&gt;().Build().Run()</code>. In reality, though, you're supposed to configure the application from your <code>Startup</code> class, making it the <em>de facto</em> entry point.) </p> <h3 id="61e3f212e0e244f18ac998f4b9fbb635"> Unit testing endpoints <a href="#61e3f212e0e244f18ac998f4b9fbb635" title="permalink">#</a> </h3> <p> Most .NET-based unit testing packages are frameworks. There's typically little explicit configuration. Instead, you just write a method and adorn it with an attribute: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;ReservationSucceeds() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;repo&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeReservationsRepository</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(10,&nbsp;repo); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;reservation&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;date:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTimeOffset</span>(2018,&nbsp;8,&nbsp;13,&nbsp;16,&nbsp;53,&nbsp;0,&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(2)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;email:&nbsp;<span style="color:#a31515;">&quot;mark@example.com&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;name:&nbsp;<span style="color:#a31515;">&quot;Mark&nbsp;Seemann&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;quantity:&nbsp;4); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(reservation); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.True(repo.Contains(reservation.Accept())); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expectedId&nbsp;=&nbsp;repo.GetId(reservation.Accept()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;ok&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">OkActionResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(expectedId,&nbsp;ok.Value); } [<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;ReservationFails() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;repo&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">FakeReservationsRepository</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;sut&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationsController</span>(10,&nbsp;repo); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;reservation&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;date:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTimeOffset</span>(2018,&nbsp;8,&nbsp;13,&nbsp;16,&nbsp;53,&nbsp;0,&nbsp;<span style="color:#2b91af;">TimeSpan</span>.FromHours(2)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;email:&nbsp;<span style="color:#a31515;">&quot;mark@example.com&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;name:&nbsp;<span style="color:#a31515;">&quot;Mark&nbsp;Seemann&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;quantity:&nbsp;11); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;sut.Post(reservation); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(reservation.IsAccepted); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(repo.Contains(reservation)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">InternalServerErrorActionResult</span>&gt;(actual); }</pre> </p> <p> With <a href="https://xunit.net">xUnit.net</a>, the attribute is called <code>[Fact]</code>, but the principle is the same in <a href="https://nunit.org">NUnit</a> and MSTest, only that names are different. </p> <p> Where's the entry point? </p> <p> Each test is it's own entry point. The test is (typically) the first user code that the test runner executes. Furthermore, each test runs independently of any other. </p> <p> For the sake of argument, you could write each test case in a new application, and run all your test applications in parallel. It would be impractical, but it oughtn't change the way you organise the tests. Each test method is, conceptually, a mini-application. </p> <p> A test method is its own Composition Root; or, more generally, each test has its own Composition Root. In fact, xUnit.net has various extensibility points that enable you to hook into the framework before each test method executes. You can, for example, <a href="/2010/10/08/AutoDataTheorieswithAutoFixture">combine a <code>[Theory]</code> attribute with a custom <code>AutoDataAttribute</code></a>, or you can adorn your tests with a <code>BeforeAfterTestAttribute</code>. This doesn't change that the test runner will run each test case independently of all the other tests. Those pre-execution hooks play the same role as middleware in real applications. </p> <p> You can, therefore, consider the <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">Arrange phase</a> the Composition Root for each test. </p> <p> Thus, I don't consider the use of an Auto-mocking Container to be a Service Locator, since <a href="/2011/08/25/ServiceLocatorrolesvs.mechanics">its role is to resolve object graphs at the entry point instead of locating services from arbitrary locations in the code base</a>. </p> <h3 id="200be4483e4b4369abe5912b2a8213c3"> Summary <a href="#200be4483e4b4369abe5912b2a8213c3" title="permalink">#</a> </h3> <p> A Composition Root is located at, or near, the <em>entry point</em>. An entry point is where <em>user code</em> is first executed by a framework. Each unit test method constitutes a separate, independent entry point. Therefore, it's consistent with these definitions to use an Auto-mocking Container in a unit test. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Tree catamorphism https://blog.ploeh.dk/2019/06/10/tree-catamorphism 2019-06-10T09:10:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for a tree is just a single function with a particular type.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">tree</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> A tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. </p> <p> <img src="/content/binary/tree-example.png" alt="A tree example diagram, with each node containing integers."> </p> <p> The diagram shows an example of a tree of integers. The left branch contains a sub-tree with only a single branch, whereas the right branch contains a sub-tree with three branches. Each of the leaf nodes are trees in their own right, but they all have zero branches. </p> <p> In this example, each branch at the 'same level' has the same depth, but this isn't required. </p> <h3 id="7d3f657d0c6b443f83eac89370e0c660"> C# catamorphism <a href="#7d3f657d0c6b443f83eac89370e0c660" title="permalink">#</a> </h3> <p> As a C# representation of a tree, I'll use the <code>Tree&lt;T&gt;</code> class from <a href="/2018/08/06/a-tree-functor">A Tree functor</a>. The catamorphism is this instance method on <code>Tree&lt;T&gt;</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Cata&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">IReadOnlyCollection</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;func) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;func(Item,&nbsp;children.Select(c&nbsp;=&gt;&nbsp;c.Cata(func)).ToArray()); }</pre> </p> <p> Contrary to previous articles, I didn't call this method <code>Match</code>, but simply <code>Cata</code> (for <em>catamorphism</em>). The reason is that those other methods are called <code>Match</code> for a particular reason. The data structures for which they are catamorphisms are all <a href="/2018/05/22/church-encoding">Church-encoded</a> <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>. For those types, the <code>Match</code> methods enable a syntax similar to pattern matching in <a href="https://fsharp.org">F#</a>. That's not the case for <code>Tree&lt;T&gt;</code>. It's not a sum type, and it isn't Church-encoded. </p> <p> The method takes a single function as an input argument. This is the first catamorphism in this article series that isn't made up of a pair of some sort. The <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a> is a pair of values, the <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a> is a pair made up of a value and a function, and the <a href="/2019/06/03/either-catamorphism">Either catamorphism</a> is a pair of functions. The tree catamorphism, in contrast, is just a single function. </p> <p> The first argument to the function is a value of the type <code>T</code>. This will be an <code>Item</code> value. The second argument to the function is a finite collection of <code>TResult</code> values. This may take a little time getting used to, but it's a collection of already reduced sub-trees. When you supply such a function to <code>Cata</code>, that function must return a single value of the type <code>TResult</code>. Thus, the function must be able to digest a finite collection of <code>TResult</code> values, as well as a <code>T</code> value, to a single <code>TResult</code> value. </p> <p> The <code>Cata</code> method accomplishes this by calling <code>func</code> with the current <code>Item</code>, as well as by recursively applying itself to each of the sub-trees. Eventually, <code>Cata</code> will recurse into leaf nodes, which means that <code>children</code> will be empty. When that happens, the lambda expression inside <code>children.Select</code> never runs, and recursion stops and unwinds. </p> <h3 id="167ba023ee654db39fb5eb448d35a8df"> Examples <a href="#167ba023ee654db39fb5eb448d35a8df" title="permalink">#</a> </h3> <p> You can use <code>Cata</code> to implement most other behaviour you'd like <code>Tree&lt;T&gt;</code> to have. In <a href="/2018/08/06/a-tree-functor">the original article on the Tree functor</a> you saw an ad-hoc implementation of <code>Select</code>, but instead, you can derive <code>Select</code> from <code>Cata</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Tree</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;Select&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Cata&lt;<span style="color:#2b91af;">Tree</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;&gt;((x,&nbsp;nodes)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Tree</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;(selector(x),&nbsp;nodes)); }</pre> </p> <p> The lambda expression receives <code>x</code>, an object of the type <code>T</code>, as well as <code>nodes</code>, which is a finite collection of already translated sub-trees. It simply translates <code>x</code> with <code>selector</code> and returns a <code>new Tree&lt;TResult&gt;</code> with the translated value and the already translated <code>nodes</code>. </p> <p> This works just as well as the ad-hoc implementation; it passes all the same tests as shown in the previous article. </p> <p> If you have a tree of numbers, you can add them all together: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Sum(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tree</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Cata&lt;<span style="color:blue;">int</span>&gt;((x,&nbsp;xs)&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;xs.Sum()); }</pre> </p> <p> This uses the built-in <a href="https://docs.microsoft.com/dotnet/api/system.linq.enumerable.sum">Sum method</a> for <code>IEnumerable&lt;T&gt;</code> to add all the partly calculated sub-trees together, and then adds the value of the current node. In this and remaining examples, I'll use the tree shown in the above diagram: </p> <p> <pre><span style="color:#2b91af;">Tree</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Create(42, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Create(1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Leaf(-3)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Create(7, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Leaf(-99), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Leaf(100), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tree</span>.Leaf(0)));</pre> </p> <p> You can now calculate the sum of all these nodes: </p> <p> <pre>&gt; tree.Sum() 1384</pre> </p> <p> Another option is to find the maximum value anywhere in a tree: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Max(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tree</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;tree) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Cata&lt;<span style="color:blue;">int</span>&gt;((x,&nbsp;xs)&nbsp;=&gt;&nbsp;xs.Any()&nbsp;?&nbsp;<span style="color:#2b91af;">Math</span>.Max(x,&nbsp;xs.Max())&nbsp;:&nbsp;x); }</pre> </p> <p> This method again utilises one of the LINQ methods available via the .NET base class library: <a href="https://docs.microsoft.com/dotnet/api/system.linq.enumerable.max">Max</a>. It is, however, necessary to first check whether the partially reduced <code>xs</code> is empty or not, because the <code>Max</code> extension method on <code>IEnumerable&lt;int&gt;</code> doesn't know how to deal with an empty collection (it throws an exception). When <code>xs</code> is empty that implies a leaf node, in which case you can simply return <code>x</code>; otherwise, you'll first have to use the <code>Max</code> method on <code>xs</code> to find the maximum value there, and then use <code>Math.Max</code> to find the maximum of those two. (I'll here remind the attentive reader that finding the maximum number forms a <a href="/2017/11/27/semigroups">semigroup</a> and that <a href="/2017/12/11/semigroups-accumulate">semigroups accumulate</a> when collections are non-empty. It all fits together. Isn't maths lovely?) </p> <p> Using the same <code>tree</code> as before, you can see that this method, too, works as expected: </p> <p> <pre>&gt; tree.Max() 1337</pre> </p> <p> So far, these two extension methods are just specialised <em>folds</em>. In Haskell, <code>Foldable</code> is a specific type class, and <code>sum</code> and <code>max</code> are available for all instances. As promised in <a href="/2019/04/29/catamorphisms">the introduction to the series</a>, though, there are some functions on trees that you can't implement using a fold. One of these is to count all the leaf nodes. You can still derive that functionality from the catamorphism, though: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;CountLeaves() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Cata&lt;<span style="color:blue;">int</span>&gt;((x,&nbsp;xs)&nbsp;=&gt;&nbsp;xs.Any()&nbsp;?&nbsp;xs.Sum()&nbsp;:&nbsp;1); }</pre> </p> <p> Like <code>Max</code>, the lambda expression used to implement <code>CountLeaves</code> uses <a href="https://docs.microsoft.com/dotnet/api/system.linq.enumerable.any">Any</a> to detect whether or not <code>xs</code> is empty, which is when <code>Any</code> is <code>false</code>. Empty <code>xs</code> indicates that you've found a leaf node, so return <code>1</code>. When <code>xs</code> isn't empty, it contains a collection of <code>1</code> values - one for each leaf node recursively found; add them together with <code>Sum</code>. </p> <p> This also works for the same <code>tree</code> as before: </p> <p> <pre>&gt; tree.CountLeaves() 4</pre> </p> <p> You can also measure the maximum depth of a tree: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;MeasureDepth() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Cata&lt;<span style="color:blue;">int</span>&gt;((x,&nbsp;xs)&nbsp;=&gt;&nbsp;xs.Any()&nbsp;?&nbsp;1&nbsp;+&nbsp;xs.Max()&nbsp;:&nbsp;0); }</pre> </p> <p> This implementation considers a leaf node to have no depth: </p> <p> <pre>&gt; <span style="color:#2b91af;">Tree</span>.Leaf(<span style="color:#a31515;">&quot;foo&quot;</span>).MeasureDepth() 0</pre> </p> <p> This is a discretionary definition; you could also argue that, by definition, a leaf node ought to have a depth of one. If you think so, you'll need to change the <code>0</code> to <code>1</code> in the above <code>MeasureDepth</code> implementation. </p> <p> Once more, you can use <code>Any</code> to detect leaf nodes. Whenever you find a leaf node, you return its depth, which, by definition, is <code>0</code>. Otherwise, you find the maximum depth already found among <code>xs</code>, and add <code>1</code>, because <code>xs</code> contains the maximum depths of all immediate sub-trees. </p> <p> Using the same <code>tree</code> again: </p> <p> <pre>&gt; tree.MeasureDepth() 2</pre> </p> <p> The above <code>tree</code> has the same depth for all sub-trees, so here's an example of a tilted tree: </p> <p> <pre>&gt; <span style="color:#2b91af;">Tree</span>.Create(3, . <span style="color:#2b91af;">Tree</span>.Create(1, . <span style="color:#2b91af;">Tree</span>.Leaf(0), . <span style="color:#2b91af;">Tree</span>.Leaf(0)), . <span style="color:#2b91af;">Tree</span>.Leaf(0), . <span style="color:#2b91af;">Tree</span>.Leaf(0), . <span style="color:#2b91af;">Tree</span>.Create(2, . <span style="color:#2b91af;">Tree</span>.Create(1, . <span style="color:#2b91af;">Tree</span>.Leaf(0)))) . .MeasureDepth() 3</pre> </p> <p> To make it easier to understand, I've labelled all the leaf nodes with <code>0</code>, because that's their depth. I've then labelled the other nodes with the maximum number 'under' them, plus one. That's the algorithm used. </p> <h3 id="82e5042d05534db2a67c8f7c37f78419"> Tree F-Algebra <a href="#82e5042d05534db2a67c8f7c37f78419" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/03/either-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> As always, start with the underlying endofunctor. I've taken some inspiration from <code>Tree a</code> from <code>Data.Tree</code>, but changed some names: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;TreeF&nbsp;a&nbsp;c&nbsp;=&nbsp;NodeF&nbsp;{&nbsp;nodeValue&nbsp;::&nbsp;a,&nbsp;nodes&nbsp;::&nbsp;ListFix&nbsp;c&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">TreeF</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;NodeF&nbsp;x&nbsp;$&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;ns</pre> </p> <p> Instead of using Haskell's standard list (<code>[]</code>) for the sub-forest, I've used <code>ListFix</code> from <a href="/2019/05/27/list-catamorphism">the article on list catamorphism</a>. This should, hopefully, demonstrate how you can build on already established definitions derived from first principles. </p> <p> As usual, I've called the 'data' type <code>a</code> and the carrier type <code>c</code> (for <em>carrier</em>). The <code>Functor</code> instance as usual translates the carrier type; the <code>fmap</code> function has the type <code>(c -&gt; c1) -&gt; TreeF a c -&gt; TreeF a c1</code>. </p> <p> As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like <code>Fix (TreeF a)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;TreeFix&nbsp;a&nbsp;=&nbsp;TreeFix&nbsp;{&nbsp;unTreeFix&nbsp;::&nbsp;Fix&nbsp;(TreeF&nbsp;a)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A pair of helper functions make it easier to define <code>TreeFix</code> values: </p> <p> <pre><span style="color:#2b91af;">leafF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a leafF&nbsp;x&nbsp;=&nbsp;TreeFix&nbsp;$&nbsp;Fix&nbsp;$&nbsp;NodeF&nbsp;x&nbsp;nilF <span style="color:#2b91af;">nodeF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;(<span style="color:blue;">TreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a nodeF&nbsp;x&nbsp;=&nbsp;TreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;NodeF&nbsp;x&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;unTreeFix</pre> </p> <p> <code>leafF</code> creates a leaf node: </p> <p> <pre>Prelude Fix List Tree&gt; leafF "ploeh" TreeFix {unTreeFix = Fix (NodeF "ploeh" (ListFix (Fix NilF)))}</pre> </p> <p> <code>nodeF</code> is a helper function to create a non-leaf node: </p> <p> <pre>Prelude Fix List Tree&gt; nodeF 4 (consF (leafF 9) nilF) TreeFix {unTreeFix = Fix (NodeF 4 (ListFix (Fix (ConsF (Fix (NodeF 9 (ListFix (Fix NilF)))) (Fix NilF)))))}</pre> </p> <p> Even with helper functions, construction of <code>TreeFix</code> values is cumbersome, but keep in mind that the code shown here isn't meant to be used in practice. The goal is only to deduce catamorphisms from more basic universal abstractions, and you now have all you need to do that. </p> <h3 id="ca5669298d814809a3f0d4b0422b860f"> Haskell catamorphism <a href="#ca5669298d814809a3f0d4b0422b860f" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>TreeF a</code>), and an object <code>c</code>, but you still need to find a morphism <code>TreeF a c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not the 'data type' <code>a</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code>, as you'll see. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>treeF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementation of <code>alg</code>, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from <code>alg</code>? You could pass a function argument to the <code>treeF</code> function and use it with <code>x</code> and <code>ns</code>: </p> <p> <pre><span style="color:#2b91af;">treeF</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c treeF&nbsp;f&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;f&nbsp;x&nbsp;ns</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>TreeF</code>, the compiler infers that the <code>alg</code> function has the type <code>TreeF a c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for a tree. So far in this article series, all previous catamorphisms have been pairs, but this one is just a single function. It's still not the only possible catamorphism, since you could trivially flip the arguments to <code>f</code>. </p> <p> I've chosen the representation shown here because it's isomorphic to the <code>foldTree</code> function from Haskell's built-in <code>Data.Tree</code> module, which explicitly documents that the function "is also known as the catamorphism on trees." <code>foldTree</code> is defined using Haskell's standard list type (<code>[]</code>), so the type is simpler: <code>(a -&gt; [b] -&gt; b) -&gt; Tree a -&gt; b</code>. The two representations of trees, <code>TreeFix</code> and <code>Tree</code> are, however, isomorphic, so <code>foldTree</code> is equivalent to <code>treeF</code>. Notice how both of these functions are also equivalent to the above C# <code>Cata</code> method. </p> <h3 id="8647c7bd03aa4d4b8a01a8252058830f"> Basis <a href="#8647c7bd03aa4d4b8a01a8252058830f" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>treeF</code>. Here's the <code>Functor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;=&nbsp;treeF&nbsp;(nodeF&nbsp;.&nbsp;f)</pre> </p> <p> <code>nodeF . f</code> is just the <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free</a> version of <code>\x ns -&gt; nodeF (f x) ns</code>, which follows the exact same implementation logic as the above C# <code>Select</code> implementation. </p> <p> The <code>Applicative</code> instance is, I'm afraid, the most complex code you've seen so far in this article series: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Applicative</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;pure&nbsp;=&nbsp;leafF &nbsp;&nbsp;ft&nbsp;&lt;*&gt;&nbsp;xt&nbsp;=&nbsp;treeF&nbsp;(\f&nbsp;ts&nbsp;-&gt;&nbsp;addNodes&nbsp;ts&nbsp;$&nbsp;f&nbsp;&lt;$&gt;&nbsp;xt)&nbsp;ft &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;addNodes&nbsp;ns&nbsp;(TreeFix&nbsp;(Fix&nbsp;(NodeF&nbsp;x&nbsp;xs)))&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TreeFix&nbsp;(Fix&nbsp;(NodeF&nbsp;x&nbsp;(xs&nbsp;&lt;&gt;&nbsp;(unTreeFix&nbsp;&lt;$&gt;&nbsp;ns))))</pre> </p> <p> I'd be surprised if it's impossible to make this terser, but I thought that it was just complicated enough that I needed to make one of the steps explicit. The <code>addNodes</code> helper function has the type <code>ListFix (TreeFix a) -&gt; TreeFix a -&gt; TreeFix a</code>, and it adds a list of sub-trees to the top node of a tree. It looks worse than it is, but it really just peels off the wrappers (<code>TreeFix</code>, <code>Fix</code>, and <code>NodeF</code>) to access the data (<code>x</code> and <code>xs</code>) of the top node. It then concatenates <code>xs</code> with <code>ns</code>, and puts all the wrappers back on. </p> <p> I have to admit, though, that the <code>Applicative</code> and <code>Monad</code> instance in general are mind-binding to me. The <code>&lt;*&gt;</code> operation, particularly, takes a <em>tree of functions</em> and has to combine it with a <em>tree of values</em>. What does that even mean? How does one do that? </p> <p> Like the above, apparently. I took the <code>Applicative</code> behaviour from <code>Data.Tree</code> and made sure that my implementation is isomorphic. I even have a property to make 'sure' that's the case: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;Applicative&nbsp;behaves&nbsp;like&nbsp;Data.Tree&quot;</span>&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:#2b91af;">xt</span>&nbsp;::&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:#2b91af;">Integer</span>&nbsp;&lt;-&nbsp;fromTree&nbsp;&lt;$&gt;&nbsp;resize&nbsp;10&nbsp;arbitrary &nbsp;&nbsp;<span style="color:#2b91af;">ft</span>&nbsp;::&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;(<span style="color:#2b91af;">Integer</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">String</span>)&nbsp;&lt;-&nbsp;fromTree&nbsp;&lt;$&gt;&nbsp;resize&nbsp;5&nbsp;arbitrary &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;ft&nbsp;&lt;*&gt;&nbsp;xt &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;expected&nbsp;=&nbsp;toTree&nbsp;ft&nbsp;&lt;*&gt;&nbsp;toTree&nbsp;xt &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;===&nbsp;toTree&nbsp;actual</pre> </p> <p> The <code>Monad</code> instance looks similar to the <code>Applicative</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monad</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;t&nbsp;&gt;&gt;=&nbsp;f&nbsp;=&nbsp;treeF&nbsp;(\x&nbsp;ns&nbsp;-&gt;&nbsp;addNodes&nbsp;ns&nbsp;$&nbsp;f&nbsp;x)&nbsp;t &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;addNodes&nbsp;ns&nbsp;(TreeFix&nbsp;(Fix&nbsp;(NodeF&nbsp;x&nbsp;xs)))&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TreeFix&nbsp;(Fix&nbsp;(NodeF&nbsp;x&nbsp;(xs&nbsp;&lt;&gt;&nbsp;(unTreeFix&nbsp;&lt;$&gt;&nbsp;ns))))</pre> </p> <p> The <code>addNodes</code> helper function is the same as for <code>&lt;*&gt;</code>, so you may wonder why I didn't extract that as a separate, reusable function. I decided, however, to apply the <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">rule of three</a>, and since, ultimately, <code>addNodes</code> appear only twice, I left them as the implementation details they are. </p> <p> Fortunately, the <code>Foldable</code> instance is easier on the eyes: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;f&nbsp;=&nbsp;treeF&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;fold&nbsp;xs)</pre> </p> <p> Since <code>f</code> is a function of the type <code>a -&gt; m</code>, where <code>m</code> is a <code>Monoid</code> instance, you can use <code>fold</code> and <code>&lt;&gt;</code> to accumulate everything to a single <code>m</code> value. </p> <p> The <code>Traversable</code> instance is similarly terse: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;treeF&nbsp;(\x&nbsp;ns&nbsp;-&gt;&nbsp;nodeF&nbsp;&lt;$&gt;&nbsp;x&nbsp;&lt;*&gt;&nbsp;sequenceA&nbsp;ns)</pre> </p> <p> Finally, you can implement conversions to and from the <code>Tree</code> type from <code>Data.Tree</code>, using <code>ana</code> as the dual of <code>cata</code>: </p> <p> <pre><span style="color:#2b91af;">toTree</span>&nbsp;::&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;a toTree&nbsp;=&nbsp;treeF&nbsp;(\x&nbsp;ns&nbsp;-&gt;&nbsp;Node&nbsp;x&nbsp;$&nbsp;toList&nbsp;ns) <span style="color:#2b91af;">fromTree</span>&nbsp;::&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a fromTree&nbsp;=&nbsp;TreeFix&nbsp;.&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;(Node&nbsp;x&nbsp;ns)&nbsp;=&nbsp;NodeF&nbsp;x&nbsp;(fromList&nbsp;ns)</pre> </p> <p> This demonstrates that <code>TreeFix</code> is isomorphic to <code>Tree</code>, which again establishes that <code>treeF</code> and <code>foldTree</code> are equivalent. </p> <h3 id="7180d37efb404a70b707f8c9b8639a35"> Relationships <a href="#7180d37efb404a70b707f8c9b8639a35" title="permalink">#</a> </h3> <p> In this series, you've seen various examples of catamorphisms of structures that have no folds, catamorphisms that coincide with folds, and catamorphisms that are more general than the fold. The introduction to the series included this diagram: </p> <p> <img src="/content/binary/catamorphism-and-fold-relations.png" alt="Catamorphisms and folds as sets, for various sum types."> </p> <p> The <a href="/2019/06/03/either-catamorphism">Either catamorphism</a> is another example of a catamorphism that is more general than the fold, but that one turned out to be identical to the <em>bifold</em>. That's not the case here, because <code>TreeFix</code> isn't a <code>Bifoldable</code> instance at all. </p> <p> There are operations on trees that you can implement with a fold, but some that you can't. Consider the tree in shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using <code>TreeFix</code>, you can define that tree like this: </p> <p> <pre>tree&nbsp;= &nbsp;&nbsp;nodeF&nbsp;42 &nbsp;&nbsp;&nbsp;&nbsp;(consF&nbsp;(nodeF&nbsp;1337 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(consF&nbsp;(leafF&nbsp;(-3))&nbsp;nilF))&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(nodeF&nbsp;7 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(consF&nbsp;(leafF&nbsp;(-99))&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(leafF&nbsp;100)&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(leafF&nbsp;0)&nbsp;nilF)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nilF)</pre> </p> <p> Yes, that almost looks like some Lisp dialect... </p> <p> Since <code>TreeFix</code> is <code>Foldable</code>, and that type class already comes with <code>sum</code> and <code>maximum</code> functions, no further work is required to repeat the first two of the above C# examples: </p> <p> <pre>*Tree Fix List Tree&gt; sum tree 1384 *Tree Fix List Tree&gt; maximum tree 1337</pre> </p> <p> Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the <code>Foldable</code> instance, but can be implemented using the catamorphism: </p> <p> <pre><span style="color:#2b91af;">countLeaves</span>&nbsp;::&nbsp;<span style="color:blue;">Num</span>&nbsp;n&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n countLeaves&nbsp;=&nbsp;treeF&nbsp;(\_&nbsp;xs&nbsp;-&gt;&nbsp;<span style="color:blue;">if</span>&nbsp;<span style="color:blue;">null</span>&nbsp;xs&nbsp;<span style="color:blue;">then</span>&nbsp;1&nbsp;<span style="color:blue;">else</span>&nbsp;<span style="color:blue;">sum</span>&nbsp;xs) <span style="color:#2b91af;">treeDepth</span>&nbsp;::&nbsp;(<span style="color:blue;">Ord</span>&nbsp;n,&nbsp;<span style="color:blue;">Num</span>&nbsp;n)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">TreeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n treeDepth&nbsp;=&nbsp;treeF&nbsp;(\_&nbsp;xs&nbsp;-&gt;&nbsp;<span style="color:blue;">if</span>&nbsp;<span style="color:blue;">null</span>&nbsp;xs&nbsp;<span style="color:blue;">then</span>&nbsp;0&nbsp;<span style="color:blue;">else</span>&nbsp;1&nbsp;+&nbsp;<span style="color:blue;">maximum</span>&nbsp;xs)</pre> </p> <p> The reasoning is the same as already explained in the above C# examples. The functions also produce the same results: </p> <p> <pre>*Tree Fix List Tree&gt; countLeaves tree 4 *Tree Fix List Tree&gt; treeDepth tree 2</pre> </p> <p> This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation. </p> <h3 id="66f69ba33cef4ed58d01f1f0bafef14a"> Summary <a href="#66f69ba33cef4ed58d01f1f0bafef14a" title="permalink">#</a> </h3> <p> The catamorphism for a tree is just a single function, which is recursively evaluated. It enables you to translate, traverse, and reduce trees in many interesting ways. </p> <p> You can use the catamorphism to implement a (list-biased) fold, including enumerating all nodes as a flat list, but there are operations (such as counting leaves) that you can implement with the catamorphism, but not with the fold. </p> <p> <strong>Next:</strong> <a href="/2019/08/05/rose-tree-catamorphism">Rose tree catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Either catamorphism https://blog.ploeh.dk/2019/06/03/either-catamorphism 2019-06-03T06:05:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for Either is a generalisation of its fold. The catamorphism enables operations not available via fold.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for <a href="/2018/06/11/church-encoded-either">Either</a> (also known as <em>Result</em>), as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> <em>Either</em> is a <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">data container</a> that models two mutually exclusive results. It's often used to return values that may be either correct (<em>right</em>), or incorrect (<em>left</em>). In statically typed functional programming with a preference for total functions, Either offers a saner, more reasonable way to model success and error results than throwing exceptions. </p> <h3 id="d8214c38aac04ee7b80b9352d57d3bd1"> C# catamorphism <a href="#d8214c38aac04ee7b80b9352d57d3bd1" title="permalink">#</a> </h3> <p> This article uses <a href="/2018/06/11/church-encoded-either">the Church encoding of Either</a>. The catamorphism for Either is the <code>Match</code> method: </p> <p> <pre><span style="color:#2b91af;">T</span>&nbsp;Match&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;onLeft,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">R</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;onRight);</pre> </p> <p> Until this article, all previous catamorphisms have been a pair made from an initial value and a function. The Either catamorphism is a generalisation, since it's a pair of functions. One function handles the case where there's a <em>left</em> value, and the other function handles the case where there's a <em>right</em> value. Both functions must return the same, unifying type, which is often a string or something similar that can be shown in a user interface: </p> <p> <pre>&gt; <span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">TimeSpan</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;e&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Left</span>&lt;<span style="color:#2b91af;">TimeSpan</span>,&nbsp;<span style="color:blue;">int</span>&gt;(<span style="color:#2b91af;">TimeSpan</span>.FromMinutes(3)); &gt; e.Match(ts&nbsp;=&gt;&nbsp;ts.ToString(),&nbsp;i&nbsp;=&gt;&nbsp;i.ToString()) "00:03:00" &gt; <span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">TimeSpan</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;e&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">TimeSpan</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42); &gt; e.Match(ts&nbsp;=&gt;&nbsp;ts.ToString(),&nbsp;i&nbsp;=&gt;&nbsp;i.ToString()) "42"</pre> </p> <p> You'll often see examples like the above that turns both left and right cases into strings or something that can be represented as a unifying response type, but this is in no way required. If you can come up with a unifying type, you can convert both cases to that type: </p> <p> <pre>&gt; <span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;e&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Left</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#2b91af;">Guid</span>.NewGuid()); &gt; e.Match(g&nbsp;=&gt;&nbsp;g.ToString().Count(c&nbsp;=&gt;&nbsp;<span style="color:#a31515;">&#39;a&#39;</span>&nbsp;&lt;=&nbsp;c),&nbsp;s&nbsp;=&gt;&nbsp;s.Length) 12 &gt; <span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;e&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>); &gt; e.Match(g&nbsp;=&gt;&nbsp;g.ToString().Count(c&nbsp;=&gt;&nbsp;<span style="color:#a31515;">&#39;a&#39;</span>&nbsp;&lt;=&nbsp;c),&nbsp;s&nbsp;=&gt;&nbsp;s.Length) 3</pre> </p> <p> In the two above examples, you use two different functions that both reduce respectively <code>Guid</code> and <code>string</code> values to a number. The function that turns <code>Guid</code> values into a number counts how many of the hexadecimal digits that are greater than or equal to A (10). The other function simply returns the length of the <code>string</code>, if it's there. That example makes little sense, but the <code>Match</code> method doesn't care about that. </p> <p> In practical use, Either is often used for error handling. The <a href="/2018/06/11/church-encoded-either">article on the Church encoding of Either</a> contains an example. </p> <h3 id="99e1027823114e95bebf81c08d35779f"> Either F-Algebra <a href="#99e1027823114e95bebf81c08d35779f" title="permalink">#</a> </h3> <p> As in the <a href="/2019/05/27/list-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. You already saw an example of that in the articles about <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a> and <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. The difference between e.g. Maybe values and Either is that both cases of Either carry a value. You can model this as a <code>Functor</code> with both a carrier type and two type arguments for the data that Either may contain: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;EitherF&nbsp;l&nbsp;r&nbsp;c&nbsp;=&nbsp;LeftF&nbsp;l&nbsp;|&nbsp;RightF&nbsp;r&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">EitherF</span>&nbsp;l&nbsp;r)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;(LeftF&nbsp;l)&nbsp;=&nbsp;LeftF&nbsp;l &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;(RightF&nbsp;r)&nbsp;=&nbsp;RightF&nbsp;r</pre> </p> <p> I chose to call the 'data types' <code>l</code> (for <em>left</em>) and <code>r</code> (for <em>right</em>), and the carrier type <code>c</code> (for <em>carrier</em>). As was also the case with <code>BoolF</code> and <code>MaybeF</code>, the <code>Functor</code> instance ignores the map function because the carrier type is missing from both the <code>LeftF</code> case and the <code>RightF</code> case. Like the <code>Functor</code> instances for <code>BoolF</code> and <code>MaybeF</code>, it'd seem that nothing happens, but at the type level, this is still a translation from <code>EitherF l r c</code> to <code>EitherF l r c1</code>. Not much of a function, perhaps, but definitely an <em>endofunctor</em>. </p> <p> As was also the case when deducing the Maybe and List catamorphisms, Haskell isn't too happy about defining instances for a type like <code>Fix (EitherF l r)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;EitherFix&nbsp;l&nbsp;r&nbsp;=&nbsp;EitherFix&nbsp;{&nbsp;unEitherFix&nbsp;::&nbsp;Fix&nbsp;(EitherF&nbsp;l&nbsp;r)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A pair of helper functions make it easier to define <code>EitherFix</code> values: </p> <p> <pre><span style="color:#2b91af;">leftF</span>&nbsp;::&nbsp;l&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;l&nbsp;r leftF&nbsp;=&nbsp;EitherFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;LeftF <span style="color:#2b91af;">rightF</span>&nbsp;::&nbsp;r&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;l&nbsp;r rightF&nbsp;=&nbsp;EitherFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;RightF</pre> </p> <p> With those functions, you can create <code>EitherFix</code> values: </p> <p> <pre>Prelude Data.UUID Data.UUID.V4 Fix Either&gt; leftF &lt;$&gt; nextRandom EitherFix {unEitherFix = Fix (LeftF e65378c2-0d6e-47e0-8bcb-7cc29d185fad)} Prelude Data.UUID Data.UUID.V4 Fix Either&gt; rightF "foo" EitherFix {unEitherFix = Fix (RightF "foo")}</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="67d13d05a8564f3481e181d0c32b3165"> Haskell catamorphism <a href="#67d13d05a8564f3481e181d0c32b3165" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>EitherF l r</code>), and an object <code>c</code>, but you still need to find a morphism <code>EitherF l r c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not the 'data types' <code>l</code> and <code>r</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>l</code> and <code>r</code>, as you'll see. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>eitherF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unEitherFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;(LeftF&nbsp;l)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(RightF&nbsp;r)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>LeftF</code> case? You could pass an argument to the <code>eitherF</code> function: </p> <p> <pre>eitherF&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unEitherFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;(LeftF&nbsp;l)&nbsp;=&nbsp;fl&nbsp;l &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(RightF&nbsp;r)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While you could, technically, pass an argument of the type <code>c</code> to <code>eitherF</code> and then return that value from the <code>LeftF</code> case, that would mean that you would ignore the <code>l</code> value. This would be incorrect, so instead, make the argument a function and call it with <code>l</code>. Likewise, you can deal with the <code>RightF</code> case in the same way: </p> <p> <pre><span style="color:#2b91af;">eitherF</span>&nbsp;::&nbsp;(l&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(r&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;l&nbsp;r&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c eitherF&nbsp;fl&nbsp;fr&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unEitherFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;(LeftF&nbsp;l)&nbsp;=&nbsp;fl&nbsp;l &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(RightF&nbsp;r)&nbsp;=&nbsp;fr&nbsp;r</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>EitherF</code>, the compiler infers that the <code>alg</code> function has the type <code>EitherF l r c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for Either. As has been consistent so far, it's a pair, but now made from two functions. As you've seen repeatedly, this isn't the only possible catamorphism, since you can, for example, trivially flip the arguments to <code>eitherF</code>. </p> <p> I've chosen the representation shown here because it's isomorphic to the <code>either</code> function from Haskell's built-in <code>Data.Either</code> module, which calls the function the "Case analysis for the <code>Either</code> type". Both of these functions (<code>eitherF</code> and <code>either</code>) are equivalent to the above C# <code>Match</code> method. </p> <h3 id="77731d63c79543b0994c06721521b6f3"> Basis <a href="#77731d63c79543b0994c06721521b6f3" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>eitherF</code>. Here's the <code>Bifunctor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifunctor</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bimap&nbsp;f&nbsp;s&nbsp;=&nbsp;eitherF&nbsp;(leftF&nbsp;.&nbsp;f)&nbsp;(rightF&nbsp;.&nbsp;s)</pre> </p> <p> From that instance, the <code>Functor</code> instance trivially follows: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">EitherFix</span>&nbsp;l)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;=&nbsp;second</pre> </p> <p> On top of <code>Functor</code> you can add <code>Applicative</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Applicative</span>&nbsp;(<span style="color:blue;">EitherFix</span>&nbsp;l)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;pure&nbsp;=&nbsp;rightF &nbsp;&nbsp;f&nbsp;&lt;*&gt;&nbsp;x&nbsp;=&nbsp;eitherF&nbsp;leftF&nbsp;(&lt;$&gt;&nbsp;x)&nbsp;f</pre> </p> <p> Notice that the <code>&lt;*&gt;</code> implementation is similar to to the <code>&lt;*&gt;</code> implementation for <code>MaybeFix</code>. The same is the case for the <code>Monad</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monad</span>&nbsp;(<span style="color:blue;">EitherFix</span>&nbsp;l)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;x&nbsp;&gt;&gt;=&nbsp;f&nbsp;=&nbsp;eitherF&nbsp;leftF&nbsp;f&nbsp;x</pre> </p> <p> Not only is <code>EitherFix</code> <code>Foldable</code>, it's <code>Bifoldable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifoldable</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bifoldMap&nbsp;=&nbsp;eitherF</pre> </p> <p> Notice, interestingly, that <code>bifoldMap</code> is identical to <code>eitherF</code>. </p> <p> The <code>Bifoldable</code> instance enables you to trivially implement the <code>Foldable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;(<span style="color:blue;">EitherFix</span>&nbsp;l)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;bifoldMap&nbsp;mempty</pre> </p> <p> You may find the presence of <code>mempty</code> puzzling, since <code>bifoldMap</code> (or <code>eitherF</code>; they're identical) takes as arguments two functions. Is <code>mempty</code> a function? </p> <p> Yes, <code>mempty</code> can be a function. Here, it is. There's a <code>Monoid</code> instance for any function <code>a -&gt; m</code>, where <code>m</code> is a <code>Monoid</code> instance, and <code>mempty</code> is the identity for that monoid. That's the instance in use here. </p> <p> Just as <code>EitherFix</code> is <code>Bifoldable</code>, it's also <code>Bitraversable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bitraversable</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bitraverse&nbsp;fl&nbsp;fr&nbsp;=&nbsp;eitherF&nbsp;(<span style="color:blue;">fmap</span>&nbsp;leftF&nbsp;.&nbsp;fl)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;rightF&nbsp;.&nbsp;fr)</pre> </p> <p> You can comfortably implement the <code>Traversable</code> instance based on the <code>Bitraversable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;(<span style="color:blue;">EitherFix</span>&nbsp;l)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;bisequenceA&nbsp;.&nbsp;first&nbsp;pure</pre> </p> <p> Finally, you can implement conversions to and from the standard <code>Either</code> type, using <code>ana</code> as the dual of <code>cata</code>: </p> <p> <pre><span style="color:#2b91af;">toEither</span>&nbsp;::&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;l&nbsp;r&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;l&nbsp;r toEither&nbsp;=&nbsp;eitherF&nbsp;Left&nbsp;Right <span style="color:#2b91af;">fromEither</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;a&nbsp;b fromEither&nbsp;=&nbsp;EitherFix&nbsp;.&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;&nbsp;(Left&nbsp;l)&nbsp;=&nbsp;&nbsp;LeftF&nbsp;l &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;(Right&nbsp;r)&nbsp;=&nbsp;RightF&nbsp;r</pre> </p> <p> This demonstrates that <code>EitherFix</code> is isomorphic to <code>Either</code>, which again establishes that <code>eitherF</code> and <code>either</code> are equivalent. </p> <h3 id="00a274a2317548538521d4a171191230"> Relationships <a href="#00a274a2317548538521d4a171191230" title="permalink">#</a> </h3> <p> In this series, you've seen various examples of catamorphisms of structures that have no folds, catamorphisms that coincide with folds, and now a catamorphism that is more general than the fold. The introduction to the series included this diagram: </p> <p> <img src="/content/binary/catamorphism-and-fold-relations.png" alt="Catamorphisms and folds as sets, for various sum types."> </p> <p> This shows that Boolean values and Peano numbers have catamorphisms, but no folds, whereas for Maybe and List, the fold and the catamorphism is the same. For Either, however, the fold is a special case of the catamorphism. The fold for Either 'pretends' that the left side doesn't exist. Instead, the left value is interpreted as a missing right value. Thus, in order to fold Either values, you must supply a 'fallback' value that will be used in case an Either value isn't a <em>right</em> value: </p> <p> <pre>Prelude Fix Either&gt; e = rightF LT :: EitherFix Integer Ordering Prelude Fix Either&gt; foldr (const . show) "" e "LT" Prelude Fix Either&gt; e = leftF 42 :: EitherFix Integer Ordering Prelude Fix Either&gt; foldr (const . show) "" e ""</pre> </p> <p> In a GHCi session like the above, you can create two Either values of the same type. The <em>right</em> case is an <code>Ordering</code> value, while the <em>left</em> case is an <code>Integer</code> value. </p> <p> With <code>foldr</code>, there's no way to access the <em>left</em> case. While you can access and transform the right <code>Ordering</code> value, the number <code>42</code> is simply ignored during the fold. Instead, the default value <code>""</code> is returned. </p> <p> Contrast this with the catamorphism, which can access both cases: </p> <p> <pre>Prelude Fix Either&gt; e = rightF LT :: EitherFix Integer Ordering Prelude Fix Either&gt; eitherF show show e "LT" Prelude Fix Either&gt; e = leftF 42 :: EitherFix Integer Ordering Prelude Fix Either&gt; eitherF show show e "42"</pre> </p> <p> In a session like this, you recreate the same values, but using the catamorphism <code>eitherF</code>, you can now access and transform both the <em>left</em> and the <em>right</em> cases. In other words, the catamorphism enables you to perform operations not possible with the fold. </p> <p> It's interesting, however, to note that while the fold is a specialisation of the catamorphism, the <em>bifold</em> is identical to the catamorphism. </p> <h3 id="7512bca753b747ad9accde04bf13b6ca"> Summary <a href="#7512bca753b747ad9accde04bf13b6ca" title="permalink">#</a> </h3> <p> The catamorphism for Either is a pair of functions. One function transforms the <em>left</em> case, while the other function transforms the <em>right</em> case. For any Either value, only one of those functions will be used. </p> <p> When I originally encountered the concept of a <em>catamorphism</em>, I found it difficult to distinguish between catamorphism and fold. My problem was, I think, that the tutorials I ran into mostly used linked lists to demonstrate how, <a href="/2019/05/27/list-catamorphism">in that case</a>, the fold <em>is</em> the catamorphism. It turns out, however, that this isn't always the case. A catamorphism is a general abstraction. A fold, on the other hand, seems to me to be mostly related to collections. </p> <p> In this article you saw the first example of a catamorphism that can do more than the fold. For Either, the fold is just a special case of the catamorphism. You also saw, however, how the catamorphism was identical to the <em>bifold</em>. Thus, it's still not entirely clear how these concepts relate. Therefore, in the next article, you'll get an example of a <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">container</a> where there's no bifold, and where the catamorphism is, indeed, a generalisation of the fold. </p> <p> <strong>Next:</strong> <a href="/2019/06/10/tree-catamorphism">Tree catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. List catamorphism https://blog.ploeh.dk/2019/05/27/list-catamorphism 2019-05-27T06:10:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for a list is the same as its fold.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for (linked) lists, and other collections in general. It also shows how to identify it. The beginning of this article presents the catamorphism in C#, with an example. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> The C# part of the article will discuss <code>IEnumerable&lt;T&gt;</code>, while the Haskell part will deal specifically with linked lists. Since C# is a less strict language anyway, we have to make some concessions when we consider how concepts translate. In my experience, the functionality of <code>IEnumerable&lt;T&gt;</code> closely mirrors that of Haskell lists. </p> <h3 id="3190f7181a954b6388d77f61a1dbb928"> C# catamorphism <a href="#3190f7181a954b6388d77f61a1dbb928" title="permalink">#</a> </h3> <p> The .NET base class library defines this <a href="https://docs.microsoft.com/dotnet/api/system.linq.enumerable.aggregate">Aggregate</a> overload: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TAccumulate</span>&nbsp;Aggregate&lt;<span style="color:#2b91af;">TSource</span>,&nbsp;<span style="color:#2b91af;">TAccumulate</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TSource</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TAccumulate</span>&nbsp;seed, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">TAccumulate</span>,&nbsp;<span style="color:#2b91af;">TSource</span>,&nbsp;<span style="color:#2b91af;">TAccumulate</span>&gt;&nbsp;func);</pre> </p> <p> This is the catamorphism for linked lists (and, I'll conjecture, for <code>IEnumerable&lt;T&gt;</code> in general). The <a href="/2019/04/29/catamorphisms">introductory article</a> already used it to show several motivating examples, of which I'll only repeat the last: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;42,&nbsp;1337,&nbsp;2112,&nbsp;90125,&nbsp;5040,&nbsp;7,&nbsp;1984&nbsp;} . .Aggregate(<span style="color:#2b91af;">Angle</span>.Identity,&nbsp;(a,&nbsp;i)&nbsp;=&gt;&nbsp;a.Add(<span style="color:#2b91af;">Angle</span>.FromDegrees(i))) [{ Angle = 207° }]</pre> </p> <p> In short, the catamorphism is, similar to the previous catamorphisms covered in this article series, a pair made from an initial value and a function. This has been true for both the <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a> and the <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. An initial value is just a value in all three cases, but you may notice that the function in question becomes increasingly elaborate. For <code>IEnumerable&lt;T&gt;</code>, it's a function that takes two values. One of the values are of the type of the input list, i.e. for <code>IEnumerable&lt;TSource&gt;</code> it would be <code>TSource</code>. By elimination you can deduce that this value must come from the input list. The other value is of the type <code>TAccumulate</code>, which implies that it could be the <code>seed</code>, or the result from a previous call to <code>func</code>. </p> <h3 id="46afb325e03743d9ac2c2cf391607f82"> List F-Algebra <a href="#46afb325e03743d9ac2c2cf391607f82" title="permalink">#</a> </h3> <p> As in the <a href="/2019/05/20/maybe-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. The <code>ListF</code> type comes from his article as well, although I've renamed the type arguments: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;ListF&nbsp;a&nbsp;c&nbsp;=&nbsp;NilF&nbsp;|&nbsp;ConsF&nbsp;a&nbsp;c&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">ListF</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;NilF &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(ConsF&nbsp;a&nbsp;c)&nbsp;=&nbsp;ConsF&nbsp;a&nbsp;$&nbsp;f&nbsp;c</pre> </p> <p> Like I did with <code>MaybeF</code>, I've named the 'data' type argument <code>a</code>, and the carrier type <code>c</code> (for <em>carrier</em>). Once again, notice that the <code>Functor</code> instance maps over the carrier type <code>c</code>; not over the 'data' type <code>a</code>. </p> <p> As was also the case when deducing the Maybe catamorphism, Haskell isn't too happy about defining instances for a type like <code>Fix (ListF a)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;ListFix&nbsp;a&nbsp;=&nbsp;ListFix&nbsp;{&nbsp;unListFix&nbsp;::&nbsp;Fix&nbsp;(ListF&nbsp;a)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A few helper functions make it easier to define <code>ListFix</code> values: </p> <p> <pre><span style="color:#2b91af;">nilF</span>&nbsp;::&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a nilF&nbsp;=&nbsp;ListFix&nbsp;$&nbsp;Fix&nbsp;NilF <span style="color:#2b91af;">consF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a consF&nbsp;x&nbsp;=&nbsp;ListFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;ConsF&nbsp;x&nbsp;.&nbsp;unListFix</pre> </p> <p> With those functions, you can create <code>ListFix</code> linked lists: </p> <p> <pre>Prelude Fix List&gt; nilF ListFix {unListFix = Fix NilF} Prelude Fix List&gt; consF 42$ consF 1337 $consF 2112 nilF ListFix {unListFix = Fix (ConsF 42 (Fix (ConsF 1337 (Fix (ConsF 2112 (Fix NilF))))))}</pre> </p> <p> The first example creates an empty list, whereas the second creates a list of three integers, corresponding to <code>[42,1337,2112]</code>. </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="db44e85b27fb40d0b570c53cf4ce1843"> Haskell catamorphism <a href="#db44e85b27fb40d0b570c53cf4ce1843" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>ListF</code>), and an object <code>a</code>, but you still need to find a morphism <code>ListF a c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not the 'data type' <code>a</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code>, as you'll see. </p> <p> As in the previous article, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>listF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unListFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(ConsF&nbsp;h&nbsp;t)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>NilF</code> case? You could pass an argument to the <code>listF</code> function: </p> <p> <pre>listF&nbsp;n&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unListFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(ConsF&nbsp;h&nbsp;t)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> The <code>ConsF</code> case, contrary to <code>NilF</code>, contains both a head <code>h</code> (of type <code>a</code>) and a tail <code>t</code> (of type <code>c</code>). While you could make the code compile by simply returning <code>t</code>, it'd be incorrect to ignore <code>h</code>. In order to deal with both, you'll need a function that turns both <code>h</code> and <code>t</code> into a value of the type <code>c</code>. You can do this by passing a function to <code>listF</code> and using it: </p> <p> <pre><span style="color:#2b91af;">listF</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c listF&nbsp;f&nbsp;n&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unListFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(ConsF&nbsp;h&nbsp;t)&nbsp;=&nbsp;f&nbsp;h&nbsp;t</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>ListF</code>, the compiler infers that the <code>alg</code> function has the type <code>ListF a c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for lists. As has been consistent so far, it's a pair made from an initial value and a function. Once more, this isn't the only possible catamorphism, since you can, for example, trivially flip the arguments to <code>listF</code>: </p> <p> <pre><span style="color:#2b91af;">listF</span>&nbsp;::&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c listF&nbsp;n&nbsp;f&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unListFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(ConsF&nbsp;h&nbsp;t)&nbsp;=&nbsp;f&nbsp;h&nbsp;t</pre> </p> <p> You can also flip the arguments of <code>f</code>: </p> <p> <pre><span style="color:#2b91af;">listF</span>&nbsp;::&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c listF&nbsp;n&nbsp;f&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unListFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NilF&nbsp;&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(ConsF&nbsp;h&nbsp;t)&nbsp;=&nbsp;f&nbsp;t&nbsp;h</pre> </p> <p> These representations are all isomorphic to each other, but notice that the last variation is similar to the above C# <code>Aggregate</code> overload. The initial <code>n</code> value is the <code>seed</code>, and the function <code>f</code> has the same shape as <code>func</code>. Thus, I consider it reasonable to conjecture that that <code>Aggregate</code> overload is the catamorphism for <code>IEnumerable&lt;T&gt;</code>. </p> <h3 id="3ba9143e87544443b713727eb4ea60ba"> Basis <a href="#3ba9143e87544443b713727eb4ea60ba" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>listF</code>. The rest of this article uses the first of the variations shown above, with the type <code>(a -&gt; c -&gt; c) -&gt; c -&gt; ListFix a -&gt; c</code>. Here's the <code>Semigroup</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;(<span style="color:blue;">ListFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;xs&nbsp;&lt;&gt;&nbsp;ys&nbsp;=&nbsp;listF&nbsp;consF&nbsp;ys&nbsp;xs</pre> </p> <p> The initial value passed to <code>listF</code> is <code>ys</code>, and the function to apply is simply the <code>consF</code> function, thus 'consing' the two lists together. Here's an example of the operation in action: </p> <p> <pre>Prelude Fix List&gt; consF 42$ consF 1337 nilF &lt;&gt; (consF 2112 $consF 1 nilF) ListFix {unListFix = Fix (ConsF 42 (Fix (ConsF 1337 (Fix (ConsF 2112 (Fix (ConsF 1 (Fix NilF))))))))}</pre> </p> <p> With a <code>Semigroup</code> instance, it's trivial to also implement the <code>Monoid</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monoid</span>&nbsp;(<span style="color:blue;">ListFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;mempty&nbsp;=&nbsp;nilF</pre> </p> <p> While you <em>could</em> implement <code>mempty</code> with <code>listF</code> (<code>mempty = listF (const id) nilF nilF</code>), that'd be overcomplicated. Just because you can implement all functionality using <code>listF</code>, it doesn't mean that you should, if a simpler alternative exists. </p> <p> You can, on the other hand, use <code>listF</code> for the <code>Functor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;=&nbsp;listF&nbsp;(\h&nbsp;l&nbsp;-&gt;&nbsp;consF&nbsp;(f&nbsp;h)&nbsp;l)&nbsp;nilF</pre> </p> <p> You could write the function you pass to <code>listF</code> in a point-free fashion as <code>consF . f</code>, but I thought it'd be easier to follow what happens when written as an explicit lambda expression. The function receives a 'current value' <code>h</code>, as well as the part of the list which has already been translated <code>l</code>. Use <code>f</code> to translate <code>h</code>, and <code>consF</code> the result unto <code>l</code>. </p> <p> You can add <code>Applicative</code> and <code>Monad</code> instances in a similar fashion: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Applicative</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;pure&nbsp;x&nbsp;=&nbsp;consF&nbsp;x&nbsp;nilF &nbsp;&nbsp;fs&nbsp;&lt;*&gt;&nbsp;xs&nbsp;=&nbsp;listF&nbsp;(\f&nbsp;acc&nbsp;-&gt;&nbsp;(f&nbsp;&lt;$&gt;&nbsp;xs)&nbsp;&lt;&gt;&nbsp;acc)&nbsp;nilF&nbsp;fs <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monad</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;xs&nbsp;&gt;&gt;=&nbsp;f&nbsp;=&nbsp;listF&nbsp;(\x&nbsp;acc&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;acc)&nbsp;nilF&nbsp;xs</pre> </p> <p> What may be more interesting, however, is the <code>Foldable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">foldr</span>&nbsp;=&nbsp;listF</pre> </p> <p> The demonstrates that <code>listF</code> and <code>foldr</code> is the same. </p> <p> Next, you can also add a <code>Traversable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;listF&nbsp;(\x&nbsp;acc&nbsp;-&gt;&nbsp;consF&nbsp;&lt;$&gt;&nbsp;x&nbsp;&lt;*&gt;&nbsp;acc)&nbsp;(pure&nbsp;nilF)</pre> </p> <p> Finally, you can implement conversions to and from the standard list <code>[]</code> type, using <code>ana</code> as the dual of <code>cata</code>: </p> <p> <pre><span style="color:#2b91af;">toList</span>&nbsp;::&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[a] toList&nbsp;=&nbsp;listF&nbsp;<span style="color:#2b91af;">(:)</span>&nbsp;<span style="color:blue;">[]</span> <span style="color:#2b91af;">fromList</span>&nbsp;::&nbsp;[a]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;a fromList&nbsp;=&nbsp;ListFix&nbsp;.&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;&nbsp;&nbsp;<span style="color:blue;">[]</span>&nbsp;&nbsp;=&nbsp;NilF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;(h:t)&nbsp;=&nbsp;ConsF&nbsp;h&nbsp;t</pre> </p> <p> This demonstrates that <code>ListFix</code> is isomorphic to <code>[]</code>, which again establishes that <code>listF</code> and <code>foldr</code> are equivalent. </p> <h3 id="616c31c72b1f43cca647760d9fa8b226"> Summary <a href="#616c31c72b1f43cca647760d9fa8b226" title="permalink">#</a> </h3> <p> The catamorphism for lists is a pair made from an initial value and a function. One variation is equal to <code>foldr</code>. Like Maybe, the catamorphism is the same as the fold. </p> <p> In C#, this function corresponds to the <code>Aggregate</code> extension method identified above. </p> <p> You've now seen two examples where the catamorphism coincides with the fold. You've also seen examples (<a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a> and <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a>) where there's a catamorphism, but no fold at all. In the next article, you'll see an example of a <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">container</a> that has both catamorphism and fold, but where the catamorphism is more general than the fold. </p> <p> <strong>Next:</strong> <a href="/2019/06/03/either-catamorphism">Either catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Maybe catamorphism https://blog.ploeh.dk/2019/05/20/maybe-catamorphism 2019-05-20T06:04:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for Maybe is just a simplification of its fold.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for <a href="/2018/03/26/the-maybe-functor">Maybe</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> <em>Maybe</em> is a <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">data container</a> that models the absence or presence of a value. <a href="/2015/11/13/null-has-no-type-but-maybe-has">Contrary to null, Maybe has a type</a>, so offers a sane and reasonable way to model that situation. </p> <h3 id="1feeee3382ff44d182e0a28a33f8f80a"> C# catamorphism <a href="#1feeee3382ff44d182e0a28a33f8f80a" title="permalink">#</a> </h3> <p> This article uses <a href="/2018/06/04/church-encoded-maybe">Church-encoded Maybe</a>. Other, <a href="/2018/03/26/the-maybe-functor">alternative implementations of Maybe are possible</a>. The catamorphism for Maybe is the <code>Match</code> method: </p> <p> <pre><span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">TResult</span>&nbsp;nothing,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;just);</pre> </p> <p> Like the <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a>, the Maybe catamorphism is a pair of a value and a function. The <code>nothing</code> value corresponds to the absence of data, whereas the <code>just</code> function handles the presence of data. </p> <p> Given, for example, a Maybe containing a number, you can use <code>Match</code> to <a href="/2019/02/04/how-to-get-the-value-out-of-the-monad">get the value out of the Maybe</a>: </p> <p> <pre>&gt; <span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;maybe&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Just</span>&lt;<span style="color:blue;">int</span>&gt;(42); &gt; maybe.Match(0,&nbsp;x&nbsp;=&gt;&nbsp;x) 42 &gt; <span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;maybe&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Nothing</span>&lt;<span style="color:blue;">int</span>&gt;(); &gt; maybe.Match(0,&nbsp;x&nbsp;=&gt;&nbsp;x) 0</pre> </p> <p> The functionality is, however, more useful than a simple <em>get-value-or-default</em> operation. Often, you don't have a good default value for the type potentially wrapped in a Maybe object. In the core of your application architecture, it may not be clear how to deal with, say, the absence of a <code>Reservation</code> object, whereas at the boundary of your system, it's evident how to convert both absence and presence of data into a unifying type, such as an HTTP response: </p> <p> <pre><span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;maybe&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:blue;">return</span>&nbsp;maybe &nbsp;&nbsp;&nbsp;&nbsp;.Select(r&nbsp;=&gt;&nbsp;Repository.Create(r)) &nbsp;&nbsp;&nbsp;&nbsp;.Match&lt;<span style="color:#2b91af;">IHttpActionResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nothing:&nbsp;Content( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">HttpStatusCode</span>.InternalServerError, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HttpError</span>(<span style="color:#a31515;">&quot;Couldn&#39;t&nbsp;accept.&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;just:&nbsp;id&nbsp;=&gt;&nbsp;Ok(id));</pre> </p> <p> This enables you to avoid special cases, such as null <code>Reservation</code> objects, or magic numbers like <code>-1</code> to indicate the absence of <code>id</code> values. At the boundary of an HTTP-based application, you know that you must return an HTTP response. That's the unifying type, so you can return <code>200 OK</code> with a reservation ID in the response body when data is present, and <code>500 Internal Server Error</code> when data is absent. </p> <h3 id="87d91da8944f4eb5b8b24e9ea20d3e1b"> Maybe F-Algebra <a href="#87d91da8944f4eb5b8b24e9ea20d3e1b" title="permalink">#</a> </h3> <p> As in the <a href="/2019/05/13/peano-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. You already saw an example of that in the article about <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a>. The difference between Boolean values and Maybe is that the <em>just</em> case of Maybe carries a value. You can model this as a <code>Functor</code> with both a carrier type and a type argument for the data that Maybe may contain: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;MaybeF&nbsp;a&nbsp;c&nbsp;=&nbsp;NothingF&nbsp;|&nbsp;JustF&nbsp;a&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">MaybeF</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;NothingF&nbsp;=&nbsp;NothingF &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;(JustF&nbsp;x)&nbsp;=&nbsp;JustF&nbsp;x</pre> </p> <p> I chose to call the 'data type' <code>a</code> and the carrier type <code>c</code> (for <em>carrier</em>). As was also the case with <code>BoolF</code>, the <code>Functor</code> instance ignores the map function because the carrier type is missing from both the <code>NothingF</code> case and the <code>JustF</code> case. Like the <code>Functor</code> instance for <code>BoolF</code>, it'd seem that nothing happens, but at the type level, this is still a translation from <code>MaybeF a c</code> to <code>MaybeF a c1</code>. Not much of a function, perhaps, but definitely an <em>endofunctor</em>. </p> <p> In the previous articles, it was possible to work directly with the fixed points of both functors; i.e. <code>Fix BoolF</code> and <code>Fix NatF</code>. Haskell isn't happy about attempts to define various instances for <code>Fix (MaybeF a)</code>, so in order to make this easier, you can define a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;MaybeFix&nbsp;a&nbsp;= &nbsp;&nbsp;MaybeFix&nbsp;{&nbsp;unMaybeFix&nbsp;::&nbsp;Fix&nbsp;(MaybeF&nbsp;a)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> In order to make it easier to work with <code>MaybeFix</code> you can add helper functions to create values: </p> <p> <pre><span style="color:#2b91af;">nothingF</span>&nbsp;::&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a nothingF&nbsp;=&nbsp;MaybeFix&nbsp;$&nbsp;Fix&nbsp;NothingF <span style="color:#2b91af;">justF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a justF&nbsp;=&nbsp;MaybeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;JustF</pre> </p> <p> You can now create <code>MaybeFix</code> values to your heart's content: </p> <p> <pre>Prelude Fix Maybe&gt; justF 42 MaybeFix {unMaybeFix = Fix (JustF 42)} Prelude Fix Maybe&gt; nothingF MaybeFix {unMaybeFix = Fix NothingF}</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="24db4c715d1f4540bd8f87604819952f"> Haskell catamorphism <a href="#24db4c715d1f4540bd8f87604819952f" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>MaybeF</code>), and an object <code>a</code>, but you still need to find a morphism <code>MaybeF a c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not the 'data type' <code>a</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code>, as you'll see. </p> <p> As in the previous article, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>maybeF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unMaybeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;NothingF&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(JustF&nbsp;x)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>NothingF</code> case? You could pass an argument to the <code>maybeF</code> function: </p> <p> <pre>maybeF&nbsp;n&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unMaybeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;NothingF&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(JustF&nbsp;x)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> The <code>JustF</code> case, contrary to <code>NothingF</code>, already contains a value, and it'd be incorrect to ignore it. On the other hand, <code>x</code> is a value of type <code>a</code>, and you need to return a value of type <code>c</code>. You'll need a function to perform the conversion, so pass such a function as an argument to <code>maybeF</code> as well: </p> <p> <pre><span style="color:#2b91af;">maybeF</span>&nbsp;::&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c maybeF&nbsp;n&nbsp;f&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unMaybeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;NothingF&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(JustF&nbsp;x)&nbsp;=&nbsp;f&nbsp;x</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>MaybeF</code>, the compiler infers that the <code>alg</code> function has the type <code>MaybeF a c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> Notice that <code>maybeF</code>, like the above C# <code>Match</code> method, takes as arguments a pair of a value and a function (together with the Maybe value itself). Those are two representations of the same idea. Furthermore, this is nearly identical to the <code>maybe</code> function in Haskell's <code>Data.Maybe</code> module. I found if fitting, therefore, to name the function <code>maybeF</code>. </p> <h3 id="d8a0eed800de48a994085c419b7b5379"> Basis <a href="#d8a0eed800de48a994085c419b7b5379" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>maybeF</code>. Here's the <code>Functor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;=&nbsp;maybeF&nbsp;nothingF&nbsp;(justF&nbsp;.&nbsp;f)</pre> </p> <p> Since <code>fmap</code> should be a structure-preserving map, you'll have to map the <em>nothing</em> case to the <em>nothing</em> case, and <em>just</em> to <em>just</em>. When calling <code>maybeF</code>, you must supply a value for the <em>nothing</em> case and a function to deal with the <em>just</em> case. The <em>nothing</em> case is easy to handle: just use <code>nothingF</code>. </p> <p> In the <em>just</em> case, first apply the function <code>f</code> to map from <code>a</code> to <code>b</code>, and then use <code>justF</code> to wrap the <code>b</code> value in a new <code>MaybeFix</code> container to get <code>MaybeFix b</code>. </p> <p> <code>Applicative</code> is a little harder, but not much: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Applicative</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;pure&nbsp;=&nbsp;justF &nbsp;&nbsp;f&nbsp;&lt;*&gt;&nbsp;x&nbsp;=&nbsp;maybeF&nbsp;nothingF&nbsp;(&lt;$&gt;&nbsp;x)&nbsp;f</pre> </p> <p> The <code>pure</code> function is just <em>justF</em> (pun intended). The <em>apply</em> operator <code>&lt;*&gt;</code> is more complex. </p> <p> Both <code>f</code> and <code>x</code> surrounding <code>&lt;*&gt;</code> are <code>MaybeFix</code> values: <code>f</code> is <code>MaybeFix (a -&gt; b)</code>, and <code>x</code> is <code>MaybeFix a</code>. While it's becoming increasingly clear that you can use a catamorphism like <code>maybeF</code> to implement most other functionality, to which <code>MaybeFix</code> value should you apply it? To <code>f</code> or <code>x</code>? </p> <p> Both are possible, but the code looks (in my opinion) more readable if you apply it to <code>f</code>. Again, when <code>f</code> is <em>nothing</em>, return <code>nothingF</code>. When <code>f</code> is <em>just</em>, use the functor instance to map <code>x</code> (using the infix <code>fmap</code> alias <code>&lt;$&gt;</code>). </p> <p> The <code>Monad</code> instance, on the other hand, is almost trivial: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monad</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;x&nbsp;&gt;&gt;=&nbsp;f&nbsp;=&nbsp;maybeF&nbsp;nothingF&nbsp;f&nbsp;x</pre> </p> <p> As usual, map <em>nothing</em> to <em>nothing</em> by supplying <code>nothingF</code>. <code>f</code> is already a function that returns a <code>MaybeFix b</code> value, so just use that. </p> <p> The <code>Foldable</code> instance is likewise trivial (although, as you'll see below, you can make it even more trivial): </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;maybeF&nbsp;mempty</pre> </p> <p> The <code>foldMap</code> function must return a <code>Monoid</code> instance, so for the <em>nothing</em> case, simply return the identity, <em>mempty</em>. Furthermore, <code>foldMap</code> takes a function <code>a -&gt; m</code>, but since the <code>foldMap</code> implementation is <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free</a>, you can't 'see' that function as an argument. </p> <p> Finally, for the sake of completeness, here's the <code>Traversable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;maybeF&nbsp;(pure&nbsp;nothingF)&nbsp;(justF&nbsp;&lt;$&gt;)</pre> </p> <p> In the <em>nothing</em> case, you can put <code>nothingF</code> into the desired <code>Applicative</code> with <code>pure</code>. In the <em>just</em> case you can take advantage of the desired <code>Applicative</code> being also a <code>Functor</code> by simply mapping the inner value(s) with <code>justF</code>. </p> <p> Since the <code>Applicative</code> instance for <code>MaybeFix</code> equals <code>pure</code> to <code>justF</code>, you could alternatively write the <code>Traversable</code> instance like this: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;maybeF&nbsp;(pure&nbsp;nothingF)&nbsp;(pure&nbsp;&lt;$&gt;)</pre> </p> <p> I like this alternative less, since I find it confusing. The two appearances of <code>pure</code> relate to two different types. The <code>pure</code> in <code>pure nothingF</code> has the type <code>MaybeFix a -&gt; f (MaybeFix a)</code>, while the <code>pure</code> in <code>pure&nbsp;&lt;$&gt;</code> has the type <code>a -&gt; MaybeFix a</code>! </p> <p> Both implementations work the same, though: </p> <p> <pre>Prelude Fix Maybe&gt; sequenceA (justF ("foo", 42)) ("foo",MaybeFix {unMaybeFix = Fix (JustF 42)})</pre> </p> <p> Here, I'm using the <code>Applicative</code> instance of <code>(,) String</code>. </p> <p> Finally, you can implement conversions to and from the standard <code>Maybe</code> type, using <code>ana</code> as the dual of <code>cata</code>: </p> <p> <pre><span style="color:#2b91af;">toMaybe</span>&nbsp;::&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a toMaybe&nbsp;=&nbsp;maybeF&nbsp;Nothing&nbsp;<span style="color:blue;">return</span> <span style="color:#2b91af;">fromMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a fromMaybe&nbsp;=&nbsp;MaybeFix&nbsp;.&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;&nbsp;Nothing&nbsp;=&nbsp;NothingF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;(Just&nbsp;x)&nbsp;=&nbsp;JustF&nbsp;x</pre> </p> <p> This demonstrates that <code>MaybeFix</code> is isomorphic to <code>Maybe</code>, which again establishes that <code>maybeF</code> and <code>maybe</code> are equivalent. </p> <h3 id="2ec047e5122b4750a10cbe2012285524"> Alternatives <a href="#2ec047e5122b4750a10cbe2012285524" title="permalink">#</a> </h3> <p> As usual, the above <code>maybeF</code> isn't the only possible catamorphism. A trivial variation is to flip its arguments, but other variations exist. </p> <p> It's a recurring observation that a catamorphism is just a generalisation of a <em>fold</em>. In the above code, the <code>Foldable</code> instance already looked as simple as anyone could desire, but another variation of a catamorphism for Maybe is this gratuitously embellished definition: </p> <p> <pre><span style="color:#2b91af;">maybeF</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c maybeF&nbsp;f&nbsp;n&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unMaybeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;NothingF&nbsp;=&nbsp;n &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(JustF&nbsp;x)&nbsp;=&nbsp;f&nbsp;x&nbsp;n</pre> </p> <p> This variation redundantly passes <code>n</code> as an argument to <code>f</code>, thereby changing the type of <code>f</code> to <code>a -&gt; c -&gt; c</code>. There's no particular motivation for doing this, apart from establishing that this catamorphism is exactly the same as the fold: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">foldr</span>&nbsp;=&nbsp;maybeF</pre> </p> <p> You can still implement the other instances as well, but the rest of the code suffers in clarity. Here's a few examples: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;=&nbsp;maybeF&nbsp;(<span style="color:blue;">const</span>&nbsp;.&nbsp;justF&nbsp;.&nbsp;f)&nbsp;nothingF <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Applicative</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;pure&nbsp;=&nbsp;justF &nbsp;&nbsp;f&nbsp;&lt;*&gt;&nbsp;x&nbsp;=&nbsp;maybeF&nbsp;(<span style="color:blue;">const</span>&nbsp;.&nbsp;(&lt;$&gt;&nbsp;x))&nbsp;nothingF&nbsp;f <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monad</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;x&nbsp;&gt;&gt;=&nbsp;f&nbsp;=&nbsp;maybeF&nbsp;(<span style="color:blue;">const</span>&nbsp;.&nbsp;f)&nbsp;nothingF&nbsp;x</pre> </p> <p> I find that the need to compose with <code>const</code> does nothing to improve the readability of the code, so this variation is mostly, I think, of academic interest. It does show, though, that the catamorphism of Maybe is isomorphic to its fold, as the diagram in the overview article indicated: </p> <p> <img src="/content/binary/catamorphism-and-fold-relations.png" alt="Catamorphisms and folds as sets, for various sum types."> </p> <p> You can demonstrate that this variation, too, is isomorphic to <code>Maybe</code> with a set of conversion: </p> <p> <pre><span style="color:#2b91af;">toMaybe</span>&nbsp;::&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a toMaybe&nbsp;=&nbsp;maybeF&nbsp;(<span style="color:blue;">const</span>&nbsp;.&nbsp;<span style="color:blue;">return</span>)&nbsp;Nothing <span style="color:#2b91af;">fromMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeFix</span>&nbsp;a fromMaybe&nbsp;=&nbsp;MaybeFix&nbsp;.&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;&nbsp;Nothing&nbsp;=&nbsp;NothingF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;(Just&nbsp;x)&nbsp;=&nbsp;JustF&nbsp;x</pre> </p> <p> Only <code>toMaybe</code> has changed, compared to above; the <code>fromMaybe</code> function remains the same. The only change to <code>toMaybe</code> is that the arguments have been flipped, and <code>return</code> is now composed with <code>const</code>. </p> <p> Since (according to <a href="http://amzn.to/13tGJ0f">Conceptual Mathematics</a>) isomorphisms are transitive this means that the two variations of <code>maybeF</code> are isomorphic. The latter, more complex, variation of <code>maybeF</code> is identical <code>foldr</code>, so we can consider the simpler, more frequently encountered variation a simplification of <em>fold</em>. </p> <h3 id="f88757d425a04e97956d89270b32c0c0"> Summary <a href="#f88757d425a04e97956d89270b32c0c0" title="permalink">#</a> </h3> <p> The catamorphism for Maybe is the same as its Church encoding: a pair made from a default value and a function. In Haskell's base library, this is simply called <code>maybe</code>. In the above C# code, it's called <code>Match</code>. </p> <p> This function is total, and you can implement any other functionality you need with it. I therefore consider it the canonical representation of Maybe, which is also why it annoys me that most Maybe implementations come equipped with partial functions like <code>fromJust</code>, or F#'s <code>Option.get</code>. Those functions shouldn't be part of a sane and reasonable Maybe API. You shouldn't need them. </p> <p> In this series of articles about catamorphisms, you've now seen the first example of catamorphism and fold coinciding. In the next article, you'll see another such example - probably the most well-known catamorphism example of them all. </p> <p> <strong>Next:</strong> <a href="/2019/05/27/list-catamorphism">List catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Peano catamorphism https://blog.ploeh.dk/2019/05/13/peano-catamorphism 2019-05-13T05:10:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for Peano numbers involves a base value and a successor function.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for <a href="https://en.wikipedia.org/wiki/Natural_number">natural numbers</a>, as well as how to identify it. The beginning of the article presents the catamorphism in C#, with examples. The rest of the article describes how I deduced the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <h3 id="742f4b9c0c014152a933961c10f98b66"> C# catamorphism <a href="#742f4b9c0c014152a933961c10f98b66" title="permalink">#</a> </h3> <p> In this article, I model natural numbers using <a href="https://en.wikipedia.org/wiki/Peano_axioms">Peano's model</a>, and I'll reuse the <a href="/2018/05/28/church-encoded-natural-numbers">Church-encoded implementation you've seen before</a>. The catamorphism for <code>INaturalNumber</code> is: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;Cata&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n,&nbsp;<span style="color:#2b91af;">T</span>&nbsp;zero,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;succ) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;n.Match(zero,&nbsp;p&nbsp;=&gt;&nbsp;p.Cata(succ(zero),&nbsp;succ)); }</pre> </p> <p> Notice that this is an extension method on <code>INaturalNumber</code>, taking two other arguments: a <code>zero</code> argument which will be returned when the number is <em>zero</em>, and a successor function to return the 'next' value based on a previous value. </p> <p> The <code>zero</code> argument is the easiest to understand. It's simply passed to <code>Match</code> so that this is the value that <code>Cata</code> returns when <code>n</code> is <em>zero</em>. </p> <p> The other argument to <code>Match</code> must be a <code>Func&lt;INaturalNumber, T&gt;</code>; that is, a function that takes an <code>INaturalNumber</code> as input and returns a value of the type <code>T</code>. You can supply such a function by using a lambda expression. This expression receives a predecessor <code>p</code> as input, and has to return a value of the type <code>T</code>. The only function available in this context, however, is <code>succ</code>, which has the type <code>Func&lt;T, T&gt;</code>. How can you make that work? </p> <p> As is often the case when programming with generics, it pays to <em>follow the types</em>. A <code>Func&lt;T, T&gt;</code> requires an input of the type <code>T</code>. Do you have any <code>T</code> objects around? </p> <p> The only available <code>T</code> object is <code>zero</code>, so you could call <code>succ(zero)</code> to produce another <code>T</code> value. While you could return that immediately, that'd ignore the predecessor <code>p</code>, so that's not going to work. Another option, which is the one that works, is to recursively call <code>Cata</code> with <code>succ(zero)</code> as the <code>zero</code> value, and <code>succ</code> as the second argument. </p> <p> What this accomplishes is that <code>Cata</code> keeps recursively calling itself until <code>n</code> is <em>zero</em>. The <code>zero</code> object, however, will be the result of repeated applications of <code>succ(zero)</code>. In other words, <code>succ</code> will be called as many times as the natural number. If <code>n</code> is 7, then <code>succ</code> will be called seven times, the first time with the original <code>zero</code> value, the next time with the result of <code>succ(zero)</code>, the third time with the result of <code>succ(succ(zero))</code>, and so on. If the number is 42, <code>succ</code> will be called 42 times. </p> <h3 id="633dae2048cd45ebaa17962710048c67"> Arithmetic <a href="#633dae2048cd45ebaa17962710048c67" title="permalink">#</a> </h3> <p> You can implement all the functionality you saw in the article on Church-encoded natural numbers. You can start gently by converting a Peano number into a normal C# <code>int</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Count(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;n.Cata(0,&nbsp;x&nbsp;=&gt;&nbsp;1&nbsp;+&nbsp;x); }</pre> </p> <p> You can play with the functionality in <em>C# Interactive</em> to get a feel for how it works: </p> <p> <pre>&gt; <span style="color:#2b91af;">NaturalNumber</span>.Eight.Count() 8 &gt; <span style="color:#2b91af;">NaturalNumber</span>.Five.Count() 5</pre> </p> <p> The <code>Count</code> extension method uses <code>Cata</code> to count the level of recursion. The <code>zero</code> value is, not surprisingly, <code>0</code>, and the successor function simply adds one to the previous number. Since the successor function runs as many times as encoded by the Peano number, and since the initial value is <code>0</code>, you get the integer value of the number when <code>Cata</code> exits. </p> <p> A useful building block you can write using <code>Cata</code> is a function to increment a natural number by one: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;Increment(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;n.Cata(One,&nbsp;p&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Successor</span>(p)); }</pre> </p> <p> This, again, works as you'd expect: </p> <p> <pre>&gt; <span style="color:#2b91af;">NaturalNumber</span>.Zero.Increment().Count() 1 &gt; <span style="color:#2b91af;">NaturalNumber</span>.Eight.Increment().Count() 9</pre> </p> <p> With the <code>Increment</code> method and <code>Cata</code>, you can easily implement addition: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;x,&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;x.Cata(y,&nbsp;p&nbsp;=&gt;&nbsp;p.Increment()); }</pre> </p> <p> The trick here is to use <code>y</code> as the <code>zero</code> case for <code>x</code>. In other words, if <code>x</code> is <em>zero</em>, then <code>Add</code> should return <code>y</code>. If <code>x</code> isn't <em>zero</em>, then <code>Increment</code> it as many times as the number encodes, but starting at <code>y</code>. In other words, start with <code>y</code> and <code>Increment</code> <code>x</code> times. </p> <p> The catamorphism makes it much easier to implement arithmetic operation. Just consider multiplication, which wasn't the simplest implementation in the previous article. Now, it's as simple as this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;Multiply(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;x,&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;x.Cata(Zero,&nbsp;p&nbsp;=&gt;&nbsp;p.Add(y)); }</pre> </p> <p> Start at <code>0</code> and simply <code>Add(y)</code> <code>x</code> times. </p> <p> <pre>&gt; <span style="color:#2b91af;">NaturalNumber</span>.Nine.Multiply(<span style="color:#2b91af;">NaturalNumber</span>.Four).Count() 36</pre> </p> <p> Finally, you can also implement some common predicates: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsZero(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;n.Cata&lt;<span style="color:#2b91af;">IChurchBoolean</span>&gt;(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchTrue</span>(),&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchFalse</span>()); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsEven(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;n.Cata&lt;<span style="color:#2b91af;">IChurchBoolean</span>&gt;(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchTrue</span>(),&nbsp;b&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchNot</span>(b)); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsOdd(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">INaturalNumber</span>&nbsp;n) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchNot</span>(n.IsEven()); }</pre> </p> <p> Particularly <code>IsEven</code> is elegant: It considers <code>zero</code> even, so simply uses a <code>new ChurchTrue()</code> object for that case. In all other cases, it alternates between <em>false</em> and <em>true</em> by negating the predecessor. </p> <p> <pre>&gt; <span style="color:#2b91af;">NaturalNumber</span>.Three.IsEven().ToBool() false</pre> </p> <p> It seems convincing that we can use <code>Cata</code> to implement all the other functionality we need. That seems to be a characteristic of a catamorphism. Still, how do we know that <code>Cata</code> is, in fact, the catamorphism for natural numbers? </p> <h3 id="05dbca489b8a49be830df87e13bfcae3"> Peano F-Algebra <a href="#05dbca489b8a49be830df87e13bfcae3" title="permalink">#</a> </h3> <p> As in the <a href="/2019/05/06/boolean-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. The <code>NatF</code> type comes from his article as well: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;NatF&nbsp;a&nbsp;=&nbsp;ZeroF&nbsp;|&nbsp;SuccF&nbsp;a&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;ZeroF&nbsp;=&nbsp;ZeroF &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(SuccF&nbsp;x)&nbsp;=&nbsp;SuccF&nbsp;$&nbsp;f&nbsp;x</pre> </p> <p> You can use the fixed point of this functor to define numbers with a shared type. Here's just the first ten: </p> <p> <pre><span style="color:#2b91af;">zeroF</span>,&nbsp;<span style="color:#2b91af;">oneF</span>,&nbsp;<span style="color:#2b91af;">twoF</span>,&nbsp;<span style="color:#2b91af;">threeF</span>,&nbsp;<span style="color:#2b91af;">fourF</span>,&nbsp;<span style="color:#2b91af;">fiveF</span>,&nbsp;<span style="color:#2b91af;">sixF</span>,&nbsp;<span style="color:#2b91af;">sevenF</span>,&nbsp;<span style="color:#2b91af;">eightF</span>,&nbsp;<span style="color:#2b91af;">nineF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span> zeroF&nbsp;&nbsp;=&nbsp;Fix&nbsp;ZeroF oneF&nbsp;&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;zeroF twoF&nbsp;&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;oneF threeF&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;twoF fourF&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;threeF fiveF&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;fourF sixF&nbsp;&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;fiveF sevenF&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;sixF eightF&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;sevenF nineF&nbsp;&nbsp;=&nbsp;Fix&nbsp;$&nbsp;SuccF&nbsp;eightF</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="f0e66c873a034830a4b069229971299a"> Haskell catamorphism <a href="#f0e66c873a034830a4b069229971299a" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>NatF</code>), and an object <code>a</code>, but you still need to find a morphism <code>NatF a -&gt; a</code>. </p> <p> As in the previous article, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>natF&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;ZeroF&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(SuccF&nbsp;predecessor)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>a</code> from the <code>ZeroF</code> case? You could pass an argument to the <code>natF</code> function: </p> <p> <pre>natF&nbsp;z&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;ZeroF&nbsp;=&nbsp;z &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(SuccF&nbsp;predecessor)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> In the <code>SuccF</code> case, <code>predecessor</code> is already of the polymorphic type <code>a</code>, so instead of returning a constant value, you can supply a function as an argument to <code>natF</code> and use it in that case: </p> <p> <pre><span style="color:#2b91af;">natF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a natF&nbsp;z&nbsp;next&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;ZeroF&nbsp;=&nbsp;z &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;(SuccF&nbsp;predecessor)&nbsp;=&nbsp;next&nbsp;predecessor</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>NatF</code>, the compiler infers that the <code>alg</code> function has the type <code>NatF a -&gt; a</code>, which is just what you need! </p> <p> For good measure, I should point out that, as usual, the above <code>natF</code> function isn't the only possible catamorphism. Trivially, you can flip the order of the arguments, and this would also be a catamorphism. These two alternatives are isomorphic. </p> <p> The <code>natF</code> function identifies the Peano number catamorphism, which is equivalent to the C# representation in the beginning of the article. I called the function <code>natF</code>, because there's a tendency in Haskell to name the 'case analysis' or catamorphism after the type, just with a lower-case initial letter. </p> <h3 id="e78ae79059e14f4b92f525393cc74861"> Basis <a href="#e78ae79059e14f4b92f525393cc74861" title="permalink">#</a> </h3> <p> A catamorphism can be used to implement most (if not all) other useful functionality, like all of the above C# functionality. In fact, I wrote the Haskell code first, and then translated my implementations into the above C# extension methods. This means that the following functions apply the same reasoning: </p> <p> <pre><span style="color:#2b91af;">evenF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> evenF&nbsp;=&nbsp;natF&nbsp;trueF&nbsp;notF <span style="color:#2b91af;">oddF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> oddF&nbsp;=&nbsp;notF&nbsp;.&nbsp;evenF <span style="color:#2b91af;">incF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span> incF&nbsp;=&nbsp;natF&nbsp;oneF&nbsp;$&nbsp;Fix&nbsp;.&nbsp;SuccF <span style="color:#2b91af;">addF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span> addF&nbsp;x&nbsp;y&nbsp;=&nbsp;natF&nbsp;y&nbsp;incF&nbsp;x <span style="color:#2b91af;">multiplyF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span> multiplyF&nbsp;x&nbsp;y&nbsp;=&nbsp;natF&nbsp;zeroF&nbsp;(addF&nbsp;y)&nbsp;x</pre> </p> <p> Here are some GHCi usage examples: </p> <p> <pre>Prelude Boolean Nat&gt; evenF eightF Fix TrueF Prelude Boolean Nat&gt; toNum$ multiplyF sevenF sixF 42</pre> </p> <p> The <code>toNum</code> function corresponds to the above <code>Count</code> C# method. It is, again, based on <code>cata</code>. You can use <code>ana</code> to convert the other way: </p> <p> <pre><span style="color:#2b91af;">toNum</span>&nbsp;::&nbsp;<span style="color:blue;">Num</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a toNum&nbsp;=&nbsp;natF&nbsp;0&nbsp;(+&nbsp;1) <span style="color:#2b91af;">fromNum</span>&nbsp;::&nbsp;(<span style="color:blue;">Eq</span>&nbsp;a,&nbsp;<span style="color:blue;">Num</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">NatF</span> fromNum&nbsp;=&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;0&nbsp;=&nbsp;ZeroF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;x&nbsp;=&nbsp;SuccF&nbsp;$&nbsp;x&nbsp;-&nbsp;1</pre> </p> <p> This demonstrates that <code>Fix NatF</code> is isomorphic to <code>Num</code> instances, such as <code>Integer</code>. </p> <h3 id="d09e79446af14875a42f66869e10f33a"> Summary <a href="#d09e79446af14875a42f66869e10f33a" title="permalink">#</a> </h3> <p> The catamorphism for Peano numbers is a pair consisting of a zero value and a successor function. The most common description of catamorphisms that I've found emphasise how a catamorphism is like a <em>fold;</em> an operation you can use to reduce a data structure like a list or a tree to a single value. This is what happens here, but even so, the <code>Fix NatF</code> type isn't a <code>Foldable</code> instance. The reason is that while <code>NatF</code> is a polymorphic type, its fixed point <code>Fix NatF</code> isn't. Haskell's <code>Foldable</code> type class requires foldable containers to be polymorphic (what C# programmers would call 'generic'). </p> <p> When I first ran into the concept of a <em>catamorphism</em>, it was invariably described as a 'generalisation of fold'. The examples shown were always how the catamorphism for linked list is the same as its <em>fold</em>. I found such explanations unhelpful, because I couldn't understand how those two concepts differ. </p> <p> The purpose with this article series is to show just how much more general the abstraction of a catamorphism is. In this article you saw how an infinitely recursive data structure like Peano numbers have a catamorphism, even though it isn't a parametrically polymorphic type. In the next article, though, you'll see the first example of a polymorphic type where the catamorphism coincides with the fold. </p> <p> <strong>Next:</strong> <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Boolean catamorphism https://blog.ploeh.dk/2019/05/06/boolean-catamorphism 2019-05-06T12:30:00+00:00 Mark Seemann <div id="post"> <p> <em>The catamorphism for Boolean values is just the common ternary operator.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for Boolean values, as well as how you identify it. The beginning of this article presents the catamorphism in C#, with a simple example. The rest of the article describes how I deduced the catamorphism. That part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <h3 id="35155b758274445cbe57f75d730a4eb6"> C# catamorphism <a href="#35155b758274445cbe57f75d730a4eb6" title="permalink">#</a> </h3> <p> The catamorphism for Boolean values is the familiar <a href="https://en.wikipedia.org/wiki/%3F:">ternary conditional operator</a>: </p> <p> <pre>&gt; <span style="color:#2b91af;">DateTime</span>.Now.Day&nbsp;%&nbsp;2&nbsp;==&nbsp;0&nbsp;?&nbsp;<span style="color:#a31515;">&quot;Even&nbsp;date&quot;</span>&nbsp;:&nbsp;<span style="color:#a31515;">&quot;Odd&nbsp;date&quot;</span> "Odd date"</pre> </p> <p> Given a Boolean expression, you basically provide two values: one to use in case the Boolean expression is <em>true</em>, and one to use in case it's <em>false</em>. </p> <p> For <a href="/2018/05/24/church-encoded-boolean-values">Church-encoded Boolean values</a>, the catamorphism looks like this: </p> <p> <pre><span style="color:#2b91af;">T</span>&nbsp;Match&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">T</span>&nbsp;trueCase,&nbsp;<span style="color:#2b91af;">T</span>&nbsp;falseCase);</pre> </p> <p> This is an instance method where you must, again, supply two alternatives. When the instance represents <em>true</em>, you'll get the left-most value <code>trueCase</code>; otherwise, when the instance represents <em>false</em>, you'll get the right-most value <code>falseCase</code>. </p> <p> The catamorphism turns out to be the same as the <a href="/2018/05/22/church-encoding">Church encoding</a>. This seems to be a recurring pattern. </p> <h3 id="cd81cb92ed2d42f8bc0ad5adbde4b014"> Alternatives <a href="#cd81cb92ed2d42f8bc0ad5adbde4b014" title="permalink">#</a> </h3> <p> To be accurate, there's more than one catamorphism for Boolean values. It's only by convention that the value corresponding to <em>true</em> goes on the left, and the <em>false</em> value goes to the right. You could flip the arguments, and it would still be a catamorphism. This is, in fact, what Haskell's <code>Data.Bool</code> module does: </p> <p> <pre>Prelude Data.Bool&gt; bool "Odd date" "Even date"$ even date "Odd date"</pre> </p> <p> The <a href="http://hackage.haskell.org/package/base/docs/Data-Bool.html">module documentation</a> calls this the <em>"Case analysis for the <code>Bool</code> type"</em>, instead of a catamorphism, but the two representations are isomorphic: <blockquote> "This is equivalent to <code>if p then y else x</code>; that is, one can think of it as an if-then-else construct with its arguments reordered." </blockquote> This is another recurring result. There's typically more than one catamorphism, but the alternatives are isomorphic. In this article series, I'll mostly present the alternative that strikes me as the one you'll encounter most frequently. </p> <h3 id="60235fb428d14785a5aeea440c05cce5"> Fix <a href="#60235fb428d14785a5aeea440c05cce5" title="permalink">#</a> </h3> <p> In this and future articles, I'll derive the catamorphism from an F-Algebra. For an introduction to F-Algebras and fixed points, I'll refer you to <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on the topic</a>. In it, he presents a generic data type for a fixed point, as well as polymorphic functions for catamorphisms and anamorphisms. While they're available in his article, I'll repeat them here for good measure: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;Fix&nbsp;f&nbsp;=&nbsp;Fix&nbsp;{&nbsp;unFix&nbsp;::&nbsp;f&nbsp;(Fix&nbsp;f)&nbsp;} <span style="color:#2b91af;">cata</span>&nbsp;::&nbsp;<span style="color:blue;">Functor</span>&nbsp;f&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(f&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;f&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a cata&nbsp;alg&nbsp;=&nbsp;alg&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(cata&nbsp;alg)&nbsp;.&nbsp;unFix <span style="color:#2b91af;">ana</span>&nbsp;::&nbsp;<span style="color:blue;">Functor</span>&nbsp;f&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;f&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;f ana&nbsp;coalg&nbsp;=&nbsp;Fix&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(ana&nbsp;coalg)&nbsp;.&nbsp;coalg</pre> </p> <p> This should be recognisable from Bartosz Milewski's article. With one small exception, this is just a copy of the code shown there. </p> <h3 id="6b0fdc2b04e540d19322f0e30c00e86c"> Boolean F-Algebra <a href="#6b0fdc2b04e540d19322f0e30c00e86c" title="permalink">#</a> </h3> <p> While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. As data types go, they don't get much simpler than Boolean values, which are just two mutually exclusive cases. In order to make a <code>Functor</code> out of the definition, though, you can equip it with a <em>carrier type:</em> </p> <p> <pre><span style="color:blue;">data</span>&nbsp;BoolF&nbsp;a&nbsp;=&nbsp;TrueF&nbsp;|&nbsp;FalseF&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;TrueF&nbsp;=&nbsp;&nbsp;TrueF &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;FalseF&nbsp;=&nbsp;FalseF</pre> </p> <p> The <code>Functor</code> instance simply ignores the carrier type and just returns <code>TrueF</code> and <code>FalseF</code>, respectively. It'd seem that nothing happens, but at the type level, this is still a translation from <code>BoolF a</code> to <code>BoolF b</code>. Not much of a function, perhaps, but definitely an <em>endofunctor</em>. </p> <p> Another note that may be in order here, as well as for all future articles in this series, is that you'll notice that most types and custom functions come with the <code>F</code> suffix. This is simply a suffix I've added to avoid conflicts with built-in types, values, and functions, such as <code>Bool</code>, <code>True</code>, <code>and</code>, and so on. The <code>F</code> is for <em>F-Algebra</em>. </p> <p> You can lift these values into <code>Fix</code> in order to make it fit with the <code>cata</code> function: </p> <p> <pre><span style="color:#2b91af;">trueF</span>,&nbsp;<span style="color:#2b91af;">falseF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> trueF&nbsp;&nbsp;=&nbsp;Fix&nbsp;&nbsp;TrueF falseF&nbsp;=&nbsp;Fix&nbsp;FalseF</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="a28e972fc7eb45038427cff258c0c8f2"> Haskell catamorphism <a href="#a28e972fc7eb45038427cff258c0c8f2" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>BoolF</code>), and an object <code>a</code>, but you still need to find a morphism <code>BoolF a -&gt; a</code>. At first glance, this seems impossible, because neither <code>TrueF</code> nor <code>FalseF</code> actually contain a value of the type <code>a</code>. How, then, can you conjure an <code>a</code> value out of thin air? </p> <p> The <code>cata</code> function has the answer. </p> <p> What you can do is to start writing the function that will become the catamorphism, basing it on <code>cata</code>: </p> <p> <pre>boolF&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;TrueF&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;FalseF&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>a</code> from the <code>TrueF</code> case? You could pass an argument to the <code>boolF</code> function: </p> <p> <pre>boolF&nbsp;x&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;TrueF&nbsp;=&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;FalseF&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> That seems promising, so do that for the <code>FalseF</code> case as well: </p> <p> <pre><span style="color:#2b91af;">boolF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a boolF&nbsp;x&nbsp;y&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;&nbsp;TrueF&nbsp;=&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;FalseF&nbsp;=&nbsp;y</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>BoolF</code>, the compiler infers that the <code>alg</code> function has the type <code>BoolF a -&gt; a</code>, which is just what you need! </p> <p> The <code>boolF</code> function identifies the Boolean catamorphism, which is equivalent to representations in the beginning of the article. I called the function <code>boolF</code>, because there's a tendency in Haskell to name the 'case analysis' or catamorphism after the type, just with a lower-case initial letter. </p> <p> You can use the <code>boolF</code> function just like the above ternary operator: </p> <p> <pre>Prelude Boolean Nat&gt; boolF "Even date" "Odd date" $evenF dateF "Odd date"</pre> </p> <p> Here, I've also used <code>evenF</code> from the <code>Nat</code> module shown in the next article in the series. </p> <p> From the above definition of <code>boolF</code>, it should also be clear that you can arrive at the alternative catamorphism defined by <code>Data.Bool.bool</code> by simply flipping <code>x</code> and <code>y</code>. </p> <h3 id="54886f1be8684dd4a5909e4d50b7d5dc"> Basis <a href="#54886f1be8684dd4a5909e4d50b7d5dc" title="permalink">#</a> </h3> <p> A catamorphism can be used to implement most (if not all) other useful functionality. For Boolean values, that would be the standard Boolean operations <em>and</em>, <em>or</em>, and <em>not:</em> </p> <p> <pre><span style="color:#2b91af;">andF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> andF&nbsp;x&nbsp;y&nbsp;=&nbsp;boolF&nbsp;y&nbsp;falseF&nbsp;x <span style="color:#2b91af;">orF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> orF&nbsp;x&nbsp;y&nbsp;=&nbsp;boolF&nbsp;trueF&nbsp;y&nbsp;x <span style="color:#2b91af;">notF</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> notF&nbsp;=&nbsp;boolF&nbsp;falseF&nbsp;trueF</pre> </p> <p> They work as you'd expect them to work: </p> <p> <pre>Prelude Boolean&gt; andF trueF falseF Fix FalseF Prelude Boolean&gt; orF trueF falseF Fix TrueF Prelude Boolean&gt; orF (notF trueF) falseF Fix FalseF</pre> </p> <p> You can also implement conversion to and from the built-in <code>Bool</code> type: </p> <p> <pre><span style="color:#2b91af;">toBool</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span> toBool&nbsp;=&nbsp;boolF&nbsp;True&nbsp;False <span style="color:#2b91af;">fromBool</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Bool</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">BoolF</span> fromBool&nbsp;=&nbsp;ana&nbsp;coalg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;coalg&nbsp;&nbsp;True&nbsp;=&nbsp;TrueF &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coalg&nbsp;False&nbsp;=&nbsp;FalseF</pre> </p> <p> This demonstrates that <code>Fix BoolF</code> is isomorphic to <code>Bool</code>. </p> <h3 id="327411e72cea46bbb1f6fe167738c7b2"> Summary <a href="#327411e72cea46bbb1f6fe167738c7b2" title="permalink">#</a> </h3> <p> The catamorphism for Boolean values is a function, method, or operator akin to the familiar ternary conditional operator. The most common descriptions of catamorphisms that I've found emphasise how a catamorphism is like a <em>fold;</em> an operation you can use to reduce a data structure like a list or a tree to a single value. In that light, it may be surprising that something as simple as Boolean values have an associated catamorphism. </p> <p> Since <code>Fix BoolF</code> is isomorphic to <code>Bool</code>, you may wonder what the point is. Why define this data type, and implement functions like <code>andF</code>, <code>orF</code>, and <code>notF</code>? </p> <p> The code presented here is nothing but an analysis tool. It's a way to identify the catamorphism for Boolean values. </p> <p> <strong>Next:</strong> <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Catamorphisms https://blog.ploeh.dk/2019/04/29/catamorphisms 2019-04-29T18:31:00+00:00 Mark Seemann <div id="post"> <p> <em>A catamorphism is a general abstraction that enables you to handle multiple values, for example in order to reduce them to a single value.</em> </p> <p> This article series is part of <a href="/2017/10/04/from-design-patterns-to-category-theory">an even larger series of articles about the relationship between design patterns and category theory</a>. In another article series in this big series of articles, you learned about <a href="/2018/03/19/functors-applicatives-and-friends">functors, applicatives, and other types of data containers</a>. </p> <p> You may have heard about <em>map-reduce</em> architectures. Much software can be designed around two general types of operations: those that <em>map</em> data, and those that <em>reduce</em> data. A <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">functor is a container of data</a> that supports structure-preserving maps. Thus, you can think of <a href="/2018/03/22/functors">functors</a> as the general abstraction for map operations (also sometimes called <em>projections</em>). Does a similar universal abstraction exist for operations that reduce data? </p> <p> Yes, that abstraction is called a <em>catamorphism</em>. </p> <h3 id="bb64d005b16b49f892c00824ef803997"> Aggregation <a href="#bb64d005b16b49f892c00824ef803997" title="permalink">#</a> </h3> <p> <em>Catamorphism</em> is an intimidating word, so let's start with an example. You often have a collection of values that you'd like to reduce to a single value. Such a collection can contain arbitrarily complex objects, but I'll keep it simple and start with a collection of numbers: </p> <p> <pre><span style="color:blue;">new</span>[]&nbsp;{&nbsp;42,&nbsp;1337,&nbsp;2112,&nbsp;90125,&nbsp;5040,&nbsp;7,&nbsp;1984&nbsp;};</pre> </p> <p> This particular list of numbers is an array, but that's not important. What comes next works for any <code>IEnumerable&lt;T&gt;</code>, including arrays. I only chose an array because the C# syntax for array creation is more compact than for other collection types. </p> <p> How do you reduce those seven numbers to a single number? That depends on what you want that number to tell you. One option is to add the numbers together. There's a specific, built-in function for that: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;42,&nbsp;1337,&nbsp;2112,&nbsp;90125,&nbsp;5040,&nbsp;7,&nbsp;1984&nbsp;}.Sum(); 100647</pre> </p> <p> The <a href="https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.sum">Sum</a> extension method is a one of many built-in functions that enable you to reduce a list of numbers to a single number: <a href="https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.average">Average</a>, <a href="https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.max">Max</a>, <a href="https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.count">Count</a>, and so on. </p> <p> What do you do, though, if you need to reduce many values to one, and there's no existing function for that? What if, for example, you need to add all the numbers using <a href="/2018/07/16/angular-addition-monoid">modulo 360 addition</a>? </p> <p> In that case, you use <a href="https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.aggregate">Aggregate</a>: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;42,&nbsp;1337,&nbsp;2112,&nbsp;90125,&nbsp;5040,&nbsp;7,&nbsp;1984&nbsp;}.Aggregate((x,&nbsp;y)&nbsp;=&gt;&nbsp;(x&nbsp;+&nbsp;y)&nbsp;%&nbsp;360) 207</pre> </p> <p> The way to interpret this result is that the initial array represents a sequence of rotations (measured in degrees), and the result is the final angle after all the rotations have completed. </p> <p> In other (functional) languages, such a 'reduce' operation is called a <em>fold</em>. The metaphor, I suppose, is that you fold multiple values together, two by two. </p> <p> A <em>fold</em> is a catamorphism, but a catamorphism is a more general abstraction. For some data structures, the catamorphism is more powerful than the fold, but for collections, there's no difference. </p> <p> There's one edge case we need to be aware of, though. What if the collection is empty? </p> <h3 id="65483950f21d453ebe4e8949eac5751f"> Aggregation of empty containers <a href="#65483950f21d453ebe4e8949eac5751f" title="permalink">#</a> </h3> <p> What happens if you attempt to aggregate an empty collection? </p> <p> <pre>&gt; <span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0].Aggregate((x,&nbsp;y)&nbsp;=&gt;&nbsp;(x&nbsp;+&nbsp;y)&nbsp;%&nbsp;360) <span style="color:red;">Sequence contains no elements + System.Linq.Enumerable.Aggregate&lt;TSource&gt;(IEnumerable&lt;TSource&gt;, Func&lt;TSource, TSource, TSource&gt;)</span></pre> </p> <p> The <code>Aggregate</code> method throws an exception because it doesn't know how to deal with empty collections. The lambda expression you supply tells the <code>Aggregate</code> method how to combine two values into one. This is, for instance, how <a href="/2017/12/11/semigroups-accumulate">semigroups accumulate</a>. </p> <p> The lambda expression handles all cases where you have two or more values. If you have only a single value, then that's no problem either: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;1337&nbsp;}.Aggregate((x,&nbsp;y)&nbsp;=&gt;&nbsp;(x&nbsp;+&nbsp;y)&nbsp;%&nbsp;360) 1337</pre> </p> <p> In that case, the lambda expression isn't involved at all, because the single value is simply returned without modification. In this example, this could even be interpreted as being incorrect, since you'd expect the result to be 257 (<code>1337 % 360</code>). </p> <p> It's safer to use the <code>Aggregate</code> overload that takes a <em>seed</em> value: </p> <p> <pre>&gt; <span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0].Aggregate(0,&nbsp;(x,&nbsp;y)&nbsp;=&gt;&nbsp;(x&nbsp;+&nbsp;y)&nbsp;%&nbsp;360) 0</pre> </p> <p> Not only does that gracefully handle empty collections, it also gives you a 'better' result for a single value: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;1337&nbsp;}.Aggregate(0,&nbsp;(x,&nbsp;y)&nbsp;=&gt;&nbsp;(x&nbsp;+&nbsp;y)&nbsp;%&nbsp;360) 257</pre> </p> <p> This works better because the method always starts with the <em>seed</em> value, which means that even if there's only a single value (<code>1337</code>), the lambda expression still runs (<code>(0 + 1337) % 360</code>). </p> <p> This overload of <code>Aggregate</code> has a different type, though: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TAccumulate</span>&nbsp;Aggregate&lt;<span style="color:#2b91af;">TSource</span>,&nbsp;<span style="color:#2b91af;">TAccumulate</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TSource</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TAccumulate</span>&nbsp;seed, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">TAccumulate</span>,&nbsp;<span style="color:#2b91af;">TSource</span>,&nbsp;<span style="color:#2b91af;">TAccumulate</span>&gt;&nbsp;func);</pre> </p> <p> Notice that the <code>func</code> doesn't require the accumulator to have the same type as elements from the <code>source</code> collection. This enables you to translate on the fly, so to speak. You can still use binary operations like the above modulo 360 addition, because that just implies that both <code>TSource</code> and <code>TAccumulate</code> are <code>int</code>. </p> <p> With this overload, you could, for example, use <a href="/2018/07/16/angular-addition-monoid">the Angle class</a> to perform the work: </p> <p> <pre>&gt; <span style="color:blue;">new</span>[]&nbsp;{&nbsp;42,&nbsp;1337,&nbsp;2112,&nbsp;90125,&nbsp;5040,&nbsp;7,&nbsp;1984&nbsp;} . .Aggregate(<span style="color:#2b91af;">Angle</span>.Identity,&nbsp;(a,&nbsp;i)&nbsp;=&gt;&nbsp;a.Add(<span style="color:#2b91af;">Angle</span>.FromDegrees(i))) [{ Angle = 207° }]</pre> </p> <p> Now the <code>seed</code> argument is <code>Angle.Identity</code>, which implies that <code>TAccumulate</code> is <code>Angle</code>. The <code>source</code> is still a collection of numbers, so <code>TSource</code> is <code>int</code>. Hence, I called the angle <code>a</code> and the integer <code>i</code> in the lambda expression. The output is an <code>Angle</code> object that represents 207°. </p> <p> That <code>Aggregate</code> overload is the catamorphism for collections. It reduces a collection to an object. </p> <h3 id="5a964dee9b1f4cdd8427ce0a0806d65d"> Catamorphisms and folds <a href="#5a964dee9b1f4cdd8427ce0a0806d65d" title="permalink">#</a> </h3> <p> Is <em>catamorphism</em> just an intimidating word for <em>aggregate</em>, <em>accumulate</em>, <em>fold</em>, or <em>reduce?</em> </p> <p> It took me a long time to be able to tell the difference, because in many cases, it seems that there's no difference. The purpose of this article series is to make the distinction clearer. In short, a catamorphism is a more general concept. </p> <p> <img src="/content/binary/catamorphism-and-fold-relations.png" alt="Catamorphisms and folds as sets, for various sum types."> </p> <p> For some data structures, such as <a href="/2018/05/24/church-encoded-boolean-values">Boolean values</a>, or <a href="/2018/05/28/church-encoded-natural-numbers">Peano numbers</a>, the catamorphism is all there is; no fold exists. For other data structures, such as <a href="/2018/06/04/church-encoded-maybe">Maybe</a> or collections, the catamorphism and the fold coincide. Still other data structures, such as <a href="/2018/06/11/church-encoded-either">Either</a> and <a href="/2018/08/06/a-tree-functor">trees</a>, support folding, but the fold is based on the catamorphism. For those types, there are operations you can do with the catamorphism that are impossible to implement with the <em>fold</em> function. One example is that a tree's catamorphism enables you to count its leaves; you can't do that with its <em>fold</em> function. </p> <p> You'll see plenty of examples in this article series: </p> <p> <ul> <li><a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a></li> <li><a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a></li> <li><a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a></li> <li><a href="/2019/05/27/list-catamorphism">List catamorphism</a></li> <li><a href="/2019/06/03/either-catamorphism">Either catamorphism</a></li> <li><a href="/2019/06/10/tree-catamorphism">Tree catamorphism</a></li> <li><a href="/2019/08/05/rose-tree-catamorphism">Rose tree catamorphism</a></li> <li><a href="/2019/06/24/full-binary-tree-catamorphism">Full binary tree catamorphism</a></li> <li><a href="/2019/07/08/payment-types-catamorphism">Payment types catamorphism</a></li> </ul> </p> <p> Each of these articles will contain a fair amount of <a href="https://www.haskell.org">Haskell</a> code, but even if you're an object-oriented programmer who doesn't read Haskell, you should still scan them, as I'll start each with some C# examples. The Haskell code, by the way, is <a href="https://github.com/ploeh/FAlgebras">available on GitHub</a>. </p> <h3 id="cc687d1bebed47229cbdeffdf98fd666"> Greek <a href="#cc687d1bebed47229cbdeffdf98fd666" title="permalink">#</a> </h3> <p> When encountering a word like <em>catamorphism</em>, your reaction might be: <blockquote> "Catamorphism?! What does that even mean? It's all Greek to me." </blockquote> Indeed, it's Greek, as is so much of mathematical terminology. The <em>cata</em> prefix means 'down'; lots of words start with <em>cata</em>, like <em>catastrophe</em>, <em>catalogue</em>, <em>catatonia</em>, <em>catacomb</em>, etc. </p> <p> The <em>morph</em> suffix generally means 'shape'. While the <em>cata</em> prefix appears in common words like <em>catastrophe</em>, the <em>morph</em> suffix mostly appears in more academic contexts. Programmers will probably have encountered <em>polymorphism</em> and <em>skeuomorphism</em>, not to mention <a href="/2018/01/08/software-design-isomorphisms">isomorphism</a>. While <em>morphism</em> is heavily used in mathematics, other sciences use the suffix too, like <em>dimorphism</em> in biology. </p> <p> In category theory, a <em>morphism</em> is basically just an arrow that points from one object to another. Think of it as a function. </p> <p> If a morphism is just a function, why don't we just call it that, then? Is it really necessary with this intimidating terminology? Yes and no. </p> <p> If someone had originally figured all of this out in the context of mainstream programming, he or she would probably have used friendlier names, like <em>condense</em>, <em>reduce</em>, <em>fold</em>, and so on. This would have been more encouraging, although <a href="/2017/10/05/monoids-semigroups-and-friends">I'm not sure it would have been better</a>. </p> <p> In software architecture we use many overloaded terms. For example, what's a <em>service</em>, or a <em>client?</em> What does <em>tier</em> mean? Is it the same as a <em>layer</em>, or is it something different? What's the <a href="http://tomasp.net/blog/2015/library-frameworks">difference between a library and a framework</a>? </p> <p> At least a word like <em>catamorphism</em> is concise. It's not in common use, so isn't overloaded and vague. </p> <p> Another, more pragmatic, concern is that whether you like it or not, the terminology is already established. Mathematicians decided to name the concept <em>catamorphism</em>. While the name may seem intimidating, I prefer to teach concepts like these using established terminology. This means that if my articles are unclear, you can do further research with other resources. That's the benefit of established terminology, whether you like the specific words or not. </p> <h3 id="c179dad7693c48c2ae592d33cb2be792"> Summary <a href="#c179dad7693c48c2ae592d33cb2be792" title="permalink">#</a> </h3> <p> You can compose entire applications based on the abstractions of <em>map</em> and <em>reduce</em>. You can see one example of such a system in my <a href="https://blog.ploeh.dk/functional-architecture-with-fsharp">A Functional Architecture with F#</a> Pluralsight course. </p> <p> The terms <em>map</em> and <em>reduce</em> may, however, not be helpful, because it may not be clear exactly what types of data you can map, and what types you can reduce. One of the most important goals of this overall article series about universal abstractions is to help you identify when such software architectures apply. This is more often that you think. </p> <p> What sort of data can you map? You can map <em>functors</em>. While hardly finite, there's a catalogue of well-known functors, of which I've covered some, but not all. That catalogue contains data containers like <a href="/2018/03/26/the-maybe-functor">Maybe</a>, <a href="/2018/08/06/a-tree-functor">Tree</a>, <a href="/2018/09/10/the-lazy-functor">lazy computations</a>, <a href="/2018/09/24/asynchronous-functors">tasks</a>, and perhaps a score more. The catalogue of (actually useful) functors has, in my experience, a manageable size. </p> <p> Likewise you could ask: What sort of data can you reduce? How do you implement that reduction? Again, there's a compact set of well-known catamorphisms. How do you reduce a collection? You use its catamorphism (which is equal to a fold). How do you reduce a tree? You use its catamorphism. How do you reduce an Either object? You use its catamorphism. </p> <p> When we learn new programming languages, new libraries, new frameworks, we gladly invest time in learning hundreds, if not thousands, of keywords, APIs, extensibility points, and so on. May I offer, for your consideration, that your mental resources are better spent learning only a handful of universal abstractions? </p> <p> <strong>Next:</strong> <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="673934773ecd4263a557599471eaf57c"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> In other (functional) languages, such a 'reduce' operation is called a <em>fold</em>. The metaphor, I suppose, is that you fold multiple values together, two by two. <br> ...the <code>Aggregate</code> overload that takes a <em>seed</em> value... </blockquote> <p> My impression is that part of the functional programming style is to avoid function overloading. Consistent with that is the naming used by Language Ext for these concepts. In Language Ext, the function with type (in F# notation) <code>seq&lt;'a&gt; -&gt; ('a -&gt; 'a -&gt; 'a) -&gt; 'a</code> is called <a href="https://github.com/louthy/language-ext/blob/master/LanguageExt.Core/DataTypes/List/Lst.Extensions.cs#L599">Reduce</a> and the function with type (in F# notation) <code>seq&lt;'a&gt; -&gt; 'b -&gt; ('b -&gt; 'a -&gt; 'b) -&gt; 'b</code> is called <a href="https://github.com/louthy/language-ext/blob/master/LanguageExt.Core/DataTypes/List/Lst.Extensions.cs#L420">Fold</a>. </p> <p> I don't know the origin of these two names, but I remember the difference by thinking about preparing food. In cooking, <a href="https://en.wikipedia.org/wiki/Reduction_(cooking)">reduction</a> increases the concentration of a liquid by boiling away some of its water. I think of the returned <code>'a</code> as being a highly concentrated form of the input sequence since every sequence element (and only those elements) was used to create that return value. In baking, <a href="https://www.wikihow.com/Fold-(Baking)">folding</a> is a technique that carefully combines two mixtures into one. This reminds me of how the seed value <code>'b</code> and the sequence of <code>'a</code> are (typically) two different types and are combined by the given function. They are not perfect analogies, but they work for me. </p> <p> On a related note, <a href="https://github.com/louthy/language-ext/issues/583">I dislike</a> that Reduce returns <code>'a</code> instead of <code>Option<'a></code>. </p> </div> <div class="comment-date">2019-07-12 12:20 UTC</div> </div> <div class="comment" id="43e48df0645b4c05992b0f599cc71d10"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. As you may know, <a href="https://amzn.to/2TE8tJx">my book</a> liberally employs cooking analogies, but I admit that I've never thought about <em>reduction</em> and <em>fold</em> in that light before. Good analogies, although perhaps a bit <em>strained</em> (pun intended). </p> <p> They do work well, though, for the reasons you give. <blockquote> "the functional programming style is to avoid function overloading" </blockquote> As far as I can tell, this has more to do with the combination of <a href="https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system">Hindley–Milner type inference</a> and currying you encounter in Haskell and ML-derived languages than it has to do with functional programming in itself. If I recall correctly, <a href="https://clojure.org">Clojure</a> makes copious use of overloading. </p> <p> The problem with overloading in a language like <a href="https://fsharp.org">F#</a> is that if you imagine that the function you refer to as <code>fold</code> was also called <code>reduce</code>, a partially applied expression like this would be ambiguous: </p> <p> <pre>let foo = reduce xs bar</pre> </p> <p> What is <code>bar</code>, here? If <code>reduce</code> is overloaded, is it a function, or is it a 'seed value'? </p> <p> As far as I can tell, the compiler can't infer that, so instead of compromising on type inference, the languages in question disallow function overloading. </p> <p> Notice, additionally, that F# does allow <em>method</em> overloading, for the part of the language that enables interoperability with the rest of .NET. In that part of the language, type inference rarely works anyway. I'm not an expert in how the F# compiler works, but I've always understood this to indicate that the interop part of F# isn't based on Hindley-Milner. I don't see how it could be, since the .NET/IL type system isn't a Hindley-Milner type system. </p> <p> The <code>reduce</code> function you refer to is, by the way, based on a <a href="/2017/11/27/semigroups">semigroup</a> instance. More specifically, it's simply how <a href="/2017/12/11/semigroups-accumulate">semigroups accumulate</a>. I agree that <code>reduce</code> is partial, and therefore not as pretty as one could wish, but I think a more appropriate solution is to define it on <code>NotEmptyCollection&lt;T&gt;</code>, instead of on <code>IEnumerable&lt;T&gt;</code>, as shown in <a href="/2017/12/11/semigroups-accumulate">that article</a>. </p> <p> In other words, I don't think <code>reduce</code> belongs on <code>IEnumerable&lt;T&gt;</code> at all. I know it's in both F# and Haskell, but my personal opinion is that it shouldn't be, just like Haskell's <code>head</code> function ought not to exist either... </p> </div> <div class="comment-date">2019-07-14 16:29 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Applicative monoids https://blog.ploeh.dk/2019/04/22/applicative-monoids 2019-04-22T05:36:00+00:00 Mark Seemann <div id="post"> <p> <em>An applicative functor containing monoidal values itself forms a monoid.</em> </p> <p> This article is an instalment in <a href="/2018/10/01/applicative-functors">an article series about applicative functors</a>. An applicative functor is a <a href="https://bartoszmilewski.com/2014/01/14/functors-are-containers">data container</a> that supports combinations. If an applicative functor contains values of a type that gives rise to a <a href="/2017/10/06/monoids">monoid</a>, then the <a href="/2018/03/22/functors">functor</a> itself forms a monoid. </p> <p> In a previous article you learned that <a href="/2019/04/15/lazy-monoids">lazy computations of monoids remain monoids</a>. Furthermore, <a href="/2018/12/17/the-lazy-applicative-functor">a lazy computation is an applicative functor</a>, and it turns out that the result generalises. The result regarding lazy computation is just a special case. </p> <h3 id="3c6acb0da15b4ae8b78f5d8879b7efe3"> Monap <a href="#3c6acb0da15b4ae8b78f5d8879b7efe3" title="permalink">#</a> </h3> <p> Since version 4.11 of <a href="https://www.haskell.org">Haskell</a>'s <em>base</em> library, <code>Monoid</code> is a subset of <code>Semigroup</code>, so in order to create a <code>Monoid</code> instance, you must first define a <code>Semigroup</code> instance. </p> <p> In order to escape the need for flexible contexts, you'll have to define a wrapper <code>newtype</code> that'll be the instance. What should you call it? It's going to be an applicative functor of monoids, so perhaps something like <em>ApplicativeMonoid?</em> Nah, that's too long. <em>AppMon</em>, then? Sure, but how about flipping the terms: <em>MonApp?</em> That's better. Let's drop the last <em>p</em> and dispense with the <a href="https://en.wikipedia.org/wiki/Camel_case">Pascal case</a>: <em>Monap</em>. </p> <p> <em>Monap</em> almost looks like <em>Monad</em>, only with the last letter rotated half a revolution. This should allow for maximum confusion. </p> <p> To be clear, I normally don't advocate for droll word play when writing production code, but I occasionally do it in articles and presentations. The <em>Monap</em> in this article exists only to illustrate a point. It's not intended to be used. Furthermore, this article doesn't discuss monads at all, so the risk of confusion should, hopefully, be minimised. I may, however, regret this decision... </p> <h3 id="fb2568655a314bba8e417d06317b2690"> Applicative semigroup <a href="#fb2568655a314bba8e417d06317b2690" title="permalink">#</a> </h3> <p> First, introduce the wrapper <code>newtype</code>: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;Monap&nbsp;f&nbsp;a&nbsp;=&nbsp;Monap&nbsp;{&nbsp;runMonap&nbsp;::&nbsp;f&nbsp;a&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>)</pre> </p> <p> This only states that there's a type called <code>Monap</code> that wraps some higher-kinded type <code>f a</code>; that is, a container <code>f</code> of values of the type <code>a</code>. The intent is that <code>f</code> is an applicative functor, hence the use of the letter <em>f</em>, but the type itself doesn't constrain <code>f</code> to any type class. </p> <p> The <code>Semigroup</code> instance does, though: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;(<span style="color:blue;">Applicative</span>&nbsp;f,&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;(<span style="color:blue;">Monap</span>&nbsp;f&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;(Monap&nbsp;x)&nbsp;&lt;&gt;&nbsp;(Monap&nbsp;y)&nbsp;=&nbsp;Monap&nbsp;$&nbsp;liftA2&nbsp;<span style="color:#2b91af;">(&lt;&gt;)</span>&nbsp;x&nbsp;y </pre> </p> <p> This states that when <code>f</code> is a <code>Applicative</code> instance, and <code>a</code> is a <code>Semigroup</code> instance, then <code>Monap f a</code> is also a <code>Semigroup</code> instance. </p> <p> Here's an example of combining two applicative <a href="/2017/11/27/semigroups">semigroups</a>: </p> <p> <pre>λ&gt; Monap (Just (Max 42)) &lt;&gt; Monap (Just (Max 1337)) Monap {runMonap = Just (Max {getMax = 1337})}</pre> </p> <p> This example uses the <code>Max</code> semigroup container, and <code>Maybe</code> as the applicative functor. For <code>Max</code>, the <code>&lt;&gt;</code> operator returns the value that contains the highest value, which in this case is 1337. </p> <p> It even works when the applicative functor in question is <code>IO</code>: </p> <p> <pre>λ&gt; runMonap $Monap (Sum &lt;$&gt; randomIO @Word8) &lt;&gt; Monap (Sum &lt;$&gt; randomIO @Word8) Sum {getSum = 165}</pre> </p> <p> This example uses <code>randomIO</code> to generate two random values. It uses the <code>TypeApplications</code> GHC extension to make <code>randomIO</code> generate <code>Word8</code> values. Each random number is projected into the <code>Sum</code> container, which means that <code>&lt;&gt;</code> will add the numbers together. In the above example, the result is 165, but if you evaluate the expression a second time, you'll most likely get another result: </p> <p> <pre>λ&gt; runMonap$ Monap (Sum &lt;$&gt; randomIO @Word8) &lt;&gt; Monap (Sum &lt;$&gt; randomIO @Word8) Sum {getSum = 246}</pre> </p> <p> You can also use linked list (<code>[]</code>) as the applicative functor. In this case, the result may be surprising (depending on what you expect): </p> <p> <pre>λ&gt; Monap [Product 2, Product 3] &lt;&gt; Monap [Product 4, Product 5, Product 6] Monap {runMonap = [Product {getProduct = 8},Product {getProduct = 10},Product {getProduct = 12}, Product {getProduct = 12},Product {getProduct = 15},Product {getProduct = 18}]}</pre> </p> <p> Notice that we get all the combinations of products: <em>2</em> multiplied with each element in the second list, followed by <em>3</em> multiplied by each of the elements in the second list. This shouldn't be that startling, though, since you've already, previously in this article series, seen several examples of how an applicative functor implies combinations. </p> <h3 id="33d4ff4f0fbf4e3cbf4d802b0f287f63"> Applicative monoid <a href="#33d4ff4f0fbf4e3cbf4d802b0f287f63" title="permalink">#</a> </h3> <p> With the <code>Semigroup</code> instance in place, you can now add the <code>Monoid</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;(<span style="color:blue;">Applicative</span>&nbsp;f,&nbsp;<span style="color:blue;">Monoid</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">Monoid</span>&nbsp;(<span style="color:blue;">Monap</span>&nbsp;f&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;mempty&nbsp;=&nbsp;Monap&nbsp;$&nbsp;pure&nbsp;$&nbsp;mempty </pre> </p> <p> This is straightforward: you take the identity (<code>mempty</code>) of the monoid <code>a</code>, promote it to the applicative functor <code>f</code> with <code>pure</code>, and finally put that value into the <code>Monap</code> wrapper. </p> <p> This works fine as well: </p> <p> <pre>λ&gt; mempty :: Monap Maybe (Sum Integer) Monap {runMonap = Just (Sum {getSum = 0})} λ&gt; mempty :: Monap [] (Product Word8) Monap {runMonap = [Product {getProduct = 1}]}</pre> </p> <p> The identity laws also seem to hold: </p> <p> <pre>λ&gt; Monap (Right mempty) &lt;&gt; Monap (Right (Sum 2112)) Monap {runMonap = Right (Sum {getSum = 2112})} λ&gt; Monap ("foo", All False) &lt;&gt; Monap mempty Monap {runMonap = ("foo",All {getAll = False})}</pre> </p> <p> The last, right-identity example is interesting, because the applicative functor in question is a tuple. Tuples are <code>Applicative</code> instances when the first, or left, element is a <code>Monoid</code> instance. In other words, <code>f</code> is, in this case, <code>(,) String</code>. The <code>Monoid</code> instance that <code>Monap</code> sees as <code>a</code>, on the other hand, is <code>All</code>. </p> <p> Since <a href="/2017/10/30/tuple-monoids">tuples of monoids are themselves monoids</a>, however, I can get away with writing <code>Monap mempty</code> on the right-hand side, instead of the more elaborate template the other examples use: </p> <p> <pre>λ&gt; Monap ("foo", All False) &lt;&gt; Monap ("", mempty) Monap {runMonap = ("foo",All {getAll = False})}</pre> </p> <p> or perhaps even: </p> <p> <pre>λ&gt; Monap ("foo", All False) &lt;&gt; Monap (mempty, mempty) Monap {runMonap = ("foo",All {getAll = False})}</pre> </p> <p> Ultimately, all three alternatives mean the same. </p> <h3 id="6b484b60d19b4ef4a3716ceaa693cb8b"> Associativity <a href="#6b484b60d19b4ef4a3716ceaa693cb8b" title="permalink">#</a> </h3> <p> As usual, I'm not going to do the work of formally proving that the monoid laws hold for the <code>Monap</code> instances, but I'd like to share some QuickCheck properties that indicate that they do, starting with a property that verifies associativity: </p> <p> <pre><span style="color:#2b91af;">assocLaw</span>&nbsp;::&nbsp;(<span style="color:blue;">Eq</span>&nbsp;a,&nbsp;<span style="color:blue;">Show</span>&nbsp;a,&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Property</span> assocLaw&nbsp;x&nbsp;y&nbsp;z&nbsp;=&nbsp;(x&nbsp;&lt;&gt;&nbsp;y)&nbsp;&lt;&gt;&nbsp;z&nbsp;===&nbsp;x&nbsp;&lt;&gt;&nbsp;(y&nbsp;&lt;&gt;&nbsp;z)</pre> </p> <p> This property is entirely generic. It'll verify associativity for any <code>Semigroup a</code>, not only for <code>Monap</code>. You can, however, run it for various <code>Monap</code> types, as well. You'll see how this is done a little later. </p> <h3 id="51973eba155f418e8903d37f6d3938d2"> Identity <a href="#51973eba155f418e8903d37f6d3938d2" title="permalink">#</a> </h3> <p> Likewise, you can write two properties that check left and right identity, respectively. </p> <p> <pre><span style="color:#2b91af;">leftIdLaw</span>&nbsp;::&nbsp;(<span style="color:blue;">Eq</span>&nbsp;a,&nbsp;<span style="color:blue;">Show</span>&nbsp;a,&nbsp;<span style="color:blue;">Monoid</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Property</span> leftIdLaw&nbsp;x&nbsp;=&nbsp;x&nbsp;===&nbsp;mempty&nbsp;&lt;&gt;&nbsp;x <span style="color:#2b91af;">rightIdLaw</span>&nbsp;::&nbsp;(<span style="color:blue;">Eq</span>&nbsp;a,&nbsp;<span style="color:blue;">Show</span>&nbsp;a,&nbsp;<span style="color:blue;">Monoid</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Property</span> rightIdLaw&nbsp;x&nbsp;=&nbsp;x&nbsp;===&nbsp;x&nbsp;&lt;&gt;&nbsp;mempty </pre> </p> <p> Again, this is entirely generic. These properties can be used to test the identity laws for any monoid, including <code>Monap</code>. </p> <h3 id="0cc5968e519c48a0816574e1dc1667fc"> Properties <a href="#0cc5968e519c48a0816574e1dc1667fc" title="permalink">#</a> </h3> <p> You can run each of these properties multiple time, for various different functors and monoids. As <code>Applicative</code> instances, I've used <code>Maybe</code>, <code>[]</code>, <code>(,) Any</code>, and <code>Identity</code>. As <code>Monoid</code> instances, I've used <code>String</code>, <code>Sum Integer</code>, <code>Max Int16</code>, and <code>[Float]</code>. Notice that a list (<code>[]</code>) is both an applicative functor as well as a monoid. In this test set, I've used it in both roles. </p> <p> <pre>tests&nbsp;= &nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;testGroup&nbsp;<span style="color:#a31515;">&quot;Properties&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Associativity&nbsp;law,&nbsp;Maybe&nbsp;String&quot;</span>&nbsp;(assocLaw&nbsp;@(Monap&nbsp;Maybe&nbsp;String)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Left&nbsp;identity&nbsp;law,&nbsp;Maybe&nbsp;String&quot;</span>&nbsp;(leftIdLaw&nbsp;@(Monap&nbsp;Maybe&nbsp;String)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Right&nbsp;identity&nbsp;law,&nbsp;Maybe&nbsp;String&quot;</span>&nbsp;(rightIdLaw&nbsp;@(Monap&nbsp;Maybe&nbsp;String)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Associativity&nbsp;law,&nbsp;[Sum&nbsp;Integer]&quot;</span>&nbsp;(assocLaw&nbsp;@(Monap&nbsp;<span style="color:blue;">[]</span>&nbsp;(Sum&nbsp;Integer))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Left&nbsp;identity&nbsp;law,&nbsp;[Sum&nbsp;Integer]&quot;</span>&nbsp;(leftIdLaw&nbsp;@(Monap&nbsp;<span style="color:blue;">[]</span>&nbsp;(Sum&nbsp;Integer))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Right&nbsp;identity&nbsp;law,&nbsp;[Sum&nbsp;Integer]&quot;</span>&nbsp;(rightIdLaw&nbsp;@(Monap&nbsp;<span style="color:blue;">[]</span>&nbsp;(Sum&nbsp;Integer))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Associativity&nbsp;law,&nbsp;(Any,&nbsp;Max&nbsp;Int8)&quot;</span>&nbsp;(assocLaw&nbsp;@(Monap&nbsp;(<span style="color:#2b91af;">(,)</span>&nbsp;Any)&nbsp;(Max&nbsp;Int8))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Left&nbsp;identity&nbsp;law,&nbsp;(Any,&nbsp;Max&nbsp;Int8)&quot;</span>&nbsp;(leftIdLaw&nbsp;@(Monap&nbsp;(<span style="color:#2b91af;">(,)</span>&nbsp;Any)&nbsp;(Max&nbsp;Int8))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Right&nbsp;identity&nbsp;law,&nbsp;(Any,&nbsp;Max&nbsp;Int8)&quot;</span>&nbsp;(rightIdLaw&nbsp;@(Monap&nbsp;(<span style="color:#2b91af;">(,)</span>&nbsp;Any)&nbsp;(Max&nbsp;Int8))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Associativity&nbsp;law,&nbsp;Identity&nbsp;[Float]&quot;</span>&nbsp;(assocLaw&nbsp;@(Monap&nbsp;Identity&nbsp;[Float])), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Left&nbsp;identity&nbsp;law,&nbsp;Identity&nbsp;[Float]&quot;</span>&nbsp;(leftIdLaw&nbsp;@(Monap&nbsp;Identity&nbsp;[Float])), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;testProperty&nbsp;<span style="color:#a31515;">&quot;Right&nbsp;identity&nbsp;law,&nbsp;Identity&nbsp;[Float]&quot;</span>&nbsp;(rightIdLaw&nbsp;@(Monap&nbsp;Identity&nbsp;[Float])) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;] </pre> </p> <p> All of these properties pass. </p> <h3 id="af35f2986a734a16be80590c86d0432d"> Summary <a href="#af35f2986a734a16be80590c86d0432d" title="permalink">#</a> </h3> <p> It seems that any applicative functor that contains monoidal values itself forms a monoid. The <code>Monap</code> type presented in this article only exists to demonstrate this conjecture; it's not intended to be <em>used</em>. </p> <p> If it holds, I think it's an interesting result, because it further enables you to reason about the properties of complex systems, based on the properties of simpler systems. </p> <p> <strong>Next: </strong> <a href="/2018/12/24/bifunctors">Bifunctors</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="9a3acf3cf4174e178dc9349e11fee488"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> It seems that any applicative functor that contains monoidal values itself forms a monoid. </blockquote> <p> Is it necessary for the functor to be applicative? Do you know of a functor that contains monoidal values for which itself does <em>not</em> form a monoid? </p> </div> <div class="comment-date">2019-05-13 11:28 UTC</div> </div> <div class="comment" id="a164909adb884cd78b309a81029a2dd8"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. Yes, it's necessary for the functor to be applicative, because you need the applicative combination operator <code>&lt;*&gt;</code> in order to implement the combination. In C#, you'd need an <code>Apply</code> method <a href="/2018/10/01/applicative-functors/#cef395ee19644f30bfd1ad7a84b6f912">as shown here</a>. </p> <p> Technically, the monoidal <code>&lt;&gt;</code> operator for <code>Monap</code> is, as you can see, implemented with a call to <code>liftA2</code>. In Haskell, you can implement an instance of <code>Applicative</code> by implementing either <code>liftA2</code> or <code>&lt;*&gt;</code>, as well as <code>pure</code>. You usually see <code>Applicative</code> described by <code>&lt;*&gt;</code>, which is what I've done in <a href="/2018/10/01/applicative-functors">my article series on applicative functors</a>. If you do that, you can define <code>liftA2</code> by a combination of <code>&lt;*&gt;</code> and <code>fmap</code> (the <code>Select</code> method that defines functors). </p> <p> If you want to put this in C# terms, you need both <code>Select</code> and <code>Apply</code> in order to be able to lift a monoid into a functor. </p> <p> Is there a functor that contains monoidal values that itself doesn't form a monoid? </p> <p> Yes, indeed. In order to answer that question, we 'just' need to identify a functor that's <em>not</em> an applicative functor. Tuples are good examples. </p> <p> A <a href="https://blog.ploeh.dk/2018/12/31/tuple-bifunctor/#d918d0271c33406ba3047ef162212100">tuple forms a functor</a>, but in general nothing more than that. Consider a tuple where the first element is a <code>Guid</code>. It's a functor, but can you implement the following function? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;Apply&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&gt;&nbsp;selector, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">NotImplementedException</span>(<span style="color:#a31515;">&quot;What&nbsp;would&nbsp;you&nbsp;write&nbsp;here?&quot;</span>); }</pre> </p> <p> You can pull the <code>T</code> value out of <code>source</code> and project it to a <code>TResult</code> value with <code>selector</code>, but you'll need to put it back in a <code>Tuple&lt;Guid, TResult&gt;</code>. Which <code>Guid</code> value are you going to use for that tuple? </p> <p> There's no clear answer to that question. </p> <p> More specifically, consider <code>Tuple&lt;Guid, int&gt;</code>. This is a functor that contains monoidal values. Let's say that we want to use the addition monoid over integers. How would you implement the following method? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">NotImplementedException</span>(<span style="color:#a31515;">&quot;What&nbsp;would&nbsp;you&nbsp;write&nbsp;here?&quot;</span>); }</pre> </p> <p> Again, you run into the issue that while you can pull the integers out of the tuples and add them together, there's no clear way to figure out which <code>Guid</code> value to put into the tuple that contains the sum. </p> <p> The issue particularly with tuples is that there's no general way to combine the leftmost values of the tuples. If there is - that is, if leftmost values form a monoid - then the tuple is also an applicative functor. For example, <code>Tuple&lt;string, int&gt;</code> is applicative and forms a monoid over addition: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;Apply&lt;<span style="color:#2b91af;">TResult</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&gt;&nbsp;selector, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Tuple</span>.Create( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;selector.Item1&nbsp;+&nbsp;source.Item1, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;selector.Item2(source.Item2)); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Tuple</span>.Create(x.Item1&nbsp;+&nbsp;y.Item1,&nbsp;x.Item2&nbsp;+&nbsp;y.Item2); }</pre> </p> <p> You can also implement <code>Add</code> with <code>Apply</code>, but you're going to need two <code>Apply</code> overloads to make it work. </p> <p> Incidentally, unbeknownst to me, the <code>Ap</code> wrapper was added to Haskell's <code>Data.Monoid</code> module 12 days before I wrote this article. In all but name, it's equivalent to the <code>Monap</code> wrapper presented here. </p> </div> <div class="comment-date">2019-05-14 20:44 UTC</div> </div> <div class="comment" id="8900297a3b484ac0adfb8574a66cff87"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> <p> ...if leftmost values form a monoid - then the tuple is also an applicative functor. For example, <code>Tuple&lt;string, int&gt;</code> is applicative... </p> </blockquote> <p> I want to add some prepositional phrases to our statements like I <a href="https://blog.ploeh.dk/2019/01/07/either-bifunctor/#79f5d74763e34cb0997a7a79df1e05f0">commented here</a> to help claify things. I don't think that <code>Tuple&lt;string, int&gt;</code> can be applicative because there are no type parameters in which it could be applicative. Were you trying to say that <code>Tuple&lt;string, B&gt;</code> is applicative in <code>B</code>? This seems to match your <code>Apply</code> function, which doesn't depend on <code>int</code>. </p> </div> <div class="comment-date">2019-05-30 05:02 UTC</div> </div> <div class="comment" id="daa6444c8fb0484a8bf1accdb3cc544b"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, you're quite right; good catch. My wording was incorrect (I was probably either tired or in a hurry when I wrote that), but fortunately, the code looks like it's correct. </p> <p> That you for pointing out my error. </p> </div> <div class="comment-date">2019-05-30 13:00 UTC</div> </div> <div class="comment" id="075df592e63948e695851d7ae20842ea"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> <p> ...<code>Tuple&lt;string, int&gt;</code> is applicative and forms a monoid over addition... </p> </blockquote> <p> I do agree with the monoid part, where "addition" means string concatenation for the first item and integer addition for the second item. </p> <blockquote> <p> <code>Tuple&lt;string, B&gt;</code> is applicative in <code>B</code> </p> </blockquote> <p> Now I am trying to improve my understanding of this statement. In Haskell, my understanding the definition of the <a href="https://en.wikibooks.org/wiki/Haskell/Applicative_functors#The_Applicative_class">Applicative type class</a> applied to <code>Tuple&lt;string, B&gt;</code> requires a function <code>pure</code> from <code>B</code> to <code>Tuple&lt;string, B&gt;</code>. What it the definition of this funciton? Does it use the empty string in order to make an instance of <code>Tuple&lt;string, B&gt;</code>? If so, what is the justification for this? Or maybe my reasoning here is mistaken. </p> </div> <div class="comment-date">2019-05-31 12:52 UTC</div> </div> <div class="comment" id="ce69d4440f314827a8ec441511d0ce71"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. In Haskell, it's true that applicative functors must also define <code>pure</code>. In this article series, I've glosssed over that constraint, since I'm not aware of any data containers that can lawfully implement <code>&lt;*&gt;</code> or <code>liftA2</code>, but <em>can't</em> define <code>pure</code>. </p> <p> The applicative instance for tuples is, however, constrained: </p> <p> <pre>Monoid a =&gt; Applicative ((,) a)</pre> </p> <p> The construct <code>((,) a)</code> means any tuple where the first element has the generic type <code>a</code>. The entire expression means that tuples are applicative functors when the first element forms a monoid; that's the restriction on <code>a</code>. The definition of <code>pure</code>, then, is: </p> <p> <pre>pure x = (mempty, x)</pre> </p> <p> That is, use the monoidal identity (<code>mempty</code>) for the first element, and use <code>x</code> as the second element. For strings, since the identity for string concatenation is the empty string, yes, it does exactly mean that <code>pure</code> for <code>Tuple&lt;string, B&gt;</code> would return a tuple with the empty string, and the input argument as the second element: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;Pure&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;x) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Tuple</span>.Create(<span style="color:#a31515;">&quot;&quot;</span>,&nbsp;x); }</pre> </p> <p> That's the behaviour you get from Haskell as well: </p> <p> <pre>Prelude Data.Monoid&gt; pure 42 :: (String, Int) ("",42)</pre> </p> </div> <div class="comment-date">2019-05-31 14:59 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Lazy monoids https://blog.ploeh.dk/2019/04/15/lazy-monoids 2019-04-15T13:54:00+00:00 Mark Seemann <div id="post"> <p> <em>Lazy monoids are monoids. An article for object-oriented programmers.</em> </p> <p> This article is part of a <a href="/2017/10/06/monoids">series about monoids</a>. In short, a <em>monoid</em> is an associative binary operation with a neutral element (also known as <em>identity</em>). Previous articles have shown how more complex monoids arise from simpler monoids, such as <a href="/2017/10/30/tuple-monoids">tuple monoids</a>, <a href="/2017/11/06/function-monoids">function monoids</a>, and <a href="/2018/04/03/maybe-monoids">Maybe monoids</a>. This article shows another such result: how lazy computations of monoids itself form monoids. </p> <p> You'll see how simple this is through a series of examples. Specifically, you'll revisit several of the examples you've already seen in this article series. </p> <h3 id="a715cff45376401db9863a095a5e156d"> Lazy addition <a href="#a715cff45376401db9863a095a5e156d" title="permalink">#</a> </h3> <p> Perhaps the most intuitive monoid is <em>addition</em>. Lazy addition forms a monoid as well. In C#, you can implement this with a simple extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;+&nbsp;y.Value); }</pre> </p> <p> This <code>Add</code> method simply adds two lazy integers together in a lazy computation. You use it like any other extension method: </p> <p> <pre><span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;sum&nbsp;=&nbsp;x.Add(y);</pre> </p> <p> I'll spare you the tedious listing of <a href="https://fscheck.github.io/FsCheck">FsCheck</a>-based properties that demonstrate that the monoid laws hold. We'll look at an example of such a set of properties later in this article, for one of the other monoids. </p> <h3 id="eda5d39029904d70995ebee84570cf60"> Lazy multiplication <a href="#eda5d39029904d70995ebee84570cf60" title="permalink">#</a> </h3> <p> Not surprisingly, I hope, you can implement multiplication over lazy numbers in the same way: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;Multiply(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;*&nbsp;y.Value); }</pre> </p> <p> Usage is similar to lazy addition: </p> <p> <pre><span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;product&nbsp;=&nbsp;x.Multiply(y);</pre> </p> <p> As is the case with lazy addition, this <code>Multiply</code> method currently only works with lazy <code>int</code> values. If you also want it to work with <code>long</code>, <code>short</code>, or other types of numbers, you'll have to add method overloads. </p> <h3 id="93b85bb3c55045dabdb16fd11167dad7"> Lazy Boolean monoids <a href="#93b85bb3c55045dabdb16fd11167dad7" title="permalink">#</a> </h3> <p> There are four monoids over Boolean values, although I've customarily only shown two of them: <em>and</em> and <em>or</em>. These also, trivially, work with lazy Boolean values: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;And(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;&amp;&amp;&nbsp;y.Value); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;Or(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;||&nbsp;y.Value); }</pre> </p> <p> Given the previous examples, you'll hardly be surprised to see how you can use one of these extension methods: </p> <p> <pre><span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;x&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;y&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;b&nbsp;=&nbsp;x.And(y);</pre> </p> <p> Have you noticed a pattern in how the lazy binary operations <code>Add</code>, <code>Multiply</code>, <code>And</code>, and <code>Or</code> are implemented? Could this be generalised? </p> <h3 id="12b7d0077481413cab6acf8f6bf696cf"> Lazy angular addition <a href="#12b7d0077481413cab6acf8f6bf696cf" title="permalink">#</a> </h3> <p> In a previous article you saw how <a href="/2018/07/16/angular-addition-monoid">angular addition forms a monoid</a>. Lazy angular addition forms a monoid as well, which you can implement with another extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value.Add(y.Value)); }</pre> </p> <p> Until now, you may have noticed that all the extension methods seemed to follow a common pattern that looks like this: </p> <p> <pre><span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Foo</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;&diamond;&nbsp;y.Value);</pre> </p> <p> I've here used the diamond operator <code>&diamond;</code> as a place-holder for any sort of binary operation. My choice of that particular character is strongly influenced by <a href="https://www.haskell.org">Haskell</a>, where <a href="/2017/11/27/semigroups">semigroups</a> and monoids polymorphically are modelled with (among other options) the <code>&lt;&gt;</code> operator. </p> <p> The lazy angular addition implementation looks a bit different, though. This is because the original example uses an instance method to model the binary operation, instead of an infix operator such as <code>+</code>, <code>&&</code>, and so on. Given that the implementation of a lazy binary operation can also look like this, can you still imagine a generalisation? </p> <h3 id="669710b096144c6d8aaaca4426f8f795"> Lazy string concatenation <a href="#669710b096144c6d8aaaca4426f8f795" title="permalink">#</a> </h3> <p> If we follow the rough ordering of examples introduced in this article series about monoids, we've now reached <a href="/2017/10/10/strings-lists-and-sequences-as-a-monoid">concatenation as a monoid</a>. While various lists, arrays, and other sorts of collections also form a monoid over concatenation, in .NET, <code>IEnumerable&lt;T&gt;</code> already enables lazy evaluation, so I think it's more interesting to consider lazy string concatenation. </p> <p> The implementation, however, holds few surprises: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;Concat(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">string</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;+&nbsp;y.Value); }</pre> </p> <p> The overall result, so far, seems encouraging. All the basic monoids we've covered are also monoids when lazily computed. </p> <h3 id="29df873ebae145e2b903f6562825b69e"> Lazy money addition <a href="#29df873ebae145e2b903f6562825b69e" title="permalink">#</a> </h3> <p> The portfolio example from Kent Beck's book <a href="http://bit.ly/tddbe">Test-Driven Development By Example</a> also <a href="/2017/10/16/money-monoid">forms a monoid</a>. You can, again, implement an extension method that enables you to add lazy expressions together: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;Plus( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;addend) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;(()&nbsp;=&gt;&nbsp;source.Value.Plus(addend.Value)); }</pre> </p> <p> So far, you've seen several examples of implementations, but are they really monoids? All are clearly binary operations, but are they associative? Do identities exist? In other words, do the lazy binary operations obey the monoid laws? </p> <p> As usual, I'm not going to prove that they do, but I do want to share a set of FsCheck properties that demonstrate that the monoid laws hold. As an example, I'll share the properties for this lazy <code>Plus</code> method, but you can write similar properties for all of the above methods as well. </p> <p> You can verify the associativity law like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;LazyPlusIsAssociative( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;x, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;y, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;z) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x.Plus(y).Plus(z), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x.Plus(y.Plus(z)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Compare</span>.UsingBank); }</pre> </p> <p> Here, <code>Compare.UsingBank</code> is just a <a href="http://xunitpatterns.com/Test%20Utility%20Method.html">test utility API</a> to make the code more readable: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Compare</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">ExpressionEqualityComparer</span>&nbsp;UsingBank&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ExpressionEqualityComparer</span>(); }</pre> </p> <p> This takes advantage of the overloads for <a href="https://xunit.github.io">xUnit.net</a>'s <code>Assert</code> methods that take custom equality comparers as an extra, optional argument. <code>ExpressionEqualityComparer</code> is implemented in the test code base: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ExpressionEqualityComparer</span>&nbsp;: &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IEqualityComparer</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;,&nbsp;<span style="color:#2b91af;">IEqualityComparer</span>&lt;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">Bank</span>&nbsp;bank; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;ExpressionEqualityComparer() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bank&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Bank</span>(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bank.AddRate(<span style="color:#a31515;">&quot;CHF&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>,&nbsp;2); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:#2b91af;">IExpression</span>&nbsp;x,&nbsp;<span style="color:#2b91af;">IExpression</span>&nbsp;y) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;xm&nbsp;=&nbsp;bank.Reduce(x,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;ym&nbsp;=&nbsp;bank.Reduce(y,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">object</span>.Equals(xm,&nbsp;ym); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode(<span style="color:#2b91af;">IExpression</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;bank.Reduce(obj,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>).GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;y) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(x.Value,&nbsp;y.Value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode(<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;GetHashCode(obj.Value); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> If you think that the exchange rate between Dollars and Swiss Francs looks ridiculous, it's because I'm using the rate that Kent Beck used in his book, which is from 2002 (but otherwise timeless). </p> <p> The above property passes for hundreds of randomly generated input values, as is the case for this property, which verifies the left and right identity: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;LazyPlusHasIdentity(<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;x) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(x,&nbsp;x.Plus(<span style="color:#2b91af;">Plus</span>.Identity.ToLazy()),&nbsp;<span style="color:#2b91af;">Compare</span>.UsingBank); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(x,&nbsp;<span style="color:#2b91af;">Plus</span>.Identity.ToLazy().Plus(x),&nbsp;<span style="color:#2b91af;">Compare</span>.UsingBank); }</pre> </p> <p> These properties are just examples, not proofs. Still, they give confidence that lazy computations of monoids are themselves monoids. </p> <h3 id="6b5aa8f3a0b34d4c9d145eac3e59fe9b"> Lazy Roster combinations <a href="#6b5aa8f3a0b34d4c9d145eac3e59fe9b" title="permalink">#</a> </h3> <p> The last example you'll get in this article is the <code>Roster</code> example from the article on <a href="/2017/10/30/tuple-monoids">tuple monoids</a>. Here's yet another extension method that enables you to combine two lazy rosters: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Roster</span>&gt;&nbsp;Combine(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Roster</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Roster</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Roster</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value.Combine(y.Value)); }</pre> </p> <p> At this point it should be clear that there's essentially two variations in how the above extension methods are implemented. One variation is when the binary operation is implemented with an infix operator (like <code>+</code>, <code>||</code>, and so on), and another variation is when it's modelled as an instance method. How do these implementations generalise? </p> <p> I'm sure you could come up with an ad-hoc higher-order function, abstract base class, or interface to model such a generalisation, but I'm not motivated by <a href="/2018/09/17/typing-is-not-a-programming-bottleneck">saving keystrokes</a>. What I'm trying to uncover with this <a href="/2017/10/04/from-design-patterns-to-category-theory">overall article series</a> is how universal abstractions apply to programming. </p> <p> Which universal abstraction is in play here? </p> <h3 id="6d247327cf214fa698dc9ec29693bd6f"> Lazy monoids as applicative operations <a href="#6d247327cf214fa698dc9ec29693bd6f" title="permalink">#</a> </h3> <p> <a href="/2018/12/17/the-lazy-applicative-functor">Lazy&lt;T&gt; forms an applicative functor</a>. Using appropriate <code>Apply</code> overloads, you can rewrite all the above implementations in applicative style. Here's lazy addition: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;f&nbsp;=&nbsp;(i,&nbsp;j)&nbsp;=&gt;&nbsp;i&nbsp;+&nbsp;j; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y); }</pre> </p> <p> This first declares a <code>Func</code> value <code>f</code> that invokes the non-lazy binary operation, in this case <code>+</code>. Next, you can leverage the applicative nature of <code>Lazy&lt;T&gt;</code> to apply <code>f</code> to the two lazy values <code>x</code> and <code>y</code>. </p> <p> Multiplication, as well as the Boolean operations, follow the exact same template: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;Multiply(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;f&nbsp;=&nbsp;(i,&nbsp;j)&nbsp;=&gt;&nbsp;i&nbsp;*&nbsp;j; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;And(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">bool</span>,&nbsp;<span style="color:blue;">bool</span>,&nbsp;<span style="color:blue;">bool</span>&gt;&nbsp;f&nbsp;=&nbsp;(b1,&nbsp;b2)&nbsp;=&gt;&nbsp;b1&nbsp;&amp;&amp;&nbsp;b2; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;Or(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">bool</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">bool</span>,&nbsp;<span style="color:blue;">bool</span>,&nbsp;<span style="color:blue;">bool</span>&gt;&nbsp;f&nbsp;=&nbsp;(b1,&nbsp;b2)&nbsp;=&gt;&nbsp;b1&nbsp;||&nbsp;b2; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y); }</pre> </p> <p> Notice that in all four implementations, the second line of code is verbatim the same: <code>return f.Apply(x).Apply(y);</code> </p> <p> Does this generalisation also hold when the underlying, non-lazy binary operation is modelled as an instance method, as is the case of e.g. angular addition? Yes, indeed: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;Add(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;x,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;y) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">Angle</span>,&nbsp;<span style="color:#2b91af;">Angle</span>,&nbsp;<span style="color:#2b91af;">Angle</span>&gt;&nbsp;f&nbsp;=&nbsp;(i,&nbsp;j)&nbsp;=&gt;&nbsp;i.Add(j); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y); }</pre> </p> <p> You can implement the <code>Combine</code> method for lazy <code>Roster</code> objects in the same way, as well as <code>Plus</code> for lazy monetary expressions. The latter is worth revisiting. </p> <h3 id="445af7622459483ab64f4ff31ceb1c2e"> Using the Lazy functor over portfolio expressions <a href="#445af7622459483ab64f4ff31ceb1c2e" title="permalink">#</a> </h3> <p> The lazy <code>Plus</code> implementation looks like all of the above <code>Apply</code>-based implementations: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;Plus( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;source,&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;addend) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">IExpression</span>,&nbsp;<span style="color:#2b91af;">IExpression</span>,&nbsp;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;f&nbsp;=&nbsp;(x,&nbsp;y)&nbsp;=&gt;&nbsp;x.Plus(y); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f.Apply(source).Apply(addend); }</pre> </p> <p> In my article, however, you saw how, when <code>Plus</code> is a monoid, you can implement <code>Times</code> as an extension method. Can you implement a lazy version of <code>Times</code> as well? Must it be another extension method? </p> <p> Yes, but instead of an ad-hoc implementation, you can take advantage of the functor nature of Lazy: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;Times(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;exp,&nbsp;<span style="color:blue;">int</span>&nbsp;multiplier) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;exp.Select(x&nbsp;=&gt;&nbsp;x.Times(multiplier)); }</pre> </p> <p> Notice that instead of explicitly reaching into the lazy <code>Value</code>, you can simply call <code>Select</code> on <code>exp</code>. This lazily projects the <code>Times</code> operation, while preserving the invariants of <code>Lazy&lt;T&gt;</code> (i.e. that the computation is deferred until you ultimately access the <code>Value</code> property). </p> <p> You can implement a lazy version of <code>Reduce</code> in the same way: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Money</span>&gt;&nbsp;Reduce(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;exp,&nbsp;<span style="color:#2b91af;">Bank</span>&nbsp;bank,&nbsp;<span style="color:blue;">string</span>&nbsp;to) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;exp.Select(x&nbsp;=&gt;&nbsp;x.Reduce(bank,&nbsp;to)); }</pre> </p> <p> The question is, however, is it even worthwhile? Do you need to create all these overloads, or could you just leverage <code>Select</code> when you have a lazy value? </p> <p> For example, if the above <code>Reduce</code> overload didn't exist, you'd still be able to work with the portfolio API like this: </p> <p> <pre><span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;portfolio&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Money</span>&gt;&nbsp;result&nbsp;=&nbsp;portfolio.Select(x&nbsp;=&gt;&nbsp;x.Reduce(bank,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>));</pre> </p> <p> If you only occasionally use <code>Reduce</code>, then perhaps this is good enough. If you frequently call <code>Reduce</code>, however, it might be worth to add the above overload, in which case you could then instead write: </p> <p> <pre><span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">IExpression</span>&gt;&nbsp;portfolio&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Money</span>&gt;&nbsp;result&nbsp;=&nbsp;portfolio.Reduce(bank,&nbsp;<span style="color:#a31515;">&quot;USD&quot;</span>);</pre> </p> <p> In both cases, however, I think that you'd be putting the concept of an applicative functor to good use. </p> <h3 id="a5942465683c459a92457ea13e3bd21d"> Towards generalisation <a href="#a5942465683c459a92457ea13e3bd21d" title="permalink">#</a> </h3> <p> Is the applicative style better than the initial ad-hoc implementations? That depends on how you evaluate 'better'. If you count lines of code, then the applicative style is twice as verbose as the ad-hoc implementations. In other words, this: </p> <p> <pre><span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:blue;">int</span>&gt;(()&nbsp;=&gt;&nbsp;x.Value&nbsp;+&nbsp;y.Value);</pre> </p> <p> seems simpler than this: </p> <p> <pre><span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;f&nbsp;=&nbsp;(i,&nbsp;j)&nbsp;=&gt;&nbsp;i&nbsp;+&nbsp;j; <span style="color:blue;">return</span>&nbsp;f.Apply(x).Apply(y);</pre> </p> <p> This is, however, mostly because C# is too weak to express such abstractions in an elegant way. In <a href="https://fsharp.org">F#</a>, using the custom <code>&lt;*&gt;</code> operator from <a href="/2018/12/17/the-lazy-applicative-functor">the article on the Lazy applicative functor</a>, you could express the lazy addition as simply as: </p> <p> <pre><span style="color:blue;">lazy</span>&nbsp;(+)&nbsp;&lt;*&gt;&nbsp;x&nbsp;&lt;*&gt;&nbsp;y</pre> </p> <p> In <a href="https://www.haskell.org">Haskell</a> (if we, once more, pretend that <code>Identity</code> is equivalent to <code>Lazy</code>), you can simplify even further to: </p> <p> <pre>(+) &lt;$&gt; x &lt;*&gt; y</pre> </p> <p> Or rather, if you want it in <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free style</a>, <code>liftA2 (+)</code>. </p> <h3 id="f418f44f772b44c287365c7054d51aa1"> Summary <a href="#f418f44f772b44c287365c7054d51aa1" title="permalink">#</a> </h3> <p> The point of this article series isn't to save keystrokes, but to identify universal laws of computing, even as they relate to object-oriented programming. The pattern seems clear enough that I dare propose the following: </p> <p> <em>All monoids remain monoids under lazy computation.</em> </p> <p> In a future article I'll offer further support for that proposition. </p> <p> <strong>Next:</strong> <a href="/2017/11/20/monoids-accumulate">Monoids accumulate</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A pure Test Spy https://blog.ploeh.dk/2019/04/08/a-pure-test-spy 2019-04-08T06:02:00+00:00 Mark Seemann <div id="post"> <p> <em>Ad-hoc Test Spies can be implemented in Haskell using the Writer monad.</em> </p> <p> In a previous article on <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">state-based testing in Haskell</a>, I made the following throw-away statement: <blockquote> "you <em>could</em> write an ad-hoc Mock using, for example, the Writer monad" </blockquote> In that article, I didn't pursue that thought, since the theme was another. Instead, I'll expand on it here. </p> <h3 id="552a47e82a99466ebb8fe9d840b80a2f"> Test Double continuum <a href="#552a47e82a99466ebb8fe9d840b80a2f" title="permalink">#</a> </h3> <p> More than a decade ago, I wrote an MSDN Magazine article called <em>Exploring The Continuum Of Test Doubles</em>. It was in the September 2007 issue, and since then, the Magazine restructured so that the article is no longer available online. You can still <a href="https://msdn.microsoft.com/en-us/magazine/msdn-magazine-issues.aspx">download the entire issue as a single file</a> and read the article offline, should you want to. </p> <p> In the article, I made the argument that the classification of <a href="http://xunitpatterns.com/Test%20Double.html">Test Doubles</a> presented in the excellent <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a> should be thought of more as a continuum with vague and fuzzy transitions, rather than discrete categories. </p> <p> <img src="/content/binary/msdn-test-double-continuum.png" alt="Spectrum of Test Doubles."> </p> <p> This figure appeared in the original article. Given that the entire MSDN Magazine issue is available for free, and that I'm the original author of the article, I consider it fair use to repeat it here. </p> <p> The point is that it's not always clear whether a Test Double is, say, a Mock, or a Spy. What I'll show you in this article is closer to a Test Spy than to a Mock, but since the distinction is blurred anyway, I think that I can get away with it. </p> <h3 id="200ba4d9157442fe9caef2429371f7c6"> Test Spy <a href="#200ba4d9157442fe9caef2429371f7c6" title="permalink">#</a> </h3> <p> <em>xUnit Test Patterns</em> defines a Test Spy as a Test Double that captures "the indirect output calls made to another component by the SUT [System Under Test] for later verification by the test." When, as shown in <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">a previous article</a>, you use <code>Mock&lt;T&gt;.Verify</code> to assert than an interaction took place, you're using the Test Double more as a Spy than a Mock: </p> <p> <pre>repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(user));</pre> </p> <p> Strictly speaking, a Mock is a Test Double that <em>immediately</em> fails the test if any unexpected interaction takes place. People often call those <em>Strict Mocks</em>, but according to the book, that's a Mock. If the Test Double only records what happens, so that you can later query it to verify whether some interaction took place, it's closer to being a Test Spy. </p> <p> Whether you call it a Mock or a Spy, you can implement verification similar to the above <code>Verify</code> method in functional programming using the Writer monad. </p> <h3 id="cf14cbf8e65a4dd6a47f879c1b850de5"> Writer-based Spy <a href="#cf14cbf8e65a4dd6a47f879c1b850de5" title="permalink">#</a> </h3> <p> I'll show you a single example in <a href="https://www.haskell.org">Haskell</a>. In <a href="http://blog.ploeh.dk/2018/07/30/flattening-arrow-code-using-a-stack-of-monads">a previous article</a>, you saw a simplified function to implement a restaurant reservation feature, repeated here for your convenience: </p> <p> <pre><span style="color:#2b91af;">tryAccept</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Int</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Reservation</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">MaybeT</span>&nbsp;<span style="color:blue;">ReservationsProgram</span>&nbsp;<span style="color:#2b91af;">Int</span> tryAccept&nbsp;capacity&nbsp;reservation&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;guard&nbsp;=&lt;&lt;&nbsp;isReservationInFuture&nbsp;reservation &nbsp;&nbsp;reservations&nbsp;&lt;-&nbsp;readReservations&nbsp;$&nbsp;reservationDate&nbsp;reservation &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;reservedSeats&nbsp;=&nbsp;<span style="color:blue;">sum</span>&nbsp;$&nbsp;reservationQuantity&nbsp;&lt;$&gt;&nbsp;reservations &nbsp;&nbsp;guard&nbsp;$&nbsp;reservedSeats&nbsp;+&nbsp;reservationQuantity&nbsp;reservation&nbsp;&lt;=&nbsp;capacity &nbsp;&nbsp;create&nbsp;$&nbsp;reservation&nbsp;{&nbsp;reservationIsAccepted&nbsp;=&nbsp;True&nbsp;} </pre> </p> <p> This function runs in the <code>MaybeT</code> monad, so the two <code>guard</code> functions could easily prevent if from running 'to completion'. In the happy path, though, execution should reach 'the end' of the function and call the <code>create</code> function. </p> <p> In order to test this happy path, you'll need to not only run a test-specific interpreter over the <code>ReservationsProgram</code> free monad, you should also verify that <code>reservationIsAccepted</code> is <code>True</code>. </p> <p> You can do this using the <code>Writer</code> monad to implement a Test Spy: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;tryAccept,&nbsp;happy&nbsp;path&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;(NonNegative&nbsp;i) &nbsp;&nbsp;(<span style="color:blue;">fmap</span>&nbsp;getReservation&nbsp;-&gt;&nbsp;reservations) &nbsp;&nbsp;(ArbReservation&nbsp;reservation) &nbsp;&nbsp;expected &nbsp;&nbsp;-&gt; &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;spy&nbsp;(IsReservationInFuture&nbsp;_&nbsp;next)&nbsp;=&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;next&nbsp;True &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;spy&nbsp;(ReadReservations&nbsp;_&nbsp;next)&nbsp;=&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;next&nbsp;reservations &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;spy&nbsp;(Create&nbsp;r&nbsp;next)&nbsp;=&nbsp;tell&nbsp;[r]&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;(next&nbsp;expected) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reservedSeats&nbsp;=&nbsp;<span style="color:blue;">sum</span>&nbsp;$&nbsp;reservationQuantity&nbsp;&lt;$&gt;&nbsp;reservations &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;capacity&nbsp;=&nbsp;reservedSeats&nbsp;+&nbsp;reservationQuantity&nbsp;reservation&nbsp;+&nbsp;i &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(actual,&nbsp;observedReservations)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;runWriter&nbsp;$&nbsp;foldFreeT&nbsp;spy&nbsp;$&nbsp;runMaybeT&nbsp;$&nbsp;tryAccept&nbsp;capacity&nbsp;reservation &nbsp;&nbsp;<span style="color:blue;">in</span>&nbsp;&nbsp;Just&nbsp;expected&nbsp;==&nbsp;actual&nbsp;&amp;&amp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[True]&nbsp;==&nbsp;(reservationIsAccepted&nbsp;&lt;&gt;&nbsp;observedReservations)</pre> </p> <p> This test is an <a href="http://blog.ploeh.dk/2018/05/07/inlined-hunit-test-lists">inlined</a> QuickCheck-based property. The entire <a href="https://github.com/ploeh/dependency-injection-revisited">source code is available on GitHub</a>. </p> <p> Notice the <code>spy</code> function. As the name implies, it's the Test Spy for the test. Its full type is: </p> <p> <pre>spy&nbsp;::&nbsp;Monad&nbsp;m&nbsp;=&gt;&nbsp;ReservationsInstruction&nbsp;a&nbsp;-&gt;&nbsp;WriterT&nbsp;[Reservation]&nbsp;m&nbsp;a</pre> </p> <p> This is a function that, for a given <code>ReservationsInstruction</code> value returns a <code>WriterT</code> value where the type of data being written is <code>[Reservation]</code>. The function only writes to the writer context in one of the three cases: the <code>Create</code> case. The <code>Create</code> case carries with it a <code>Reservation</code> value here named <code>r</code>. Before returning the <code>next</code> step in interpreting the free monad, the <code>spy</code> function calls <code>tell</code>, thereby writing a singleton list of <code>[r]</code> to the writer context. </p> <p> In the Act phase of the test, it calls the <code>tryAccept</code> function and proceeds to interpret the result, which is a <code>MaybeT ReservationsProgram Int</code> value. Calling <code>runMaybeT</code> produces a <code>ReservationsProgram (Maybe Int)</code>, which you can then interpret with <code>foldFreeT spy</code>. This returns a <code>Writer [Reservation] (Maybe Int)</code>, which you can finally run with <code>runWriter</code> to get a <code>(Maybe Int, [Reservation])</code> tuple. Thus, <code>actual</code> is a <code>Maybe Int</code> value, and <code>observedReservations</code> is a <code>[Reservation]</code> value - the reservation that was written by <code>spy</code> using <code>tell</code>. </p> <p> The Assert phase of the test is a Boolean expression that checks that <code>actual</code> is as expected, and that <code>reservationIsAccepted</code> of the observed reservation is <code>True</code>. </p> <p> It takes a little while to make the pieces of the puzzle fit, but it's basically just standard Haskell library functions clicked together. </p> <h3 id="fd3478dba1a241a7bb95d9a97140ebac"> Summary <a href="#fd3478dba1a241a7bb95d9a97140ebac" title="permalink">#</a> </h3> <p> People sometimes ask me: <em>How do Mocks and Stubs work in functional programming?</em> </p> <p> In general, my answer is that you don't need Mocks and Stubs because when functions are <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>, you don't need to to test interactions. Sooner or later, though, you may run into higher-level interactions, even if they're <a href="http://blog.ploeh.dk/2017/07/10/pure-interactions">pure interactions</a>, and you'll likely want to unit test those. </p> <p> In a previous article you saw how to apply <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">state-based testing in Haskell, using the State monad</a>. In this article you've seen how you can create ad-hoc Mocks or Spies with the Writer monad. No auto-magical test-specific 'isolation frameworks' are required. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. An example of state-based testing in C# https://blog.ploeh.dk/2019/04/01/an-example-of-state-based-testing-in-c 2019-04-01T05:50:00+00:00 Mark Seemann <div id="post"> <p> <em>An example of avoiding Mocks and Stubs in C# unit testing.</em> </p> <p> This article is an instalment in an article series about how to move <a href="/2019/02/18/from-interaction-based-to-state-based-testing">from interaction-based testing to state-based testing</a>. In the previous article, you saw <a href="/2019/03/25/an-example-of-state-based-testing-in-f">an example of a pragmatic state-based test in F#</a>. You can now take your new-found knowledge and apply it to the <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">original C# example</a>. </p> <p> In the spirit of <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a>, in this article you'll see how to refactor the tests while keeping the implementation code constant. </p> <p> The code shown in this article is <a href="https://github.com/ploeh/UserManagement">available on GitHub</a>. </p> <h3 id="da7f393fdef24bae9e72c4bcad7e8373"> Connect two users <a href="#da7f393fdef24bae9e72c4bcad7e8373" title="permalink">#</a> </h3> <p> The <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">previous article</a> provides more details on the System Under Test (SUT), but here it is, repeated, for your convenience: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;:&nbsp;<span style="color:#2b91af;">ApiController</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;ConnectionsController( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IUserReader</span>&nbsp;userReader, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;userRepository) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserReader&nbsp;=&nbsp;userReader; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserRepository&nbsp;=&nbsp;userRepository; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IUserReader</span>&nbsp;UserReader&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;UserRepository&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IHttpActionResult</span>&nbsp;Post(<span style="color:blue;">string</span>&nbsp;userId,&nbsp;<span style="color:blue;">string</span>&nbsp;otherUserId) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;userRes&nbsp;=&nbsp;UserReader.Lookup(userId).SelectError( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;error&nbsp;=&gt;&nbsp;error.Accept(<span style="color:#2b91af;">UserLookupError</span>.Switch( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onInvalidId:&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onNotFound:&nbsp;&nbsp;<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;otherUserRes&nbsp;=&nbsp;UserReader.Lookup(otherUserId).SelectError( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;error&nbsp;=&gt;&nbsp;error.Accept(<span style="color:#2b91af;">UserLookupError</span>.Switch( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onInvalidId:&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onNotFound:&nbsp;&nbsp;<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;connect&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;user&nbsp;<span style="color:blue;">in</span>&nbsp;userRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;otherUser&nbsp;<span style="color:blue;">in</span>&nbsp;otherUserRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;Connect(user,&nbsp;otherUser); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;connect.SelectBoth(Ok,&nbsp;BadRequest).Bifold(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">User</span>&nbsp;Connect(<span style="color:#2b91af;">User</span>&nbsp;user,&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;user.Connect(otherUser); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserRepository.Update(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;otherUser; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This implementation code is a simplification of the code example that serves as an example running through my two <a href="https://cleancoders.com">Clean Coders</a> videos, <a href="https://cleancoders.com/episode/humane-code-real-episode-4/show">Church Visitor</a> and <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a>. </p> <h3 id="4e6812ee83ad4cda9b8deedef99ab4e1"> A Fake database <a href="#4e6812ee83ad4cda9b8deedef99ab4e1" title="permalink">#</a> </h3> <p> As in the previous article, you can define a test-specific <a href="http://xunitpatterns.com/Fake%20Object.html">Fake</a> database: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeDB</span>&nbsp;:&nbsp;<span style="color:#2b91af;">Collection</span>&lt;<span style="color:#2b91af;">User</span>&gt;,&nbsp;<span style="color:#2b91af;">IUserReader</span>,&nbsp;<span style="color:#2b91af;">IUserRepository</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IResult</span>&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;&nbsp;Lookup(<span style="color:blue;">string</span>&nbsp;id) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(<span style="color:blue;">int</span>.TryParse(id,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">int</span>&nbsp;i))) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(<span style="color:#2b91af;">UserLookupError</span>.InvalidId); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;user&nbsp;=&nbsp;<span style="color:blue;">this</span>.FirstOrDefault(u&nbsp;=&gt;&nbsp;u.Id&nbsp;==&nbsp;i); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(<span style="color:#2b91af;">UserLookupError</span>.NotFound); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Result</span>.Success&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(user); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;IsDirty&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;Update(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;IsDirty&nbsp;=&nbsp;<span style="color:blue;">true</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!Contains(user)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(user); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This is one of the few cases where I find inheritance more convenient than composition. By deriving from <code>Collection&lt;User&gt;</code>, you don't have to explicitly write code to expose a <em>Retrieval Interface</em>. The entirety of a standard collection API is already available via the base class. Had this class been part of a public API, I'd be concerned that inheritance could introduce future breaking changes, but as part of a suite of unit tests, I hope that I've made the right decision. </p> <p> Although you can derive a Fake database from a base class, you can still implement required interfaces - in this case <code>IUserReader</code> and <code>IUserRepository</code>. The <code>Update</code> method is the easiest one to implement, since it simply sets the <code>IsDirty</code> flag to <code>true</code> and adds the <code>user</code> if it's not already part of the collection. </p> <p> The <code>IsDirty</code> flag is the only custom Retrieval Interface added to the <code>FakeDB</code> class. As the previous article explains, this flag provides a convenient was to verify whether or not the database has changed. </p> <p> The <code>Lookup</code> method is a bit more involved, since it has to support all three outcomes implied by the protocol: <ul> <li>If the <code>id</code> is invalid, a result to that effect is returned.</li> <li>If the user isn't found, a result to that effect is returned.</li> <li>If the user with the requested <code>id</code> is found, then that user is returned.</li> </ul> This is a typical quality of a Fake: it contains <em>some</em> production-like behaviour, while still taking shortcuts compared to a full production implementation. In this case, it properly adheres to the protocol implied by the interface and protects its invariants. It still doesn't implement persistent storage, though. </p> <h3 id="92480e7451ad4c02a524f9fcd1ef1c8d"> Happy path test case <a href="#92480e7451ad4c02a524f9fcd1ef1c8d" title="permalink">#</a> </h3> <p> This is all you need in terms of <a href="http://xunitpatterns.com/Test%20Double.html">Test Doubles</a>. You now have a test-specific <code>IUserReader</code> and <code>IUserRepository</code> implementation that you can pass to the <code>Post</code> method. Notice that a single class implements multiple interfaces. This is often key to be able to implement a Fake object in the first place. </p> <p> Like in the previous article, you can start by exercising the happy path where a user successfully connects with another user: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersSuccessfullyConnect( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>(<span style="color:#2b91af;">Matching</span>.ImplementedInterfaces)]<span style="color:#2b91af;">FakeDB</span>&nbsp;db, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;db.Add(user); &nbsp;&nbsp;&nbsp;&nbsp;db.Add(otherUser); &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;ok&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">OkNegotiatedContentResult</span>&lt;<span style="color:#2b91af;">User</span>&gt;&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(otherUser,&nbsp;ok.Content); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.True(db.IsDirty); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(otherUser.Id,&nbsp;user.Connections); }</pre> </p> <p> This, and all other tests in this article use <a href="https://xunit.github.io">xUnit.net</a> 2.3.1 and <a href="https://github.com/AutoFixture/AutoFixture">AutoFixture</a> 4.1.0. </p> <p> The test is organised according to my standard <a href="http://blog.ploeh.dk/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">heuristic for formatting tests according to the Arrange Act Assert pattern</a>. In the Arrange phase, it adds the two valid <code>User</code> objects to the Fake <code>db</code> and sets the <code>IsDirty</code> flag to false. </p> <p> Setting the flag is necessary because this is object-oriented code, where objects have mutable state. In the previous articles with examples in <a href="https://fsharp.org">F#</a> and <a href="https://www.haskell.org">Haskell</a>, the <code>User</code> types were immutable. Connecting two users didn't mutate one of the users, but rather returned a new <code>User</code> value, as this F# example demonstrates: </p> <p> <pre><span style="color:green;">//&nbsp;User&nbsp;-&gt;&nbsp;User&nbsp;-&gt;&nbsp;User</span> <span style="color:blue;">let</span>&nbsp;addConnection&nbsp;user&nbsp;otherUser&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;user&nbsp;<span style="color:blue;">with</span>&nbsp;ConnectedUsers&nbsp;=&nbsp;otherUser&nbsp;::&nbsp;user.ConnectedUsers&nbsp;}</pre> </p> <p> In the current object-oriented code base, however, connecting one user to another is an instance method on the <code>User</code> class that mutates its state: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;Connect(<span style="color:#2b91af;">User</span>&nbsp;otherUser) { &nbsp;&nbsp;&nbsp;&nbsp;connections.Add(otherUser.Id); }</pre> </p> <p> As a consequence, the <code>Post</code> method could, if someone made a mistake in its implementation, call <code>user.Connect</code>, but forget to invoke <code>UserRepository.Update</code>. Even if that happened, then all the other assertions would pass. This is the reason that you need the <code>Assert.True(db.IsDirty)</code> assertion in the Assert phase of the test. </p> <p> While we can apply to object-oriented code what we've learned from functional programming, the latter remains simpler. </p> <h3 id="011dc9e733ad44c9a656c07b2d420e1f"> Error test cases <a href="#011dc9e733ad44c9a656c07b2d420e1f" title="permalink">#</a> </h3> <p> While there's one happy path, there's four distinct error paths that you ought to cover. You can use the Fake database for that as well: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersFailToConnectWhenUserIdIsInvalid( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>(<span style="color:#2b91af;">Matching</span>.ImplementedInterfaces)]<span style="color:#2b91af;">FakeDB</span>&nbsp;db, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;userId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(<span style="color:blue;">int</span>.TryParse(userId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;_)); &nbsp;&nbsp;&nbsp;&nbsp;db.Add(otherUser); &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(userId,&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(db.IsDirty); } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersFailToConnectWhenOtherUserIdIsInvalid( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>(<span style="color:#2b91af;">Matching</span>.ImplementedInterfaces)]<span style="color:#2b91af;">FakeDB</span>&nbsp;db, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;otherUserId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(<span style="color:blue;">int</span>.TryParse(otherUserId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;_)); &nbsp;&nbsp;&nbsp;&nbsp;db.Add(user); &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUserId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(db.IsDirty); } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersDoNotConnectWhenUserDoesNotExist( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>(<span style="color:#2b91af;">Matching</span>.ImplementedInterfaces)]<span style="color:#2b91af;">FakeDB</span>&nbsp;db, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;userId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;db.Add(otherUser); &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(userId.ToString(),&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(db.IsDirty); } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersDoNotConnectWhenOtherUserDoesNotExist( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>(<span style="color:#2b91af;">Matching</span>.ImplementedInterfaces)]<span style="color:#2b91af;">FakeDB</span>&nbsp;db, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;otherUserId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;db.Add(user); &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUserId.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(db.IsDirty); }</pre> </p> <p> There's little to say about these tests that hasn't already been said in at least one of the previous articles. All tests inspect the state of the Fake database after calling the <code>Post</code> method. The exact interactions between <code>Post</code> and <code>db</code> aren't specified. Instead, these tests rely on setting up the initial state, exercising the SUT, and verifying the final state. These are all state-based tests that avoid over-specifying the interactions. </p> <p> Specifically, none of these tests use Mocks and Stubs. In fact, at this incarnation of the test code, I was able to entirely remove the reference to <a href="https://github.com/moq/moq4">Moq</a>. </p> <h3 id="2e73c70d74cf40dfbb3e258e4293a5cf"> Summary <a href="#2e73c70d74cf40dfbb3e258e4293a5cf" title="permalink">#</a> </h3> <p> The premise of <a href="http://amzn.to/YPdQDf">Refactoring</a> is that in order to be able to refactor, the "precondition is [...] solid tests". In reality, many development organisations have the opposite experience. When programmers attempt to make changes to how their code is organised, tests break. In <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a> this problem is called <em>Fragile Tests</em>, and the cause is often <em>Overspecified Software</em>. This means that tests are tightly coupled to implementation details of the SUT. </p> <p> It's easy to inadvertently fall into this trap when you use Mocks and Stubs, even when you follow the rule of using <a href="http://blog.ploeh.dk/2013/10/23/mocks-for-commands-stubs-for-queries">Mocks for Commands and Stubs for Queries</a>. Refactoring tests towards state-based testing with Fake objects, instead of interaction-based testing, could make test suites more robust to changes. </p> <p> It's intriguing, though, that state-based testing is simpler in functional programming. In Haskell, you can simply write your tests in the State monad and compare the expected outcome to the actual outcome. Since state in Haskell is immutable, it's trivial to compare the expected with the actual state. </p> <p> As soon as you introduce mutable state, structural equality is no longer safe, and instead you have to rely on other inspection mechanisms, such as the <code>IsDirty</code> flag seen in this, and the previous, article. This makes the tests slightly more brittle, because it tends to pull towards interaction-based testing. </p> <p> While you can implement the State monad in both F# and C#, it's probably more pragmatic to express state-based tests using mutable state and the occasional <code>IsDirty</code> flag. As always, there's no panacea. </p> <p> While this article concludes the series on moving towards state-based testing, I think that an appendix on Test Spies is in order. </p> <p> <strong>Next:</strong> <a href="/2019/04/08/a-pure-test-spy">A pure Test Spy</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="b66a6b327b8949b69db335722c9501e3"> <div class="comment-author">ladeak</div> <div class="comment-content"> <p> If we had checked the FakeDB contains to user (by retrieving, similar as in the F# case), and assert Connections property on the retrieved objects, would we still need the IsDirty flag? I think it would be good to create a couple of cases which demonstrates refactoring, and how overspecified tests break with the interaction based tests, while works nicely here. </p> </div> <div class="comment-date">2019-04-05 17:20 UTC</div> </div> <div class="comment" id="39fa5e0bfdf24153854a260addf32d0e"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> ladeak, thank you for writing. The <code>IsDirty</code> flag is essentially a hack to work around the mutable nature of the <code>FakeDB</code>. As the <a href="/2019/03/25/an-example-of-state-based-testing-in-f#dd0384ed4c2d478f8374dbd55c86b197">previous article describes</a>: <blockquote> "In the previous article, the Fake database was simply an immutable dictionary. This meant that tests could easily compare expected and actual values, since they were immutable. When you use a mutable object, like the above dictionary, this is harder. Instead, what I chose to do here was to introduce an <code>IsDirty</code> flag. This enables easy verification of whether or not the database changed." </blockquote> The <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">Haskell example</a> demonstrates how no <code>IsDirty</code> flag is required, because you can simply compare the state before and after the SUT was exercised. </p> <p> You could do something similar in C# or F#, but that would require you to take an immutable snapshot of the Fake database before exercising the SUT, and then compare that snapshot with the state of the Fake database after the SUT was exercised. This is definitely also doable (as the Haskell example demonstrates), but a bit more work, which is the (unprincipled, pragmatic) reason I instead chose to use an <code>IsDirty</code> flag. </p> <p> Regarding more examples, I originally wrote another sample code base to support <a href="https://vimeo.com/296652563">this talk</a>. That sample code base contains examples that demonstrate how overspecified tests break even when you make small internal changes. I haven't yet, however, found a good home for that code base. </p> </div> <div class="comment-date">2019-04-06 10:25 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. An example of state based-testing in F# https://blog.ploeh.dk/2019/03/25/an-example-of-state-based-testing-in-f 2019-03-25T06:34:00+00:00 Mark Seemann <div id="post"> <p> <em>While F# is a functional-first language, it's okay to occasionally be pragmatic and use mutable state, for example to easily write some sustainable state-based tests.</em> </p> <p> This article is an instalment in an article series about how to move <a href="/2019/02/18/from-interaction-based-to-state-based-testing">from interaction-based testing to state-based testing</a>. In the previous article, you saw how to write <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">state-based tests in Haskell</a>. In this article, you'll see how to apply what you've learned in <a href="https://fsharp.org">F#</a>. </p> <p> The code shown in this article is <a href="https://github.com/ploeh/UserManagement">available on GitHub</a>. </p> <h3 id="57929a79c0a740c5986df12fc6e4b1c1"> A function to connect two users <a href="#57929a79c0a740c5986df12fc6e4b1c1" title="permalink">#</a> </h3> <p> This article, like the others in this series, implements an operation to connect two users. I explain the example in details in my two <a href="https://cleancoders.com">Clean Coders</a> videos, <a href="https://cleancoders.com/episode/humane-code-real-episode-4/show">Church Visitor</a> and <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a>. </p> <p> Like in the previous <a href="https://www.haskell.org">Haskell</a> example, in this article we'll start with the implementation, and then see how to unit test it. </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;Result&lt;User,UserLookupError&gt;)&nbsp;-&gt;&nbsp;(User&nbsp;-&gt;&nbsp;unit)&nbsp;-&gt;&nbsp;&#39;a&nbsp;-&gt;&nbsp;&#39;a&nbsp;-&gt;&nbsp;HttpResponse&lt;User&gt;</span> <span style="color:blue;">let</span>&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;userId&nbsp;otherUserId&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;userRes&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lookupUser&nbsp;userId&nbsp;|&gt;&nbsp;Result.mapError&nbsp;(<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;InvalidId&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;NotFound&nbsp;&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;otherUserRes&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lookupUser&nbsp;otherUserId&nbsp;|&gt;&nbsp;Result.mapError&nbsp;(<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;InvalidId&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;NotFound&nbsp;&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;connect&nbsp;=&nbsp;result&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;user&nbsp;=&nbsp;userRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;otherUser&nbsp;=&nbsp;otherUserRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;addConnection&nbsp;user&nbsp;otherUser&nbsp;|&gt;&nbsp;updateUser &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;otherUser&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;connect&nbsp;<span style="color:blue;">with</span>&nbsp;Ok&nbsp;u&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;OK&nbsp;u&nbsp;|&nbsp;Error&nbsp;msg&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;BadRequest&nbsp;msg</pre> </p> <p> While <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">the original C# example</a> used Constructor Injection, the above <code>post</code> function uses <a href="/2017/01/30/partial-application-is-dependency-injection">partial application for Dependency Injection</a>. The two function arguments <code>lookupUser</code> and <code>updateUser</code> represent interactions with a database. Since functions are polymorphic, however, it's possible to replace them with <a href="http://xunitpatterns.com/Test%20Double.html">Test Doubles</a>. </p> <h3 id="dd0384ed4c2d478f8374dbd55c86b197"> A Fake database <a href="#dd0384ed4c2d478f8374dbd55c86b197" title="permalink">#</a> </h3> <p> Like in the Haskell example, you can implement a <a href="http://xunitpatterns.com/Fake%20Object.html">Fake</a> database in F#. It's also possible to implement the State monad in F#, but there's less need for it. F# is a functional-first language, but you can also write mutable code if need be. You could, then, choose to be pragmatic and base your Fake database on mutable state. </p> <p> <pre><span style="color:blue;">type</span>&nbsp;FakeDB&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;users&nbsp;=&nbsp;Dictionary&lt;int,&nbsp;User&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;<span style="color:blue;">val</span>&nbsp;IsDirty&nbsp;=&nbsp;<span style="color:blue;">false</span>&nbsp;<span style="color:blue;">with</span>&nbsp;get,&nbsp;set &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.AddUser&nbsp;user&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">true</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;users.Add&nbsp;(user.UserId,&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.TryFind&nbsp;i&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;users.TryGetValue&nbsp;i&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">false</span>,&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">true</span>,&nbsp;&nbsp;u&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Some&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.LookupUser&nbsp;s&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;Int32.TryParse&nbsp;s&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">false</span>,&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;InvalidId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">true</span>,&nbsp;i&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;users.TryGetValue&nbsp;i&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">false</span>,&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;NotFound &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">true</span>,&nbsp;u&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.UpdateUser&nbsp;u&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">true</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;users.[u.UserId]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;u</pre> </p> <p> This <code>FakeDB</code> type is <em>a class</em> that wraps a mutable dictionary. While it 'implements' <code>LookupUser</code> and <code>UpdateUser</code>, it also exposes what <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a> calls a <em>Retrieval Interface:</em> an API that tests can use to examine the state of the object. </p> <p> Immutable values normally have structural equality. This means that two values are considered equal if they contain the same constituent values, and have the same structure. Mutable objects, on the other hand, typically have reference equality. This makes it harder to compare two objects, which is, however, what almost all unit testing is about. You compare expected state with actual state. </p> <p> In the previous article, the Fake database was simply an immutable dictionary. This meant that tests could easily compare expected and actual values, since they were immutable. When you use a mutable object, like the above dictionary, this is harder. Instead, what I chose to do here was to introduce an <code>IsDirty</code> flag. This enables easy verification of whether or not the database changed. </p> <h3 id="88dbfa0dd8f34f4ba91f2b7acc6173b1"> Happy path test case <a href="#88dbfa0dd8f34f4ba91f2b7acc6173b1" title="permalink">#</a> </h3> <p> This is all you need in terms of Test Doubles. You now have test-specific <code>LookupUser</code> and <code>UpdateUser</code> methods that you can pass to the <code>post</code> function. </p> <p> Like in the previous article, you can start by exercising the happy path where a user successfully connects with another user: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;Users&nbsp;successfully&nbsp;connect&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;user&nbsp;=&nbsp;Gen.user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;otherUser&nbsp;=&nbsp;Gen.withOtherId&nbsp;user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;user &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;otherUser &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;post&nbsp;db.LookupUser&nbsp;db.UpdateUser&nbsp;(string&nbsp;user.UserId)&nbsp;(string&nbsp;otherUser.UserId) &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;db.TryFind&nbsp;user.UserId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Option.exists &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">fun</span>&nbsp;u&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;u.ConnectedUsers&nbsp;|&gt;&nbsp;List.contains&nbsp;otherUser)&nbsp;@&gt; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;isOK&nbsp;actual&nbsp;@&gt;&nbsp;}</pre> </p> <p> All tests in this article use <a href="https://xunit.github.io">xUnit.net</a> 2.3.1, <a href="https://github.com/SwensenSoftware/unquote">Unquote</a> 4.0.0, and <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a> 0.7.0.0. </p> <p> This test first adds two valid users to the Fake database <code>db</code>. It then calls the <code>post</code> function, passing the <code>db.LookupUser</code> and <code>db.UpdateUser</code> methods as arguments. Finally, it verifies that the 'first' user's <code>ConnectedUsers</code> now contains the <code>otherUser</code>. It also verifies that <code>actual</code> represents a <code>200 OK</code> HTTP response. </p> <h3 id="750f435b6e854db898e0da44119ee4f6"> Missing user test case <a href="#750f435b6e854db898e0da44119ee4f6" title="permalink">#</a> </h3> <p> While there's one happy-path test case, there's four other test cases left. One of these is when the first user doesn't exist: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;user&nbsp;doesn&#39;t&nbsp;exist&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;i&nbsp;=&nbsp;Range.linear&nbsp;1&nbsp;1_000_000&nbsp;|&gt;&nbsp;Gen.int &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;otherUser&nbsp;=&nbsp;Gen.user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;otherUser &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;uniqueUserId&nbsp;=&nbsp;string&nbsp;(otherUser.UserId&nbsp;+&nbsp;i) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;post&nbsp;db.LookupUser&nbsp;db.UpdateUser&nbsp;uniqueUserId&nbsp;(string&nbsp;otherUser.UserId) &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;not&nbsp;db.IsDirty&nbsp;@&gt; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;isBadRequest&nbsp;actual&nbsp;@&gt;&nbsp;}</pre> </p> <p> This test adds one valid user to the Fake database. Once it's done with configuring the database, it sets <code>IsDirty</code> to <code>false</code>. The <code>AddUser</code> method sets <code>IsDirty</code> to <code>true</code>, so it's important to reset the flag before the <em>act</em> phase of the test. You could consider this a bit of a hack, but I think it makes the intent of the test clear. This is, however, a position I'm ready to reassess should the tests evolve to make this design awkward. </p> <p> As explained in the previous article, this test case requires an ID of a user that doesn't exist. Since this is a property-based test, there's a risk that Hedgehog might generate a number <code>i</code> equal to <code>otherUser.UserId</code>. One way to get around that problem is to add the two numbers together. Since <code>i</code> is generated from the range <em>1 - 1,000,000</em>, <code>uniqueUserId</code> is guaranteed to be different from <code>otherUser.UserId</code>. </p> <p> The test verifies that the state of the database didn't change (that <code>IsDirty</code> is still <code>false</code>), and that <code>actual</code> represents a <code>400 Bad Request</code> HTTP response. </p> <h3 id="4dbe21251f8440b1a594192d07b53c9f"> Remaining test cases <a href="#4dbe21251f8440b1a594192d07b53c9f" title="permalink">#</a> </h3> <p> You can write the remaining three test cases in the same vein: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;other&nbsp;user&nbsp;doesn&#39;t&nbsp;exist&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;i&nbsp;=&nbsp;Range.linear&nbsp;1&nbsp;1_000_000&nbsp;|&gt;&nbsp;Gen.int &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;user&nbsp;=&nbsp;Gen.user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;user &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;uniqueOtherUserId&nbsp;=&nbsp;string&nbsp;(user.UserId&nbsp;+&nbsp;i) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;post&nbsp;db.LookupUser&nbsp;db.UpdateUser&nbsp;(string&nbsp;user.UserId)&nbsp;uniqueOtherUserId&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;not&nbsp;db.IsDirty&nbsp;@&gt; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;isBadRequest&nbsp;actual&nbsp;@&gt;&nbsp;} [&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;user&nbsp;Id&nbsp;is&nbsp;invalid&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;s&nbsp;=&nbsp;Gen.alphaNum&nbsp;|&gt;&nbsp;Gen.string&nbsp;(Range.linear&nbsp;0&nbsp;100)&nbsp;|&gt;&nbsp;Gen.filter&nbsp;isIdInvalid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;otherUser&nbsp;=&nbsp;Gen.user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;otherUser &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;post&nbsp;db.LookupUser&nbsp;db.UpdateUser&nbsp;s&nbsp;(string&nbsp;otherUser.UserId) &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;not&nbsp;db.IsDirty&nbsp;@&gt; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;isBadRequest&nbsp;actual&nbsp;@&gt;&nbsp;} [&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;other&nbsp;user&nbsp;Id&nbsp;is&nbsp;invalid&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;s&nbsp;=&nbsp;Gen.alphaNum&nbsp;|&gt;&nbsp;Gen.string&nbsp;(Range.linear&nbsp;0&nbsp;100)&nbsp;|&gt;&nbsp;Gen.filter&nbsp;isIdInvalid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;user&nbsp;=&nbsp;Gen.user &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;db.AddUser&nbsp;user &nbsp;&nbsp;&nbsp;&nbsp;db.IsDirty&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;post&nbsp;db.LookupUser&nbsp;db.UpdateUser&nbsp;(string&nbsp;user.UserId)&nbsp;s &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;not&nbsp;db.IsDirty&nbsp;@&gt; &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;isBadRequest&nbsp;actual&nbsp;@&gt;&nbsp;}</pre> </p> <p> All tests inspect the state of the Fake database after the calling the <code>post</code> function. The exact interactions between <code>post</code> and <code>db</code> aren't specified. Instead, these tests rely on setting up the initial state, exercising the System Under Test, and verifying the final state. These are all state-based tests that avoid over-specifying the interactions. </p> <h3 id="102ff7e3a9b247b4914f113b06afd6df"> Summary <a href="#102ff7e3a9b247b4914f113b06afd6df" title="permalink">#</a> </h3> <p> While the previous Haskell example demonstrated that it's possible to write state-based unit tests in a functional style, when using F#, it sometimes make sense to leverage the object-oriented features already available in the .NET framework, such as mutable dictionaries. It would have been possible to write purely functional state-based tests in F# as well, by porting the Haskell examples, but here, I wanted to demonstrate that this isn't required. </p> <p> I tend to be of the opinion that it's only possible to be pragmatic if you know how to be dogmatic, but now that we know how to write state-based tests in a strictly functional style, I think it's fine to be pragmatic and use a bit of mutable state in F#. The benefit of this is that it now seems clear how to apply what we've learned to the original C# example. </p> <p> <strong>Next: </strong> <a href="/2019/04/01/an-example-of-state-based-testing-in-c">An example of state-based testing in C#</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The programmer as decision maker https://blog.ploeh.dk/2019/03/18/the-programmer-as-decision-maker 2019-03-18T07:44:00+00:00 Mark Seemann <div id="post"> <p> <em>As a programmer, your job is to make technical decisions. Make some more.</em> </p> <p> When <a href="/schedule">I speak at conferences</a>, people often come and talk to me. (I welcome that, BTW.) Among all the conversations I've had over the years, there's a pattern to some of them. The attendee will start by telling me how inspired (s)he is by the talk I just gave, or something I've written. That's gratifying, and a good way to start a conversation, but is often followed up like this: </p> <p> <strong>Attendee:</strong> "I just wish that we could do something like that in our organisation..." </p> <p> Let's just say that here we're talking about test-driven development, or perhaps just unit testing. Nothing too controversial. I'd typically respond, </p> <p> <strong>Me:</strong> "Why can't you?" </p> <p> <strong>Attendee:</strong> "Our boss won't let us..." </p> <p> That's unfortunate. If your boss has explicitly forbidden you to write and run unit tests, then there's not much you can do. Let me make this absolutely clear: I'm not going on record saying that you should actively disobey a direct order (unless it's unethical, that is). I do wonder, however: </p> <p> <em>Why is the boss even involved in that decision?</em> </p> <p> It seems to me that programmers often defer too much authority to their managers. </p> <h3 id="510367ded1ca47e7922b540d2fff50b0"> A note on culture <a href="#510367ded1ca47e7922b540d2fff50b0" title="permalink">#</a> </h3> <p> I'd like to preface the rest of this article with my own context. I've spent most of my programming career in Danish organisations. Even when I worked for Microsoft, I worked for Danish subsidiaries, with Danish managers. </p> <p> The power distance in Denmark is (in)famously short. It's not unheard of for individual contributors to question their superiors' decisions; sometimes to their face, and sometimes even when other people witness this. When done respectfully (which it often is), this can be extremely efficient. Managers are as fallible as the rest of us, and often their subordinates know of details that could impact a decision that a manager is about to make. Immediately discussing such details can help ensure that good decisions are made, and bad decisions are cancelled. </p> <p> This helps managers make better decisions, so enlightened managers welcome feedback. </p> <p> In general, Danish employees also tend to have a fair degree of autonomy. What I'll suggest in this article is unlikely to get you fired in Denmark. Please use your own judgement if you consider transplanting the following to your own culture. </p> <h3 id="eaa896ed8a6240729301c54fed859ab7"> Technical decisions <a href="#eaa896ed8a6240729301c54fed859ab7" title="permalink">#</a> </h3> <p> If your job is <em>programmer</em>, <em>software developer</em>, or similar, the value you add to the team is that you bring <em>technical expertise</em>. Maybe some of your colleagues are programmers as well, but together, you are the people with the technical expertise. </p> <p> Even if the project manager or other superiors used to program, unless they're also writing code for the current code base, they only have general technical expertise, but not specific expertise related to the code base you're working with. The people with most technical expertise are you and your colleagues. </p> <p> You are decision makers. </p> <p> Whenever you interact with your code base, you make technical decisions. </p> <p> In order to handle incoming HTTP requests to a <code>/reservations</code> resource, you may first decide to create a <a href="/2019/02/11/asynchronous-injection">new file called <code>ReservationsController.cs</code></a>. You'd most likely also decide to open that file and start adding code to it. </p> <p> Perhaps you add a method called <code>Post</code> that takes a <code>Reservation</code> argument. Perhaps you decide to inject an <code>IMaîtreD</code> dependency. </p> <p> At various steps along the way, you may decide to compile the code. </p> <p> Once you think that you've made enough changes to address your current work item, you may decide to run the program to see if it works. For a web-based piece of software, that typically involves starting up a browser and somehow interacting with the service. If your program is a web site, you may start at the front page, log in, click around, and fill in some forms. If your program is a REST API, you may interact with it via Fiddler or Postman (I prefer curl or <a href="https://github.com/ploeh/Furl">Furl</a>, but most people I've met still prefer something they can click on, it seems). </p> <p> What often happens is that your changes don't work the first time around, so you'll have to troubleshoot. Perhaps you decide to use a debugger. </p> <p> How many decisions are that? </p> <p> I just described seven or eight types of the sort of decisions you make as a programmer. You make such decisions all the time. Do you ask your managers permission before you start a debugging session? Before you create a new file? Before you name a variable? </p> <p> Of course you don't. You're the technical expert. There's no-one better equipped than you or your team members to make those decisions. </p> <h3 id="bed169525ccf4a90aaa9c0e33fcbf8d0"> Decide to add unit tests <a href="#bed169525ccf4a90aaa9c0e33fcbf8d0" title="permalink">#</a> </h3> <p> If you want to add unit tests, why don't you just decide to add them? If you want to apply test-driven development, why don't you just do so? </p> <p> A unit test is one or more code files. You're already authorised to make decisions about adding files. </p> <p> You can run a test suite instead of launching the software every time you want to interact with it. It's likely to be faster, even. </p> <p> Why should you ask permission to do that? </p> <h3 id="e42aac76e52141e68a0bc08fefa56a9b"> Decide to refactor <a href="#e42aac76e52141e68a0bc08fefa56a9b" title="permalink">#</a> </h3> <p> Another complaint I hear is that people aren't allowed to refactor. </p> <p> Why are you even asking permission to refactor? </p> <p> <a href="http://amzn.to/YPdQDf">Refactoring</a> means reorganising the code without changing the behaviour of the system. Another word for that is <em>editing</em> the code. It's okay. You're already permitted to edit code. It's part of your job description. </p> <p> I think I know what the underlying problem is, though... </p> <h3 id="f8c290ef29bd437daf2a81b3e06e131c"> Make technical decisions in the small <a href="#f8c290ef29bd437daf2a81b3e06e131c" title="permalink">#</a> </h3> <p> As an individual contributor, you're empowered to make small-scale technical decisions. These are decisions that are unlikely to impact schedules or allocation of programmers, including new hires. Big decisions probably should involve your manager. </p> <p> I have an inkling of why people feel that they need permission to refactor. It's because the refactoring they have in mind is going to take weeks. Weeks in which nothing else can be done. Weeks where perhaps the code doesn't even compile. </p> <p> Many years ago (but not as many as I'd like it to be), my colleague and I had what Eric Evans in <a href="http://amzn.to/WBCwx7">DDD</a> calls a <em>breakthrough</em>. We wanted to refactor towards deeper insight. What prompted the insight was a new feature that we had to add, and we'd been throwing design ideas back and forth for some time before the new insight arrived. </p> <p> We could implement the new feature if we changed one of the core abstractions in our domain model, but it required substantial changes to the existing code base. We informed our manager of our new insight and our plan, estimating that it would take less than a week to make the changes and implement the new feature. Our manager agreed with the plan. </p> <p> Two weeks later our code hadn't been in a compilable state for a week. Our manager pulled me away to tell me, quietly and equitably, that he was not happy with our lack of progress. I could only concur. </p> <p> After more heroic work, we finally managed to complete the changes and implement the new feature. Nonetheless, blocking all other development for two-three weeks in order to make a change isn't acceptable. </p> <p> That sort of change is a big decision because it impacts other team members, schedules, and perhaps overall business plans. Don't make those kinds of decisions without consulting with stakeholders. </p> <p> This still leaves, I believe, lots of room for individual decision-making in the small. What I learned from the experience I just recounted was not to engage in big changes to a code base. Learn how to make multiple incremental changes instead. In case that's completely impossible, add the new model side-by-side with the old model, and incrementally change over. That's what I should have done those many years ago. </p> <h3 id="298ca23885544e5f906379c1a6d15586"> Don't be sneaky <a href="#298ca23885544e5f906379c1a6d15586" title="permalink">#</a> </h3> <p> When I give talks about the blessings of functional programming, I sometimes get into another type of discussion. </p> <p> <strong>Attendee:</strong> It's so inspiring how beautiful and simple complex domain models become in <a href="https://fsharp.org">F#</a>. How can we do the same in C#? </p> <p> <strong>Me:</strong> You can't. If you're already using C#, you should strongly consider F# if you wish to do functional programming. Since it's also a .NET language, you can gradually introduce F# code and mix the compiled code with your existing C# code. </p> <p> <strong>Attendee:</strong> Yes... [already getting impatient with me] But we can't do that... </p> <p> <strong>Me:</strong> Why not? </p> <p> <strong>Attendee:</strong> Because our manager will not allow it. </p> <p> Based on the suggestions I've already made here, you may expect me to say that that's another technical decision that you should make without asking permission. Like the previous example about blocking refactorings, however, this is another large-scale decision. </p> <p> Your manager may be concerned that it'd be hard to find new employees if the code base is written in some niche language. <a href="/2015/12/03/the-rules-of-attraction-language">I tend to disagree with that position</a>, but I do understand why a manager would take that position. While I think it suboptimal to restrict an entire development organisation to a single language (whether it's C#, Java, C++, Ruby, etc.), I'll readily accept that language choice is a strategic decision. </p> <p> If every programmer got to choose the programming language they prefer the most that day, you'd have code bases written in dozens of different languages. While you can train bright new hires to learn a new language or two, it's unrealistic that a new employee will be able to learn thirty different languages in a short while. </p> <p> I find it reasonable that a manager has the final word on the choice of language, even when I often disagree with the decisions. </p> <p> The outcome usually is that people are stuck with C# (or Java, or...). Hence the question: <em>How can we do functional programming in C#?</em> </p> <p> I'll give the answer that I often give here on the blog: <a href="https://en.wikipedia.org/wiki/Mu_(negative)">mu</a> (<em>unask the question</em>). You can, in fact, translate functional concepts to C#, but <a href="/2018/07/24/dependency-injection-revisited">the result is so non-idiomatic</a> that only the syntax remains of C#: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IReservationsInstruction</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;Select&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IReservationsInstruction</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IReservationsInstruction</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;isReservationInFuture:&nbsp;t&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IsReservationInFuture</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Reservation</span>,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">bool</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.Item1, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;b&nbsp;=&gt;&nbsp;selector(t.Item2(b)))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readReservations:&nbsp;t&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReadReservations</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">DateTimeOffset</span>,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">IReadOnlyCollection</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.Item1, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&gt;&nbsp;selector(t.Item2(d)))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;create:&nbsp;t&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Create</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Tuple</span>&lt;<span style="color:#2b91af;">Reservation</span>,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.Item1, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r&nbsp;=&gt;&nbsp;selector(t.Item2(r))))); }</pre> </p> <p> Keep in mind the manager's motivation for standardising on C#. It's often related to concerns about being able to hire new employees, or move employees from project to project. </p> <p> If you write 'functional' C#, you'll end up with code like the above, or the following real-life example: </p> <p> <pre><span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;sendRequest( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ApiMethodNames</span>.InitRegistration, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GSObject</span>()) &nbsp;&nbsp;&nbsp;&nbsp;.Map(r&nbsp;=&gt;&nbsp;<span style="color:#2b91af;">ValidateResponse</span>.Validate(r) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.MapFailure(_&nbsp;=&gt;&nbsp;<span style="color:#2b91af;">ErrorResponse</span>.RegisterErrorResponse())) &nbsp;&nbsp;&nbsp;&nbsp;.Bind(r&nbsp;=&gt;&nbsp;r.RetrieveField(<span style="color:#a31515;">&quot;regToken&quot;</span>)) &nbsp;&nbsp;&nbsp;&nbsp;.BindAsync(token&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sendRequest( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ApiMethodNames</span>.RegisterAccount, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;CreateRegisterRequest( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mailAddress, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;password, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;token)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Map(<span style="color:#2b91af;">ValidateResponse</span>.Validate) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Bind(response&nbsp;=&gt;&nbsp;getIdentity(response) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToResult(<span style="color:#2b91af;">ErrorResponse</span>.ExternalServiceResponseInvalid))) &nbsp;&nbsp;&nbsp;&nbsp;.Map(id&nbsp;=&gt;&nbsp;<span style="color:#2b91af;">GigyaIdentity</span>.CreateNewSiteUser(id.UserId,&nbsp;mailAddress));</pre> </p> <p> (I'm indebted to <a href="https://twitter.com/runeibsen">Rune Ibsen</a> for this example.) </p> <p> A new hire can have ten years of C# experience and still have no chance in a code base like that. You'll first have to teach him or her functional programming. If you can do that, you might as well also teach a new language, like F#. </p> <p> It's my experience that learning the syntax of a new language is easy, and usually doesn't take much time. The hard part is learning a new way to think. </p> <p> Writing 'functional' C# makes it doubly hard on new team members. Not only do they have to learn a new paradigm (functional programming), but they have to learn it in a language unsuited for that paradigm. </p> <p> That's why I think you should unask the question. If your manager doesn't want to allow F#, then writing 'functional' C# is just being sneaky. That'd be obeying the letter of the law while breaking the spirit of it. That is, in my opinion, immoral. Don't be sneaky. </p> <h3 id="2f1b38953bbc4814994ee471caf4c5ac"> Summary <a href="#2f1b38953bbc4814994ee471caf4c5ac" title="permalink">#</a> </h3> <p> As a professional programmer, your job is to be a technical expert. In normal circumstances (at least the ones I know from my own career), you have agency. In order to get anything done, you make small decisions all the time, such as editing code. That's not only okay, but expected of you. </p> <p> Some decision, on the other hand, can have substantial ramifications. Choosing to write code in an unsanctioned language tends to fall on the side where a manager should be involved in the decision. </p> <p> In between is a grey area. </p> <p> <img src="/content/binary/small-vs-big-decisions-gradient.png" alt="A spectrum of decisions from small to the left to big to the right."> </p> <p> I don't even consider adding unit tests to be in the grey area, but some refactorings may be. <blockquote> <p>"It's easier to ask forgiveness than it is to get permission."</p> <footer><cite>Grace Hopper</cite></footer> </blockquote> </p> <p> To navigate grey areas you need a moral compass. </p> <p> I'll let you be the final judge of what you can get away with, but I consider it both appropriate and ethical to make the decision to add unit tests, and to continually improve code bases. You shouldn't have to ask permission to do that. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="dd2ab8d5dc6e4c5f9e49d7d7f35a8759"> <div class="comment-author"><a href="https://github.com/chicocode">Francisco Berrocal</a></div> <div class="comment-content"> <p>Before all, I'd just like to thank all the content you share, they all make me think in a good way!</p> <p> Now regarding to this post, while I tend to agree that a developer can take the decision to add (or not) unit tests by himself, there is no great value comming out of it, if that's not an approach of the whole development team, right? I believe we need the entire team on board to maximize the values of unit tests. There are changes we need to consider, from changes in the mindset of how you develop to actually running them on continuour integration pipelines. Doesn't all of that push simple decisions like "add unit test" from green area towards orange area? </p> </div> <div class="comment-date">2019-03-18 13:14 UTC</div> </div> <div class="comment" id="2bc69a5123d8499ca40631b9ce946919"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Francisco, thank you for writing. If you have a team of developers, then I agree that unit tests are going to be most valuable if the team decides to use them. </p> <p> This is still something that you ought to be competent to decide as a self-organising team of developers. Do you need to ask a manager's permission? </p> <p> I'm not trying to pretend that this is easy. I realise that it can be difficult. </p> <p> I've heard about teams where other developers are hostile to the idea of unit testing. In that situation, I can offer no easy fixes. What a lone developer can try to do in that situation is to add and run unit tests locally, on his or her own machine. This will incur some friction, because other team members will be oblivious to the tests, so they'll change code that will cause those unit tests to break. </p> <p> This might teach the lone developer to write tests so that they're as robust to trivial changes as possible. That's a valuable skill in any case. There's still going to be some overhead of maintaining the unit tests in a scenario like that, but if that overhead is smaller than the productivity gained, then in might still be worthwhile. </p> <p> What might then happen could be that other developers who are on the fence see that the lone unit tester is more effective than they are. Perhaps they'll get curious about unit tests after all, once they can see the contours of advantages. </p> <p> The next scenario, then, is a team with a few developers writing unit tests, and other who don't. At some number, you'll have achieved enough critical mass that, at least, you get to check in the unit tests together with the source code. Soon after, you may be able to institute a policy that while not everyone writes unit tests, it's not okay to break existing tests. </p> <p> The next thing you can do, then, is to set up a test run as part of continuous integration and declare that a failing test run means that the build broke. You still have team members who don't write tests, but at least you get to do it, and the tests add value to the whole team. </p> <p> Perhaps the sceptics will slowly start to write unit tests over time. Some die-hards probably never will. </p> <p> You may be able to progress through such stages without asking a manager, but I do understand that there's much variation in organisation and team dynamics. If you can use any of the above sketches as inspiration, then that's great. If you (or other readers) have other success stories to tell, then please share them. </p> <p> The point I was trying to make with this article is that programmers have agency. This isn't a licence to do whatever you please. You still have to navigate the dynamics of whatever organisation you're in. You may not, however, need to ask your manager about every little thing that you're competent to decide yourselves. </p> </div> <div class="comment-date">2019-03-19 7:57 UTC</div> </div> <div class="comment"> <div class="comment-author"><a href="https://hettomei.github.io/">Timothée GAUTHIER</a></div> <div class="comment-content"> <p> Thank you A LOT for putting words on all these thought. You'll be my reference whenever I want to introduce unit test. </p> <p> My usual example is "a surgeon doesn't need to ask to the manager if he can wash his hand. Whashing his hand is part of his job". (Not mine, but I can't remember where it comes from) </p> </div> <div class="comment-date">2019-03-19 20:15 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. An example of state-based testing in Haskell https://blog.ploeh.dk/2019/03/11/an-example-of-state-based-testing-in-haskell 2019-03-11T07:55:00+00:00 Mark Seemann <div id="post"> <p> <em>How do you do state-based testing when state is immutable? You use the State monad.</em> </p> <p> This article is an instalment in an article series about how to move <a href="/2019/02/18/from-interaction-based-to-state-based-testing">from interaction-based testing to state-based testing</a>. In the previous article, you saw <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">an example of an interaction-based unit test</a> written in C#. The problem that this article series attempts to address is that interaction-based testing can lead to what <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a> calls <em>Fragile Tests</em>, because the tests get coupled to implementation details, instead of overall behaviour. </p> <p> My experience is that functional programming is better aligned with unit testing because <a href="/2015/05/07/functional-design-is-intrinsically-testable">functional design is intrinsically testable</a>. While I believe that functional programming is no panacea, it still seems to me that we can learn many valuable lessons about programming from it. </p> <p> People often ask me about <a href="https://fsharp.org">F#</a> programming: <em>How do I know that my F# code is functional?</em> </p> <p> I sometimes wonder that myself, about my own F# code. One can certainly choose to ignore such a question as irrelevant, and I sometimes do, as well. Still, in my experience, asking such questions can create learning opportunities. </p> <p> The best answer that I've found is: <em>Port the F# code to <a href="https://www.haskell.org">Haskell</a>.</em> </p> <p> Haskell enforces <a href="https://en.wikipedia.org/wiki/Referential_transparency">referential transparency</a> via its compiler. If Haskell code compiles, it's functional. In this article, then, I take the problem from the previous article and port it to Haskell. </p> <p> The code shown in this article is <a href="https://github.com/ploeh/UserManagement">available on GitHub</a>. </p> <h3 id="ae68546c3d9f4810a18ea99c2bcc873c"> A function to connect two users <a href="#ae68546c3d9f4810a18ea99c2bcc873c" title="permalink">#</a> </h3> <p> In the previous article, you saw implementation and test coverage of a piece of software functionality to connect two users with each other. This was a simplification of the example running through my two <a href="https://cleancoders.com">Clean Coders</a> videos, <a href="https://cleancoders.com/episode/humane-code-real-episode-4/show">Church Visitor</a> and <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a>. </p> <p> In contrast to the previous article, we'll start with the implementation of the System Under Test (SUT). </p> <p> <pre><span style="color:#2b91af;">post</span>&nbsp;::&nbsp;<span style="color:blue;">Monad</span>&nbsp;m&nbsp;<span style="color:blue;">=&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(a&nbsp;-&gt;&nbsp;m&nbsp;(Either&nbsp;UserLookupError&nbsp;User))&nbsp;-&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(User&nbsp;-&gt;&nbsp;m&nbsp;<span style="color:blue;">()</span>)&nbsp;-&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;-&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;-&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;m&nbsp;(HttpResponse&nbsp;User) post&nbsp;lookupUser&nbsp;updateUser&nbsp;userId&nbsp;otherUserId&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;userRes&nbsp;&lt;-&nbsp;first&nbsp;(\<span style="color:blue;">case</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;InvalidId&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NotFound&nbsp;&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&lt;&gt;&nbsp;lookupUser&nbsp;userId &nbsp;&nbsp;otherUserRes&nbsp;&lt;-&nbsp;first&nbsp;(\<span style="color:blue;">case</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;InvalidId&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NotFound&nbsp;&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&lt;$&gt;&nbsp;lookupUser&nbsp;otherUserId &nbsp;&nbsp;connect&nbsp;&lt;-&nbsp;runExceptT&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;user&nbsp;&lt;-&nbsp;ExceptT&nbsp;$&nbsp;<span style="color:blue;">return</span>&nbsp;userRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;otherUser&nbsp;&lt;-&nbsp;ExceptT&nbsp;$&nbsp;<span style="color:blue;">return</span>&nbsp;otherUserRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lift&nbsp;$&nbsp;updateUser&nbsp;$&nbsp;addConnection&nbsp;user&nbsp;otherUser &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;otherUser &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;either&nbsp;BadRequest&nbsp;OK&nbsp;connect </pre> </p> <p> This is as direct a translation of the C# code as makes sense. If I'd only been implementing the desired functionality in Haskell, without having to port existing code, I'd designed the code differently. </p> <p> This <code>post</code> function uses <a href="/2017/01/30/partial-application-is-dependency-injection">partial application as an analogy to dependency injection</a>, but in order to enable potentially impure operations to take place, everything must happen inside of some monad. While the production code must ultimately run in the <code>IO</code> monad in order to interact with a database, tests can choose to run in another monad. </p> <p> In the C# example, two dependencies are injected into the class that defines the <code>Post</code> method. In the above Haskell function, these two dependencies are instead passed as function arguments. Notice that both functions return values in the monad <code>m</code>. </p> <p> The intent of the <code>lookupUser</code> argument is that it'll query a database with a user ID. It'll return the user if present, but it could also return a <code>UserLookupError</code>, which is a simple <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type:</a> </p> <p> <pre><span style="color:blue;">data</span>&nbsp;UserLookupError&nbsp;=&nbsp;InvalidId&nbsp;|&nbsp;NotFound&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>)</pre> </p> <p> If both users are found, the function connects the users and calls the <code>updateUser</code> function argument. The intent of this 'dependency' is that it updates the database. This is recognisably a <a href="https://en.wikipedia.org/wiki/Command%E2%80%93query_separation">Command</a>, since its return type is <code>m ()</code> - <a href="/2018/01/15/unit-isomorphisms"><em>unit</em> (<code>()</code>) is equivalent to <code>void</code></a>. </p> <h3 id="6855a96abefe47c3a924b6d7d94e7bc8"> State-based testing <a href="#6855a96abefe47c3a924b6d7d94e7bc8" title="permalink">#</a> </h3> <p> How do you unit test such a function? How do you use Mocks and Stubs in Haskell? You don't; you don't have to. While the <code>post</code> method <em>can</em> be impure (when <code>m</code> is <code>IO</code>), it doesn't have to be. Functional design is intrinsically testable, but that proposition depends on purity. Thus, it's worth figuring out how to keep the <code>post</code> function <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a> in the context of unit testing. </p> <p> While <code>IO</code> implies impurity, most common monads are pure. Which one should you choose? You could attempt to entirely 'erase' the monadic quality of the <code>post</code> function with the <code>Identity</code> monad, but if you do that, you can't verify whether or not <code>updateUser</code> was invoked. </p> <p> While you <em>could</em> write an ad-hoc Mock using, for example, the <code>Writer</code> monad, it might be a better choice to investigate if something closer to state-based testing would be possible. </p> <p> In an object-oriented context, state-based testing implies that you exercise the SUT, which mutates some state, and then you verify that the (mutated) state matches your expectations. You can't do that when you test a pure function, but you can examine the state of the function's return value. The <code>State</code> monad is an obvious choice, then. </p> <h3 id="6d9ce24e5e374c5d9987ac481b1fbd37"> A Fake database <a href="#6d9ce24e5e374c5d9987ac481b1fbd37" title="permalink">#</a> </h3> <p> Haskell's <code>State</code> monad is parametrised on the state type as well as the normal 'value type', so in order to be able to test the <code>post</code> function, you'll have to figure out what type of state to use. The interactions implied by the <code>post</code> function's <code>lookupUser</code> and <code>updateUser</code> arguments are those of database interactions. A <a href="http://xunitpatterns.com/Fake%20Object.html">Fake</a> database seems an obvious choice. </p> <p> For the purposes of testing the <code>post</code> function, an in-memory database implemented using a <code>Map</code> is appropriate: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;DB&nbsp;=&nbsp;Map&nbsp;Integer&nbsp;User</pre> </p> <p> This is simply a dictionary keyed by <code>Integer</code> values and containing <code>User</code> values. You can implement compatible <code>lookupUser</code> and <code>updateUser</code> functions with <code>State DB</code> as the <code>Monad</code>. The <code>updateUser</code> function is the easiest one to implement: </p> <p> <pre><span style="color:#2b91af;">updateUser</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">State</span>&nbsp;<span style="color:blue;">DB</span>&nbsp;() updateUser&nbsp;user&nbsp;=&nbsp;modify&nbsp;$&nbsp;Map.insert&nbsp;(userId&nbsp;user)&nbsp;user</pre> </p> <p> This simply inserts the <code>user</code> into the database, using the <code>userId</code> as the key. The type of the function is compatible with the general requirement of <code>User -&gt; m ()</code>, since here, <code>m</code> is <code>State DB</code>. </p> <p> The <code>lookupUser</code> Fake implementation is a bit more involved: </p> <p> <pre><span style="color:#2b91af;">lookupUser</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">State</span>&nbsp;<span style="color:blue;">DB</span>&nbsp;(<span style="color:#2b91af;">Either</span>&nbsp;<span style="color:blue;">UserLookupError</span>&nbsp;<span style="color:blue;">User</span>) lookupUser&nbsp;s&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;maybeInt&nbsp;=&nbsp;readMaybe&nbsp;s&nbsp;::&nbsp;Maybe&nbsp;Integer &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;eitherInt&nbsp;=&nbsp;<span style="color:blue;">maybe</span>&nbsp;(Left&nbsp;InvalidId)&nbsp;Right&nbsp;maybeInt &nbsp;&nbsp;db&nbsp;&lt;-&nbsp;get &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;eitherInt&nbsp;&gt;&gt;=&nbsp;<span style="color:blue;">maybe</span>&nbsp;(Left&nbsp;NotFound)&nbsp;Right&nbsp;.&nbsp;<span style="color:blue;">flip</span>&nbsp;Map.<span style="color:blue;">lookup</span>&nbsp;db</pre> </p> <p> First, consider the type. The function takes a <code>String</code> value as an argument and returns a <code>State DB (Either UserLookupError User)</code>. The requirement is a function compatible with the type <code>a -&gt; m (Either UserLookupError User)</code>. This works when <code>a</code> is <code>String</code> and <code>m</code> is, again, <code>State DB</code>. </p> <p> The entire function is written in <code>do</code> notation, where the inferred <code>Monad</code> is, indeed, <code>State DB</code>. The first line attempts to parse the <code>String</code> into an <code>Integer</code>. Since the built-in <code>readMaybe</code> function returns a <code>Maybe Integer</code>, the next line uses the <code>maybe</code> function to handle the two possible cases, converting the <code>Nothing</code> case into the <code>Left InvalidId</code> value, and the <code>Just</code> case into a <code>Right</code> value. </p> <p> It then uses the <code>State</code> module's <code>get</code> function to access the database <code>db</code>, and finally attempt a <code>lookup</code> against that <code>Map</code>. Again, <code>maybe</code> is used to convert the <code>Maybe</code> value returned by <code>Map.lookup</code> into an <code>Either</code> value. </p> <h3 id="280bfa2e191e42d095f6f762e0a94a55"> Happy path test case <a href="#280bfa2e191e42d095f6f762e0a94a55" title="permalink">#</a> </h3> <p> This is all you need in terms of <a href="http://xunitpatterns.com/Test%20Double.html">Test Doubles</a>. You now have test-specific <code>lookupUser</code> and <code>updateUser</code> functions that you can pass to the <code>post</code> function. </p> <p> Like in the previous article, you can start by exercising the happy path where a user successfully connects with another user: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;Users&nbsp;successfully&nbsp;connect&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;user&nbsp;otherUser&nbsp;-&gt;&nbsp;runStateTest&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;put&nbsp;$&nbsp;Map.fromList&nbsp;[toDBEntry&nbsp;user,&nbsp;toDBEntry&nbsp;otherUser] &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;user)&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;otherUser) &nbsp;&nbsp;db&nbsp;&lt;-&nbsp;get &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;isOK&nbsp;actual&nbsp;&amp;&amp; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">any</span>&nbsp;(<span style="color:blue;">elem</span>&nbsp;otherUser&nbsp;.&nbsp;connectedUsers)&nbsp;(Map.<span style="color:blue;">lookup</span>&nbsp;(userId&nbsp;user)&nbsp;db)</pre> </p> <p> Here I'm <a href="/2018/05/07/inlined-hunit-test-lists">inlining test cases as anonymous functions</a> - this time expressing the tests as QuickCheck properties. I'll later return to the <code>runStateTest</code> helper function, but first I want to focus on the test body itself. It's written in <code>do</code> notation, and specifically, it runs in the <code>State DB</code> monad. </p> <p> <code>user</code> and <code>otherUser</code> are input arguments to the property. These are both <code>User</code> values, since the test also defines <code>Arbitrary</code> instances for that type (not shown in this article; see the source code repository for details). </p> <p> The first step in the test is to 'save' both users in the Fake database. This is easily done by converting each <code>User</code> value to a database entry: </p> <p> <pre><span style="color:#2b91af;">toDBEntry</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(<span style="color:#2b91af;">Integer</span>,&nbsp;<span style="color:blue;">User</span>) toDBEntry&nbsp;=&nbsp;userId&nbsp;&amp;&amp;&amp;&nbsp;<span style="color:blue;">id</span></pre> </p> <p> Recall that the Fake database is nothing but an alias over <code>Map Integer User</code>, so the only operation required to turn a <code>User</code> into a database entry is to extract the key. </p> <p> The next step in the test is to exercise the SUT, passing the test-specific <code>lookupUser</code> and <code>updateUser</code> Test Doubles to the <code>post</code> function, together with the user IDs converted to <code>String</code> values. </p> <p> In the <em>assert</em> phase of the test, it first extracts the current state of the database, using the <code>State</code> library's built-in <code>get</code> function. It then verifies that <code>actual</code> represents a <code>200 OK</code> value, and that the <code>user</code> entry in the database now contains <code>otherUser</code> as a connected user. </p> <h3 id="050ecaa9962a487c9381da982ab264e7"> Missing user test case <a href="#050ecaa9962a487c9381da982ab264e7" title="permalink">#</a> </h3> <p> While there's one happy-path test case, there's four other test cases left. One of these is when the first user doesn't exist: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;user&nbsp;doesn&#39;t&nbsp;exist&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;(Positive&nbsp;i)&nbsp;otherUser&nbsp;-&gt;&nbsp;runStateTest&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;Map.fromList&nbsp;[toDBEntry&nbsp;otherUser] &nbsp;&nbsp;put&nbsp;db &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;uniqueUserId&nbsp;=&nbsp;<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;otherUser&nbsp;+&nbsp;i &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;uniqueUserId&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;otherUser) &nbsp;&nbsp;assertPostFailure&nbsp;db&nbsp;actual</pre> </p> <p> What ought to trigger this test case is that the 'first' user doesn't exist, even if the <code>otherUser</code> does exist. For this reason, the test inserts the <code>otherUser</code> into the Fake database. </p> <p> Since the test is a QuickCheck property, <code>i</code> could be <em>any</em> positive <code>Integer</code> value - <em>including</em> the <code>userId</code> of <code>otherUser</code>. In order to properly exercise the test case, however, you'll need to call the <code>post</code> function with a <code>uniqueUserId</code> - thas it: an ID which is guaranteed to not be equal to the <code>userId</code> of <code>otherUser</code>. There's several options for achieving this guarantee (including, as you'll see soon, the <code>==&gt;</code> operator), but a simple way is to add a non-zero number to the number you need to avoid. </p> <p> You then exercise the <code>post</code> function and, as a verification, call a reusable <code>assertPostFailure</code> function: </p> <p> <pre><span style="color:#2b91af;">assertPostFailure</span>&nbsp;::&nbsp;(<span style="color:blue;">Eq</span>&nbsp;s,&nbsp;<span style="color:blue;">Monad</span>&nbsp;m)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;s&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">HttpResponse</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">StateT</span>&nbsp;s&nbsp;m&nbsp;<span style="color:#2b91af;">Bool</span> assertPostFailure&nbsp;stateBefore&nbsp;resp&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;stateAfter&nbsp;&lt;-&nbsp;get &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;stateDidNotChange&nbsp;=&nbsp;stateBefore&nbsp;==&nbsp;stateAfter &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;stateDidNotChange&nbsp;&amp;&amp;&nbsp;isBadRequest&nbsp;resp</pre> </p> <p> This function verifies that the state of the database didn't change, and that the response value represents a <code>400 Bad Request</code> HTTP response. This verification doesn't actually verify that the error message associated with the <code>BadRequest</code> case is the expected message, like in the previous article. This would, however, involve a fairly trivial change to the code. </p> <h3 id="dbd934d3e4f7431d95d78dfb0c1d7f4e"> Missing other user test case <a href="#dbd934d3e4f7431d95d78dfb0c1d7f4e" title="permalink">#</a> </h3> <p> Similar to the above test case, users will also fail to connect if the 'other user' doesn't exist. The property is almost identical: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;other&nbsp;user&nbsp;doesn&#39;t&nbsp;exist&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;(Positive&nbsp;i)&nbsp;user&nbsp;-&gt;&nbsp;runStateTest&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp; &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;Map.fromList&nbsp;[toDBEntry&nbsp;user] &nbsp;&nbsp;put&nbsp;db &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;uniqueOtherUserId&nbsp;=&nbsp;<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;user&nbsp;+&nbsp;i &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;user)&nbsp;uniqueOtherUserId &nbsp;&nbsp;assertPostFailure&nbsp;db&nbsp;actual</pre> </p> <p> Since this test body is so similar to the previous test, I'm not going to give you a detailed walkthrough. I did, however, promise to describe the <code>runStateTest</code> helper function: </p> <p> <pre><span style="color:#2b91af;">runStateTest</span>&nbsp;::&nbsp;<span style="color:blue;">State</span>&nbsp;(<span style="color:blue;">Map</span>&nbsp;k&nbsp;a)&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;b runStateTest&nbsp;=&nbsp;<span style="color:blue;">flip</span>&nbsp;evalState&nbsp;Map.empty</pre> </p> <p> Since this is a one-liner, you could also write all the tests by simply in-lining that little expression, but I thought that it made the tests more readable to give this function an explicit name. </p> <p> It takes any <code>State (Map k a) b</code> and runs it with an empty map. Thus, all <code>State</code>-valued functions, like the tests, must explicitly put data into the state. This is also what the tests do. </p> <p> Notice that all the tests return <code>State</code> values. For example, the <code>assertPostFailure</code> function returns <code>StateT s m Bool</code>, of which <code>State s Bool</code> is an alias. This fits <code>State (Map k a) b</code> when <code>s</code> is <code>Map k a</code>, which again is aliased to <code>DB</code>. Reducing all of this, the tests are simply functions that return <code>Bool</code>. </p> <h3 id="bc67a5ebaf7347ffa4906b209c31d156"> Invalid user ID test cases <a href="#bc67a5ebaf7347ffa4906b209c31d156" title="permalink">#</a> </h3> <p> Finally, you can also cover the two test cases where one of the user IDs is invalid: </p> <p> <pre>testProperty&nbsp;<span style="color:#a31515;">&quot;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;user&nbsp;Id&nbsp;is&nbsp;invalid&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;s&nbsp;otherUser&nbsp;-&gt;&nbsp;isIdInvalid&nbsp;s&nbsp;==&gt;&nbsp;runStateTest&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;Map.fromList&nbsp;[toDBEntry&nbsp;otherUser] &nbsp;&nbsp;put&nbsp;db &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;s&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;otherUser) &nbsp;&nbsp;assertPostFailure&nbsp;db&nbsp;actual , testProperty&nbsp;<span style="color:#a31515;">&quot;Users&nbsp;don&#39;t&nbsp;connect&nbsp;when&nbsp;other&nbsp;user&nbsp;Id&nbsp;is&nbsp;invalid&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;s&nbsp;user&nbsp;-&gt;&nbsp;isIdInvalid&nbsp;s&nbsp;==&gt;&nbsp;runStateTest&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;Map.fromList&nbsp;[toDBEntry&nbsp;user] &nbsp;&nbsp;put&nbsp;db &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;post&nbsp;lookupUser&nbsp;updateUser&nbsp;(<span style="color:blue;">show</span>&nbsp;$&nbsp;userId&nbsp;user)&nbsp;s &nbsp;&nbsp;assertPostFailure&nbsp;db&nbsp;actual</pre> </p> <p> Both of these properties take a <code>String</code> value <code>s</code> as input. When QuickCheck generates a <code>String</code>, that could be any <code>String</code> value. Both tests require that the value is an invalid user ID. Specifically, it mustn't be possible to parse the string into an <code>Integer</code>. If you don't constrain QuickCheck, it'll generate various strings, including e.g. <code>"8"</code> and other strings that can be parsed as numbers. </p> <p> In the above <code>"Users don't connect when user doesn't exist"</code> test, you saw how one way to explicitly model constraints on data is to project a seed value in such a way that the constraint always holds. Another way is to use QuickCheck's built-in <code>==&gt;</code> operator to filter out undesired values. In this example, both tests employ the <code>isIdInvalid</code> function: </p> <p> <pre><span style="color:#2b91af;">isIdInvalid</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span> isIdInvalid&nbsp;s&nbsp;= &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;userInt&nbsp;=&nbsp;readMaybe&nbsp;s&nbsp;::&nbsp;Maybe&nbsp;Integer &nbsp;&nbsp;<span style="color:blue;">in</span>&nbsp;isNothing&nbsp;userInt</pre> </p> <p> Using <code>isIdInvalid</code> with the <code>==&gt;</code> operator guarantees that <code>s</code> is an invalid ID. </p> <h3 id="5d89ffd490454c0890dedff39c2f0852"> Summary <a href="#5d89ffd490454c0890dedff39c2f0852" title="permalink">#</a> </h3> <p> While state-based testing may, at first, sound incompatible with strictly functional programming, it's not only possible with the State monad, but even, with good language support, easily done. </p> <p> The tests shown in this article aren't concerned with the interactions between the SUT and its dependencies. Instead, they compare the initial state with the state after exercising the SUT. Comparing values, even complex data structures such as maps, tends to be trivial in functional programming. Immutable values typically have built-in structural equality (in Haskell signified by the automatic <code>Eq</code> type class), which makes comparing them trivial. </p> <p> Now that we know that state-based testing is possible even with Haskell's enforced purity, it should be clear that we can repeat the feat in F#. </p> <p> <strong>Next:</strong> <a href="/2019/03/25/an-example-of-state-based-testing-in-f">An example of state based-testing in F#</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Code quality isn't software quality https://blog.ploeh.dk/2019/03/04/code-quality-is-not-software-quality 2019-03-04T07:38:00+00:00 Mark Seemann <div id="post"> <p> <em>A trivial observation made explicit.</em> </p> <p> You'd think that it's evident that code quality and software quality are two different things. Yet, I often see or hear arguments about one or the other that indicates to me that some people don't make that distinction. I wonder why; I do. </p> <h3 id="56d219ddf746496c94b28d22def9a182"> Software quality <a href="#56d219ddf746496c94b28d22def9a182" title="permalink">#</a> </h3> <p> There's a school of thought leaders who advocate that, ultimately, we write code to solve problems, or to improve life, for people. I have nothing against that line of reasoning; it's just not one that I pursue much. Why should I use my energy on this message when someone like <a href="https://dannorth.net">Dan North</a> does it so much better than I could? </p> <p> Dan North is far from the only person making the point that our employers, or clients, or end-users don't care about the code; he is, in my opinion, one of the best communicators in that field. It makes sense that, with that perspective on software development, you'd invent something like <a href="https://en.wikipedia.org/wiki/Behavior-driven_development">behaviour-driven development</a>. </p> <p> The evaluation criterion used in this discourse is one of utility. Does the software serve a purpose? Does it do it well? </p> <p> In that light, <em>quality software</em> is software that serves its purpose beyond expectation. It rarely, if ever, crashes. It's easy to use. It's sufficiently responsive. It's pretty. It works both on-line and off-line. Attributes like that are externally observable qualities. </p> <p> You can write quality software in many different languages, using various styles. When you evaluate the externally observable qualities of software, the code is invisible. It's not part of the evaluation. </p> <p> It seems to me that some people try to make an erroneous conclusion from this premise. They'd say that since no employer, client, or end user evaluates the software based on the code that produced it, then no one cares about the code. </p> <h3 id="6bfca38c59a543b485fb2658ec86a615"> Code quality <a href="#6bfca38c59a543b485fb2658ec86a615" title="permalink">#</a> </h3> <p> It's easy to refute that argument. All you have to do is to come up with a counter-example. You just have to find <em>one</em> person who cares about the code. That's easy. </p> <p> <em>You</em> care about the code. </p> <p> Perhaps you react negatively to that assertion. Perhaps you say: <em>"No! I'm not one of those effete aesthetes who <a href="http://www.sandraandwoo.com/2015/12/24/0747-melodys-guide-to-programming-languages">only program in Plankalkül</a>."</em> Fine. Maybe you're not the type who likes to polish the code; maybe you're the practical, down-to-earth type who just likes to get stuff done, so that your employer/client/end-user is happy. </p> <p> Even so, I think that you still care about the code. Have you ever looked with bewilderment at a piece of code and thought: <em>"Who the hell wrote this piece of shit!?"</em> How many <a href="https://www.osnews.com/story/19266/wtfsm/">WTFs/m</a> is your code? </p> <p> I think every programmer cares about their code bases; if not in an active manner, then at least in a passive way. Bad code can seriously impede progress. I've seen more than one organisation effectively go out of business because of bad legacy code. </p> <p> Code quality is when you care about the readability and malleability of the code. It's when you care about the code's ability to <em>sustain</em> the business, not only today, but also in the future. </p> <h3 id="19f49a7c758947d9b9a8b4c52e2f8e8d"> Sustainable code <a href="#19f49a7c758947d9b9a8b4c52e2f8e8d" title="permalink">#</a> </h3> <p> I often get the impression that some people look at code quality and software quality as a (false) dichotomy. </p> <p> <img src="/content/binary/software-vs-code-quality-false-dichotomy.png" alt="Software quality versus code quality as a false dichotomy."> </p> <p> Such arguments often seem to imply that you can't have one without sacrificing the other. You must choose. </p> <p> The reality is, of course, that you can do both. </p> <p> <img src="/content/binary/software-code-quality-venn.png" alt="Software and code quality Venn diagram."> </p> <p> At the intersection between software and code quality the code sustains the business both now, and in the future. </p> <p> Yes, you should write code such that it produces software that provides value here and now, but you should also do your best to enable it to provide value in the future. This is <em>sustainable code</em>. It's code that can sustain the organisation during its lifetime. </p> <h3 id="1b88784a7a8a4ce5b44334a5ef474a85"> No gold-plating <a href="#1b88784a7a8a4ce5b44334a5ef474a85" title="permalink">#</a> </h3> <p> To be clear: this is not a call for <a href="https://en.wikipedia.org/wiki/Gold_plating_(project_management)">gold plating</a> or <a href="http://wiki.c2.com/?SpeculativeGenerality">speculative generality</a>. You probably can't predict the future needs of the stake-holders. </p> <p> Quality code doesn't have to be able to perfectly address all future requirements. In order to be sustainable, though, it should be easy to modify in the future, or perhaps just easy to throw away and rewrite. I think a good start is to write <a href="https://cleancoders.com/episode/humane-code-real-episode-1/show">humane code</a>; code that fits in your brain. </p> <p> At least, do your best to avoid writing legacy code. </p> <h3 id="74a27b6da02840919eed541b1c93f92f"> Summary <a href="#74a27b6da02840919eed541b1c93f92f" title="permalink">#</a> </h3> <p> Software quality and code quality can co-exist. You can write quality code that compiles to quality software, but one doesn't imply the other. These are two independent quality dimensions. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. An example of interaction-based testing in C# https://blog.ploeh.dk/2019/02/25/an-example-of-interaction-based-testing-in-c 2019-02-25T05:42:00+00:00 Mark Seemann <div id="post"> <p> <em>An example of using Mocks and Stubs for unit testing in C#.</em> </p> <p> This article is an instalment in an article series about how to move <a href="/2019/02/18/from-interaction-based-to-state-based-testing">from interaction-based testing to state-based testing</a>. In this series, you'll be presented with some alternatives to interaction-based testing with Mocks and Stubs. Before we reach the alternatives, however, we need to establish an example of interaction-based testing, so that you have something against which you can compare those alternatives. In this article, I'll present a simple example, in the form of C# code. </p> <p> The code shown in this article is <a href="https://github.com/ploeh/UserManagement">available on GitHub</a>. </p> <h3 id="7cc38b5dd1bc44c6aacb5077ae65c288"> Connect two users <a href="#7cc38b5dd1bc44c6aacb5077ae65c288" title="permalink">#</a> </h3> <p> For the example, I'll use a simplified version of the example that runs through my two <a href="https://cleancoders.com">Clean Coders</a> videos, <a href="https://cleancoders.com/episode/humane-code-real-episode-4/show">Church Visitor</a> and <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a>. </p> <p> The desired functionality is simple: implement a REST API that enables one user to connect to another user. You could imagine some sort of social media platform, or essentially any sort of online service where users might be interested in connecting with, or following, other users. </p> <p> In essence, you could imagine that a user interface makes an HTTP POST request against our REST API: </p> <p> <pre>POST /connections/42 HTTP/1.1 Content-Type: application/json { "otherUserId": 1337 }</pre> </p> <p> Let's further imagine that we implement the desired functionality with a C# method with this signature: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IHttpActionResult</span>&nbsp;Post(<span style="color:blue;">string</span>&nbsp;userId,&nbsp;<span style="color:blue;">string</span>&nbsp;otherUserId)</pre> </p> <p> We'll return to the implementation later, but I want to point out a few things. </p> <p> First, notice that both <code>userId</code> and <code>otherUserId</code> are <code>string</code> arguments. While the above example encodes both IDs as numbers, essentially, both URLs and JSON are text-based. Following <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>, the method should also accept JSON like <code>{ "otherUserId": "1337" }</code>. That's the reason the <code>Post</code> method takes <code>string</code> arguments instead of <code>int</code> arguments. </p> <p> Second, the return type is <code>IHttpActionResult</code>. Don't worry if you don't know that interface. It's just a way to model HTTP responses, such as <code>200 OK</code> or <code>400 Bad Request</code>. </p> <p> Depending on the input values, and the state of the application, several outcomes are possible: <table> <col> <col> <colgroup span="3"></colgroup> <thead> <tr> <td colspan="2" rowspan="2"></td> <th colspan="3" scope="colgroup">Other user</th> </tr> <tr> <th scope="col">Found</th> <th scope="col">Not found</th> <th scope="col">Invalid</th> </tr> </thead> <tbody> <tr> <th rowspan="3" scope="rowgroup">User</th> <th scope="row">Found</th> <td><em>Other user</em></td> <td><code>"Other user not found."</code></td> <td><code>"Invalid ID for other user."</code></td> </tr> <tr> <th scope="row">Not found</th> <td><code>"User not found."</code></td> <td><code>"User not found."</code></td> <td><code>"User not found."</code></td> </tr> <tr> <th scope="row">Invalid</th> <td><code>"Invalid user ID."</code></td> <td><code>"Invalid user ID."</code></td> <td><code>"Invalid user ID."</code></td> </tr> </tbody> </table> You'll notice that although this is a 3x3 matrix, there's only five distinct outcomes. This is just an implementation decision. If the first user ID is invalid (e.g. if it's a string like <code>"foo"</code> that doesn't represent a number), then it doesn't matter if the other user exists. Likewise, even if the first user ID is well-formed, it might still be the case that no user with that ID exists in the database. </p> <p> The assumption here is that the underlying user database uses integers as row IDs. </p> <p> When both users are found, the other user should be returned in the HTTP response, like this: </p> <p> <pre>HTTP/1.1 200 OK Content-Type: application/json { "id": 1337, "name": "ploeh", "connections": [{ "id": 42, "name": "fnaah" }, { "id": 2112, "name": "ndøh" }] }</pre> </p> <p> The intent is that when the first user (e.g. the one with the <code>42</code> ID) successfully connects to user <em>1337</em>, a user interface can show the full details of the other user, including the other user's connections. </p> <h3 id="2f013c5e86ef4abda25de01c06a90f25"> Happy path test case <a href="#2f013c5e86ef4abda25de01c06a90f25" title="permalink">#</a> </h3> <p> Since there's five distinct outcomes, you ought to write at least five test cases. You could start with the happy-path case, where both user IDs are well-formed and the users exist. </p> <p> All tests in this article use <a href="https://xunit.github.io">xUnit.net</a> 2.3.1, <a href="https://github.com/moq/moq4">Moq</a> 4.8.1, and <a href="https://github.com/AutoFixture/AutoFixture">AutoFixture</a> 4.1.0. </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersSuccessfullyConnect( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserReader</span>&gt;&nbsp;readerTD, &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserRepository</span>&gt;&nbsp;repoTD, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(user.Id.ToString())) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Success&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(user)); &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(otherUser.Id.ToString())) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Success&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(otherUser)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;ok&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">OkNegotiatedContentResult</span>&lt;<span style="color:#2b91af;">User</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(otherUser,&nbsp;ok.Content); &nbsp;&nbsp;&nbsp;&nbsp;repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(user)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Contains(otherUser.Id,&nbsp;user.Connections); }</pre> </p> <p> To be clear, as far as Overspecified Software goes, this isn't a bad test. It only has two Test Doubles, <code>readerTD</code> and <code>repoTD</code>. My current habit is to name any Test Double with the <em>TD</em> suffix (for <em>Test Double</em>), instead of explicitly naming them <code>readerStub</code> and <code>repoMock</code>. The latter would have been more correct, though, since the <code>Mock&lt;IUserReader&gt;</code> object is consistently used as a Stub, whereas the <code>Mock&lt;IUserRepository&gt;</code> object is used only as a Mock. This is as it should be, because it follows the rule that you should use <a href="/2013/10/23/mocks-for-commands-stubs-for-queries">Mocks for Commands, Stubs for Queries</a>. </p> <p> <code>IUserRepository.Update</code> is, indeed a Command: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;Update(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> Since the method returns <code>void</code>, unless it doesn't do anything at all, the only thing it can do is to somehow change the state of the system. The test verifies that <code>IUserRepository.Update</code> was invoked with the appropriate input argument. </p> <p> This is fine. </p> <p> I'd like to emphasise that this isn't the biggest problem with this test. A Mock like this verifies that a desired interaction took place. If <code>IUserRepository.Update</code> isn't called in this test case, it would constitute a defect. The software wouldn't have the desired behaviour, so the test ought to fail. </p> <p> The signature of <code>IUserReader.Lookup</code>, on the other hand, implies that it's a Query: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IUserReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IResult</span>&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;&nbsp;Lookup(<span style="color:blue;">string</span>&nbsp;id); }</pre> </p> <p> In C# and most other languages, you can't be sure that implementations of the <code>Lookup</code> method have no side effects. If, however, we assume that the code base in question obeys the <a href="http://en.wikipedia.org/wiki/Command%E2%80%93query_separation">Command Query Separation</a> principle, then, by elimination, this must be a Query (since it's not a Command, because the return type isn't <code>void</code>). </p> <p> For a detailed walkthrough of the <code>IResult&lt;S, E&gt;</code> interface, see my <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a> video. It's just an <a href="/2018/06/11/church-encoded-either">Either</a> with different terminology, though. <code>Right</code> is equivalent to <code>SuccessResult</code>, and <code>Left</code> corresponds to <code>ErrorResult</code>. </p> <p> The test configures the <code>IUserReader</code> Stub twice. It's necessary to give the Stub some behaviour, but unfortunately you can't just use Moq's <code>It.IsAny&lt;string&gt;()</code> for configuration, because in order to model the test case, the reader should return two different objects for two different inputs. </p> <p> This starts to look like Overspecified Software. </p> <p> Ideally, a Stub should just be present to 'make happy noises' in case the SUT decides to interact with the dependency, but with these two <code>Setup</code> calls, the interaction is overspecified. The test is tightly coupled to how the SUT is implemented. If you change the interaction implemented in the <code>Post</code> method, you could break the test. </p> <p> In any case, what the test does specify is that when you query the <code>UserReader</code>, it returns a <code>Success</code> object for both user lookups, a <code>200 OK</code> result is returned, and the <code>Update</code> method was called with <code>user</code>. </p> <h3 id="e60081720b2e4468883e3b35b9555c47"> Invalid user ID test case <a href="#e60081720b2e4468883e3b35b9555c47" title="permalink">#</a> </h3> <p> If the first user ID is invalid (i.e. not an integer) then the return value should represent <code>400 Bad Request</code> and the message body should indicate as much. This test verifies that this is the case: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersFailToConnectWhenUserIdIsInvalid( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserReader</span>&gt;&nbsp;readerTD, &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserRepository</span>&gt;&nbsp;repoTD, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;userId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(<span style="color:blue;">int</span>.TryParse(userId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;_)); &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(userId)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">UserLookupError</span>.InvalidId)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(userId,&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(<span style="color:#2b91af;">It</span>.IsAny&lt;<span style="color:#2b91af;">User</span>&gt;()),&nbsp;<span style="color:#2b91af;">Times</span>.Never()); }</pre> </p> <p> This test starts with a Guard Assertion that <code>userId</code> isn't an integer. This is mostly an artefact of using AutoFixture. Had you used specific example values, then this wouldn't have been necessary. On the other hand, had you written the test case as a property-based test, it would have been even more important to <a href="/2016/01/18/make-pre-conditions-explicit-in-property-based-tests">explicitly encode such a constraint</a>. </p> <p> Perhaps a better design would have been to use a domain-specific method to check for the validity of the ID, but there's always room for improvement. </p> <p> This test is more brittle than it looks. It only defines what should happen when <code>IUserReader.Lookup</code> is called with the invalid <code>userId</code>. What happens if <code>IUserReader.Lookup</code> is called with the <code>Id</code> associated with <code>otherUser</code>? </p> <p> This currently doesn't matter, so the test passes. </p> <p> The test relies, however, on an implementation detail. This test implicitly assumes that the implementation short-circuits as soon as it discovers that <code>userId</code> is invalid. What if, however, you'd made some performance measurements, and you'd discovered that in most cases, the software would run faster if you <code>Lookup</code> both users in parallel? </p> <p> Such an innocuous performance optimisation could break the test, because the behaviour of <code>readerTD</code> is unspecified for all other cases than for <code>userId</code>. </p> <h3 id="d691fe8c85fa42b7baa3ad565c3e6f10"> Invalid ID for other user test case <a href="#d691fe8c85fa42b7baa3ad565c3e6f10" title="permalink">#</a> </h3> <p> What happens if the other user ID is invalid? This unit test exercises that test case: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersFailToConnectWhenOtherUserIdIsInvalid( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserReader</span>&gt;&nbsp;readerTD, &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserRepository</span>&gt;&nbsp;repoTD, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;otherUserId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.False(<span style="color:blue;">int</span>.TryParse(otherUserId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;_)); &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(user.Id.ToString())) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Success&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(user)); &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(otherUserId)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">UserLookupError</span>.InvalidId)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUserId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(<span style="color:#2b91af;">It</span>.IsAny&lt;<span style="color:#2b91af;">User</span>&gt;()),&nbsp;<span style="color:#2b91af;">Times</span>.Never()); }</pre> </p> <p> Notice how the test configures <code>readerTD</code> twice: once for the <code>Id</code> associated with <code>user</code>, and once for <code>otherUserId</code>. Why does this test look different from the previous test? </p> <p> Why is the first <code>Setup</code> required? Couldn't the <em>arrange</em> phase of the test just look like the following? </p> <p> <pre><span style="color:#2b91af;">Assert</span>.False(<span style="color:blue;">int</span>.TryParse(otherUserId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;_)); readerTD &nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(otherUserId)) &nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">UserLookupError</span>.InvalidId));</pre> </p> <p> If you wrote the test like that, it would resemble the previous test (<code>UsersFailToConnectWhenUserIdIsInvalid</code>). The problem, though, is that if you remove the <code>Setup</code> for the valid user, the test fails. </p> <p> This is another example of how the use of interaction-based testing makes the tests brittle. The tests are tightly coupled to the implementation. </p> <h3 id="1d8f30969e8345f394af8f3e1bda37ae"> Missing users test cases <a href="#1d8f30969e8345f394af8f3e1bda37ae" title="permalink">#</a> </h3> <p> I don't want to belabour the point, so here's the two remaining tests: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersDoNotConnectWhenUserDoesNotExist( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserReader</span>&gt;&nbsp;readerTD, &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserRepository</span>&gt;&nbsp;repoTD, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;userId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(userId)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">UserLookupError</span>.NotFound)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(userId,&nbsp;otherUser.Id.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(<span style="color:#2b91af;">It</span>.IsAny&lt;<span style="color:#2b91af;">User</span>&gt;()),&nbsp;<span style="color:#2b91af;">Times</span>.Never()); } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">UserManagementTestConventions</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;UsersDoNotConnectWhenOtherUserDoesNotExist( &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserReader</span>&gt;&nbsp;readerTD, &nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#2b91af;">Frozen</span>]<span style="color:#2b91af;">Mock</span>&lt;<span style="color:#2b91af;">IUserRepository</span>&gt;&nbsp;repoTD, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">User</span>&nbsp;user, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;otherUserId, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;sut) { &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(user.Id.ToString())) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Success&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;(user)); &nbsp;&nbsp;&nbsp;&nbsp;readerTD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Setup(r&nbsp;=&gt;&nbsp;r.Lookup(otherUserId.ToString())) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Returns(<span style="color:#2b91af;">Result</span>.Error&lt;<span style="color:#2b91af;">User</span>,&nbsp;<span style="color:#2b91af;">IUserLookupError</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">UserLookupError</span>.NotFound)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;sut.Post(user.Id.ToString(),&nbsp;otherUserId.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;err&nbsp;=&nbsp;<span style="color:#2b91af;">Assert</span>.IsAssignableFrom&lt;<span style="color:#2b91af;">BadRequestErrorMessageResult</span>&gt;(actual); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>,&nbsp;err.Message); &nbsp;&nbsp;&nbsp;&nbsp;repoTD.Verify(r&nbsp;=&gt;&nbsp;r.Update(<span style="color:#2b91af;">It</span>.IsAny&lt;<span style="color:#2b91af;">User</span>&gt;()),&nbsp;<span style="color:#2b91af;">Times</span>.Never()); }</pre> </p> <p> Again, notice the asymmetry of these two tests. The top one passes with only one <code>Setup</code> of <code>readerTD</code>, whereas the bottom test requires two in order to pass. </p> <p> You can add a second <code>Setup</code> to the top test to make the two tests equivalent, but people often forget to take such precautions. The result is Fragile Tests. </p> <h3 id="ece9e2b8532f4e7c9e445e9840789121"> Post implementation <a href="#ece9e2b8532f4e7c9e445e9840789121" title="permalink">#</a> </h3> <p> In the spirit of test-driven development, I've shown you the tests before the implementation. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ConnectionsController</span>&nbsp;:&nbsp;<span style="color:#2b91af;">ApiController</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;ConnectionsController( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IUserReader</span>&nbsp;userReader, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;userRepository) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserReader&nbsp;=&nbsp;userReader; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserRepository&nbsp;=&nbsp;userRepository; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IUserReader</span>&nbsp;UserReader&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;UserRepository&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">IHttpActionResult</span>&nbsp;Post(<span style="color:blue;">string</span>&nbsp;userId,&nbsp;<span style="color:blue;">string</span>&nbsp;otherUserId) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;userRes&nbsp;=&nbsp;UserReader.Lookup(userId).SelectError( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;error&nbsp;=&gt;&nbsp;error.Accept(<span style="color:#2b91af;">UserLookupError</span>.Switch( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onInvalidId:&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;user&nbsp;ID.&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onNotFound:&nbsp;&nbsp;<span style="color:#a31515;">&quot;User&nbsp;not&nbsp;found.&quot;</span>))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;otherUserRes&nbsp;=&nbsp;UserReader.Lookup(otherUserId).SelectError( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;error&nbsp;=&gt;&nbsp;error.Accept(<span style="color:#2b91af;">UserLookupError</span>.Switch( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onInvalidId:&nbsp;<span style="color:#a31515;">&quot;Invalid&nbsp;ID&nbsp;for&nbsp;other&nbsp;user.&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onNotFound:&nbsp;&nbsp;<span style="color:#a31515;">&quot;Other&nbsp;user&nbsp;not&nbsp;found.&quot;</span>))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;connect&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;user&nbsp;<span style="color:blue;">in</span>&nbsp;userRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;otherUser&nbsp;<span style="color:blue;">in</span>&nbsp;otherUserRes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;Connect(user,&nbsp;otherUser); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;connect.SelectBoth(Ok,&nbsp;BadRequest).Bifold(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">User</span>&nbsp;Connect(<span style="color:#2b91af;">User</span>&nbsp;user,&nbsp;<span style="color:#2b91af;">User</span>&nbsp;otherUser) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;user.Connect(otherUser); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UserRepository.Update(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;otherUser; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This is a simplified version of the code shown towards the end of my <a href="https://cleancoders.com/episode/humane-code-real-episode-5/show">Preserved in translation</a> video, so I'll refer you there for a detailed explanation. </p> <h3 id="ed29f1f1efa14245a29e33010c4dd4d1"> Summary <a href="#ed29f1f1efa14245a29e33010c4dd4d1" title="permalink">#</a> </h3> <p> The premise of <a href="http://amzn.to/YPdQDf">Refactoring</a> is that in order to be able to refactor, the "precondition is [...] solid tests". In reality, many development organisations have the opposite experience. When programmers attempt to make changes to how their code is organised, tests break. In <a href="http://bit.ly/xunitpatterns">xUnit Test Patterns</a> this problem is called <em>Fragile Tests</em>, and the cause is often <em>Overspecified Software</em>. This means that tests are tightly coupled to implementation details of the System Under Test (SUT). </p> <p> It's easy to inadvertently fall into this trap when you use Mocks and Stubs, even when you follow the rule of using Mocks for Commands and Stubs for Queries. In my experience, it's often the explicit configuration of Stubs that tend to make tests brittle. A Command represents an intentional side effect, and you want to verify that such a side effect takes place. A Query, on the other hand, has no side effect, so a black-box test shouldn't be concerned with any interactions involving Queries. </p> <p> Yet, using an 'isolation framework' such as Moq, <a href="https://fakeiteasy.github.io/">FakeItEasy</a>, <a href="http://nsubstitute.github.io/">NSubstitute</a>, and so on, will pull you towards overspecifying the interactions the SUT has with its Query dependencies. </p> <p> How can we improve? One strategy is to move towards a more functional design, which is <a href="/2015/05/07/functional-design-is-intrinsically-testable">intrinsically testable</a>. In the next article, you'll see how to rewrite both tests and implementation in <a href="https://www.haskell.org">Haskell</a>. </p> <p> <strong>Next:</strong> <a href="/2019/03/11/an-example-of-state-based-testing-in-haskell">An example of state-based testing in Haskell</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment"> <div class="comment-author"><a href="https://remibou.github.io/">Rémi Bourgarel</a></div> <div class="comment-content"> <p> Hi Mark, </p> <p> I think I came to the same conclusion (maybe not the same solution), meaning you can't write solid tests when mocking all the dependencies interaction : all these dependencies interaction are implementation details (even the database system you chose). For writing solid tests I chose to write my tests like this : start all the services I can in test environment (database, queue ...), mock only things I have no choice (external PSP or Google Captcha), issue command (using MediatR) and check the result with a query. You can find some of my work <a href="https://github.com/RemiBou/Toss.Blazor/blob/master/Toss.Tests/Server/Models/Tosses/LastTossQueryHandlerTest.cs"> here </a>. The work is not done on all the tests but this is the way I want to go. Let me know what you think about it. </p> <p> I could have launched the tests at the Controller level but I chose Command and Query handler.</p> <p> Can't wait to see your solution </p> </div> <div class="comment-date">2019-02-25 07:53 UTC</div> </div> <div class="comment" id="7b60769bf7eb4be3969623d4819d5c0e"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Rémi, thank you for writing. Hosting services as part of a test run can be a valuable addition to an overall testing or release pipeline. It's reminiscent of the approach taken in <a href="http://bit.ly/growingoos">GOOS</a>. I've also touched on this option in my Pluralsight course <a href="https://blog.ploeh.dk/outside-in-tdd">Outside-In Test-Driven Development</a>. This is, however, a set of tests I would identify as belonging towards the top of a <a href="https://martinfowler.com/bliki/TestPyramid.html">Test Pyramid</a>. In my experience, such tests tend to run (an order of magnitude) slower than unit tests. </p> <p> That doesn't preclude their use. Depending on circumstances, I still prefer having tests like that. I think that I've written a few applications where tests like that constituted the main body of unit tests. </p> <p> I do, however, also find this style of testing too limiting in many situation. I tend to prefer 'real' unit tests, since they tend to be easier to write, and they execute faster. </p> <p> Apart from performance and maintainability concerns, one problem that I often see with integration tests is that <a href="https://www.infoq.com/presentations/integration-tests-scam">it's practically impossible to cover all edge cases</a>. This tends to lead to either bug-ridden software, or unmaintainable test suites. </p> <p> Still, I think that, ultimately, having enough experience with different styles of testing enables one to make an informed choice. That's my purpose with these articles: to point out that alternatives exist. </p> </div> <div class="comment-date">2019-03-01 9:31 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. From interaction-based to state-based testing https://blog.ploeh.dk/2019/02/18/from-interaction-based-to-state-based-testing 2019-02-18T08:19:00+00:00 Mark Seemann <div id="post"> <p> <em>Indiscriminate use of Mocks and Stubs can lead to brittle test suites. A more functional design can make state-based testing easier, leading to more robust test suites.</em> </p> <p> The original premise of <a href="http://amzn.to/YPdQDf">Refactoring</a> was that in order to refactor, you must have a trustworthy suite of unit tests, so that you can be confident that you didn't break any functionality. <blockquote> <p>"to refactor, the essential precondition is [...] solid tests"</p> <footer><cit