Algebraic data types aren't numbers on steroids https://blog.ploeh.dk/2020/01/20/algebraic-data-types-arent-numbers-on-steroids 2020-01-20T07:39:00+00:00 Mark Seemann What is it that smart people have figured out that I haven't? </p> <p> The debate continues, and this article isn't going to stop it. It may, perhaps, put one misconception to rest. There are still good arguments on either side. It's not my goal to dispute any of the good arguments. It's my goal to counter a common bad argument. </p> <h3 id="41cff26ef5fe4a56943d34da2e7ad657"> Misconception: static typing as numbers on steroids <a href="#41cff26ef5fe4a56943d34da2e7ad657" title="permalink">#</a> </h3> <p> I get the impression that many people think about static types as something that has to do with strings and numbers - particularly numbers. Introductions to programming languages often introduce strings first. That's natural, since the most common first example is <a href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program">Hello, world!</a>. After that usually follows an introduction to basic arithmetic, and that often includes an explanation about types of numbers - at least the distinction between integers and floating-point numbers. At the time I'm writing this, <a href="https://docs.microsoft.com/dotnet/csharp/tutorials/">the online C# tutorial</a> is a typical example of this. <a href="http://bit.ly/real-world-haskell">Real World Haskell</a> takes the same approach to introducing types. </p> <p> It's a natural enough way to introduce static types, but it seems to leave some learners with the impression that static types are mostly useful to prevent them from calling a method with a floating-point number when an integer was expected. That's the vibe I'm getting from <a href="https://blog.cleancoder.com/uncle-bob/2017/01/13/TypesAndTests.html">this article by Robert C. Martin</a>. </p> <p> When presented with the notion of a 'stronger' type system, people with that mindset seem to extrapolate what they already know about static types. </p> <p> <img src="/content/binary/extrapolation-of-static-primitive-types.png" alt="Three boxes, from left to right: no types, static primitive types, and static primitive types on steroids."> </p> <p> If you mostly think of static types as a way to distinguish between various primitive types (such as strings and a zoo of number types), I can't blame you for extrapolating that notion. This seems to happen often, and it leads to a lot of frustration. </p> <p> People who want 'stronger numbers' try to: <ul> <li>Model natural numbers; i.e. to define a type that represents only positive integers</li> <li>Model positive numbers; i.e. rational or real numbers greater than zero</li> <li>Model non-negative numbers</li> <li>Model numbers in a particular range; e.g. between 0 and 100</li> <li><a href="https://ren.zone/articles/safe-money">Model money in different currencies</a></li> </ul> Particularly, people run into all sorts of trouble when they try to accomplish such goals with <a href="https://www.haskell.org">Haskell</a>. They've heard that Haskell has a powerful type system, and now they want to do those things. </p> <p> Haskell does have a powerful type system, but it's a type system that builds on the concept of <a href="https://en.wikipedia.org/wiki/Algebraic_data_type">algebraic data types</a>. (If you want to escape the jargon of that Wikipedia article, I recommend <a href="http://tomasp.net">Tomas Petricek</a>'s lucid and straightforward explanation <a href="http://tomasp.net/blog/types-and-math.aspx">Power of mathematics: Reasoning about functional types</a>.) </p> <p> There are type systems that enable you to take the notion of numbers to the next level. This is called either <a href="https://en.wikipedia.org/wiki/Refinement_type">refinement types</a> or <a href="https://en.wikipedia.org/wiki/Dependent_type">dependent types</a>, contingent on what exactly it is that you want to do. Haskell doesn't support that out of the box. The most prominent dependently-typed programming language is probably <a href="https://www.idris-lang.org">Idris</a>, which is still a research language. As far as I know, there's no 'production strength' languages that support refinement or dependent types, unless you consider <a href="https://en.wikipedia.org/wiki/Liquid_Haskell">Liquid Haskell</a> to fit that description. Honestly, all this is at the fringe of my expertise. </p> <p> I'll return to an example of this kind of frustration later, and also suggest a simple alternative. Before I do that, though, I'd like to outline what it is proponents of 'strong' type systems mean. </p> <h3 id="76b0c9b8e461448796d44443351619f1"> Make illegal states unrepresentable <a href="#76b0c9b8e461448796d44443351619f1" title="permalink">#</a> </h3> <p> Languages like Haskell, <a href="https://ocaml.org">OCaml</a>, and <a href="https://fsharp.org">F#</a> have algebraic type systems. They still distinguish between various primitive types, but they take the notion of static types in a completely different direction. They introduce a new dimension of static type safety, so to speak. </p> <p> <img src="/content/binary/algebraic-data-types-as-another-dimension.png" alt="Three boxes. At the bottom left: no types. To the right of that: static primitive types. To the top of the no-types box: algebraic data types"> </p> <p> It's a completely different way to think about static types. The advantage isn't that it prevents you from using a floating point where an integer was required. The advantage is that it enables you to model domain logic in a way that flushes out all sorts of edge cases at compile time. </p> <p> I've previously <a href="/2016/11/28/easy-domain-modelling-with-types">described a real-world example of domain modelling with types</a>, so I'm not going to repeat that effort here. Most business processes can be described as a progression of states. With algebraic data types, not only can you model what a valid state looks like - you can also model the state machine in such a way that you can't represent illegal states. </p> <p> This notion is eloquently captured by <a href="https://twitter.com/yminsky">Yaron Minsky</a>'s aphorism: <blockquote> <p> Make illegal states unrepresentable. </p> </blockquote> This is solving an entirely different type of problem than distinguishing between 32-bit and 64-bit integers. Writing even moderately complex code involves dealing with many edge cases. In most mainstream languages (including C# and Java), it's your responsibility to ensure that you've handled all edge cases. It's easy to overlook or forget a few of those. With algebraic data types, the compiler keeps track of that for you. That's a tremendous boon because it enables you to forget about those technical details and instead focus on adding value. </p> <p> Scott Wlaschin wrote <a href="https://amzn.to/2OyI51M">an entire book about domain modelling with algebraic data types</a>. That's what we talk about when we talk about stronger type systems. Not 'numbers on steroids'. </p> <h3 id="4c5ac4aa723f4f078d5cfbb13164194c"> Exhibit: summing notionals <a href="#4c5ac4aa723f4f078d5cfbb13164194c" title="permalink">#</a> </h3> <p> I consider this notion of <em>strong type systems viewed as numbers on steroids</em> a red herring. I don't blame anyone from extrapolating from what they already know. That's a natural way to try to make sense of the world. We all do it. </p> <p> I came across a recent example of this way of thinking in a great article by <a href="https://alexnixon.github.io">Alex Nixon</a> titled <a href="https://alexnixon.github.io/2020/01/14/static-types-are-dangerous.html">Static types are dangerously interesting</a>. The following is in no way meant to excoriate Alex or his article, but I think it's a great example of how easily one can be lead astray by thinking that strong type systems imply numbers on steroids. </p> <p> You should read the article. It's well-written and uses more sophisticated features of Haskell than I'm comfortable with. The example problem it tries to solve is basically this: Given a set of trades, calculate the <em>total notional in each currency</em>. Consider a collection of trades: </p> <p> <pre>Quantity, Ticker, Price, Currency 100, VOD.L, 1, GBP 200, VOD.L, 2, GBP 300, AAPL.O, 3, USD 50, 4151.T, 5, JPY</pre> </p> <p> I'll let Alex explain what it is that he wants to do: <blockquote> <p> "I want to write a function which calculates the <em>total notional in each currency</em>. The word <em>notional</em> is a fancy way of saying <code>price * quantity</code>. Think of it as "value of the thing that changed hands". </p> <p> "For illustration, the function signature might look something like this: </p> <p> "<code>sumNotionals :: [Trade] -> Map Currency Rational</code> </p> <p> "In English, it’s a function that takes a list of trades and returns a map from currency to quantity." </p> </blockquote> If given the above trades, the output would be: </p> <p> <pre>Currency, Notional GBP, 500 USD, 900 JPY, 250</pre> </p> <p> The article proceeds to explore how to model this problem with Haskell's strong type system. Alex wants to be able to calculate with money, but on the other hand, he wants the type system to prevent accidents. You can't add <em>100 GBP</em> to <em>300 USD</em>. The type system should prevent that. </p> <p> Early on, he defines a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> to model currencies: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Currency &nbsp;&nbsp;=&nbsp;USD &nbsp;&nbsp;|&nbsp;GBP &nbsp;&nbsp;|&nbsp;JPY &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Ord</span>,&nbsp;<span style="color:#2b91af;">Show</span>)</pre> </p> <p> Things basically go downhill from there. Read the article; it's good. </p> <h3 id="f04845b0c38f4924a5d34d1d5802912e"> Sum types should distinguish behaviour, not values <a href="#f04845b0c38f4924a5d34d1d5802912e" title="permalink">#</a> </h3> <p> I doubt that Alex Nixon views his proposed <code>Currency</code> type as anything but a proof of concept. In a 'real' code base, you'd enumerate all the currencies you'd trade, right? </p> <p> I wouldn't. This is the red herring in action. Algebraic data types are useful because they enable us to distinguish between cases that we should treat differently, by writing specific code that deals with each case. That's not the case with a currency. You add US dollars together in exactly the same way that you add euros together. The currency doesn't change the behaviour of that operation. </p> <p> But we can't just enable addition of arbitrary monetary values, right? After all, we shouldn't be able to add <em>20 USD</em> and <em>300 DKK</em>. At least, without an exchange rate, that shouldn't compile. </p> <p> Let's imagine, for the sake of argument, that we encode all the currencies we trade into a type. What happens if our traders decide to trade a currency that they haven't previously traded? What if a country decides to <a href="https://en.wikipedia.org/wiki/Redenomination">reset their currency</a>? What if a country splits into two countries, each with <a href="https://en.wikipedia.org/wiki/South_Sudanese_pound">their own currency</a>? </p> <p> If you model currency as a type, you'd have to edit and recompile your code every time such an external event occurs. I don't think this is a good use of a type system. </p> <p> Types should, I think, help us programmers identify the parts of our code bases where we need to treat various cases differently. They shouldn't be used to distinguish run-time values. Types provide value at compile time; run-time values only exist at run time. To paraphrase Kent Beck, <em>keep things together that change together; keep things apart that don't</em>. </p> <p> I'd model currency as a run-time value, because the behaviour of money doesn't vary with the currency. </p> <h3 id="cfa838f3fad84259b9629f24f09f37eb"> Boring Haskell <a href="#cfa838f3fad84259b9629f24f09f37eb" title="permalink">#</a> </h3> <p> How would I calculate the notionals, then? With <a href="https://www.snoyman.com/blog/2019/11/boring-haskell-manifesto">boring Haskell</a>. Really boring Haskell, in fact. I'm only going to need two imports and no language pragmas: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Trades&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;Data.List <span style="color:blue;">import</span>&nbsp;Data.Map.Strict&nbsp;(<span style="color:blue;">Map</span>) <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.Map.Strict&nbsp;<span style="color:blue;">as</span>&nbsp;Map</pre> </p> <p> Which types do I need? For this particular purpose, I think I'll just stick with a single <code>Trade</code> type: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Trade&nbsp;=&nbsp;Trade&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;tradeQuantity&nbsp;::&nbsp;Int &nbsp;&nbsp;,&nbsp;tradeTicker&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;tradePrice&nbsp;::&nbsp;Rational &nbsp;&nbsp;,&nbsp;tradeCurrency&nbsp;::&nbsp;String&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>)</pre> </p> <p> Shouldn't I introduce a <code>Money</code> type? I could, but I don't have to. As <a href="https://lexi-lambda.github.io">Alexis King</a> so clearly explains, <a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open">you don't have to model more than you need to do the job</a>. </p> <p> By not introducing a <code>Money</code> type and making it an instance of various type classes, I still prevent client code from adding things together that shouldn't be added together. You can't add <code>Trade</code> values together because <code>Trade</code> isn't a <code>Num</code> instance. </p> <p> How do we calculate the notionals, then? It's easy; it's a one-liner: </p> <p> <pre><span style="color:#2b91af;">sumNotionals</span>&nbsp;::&nbsp;<span style="color:blue;">Foldable</span>&nbsp;t&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;<span style="color:blue;">Trade</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Map</span>&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">Rational</span> sumNotionals&nbsp;=&nbsp;foldl&#39;&nbsp;(\m&nbsp;t&nbsp;-&gt;&nbsp;Map.insertWith&nbsp;<span style="color:#2b91af;">(+)</span>&nbsp;(key&nbsp;t)&nbsp;(value&nbsp;t)&nbsp;m)&nbsp;Map.empty &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;key&nbsp;&nbsp;&nbsp;(Trade&nbsp;_&nbsp;_&nbsp;_&nbsp;currency)&nbsp;=&nbsp;currency &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;value&nbsp;(Trade&nbsp;quantity&nbsp;_&nbsp;price&nbsp;_)&nbsp;=&nbsp;<span style="color:blue;">toRational</span>&nbsp;quantity&nbsp;*&nbsp;price</pre> </p> <p> Okay, that looks more like four lines of code, but the first is an optional type declaration, so it doesn't count. The <code>key</code> and <code>value</code> functions could be inlined to make the function a single (wide) line of code, but I made them two named functions in order to make the code more readable. </p> <p> It gets the job done: </p> <p> <pre>*Trades&gt; sumNotionals trades fromList [("GBP",500 % 1),("JPY",250 % 1),("USD",900 % 1)]</pre> </p> <p> While this code addresses this particular problem, you probably consider it cheating because I've failed to address a wider concern. How does one model money in several currencies? I've <a href="/2017/10/16/money-monoid">previously covered that, including a simple Haskell example</a>, but in general, I consider it more productive to have a problem and <em>then</em> go looking for a solution, rather than inventing a solution and go looking for a problem. </p> <h3 id="8d472a61051e4ddfaf2698c8d215d658"> Summary <a href="#8d472a61051e4ddfaf2698c8d215d658" title="permalink">#</a> </h3> <p> When people enter into a debate, they use the knowledge they have. This is also the case in the debate about static versus dynamic types. Most programmers have experience with statically typed languages like C# or Java. It's natural to argue from what you know, and extrapolate from that. </p> <p> I think that when confronted with a phrase like <em>a more powerful type system</em>, many people extrapolate and think that they know what that means. They think that it means statically typed numbers on steroids. That's a red herring. </p> <p> That's usually not what we mean when we talk about <em>more powerful type systems</em>. We talk about algebraic data types, which make illegal states unrepresentable. Judged by the debates I've participated in, you can't <em>extrapolate</em> from mainstream type systems to algebraic data types. If you haven't tried programming with both sum and <a href="https://en.wikipedia.org/wiki/Product_type">product types</a>, you aren't going to <a href="http://bit.ly/stranger-in-a-strange-land">grok</a> what we mean when we talk about <em>strong type systems</em>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="61115b6622294923b7440cb1c4e32f84"> <div class="comment-author"><a href="https://github.com/Jankowski-J">Jakub Jankowski</a></div> <div class="comment-content"> <p> "but in general, I consider it more productive to have a problem and <em>then</em> go looking for a solution, rather than inventing a solution and go looking for a problem." </p> <p> This really resonates with me. I've been observing this in my current team and the tendency to "lookout" for the solutions to problems not yet present, just for the sake of "making it a robust solution" so to say. </p> <p> I really like the properties of the Haskell solution. It handles all the currencies (no matter how many of them come in the dataset) without explicitly specifying them. And you can't accidentally add two different currencies together. The last part would be pretty verbose to implement in C#. </p> </div> <div class="comment-date">2020-01-20 20:54 UTC</div> </div> <div class="comment" id="a496e710c3ea4391a9de433da3e6d54d"> <div class="comment-author"><a href="https://github.com/drewjcooper">Andrew Cooper</a></div> <div class="comment-content"> <p> I'm not sure the above is a good example of what you're trying to say about algebraic data types. The problem can be solve identically (at least semantically) in C#. Granted, the definition of the <code>Trade</code> type would be way more verbose, but once you have that, the <code>SumNotionals</code> method is basically the same as you code, albeit with different syntax: </p> <p> <pre>Dictionary&lt;string,&nbsp;int&gt;&nbsp;SumNotionals(IEnumerable&lt;Trade&gt;&nbsp;trades) { &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;trades &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.GroupBy(t&nbsp;=&gt;&nbsp;t.Currency,&nbsp;t&nbsp;=&gt&nbsp;t.Price&nbsp;*&nbsp;t.Quantity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToDictionary(g&nbsp;=&gt;&nbsp;g.Key,&nbsp;g&nbsp;=&gt&nbsp;g.Sum()); }</pre> </p> <p> Am I missing something? </p> </div> <div class="comment-date">2020-01-20 22:30 UTC</div> </div> <div class="comment" id="bd48b3267e014ccc9b377e983369ddca"> <div class="comment-author"><a href="https://github.com/Jankowski-J">Jakub Jankowski</a></div> <div class="comment-content"> <p> You are right Andrew. The LINQ query indeed has the same properites as the Haskell function. </p> <p> I'm not sure what I was thinking yesterday, but I think I subconsciously "wanted" C# to be less robust. </p> </div> <div class="comment-date">2020-01-21 18:04 UTC</div> </div> <div class="comment" id="c9069e6554934325a86a27921cff7ac3"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Andrew, thank you for writing. I didn't intend to say much about algebraic data types in this article. It wasn't the topic I had in mind. It can be difficult to communicate any but the simplest ideas, so it's possible that I didn't state my intention well enough. If so, the fault is mine. I've <a href="/2016/11/28/easy-domain-modelling-with-types">tried to demonstrate the power of algebraic data types before</a>, so I didn't want to repeat the effort, since my agenda was another. That's why I linked to that other article. </p> <p> The reason I discussed Alex Nixon's blog post was that it was the article that originally inspired me to write this article. I always try to include an example so that the reader gets to see the connection between the general concept and specifics. </p> <p> I could have discussed Alex' article solely on its merits of showcasing failed attempts to model a 'stronger number'. That would, however, have left the reader without a resolution. I found that a bad way to structure my text. This blog is totally free, but if you like it, please consider supporting it. On doing katas https://blog.ploeh.dk/2020/01/13/on-doing-katas 2020-01-13T06:23:00+00:00 Mark Seemann <div id="post"> <p> <em>Approach programming katas differently than martial arts katas.</em> </p> <p> Would you like to become a better programmer? Then practice. It's no different from becoming a better musician, a better sports(wo)man, a better cook, a better artist, etcetera. </p> <p> How do you practice programming? </p> <p> There's many ways. Doing <a href="https://en.wikipedia.org/wiki/Kata_(programming)">programming katas</a> is one way. </p> <h3 id="16e8612d24b14930a4d7cc02ebad6fd6"> Variation, not repetition <a href="#16e8612d24b14930a4d7cc02ebad6fd6" title="permalink">#</a> </h3> <p> When I talk to other programmers about katas, I often get the impression that people fail to extract value from the exercises. You can find catalogues of exercises on the internet, but there's a dearth of articles that discuss <em>how</em> to do katas. </p> <p> Part of the problem is, I think, that <a href="https://en.wikipedia.org/wiki/Kata">the term comes from martial arts practice</a>. In martial arts, one repeats the same movements over and over again in order to build up <a href="https://en.wikipedia.org/wiki/Muscle_memory">muscle memory</a>. Repetition produces improvements. </p> <p> Some people translate that concept literally. They try to do programming katas by doing the <em>same exercise</em> again and again, with no variation. After a few days or weeks, they stop because they can't see the point. </p> <p> That's no wonder. Neither can I. </p> <p> Programming and software design is mostly an intellectual (and perhaps artistic) endeavour. Unless you can't <a href="https://en.wikipedia.org/wiki/Touch_typing">touch type</a>, there's little need to build up muscle memory. You train your brain unlike you train your muscles. Repetition numbs the brain. Variation stimulates it. </p> <h3 id="5e0804109e5c40cb80f0c3c9274afb30"> Suggested variations <a href="#5e0804109e5c40cb80f0c3c9274afb30" title="permalink">#</a> </h3> <p> I find that doing a kata is a great opportunity to explore alternatives. A kata is usually a limited exercise, which means that you can do it multiple times and compare outcomes. </p> <p> You can find various kata catalogues on the internet. One of my favourites is the <a href="http://bit.ly/codekatas">Coding Dojo</a>. Among the katas there, I particularly like the <a href="http://codingdojo.org/kata/Tennis">Tennis kata</a>. I'll use that as an example to describe how I often approach a kata. </p> <p> The first time I encounter a kata I've never done before, I do it with as little fuss as possible. I use the programming language I'm most comfortable with, and don't attempt any stunts. I no longer remember when I first encountered the Tennis kata, but it's many years ago, and C# was my preferred language. I'd do the Tennis kata in C#, then, just to get acquainted with the problem. </p> <p> Most good katas contain small surprises. They may sound simpler than they actually turn out to be. On the other hand, they're typically not overwhelmingly difficult. It pays to overcome the surprise the kata may hold without getting bogged down by trying some feat. The Tennis kata, for example, sounds easy, but most people stumble on the rules associated with <em>deuce</em> and <em>advantage</em>. How to model the API? How do you implement the algorithm? </p> <p> Once you're comfortable with the essence of the exercise, introduce variations. Most of the variations I use take the form of some sort of constraint. <a href="https://www.dotnetrocks.com/?show=1542">Constraints liberate</a>. <a href="/2015/04/13/less-is-more-language-features">Less is more</a>. </p> <p> Here's a list of suggestions: <ul> <li><a href="/2019/10/21/a-red-green-refactor-checklist">Follow test-driven development</a> (TDD). That's my usual modus operandi, but if you don't normally practice TDD, a kata is a great opportunity.</li> <li>Use the (<em>Gollum style</em>) <a href="/2019/10/07/devils-advocate">Devil's Advocate</a> technique with TDD.</li> <li>Follow the <a href="https://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html">Transformation Priority Premise</a>.</li> <li>Do TDD without mocks.</li> <li>Do TDD with mocks.</li> <li>Use the <a href="http://www.natpryce.com/articles/000714.html">Test Data Builder design pattern</a>.</li> <li>Try <a href="/property-based-testing-intro">property-based testing</a>. I've <a href="/2016/02/10/types-properties-software">done that with the Tennis</a> kata multiple times.</li> <li>Put your mouse away.</li> <li>Use another editor or IDE.</li> <li>Use another programming language. A kata is a great way to practice a new language. When you're learning a new language, you're often fighting with unfamiliar syntax, which is the reason I recommend that you <em>first</em> do the kata in a language with which you're familiar.</li> <li>Use only immutable data structures. This is a good first step towards learning functional programming.</li> <li>Keep the <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> of all methods at <em>1</em>. I <a href="/2011/05/16/TennisKatawithimmutabletypesandacyclomaticcomplexityof1">once did that with the Tennis kata</a>.</li> <li>Use an unfamiliar API. If you normally use <a href="https://nunit.org">NUnit</a> then try <a href="https://xunit.net">xUnit.net</a> instead. Use a new Test Double library. Use a different assertion library. I once did the Tennis kata in <a href="https://www.haskell.org">Haskell</a> using the <a href="http://hackage.haskell.org/package/lens">lens</a> library because I wanted to hone those skills. I've also done the <em>Mark IV coffee maker</em> exercise from <a href="http://amzn.to/19W4JHk">APPP</a> with <a href="http://reactivex.io">Reactive Extensions</a>.</li> <li>Employ a design pattern you'd like to understand better. I've had particular success with the <a href="https://en.wikipedia.org/wiki/Visitor_pattern">Visitor design pattern</a>.</li> <li>Refactor an existing kata solution to another design.</li> <li>Refactor another programmer's kata solution.</li> <li><a href="https://en.wikipedia.org/wiki/Pair_programming">Pair-program</a> the kata.</li> <li>Use the <a href="http://wiki.c2.com/?PairProgrammingPingPongPattern">Ping Pong pattern</a> when pair programming.</li> <li><a href="https://en.wikipedia.org/wiki/Mob_programming">Mob-program</a> it.</li> </ul> You'll probably come up with your own list of variations. </p> <p> What I like about katas is that they're small enough that you can do the same exercise multiple times, but with different designs. This makes it easy to learn new ways of doing things, because you can compare different approaches to the same problem. </p> <h3 id="1faffc34ac644021b87c7f0b0691eeef"> Conclusion <a href="#1faffc34ac644021b87c7f0b0691eeef" title="permalink">#</a> </h3> <p> The way that the idea of a programming kata <a href="http://www.butunclebob.com/ArticleS.UncleBob.TheProgrammingDojo">was originally introduced</a> is a bit unfortunate. On one hand, the metaphor may have helped adoption because martial arts are cool, and Japanese is a beautiful language. On the other hand, the underlying message is one of repetition, which is hardly helpful when it comes to exercising the brain. </p> <p> Repetition dulls the brain, while variation stimulates it. Katas are great because they're short exercises, but you have to deliberately introduce diversity to make them work for you. You're not building muscle memory, you're forming new neural pathways. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="?"> <div class="comment-author">Johannes Schmitt</div> <div class="comment-content"> <p> Regarding kata variations, I'd like mention Jeff Bay's <i>Object Calisthenics</i> (by Jeff Bay). This blog is totally free, but if you like it, please consider supporting it. The case of the unbalanced brackets https://blog.ploeh.dk/2020/01/06/the-case-of-the-unbalanced-brackets 2020-01-06T06:37:00+00:00 Mark Seemann <div id="post"> <p> <em>A code mystery.</em> </p> <p> One of my clients was kind enough to let me look at some of their legacy code. As I was struggling to understand how it worked, I encountered something that <em>looked</em> like this: </p> <p> <pre>ApplyDueAmountG89.<span style="font-weight:bold;color:#74531f;">Calculate</span>(<span style="font-weight:bold;color:#1f377f;">postState</span>.PartialMebershipsBAT.<span style="font-weight:bold;color:#74531f;">Where</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&gt;&nbsp;(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.Choicetype&nbsp;==&nbsp;<span style="color:#2b91af;">GarplyChoicetype</span>.AtoC&nbsp;|| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">retirablePartialMembershipNr</span>.<span style="font-weight:bold;color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.PartialMembershipNr)).<span style="font-weight:bold;color:#74531f;">ToList</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ApplyDueAmountG89.Situation.Depreciation, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ApplyDueAmountG89.RecordType.Primo);</pre> </p> <p> For the record, this isn't the actual code that my client gave me. I wouldn't post someone else's code without their permission. It is, however, a faithful imitation of the original code. What's wrong with it? </p> <p> I'll wait. </p> <h3 id="4eb9d6fa6cf04f45a4afdeb3638a9c8a"> Brackets <a href="#4eb9d6fa6cf04f45a4afdeb3638a9c8a" title="permalink">#</a> </h3> <p> Count the brackets. There's a missing closing bracket. </p> <p> Yet, the code compiles. How? </p> <p> Legacy code isn't <a href="https://cleancoders.com/video-details/humane-code-real-episode-1">humane code</a>. There's a multitude of ways in which code can be obscure. This article describes one of them. </p> <p> When brackets are nested and far apart, it's hard for the brain to parse and balance them. Yet, on closer inspection the brackets seem unbalanced. </p> <h3 id="0595d741e4ff4a139d3b26ab8fb3e75d"> Show whitespace <a href="#0595d741e4ff4a139d3b26ab8fb3e75d" title="permalink">#</a> </h3> <p> Ever since I started programming in <a href="https://fsharp.org">F#</a>, I've turned on the Visual Studio feature that shows whitespace. F# does, after all, use significant whitespace (AKA the <a href="https://en.wikipedia.org/wiki/Off-side_rule">Off-side rule</a>), and it helps to be able to detect if a tab character has slipped in among the spaces. </p> <p> Visual Studio shows whitespace with pale blue dots and arrows. When that feature is turned on (<kbd>Ctrl</kbd> + <kbd>e</kbd>, <kbd>s</kbd>), the above code example looks different: </p> <p> <pre>ApplyDueAmountG89.<span style="font-weight:bold;color:#74531f;">Calculate</span>(<span style="font-weight:bold;color:#1f377f;">postState</span>.PartialMebershipsBAT.<span style="font-weight:bold;color:#74531f;">Where</span>( <span style="color:#2b91af;">&middot;&middot;&middot;&middot;</span><span style="font-weight:bold;color:#1f377f;">d</span><span style="color:#2b91af;">&middot;</span>=&gt;<span style="color:#2b91af;">&middot;</span>(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.Choicetype<span style="color:#2b91af;">&middot;</span>==<span style="color:#2b91af;">&middot;</span><span style="color:#2b91af;">GarplyChoicetype</span>.AtoC<span style="color:#2b91af;">&middot;</span>||<span style="color:#2b91af;">&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;</span> <span style="color:#2b91af;">&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;</span><span style="font-weight:bold;color:#1f377f;">retirablePartialMembershipNr</span>.<span style="font-weight:bold;color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.PartialMembershipNr)).<span style="font-weight:bold;color:#74531f;">ToList</span>(), <span style="color:#2b91af;">&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;</span>ApplyDueAmountG89.Situation.Depreciation, <span style="color:#2b91af;">&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;&middot;</span>ApplyDueAmountG89.RecordType.Primo);</pre> </p> <p> Notice the space characters that seem to run off to the right of the <code>||</code> operator. What's at the end of those spaces? </p> <p> Yes, you guessed it: another Boolean expression, including the missing closing bracket: </p> <p> <pre><span style="font-weight:bold;color:#1f377f;">d</span>.Data.Choicetype&nbsp;==&nbsp;<span style="color:#2b91af;">GarplyChoicetype</span>.BtoC)&nbsp;&amp;&amp;</pre> </p> <p> If you delete all those redundant spaces, this is the actual code: </p> <p> <pre>ApplyDueAmountG89.<span style="font-weight:bold;color:#74531f;">Calculate</span>(<span style="font-weight:bold;color:#1f377f;">postState</span>.PartialMebershipsBAT.<span style="font-weight:bold;color:#74531f;">Where</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&gt;&nbsp;(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.Choicetype&nbsp;==&nbsp;<span style="color:#2b91af;">GarplyChoicetype</span>.AtoC&nbsp;||&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>.Data.Choicetype&nbsp;==&nbsp;<span style="color:#2b91af;">GarplyChoicetype</span>.BtoC)&nbsp;&amp;&amp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">retirablePartialMembershipNr</span>.<span style="font-weight:bold;color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">d</span>.Data.PartialMembershipNr)).<span style="font-weight:bold;color:#74531f;">ToList</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ApplyDueAmountG89.Situation.Depreciation, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ApplyDueAmountG89.RecordType.Primo);</pre> </p> <p> Imagine troubleshooting code like that, and not realising that there's another Boolean expression so far right that even a large screen doesn't show it. This blog is totally free, but if you like it, please consider supporting it. Semigroup resonance FizzBuzz https://blog.ploeh.dk/2019/12/30/semigroup-resonance-fizzbuzz 2019-12-30T10:44:00+00:00 Mark Seemann <div id="post"> <p> <em>An alternative solution to the FizzBuzz kata.</em> </p> <p> A common solution to the <a href="http://codingdojo.org/kata/FizzBuzz/">FizzBuzz kata</a> is to write a loop from 1 to 100 and perform a modulo check for each number. Functional programming languages like <a href="https://www.haskell.org">Haskell</a> don't have loops, so instead you'd typically solve the kata like this: </p> <p> <pre><span style="color:#2b91af;">isAMultipleOf</span>&nbsp;::&nbsp;<span style="color:blue;">Integral</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span> isAMultipleOf&nbsp;i&nbsp;multiplier&nbsp;=&nbsp;i&nbsp;mod&nbsp;multiplier&nbsp;==&nbsp;0 <span style="color:#2b91af;">convert</span>&nbsp;::&nbsp;(<span style="color:blue;">Integral</span>&nbsp;a,&nbsp;<span style="color:blue;">Show</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">String</span> convert&nbsp;i&nbsp;|&nbsp;i&nbsp;isAMultipleOf&nbsp;3&nbsp;&amp;&amp;&nbsp;i&nbsp;isAMultipleOf&nbsp;5&nbsp;=&nbsp;<span style="color:#a31515;">&quot;FizzBuzz&quot;</span> convert&nbsp;i&nbsp;|&nbsp;i&nbsp;isAMultipleOf&nbsp;3&nbsp;=&nbsp;<span style="color:#a31515;">&quot;Fizz&quot;</span> convert&nbsp;i&nbsp;|&nbsp;i&nbsp;isAMultipleOf&nbsp;5&nbsp;=&nbsp;<span style="color:#a31515;">&quot;Buzz&quot;</span> convert&nbsp;i&nbsp;=&nbsp;<span style="color:blue;">show</span>&nbsp;i <span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">mapM_</span>&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;$&nbsp;convert&nbsp;&lt;$&gt;&nbsp;[1..100]</pre> </p> <p> There's more than one way to skin this cat. In this article, I'll demonstrate one based on <code>Semigroup</code> resonance. </p> <h3 id="66370c86e5c74b44a18bb9827c7dc34e"> Fizz stream <a href="#66370c86e5c74b44a18bb9827c7dc34e" title="permalink">#</a> </h3> <p> The fundamental idea is to use infinite streams that repeat at different intervals. That idea isn't mine, but I've never seen it done without resorting to some sort of Boolean conditional or pattern matching. </p> <p> You start with a finite sequence of values that represent the pulse of <em>Fizz</em> values: </p> <p> <pre>[Nothing,&nbsp;Nothing,&nbsp;Just&nbsp;<span style="color:#a31515;">&quot;Fizz&quot;</span>]</pre> </p> <p> If you repeat that sequence indefinitely, you now have a pulse of <em>Fizz</em> values: </p> <p> <pre><span style="color:#2b91af;">fizzes</span>&nbsp;::&nbsp;[<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:#2b91af;">String</span>] fizzes&nbsp;=&nbsp;<span style="color:blue;">cycle</span>&nbsp;[Nothing,&nbsp;Nothing,&nbsp;Just&nbsp;<span style="color:#a31515;">&quot;Fizz&quot;</span>]</pre> </p> <p> This stream of values is one-based, since the first two entries are <code>Nothing</code>, and only every third is <code>Just "Fizz"</code>: </p> <p> <pre>*FizzBuzz&gt; take 9 fizzes [Nothing, Nothing, Just "Fizz", Nothing, Nothing, Just "Fizz", Nothing, Nothing, Just "Fizz"]</pre> </p> <p> If you're wondering why I chose a stream of <code>Maybe String</code> instead of just a stream of <code>String</code> values, I'll explain that now. </p> <h3 id="000531d5739e4d3bbaeff702e052ca6c"> Buzz stream <a href="#000531d5739e4d3bbaeff702e052ca6c" title="permalink">#</a> </h3> <p> You can define an equivalent infinite stream of <em>Buzz</em> values: </p> <p> <pre><span style="color:#2b91af;">buzzes</span>&nbsp;::&nbsp;[<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:#2b91af;">String</span>] buzzes&nbsp;=&nbsp;<span style="color:blue;">cycle</span>&nbsp;[Nothing,&nbsp;Nothing,&nbsp;Nothing,&nbsp;Nothing,&nbsp;Just&nbsp;<span style="color:#a31515;">&quot;Buzz&quot;</span>]</pre> </p> <p> The idea is the same, but the rhythm is different: </p> <p> <pre>*FizzBuzz&gt; take 10 buzzes [Nothing, Nothing, Nothing, Nothing, Just "Buzz", Nothing, Nothing, Nothing, Nothing, Just "Buzz"]</pre> </p> <p> Why not simply generate a stream of <code>String</code> values, like the following? </p> <p> <pre>*FizzBuzz&gt; take 10 $cycle ["", "", "", "", "Buzz"] ["", "", "", "", "Buzz", "", "", "", "", "Buzz"]</pre> </p> <p> At first glance this looks simpler, but it makes it harder to merge the stream of <em>Fizz</em> and <em>Buzz</em> values with actual numbers. Distinguishing between <code>Just</code> and <code>Nothing</code> values enables you to use the <em>Maybe catamorphism</em>. </p> <h3 id="f95fb4415768473ca8fdc576e227882c"> Resonance <a href="#f95fb4415768473ca8fdc576e227882c" title="permalink">#</a> </h3> <p> You can now <em>zip</em> the <code>fizzes</code> with the <code>buzzes</code>: </p> <p> <pre><span style="color:#2b91af;">fizzBuzzes</span>&nbsp;::&nbsp;[<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:#2b91af;">String</span>] fizzBuzzes&nbsp;=&nbsp;<span style="color:blue;">zipWith</span>&nbsp;<span style="color:#2b91af;">(&lt;&gt;)</span>&nbsp;fizzes&nbsp;buzzes</pre> </p> <p> You combine the values by monoidal composition. Any <code>Maybe</code> over a <code>Semigroup</code> itself gives rise to a <code>Monoid</code>, and since <code>String</code> forms a <code>Monoid</code> (and therefore also a <code>Semigroup</code>) over concatenation, you can <em>zip</em> the two streams using the <code>&lt;&gt;</code> operator. </p> <p> <pre>*FizzBuzz&gt; take 20 fizzBuzzes [Nothing, Nothing, Just "Fizz", Nothing, Just "Buzz", Just "Fizz", Nothing, Nothing, Just "Fizz", Just "Buzz", Nothing, Just "Fizz", Nothing, Nothing, Just "FizzBuzz", Nothing, Nothing, Just "Fizz", Nothing, Just "Buzz"]</pre> </p> <p> Notice how the stream of <code>fizzes</code> enters into a resonance pattern with the stream of <code>buzzes</code>. Every fifteenth element the values <em>Fizz</em> and <em>Buzz</em> amplify each other and become <em>FizzBuzz</em>. </p> <h3 id="ef186fa307bd47b8a1651761cc1ae7af"> Numbers <a href="#ef186fa307bd47b8a1651761cc1ae7af" title="permalink">#</a> </h3> <p> While you have an infinite stream of <code>fizzBuzzes</code>, you also need a list of numbers. That's easy: </p> <p> <pre><span style="color:#2b91af;">numbers</span>&nbsp;::&nbsp;[<span style="color:#2b91af;">String</span>] numbers&nbsp;=&nbsp;<span style="color:blue;">show</span>&nbsp;&lt;$&gt;&nbsp;[1..100]</pre> </p> <p> You just use a list comprehension and map each number to its <code>String</code> representation using <code>show</code>: </p> <p> <pre>*FizzBuzz&gt; take 18 numbers ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18"]</pre> </p> <p> Now you just need to figure out how to merge the <code>fizzBuzzes</code> with the <code>numbers</code>. </p> <h3 id="a84444b7decc42f1b2fb22b7a02718aa"> Zip with catamorphism <a href="#a84444b7decc42f1b2fb22b7a02718aa" title="permalink">#</a> </h3> <p> While you can trivially <code>zip</code> <code>fizzBuzzes</code> with <code>numbers</code>, it doesn't solve the problem of which value to pick: </p> <p> <pre>*FizzBuzz&gt; take 5 $zip numbers fizzBuzzes [("1", Nothing), ("2", Nothing), ("3", Just "Fizz"), ("4", Nothing), ("5", Just "Buzz")]</pre> </p> <p> You want to use the second element of each tuple when there's a value, and only use the first element (the number) when the second element is <code>Nothing</code>. </p> <p> That's easily done with <code>fromMaybe</code> (you'll need to <code>import Data.Maybe</code> for that): </p> <p> <pre>*FizzBuzz&gt; fromMaybe "2" Nothing "2" *FizzBuzz&gt; fromMaybe "3"$ Just "Fizz" "Fizz"</pre> </p> <p> That's just what you need, so <em>zip</em> <code>numbers</code> with <code>fizzBuzzes</code> using <code>fromMaybe</code>: </p> <p> <pre><span style="color:#2b91af;">elements</span>&nbsp;::&nbsp;[<span style="color:#2b91af;">String</span>] elements&nbsp;=&nbsp;<span style="color:blue;">zipWith</span>&nbsp;fromMaybe&nbsp;numbers&nbsp;fizzBuzzes</pre> </p> <p> These <code>elements</code> is a list of the values the kata instructs you to produce: </p> <p> <pre>*FizzBuzz&gt; take 14 elements ["1", "2", "Fizz", "4", "Buzz", "Fizz", "7", "8", "Fizz", "Buzz", "11", "Fizz", "13", "14"]</pre> </p> <p> <code>fromMaybe</code> is a specialisation of the <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. I always find it interesting when I can solve a problem with <a href="/2019/04/29/catamorphisms">catamorphisms</a> and <a href="/2017/10/06/monoids">monoids</a>, because it shows that perhaps, there's some value in knowing <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstractions</a>. </p> <h3 id="02f08a325df6462994e6081928bf67bc"> From 1 to 100 <a href="#02f08a325df6462994e6081928bf67bc" title="permalink">#</a> </h3> <p> The kata instructions are to write a program that prints the numbers from 1 to 100, according to the special rules. You can use <code>mapM_ putStrLn</code> for that: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">mapM_</span>&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;elements</pre> </p> <p> When you execute the <code>main</code> function, you get the desired output: </p> <p> <pre>1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz 16</pre> </p> <p> ... and so on. </p> <h3 id="399d6c6550344aa29b5bb2a88ff8c216"> Golf <a href="#399d6c6550344aa29b5bb2a88ff8c216" title="permalink">#</a> </h3> <p> Haskell <a href="https://en.wikipedia.org/wiki/Code_golf">golfers</a> may complain that the above code is unnecessarily verbose. I disagree, but you can definitely write the entire kata as a 'one-liner' if you want to: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;= &nbsp;&nbsp;<span style="color:blue;">mapM_</span>&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;$&nbsp;&nbsp;<span style="color:blue;">zipWith</span>&nbsp;fromMaybe&nbsp;(<span style="color:blue;">show</span>&nbsp;&lt;$&gt;&nbsp;[1..100])&nbsp;$&nbsp;&nbsp;<span style="color:blue;">zipWith</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">(&lt;&gt;)</span> &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">cycle</span>&nbsp;[Nothing,&nbsp;Nothing,&nbsp;Just&nbsp;<span style="color:#a31515;">&quot;Fizz&quot;</span>]) &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">cycle</span>&nbsp;[Nothing,&nbsp;Nothing,&nbsp;Nothing,&nbsp;Nothing,&nbsp;Just&nbsp;<span style="color:#a31515;">&quot;Buzz&quot;</span>])</pre> </p> <p> I've just mechanically in-lined all the values like <code>fizzes</code>, <code>buzzes</code>, etc. and formatted the code so that it fits comfortable in a <a href="/2019/11/04/the-80-24-rule">80x24 box</a>. This blog is totally free, but if you like it, please consider supporting it. The case of the mysterious curly bracket https://blog.ploeh.dk/2019/12/23/the-case-of-the-mysterious-curly-bracket 2019-12-23T06:46:00+00:00 Mark Seemann <div id="post"> <p> <em>The story of a curly bracket that I thought was redundant. Not so.</em> </p> <p> One of my clients was kind enough to show me some of their legacy code. As I was struggling to understand how it worked, I encountered something like this: </p> <p> <pre><span style="color:green;">//&nbsp;A&nbsp;lot&nbsp;of&nbsp;code&nbsp;has&nbsp;come&nbsp;before&nbsp;this.&nbsp;This&nbsp;is&nbsp;really&nbsp;on&nbsp;line&nbsp;665,&nbsp;column&nbsp;29.</span> <span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:#2b91af;">BARLifeScheme_BAR</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">scheme</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">postPartialMembership</span>.<span style="font-weight:bold;color:#74531f;">SchemesFilterObsolete</span>(<span style="color:#2b91af;">BARCompany</span>.ONETIMESUM,&nbsp;<span style="color:blue;">false</span>)) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">schemeCalculated</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:#2b91af;">BARLifeSchemeCalculated_BAR</span>)<span style="font-weight:bold;color:#1f377f;">scheme</span>.SchemeCalculatedObsolete[<span style="font-weight:bold;color:#1f377f;">basis</span>.Data.Basis1order]; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">decimal</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">hfcFactor</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">postState</span>.OverallFuturerule&nbsp;==&nbsp;<span style="color:#2b91af;">BAROverallFuturerule</span>.Standard) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">bonusKey</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">BonusKey</span>(<span style="font-weight:bold;color:#1f377f;">postState</span>.<span style="font-weight:bold;color:#74531f;">PNr</span>()); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">hfcFactor</span>&nbsp;=&nbsp;1M&nbsp;-&nbsp;<span style="color:#2b91af;">CostFactory</span>.<span style="color:#74531f;">Instance</span>() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.<span style="font-weight:bold;color:#74531f;">CostProvider</span>(<span style="font-weight:bold;color:#1f377f;">postState</span>.Data.FrameContractNr,&nbsp;<span style="font-weight:bold;color:#1f377f;">postState</span>.StateDate) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.<span style="font-weight:bold;color:#74531f;">GetAdministrationpercentageContribution</span>(<span style="font-weight:bold;color:#1f377f;">bonusKey</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">basis</span>.Data.Basis1order) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;/&nbsp;100M; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Much&nbsp;more&nbsp;code&nbsp;comes&nbsp;after&nbsp;this...</span> &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;...and&nbsp;after&nbsp;this...</span> }</pre> </p> <p> For the record, this isn't the actual code that my client gave me. I wouldn't post someone else's code without their permission. It is, however, a faithful imitation of the original code. </p> <p> The actual code started at line 665 and further to the right. It was part of a larger block of code with <code>if</code> statements within <code>foreach</code> loops within <code>if</code> statements within <code>foreach</code> loops, and so on. The <code>foreach</code> keyword actually appeared at column 29. The entire file was 1708 lines long. </p> <p> The code has numerous smells, but here I'll focus on a single oddity. </p> <h3 id="62fa371342be48698054d125b8b326e0"> Inexplicable bracket <a href="#62fa371342be48698054d125b8b326e0" title="permalink">#</a> </h3> <p> Notice the curly bracket on the line before <code>hfcFactor</code>. Why is it there? </p> <p> Take a moment and see if you can guess. </p> <p> It doesn't seem to be required. It just surrounds a block of code, but it belongs to none of the usual language constructs that would normally call for the use of curly brackets. There's no <code>if</code>, <code>foreach</code>, <code>using</code>, or <code>try</code> before it. </p> <h3 id="f56822a0b88d4d288843723fece26f76"> Residue <a href="#f56822a0b88d4d288843723fece26f76" title="permalink">#</a> </h3> <p> I formed a theory as to why those brackets where in place. I thought that it might be the residue of an <code>if</code> statement that was no longer there; that, perhaps, once the code had looked like this: </p> <p> <pre><span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">something</span>&nbsp;&lt;&nbsp;<span style="font-weight:bold;color:#1f377f;">somethingElse</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">decimal</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">hfcFactor</span>;</pre> </p> <p> Later, a developer had discovered that the code in that block should <em>always</em> be executed, and had removed the <code>if</code> statement without removing the curly brackets. </p> <p> That was my theory, but then I noticed that this structure appeared frequently throughout the code. Mysterious curly brackets were common, sometimes even nesting each other. </p> <p> This idiom appeared too often that it looked like the legacy from earlier epochs. It looked deliberate. </p> <h3 id="1b30ee97ed92442caf8d054b4239e334"> The real reason <a href="#1b30ee97ed92442caf8d054b4239e334" title="permalink">#</a> </h3> <p> When I had the opportunity, I asked one of the developers. </p> <p> He smiled sheepishly when he told me that those curly brackets were there to introduce a <em>variable scope</em>. The curly brackets protected variables within them from colliding with other variables elsewhere in the 744-line method. </p> <p> Those scopes enabled programmers to declare variables with names that would otherwise collide with other variables. They even enabled developers to declare a variable with the same name, but a different type. </p> <p> I was appalled. </p> <h3 id="c0fc550039c341ac81340d20d4011e29"> Legacy <a href="#c0fc550039c341ac81340d20d4011e29" title="permalink">#</a> </h3> <p> I didn't write this article to point fingers. I don't think that professional software developers deliberately decide to write obscure code. </p> <p> Code becomes obscure over time. It's a slow, unyielding process. As Brian Foote and Joseph Yoder wrote in <em>The Selfish Class</em> (here quoted from <a href="http://amzn.to/1dEKjcj">Pattern Languages of Program Design 3</a>, p. 461): <blockquote> <p>"Will highly comprehensible code, by virtue of being easy to modify, inevitably be supplanted by increasingly less elegant code until some equilibrium is achieved between comprehensibility and fragility?"</p> <footer><cite>Brian Foote and Joseph Yoder</cite></footer> </blockquote> That's a disturbing thought. It suggests that 'good' code is unstable. I suspect that code tends to rot beyond comprehension. It's death by a thousand cuts. It's not any single edit that produces legacy code. This blog is totally free, but if you like it, please consider supporting it. The purpose is to correct a common misconception about statically typed languages. </p> <h3 id="07df6b539dd24732bd2c1f0038a27a71"> Ceremony <a href="#07df6b539dd24732bd2c1f0038a27a71" title="permalink">#</a> </h3> <p> People who favour dynamically typed languages over statically typed languages often emphasise that they find the lack of ceremony productive. That seems reasonable; only, it's a false dichotomy. </p> <ins datetime="2019-12-18T11:27Z"> <blockquote> <p> "Ceremony is what you have to do before you get to do what you really want to do." </p> <footer><cite><a href="https://youtu.be/4jCjDEb9KZI">Venkat Subramaniam</a></cite></footer> </blockquote> </ins> <p> Dynamically typed languages do seem to be light on ceremony, but you can't infer from that that statically typed languages have to require lots of ceremony. Unfortunately, all mainstream statically typed languages belong to the same family, and they <em>do</em> involve ceremony. I think that people extrapolate from what they know; they falsely conclude that all statically typed languages must come with the overhead of ceremony. </p> <p> It looks to me more as though there's an unfortunate <em>Zone of Ceremony</em>: </p> <p> <img src="/content/binary/zone-of-ceremony.png" alt="A conceptual spectrum of typing, from dynamic on the left, to static on the right. There's a zone of ceremony slightly to the right of the middle with the languages C++, C#, and Java."> </p> <p> Such a diagram can never be anything but a simplification, but I hope that it's illuminating. C++, Java, and C# are all languages that involve ceremony. To the right of them are what we could term the <em>trans-ceremonial languages</em>. These include <a href="https://fsharp.org">F#</a> and <a href="https://www.haskell.org">Haskell</a>. </p> <ins datetime="2019-12-18T19:30Z"> <p> In the following, I'll show some code examples in various languages. I'll discuss ceremony according to the above definition. The discussion focuses on the amount of preparatory work one has to do, such as creating a new file, declaring a new class, and declaring types. The discussion is <em>not</em> about the implementation code. For that reason, I've removed colouring from the implementation code, and emphasised the code that I consider ceremonial. </p> </ins> <h3 id="32ff9013011b4c04a24e32d635b55ab2"> Low ceremony of JavaScript <a href="#32ff9013011b4c04a24e32d635b55ab2" title="permalink">#</a> </h3> <p> Imagine that you're given a list of numbers, as well as a quantity. The quantity is a number to be consumed. You must remove elements from the left until you've consumed at least that quantity. Then return the rest of the list. </p> <p> <pre>&gt; consume ([1,2,3], 1); [ 2, 3 ] &gt; consume ([1,2,3], 2); [ 3 ] &gt; consume ([1,2,3], 3); [ 3 ] &gt; consume ([1,2,3], 4); []</pre> </p> <p> The first example consumes only the leading <code>1</code>, while both the second and the third example consumes both <code>1</code> and <code>2</code> because the sum of those values is <code>3</code>, and the requested quantity is <code>2</code> and <code>3</code>, respectively. The fourth example consumes all elements because the requested quantity is <code>4</code>, and you need both <code>1</code>, <code>2</code>, and <code>3</code> before the sum is large enough. You have to pick strictly from the left, so you can't decide to just take the elements <code>1</code> and <code>3</code>. </p> <p> In JavaScript, you could implement the <code>consume</code> function like this: </p> <p> <pre><strong><span style="color:blue;">var</span>&nbsp;consume&nbsp;=&nbsp;<span style="color:blue;">function</span>&nbsp;(source,&nbsp;quantity)&nbsp;{</strong> <span style="color: silver;">&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(!source)&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;[]; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;accumulator&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;result&nbsp;=&nbsp;[]; &nbsp;&nbsp;&nbsp;&nbsp;for&nbsp;(var&nbsp;i&nbsp;=&nbsp;0;&nbsp;i&nbsp;&lt;&nbsp;source.length;&nbsp;i++)&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;x&nbsp;=&nbsp;source[i]; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(quantity&nbsp;&lt;=&nbsp;accumulator) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;result.push(x); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;accumulator&nbsp;+=&nbsp;x; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;result;</span> <strong>}</strong></pre> </p> <p> I'm a terrible JavaScript programmer, so I'm sure that it could have been done more elegantly, but as far as I can tell, it gets the job done. I wrote some tests, and I have 17 passing test cases. The point isn't about how you write the function, but how much ceremony is required. In JavaScript you don't need to declare any types. Just name the function and its arguments, and you're ready to write code. </p> <h3 id="2b1bcebf36084abcae5bb231ca0ebe15"> High ceremony of C# <a href="#2b1bcebf36084abcae5bb231ca0ebe15" title="permalink">#</a> </h3> <p> Contrast the JavaScript example with C#. The same function in C# would look like this: </p> <p> <pre><strong><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Enumerable</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;<span style="color:#74531f;">Consume</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">source</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">quantity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{</strong> <span style="color: silver;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(source&nbsp;is&nbsp;null) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;yield&nbsp;break; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;accumulator&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;foreach&nbsp;(var&nbsp;i&nbsp;in&nbsp;source) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(quantity&nbsp;&lt;=&nbsp;accumulator) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;yield&nbsp;return&nbsp;i; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;accumulator&nbsp;+=&nbsp;i; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}</span> <strong>&nbsp;&nbsp;&nbsp;&nbsp;} }</strong></pre> </p> <p> Here you have to declare the type of each method argument, as well as the return type of the method. You also have to put the method in a class. This may not seem like much overhead, but if you later need to change the types, editing is required. This can affect downstream callers, so simple type changes ripple through code bases. </p> <p> It gets worse, though. The above <code>Consume</code> method only handles <code>int</code> values. What if you need to call the method with <code>long</code> arrays? </p> <p> You'd have to add an overload: </p> <p> <pre><strong><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">long</span>&gt;&nbsp;<span style="color:#74531f;">Consume</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">long</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">source</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">long</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">quantity</span>) {</strong> <span style="color: silver;">&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(source&nbsp;is&nbsp;null) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;yield&nbsp;break; &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;accumulator&nbsp;=&nbsp;0L; &nbsp;&nbsp;&nbsp;&nbsp;foreach&nbsp;(var&nbsp;i&nbsp;in&nbsp;source) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(quantity&nbsp;&lt;=&nbsp;accumulator) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;yield&nbsp;return&nbsp;i; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;accumulator&nbsp;+=&nbsp;i; &nbsp;&nbsp;&nbsp;&nbsp;}</span> <strong>}</strong></pre> </p> <p> Do you need support for <code>short</code>? Add an overload. <code>decimal</code>? Add an overload. <code>byte</code>? Add an overload. </p> <p> No wonder people used to dynamic languages find this awkward. </p> <h3 id="bc9ab1e2693d41678a09cf843436c736"> Low ceremony of F# <a href="#bc9ab1e2693d41678a09cf843436c736" title="permalink">#</a> </h3> <p> You can write the same functionality in F#: </p> <p> <pre><strong><span style="color:blue;">let</span>&nbsp;<span style="color:blue;">inline</span></strong><span style="color: silver;">&nbsp;consume&nbsp;quantity&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;let&nbsp;go&nbsp;(acc,&nbsp;xs)&nbsp;x&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;quantity&nbsp;&lt;=&nbsp;acc &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;then&nbsp;(acc,&nbsp;Seq.append&nbsp;xs&nbsp;(Seq.singleton&nbsp;x)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else&nbsp;(acc&nbsp;+&nbsp;x,&nbsp;xs) &nbsp;&nbsp;&nbsp;&nbsp;Seq.fold&nbsp;go&nbsp;(LanguagePrimitives.GenericZero,&nbsp;Seq.empty)&nbsp;&gt;&gt;&nbsp;snd</span></pre> </p> <p> There's no type declaration in sight, but nonetheless the function is statically typed. It has this somewhat complicated type: </p> <p> <pre>quantity: ^a -&gt; (seq&lt; ^b&gt; -&gt; seq&lt; ^b&gt;) when ( ^a or ^b) : (static member ( + ) : ^a * ^b -&gt; ^a) and ^a : (static member get_Zero : -&gt; ^a) and ^a : comparison</pre> </p> <p> While this looks arcane, it means that it support sequences of any type that comes with a zero value and supports addition and comparison. You can call it with both 32-bit integers, decimals, and so on: </p> <p> <pre>&gt; consume 2 [1;2;3];; val it : seq&lt;int&gt; = seq [3] &gt; consume 2m [1m;2m;3m];; val it : seq&lt;decimal&gt; = seq [3M]</pre> </p> <p> Static typing still means that you can't just call it with any type of value. An expression like <code>consume "foo" [true;false;true]</code> will not compile. </p> <p> You can explicitly declare types in F# (like you can in C#), but my experience is that if you don't, type changes tend to just propagate throughout your code base. Change a type of a function, and upstream callers generally just 'figure it out'. If you think of functions calling other functions as a graph, you often only have to adjust leaf nodes even when you change the type of something deep in your code base. </p> <h3 id="e360bba3f4954f6988fd1267c70b78e1"> Low ceremony of Haskell <a href="#e360bba3f4954f6988fd1267c70b78e1" title="permalink">#</a> </h3> <p> Likewise, you can write the function in Haskell: </p> <p> <pre><span style="color: silver;">consume&nbsp;quantity&nbsp;=&nbsp;reverse&nbsp;.&nbsp;snd&nbsp;.&nbsp;foldl&nbsp;go&nbsp;(0,&nbsp;[]) &nbsp;&nbsp;</span><strong>where</strong><span style="color: silver;"> &nbsp;&nbsp;&nbsp;&nbsp;go&nbsp;(acc,&nbsp;ys)&nbsp;x&nbsp;=&nbsp;if&nbsp;quantity&nbsp;&lt;=&nbsp;acc&nbsp;then&nbsp;(acc,&nbsp;x:ys)&nbsp;else&nbsp;(acc&nbsp;+&nbsp;x,&nbsp;ys)</span></pre> </p> <p> Again, you don't have to explicitly declare any types. The compiler figures them out. You can ask GHCi about the function's type, and it'll tell you: </p> <p> <pre>&gt; :t consume consume :: (Foldable t, Ord a, Num a) =&gt; a -&gt; t a -&gt; [a]</pre> </p> <p> It's more compact than the inferred F# type, but the idea is the same. It'll compile for any <code>Foldable</code> container <code>t</code> and any type <code>a</code> that belongs to the classes of types called <code>Ord</code> and <code>Num</code>. <code>Num</code> supports addition and <code>Ord</code> supports comparison. </p> <p> There's little ceremony involved with the types in Haskell or F#, yet both languages are statically typed. In fact, their type systems are more powerful than C#'s or Java's. They can express relationships between types that those languages can't. </p> <h3 id="d215cd3b1cd84ce3877dc88e8ce944be"> Summary <a href="#d215cd3b1cd84ce3877dc88e8ce944be" title="permalink">#</a> </h3> <p> In debates about static versus dynamic typing, contributors often generalise from their experience with C++, Java, or C#. They dislike the amount of ceremony required in these languages, but falsely believe that it means that you can't have static types without ceremony. </p> <p> The statically typed mainstream languages seem to occupy a <em>Zone of Ceremony</em>. </p> <p> Static typing without ceremony is possible, as evidenced by languages like F# and Haskell. You could call such languages <em>trans-ceremonial languages</em>. They offer the best of both worlds: compile-time checking and little ceremony. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="7ed32b43867d490cad18388ec86baab4"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> In your initial/<code>int</code> C# example, I think your point is that method arguments and the return type require <a href="https://en.wikipedia.org/wiki/Manifest_typing">manifest</a> typing. Then for your example about <code>long</code> (and comments about <code>short</code>, <code>decimal</code>, and <code>byte</code>), I think your point is that C#'s type system is primarily <a href="https://en.wikipedia.org/wiki/Nominal_type_system">nominal</a>. You then contrast those C# examples with F# and Haskell examples that utilize <a href="https://en.wikipedia.org/wiki/Type_inference">inferred</a> and <a href="https://en.wikipedia.org/wiki/Structural_type_system">structural</a> aspects of their type systems. </p> <p> I also sometimes get involved in debates about static versus dynamic typing and find myself on the side of static typing. Furthermore, I also typically hear arguments against manifest and nominal typing instead of against static typing. In theory, I agree with those arguments; I also prefer type systems that are inferred and structural instead of those that are manifest and nominal. </p> <p> I see the tradeoff as being among the users of the programming language, those responsible for writing and maintaining the compiler/interpreter, and what can be said about the correctness of the code. (In the rest of this paragraph, all statements about things being simple or complex are meant to be relative. I will also exaggerate for the sake of simplifying my statements.) For a dynamic language, the interpreter and coding are simple but there are no guarantees about correctness. For a static, manifest, and nominal language, the compiler is somewhere between simple and complex, the coding is complex, but at least there are some guarantees about correctness. For a static, inferred, structural language, the compiler is complex, coding is simple, and there are some guarantees about correctness. </p> <p> Contrasting a dynamic language with one that is static, inferred, and structural, I see the tradeoff as being directly between the the compiler/interpreter writers and what can be said about the correctness of the code while the experience of those writing code in the language is mostly unchanged. I think that is your point being made by contrasting the JavaScript example (a dynamic language) with the F# and Haskell examples (that demonstrate the static, inferred, and structural behavior of their type systems). </p> <p> While we are on the topic, I would like to say something that I think is controversial about <a href="https://en.wikipedia.org/wiki/Duck_typing">duck typing</a>. I think duck typing is "just" a dynamic type system that is also structural. This contradicts the lead of its Wikipedia article (linked above) as well as the <a href="https://en.wikipedia.org/wiki/Duck_typing#Structural_type_systems">subsection about structural type systems</a>. They both imply that nominal vs structural typing is a spectrum that only exists for static languages. I disagree; I think dynamic languages can also exist on that spectrum. It is just that most dynamic languages are also structural. In contrast, I think that the manifest vs inferred spectrum exists for static languages but not for dynamic languages. </p> <p> Nonetheless, that subsection makes a great observation. For structural languages, the difference between static and dynamic languages is not just some guarantees about correctness. Dynamic languages check for type correctness at the last possible moment. (That is saying more than saying that the type check happens at runtime.) For example, consider a function with dead code that "doesn't type". If the type system were static, then this function cannot be executed, but if the type system were dynamic, then it could be executed. More practically, suppose the function is a simple <code>if-else</code> statement with code in the <code>else</code> branch that "doesn't type" and that the corresponding Boolean expression always evaluates to <code>true</code>. If the type system were static, then this function cannot be executed, but if the type system were dynamic, then it could be executed. </p> <p> In my experience, the typical solution of a functional programmer would be to strengthen the input types so that the <code>else</code> branch can be proved by the compiler to be dead code and then delete the dead code. This approach makes this one function simpler, and I generally am in favor of this. However, there is a sense in which we can't always repeat this for the calling function. Otherwise, we would end up with a program that is provably correct, which is impossible for a Turning-complete language. Instead, I think the practical solution is to (at some appropriate level) short-circuit the computation when given input that is not known to be good and either do nothing or report back to the user that the input wasn't accepted. </p> </div> <div class="comment-date">2019-12-16 17:12 UTC</div> </div> <div class="comment" id="048a007a6d7f4f67b4ed92b748c78c13"> <div class="comment-author">Romain Deneau <a href="https://twitter.com/DeneauRomain">@DeneauRomain</a></div> <div class="comment-content"> <p> Using mostly both C# and TypeScript, two statically typed languages, I’ve experienced how it’s terser in TypeScript, essentially thanks to its type inference and its structural typing. I like the notion of <em>“Ceremony”</em> you gave to describe this and the fact that it’s not correlated to the kind of typing, dynamic or static 👍 </p> <p> Still, TypeScript is more verbose than F#, as we can see with the following code translation from F# to TypeScript using object literal instead of tuple for the better support of the former: </p> <pre> <span class="hljs-comment" style="color:green;">// const consume = (source: number[], quantity: number): number[]</span> <span class="hljs-keyword" style="color:blue;">const</span> consume = (source: <span class="hljs-built_in" style="color:blue;">number</span>[], quantity: <span class="hljs-built_in" style="color:blue;">number</span>) =&gt; source.reduce(({ acc, xs }, x) =&gt; quantity &lt;= acc ? { acc, xs: xs.concat(x) } : { acc: acc + x, xs }, { acc: <span class="hljs-number" style="color:purple">0</span>, xs: [] <span class="hljs-keyword" style="color:blue;">as</span> <span class="hljs-built_in" style="color:blue;">number</span>[] } ).xs;</pre> <p> Checks: </p> <pre> &gt; consume(1, [1,2,3]) [2,3] &gt; consume(2, [1,2,3]) [3] &gt; consume(3, [1,2,3]) [3] &gt; consume(4, [1,2,3]) []</pre> <p> As we can see, the code is a little more verbose than in JavaScript but still terser than in C#. The returned type is inferred as <code>number[]</code> but the <code>as number[]</code> is a pity, necessary because the inferred type of the empty array <code>[]</code> is <code>any[]</code>. </p> <p> <code>consume</code> is not generic: TypeScript/JavaScript as only one primitive for numbers: <code>number</code>. It works for common scenarios but their no simple way to make it work with <code>BigInt</code>, for instance using the union type <code>number | bigint</code>. The more pragmatic option would be to copy-paste, replacing <code>number</code> with <code>bigint</code> and <code>0</code> with <code>0n</code>. </p> </div> <div class="comment-date">2019-12-20 10:10 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Put cyclomatic complexity to good use https://blog.ploeh.dk/2019/12/09/put-cyclomatic-complexity-to-good-use 2019-12-09T14:37:00+00:00 Mark Seemann <div id="post"> <p> <em>An actually useful software metric.</em> </p> <p> In <a href="https://cleancoders.com/video-details/humane-code-real-episode-1">Humane Code</a> I argue that software development suffers from a lack of useful measurements. While I stand by that general assertion, a few code metrics can be useful. <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">Cyclomatic complexity</a>, while no <a href="/2019/07/01/yes-silver-bullet">silver bullet</a>, can be put to good use. </p> <h3 id="1462c0daaa1d42eba09691214f9ac8da"> Recap <a href="#1462c0daaa1d42eba09691214f9ac8da" title="permalink">#</a> </h3> <p> I think of cyclomatic as a measure of the number of pathways through a piece of code. Even the simplest body of code affords a single pathway, so the minimum cyclomatic complexity is <em>1</em>. You can easily 'calculate' the cyclomatic complexity of a method for function. You start at one, and then you count how many times <code>if</code> and <code>for</code> occurs. For each of these keywords you find, you increment the number (which started at <em>1</em>). </p> <p> The specifics are language-dependent. The idea is to count branching and looping instructions. In C#, for example, you'd also have to include <code>foreach</code>, <code>while</code>, <code>do</code>, and each <code>case</code> in a <code>switch</code> block. In other languages, the keywords to count will differ. </p> <p> What's the cyclomatic complexity of this <code>TryParse</code> method? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">TryParse</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">arr</span>&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>.<span style="font-weight:bold;color:#74531f;">Split</span>(<span style="color:#a31515;">&#39;,&#39;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">arr</span>.Length&nbsp;&lt;&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>(<span style="font-weight:bold;color:#1f377f;">arr</span>[0],&nbsp;<span style="font-weight:bold;color:#1f377f;">arr</span>[1]); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> The cyclomatic complexity of this method is <em>2</em>. You start with the number <em>1</em> and then increment it every time you find one of the branching keywords. In this case, there's only a single <code>if</code>, so increment <em>1</em> to <em>2</em>. That's it. If you're in doubt, <a href="https://docs.microsoft.com/en-us/visualstudio/code-quality/code-metrics-values">Visual Studio can calculate metrics for you</a>. (It calculates other metrics as well, but I don't find those useful.) </p> <h3 id="ca340e4d0dbd4391a1e15bd56ab43076"> Guide for unit testing <a href="#ca340e4d0dbd4391a1e15bd56ab43076" title="permalink">#</a> </h3> <p> I find cyclomatic complexity useful because it measures the number of pathways through a method. As such, it indicates the <em>minimum</em> number of test cases you ought to furnish. This is useful when reviewing code and tests. </p> <p> Sometimes I'm presented with code that other people wrote. When I look through the production code, I consider its cyclomatic complexity. If, for example, a method has a cyclomatic complexity of <em>5</em>, I'd expect to find at least five test cases to cover that method. </p> <p> At other times, I start by reading the tests. The number of test cases gives me a rough indication of what degree of complexity to expect. If I see four distinct tests for the same method, I expect it to have a cyclomatic complexity about <em>4</em>. </p> <p> I don't demand 100% coverage. Sometimes, people don't write tests for <a href="https://en.wikipedia.org/wiki/Guard_(computer_science)">guard clauses</a>, and I usually accept such omissions. On the other hand, I think that proper decision logic should be covered by tests. If I were to stick unwaveringly to cyclomatic complexity, that would make my reviews more objective, but not necessarily better. I could insist on 100% code coverage, but <a href="/2015/11/16/code-coverage-is-a-useless-target-measure">I don't consider that a good idea</a>. </p> <p> Presented with the above <code>TryParse</code> method, I'd expect to see at least two unit tests, since the cyclomatic complexity is <em>2</em>. </p> <h3 id="bd88752e8821476984182c968e7a00bb"> The need for more test cases <a href="#bd88752e8821476984182c968e7a00bb" title="permalink">#</a> </h3> <p> Two unit tests aren't enough, though. You could write these two tests: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseSucceeds</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">couldParse</span>&nbsp;=&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>.<span style="color:#74531f;">TryParse</span>(<span style="color:#a31515;">&quot;foo,bar&quot;</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="font-weight:bold;color:#1f377f;">couldParse</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">expected</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="font-weight:bold;color:#1f377f;">expected</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); } [<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseFails</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">couldParse</span>&nbsp;=&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>.<span style="color:#74531f;">TryParse</span>(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="font-weight:bold;color:#1f377f;">couldParse</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Null</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>); }</pre> </p> <p> Using the <a href="/2019/10/07/devils-advocate">Devil's advocate</a> technique, however, this implementation of <code>TryParse</code> passes both tests: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">TryParse</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">candidate</span>&nbsp;!=&nbsp;<span style="color:#a31515;">&quot;foo,bar&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> This is clearly not the correct implementation, but it has 100% code coverage. It also still has cyclomatic complexity of <em>2</em>. The metric suggests a <em>minimum</em> number of tests - not a sufficient number. </p> <h3 id="15d3b6efa5f74ab4ba5428d81a9e2443"> More test cases <a href="#15d3b6efa5f74ab4ba5428d81a9e2443" title="permalink">#</a> </h3> <p> It often makes sense to cover each branch with a single parametrised test: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;foo,bar&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;baz,qux&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;baz&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;ploeh,fnaah&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;ploeh&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;fnaah&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;foo,bar,baz&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseSucceeds</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userName</span>,&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">password</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">couldParse</span>&nbsp;=&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="font-weight:bold;color:#1f377f;">couldParse</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">expected</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>(<span style="font-weight:bold;color:#1f377f;">userName</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">password</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="font-weight:bold;color:#1f377f;">expected</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); } [<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;foobar&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;foo;bar&quot;</span>)] [<span style="color:#2b91af;">InlineData</span>(<span style="color:#a31515;">&quot;foo&quot;</span>)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseFails</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">couldParse</span>&nbsp;=&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="font-weight:bold;color:#1f377f;">couldParse</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Null</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>); }</pre> </p> <p> Is a total of eight test cases the correct number? Cyclomatic complexity can't help you here. You'll have to rely on other heuristics, such as test-driven development, the <a href="https://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html">transformation priority premise</a>, and the Devil's Advocate. </p> <h3 id="a8b2fa3d0101430d871b99a4bab136b7"> Humane Code <a href="#a8b2fa3d0101430d871b99a4bab136b7" title="permalink">#</a> </h3> <p> I also find cyclomatic complexity useful for another reason. I keep an eye on complexity because I care about code maintainability. In my <a href="https://cleancoders.com/video-details/humane-code-real-episode-1">Humane Code</a> video, I discuss <a href="https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two">the magic number seven, plus or minus two</a>. </p> <p> When you read code, you essentially run a little emulator in your brain. You have to maintain state in order to interpret the code you look at. <em>Will this conditional evaluate to true or false? Is the code going to exit that loop now? Is that array index out of bounds?</em> You can only follow the code by keeping track of variables' contents, and your brain can keep track of approximately seven things. </p> <p> Cyclomatic complexity is a measure of pathways - not how many things you need to keep track of. Still, in my experience, there seems to be a useful correlation. Code with high cyclomatic complexity tends to have many moving parts. There's too much to keep track of. With low cyclomatic complexity, on the other hand, the code involves few moving parts. </p> <p> I use cyclomatic complexity <em>7</em> as an approximate maximum for that reason. It's only a rule of thumb, since I'm painfully aware that I'm transplanting experimental psychology to a context where no conclusions can be scientifically drawn. But like <a href="/2019/11/04/the-80-24-rule">the 80/24 rule</a> I find that it works well in practice. </p> <h3 id="de927bfcc95d410bbfcd0adf7a63926b"> Complexity of a method call <a href="#de927bfcc95d410bbfcd0adf7a63926b" title="permalink">#</a> </h3> <p> Consider the above parametrised tests. Some of the test cases provide enough triangulation to defeat the Devil's attempt at hard-coding return values. This explains test values like <code>"foo,bar"</code>, <code>"baz,qux"</code>, and <code>"ploeh,fnaah"</code>, but why did I include the <code>"foo,bar,baz"</code> test case? And why did I include the empty string as one of the test cases for <code>TryParseFails</code>? </p> <p> When I write tests, I aspire to compose tests that verify the behaviour rather than the implementation of the System Under Test. The desired behaviour, I decided, is that any extra entries in the comma-separated input should be ignored. Likewise, if there's fewer than two entries, parsing should fail. There must be both a user name and a password. </p> <p> Fortunately, this happens to be how <a href="https://docs.microsoft.com/dotnet/api/system.string.split">Split</a> already works. If you consider all the behaviour that <code>Split</code> exhibits, it encapsulates moderate complexity. It can split on multiple alternative delimiters, it can throw away empty entries, and so on. What would happen if you inline some of that functionality? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">TryParse</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">List</span>&lt;<span style="color:blue;">string</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">element</span>&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">c</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">c</span>&nbsp;==&nbsp;<span style="color:#a31515;">&#39;,&#39;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">element</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">element</span>&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">element</span>&nbsp;+=&nbsp;<span style="font-weight:bold;color:#1f377f;">c</span>; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">element</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">l</span>.Count&nbsp;&lt;&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">credentials</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UserNamePassworCredentials</span>(<span style="font-weight:bold;color:#1f377f;">l</span>[0],&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>[1]); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> This isn't as sophisticated as the <code>Split</code> method it replaces, but it passes all eight test cases. Why did I do this? To illustrate the following point. </p> <p> What's the cyclomatic complexity now? </p> <p> Keep in mind that the externally observable behaviour (as defined by eight test cases) hasn't changed. The cyclomatic complexity, however, has. It's now <em>4</em> - double the previous metric. </p> <p> A method call (like a call to <code>Split</code>) can hide significant cyclomatic complexity. That's a desirable situation. This is the benefit that <a href="/encapsulation-and-solid">encapsulation</a> offers: that you don't have to worry about implementation details as long as both caller and callee fulfils the contract. </p> <p> When you calculate cyclomatic complexity, a method call doesn't increment the complexity, regardless of the degree of complexity that it encapsulates. </p> <h3 id="10a3f65b7bf54b04b35b8e69552803b9"> Summary <a href="#10a3f65b7bf54b04b35b8e69552803b9" title="permalink">#</a> </h3> <p> Cyclomatic complexity is one of the rare programming metrics that I find useful. It measures the number of pathways through a body of code. </p> <p> You can use it to guide your testing efforts. The number is the minimum number of tests you must write in order to cover all branches. You'll likely need more test cases than that. </p> <p> You can also use the number as a threshold. I suggest that <em>7</em> ought to be the maximum cyclomatic complexity of a method or function. You're welcome to pick another number, but keeping an eye on cyclomatic complexity is useful. It tells you when it's time to refactor a complex method. </p> <p> Cyclomatic complexity considers only the code that directly implements a method or function. That code can call other code, but what happens behind a method call doesn't impact the metric. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="6fa1adc55d8841918b27599749283abd"> <div class="comment-author">Ghillie Dhu</div> <div class="comment-content"> <p> Do you know of a tool to calculate cyclomatic complexity for F#? It appears that the Visual Studio feature doesn't support it. </p> </div> <div class="comment-date">2019-12-09 19:20 UTC</div> </div> <div class="comment" id="1caf28073f484530ad8389f44ad4a531"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Ghillie, thank you for writing. I'm not aware of any such tool. </p> <p> FWIW, it's not difficult to manually calculate cyclometric complexity for F#, but clearly that doesn't help if you'd like to automate the process. </p> <p> It might be a fine project for anyone looking for a way to contribute to the F# ecosystem. </p> </div> <div class="comment-date">2019-12-09 20:06 UTC</div> </div> <div class="comment" id="ab53d55f92eb4acea0a896436294f3af"> <div class="comment-author">Carlos Schults</div> <div class="comment-content"> <p> Hi, Mark. Thanks for your article. I'd commenting because I'd like to learn more about your thoughts on mutation testing. I ask this because I know you're not the biggest fan of code coverage as a useful metric. I'm not either, or at least I wasnt, until I learned about mutation testing. </p> <p>My current view is that code coverage is only (mostly) meaningless if you don't have a way of measuring the quality of the tests. Since mutation testing's goal is exactly that (to test the tests, if you will) my opinion is that, if you use a mutation testing tool, then code coverage become really useful and you should try to get to 100%. I've even written a <a href="https://blog.ncrunch.net/post/mutation-testing-code-coverage.aspx">post about this subject.</a></p> <p>So, in short: what are your thoughts on mutation testing and how it affects the meaning of code coverage, if at all? Looking forward to read your answer. A whole post on this would be even better! </p> <p>Thanks!</p> </div> <div class="comment-date">2019-12-14 12:32 UTC</div> </div> <div class="comment" id="4b240074da164f87bfabcc484a2b8c7b"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Carlos, thank you for writing. I'm sympathetic to the idea of mutation testing, but apart from that, I have no opinion of it. I don't think that I ought to have an opinion about something with which I've no experience. </p> <p> I first heard about mutation testing decades ago, but I've never come across a mutation testing tool for C# (or F#, for that matter). Can you recommend one? </p> </div> <div class="comment-date">2019-12-14 13:51 UTC</div> </div> <div class="comment" id="46eafa3c33ba47b1bacd997f7e217c4f"> <div class="comment-author">Carlos Schults</div> <div class="comment-content"> <p> Unfortunately, tooling is indeed one of the main Achilles heels of mutation testing, at least when it comes to .NET. </p> <p> In the Java world, they have <a href="https://pitest.org/">PIT</a>, which is considered state of the art. For C#, I have tried a few tools, with no success. The most promising solution I've found so far, for C#, is <a href="https://stryker-mutator.io/stryker-net/">Stryker.net</a>, which is a port of the Stryker mutation, designed originally for JavaScript. The C# version is still in its early phases but it's already usable and it looks very promising. </p> </div> <div class="comment-date">2019-12-14 16:16 UTC</div> </div> <div class="comment" id="32db969bac2d466a9be05111cc505f9d"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> Is mutation testing the automated version of what Mark has called the <a href="/2019/10/07/devils-advocate">Devil's Advocate technique</a>? </p> </div> <div class="comment-date">2019-12-15 02:26 UTC</div> </div> <div class="comment" id="c889aa4c4ee04df38e8954afc30e6a6f"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, I actually <a href="/2019/10/07/devils-advocate#26be7b38248c4dcba5134eb4529d8214">discuss the relationship with mutation testing</a> in that article. </p> </div> <div class="comment-date">2019-12-15 9:28 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Refactoring registration flow to functional architecture https://blog.ploeh.dk/2019/12/02/refactoring-registration-flow-to-functional-architecture 2019-12-02T08:19:00+00:00 Mark Seemann <div id="post"> <p> <em>An example showing a refactoring from F# partial application 'dependency injection' to an impure/pure/impure sandwich.</em> </p> <p> In <a href="/2017/02/02/dependency-rejection#36c724b49f614104842c47909cd9c916">a comment</a> to <a href="/2017/02/02/dependency-rejection">Dependency rejection</a>, I wrote: <blockquote> "I'd welcome a simplified, but still concrete example where the impure/pure/impure sandwich described here isn't going to be possible." </blockquote> <a href="https://www.relativisticramblings.com">Christer van der Meeren</a> kindly <a href="/2017/02/02/dependency-rejection#ade3787e6e3c4e569854e2c2bd038e29">replied with a suggestion.</a> </p> <p> The code in question relates to validation of user accounts. You can read the complete description in the linked comment, but I'll try to summarise it here. I'll then show a refactoring to a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a> - specifically, to an impure/pure/impure sandwich. </p> <p> The code is <a href="https://github.com/ploeh/RegistrationFlow">available on GitHub</a>. </p> <h3 id="53c0b4111cf640e0b6fd13066e24f3bd"> Registration flow <a href="#53c0b4111cf640e0b6fd13066e24f3bd" title="permalink">#</a> </h3> <p> The system in question uses two-factor authentication with mobile phones. When you sign up for the service, you give your phone number. You then receive an SMS, and must use whatever is in that SMS to prove ownership of the phone number. Christer van der Meeren illustrates the flow like this: </p> <p> <img src="/content/binary/complete-registration-workflow-with-2fa-difficult-to-sandwich.png" alt="A flowchart describing the workflow for completing a registration."> </p> <p> He also supplies sample code: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(createProof:&nbsp;Mobile&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;ProofId&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(verifyProof:&nbsp;Mobile&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ProofId&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;bool&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(completeRegistration:&nbsp;Registration&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;unit&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(proofId:&nbsp;ProofId&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;CompleteRegistrationResult&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proofId&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;proofId&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;isValid&nbsp;=&nbsp;verifyProof&nbsp;registration.Mobile&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;RegistrationCompleted &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> While this is <a href="https://fsharp.org">F#</a>, it's not functional, since it uses <a href="/2017/01/30/partial-application-is-dependency-injection">partial application for dependency injection</a>. From the description, I find it safe to assume that we can consider <a href="/2016/04/11/async-as-surrogate-io">Async as a surrogate for IO</a>. </p> <p> The code implies the existence of other types. I decided to define them like this: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Mobile&nbsp;=&nbsp;Mobile&nbsp;<span style="color:blue;">of</span>&nbsp;int <span style="color:blue;">type</span>&nbsp;ProofId&nbsp;=&nbsp;ProofId&nbsp;<span style="color:blue;">of</span>&nbsp;Guid <span style="color:blue;">type</span>&nbsp;Registration&nbsp;=&nbsp;{&nbsp;Mobile&nbsp;:&nbsp;Mobile&nbsp;} <span style="color:blue;">type</span>&nbsp;CompleteRegistrationResult&nbsp;=&nbsp;ProofRequired&nbsp;<span style="color:blue;">of</span>&nbsp;ProofId&nbsp;|&nbsp;RegistrationCompleted</pre> </p> <p> In reality, they're probably more complicated, but this is enough to make the code compile. </p> <p> Is it possible to refactor <code>completeRegistrationWorkflow</code> to an impure/pure/impure sandwich? </p> <h3 id="3266f23516dc4d9e98f3a8c87d072f89"> Applicability <a href="#3266f23516dc4d9e98f3a8c87d072f89" title="permalink">#</a> </h3> <p> It <em>is</em> possible to refactor <code>completeRegistrationWorkflow</code> to an impure/pure/impure sandwich. You'll see how to do that soon. Before we start that work, however, I'd like to warn against jumping to conclusions. It's possible that the problem statement doesn't capture some subtleties that one would have to deal with in the real world. It's also possible that I've misunderstood the essence of Christer van der Meeren's problem description. </p> <p> It's (relatively) easy to teach the basics of programming. You teach a beginner about keywords, programming constructs, how to compile or interpret a program, and so on. </p> <p> On the other hand, it's hard to write about dealing with complicated code. There are ways to make legacy code better, but the moves you have to make depend on myriad details. Complicated code is, by definition, something that's hard to learn. This means that truly complicated legacy code is rarely suitable for instructive examples. One has to strike a delicate balance and produce an example that looks complicated enough to warrant improvement, but on the other hand still be simple enough to be understood. </p> <p> I think that Christer van der Meeren has struck that balance. With three dependencies, the sample code looks just complicated enough to warrant refactoring. On the other hand, you can understand what it's supposed to do in a few minutes. There's a risk, however, that the example is <em>too</em> simplified. That could weaken the result of the refactoring that follows. Could you still apply that refactoring if the problem was more complicated? </p> <p> It's my experience that it's conspicuously often possible to implement an impure/pure/impure sandwich. </p> <h3 id="fedd0146b0a84ab3b768f3adcf4f684f"> Fakes <a href="#fedd0146b0a84ab3b768f3adcf4f684f" title="permalink">#</a> </h3> <p> In the rest of this article, I want to show how to refactor <code>completeRegistrationWorkflow</code> to an impure/pure/impure sandwich. As <a href="http://amzn.to/YPdQDf">Refactoring</a> admonishes: <blockquote> <p> "to refactor, the essential precondition is [...] solid tests" </p> <footer><cite>Martin Fowler</cite></footer> </blockquote> Right now, however, there's no tests, so I'm going to add some. </p> <p> The tests will need some <a href="https://en.wikipedia.org/wiki/Test_double">Test Doubles</a> to stand in for the three dependency functions. If possible, <a href="/2019/03/25/an-example-of-state-based-testing-in-f">I prefer state-based testing</a> over <a href="/2019/02/25/an-example-of-interaction-based-testing-in-c">interaction-based testing</a>. First, then, we need some Fakes. </p> <p> While <code>completeRegistrationWorkflow</code> takes three dependency functions, it looks as though there's only two architectural dependencies: <ul> <li>A two-factor authentication service</li> <li>A registration database (or service)</li> </ul> Defining a Fake two-factor authentication object is the most complicated of the two, but still manageable: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Fake2FA&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;proofs&nbsp;=&nbsp;Map.empty &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;_.CreateProof&nbsp;mobile&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;Map.tryFind&nbsp;mobile&nbsp;proofs&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;(proofId,&nbsp;_)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;proofId&nbsp;=&nbsp;ProofId&nbsp;(Guid.NewGuid&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;proofs&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;mobile&nbsp;(proofId,&nbsp;<span style="color:blue;">false</span>)&nbsp;proofs &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;<span style="color:blue;">fun</span>&nbsp;proofId&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{&nbsp;<span style="color:blue;">return</span>&nbsp;proofId&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;_.VerifyProof&nbsp;mobile&nbsp;proofId&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;Map.tryFind&nbsp;mobile&nbsp;proofs&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;(_,&nbsp;<span style="color:blue;">true</span>)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">true</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;<span style="color:blue;">fun</span>&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{&nbsp;<span style="color:blue;">return</span>&nbsp;b&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;_.VerifyMobile&nbsp;mobile&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;Map.tryFind&nbsp;mobile&nbsp;proofs&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;(proofId,&nbsp;_)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;proofs&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;mobile&nbsp;(proofId,&nbsp;<span style="color:blue;">true</span>)&nbsp;proofs &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;()</pre> </p> <p> In F#, I find that the easiest way to model a mutable resource is to use an object. This one just keeps track of a collection of proofs. The <code>CreateProof</code> method fits the function signature of <code>completeRegistrationWorkflow</code>'s <code>createProof</code> function argument. It looks for an existing proof for the mobile number so that it can reuse the same proof multiple times. If there's no proof for <code>mobile</code>, it creates a new <code>Guid</code> and returns it after having first added it to the collection. </p> <p> Likewise, the <code>VerifyProof</code> method fits the type of the <code>verifyProof</code> function argument. Proofs are actually tuples of IDs and a flag that keeps track of whether or not they've been verified. The method returns the flag if it's there, and <code>false</code> otherwise. </p> <p> The third <code>VerifyMobile</code> method is a test-specific functionality that enables a test to mark a proof as having been verified via two-factor authentication. </p> <p> Compared to <code>Fake2FA</code>, the Fake registration database is simple: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;FakeRegistrationDB&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;Collection&lt;Registration&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.CompleteRegistration&nbsp;r&nbsp;=&nbsp;async&nbsp;{&nbsp;this.Add&nbsp;r&nbsp;}</pre> </p> <p> Again, the <code>CompleteRegistration</code> method fits the <code>completeRegistration</code> function argument to <code>completeRegistrationWorkflow</code>. It just makes the inherited <code>Add</code> method <code>Async</code>. </p> <h3 id="8dbbe25d331f4517b3fe8ace6e95ffa9"> Fixture creation <a href="#8dbbe25d331f4517b3fe8ace6e95ffa9" title="permalink">#</a> </h3> <p> My plan is to add <a href="https://en.wikipedia.org/wiki/Characterization_test">Characterisation Tests</a> so that I can refactor. I do, however, plan to change the API of the System Under Test (SUT). This could break the tests, which would defy their purpose. To protect against this, I'll test against a <a href="https://en.wikipedia.org/wiki/Facade_pattern">Facade</a>. Initially, this Facade will be equivalent to the <code>completeRegistrationWorkflow</code> function, but this will change as I refactor. </p> <p> In addition to the SUT Facade, the tests will also need access to the 'injected' dependencies. You can address this by creating a <a href="/2009/03/16/FixtureObject">Fixture Object</a>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;createFixture&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;twoFA&nbsp;=&nbsp;Fake2FA&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeRegistrationDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.VerifyProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db.CompleteRegistration &nbsp;&nbsp;&nbsp;&nbsp;sut,&nbsp;twoFA,&nbsp;db</pre> </p> <p> This function return a triple of values: the SUT Facade and the two Fakes. </p> <p> The SUT Facade is a partially applied function of the type <code>ProofId option -&gt; Registration -&gt; Async&lt;CompleteRegistrationResult&gt;</code>. In other words, it abstracts away the specifics about how impure actions are executed. It seems reasonable to imagine that the two remaining input arguments, <code>ProofId option</code> and <code> Registration</code>, are run-time values. Regardless of refactoring, the resulting function should be able to receive those arguments and produce the desired outcome. </p> <h3 id="a6dbde952b53422992ae006bdc305053"> Characterising the missing proof ID case <a href="#a6dbde952b53422992ae006bdc305053" title="permalink">#</a> </h3> <p> It looks like the <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> of <code>completeRegistrationWorkflow</code> is <em>3</em>, so you're going to need three Characterisation Tests. You can add them in any order you like, but in this case I found it natural to follow the order in which the branches are laid out in the SUT. </p> <p> This test case verifies what happens if the proof ID is missing: </p> <p> <pre>[&lt;Theory&gt;] [&lt;InlineData&nbsp;123&gt;] [&lt;InlineData&nbsp;432&gt;] <span style="color:blue;">let</span>&nbsp;Missing&nbsp;proof&nbsp;ID&nbsp;mobile&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut,&nbsp;twoFA,&nbsp;db&nbsp;=&nbsp;createFixture&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;r&nbsp;=&nbsp;{&nbsp;Mobile&nbsp;=&nbsp;Mobile&nbsp;mobile&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;actual&nbsp;=&nbsp;sut&nbsp;None&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;expectedProofId&nbsp;=&nbsp;twoFA.CreateProof&nbsp;r.Mobile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;expected&nbsp;=&nbsp;ProofRequired&nbsp;expectedProofId &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;actual &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;Seq.isEmpty&nbsp;db&nbsp;@&gt;&nbsp;}</pre> </p> <p> All the tests in this article use <a href="https://xunit.net">xUnit.net</a> 2.4.0 with <a href="https://github.com/SwensenSoftware/unquote">Unquote</a> 5.0.0. </p> <p> This test calls the <code>sut</code> Facade with a <code>None</code> proof ID and an arbitrary <code>Registration</code> <code>r</code>. Had I used a <a href="/property-based-testing-intro">property-based testing</a> framework such as <a href="https://fscheck.github.io/FsCheck">FsCheck</a> or <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a>, I could have made the <code>Registration</code> value itself an arbitrary test argument, but I thought that this was overkill for this situation. </p> <p> In order to figure out the <code>expectedProofId</code>, the test relies on the behaviour of the <code>Fake2FA</code> class. The <code>CreateProof</code> method is <a href="https://en.wikipedia.org/wiki/Idempotence">idempotent</a>, so calling it several times with the same number should return the same proof. In this test case, we expect the <code>sut</code> to have done so already, so calling the method once more from the test should return the same value that the SUT received. The test then wraps the proof ID in the <code>ProofRequired</code> case and uses Unquote's <code>=!</code> (<em>must equal</em>) operator to verify that <code>expected</code> is equal to <code>actual</code>. </p> <p> Finally, the test also verifies that the reservations database remains empty. </p> <p> Since this is a Characterisation Test it already passes, <a href="/2013/04/02/why-trust-tests">which makes it untrustworthy</a>. How do I know that I didn't write a <a href="/2019/10/14/tautological-assertion">Tautological Assertion</a>? </p> <p> When I write Characterisation Tests, I always try to change the SUT to verify that the test fails for the appropriate reason. In order to fail the first assertion, I can make this change to the <code>None</code> branch of the SUT: </p> <p> <pre><span style="color:blue;">match</span>&nbsp;proofId&nbsp;<span style="color:blue;">with</span> |&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//let!&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;proofId&nbsp;=&nbsp;ProofId&nbsp;(Guid.NewGuid&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId</pre> </p> <p> This fails the <code>expected =! actual</code> assertion, as expected. </p> <p> Likewise, you can fail the second assertion with this change: </p> <p> <pre><span style="color:blue;">match</span>&nbsp;proofId&nbsp;<span style="color:blue;">with</span> |&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId</pre> </p> <p> The addition of the <code>completeRegistration</code> statement causes the <code>test &lt;@ Seq.isEmpty db @&gt;</code> assertion to fail, again as expected. </p> <p> Now I trust that test. </p> <h3 id="a9dceb6d72af4d06bc46bae83464b201"> Characterising the valid proof ID case <a href="#a9dceb6d72af4d06bc46bae83464b201" title="permalink">#</a> </h3> <p> Next, you have the case where all is good. The proof ID is present and valid. You can characterise the behaviour with this test: </p> <p> <pre>[&lt;Theory&gt;] [&lt;InlineData&nbsp;987&gt;] [&lt;InlineData&nbsp;247&gt;] <span style="color:blue;">let</span>&nbsp;Valid&nbsp;proof&nbsp;ID&nbsp;mobile&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut,&nbsp;twoFA,&nbsp;db&nbsp;=&nbsp;createFixture&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;r&nbsp;=&nbsp;{&nbsp;Mobile&nbsp;=&nbsp;Mobile&nbsp;mobile&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;twoFA.CreateProof&nbsp;r.Mobile &nbsp;&nbsp;&nbsp;&nbsp;twoFA.VerifyMobile&nbsp;r.Mobile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;actual&nbsp;=&nbsp;sut&nbsp;(Some&nbsp;p)&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;RegistrationCompleted&nbsp;=!&nbsp;actual &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;Seq.contains&nbsp;r&nbsp;db&nbsp;@&gt;&nbsp;}</pre> </p> <p> This test uses <code>CreateProof</code> to create a proof before the <code>sut</code> is exercised. It also uses the test-specific <code>VerifyMobile</code> method to mark the mobile number (and thereby the proof) as valid. </p> <p> Again, there's two assertions: one against the return value <code>actual</code>, and one that verifies that the registration database <code>db</code> now contains the registration <code>r</code>. </p> <p> As before, you can't trust a Characterisation Test before you've seen it fail, so first edit the <code>isValid</code> branch of the SUT like this: </p> <p> <pre><span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//return&nbsp;RegistrationCompleted</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId</pre> </p> <p> This fails the <code>RegistrationCompleted =! actual</code> assertion, as expected. </p> <p> Now make this change: </p> <p> <pre><span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//do!&nbsp;completeRegistration&nbsp;registration</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;RegistrationCompleted</pre> </p> <p> Now the <code>test &lt;@ Seq.contains r db @&gt;</code> assertion fails, as expected. </p> <p> This test also seems trustworthy. </p> <h3 id="a4f44c3575914d628931c88095df477e"> Characterising the invalid proof ID case <a href="#a4f44c3575914d628931c88095df477e" title="permalink">#</a> </h3> <p> The final test case is when a proof ID exists, but it's invalid: </p> <p> <pre>[&lt;Theory&gt;] [&lt;InlineData&nbsp;327&gt;] [&lt;InlineData&nbsp;666&gt;] <span style="color:blue;">let</span>&nbsp;Invalid&nbsp;proof&nbsp;ID&nbsp;mobile&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut,&nbsp;twoFA,&nbsp;db&nbsp;=&nbsp;createFixture&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;r&nbsp;=&nbsp;{&nbsp;Mobile&nbsp;=&nbsp;Mobile&nbsp;mobile&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;twoFA.CreateProof&nbsp;r.Mobile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;actual&nbsp;=&nbsp;sut&nbsp;(Some&nbsp;p)&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;expectedProofId&nbsp;=&nbsp;twoFA.CreateProof&nbsp;r.Mobile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;expected&nbsp;=&nbsp;ProofRequired&nbsp;expectedProofId &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;actual &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;Seq.isEmpty&nbsp;db&nbsp;@&gt;&nbsp;}</pre> </p> <p> The <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">arrange phase</a> of the test is comparable to the previous test case. The only difference is that the new test <em>doesn't</em> invoke <code>twoFA.VerifyMobile r.Mobile</code>. This leaves the generated proof ID <code>p</code> invalid. </p> <p> The assertions, on the other hand, are identical to those of the <code>Missing proof ID</code> test case, which means that you can make the same edits to the <code>else</code> branch as you can to the <code>None</code> branch, as described above. If you do that, the assertions fail as they're supposed to. You can also trust this Characterisation Test. </p> <h3 id="3f733ce502814d458395b3561c63b897"> Eta expansion <a href="#3f733ce502814d458395b3561c63b897" title="permalink">#</a> </h3> <p> While I want to keep the SUT Facade's type unchanged, I do want change the way I compose it. The goal is an impure/pure/impure sandwich: Do something impure first, then call a pure function with the data obtained, and finally do something impure with the output of the pure function. </p> <p> This means that the composition is going to manipulate the input values to the SUT Facade. To make that easier, I perform an <em>eta conversion</em> on the <code>sut</code>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;createFixture&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;twoFA&nbsp;=&nbsp;Fake2FA&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeRegistrationDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.VerifyProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db.CompleteRegistration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;sut,&nbsp;twoFA,&nbsp;db</pre> </p> <p> This doesn't change the behaviour or how the SUT is composed. It only makes the <code>pid</code> and <code>r</code> arguments explicitly visible. </p> <h3 id="9afbddfce5e14b3c98435a9d2e3f6848"> Move proof verification <a href="#9afbddfce5e14b3c98435a9d2e3f6848" title="permalink">#</a> </h3> <p> When you consider the current implementation of <code>completeRegistrationWorkflow</code>, it seems that the impure actions are interleaved with the decision-making code. How to separate them? </p> <p> The first opportunity that I identified was that it always calls <code>verifyProof</code> in the <code>Some</code> case. Whenever you want to call a method only in the <code>Some</code> case, but not in the <code>None</code> case, it suggest <code>Option.map</code>. </p> <p> It should be possible to run <code>Option.map (twoFA.VerifyProof r.Mobile) pid</code> as the initial impure action of the impure/pure/impure sandwich. If that's possible, we could pass the output of that pure function as an argument to <code>completeRegistrationWorkflow</code>. That would already make it simpler: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(createProof:&nbsp;Mobile&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;ProofId&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(completeRegistration:&nbsp;Registration&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;unit&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;CompleteRegistrationResult&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;isValid&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;RegistrationCompleted &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Notice that by changing the <code>proof</code> argument to a <code>bool option</code>, you no longer need to call <code>verifyProof</code>, so you can remove it. </p> <p> There's just one problem. The result of <code>Option.map (twoFA.VerifyProof r.Mobile) pid</code> is an <code>Option&lt;Async&lt;bool&gt;&gt;</code>, but you need an <code>Option&lt;bool&gt;</code>. </p> <p> You can compose the SUT Facade in an asynchronous workflow, and use a <code>let!</code> binding, but that's not going to solve the problem. A <code>let!</code> binding only works when the outer container is <code>Async</code>. Here, the outermost container is <code>Option</code>. You're going to need to flip the containers around so that you get an <code>Async&lt;Option&lt;bool&gt;&gt;</code> that you can <code>let!</code>-bind: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;Option.map&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;b&#39;&nbsp;=&nbsp;b &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Some&nbsp;b&#39;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{&nbsp;<span style="color:blue;">return</span>&nbsp;None&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db.CompleteRegistration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;p &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> By pattern-matching on <code>Option.map (twoFA.VerifyProof r.Mobile) pid</code>, you can return one of two alternative asynchronous workflows. </p> <p> Due to the <code>let!</code> binding, <code>p</code> is a <code>bool option</code> that you can pass to <code>completeRegistrationWorkflow</code>. </p> <h3 id="a271fa42747a4271a5420951763d3559"> Traversal <a href="#a271fa42747a4271a5420951763d3559" title="permalink">#</a> </h3> <p> I know what you're going to say. You'll protest that I just moved complex behaviour out of <code>completeRegistrationWorkflow</code>. The implied assumption here is that <code>completeRegistrationWorkflow</code> is the top-level behaviour that you'd compose in a <a href="/2011/07/28/CompositionRoot">Composition Root</a>. The <code>createFixture</code> function plays that role in this refactoring exercise. </p> <p> You'd normally view the Composition Root as a <a href="http://xunitpatterns.com/Humble%20Object.html">Humble Object</a> - an object that we accept isn't covered by tests because it has a cyclomatic complexity of one. This is no longer the case. </p> <p> The conversion of <code>Option&lt;Async&lt;bool&gt;&gt;</code> to <code>Async&lt;Option&lt;bool&gt;&gt;</code> is, however, a well-known operation. In <a href="https://www.haskell.org">Haskell</a> this is known as a <em>traversal</em>, and it's a completely generic operation: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;)&nbsp;-&gt;&nbsp;&#39;a&nbsp;option&nbsp;-&gt;&nbsp;Async&lt;&#39;b&nbsp;option&gt;</span> <span style="color:blue;">let</span>&nbsp;traverse&nbsp;f&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;x&#39;&nbsp;=&nbsp;f&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Some&nbsp;x&#39;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{&nbsp;<span style="color:blue;">return</span>&nbsp;None&nbsp;}</pre> </p> <p> You can put this function in a general-purpose module called <code>AsyncOption</code> and cover it by unit tests if you will. You can even put this module in a separate library; it's perfectly decoupled from the the specifics of the registration flow domain. </p> <p> If you do that, <code>completeRegistrationWorkflow</code> doesn't change, but the composition does: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db.CompleteRegistration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;p &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> You're now back where you'd like to be: One impure action produces a value that you can pass to another function. There's no explicit branching in the code. The cyclomatic complexity remains one. </p> <h3 id="58e1461e9e304afd8c491f94150ebd35"> Change return type <a href="#58e1461e9e304afd8c491f94150ebd35" title="permalink">#</a> </h3> <p> That first refactoring takes care of one out of three impure dependencies. Next, you can get rid of <code>createProof</code>. This one seems to be more difficult to get rid of. It doesn't seem to be required only in the <code>Some</code> case, so a <code>map</code> or <code>traverse</code> can't work. In both cases, however, the result of calling <code>createProof</code> is handled in exactly the same way. </p> <p> Here's another common trick in functional programming: <a href="/2016/09/26/decoupling-decisions-from-effects">Decouple decisions from effects</a>. Return a value that indicates the decision that the function reaches, and then let the second impure action of the impure/pure/impure sandwich act on the decision. </p> <p> In this case, you can model your decision as a <code>Mobile option</code>. You might want to consider a more explicit type, in order to better communicate intent, but it's best to keep each refactoring step small: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(completeRegistration:&nbsp;Registration&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;unit&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;Mobile&nbsp;option&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">return</span>&nbsp;Some&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;isValid&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Some&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Notice that the <code>createProof</code> dependency is no longer required. I've removed it from the argument list of <code>completeRegistrationWorkflow</code>. </p> <p> The composition now looks like this: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;createFixture&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;twoFA&nbsp;=&nbsp;Fake2FA&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeRegistrationDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;res&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;db.CompleteRegistration&nbsp;p&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;pidr&nbsp;=&nbsp;AsyncOption.traverse&nbsp;twoFA.CreateProof&nbsp;res &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;pidr &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Option.map&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;RegistrationCompleted&nbsp;}</pre> </p> <p> Thanks to the <code>let!</code> binding, the result <code>res</code> is a <code>Mobile option</code>. You can now let the <code>twoFA.CreateProof</code> method <code>traverse</code> over <code>res</code>. This produces an <code>Async&lt;Option&lt;ProofId&gt;&gt;</code> that you can <code>let!</code>-bind to <code>pidr</code> - a <code>ProofId option</code>. </p> <p> You can use <code>Option.map</code> to wrap the <code>ProofId</code> value in a <code>ProofRequired</code> case, if it's there. This step of the final pipeline produces a <code>CompleteRegistrationResult option</code>. </p> <p> Finally, you can use <code>Option.defaultValue</code> to fold the <code>option</code> into a <code>CompleteRegistrationResult</code>. The default value is <code>RegistrationCompleted</code>. This is the case value that'll be used if the <code>option</code> is <code>None</code>. </p> <p> Again, the composition has a cyclomatic complexity of one, and the type of the <code>sut</code> remains <code>ProofId option -&gt; Registration -&gt; Async&lt;CompleteRegistrationResult&gt;</code>. This is a true refactoring. The type of the SUT remains the same, and no behaviour changes. The tests still pass, even though I haven't had to edit them. </p> <h3 id="6550df202542434e85937da702901cd1"> Change return type to Result <a href="#6550df202542434e85937da702901cd1" title="permalink">#</a> </h3> <p> Consider the intent of <code>completeRegistrationWorkflow</code>. The purpose of the operation is to <em>complete</em> a registration workflow. The name is quite explicit. Thus, the <em>happy path</em> is when the proof ID is valid and the function can call <code>completeRegistration</code>. </p> <p> Usually, when you call a function that returns an <code>option</code>, the implied contract is that the <code>Some</code> case represents the happy path. That's not the case here. The <code>Some</code> case carries information about the error paths. This isn't <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a>. </p> <p> It'd be more appropriate to use a <code>Result</code> return value: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(completeRegistration:&nbsp;Registration&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;unit&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;Result&lt;unit,&nbsp;Mobile&gt;&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">return</span>&nbsp;Error&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;isValid&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Ok&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Error&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> This change is in itself small, but it does require some changes to the composition. Just as you had to add an <code>Option.traverse</code> function when the return type was an <code>option</code>, you'll now have to add similar functionality to <code>Result</code>. <em>Result</em> is also known as <a href="/2018/06/11/church-encoded-either">Either</a>. Not only <a href="/2019/01/07/either-bifunctor">is it a bifunctor</a>, you can also traverse both axes. Haskell calls this a <code>bitraversable</code> functor. </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;Async&lt;&#39;d&gt;)&nbsp;-&gt;&nbsp;Result&lt;&#39;a,&#39;c&gt;&nbsp;-&gt;&nbsp;Async&lt;Result&lt;&#39;b,&#39;d&gt;&gt;</span> <span style="color:blue;">let</span>&nbsp;traverseBoth&nbsp;f&nbsp;g&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Ok&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;x&#39;&nbsp;=&nbsp;f&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Ok&nbsp;x&#39;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Error&nbsp;e&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;e&#39;&nbsp;=&nbsp;g&nbsp;e &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Error&nbsp;e&#39;&nbsp;}</pre> </p> <p> Here I just decided to call the function <code>traverseBoth</code> and the module <code>AsyncResult</code>. </p> <p> You're also going to need the equivalent of <code>Option.defaultValue</code> for <code>Result</code>. Something that translates both dimensions of <code>Result</code> into the same type. That's the <a href="/2019/06/03/either-catamorphism">Either catamorphism</a>, so you could, for example, introduce another general-purpose function called <code>cata</code>: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;Result&lt;&#39;a,&#39;c&gt;&nbsp;-&gt;&nbsp;&#39;b</span> <span style="color:blue;">let</span>&nbsp;cata&nbsp;f&nbsp;g&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Ok&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;f&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Error&nbsp;e&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;g&nbsp;e</pre> </p> <p> This is another entirely general-purpose function that you can put in a general-purpose module called <code>Result</code>, in a general-purpose library. You can also cover it by unit tests, if you like. </p> <p> These two general-purpose functions enable you to compose the workflow: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;createFixture&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;twoFA&nbsp;=&nbsp;Fake2FA&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;db&nbsp;=&nbsp;FakeRegistrationDB&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;res&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;db.CompleteRegistration&nbsp;p&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;pidr&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncResult.traverseBoth &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;async&nbsp;{&nbsp;<span style="color:blue;">return</span>&nbsp;()&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;res &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;pidr &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Result.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;sut,&nbsp;twoFA,&nbsp;db</pre> </p> <p> This looks more confused than previous iterations. From here, though, it'll get better again. The first two lines of code are the same as before, but now <code>res</code> is a <code>Result&lt;unit, Mobile&gt;</code>. You still need to let <code>twoFA.CreateProof</code> traverse the 'error path', but now you also need to take care of the happy path. </p> <p> In the <code>Ok</code> case you have a <code>unit</code> value (<code>()</code>), but <code>traverseBoth</code> expects its <code>f</code> and <code>g</code> functions to return <code>Async</code> values. I could have fixed that with a more specialised <code>traverseError</code> function, but we'll soon move on from here, so it's hardly worthwhile. </p> <p> In Haskell, you can 'elevate' a value simply with the <code>pure</code> function, but in F#, you need the more cumbersome <code>(fun () -&gt; async { return () })</code> to achieve the same effect. </p> <p> The traversal produces <code>pidr</code> (for <em>Proof ID Result</em>) - a <code>Result&lt;unit, ProofId&gt;</code> value. </p> <p> Finally, it uses <code>Result.cata</code> to turn both the <code>Ok</code> and <code>Error</code> dimensions into a single <code>CompleteRegistrationResult</code> that can be returned. </p> <h3 id="d1c0adc81bc241c7a7a2ea9042356f24"> Removing the last dependency <a href="#d1c0adc81bc241c7a7a2ea9042356f24" title="permalink">#</a> </h3> <p> There's still one dependency left: the <code>completeRegistration</code> function, but it's now trivial to remove. Instead of calling the dependency function from within <code>completeRegistrationWorkflow</code> you can use the same trick as before. Decouple the decision from the effect. </p> <p> Return information about the decision the function made. In the above incarnation of the code, the <code>Ok</code> dimension is currently empty, since it only returns <code>unit</code>. You can use that 'channel' to communicate that you decided to complete a registration: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;Result&lt;Registration,&nbsp;Mobile&gt;&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">return</span>&nbsp;Error&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;isValid&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Error&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> This is another small change. When <code>isValid</code> is <code>true</code>, the function no longer calls <code>completeRegistration</code>. Instead, it returns <code>Ok registration</code>. This means that the return type is now <code>Async&lt;Result&lt;Registration, Mobile&gt;&gt;</code>. It also means that you can remove the <code>completeRegistration</code> function argument. </p> <p> In order to compose this variation, you need one new general-purpose function. Perhaps you find this barrage of general-purpose functions exhausting, but it's an artefact of a design philosophy of the F# language. The F# base library contains only few general-purpose functions. Contrast this with <a href="https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler">GHC</a>'s <a href="http://hackage.haskell.org/package/base">base</a> library, which comes with all of these functions built in. </p> <p> The new function is like <code>Result.cata</code>, but over <code>Async&lt;Result&lt;_&gt;&gt;</code>. </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;Async&lt;Result&lt;&#39;a,&#39;c&gt;&gt;&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;</span> <span style="color:blue;">let</span>&nbsp;cata&nbsp;f&nbsp;g&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;r&#39;&nbsp;=&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Result.cata&nbsp;f&nbsp;g&nbsp;r&#39;&nbsp;}</pre> </p> <p> Since this function does conceptually the same as <code>Result.cata</code> I decided to retain the name <code>cata</code> and just put it in the <code>AsyncResult</code> module. (This may not be strictly correct, as I haven't really given a lot of thought to what a catamorphism for <code>Async</code> would look like, if one exists. I'm open to suggestions about better naming. After all, <code>cata</code> is hardly an idiomatic F# name.) </p> <p> With <code>AsyncResult.cata</code> you can now compose the system: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;res&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;p&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;res &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Not only did the call to <code>completeRegistrationWorkflow</code> get even simpler, but you also now avoid the awkwardly named <code>pidr</code> value. Thanks to the <code>let!</code> binding, <code>res</code> has the type <code>Result&lt;Registration, Mobile&gt;</code>. </p> <p> Note that you can now let both impure actions (<code>db.CompleteRegistration</code> and <code>twoFA.CreateProof</code>) traverse the result. This step produces an <code>Async&lt;Result&lt;unit, ProofId&gt;&gt;</code> that's immediately piped to <code>AsyncResult.cata</code>. This reduces the two alternative dimensions of the <code>Result</code> to a single <code>Async&lt;CompleteRegistrationResult&gt;</code> value. </p> <p> The <code>completeRegistrationWorkflow</code> function now begs to be further simplified. </p> <h3 id="db6e044e55f749ba8794d7a8f74e02f4"> Pure registration workflow <a href="#db6e044e55f749ba8794d7a8f74e02f4" title="permalink">#</a> </h3> <p> <a href="/2019/02/11/asynchronous-injection">Once you remove all dependencies, your domain logic doesn't have to be asynchronous</a>. Nothing asynchronous happens in <code>completeRegistrationWorkflow</code>, so simplify it: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Result&lt;Registration,&nbsp;Mobile&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;isValid&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;Error&nbsp;registration.Mobile</pre> </p> <p> Gone is the <code>async</code> computation expression, including the <code>return</code> keyword. This is now a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>. </p> <p> You'll have to adjust the composition once more, but it's only a minor change: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;p&nbsp;=&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;completeRegistrationWorkflow&nbsp;p&nbsp;r &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> The result of invoking <code>completeRegistrationWorkflow</code> is no longer an <code>Async</code> value, so there's no reason to <code>let!</code>-bind it. Instead, you can call it and immediately pipe its output to <code>AsyncResult.traverseBoth</code>. </p> <h3 id="9a4c3af0f30843c2816af97a08b2f99b"> DRY <a href="#9a4c3af0f30843c2816af97a08b2f99b" title="permalink">#</a> </h3> <p> Consider <code>completeRegistrationWorkflow</code>. Can you make it simpler? </p> <p> At this point it should be evident that two of the branches contain duplicate code. Applying the <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY principle</a> you can simplify it: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(proof:&nbsp;bool&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Result&lt;Registration,&nbsp;Mobile&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;<span style="color:blue;">true</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile</pre> </p> <p> I'm not too fond of this style of type annotation for simple functions like this, so I'd like to remove it: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow&nbsp;proof&nbsp;registration&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;<span style="color:blue;">true</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile</pre> </p> <p> These two steps are pure refactorings: they only reorganise the code that implements <code>completeRegistrationWorkflow</code>, so the composition doesn't change. </p> <h3 id="705c34900d6342f3a79356734a67b355"> Essential complexity <a href="#705c34900d6342f3a79356734a67b355" title="permalink">#</a> </h3> <p> While reading this article, you may have felt frustration gather. <em>This is cheating! You took out all of the complexity. Now there's nothing left!</em> You're likely to feel that I've moved a lot of behaviour into untestable code. I've done nothing of the sort. </p> <p> I'll remind you that while functions like <code>AsyncOption.traverse</code> and <code>AsyncResult.cata</code> do contain branching behaviour, they <em>can</em> be tested. In fact, <a href="/2015/05/07/functional-design-is-intrinsically-testable">since they're pure functions, they're intrinsically testable</a>. </p> <p> It's true that a <em>composition</em> of a pure function with its impure dependencies may not be (unit) testable, but that's also true for a Dependency Injection-based object graph composed in a Composition Root. </p> <p> Compositions of functions may look non-trivial, but to a degree, the type system will assist you. If your composition compiles, it's likely that you've composed the impure/pure/impure sandwich correctly. </p> <p> Did I take out all the complexity? I didn't. There's a bit left; the function now has a cyclomatic complexity of <em>two</em>. If you look at the original function, you'll see that <em>the duplication was there all along</em>. Once you remove all the accidental complexity, you uncover the essential complexity. This happens to me so often when I apply functional programming principles that <a href="/2019/07/01/yes-silver-bullet">I fancy that functional programming is a silver bullet</a>. </p> <h3 id="0086592a037947e397169271eeaad627"> Pipeline composition <a href="#0086592a037947e397169271eeaad627" title="permalink">#</a> </h3> <p> We're mostly done now. The problem now appears in all its simplicity, and you have an impure/pure/impure sandwich. </p> <p> You can still improve the code, though. </p> <p> If you consider the current composition, you may find that <code>p</code> isn't the best variable name. I admit that I struggled with naming that variable. <a href="/2016/10/25/when-variable-names-are-in-the-way">Sometimes, variable names are in the way</a> and the code might be clearer if you could elide them by composing a pipeline of functions. </p> <p> That's always worth an attempt. This time, ultimately I find that it doesn't improve things, but even an attempt can be illustrative. </p> <p> If you want to eliminate a named value, you can often do so by piping the output of the function that produced the variable directly to the next function. This does, however, require that the function argument is the right-most. Currently, that's not the case. <code>registration</code> is right-most, and <code>proof</code> is to the left. </p> <p> There's no compelling reason that the arguments should come in that order, so flip them: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow&nbsp;registration&nbsp;proof&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proof&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;<span style="color:blue;">true</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile</pre> </p> <p> This enables you to write the entire composition as a single pipeline: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Async.map&nbsp;(completeRegistrationWorkflow&nbsp;r) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Async.bind&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&gt;&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired) &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> This does, however, call for two new general-purpose functions: <code>Async.map</code> and <code>Async.bind</code>: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;Async&lt;&#39;a&gt;&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;</span> <span style="color:blue;">let</span>&nbsp;map&nbsp;f&nbsp;x&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;x&#39;&nbsp;=&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;f&nbsp;x&#39;&nbsp;} <span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;)&nbsp;-&gt;&nbsp;Async&lt;&#39;a&gt;&nbsp;-&gt;&nbsp;Async&lt;&#39;b&gt;</span> <span style="color:blue;">let</span>&nbsp;bind&nbsp;f&nbsp;x&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;x&#39;&nbsp;=&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span>&nbsp;f&nbsp;x&#39;&nbsp;}</pre> </p> <p> In my opinion, these functions ought to belong to F#'s <code>Async</code> module, but for <a href="https://github.com/fsharp/fslang-suggestions/issues/318">for reasons that aren't clear to me, they don't</a>. As you can see, though, they're easy to add. </p> <p> While the this change gets rid of the <code>p</code> variable, I don't think it makes the overall composition easier to understand. The action of swapping the function arguments does, however, enable another simplification. </p> <h3 id="db99963569414669a865d4d10ad95b6e"> Eta reduction <a href="#db99963569414669a865d4d10ad95b6e" title="permalink">#</a> </h3> <p> Now that <code>proof</code> is <code>completeRegistrationWorkflow</code>'s last function argument, you can perform an <em>eta reduction:</em> </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow&nbsp;registration&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;<span style="color:blue;">true</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile</pre> </p> <p> Not everyone is a fan of the <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free style</a>, but I like it. YMMV. </p> <h3 id="798d1bb566224090a676d386afc54ea4"> Sandwich <a href="#798d1bb566224090a676d386afc54ea4" title="permalink">#</a> </h3> <p> Regardless of whether you prefer <code>completeRegistrationWorkflow</code> in point-free or pointed style, I think that the composition needs improvement. It should explicitly communicate that it's an impure/pure/impure sandwich. This makes it necessary to reintroduce some variables, so I'm also going to bite the bullet and devise some better names. </p> <p> <pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;validityOfProof&nbsp;=&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;decision&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;r&nbsp;validityOfProof &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;decision &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Instead of <code>p</code>, I decided to call the first value <code>validityOfProof</code>. This is the result of the first impure action in the sandwich (the upper slice of bread). </p> <p> While <code>validityOfProof</code> is the result of an impure action, the value itself is pure and can be used as input to <code>completeRegistrationWorkflow</code>. This is the pure part of the sandwich. I called the output <code>decision</code> because the workflow makes a decision based on its input, and it's up to the caller to act on that decision. </p> <p> Notice that <code>decision</code> is bound with a <code>let</code> binding (instead of a <code>let!</code> binding), despite taking place inside an <code>async</code> workflow. This is because <code>completeRegistrationWorkflow</code> is pure. It doesn't return an <code>Async</code> value. </p> <p> The second impure action acts on <code>decision</code> through a pipeline of <code>AsyncResult.traverseBoth</code> and <code>AsyncResult.cata</code>, as previously explained. </p> <p> I think that the impure/pure/impure sandwich is more visible like this, so that was my final edit. I'm happy with how it looks now. </p> <h3 id="a4c98d81322e4010a0dfcb1c59955812"> Conclusion <a href="#a4c98d81322e4010a0dfcb1c59955812" title="permalink">#</a> </h3> <p> I don't claim that you can always refactor code to an impure/pure/impure sandwich. In fact, <a href="/2017/07/10/pure-interactions">I can easily envision categories of software where such an architecture seems impossible</a>. </p> <p> Still, I find it intriguing that when I find myself in the realm of web services or message-based applications, I can't recall a case where a sandwich has been impossible. Surely, there must cases where it is so. That's the reason that I solicit examples. This article was a response to such an example. I found it fruitful, because it enabled me to discuss several useful techniques for composing behaviour in a functional architecture. On the other hand, it failed to be a counter-example. </p> <p> I'm sure that some readers are left with a nagging doubt. <em>That's all very impressive, but would you actually write code like that in a piece of production software?</em> </p> <p> If it was up to me, then: <em>yes.</em> I find that when I can keep code pure, it's trivial to unit test and there's no test-induced damage. Functions also compose in a way objects don't easily do, so there's many advantages to functional programming. I'll take them when they're available. </p> <p> As always, context matters. I've been in team settings where other team members would embrace this style of programming, and in other environments where team members wouldn't understand what was going on. In the latter case, I'd adjust my approach to challenge, not alienate, other team members. </p> <p> My intention with this article was to show what's <em>possible</em>, not to dictate what you should do. That's up to you. </p> <p> This article is the December 2 entry in the <a href="https://sergeytihon.com/2019/11/05/f-advent-calendar-in-english-2019">F# Advent Calendar in English 2019</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header">Comments</h2> <div class="comment" id="7c05edb624b54cafacc204e60b42bbf3"> <div class="comment-author"><a href="https://www.relativisticramblings.com/">Christer van der Meeren</a></div> <div class="comment-content"> <p>Thank you so much for the comprehensive reply to my comment. It was very instructive to see refactoring process, from thought to code. The post is an excellent reply to the question I asked.</p> <h3>A slight modification</h3> <p>In my original comment, I made one simplification that, in hindsight, I perhaps should not have made. It is not critical, but it complicates things slightly. In reality, the <code>completeRegistration</code> function does not return <code>Async&lt;unit&gt;</code>, but <code>Async&lt;Result&lt;unit, CompleteRegistrationError&gt;&gt;</code> (where, currently, <code>CompleteRegistrationError</code> has the single case <code>UserExists</code>, returned if the DB throws a unique constraint error).</p> <p>As I see it, the impact of this to your refactoring is two-fold:</p> <ul> <li>You can&#39;t easily use <code>AsyncResult.traverseBoth</code>, since the signatures between the two cases aren&#39;t compatible (unless you want to mess around with nested <code>Result</code> values). You could write a custom <code>traverse</code> function just for the needed signature, but then we’ve traveled well into the lands of “generic does not imply general”.</li> <li>It might be better to model the registration result (completed vs. proof required) as its own DU, with <code>Result</code> being reserved for actual errors.</li> </ul> <h3>Evaluating the refactoring</h3> <p>My original comment ended in the following question (emphasis added):</p> <blockquote><p>Is it possible to refactor this to direct input/output, <strong>in a way that actually reduces complexity where it matters?</strong></p> </blockquote> <p>With this (vague) question and the above modifications in mind, let&#39;s look at the relevant code before/after. In both cases, there are two functions: The workflow/logic, and the composition.</p> <h4>Before</h4> <p>Before refactoring, we have a slightly complex impure workflow (which still is fairly easily testable using state-based testing, as you so aptly demonstrated) – note the <code>asyncResult</code> CE (I’m using the excellent FsToolkit.ErrorHandling, if anyone wonders) and the updated signatures; otherwise it’s the same:</p> <pre><code class='language-f#' lang='f#'>let completeRegistrationWorkflow (createProof: Mobile -&gt; Async&lt;ProofId&gt;) (verifyProof: Mobile -&gt; ProofId -&gt; Async&lt;bool&gt;) (completeRegistration: Registration -&gt; Async&lt;Result&lt;unit, CompleteRegistrationError&gt;&gt;) (proofId: ProofId option) (registration: Registration) : Async&lt;Result&lt;CompleteRegistrationResult, CompleteRegistrationError&gt;&gt; = asyncResult { match proofId with | None -&gt; let! proofId = createProof registration.Mobile return ProofRequired proofId | Some proofId -&gt; let! isValid = verifyProof registration.Mobile proofId if isValid then do! completeRegistration registration return RegistrationCompleted else let! proofId = createProof registration.Mobile return ProofRequired proofId } </code></pre> <p>Secondly, we have the trivial &quot;humble object&quot; composition, which looks like this:</p> <pre><code class='language-f#' lang='f#'>let complete proofId validReg = Workflows.Registration.complete Http.createMobileClaimProof Http.verifyMobileClaimProof Db.completeRegistration proofId validReg </code></pre> <p>The composition is, indeed, humble – the only thing it does is call the higher-order workflow function with the correct parameters. It has no cyclomatic complexity and is trivial to read, and I don&#39;t think anyone would consider it necessary to test.</p> <h4>After</h4> <p>After refactoring, we have the almost trivial pure function we extracted (for simplicity I let it return <code>Result</code> here, as you proposed):</p> <pre><code class='language-f#' lang='f#'>let completePure reg proofValidity = match proofValidity with | Some true -&gt; Ok reg | Some false | None -&gt; Error reg.Mobile </code></pre> <p>Secondly, we have the composition function. Now, with the modification to <code>completeRegistration</code> (returning <code>Async&lt;Result&lt;_,_&gt;&gt;</code>), it can&#39;t as easily be written in point-free style. You might certainly be able to improve it, but here is my quick initial take.</p> <pre><code class='language-f#' lang='f#'>let complete proofId reg : Async&lt;Result&lt;CompleteRegistrationResult, CompleteRegistrationError&gt;&gt; = asyncResult { let! proofValidity = proofId |&gt; Option.traverseAsync (Http.verifyMobileClaimProof reg.Mobile) match completePure reg proofValidity with | Ok reg -&gt; do! Db.completeRegistration reg return RegistrationCompleted | Error mobile -&gt; let! proofId = Http.createMobileClaimProof mobile return ProofRequired proofId } </code></pre> <h4>Evaluation</h4> <p>Now that we have presented the code before/after, let us take stock of what we have gained and lost by the refactoring.</p> <p>Pros:</p> <ul> <li>We have gotten rid of the &quot;DI workflow&quot; entirely</li> <li>More of the logic is pure</li> </ul> <p>Cons:</p> <ul> <li>The logic we extracted to a pure function is almost trivial. This is not in itself bad, but one can wonder whether it was worth it (apart from the purely instructive aspects).</li> <li>If the extracted logic is pure, where then did the rest of the complexity go? The only place it could – it ended up in the &quot;composition&quot;, i.e. the &quot;humble object&quot;. The composition function isn&#39;t just calling a higher-order function with the correct function arguments any more; it has higher cyclomatic complexity and is much harder to read, and can&#39;t be easily tested (since it&#39;s a composition function). The new composition is, so to say, quite a bit less humble than the original composition. This is particularly evident in my updated version, but personally I also have to look at your simpler(?), point-free version a couple of times to convince myself that it is, really, not doing anything wrong. (Though regardless of whether a function is written point-free or not, it does the exact same thing and has the same complexity.)</li> <li>To the point above: The composition function needs many &quot;complex&quot; helper functions that would likely confuse, if not outright alienate beginner F# devs (which could, for example, lead to worse onboarding). This is particularly relevant for non-standard functions like <code>AsyncOption.traverse</code>, <code>AsyncResult.traverseBoth</code>, <code>AsyncResult.cata</code>, etc.</li> </ul> <p>Returning to my initial question: Does the refactoring “reduce complexity where it matters?“ I’m not sure. This is (at least partly) “personal opinions” territory, of course, and my vague question doesn’t help. But personally I find the result of the refactoring more complex to understand than the original, DI workflow-based version.</p> <p>Based on Scott Wlaschin’s book Domain Modelling Made Functional, it’s possible he might agree. He seems very fond of the “DI workflow” approach there. I personally prefer a bit more dependency rejection than that, because I find “DR”/sandwiches often leads to simpler code, but in this particular case, I may prefer the impure DI workflow, tested using state-based testing. At least for the more complex code I described, but perhaps also for your original example.</p> <p>Still, I truly appreciate your taking the time to respond in this manner. It was very instructive, as always, which was after all the point. And you’re welcome to share any insights regarding this comment, too.</p> </div> <div class="comment-date">2019-12-03 13:46 UTC</div> </div> <div class="comment" id="e90332adb7d24e2b8aa1484c302b6f8c"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Christer, thank you for writing. This is great! One of your comments inspires me to compose another article that I've long wanted to write. If I manage to produce it in time, I'll publish it Monday. Once that's done, I'll respond here in a more thorough manner. </p> <p> When I do that, however, I don't plan to reproduce your updated example, or address it in detail. I see nothing in it that invalidates what I've already written. As far as I can tell, you don't need to explicitly pattern-match on <code>completePure reg proofValidity</code>. You should be able to map or traverse over it like already shown. If you want my help with the details, I'll be happy to do so, but then please prepare a <a href="https://en.wikipedia.org/wiki/Minimal_working_example">minimal working example</a> like I did for this article. You can either fork <a href="https://github.com/ploeh/RegistrationFlow">my example</a> or make a new repository. </p> </div> <div class="comment-date">2019-12-04 8:35 UTC</div> </div> <div class="comment" id="785e708f61b14ad0825d1359cbebd8a2"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> This is a fantastic post Mark! Thank you very much for going step-by-step while explaining how you refactored this code. </p> <blockquote> I find it intriguing that when I find myself in the realm of web services or message-based applications, I can't recall a case where a [impure/pure/impure] sandwich has been impossible. Surely, there must cases where it is so. That's the reason that I solicit examples. </blockquote> <p> I would like to suggest a example in the realm of web services or message-based applications that cannot be expressed as a impure/pure/impure sandwich. </p> <p> Let's call an "impure/pure/impure sandwich" an impure-pure-impure composition. More generally, any impure funciton can be expressed as a composition of the form <code>[pure-]impure(-pure-impure)*[-pure]</code>. That is, (1) it might begin with a pure step, then (2) there is an impure step, then (3) there is a sequence of length zero or more containing a pure step followed by an impure step, and lastly (4) it might end with another pure step. One reason an impure fucntion might intentially be expressed by a composition that ends with a pure step is to erase senitive informaiton form the memory hierarchy. For simplicity though, let's assume that any impure function can be refactored so that the corresponding composition ends with an impure step. Let the length of a composition be one plus the number of dashes (<code>-</code>) that it contains. </p> <p> Suppose <code>f</code> is a function with an impure-pure-impure composition such that <code>f</code> cannot be refactored to a fucntion with a composition of a smaller length. Then there exists fucntion <code>f'</code> with a pure-impure-pure-impure composition. The construction uses public-key cryptography. I think this is a natural and practical example. </p> <p> Here is the definition of <code>f'</code> in words. The user sends to the server ciphertext encryped using the server's public key. The user's request is received by a process that already has the server's private key loaded into memory. This process decrypts the user's ciphertext using its private key to obtain some plantext <code>p</code>. This step is pure. Then the process passes <code>p</code> into <code>f</code>. </p> <p> Using symmetric-key cryptography, it is possible to construct a function with a composition of an arbitrarily high length. The following construction reminds me of how <a href="https://en.wikipedia.org/wiki/Onion_routing">union routing</a> works (though each decryption in that case is intended to happen in a different process on a different server). I admit that this example is not very natural or practical. </p> <p> Suppose <code>f</code> is a function with a composition of length <code>n</code>. Then there exists fucntion <code>f'</code> with a composition of length greater than <code>n</code>. Specifically, if the original composition starts with a pure step, then the length is larger by one; if the original composition starts with an impure step, then the length is larger by two. </p> <p> Here is the definition of <code>f'</code> in words. The user sends to the server an ID and ciphertext encryped using a symmetric key that corresponds to the ID. The user's request is received by a process that does not have any keys loaded into memory. First, this process obtains from disk the appropriate symmetric key using the ID. This step is impure. Then this process decrypts the user's ciphertext using this key to obtain some plantext <code>p</code>. This step is pure. Then the process passes <code>p</code> into <code>f</code>. </p> </div> <div class="comment-date">2019-12-06 17:21 UTC</div> </div> <div class="comment" id="0ae83af0cc824c848e5988eb2fb35356"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. Unfortunately, I don't follow your chain of reasoning. Cryptography strikes me as fitting the impure/pure/impure sandwich architecture quite well. There's definitely an initial impure step because you have to initialise a random number generator, as well as load keys, salts, and whatnot from storage. From there, though, the cryptographic algorithms are, as far as I'm aware, pure calculation. I don't see how asymmetric cryptography changes that. </p> <p> The reason that I'm soliciting examples that defy the impure/pure/impure sandwich architecture, however, is that I'm looking for a compelling example. What to do when the sandwich architecture is impossible is a frequently asked question. To be clear, I know what to do in that situation, but I'd like to write an article that answers the question in a compelling way. For that, I need an example that an uninitiated reader can follow. </p> </div> <div class="comment-date">2019-12-07 10:24 UTC</div> </div> <div class="comment" id="e53d31d11b814bccac392dfc0bc03230"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> Sorry that my explanation was unclear. I should have included an example. </p> <blockquote> Cryptography strikes me as fitting the impure/pure/impure sandwich architecture quite well. There's definitely an initial impure step because you have to initialise a random number generator, as well as load keys, salts, and whatnot from storage. From there, though, the cryptographic algorithms are, as far as I'm aware, pure calculation. I don't see how asymmetric cryptography changes that. </blockquote> <p> I agree that cyptographic algorithms are pure. From your qoute that I included above, I get the impression that you have neglected to consider what computation is to be done with the output of the cyptogrpahic algorithm. </p> <p> Here is a specific example of my first construction, which uses public-key cyptography. Consider the function <code>sut</code> that concluded your post. I repeat it for clarity. </p> <p><pre><span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;validityOfProof&nbsp;=&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;decision&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;r&nbsp;validityOfProof &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;decision &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Let <code>sut</code> be the function <code>f</code> in that construction. In particular, <code>sut</code> is an impure/pure/impure sandwich, or equivalently an impure-pure-impure composition that of course has length 3. Furthermore, I think it is clear that this behavior cannot be expressed as a pure-impure-pure composition, a pure-impure composition, an impure-pure composition, or an impure composition. You worked very hard to simplfy that code, and I believe an implicit claim of yours is that it cannot be simplified any further. </p> <p> In this case, <code>f'</code> would be the following function. </p> <p><pre><span style="color:blue;">let</span>&nbsp;privateKey&nbsp;=&nbsp;... <span style="color:blue;">let</span>&nbsp;sut'&nbsp;ciphertext&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;(pid,&nbsp;r)&nbsp;=&nbsp;decrypt&nbsp;privateKey&nbsp;ciphertext &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;validityOfProof&nbsp;=&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;decision&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;r&nbsp;validityOfProof &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;decision &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> As I defined it, this function is a pure-impure-pure-impure composition, which has length 4. Maybe in your jargon you would call this a pure/impure/pure/impure sandwich. My claim is that this function cannot be refactored into an impure/pure/impure sandwich. </p> <p> Do you think that my claim is correct? </p> </div> <div class="comment-date">2019-12-07 14:48 UTC</div> </div> <div class="comment" id="189069ae41704900b56403288656a8fe"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for your patience with me. Now I get it. As stated, your composition looks like a pure-impure-pure-impure composition, but unless you hard-code <code>privateKey</code>, you'll have to load that value, which is an impure operation. That would make it an impure-pure-impure-pure-impure composition. </p> <p> The decryption step itself is an impure-pure composition, assuming that we need to load keys, salts, etc. from persistent storage. You might also want to think of it as a 'mostly' pure function, since you could probably load decryption keys once when the application process starts, and keep them around for its entire lifetime. </p> <p> It's a correct example of a more involved interaction model. Thank you for supplying it. Unfortunately, it's not one I can use for an article. Like other cross-cutting concerns like caching, logging, retry mechanisms, etcetera, security can be abstracted away as middleware. This implies that you'd have a middleware action that's implemented as an impure-pure-impure sandwich, and an application feature that's implemented as another impure-pure-impure sandwich. These two sandwiches are unrelated. A change to one of them is unlikely to trigger a change in the other. Thus, we can still base our application architecture on the notion of the impure-pure-impure sandwich. </p> <p> I hope I've explained my demurral in a sensible way. </p> </div> <div class="comment-date">2019-12-08 13:24 UTC</div> </div> <div class="comment" id="c7b6630ebec24c0fad9b5821a0802878"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> This implies that you'd have a middleware action that's implemented as an impure-pure-impure sandwich, and an application feature that's implemented as another impure-pure-impure sandwich. These two sandwiches are unrelated. A change to one of them is unlikely to trigger a change in the other. </blockquote> <p> The are unrelated semantically. Syntatically, the whole application sandwich is the last piece of impure bread on the middleware sandwich. This reminds me of a thought I have had and also heard recently, which is that the structure of code is like a fractal. </p> <p> Anyway, I am hearing you say that you want functions to have "one responsibility", to do "one thing", to change for "one reason". With that constraint satisfied, you are requesting an example of a funciton that is not an impure/pure/impure sandwich. I am up to that challenge. Here is another attempt. </p> <p> Suppose our job is to implement a <a href="https://en.wikipedia.org/wiki/Man-in-the-middle_attack">man-in-the-middle attack</a> in the style of <a href="https://www.schneier.com/blog/archives/2011/06/man-in-the-midd_3.html">Schneier's Chess Grandmaster Problem</a> in which Alice and Bob know that they are communicating with Malory while Malory simply repeats what she hears to the other person. Specifically, Alice is a client and Bob is a server. Mailory acts like a server to Alice and like a client to Bob. The funciton would look something like this. </p> <p><pre><span style="color:blue;">let</span>&nbsp;malroyInTheMiddle&nbsp;aliceToMalory&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;maloryToBob&nbsp;=&nbsp;convertIncoming&nbsp;aliceToMalory &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;bobToMalory&nbsp;=&nbsp;service&nbsp;maloryToBob &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;maloryToAlice&nbsp;=&nbsp;convertOutgoing&nbsp;bobToMalory &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;maloryToAlice }</pre></p> <p> This is a pure-impure-pure composition, which is different from an impure-pure-impure composition. </p> </div> <div class="comment-date">2019-12-09 14:28 UTC</div> </div> <div class="comment" id="f2db5259c16e4241b53934ae4dcb17a0"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. The way I understand it, we are to assume that both <code>convertIncoming</code> and <code>convertOutgoing</code> are complicated functions that require substantial testing to get right. Under that assumption, I think that you're right. This doesn't directly fit the impure-pure-impure sandwich architecture. </p> <p> It does, however, fit a simple function composition. As far as I can see, it's equivalent to something like this: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;malroyInTheMiddle&nbsp;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;Async.fromResult &nbsp;&nbsp;&nbsp;&nbsp;&gt;&gt;&nbsp;Async.map&nbsp;convertIncoming &nbsp;&nbsp;&nbsp;&nbsp;&gt;&gt;&nbsp;Async.bind&nbsp;service &nbsp;&nbsp;&nbsp;&nbsp;&gt;&gt;&nbsp;Async.map&nbsp;convertOutgoing</pre> </p> <p> I haven't tested it, but I'd imagine it to be something like that. </p> <p> To nitpick, this isn't a pure-impure-pure composition, but rather an impure-pure-impure-pure-impure composition. The entry point of a system is always impure, as is the output. </p> </div> <div class="comment-date">2019-12-10 11:29 UTC</div> </div> <div class="comment" id="fea1498fe43543f29a01eb6101dbdb9f"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Christer, I'd hoped that I'd already addressed some of your concerns in the article itself, but I may not have done a good enough job of it. Overall, given a question like <blockquote> "Is it possible to refactor this to direct input/output, in a way that actually reduces complexity where it matters?" </blockquote> I tend to put emphasis on <em>is it possible</em>. Not that the rest of the question is unimportant, but it's more subjective. Perhaps you find that my article didn't answer your question, but I hope at least that I managed to establish that, yes, it's possible to refactor to an impure-pure-impure sandwich. </p> <p> Does it matter? I think it does, but that's subjective. I do think, though, that I can objectively say that <a href="/2018/11/19/functional-architecture-a-definition">my refactoring is functional</a>, whereas <a href="/2017/01/30/partial-application-is-dependency-injection">passing impure functions as arguments isn't</a>. Whether or not an architecture ought to be functional is, again, subjective. No-one says that it has to be. That's up to you. </p> <p> As I wrote in <a href="#e90332adb7d24e2b8aa1484c302b6f8c">my preliminary response</a>, I'm not going to address your modification. I don't see that it matters. Even when you return an <code>Async&lt;Result&lt;_,_&gt;&gt;</code> you can <code>map</code>, <code>bind</code>, or <code>traverse</code> over it. You may not be able to use <code>AsyncResult.traverseBoth</code>, but you can derive specialisations like <code>AsyncResult.traverseOk</code> and <code>AsyncResult.traverseError</code>. </p> <p> First, like you did, I find it illustrative to juxtapose the alternatives. I'm going to use the original example first: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;(createProof:&nbsp;Mobile&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;ProofId&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(verifyProof:&nbsp;Mobile&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ProofId&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;bool&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(completeRegistration:&nbsp;Registration&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Async&lt;unit&gt;) &nbsp;&nbsp;&nbsp;&nbsp;(proofId:&nbsp;ProofId&nbsp;option) &nbsp;&nbsp;&nbsp;&nbsp;(registration:&nbsp;Registration) &nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;Async&lt;CompleteRegistrationResult&gt;&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;proofId&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;proofId&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;isValid&nbsp;=&nbsp;verifyProof&nbsp;registration.Mobile&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isValid&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do!</span>&nbsp;completeRegistration&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;RegistrationCompleted &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;proofId&nbsp;=&nbsp;createProof&nbsp;registration.Mobile &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;ProofRequired&nbsp;proofId &nbsp;&nbsp;&nbsp;&nbsp;} <span style="color:blue;">let</span>&nbsp;sut&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;completeRegistrationWorkflow &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;twoFA.VerifyProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;db.CompleteRegistration</pre> </p> <p> In contrast, here's my refactoring: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;completeRegistrationWorkflow&nbsp;registration&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;<span style="color:blue;">true</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Ok&nbsp;registration &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Error&nbsp;registration.Mobile <span style="color:blue;">let</span>&nbsp;sut&nbsp;pid&nbsp;r&nbsp;=&nbsp;async&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;validityOfProof&nbsp;=&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AsyncOption.traverse&nbsp;(twoFA.VerifyProof&nbsp;r.Mobile)&nbsp;pid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;decision&nbsp;=&nbsp;completeRegistrationWorkflow&nbsp;r&nbsp;validityOfProof &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;decision &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.traverseBoth&nbsp;db.CompleteRegistration&nbsp;twoFA.CreateProof &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;AsyncResult.cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;RegistrationCompleted)&nbsp;ProofRequired &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> It's true that my composition (<code>sut</code>) seems more involved than yours, but the overall trade-off looks good to me. In total, the code is simpler. </p> <p> In your <em>evaluation</em>, you make some claims that I'd like to specifically address. Most are reasonable, but I think a few require special attention: <blockquote> "The logic we extracted to a pure function is almost trivial." </blockquote> Indeed. This should be listed as a <em>pro</em>, not a <em>con</em>. Whether or not it's worth it is a different discussion. </p> <p> As I wrote in the article: <blockquote> "If you look at the original function, you'll see that <em>the duplication was there all along</em>. Once you remove all the accidental complexity, you uncover the essential complexity." </blockquote> (Emphasis from the original.) Isn't it always worth it to take away accidental complexity? <blockquote> "The composition function isn't just calling a higher-order function with the correct function arguments any more; it has higher cyclomatic complexity" </blockquote> <a href="/2019/12/09/put-cyclomatic-complexity-to-good-use/#de927bfcc95d410bbfcd0adf7a63926b">No, it doesn't</a>. It has a cyclomatic complexity of <em>1</em>, exactly like the original humble object. <blockquote> "can't be easily tested" </blockquote> True, but neither can the original humble object. <blockquote> "The new composition is, so to say, quite a bit less humble than the original composition." </blockquote> According to which criterion? It has the same cyclomatic complexity, but I admit that more characters went into typing it. On the other hand, the composition juxtaposed with the actual function has far fewer characters than the original example. </p> <p> You also write that <blockquote> "The composition function needs many "complex" helper functions [...]. This is particularly relevant for non-standard functions like <code>AsyncOption.traverse</code>, <code>AsyncResult.traverseBoth</code>, <code>AsyncResult.cata</code>, etc." </blockquote> I don't like the epithet <em>non-standard</em>. It's true that these functions aren't in <code>FSharp.Core</code>, for reasons that aren't clear to me. In comparison, they're part of the standard <code>base</code> library in Haskell. </p> <p> There's nothing non-standard about functions like these. Like <code>map</code> and <code>bind</code>, traversals and catamorphisms are <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstractions</a>. They exist independently of particular programming languages or library packages. </p> <p> I think that it's fair criticism that they may not be friendly to absolute beginners, but they're still fairly basic ideas that address real needs. The same can be said for the <code>asyncResult</code> computation expression that you've decided to use. It's also 'non-standard' only in the sense that it's not part of <code>FSharp.Core</code>, but otherwise standard in that it's just a stack of monads, and plenty of libraries supply that functionality. You can also write that computation expression yourself in a dozen lines of code. </p> <p> In the end, all of this is subjective. As I also wrote in my conclusion: <blockquote> <p> "I've been in team settings where [...] team members wouldn't understand what was going on. In the latter case, I'd adjust my approach to challenge, not alienate, other team members. </p> <p> "My intention with this article was to show what's possible, not to dictate what you should do." </p> </blockquote> What I do, however, think is important to realise is that what I suggest is to learn a set of concepts <em>once</em>. Once you understand <a href="/2018/03/22/functors">functors</a>, monads, traversals etcetera, that's knowledge that applies to F#, Haskell, C#, JavaScript (I suppose) and so on. </p> <p> Personally, I find it a better investment of my time to learn a general concept once, and then work with trivial code, rather than having to learn, over and over again, how to deal with each new code base's accidental complexity. </p> </div> <div class="comment-date">2019-12-12 2:56 UTC</div> </div> <div class="comment" id="e0c12e5a6148400aac256cf3b800ff4f"> <div class="comment-author"><a href="https://www.relativisticramblings.com/">Christer van der Meeren</a></div> <div class="comment-content"> <p>Mark, thank you for getting back to me with a detailed response.</p> <p>First, a general remark. I see that my comment might have been a bit &quot;sharp around the edges” and phrased somewhat carelessly, giving the impression that I was not happy with your treatment of my example. I’d just like to clearly state that I am. You replied in your usual clear manner to exactly the question I posed, and seeing your process and solution was instructive for me.</p> <p>We are all learning, all the time, and if I use a strong voice, that is primarily because <a href='https://blog.codinghorror.com/strong-opinions-weakly-held/'>strong opinions, weakly held</a> often seems to be a fruitful way to drive discussion and learning.</p> <p>With that in mind, allow me to address some of your remarks and possibly soften my previous comment.</p> <blockquote><p>Perhaps you find that my article didn&#39;t answer your question, but I hope at least that I managed to establish that, yes, it&#39;s possible to refactor to an impure-pure-impure sandwich.</p> </blockquote> <p>Your article did indeed answer my question. My takeaway (then, not necessarily now after reading the rest of your comment) was that you managed to refactor to impure-pure-impure at the &quot;expense” of making the non-pure part harder to understand. But as you say, that’s subjective, and your remarks on that later in your comment was a good point of reflection for me. I’ll get to that later.</p> <blockquote><p>Does it matter? I think it does, but that&#39;s subjective. I do think, though, that I can objectively say that <a href='https://blog.ploeh.dk/2018/11/19/functional-architecture-a-definition'>my refactoring is functional</a>, whereas <a href='https://blog.ploeh.dk/2017/01/30/partial-application-is-dependency-injection'>passing impure functions as arguments isn&#39;t</a>. Whether or not an architecture ought to be functional is, again, subjective. No-one says that it has to be. That&#39;s up to you.</p> </blockquote> <p>I agree on all points.</p> <blockquote><p>First, like you did, I find it illustrative to juxtapose the alternatives.</p> </blockquote> <p>I don’t agree 100% that it’s a completely fair comparison, since you’re leaving out the implementations of <code>AsyncOption.traverse</code>, <code>AsyncResult.traverseBoth</code>, and <code>AsyncResult.cata</code>. However, I get why you are doing it. These are generic utility functions for universal concepts that, as you say later in your comment, you “learn once”. In that respect, it’s fair to leave them out. My only issue with it is that since F# doesn’t have higher-kinded types, these utility functions have to be specific to the monads and monad stacks in use. I originally thought this made such functions less understandable and less useful, but after reading the rest of your comment, I’m not sure they are. More on that below.</p> <blockquote><p>In your <em>evaluation</em></p> </blockquote> <p>(Emphasis yours.) Just in case: “Evaluation” might have been a poor choice of words. I hope you did not take it to mean that I was a teacher grading a student’s test. This was not in any way intended personally (e.g. evaluating &quot;<em>your</em> solution”). I was merely looking to sum up my subjective opinions about the refactoring.</p> <blockquote><p>Isn&#39;t it always worth it to take away accidental complexity?</p> </blockquote> <p>I find it hard to say an unequivocal &quot;yes” to such general statements. Ultimately it depends on the specific context and tradeoffs involved. If the context is “a codebase to be used for onboarding new F# devs” and the tradeoffs are “use generic helper functions to traverse bifunctors in a stack of monads”, then I’m not sure. (It <em>may</em> still be, but it’s certainly not a given.)</p> <p>But generally, though I haven’t reflected deeply on this, I’m sure you’re right that it’s worthwhile to always take away accidental complexity.</p> <blockquote><p><a href='https://blog.ploeh.dk/2019/12/09/put-cyclomatic-complexity-to-good-use/#de927bfcc95d410bbfcd0adf7a63926b'>No, it doesn&#39;t</a>. It has a cyclomatic complexity of <em>1</em>, exactly like the original humble object.</p> </blockquote> <p>You’re right. Thank you for the clarifying article on cyclomatic complexity.</p> <blockquote><blockquote><p>can&#39;t be easily tested</p> </blockquote> <p>True, but neither can the original humble object.</p> </blockquote> <p>That’s correct, but my point was that the original composition was trivial (just calling a “DI function” with the correct arguments/dependencies) and didn’t need to be tested, whereas the refactored composition does more and might warrant testing (at least to a larger degree than the original).</p> <p>This raises an interesting point. It seems (subjectively to me based on what I’ve read) to be a general consensus that a function can be left untested (is “humble”, so to speak) as long as it consists of just generic helpers, like the refactored composition. That &quot;if it compiles, it works”. This is not a general truth, since for some signatures there may exist several transformations from the input type to the output type, where the output value is different for the different transformations. I have come across such cases, and even had bugs because I used the wrong transformation. Which is why I said:</p> <blockquote><blockquote><p>The new composition is, so to say, quite a bit less humble than the original composition.</p> </blockquote> <p>According to which criterion?</p> </blockquote> <p>It is more complex in the sense that it doesn’t just call a function with the correct dependencies. The original composition is more or less immediately recognizable as correct. The refactored composition, as I said, required me to look at it more carefully to convince myself that it was correct. (I will grant that this is to some extent subjective, though.)</p> <blockquote><p>I don&#39;t like the epithet <em>non-standard</em>. It&#39;s true that these functions aren&#39;t in <code>FSharp.Core</code>, for reasons that aren&#39;t clear to me. In comparison, they&#39;re part of the standard <code>base</code> library in Haskell.</p> <p>There&#39;s nothing non-standard about functions like these. Like <code>map</code> and <code>bind</code>, traversals and catamorphisms are <a href='https://blog.ploeh.dk/2017/10/04/from-design-patterns-to-category-theory'>universal abstractions</a>. They exist independently of particular programming languages or library packages. </p> <p>I think that it&#39;s fair criticism that they may not be friendly to absolute beginners, but they&#39;re still fairly basic ideas that address real needs.</p> <p>…</p> <p>What I do, however, think is important to realise is that what I suggest is to learn a set of concepts <em>once</em>. Once you understand <a href='https://blog.ploeh.dk/2018/03/22/functors'>functors</a>, monads, traversals etcetera, that&#39;s knowledge that applies to F#, Haskell, C#, JavaScript (I suppose) and so on.</p> <p>Personally, I find it a better investment of my time to learn a general concept once, and then work with trivial code, rather than having to learn, over and over again, how to deal with each new code base&#39;s accidental complexity.</p> </blockquote> <p>This is the primary point of reflection for me in your comment. While I frequently use monads and monad stacks (particularly <code>Async&lt;Result&lt;_,_&gt;&gt;</code>) and often write utility code to transform when needed (e.g. <code>List.traverseResult</code>), I try to limit the number of such custom utility functions. Why? I’m not sure, actually. It may very well have to do with my work environment, where for a long time I have been the only F# dev and I don’t want to alienate the other .NET devs before they even get started with F#.</p> <p>In light of your comment, perhaps F# devs are doing others a disservice if we limit our use of important, general concepts like functors, monads, traversals etc.? Then again, there’s certainly a balance to be struck. I got started (and thrilled) with F# by reading <a href='https://fsharpforfunandprofit.com/'>F# for fun and profit</a> and learning about algebraic types, the concise syntax, &quot;railway-oriented programming” etc. If my first glimpse of F# had instead been <code>AsyncSeq.traverseAsyncResultOption</code>, then I might never have left the warm embrace of C#.</p> <p>I might check out <a href='http://fsprojects.github.io/FSharpPlus/'>FSharpPlus</a>, which seems to make this kind of programming easier. I have previously steered away from that library because I deemed it “too complex” (c.f. my remarks about alienating coworkers), but it might be time to reconsider. If you have tried it, I would love to hear your thoughts on it in some form or another, though admittedly that isn’t directly related to the topic at hand.</p> </div> <div class="comment-date">2019-12-12 9:04 UTC</div> </div> <div class="comment" id="477da2bf6ca04dc2ac478811cd77435e"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Christer, don't worry about the tone of the debate. I'm not in the least offended or vexed. On the contrary, I find this a valuable discussion, and I'm glad that we're having it in a medium where it's also visible to other people. </p> <p> I think that we're gravitating towards consensus. I definitely agree that the changes I suggest aren't beginner-friendly. </p> <p> People sometimes ask me for advice on how to get started with functional programming, and I always tell .NET developers to start with F#. It's a friendly language that enables everyone to learn gradually. If you already know C# (or Visual Basic .NET) the only thing you need to learn about F# is some syntax. Then you can write object-oriented F#. As you learn new functional concepts, you can gradually change the way you write F# code. That's what I did. </p> <p> I agree with your reservations about onboarding and beginner-friendliness. When that's a concern, I wouldn't write the F# code like I suggested either. </p> <p> For a more sophisticated team, however, I feel that my suggestions are improvements that matter. I grant you that the composition seems more convoluted, but I consider the overall trade-off beneficial. In the <a href="https://www.infoq.com/presentations/Simple-Made-Easy">terminology suggested by Rich Hickey</a>, it may not be easier, bit it's simpler. </p> <p> I have no experience with FSharpPlus or any similar libraries. I usually just add the monad stacks and functions to my code base on an as-needed basis. As we've seen here, such functions are mostly useful to compose other functions, so they rarely need to be exported as part of a code base's surface area. </p> </div> <div class="comment-date">2019-12-12 15:02 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. TimeSpan configuration values in .NET Core https://blog.ploeh.dk/2019/11/25/timespan-configuration-values-in-net-core 2019-11-25T07:04:00+00:00 Mark Seemann <div id="post"> <p> <em>You can use a standard string format for TimeSpan values in configuration files.</em> </p> <p> Sometimes you need to make <code>TimeSpan</code> values configurable. I often see configuration files that look like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;SeatingDurationInSeconds&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;9000&quot;</span> }</pre> </p> <p> Code can read such values from configuration files like this: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">seatingDuration</span>&nbsp;=&nbsp;<span style="color:#2b91af;">TimeSpan</span>.<span style="color:#74531f;">FromSeconds</span>(Configuration.<span style="font-weight:bold;color:#74531f;">GetValue</span>&lt;<span style="color:blue;">int</span>&gt;(<span style="color:#a31515;">&quot;SeatingDurationInSeconds&quot;</span>));</pre> </p> <p> This works, but is abstruse. How long is 9000 seconds? </p> <p> The <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> configuration file format for .NET Core is JSON, which even prevents you from adding comments. Had the configuration been in XML, at least you could have added a comment: </p> <p> <pre><span style="color:blue;">&lt;!--</span><span style="color:green;">9000&nbsp;seconds&nbsp;=&nbsp;2½&nbsp;hours</span><span style="color:blue;">--&gt;</span> <span style="color:blue;">&lt;</span><span style="color:#a31515;">SeatingDurationInSeconds</span><span style="color:blue;">&gt;</span>9000<span style="color:blue;">&lt;/</span><span style="color:#a31515;">SeatingDurationInSeconds</span><span style="color:blue;">&gt;</span></pre> </p> <p> In this case, however, it doesn't matter. Use the <a href="https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-timespan-format-strings">standard TimeSpan string representation</a> instead: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;SeatingDuration&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;2:30:00&quot;</span> }</pre> </p> <p> Code can read the value like this: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">seatingDuration</span>&nbsp;=&nbsp;Configuration.<span style="font-weight:bold;color:#74531f;">GetValue</span>&lt;<span style="color:#2b91af;">TimeSpan</span>&gt;(<span style="color:#a31515;">&quot;SeatingDuration&quot;</span>);</pre> </p> <p> I find a configuration value like <code>"2:30:00"</code> much easier to understand than <code>9000</code>, and the end result is the same. </p> <p> I haven't found this documented anywhere, but from experience I know that this capability is present in the .NET Framework, so I wondered if it was also available in .NET Core. It is. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="95e18fd2054447ab901aecc4d74c1547"> <div class="comment-author">Rex Ng</div> <div class="comment-content"> <p>There is actually an <a href="https://en.wikipedia.org/wiki/ISO_8601#Durations">ISO 8601 standard for durations.</a></p> <p> In your example you can use <pre>{ "SeatingDuration": "P9000S" }</pre> or <pre>{ "SeatingDuration": "P2H30M" }</pre> which is also quite readable in my opinion.</p> <p>Not every JSON serializer supports it though.</p> </div> <div class="comment-date">2019-11-26 01:02 UTC</div> </div> <div class="comment" id="edb7484150934d94848fe923f0ee4a39"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Rex, thank you for writing. I was aware of the ISO standard, although I didn't know that you can use it in a .NET Core configuration file. This is clearly subjective, but I don't find that format as readable as <code>2:30:00</code>. </p> </div> <div class="comment-date">2019-11-26 6:59 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Small methods are easy to troubleshoot https://blog.ploeh.dk/2019/11/18/small-methods-are-easy-to-troubleshoot 2019-11-18T06:48:00+00:00 Mark Seemann <div id="post"> <p> <em>Write small methods. How small? Small enough that any unhandled exception is easy to troubleshoot.</em> </p> <p> Imagine that you receive a bug report. This one include a logged exception: </p> <p> <pre>System.NullReferenceException: Object reference not set to an instance of an object. at Ploeh.Samples.BookingApi.Validator.Validate(ReservationDto dto) at Ploeh.Samples.BookingApi.ReservationsController.Post(ReservationDto dto) at lambda_method(Closure , Object , Object[] ) at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters) at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.SyncActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync() at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context) </pre> </p> <p> <em>Oh, no,</em> you think, <em>not a NullReferenceException.</em> </p> <p> If you find it hard to troubleshoot NullReferenceExceptions, you're not alone. It doesn't have to be difficult, though. Open the method at the top of the stack trace, <code>Validate</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Validator</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Validate</span>(<span style="color:#2b91af;">ReservationDto</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!<span style="color:#2b91af;">DateTime</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>.Date,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:blue;">_</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">$&quot;Invalid&nbsp;date:&nbsp;</span>{<span style="font-weight:bold;color:#1f377f;">dto</span>.Date}<span style="color:#a31515;">.&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Take a moment to consider this method in the light of the logged exception. What do you think went wrong? Which object was null? </p> <h3 id="efd0166dab0c4a85b23c624162f8da67"> Failed hypothesis <a href="#efd0166dab0c4a85b23c624162f8da67" title="permalink">#</a> </h3> <p> You may form one of a few hypotheses about which object was null. Could it be <code>dto</code>? <code>dto.Date</code>? Those are the only options I can see. </p> <p> When you encounter a bug in a production system, if at all possible, reproduce it as a unit test. </p> <p> If you think that the problem is that <code>dto.Date</code> is null, test your hypothesis in a unit test: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ValidateNullDate</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span>&nbsp;{&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">null</span>&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Validator</span>.<span style="color:#74531f;">Validate</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotEmpty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>); }</pre> </p> <p> If you think that a null <code>dto.Date</code> should reproduce the above exception, you'd expect the test to fail. When you run it, however, it passes. It passes because <code>DateTime.TryParse(null, out var _)</code> returns <code>false</code>. It doesn't throw an exception. </p> <p> That's not the problem, then. </p> <p> That's okay, this sometimes happens. You form a hypothesis and fail to validate it. Reject it and move on. </p> <h3 id="e35877f1cc064a259b27046f05777807"> Validated hypothesis <a href="#e35877f1cc064a259b27046f05777807" title="permalink">#</a> </h3> <p> If the problem isn't with <code>dto.Date</code>, it must be with <code>dto</code> itself. Write a unit test to test that hypothesis: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ValidateNullDto</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Validator</span>.<span style="color:#74531f;">Validate</span>(<span style="color:blue;">null</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">NotEmpty</span>(<span style="font-weight:bold;color:#1f377f;">actual</span>); }</pre> </p> <p> When you run this test, it does indeed fail: </p> <p> <pre>Ploeh.Samples.BookingApi.UnitTests.ValidatorTests.ValidateNullDto Duration: 6 ms Message: System.NullReferenceException : Object reference not set to an instance of an object. Stack Trace: Validator.Validate(ReservationDto dto) line 12 ValidatorTests.ValidateNullDto() line 36</pre> </p> <p> This looks like the exception included in the bug report. You can consider this as validation of your hypothesis. This test reproduces the defect. </p> <p> It's easy to address the issue: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Validate</span>(<span style="color:#2b91af;">ReservationDto</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">dto</span>&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;No&nbsp;reservation&nbsp;data&nbsp;supplied.&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!<span style="color:#2b91af;">DateTime</span>.<span style="color:#74531f;">TryParse</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>.Date,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:blue;">_</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">$&quot;Invalid&nbsp;date:&nbsp;</span>{<span style="font-weight:bold;color:#1f377f;">dto</span>.Date}<span style="color:#a31515;">.&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;&quot;</span>; }</pre> </p> <p> The point of this article, however, is to show that small methods are easy to troubleshoot. How you resolve the problem, once you've identified it, is up to you. </p> <h3 id="7b8e2476b4fb45a0b4f32b6b2d108724"> Ambiguity <a href="#7b8e2476b4fb45a0b4f32b6b2d108724" title="permalink">#</a> </h3> <p> Methods are usually more complex than the above example. Imagine, then, that you receive another bug report with this logged exception: </p> <p> <pre>System.NullReferenceException: Object reference not set to an instance of an object. at Ploeh.Samples.BookingApi.MaîtreD.CanAccept(IEnumerable1 reservations, Reservation reservation) at Ploeh.Samples.BookingApi.ReservationsController.Post(ReservationDto dto) at lambda_method(Closure , Object , Object[] ) at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters) at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.SyncActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync() at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)</pre> </p> <p> When you open the code file for the <code>MaîtreD</code> class, this is what you see: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;+=&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>.Quantity; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(Capacity&nbsp;&lt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>.Quantity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This code throws a <code>NullReferenceException</code>, but which object is <code>null</code>? Can you identify it from the code? </p> <p> I don't think that you can. It could be <code>reservations</code> or <code>reservation</code> (or both). Without more details, you can't tell. The exception is ambiguous. </p> <p> The key to making troubleshooting easy is to increase your chances that, when exceptions are thrown, they're unambiguous. The problem with a <code>NullReferenceException</code> is that you can't tell which object was <code>null</code>. </p> <h3 id="6a46508d10734d5aaeaad9252e1f70ac"> Remove ambiguity by protecting invariants <a href="#6a46508d10734d5aaeaad9252e1f70ac" title="permalink">#</a> </h3> <p> Consider the <code>CanAccept</code> method. Clearly, it requires both <code>reservations</code> and <code>reservation</code> in order to work. This requirement is, however, currently implicit. You can make it more explicit by letting the method protect its invariants. (This is also known as <em>encapsulation</em>. For more details, watch my Pluralsight course <a href="https://blog.ploeh.dk/encapsulation-and-solid">Encapsulation and SOLID</a>.) </p> <p> A simple improvement is to add a <a href="https://en.wikipedia.org/wiki/Guard_(computer_science)">Guard Clause</a> for each argument: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">reservations</span>&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">reservations</span>)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">reservation</span>&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;+=&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>.Quantity; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(Capacity&nbsp;&lt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>.Quantity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> If you'd done that from the start, the logged exception would instead have been this: </p> <p> <pre>System.ArgumentNullException: Value cannot be null. Parameter name: reservations at Ploeh.Samples.BookingApi.Ma&#xEE;treD.CanAccept(IEnumerable1 reservations, Reservation reservation) at Ploeh.Samples.BookingApi.ReservationsController.Post(ReservationDto dto) at lambda_method(Closure , Object , Object[] ) at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters) at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.SyncActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State&amp; next, Scope&amp; scope, Object&amp; state, Boolean&amp; isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync() at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)</pre> </p> <p> It's now clear that it was the <code>reservations</code> argument that was <code>null</code>. Now fix that issue. Why does the <code>CanAccept</code> receive a <code>null</code> argument? </p> <p> You may now consider examining the next frame in the stack trace: <code>ReservationsController.Post</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">ActionResult</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Post</span>(<span style="color:#2b91af;">ReservationDto</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">validationMsg</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Validator</span>.<span style="color:#74531f;">Validate</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">validationMsg</span>&nbsp;!=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">BadRequest</span>(<span style="font-weight:bold;color:#1f377f;">validationMsg</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Mapper</span>.<span style="color:#74531f;">Map</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>&nbsp;=&nbsp;Repository.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.Date); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">accepted</span>&nbsp;=&nbsp;maîtreD.<span style="font-weight:bold;color:#74531f;">CanAccept</span>(<span style="font-weight:bold;color:#1f377f;">reservations</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!<span style="font-weight:bold;color:#1f377f;">accepted</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">StatusCode</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">StatusCodes</span>.Status500InternalServerError, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Couldn&#39;t&nbsp;accept.&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">id</span>&nbsp;=&nbsp;Repository.<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Ok</span>(<span style="font-weight:bold;color:#1f377f;">id</span>); }</pre> </p> <p> The <code>Post</code> code only calls <code>CanAccept</code> once, so again, you can unambiguously deduce where the <code>null</code> reference comes from. It comes from the <code>Repository.ReadReservations</code> method call. Now go there and figure out why it returns <code>null</code>. </p> <p> Notice how troubleshooting is trivial when the methods are small. </p> <h3 id="952d4ebb24e94cb990ab1b047c4a7140"> Size matters <a href="#952d4ebb24e94cb990ab1b047c4a7140" title="permalink">#</a> </h3> <p> How small is a small method? How large does a method have to be before it's no longer small? </p> <p> <em>A method is small when you can easily troubleshoot it based exclusively on a logged exception.</em> </p> <p> Until you get the hang of it, it can be hard to predict if a method is going to be easy to troubleshoot. I therefore also follow another rule of thumb: </p> <p> For languages like C#, <a href="/2019/11/04/the-80-24-rule">methods should have a maximum size of 80x24 characters</a>. </p> <p> What if you already have bigger methods? Those are much harder to troubleshoot. If you have a method that spans hundreds of lines of code, a logged exception is rarely sufficient for troubleshooting. In order to save space, I'm not going to show en example of such a method, but I'm sure that you can easily find one in the code base that you professionally work with. Such a method is likely to have dozens of objects that could be <code>null</code>, so a <code>NullReferenceException</code> and a stack trace will contain too little information to assist troubleshooting. </p> <p> I often see developers add tracing or logging to such code in order to facilitate troubleshooting. This makes the code even more bloated, and harder to troubleshoot. </p> <p> Instead, cut the big method into smaller helper methods. You can keep the helper methods <code>private</code> so that you don't change the API. Even so, a logged exception will now report the helper method in its stack trace. Keep those helper methods small, and troubleshooting becomes trivial. </p> <h3 id="2a34a1a9c82443d8975a22fb9da9845b"> Conclusion <a href="#2a34a1a9c82443d8975a22fb9da9845b" title="permalink">#</a> </h3> <p> Small methods come with many benefits. One of these is that it's easy to troubleshoot a small method from a logged exception. Small methods typically take few arguments, and use few variables. A logged exception will often be all that you need to pinpoint where the problem occurred. </p> <p> A large method body, on the other hand, is difficult to troubleshoot. If all you have is a logged exception, you'll be able to find multiple places in the code where that exception could have been thrown. </p> <p> Refactor big methods into smaller helper methods. If one of these helper methods throws an exception, the method name will be included in the stack trace of a logged exception. When the helper method is small, you can easily troubleshoot it. </p> <p> Keep methods small. Size does matter. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="1734b6e6cd1b4898bf28c2412286f381"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> How did you create those exception stack traces? </p> <p> Compared to your examples, I am used to seeing each stack frame line end with the source code line number. (Although, sometimes I have seen all line numbers being 0, which is equivalent to having no line numbers at all.) The exception stack trace in your unit test includes line numbers but your other exception stack traces do not. This <a href="https://stackoverflow.com/questions/4272579/how-to-print-full-stack-trace-in-exception/4272629#4272629">Stack Overflow answer</a> contains an exception stack trace like with line numbers like I am used to seeing. </p> <p> The line containing "lambda_method" also seem odd to me. As I recall, invoking a lambda expression also contributes more information to the stack trace than your examples include. Although, that information is cryptic enough that I don't remember off the top of my head what it means. (After a quick search, I didn't find a good example of how I am used to seeing lambda expressions displayed in a stack trace.) </p> <p> With this additional informaiton, the methods can be longer while reamining unambiguous (in the sense you described). </p> </div> <div class="comment-date">2019-11-18 14:57 UTC</div> </div> <div class="comment" id="8a66b203db9245d8b3db5e40399f039e"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. IIRC, you only see line numbers for code compiled with debug information. That's why you normally don't see line numbers for .NET base class library methods, ASP.NET Core, etc. </p> <p> While you can deploy and run an application in Debug mode, I wouldn't. When you compile in Release mode, the compiler makes optimisations (such as in-lining) that it can't do when it needs to retain a map to a specific source. In Debug mode, you squander those optimisations. </p> <p> I've always deployed systems in Release mode. To make the examples realistic, I didn't include the line numbers in the stack trace, because in Release mode, they aren't going to be there. </p> <p> When you're troubleshooting by reproducing a logged exception as a unit test, on the other hand, it's natural to do so in Debug mode. </p> </div> <div class="comment-date">2019-11-18 16:39 UTC</div> </div> <div class="comment" id="e2c03feb26854496b7d0ec528ea3ae8d"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> I just tried to verify this. I didn't find any difference between compiling in Debug or Release mode as well as no fundemental issue with in-lining. In all four cases (debug vs release and in-lining vs no inlining), exception stack traces still included line numbers. When in-lining, the only different was the absense of the in-lined method from the stack trace. </p> <p> Instead, I found that line numbers are included in exception stack traces if and only if the corresponding <a href="https://en.wikipedia.org/wiki/Program_database">.pdb</a> file is next to the .exe file when executed. </p> </div> <div class="comment-date">2019-11-18 19:07 UTC</div> </div> <div class="comment" id="2818246690a94b7b9c81ddfb47fc6d89"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, that sounds correct. To be honest, I stopped putting effort into learning advanced debugging skills a long time ago. Early in my career we debugged by writing debug statements directly to output! Then I learned test-driven development, and with that disappeared most of the need to debug full systems. At the time I began working with systems that logged unhandled exceptions, the habit of writing small methods was already so entrenched that I never felt that I needed the line numbers. </p> <p> FWIW, more than five years ago, I decided to <a href="https://github.com/AutoFixture/AutoFixture/issues/65">publish symbols with AutoFixture</a>, but I never found it useful. </p> </div> <div class="comment-date">2019-11-19 7:18 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Diamond rock https://blog.ploeh.dk/2019/11/11/diamond-rock 2019-11-11T09:15:00+00:00 Mark Seemann <div id="post"> <p> <em>A diamond kata implementation written in Rockstar</em> </p> <p> I've <a href="/2015/01/10/diamond-kata-with-fscheck">previously written about the diamond kata</a>, which has become one of my favourite programming exercises. I wanted to take the <a href="https://codewithrockstar.com">Rockstar</a> language out for a spin, and it seemed a good candidate problem. </p> <h3 id="ab491bf89ae945f48821fe7b324febeb"> Rockstar <a href="#ab491bf89ae945f48821fe7b324febeb" title="permalink">#</a> </h3> <p> If you're not aware of the Rockstar programming language, it started with a tweet from <a href="http://paulstovell.com">Paul Stovell</a>: <blockquote> <p> "To really confuse recruiters, someone should make a programming language called Rockstar." </p> <footer><cite><a href="https://twitter.com/paulstovell/status/1013960369465782273">Paul Stovell</a></cite></footer> </blockquote> This inspired <a href="http://www.dylanbeattie.net">Dylan Beattie</a> to create the Rockstar programming language. The language's <a href="https://codewithrockstar.com">landing page</a> already sports an <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> implementation of the <a href="http://codingdojo.org/kata/FizzBuzz">FizzBuzz kata</a>, so I had to pick another exercise. The <a href="http://claysnow.co.uk/recycling-tests-in-tdd">diamond kata</a> was a natural choice for me. </p> <h3 id="c17e0b1d17f04d19b9e965ad531f10d9"> Lyrics <a href="#c17e0b1d17f04d19b9e965ad531f10d9" title="permalink">#</a> </h3> <p> I'll start with the final result. What follows here are the final lyrics to <em>Diamond rock</em>. As you'll see, it starts with a short prologue after which it settles into a repetitive pattern. I imagine that this might make it useful for audience participation, in the anthem rock style of e.g. <a href="https://en.wikipedia.org/wiki/We_Will_Rock_You">We Will Rock You</a>. </p> <p> After 25 repetitions the lyrics change. I haven't written music to any of them, but I imagine that this essentially transitions into another song, just like <em>We Will Rock You</em> traditionally moves into <a href="https://en.wikipedia.org/wiki/We_Are_the_Champions">We Are the Champions</a>. The style of the remaining lyrics does, however, suggest something musically more complex than a rock anthem. </p> <p> <pre>Your memory says&nbsp;&nbsp; The way says A Despair takes your hope The key is nothing Your courage is a tightrope Let it be without it If your hope is your courage Let the key be the way Build your courage up If your hope is your courage The key says B Build your courage up If your hope is your courage The key says C Build your courage up If your hope is your courage The key says D Build your courage up If your hope is your courage The key says E Build your courage up If your hope is your courage The key says F Build your courage up If your hope is your courage The key says G Build your courage up If your hope is your courage The key says H Build your courage up If your hope is your courage The key says I Build your courage up If your hope is your courage The key says J Build your courage up If your hope is your courage The key says K Build your courage up If your hope is your courage The key says L Build your courage up If your hope is your courage The key says M Build your courage up If your hope is your courage The key says N Build your courage up If your hope is your courage The key says O Build your courage up If your hope is your courage The key says P Build your courage up If your hope is your courage The key says Q Build your courage up If your hope is your courage The key says R Build your courage up If your hope is your courage The key says S Build your courage up If your hope is your courage The key says T Build your courage up If your hope is your courage The key says U Build your courage up If your hope is your courage The key says V Build your courage up If your hope is your courage The key says W Build your courage up If your hope is your courage The key says X Build your courage up If your hope is your courage The key says Y Build your courage up If your hope is your courage The key says Z Give back the key Dream takes a tightrope Your hope's start'n'stop While Despair taking your hope ain't a tightrope Build your hope up Give back your hope Raindrop is a wintercrop Light is a misanthrope Let it be without raindrop Darkness is a kaleidoscope Let it be without raindrop Diamond takes your hour, love, and your day If love is the way Let your sorrow be your hour without light Put your sorrow over darkness into flight Put your memory of flight into motion Put it with love, and motion into the ocean Whisper the ocean If love ain't the way Let ray be your day of darkness without light Let your courage be your hour without darkness, and ray Put your courage over darkness into the night Put your memory of ray into action Let satisfaction be your memory of the night Let alright be satisfaction with love, action, love, and satisfaction Shout alright Listen to the wind Let flight be the wind Let your heart be Dream taking flight Let your breath be your heart of darkness with light Your hope's stop'n'start While your hope is lower than your heart Let away be Despair taking your hope Diamond taking your breath, away, and your hope Build your hope up Renown is southbound While your hope is as high as renown Let away be Despair taking your hope Diamond taking your breath, away, and your hope Knock your hope down</pre> </p> <p> Not only do these look like lyrics, but it's also an executable program! </p> <h3 id="994b9a9d46dc467383c528a17dc99f65"> Execution <a href="#994b9a9d46dc467383c528a17dc99f65" title="permalink">#</a> </h3> <p> If you want to run the program, you can copy the text and paste it into <a href="https://codewithrockstar.com/online">the web-based Rockstar interpreter</a>; that's what I did. </p> <p> When you <em>Rock!</em> the lyrics, the interpreter will prompt you for an input. There's no instructions or input validation, but <em>the only valid input is the letters</em> <code>A</code> to <code>Z</code>, and <em>only in upper case</em>. If you type anything else, I don't know what'll happen, but most likely it'll just enter an infinite loop, and you'll have to reboot your computer. </p> <p> If you input, say, <code>E</code>, the output will be the expected diamond figure: </p> <p> <pre> A B B C C D D E E D D C C B B A </pre> </p> <p> When you paste the code, be sure to include everything. There's significant whitespace in those lyrics; I'll explain later. </p> <h3 id="12f2c388f5c8499594a3bf90f37e07f4"> Readable code <a href="#12f2c388f5c8499594a3bf90f37e07f4" title="permalink">#</a> </h3> <p> As the Rockstar documentation strongly implies, <em>singability</em> is more important than readability. You can, however, write more readable Rockstar code, and that's what I started with: </p> <p> <pre>Space says&nbsp;&nbsp; LetterA says A GetLetter takes index If index is 0 Give back LetterA If index is 1 retVal says B Give back retVal If index is 2 retVal says C Give back retVal GetIndex takes letter Index is 0 While GetLetter taking Index ain't letter Build Index up Give back Index PrintLine takes width, l, lidx If l is LetterA Let completeSpaceCount be width minus 1 Let paddingCount be completeSpaceCount over 2 Let padding be Space times paddingCount Let line be padding plus l plus padding Say line Else Let internalSpaceSize be lidx times 2 minus 1 Let filler be Space times internalSpaceSize Let totalOuterPaddingSize be width minus 2, internalSpaceSize Let paddingSize be totalOuterPaddingSize over 2 Let padding be Space times paddingSize Let line be padding plus l, filler, l, padding Say line Listen to input Let idx be GetIndex taking input Let width be idx times 2 plus 1 Let counter be 0 While counter is lower than idx Let l be GetLetter taking counter PrintLine taking width, l, counter Build counter up While counter is as high as 0 Let l be GetLetter taking counter PrintLine taking width, l, counter Knock counter down</pre> </p> <p> This prototype only handled the input letters <code>A</code>, <code>B</code>, and <code>C</code>, but it was enough to verify that the algorithm worked. I've done the diamond kata several times before, so I only had to find the most imperative implementation on my hard drive. It wasn't too hard to translate to Rockstar. </p> <p> Although Rockstar supports mainstream quoted strings like <code>"A"</code>, <code>"B"</code>, and so on, you can see that I went straight for <em>poetic string literals</em>. Before I started persisting Rockstar code to a file, I experimented with the language using the online interpreter. I wanted the program to look as much like rock lyrics as I could, so I didn't want to have too many statements like <code>Shout "FizzBuzz!"</code> in my code. </p> <h3 id="1ae1c1c329894b438bc934097090901b"> Obscuring space <a href="#1ae1c1c329894b438bc934097090901b" title="permalink">#</a> </h3> <p> My first concern was whether I could obscure the space character. Using a poetic string literal, I could: </p> <p> <pre>Space says&nbsp;&nbsp;</pre> </p> <p> The rules of poetic string literals is that everything between <code>says&nbsp;</code> and the newline character becomes the value of the string variable. So there's an extra space after <code>says&nbsp;</code>! </p> <p> After I renamed all the variables and functions, that line became: </p> <p> <pre>Your memory says&nbsp;&nbsp;</pre> </p> <p> Perhaps it isn't an unprintable character, but it <em>is</em> unsingable. </p> <h3 id="d1ae4568cc104312a3c20fe8019ccb18"> No else <a href="#d1ae4568cc104312a3c20fe8019ccb18" title="permalink">#</a> </h3> <p> The keyword <code>Else</code> looks conspicuously like a programming construct, so I wanted to get rid of that as well. That was easy, because I could just invert the initial <code>if</code> condition: </p> <p> <pre>If l ain't LetterA</pre> </p> <p> This effectively switches between the two alternative code blocks. </p> <h3 id="cbaa9c218905453bb3df2155b9e9b822"> Obscuring letter indices <a href="#cbaa9c218905453bb3df2155b9e9b822" title="permalink">#</a> </h3> <p> I also wanted to obscure the incrementing index values <code>1</code>, <code>2</code>, <code>3</code>, etcetera. Since the indices are monotonically increasing, I realised that I could use a counter and increment it: </p> <p> <pre>number is 0 If index is number Let retVal be LetterA Build number up If index is number retVal says B Build number up If index is number retVal says C</pre> </p> <p> The function initialises <code>number</code> to <code>0</code> and assigns a value to <code>retVal</code> if the input <code>index</code> is also <code>0</code>. </p> <p> If not, it increments the <code>number</code> (so that it's now <code>1</code>) and again compares it to <code>index</code>. This sufficiently obscures the indices, but if there's a way to hide the letters of the alphabet, I'm not aware of it. </p> <p> After I renamed the variables, the code became: </p> <p> <pre>Your courage is a tightrope Let it be without it If your hope is your courage Let the key be the way Build your courage up If your hope is your courage The key says B Build your courage up If your hope is your courage The key says C</pre> </p> <p> There's one more line of code in the final lyrics, compared to the above snippet. The line <code>Let it be without it</code> has no corresponding line of code in the readable version. What's going on? </p> <h3 id="aef4cf50f4484327b37cd61995c74d99"> Obscuring numbers <a href="#aef4cf50f4484327b37cd61995c74d99" title="permalink">#</a> </h3> <p> Like poetic string literals, Rockstar also supports <em>poetic number literals</em>. Due to its modulo-ten-based system, however, I found it difficult to come up with a good ten-letter word that fit the song's lyrical theme. I <em>could</em> have done something like this to produce the number <code>0</code>: </p> <p> <pre>Your courage is barbershop</pre> </p> <p> or some other ten-letter word. My problem was that regardless of what I chose, it didn't sound good. Some article like <code>a</code> or <code>the</code> would sound better, but that would change the value of the poetic number literal. <code>a tightrope</code> is the number <em>19</em>, because <code>a</code> has one letter, and <code>tightrope</code> has nine. </p> <p> There's a simple way to produce <em>0</em> from any number: just subtract the number from itself. That's what <code>Let it be without it</code> does. I could also have written it as <code>Let your courage be without your courage</code>, but I chose to take advantage of Rockstar's <em>pronoun</em> feature instead. I'd been looking for an opportunity to include the phrase <a href="https://en.wikipedia.org/wiki/Let_It_Be_(Beatles_song)">Let It Be</a> ever since I learned about the <code>Let x be y</code> syntax. </p> <p> The following code snippet initialises the variable <code>Your courage</code> to <code>19</code>, but on the next line subtracts 19 from 19 and updates the variable so that its value is now <code>0</code>. </p> <p> <pre>Your courage is a tightrope Let it be without it</pre> </p> <p> I had the same problem with initialising the numbers <em>1</em> and <em>2</em>, so further down I resorted to similar tricks: </p> <p> <pre>Raindrop is a wintercrop Light is a misanthrope Let it be without raindrop Darkness is a kaleidoscope Let it be without raindrop </pre> </p> <p> Here I had the additional constraint that I wanted the words to rhyme. The rhymes are a continuation of the previous lines' <code>up</code> and <code>hope</code>, so I struggled to come up with a ten-letter word that rhymes with <code>up</code>; <code>wintercrop</code> was the best I could do. <code>a wintercrop</code> is <em>10</em>, and the strategy is to define <code>Light</code> and <code>Darkness</code> as <em>11</em> and <em>12</em>, and then subtract <em>10</em> from both. At the first occurrence of <code>Let it be without raindrop</code>, <code>it</code> refers to <code>Light</code>, whereas the second time <code>it</code> refers to <code>Darkness</code>. </p> <h3 id="46b60c0d25e54f9599e5bd40b7e88d80"> Lyrical theme <a href="#46b60c0d25e54f9599e5bd40b7e88d80" title="permalink">#</a> </h3> <p> Once I had figured out how to obscure strings and numbers, it was time to rename all the readable variables and function names into idiomatic Rockstar. </p> <p> At first, I thought that I'd pattern my lyrics after <a href="https://en.wikipedia.org/wiki/Shine_On_You_Crazy_Diamond">Shine On You Crazy Diamond</a>, but I soon ran into problems with the keyword <code>taking</code>. I found it difficult to find words that would naturally succeed <code>taking</code>. Some options I thought of were: <ul> <li>taking the bus</li> <li>taking a chance</li> <li>taking hold</li> <li>taking flight</li> <li>taking time</li> </ul> Some of these didn't work for various reasons. In Rockstar <code>times</code> is a keyword, and apparently <code>time</code> is reserved as well. At least, the online interpreter choked on it. </p> <p> <code>Taking a chance</code> sounded <a href="https://en.wikipedia.org/wiki/Take_a_Chance_on_Me">too much like ABBA</a>. <code>Taking hold</code> created the derived problem that I had to initialise and use a variable called <code>hold</code>, and I couldn't make that work. </p> <p> <code>Taking flight</code>, on the other hand, turned out to provide a fertile opening. </p> <p> I soon realised, though, that my choice of words pulled the lyrical theme away from idiomatic Rockstar vocabulary. While I do get the <a href="https://en.wikipedia.org/wiki/Livin%27_on_a_Prayer">Tommy and Gina</a> references, I didn't feel at home in that poetic universe. </p> <p> On the other hand, I thought that the words started to sound like <a href="https://en.wikipedia.org/wiki/Yes_(band)">Yes</a>. I've listened to a lot of Yes. The lyrics are the same kind of lame and vapid as what was taking form in my editor. I decided to go in that direction. </p> <p> Granted, this is no longer idiomatic Rockstar, since it's more <a href="https://en.wikipedia.org/wiki/Progressive_rock">prog rock</a> than <a href="https://en.wikipedia.org/wiki/Glam_metal">hair metal</a>. I invoke creative license. </p> <p> Soon I also conceived of the extra ambition that I wanted the verses to rhyme. Here, it proved fortunate that the form <code>let x be y</code> is interchangeable with the form <code>put y into x</code>. Some words, like <em>darkness</em>, are difficult to rhyme with, so it helps that you can hide them within a <code>put y into x</code> form. </p> <p> Over the hours(!) I worked on this, a theme started to emerge. I'm particularly fond of the repeated motifs like: </p> <p> <pre>Your hope's start'n'stop</pre> </p> <p> which rhymes with <code>up</code>, but then later it appears again as </p> <p> <pre>Your hope's stop'n'start</pre> </p> <p> which rhymes with <code>heart</code>. Both words, by the way, represent the number <em>0</em>, since there's ten letters when you ignore the single quotes. </p> <h3 id="234366a7cb1a4998813719845f3c08d7"> Conclusion <a href="#234366a7cb1a4998813719845f3c08d7" title="permalink">#</a> </h3> <p> I spent more time on this that I suppose I ought to, but once I got started, it was hard to stop. I found the translation from readable code into 'idiomatic' Rockstar at least as difficult as writing working software. There's a lesson there, I believe. </p> <p> Rockstar is still a budding language, so I did miss a few features, chief among which would be <em>arrays</em>, but I'm not sure how one would make arrays sufficiently rock'n'roll. </p> <p> A unit testing framework would also be nice. </p> <p> If you liked this article, please <a href="https://www.linkedin.com/in/ploeh/">endorse my <em>Rockstar</em> skills on LinkedIn</a> so that we can <em>"confuse recruiters."</em> </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The 80/24 rule https://blog.ploeh.dk/2019/11/04/the-80-24-rule 2019-11-04T06:51:00+00:00 Mark Seemann <div id="post"> <p> <em>Write small blocks of code. How small? Here's how small.</em> </p> <p> One of the most common questions I get is this: </p> <p> <em>If you could give just one advice to programmers, what would it be?</em> </p> <p> That's easy: </p> <p> <em>Write small blocks of code.</em> </p> <p> Small methods. Small functions. Small procedures. </p> <p> How small? </p> <h3 id="849d07675e3a4c5681641eb0c67dfafb"> Few lines of code <a href="#849d07675e3a4c5681641eb0c67dfafb" title="permalink">#</a> </h3> <p> You can't give a universally good answer to that question. Among other things, it depends on the programming language in question. Some languages are much denser than others. The densest language I've ever encountered is <a href="https://en.wikipedia.org/wiki/APL_(programming_language)">APL</a>. </p> <p> Most mainstream languages, however, seem to be verbose to approximately the same order of magnitude. My experience is mostly with C#, so I'll use that (and similar languages like Java) as a starting point. </p> <p> When I write C# code, I become uncomfortable when my method size approaches fifteen or twenty lines of code. C# is, however, a fairly wordy language, so it sometimes happens that I have to allow a method to grow larger. My limit is probably somewhere around 25 lines of code. </p> <p> That's an arbitrary number, but if I have to quote a number, it would be around that size. Since it's arbitrary anyway, let's make it <em>24</em>, for reasons that I'll explain later. </p> <p> The maximum line count of a C# (or Java, or JavaScript, etc.) method, then, should be 24. </p> <p> To repeat the point from before, this depends on the language. I'd consider a 24-line <a href="https://www.haskell.org">Haskell</a> or <a href="https://fsharp.org">F#</a> function to be so huge that if I received it as a pull request, I'd reject it <a href="/2015/01/15/10-tips-for-better-pull-requests">on the grounds of size</a> alone. </p> <h3 id="8506ebcba585459b9739d84a7bcad758"> Narrow line width <a href="#8506ebcba585459b9739d84a7bcad758" title="permalink">#</a> </h3> <p> Most languages allow for flexibility in layout. For example, C-based languages use the <code>;</code> character as a delimiter. This enables you to write more than one statement per line: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">foo</span>&nbsp;=&nbsp;32;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">bar</span>&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">foo</span>&nbsp;+&nbsp;10;&nbsp;<span style="color:#2b91af;">Console</span>.<span style="color:#74531f;">WriteLine</span>(<span style="font-weight:bold;color:#1f377f;">bar</span>);</pre> </p> <p> You could attempt to avoid the 24-line-height rule by writing wide lines. That would, however, be to defeat the purpose. </p> <p> The purpose of writing small methods is to nudge yourself towards writing readable code; code that fits in your brain. The smaller, the better. </p> <p> For completeness sake, let's institute a maximum line width as well. If there's any accepted industry standard for maximum line width, it's 80 characters. I've used that maximum for years, and it's a good maximum. </p> <p> Like all other programmers, other people's code annoys me. The most common annoyance is that people write too wide code. </p> <p> This is probably because most programmers have drunk the Cool Aid that bigger screens make you more productive. When you code on a big screen, you don't notice how wide your lines become. </p> <p> There's many scenarios where wide code is problematic: <ul> <li>When you're comparing changes to a file side-by-side. This often happens when you review pull requests. Now you have only half of your normal screen width.</li> <li>When you're looking at code on a smaller device.</li> <li>When you're getting old, or are otherwise visually impaired. After I turned 40, I discovered that I found it increasingly difficult to see small things. I still use a 10-point font for programming, but I foresee that this will not last much longer.</li> <li>When you're <a href="https://en.wikipedia.org/wiki/Mob_programming">mob programming</a> you're limited to the size of the shared screen.</li> <li>When you're sharing your screen via the web, for remote pair programming or similar.</li> <li>When you're presenting code at meetups, user groups, conferences, etc.</li> </ul> What most programmers need, I think, is just a <a href="https://en.wikipedia.org/wiki/Nudge_theory">nudge</a>. In Visual Studio, for example, you can install the <a href="https://marketplace.visualstudio.com/items?itemName=PaulHarrington.EditorGuidelines">Editor Guidelines</a> extension, which will display one or more vertical guidelines. You can configure it as you'd like, but I've mine set to 80 characters, and bright red: </p> <p> <img src="/content/binary/vertical-guideline-at-80-characters.png" alt="Screen shot of editor with code, showing red vertical line at 80 characters."> </p> <p> Notice the red dotted vertical line that cuts through <code>universe</code>. It tells me where the 80 character limit is. </p> <h3 id="d94a52940215425e9cb19492d9f51a41"> Terminal box <a href="#d94a52940215425e9cb19492d9f51a41" title="permalink">#</a> </h3> <p> The 80-character limit has a long and venerable history, but what about the 24-line limit? While both are, ultimately, arbitrary, both fit the size of the popular <a href="https://en.wikipedia.org/wiki/VT100">VT100</a> terminal, which had a display resolution of 80x24 characters. </p> <p> A box of 80x24 characters thus reproduces the size of an old terminal. Does this mean that I suggest that you should write programs on terminals? No, people always misunderstand this. That should be the maximum size of a method. On larger screens, you'd be able to see multiple small methods at once. For example, you could view a unit test and its target in a split screen configuration. </p> <p> The exact sizes are arbitrary, but I think that there's something fundamentally right about such continuity with the past. </p> <p> I've been using the 80-character mark as a soft limit for years. I tend to stay within it, but I occasionally decide to make my code a little wider. I haven't paid quite as much attention to the number of lines of my methods, but only for the reason that I know that I tend to write methods shorter than that. Both limits have served me well for years. </p> <h3 id="e5301bcefd8f444487906af03df293b0"> Example <a href="#e5301bcefd8f444487906af03df293b0" title="permalink">#</a> </h3> <p> Consider this example: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">ActionResult</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Post</span>(<span style="color:#2b91af;">ReservationDto</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">validationMsg</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Validator</span>.<span style="color:#74531f;">Validate</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">validationMsg</span>&nbsp;!=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">BadRequest</span>(<span style="font-weight:bold;color:#1f377f;">validationMsg</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Mapper</span>.<span style="color:#74531f;">Map</span>(<span style="font-weight:bold;color:#1f377f;">dto</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>&nbsp;=&nbsp;Repository.<span style="font-weight:bold;color:#74531f;">ReadReservations</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>.Date); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">accepted</span>&nbsp;=&nbsp;maîtreD.<span style="font-weight:bold;color:#74531f;">CanAccept</span>(<span style="font-weight:bold;color:#1f377f;">reservations</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!<span style="font-weight:bold;color:#1f377f;">accepted</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">StatusCode</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">StatusCodes</span>.Status500InternalServerError, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Couldn&#39;t&nbsp;accept.&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">id</span>&nbsp;=&nbsp;Repository.<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="font-weight:bold;color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Ok</span>(<span style="font-weight:bold;color:#1f377f;">id</span>); }</pre> </p> <p> This method is 18 lines long, which includes the method declaration, curly brackets and blank lines. It easily stays within the 80-character limit. Note that I've deliberately formatted the code so that it behaves. You can see it in this fragment: </p> <p> <pre><span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">StatusCode</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">StatusCodes</span>.Status500InternalServerError, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Couldn&#39;t&nbsp;accept.&quot;</span>);</pre> </p> <p> Most people write it like this: </p> <p> <pre><span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#74531f;">StatusCode</span>(<span style="color:#2b91af;">StatusCodes</span>.Status500InternalServerError,&nbsp;<span style="color:#a31515;">&quot;Couldn&#39;t&nbsp;accept.&quot;</span>);</pre> </p> <p> That doesn't look bad, but I've seen much worse examples. </p> <p> Another key to writing small methods is to call other methods. The above <code>Post</code> method doesn't look like much, but significant functionality could be hiding behind <code>Validator.Validate</code>, <code>Repository.ReadReservations</code>, or <code>maîtreD.CanAccept</code>. I hope that you agree that each of these objects and methods are named well enough to give you an idea about their purpose. </p> <h3 id="ca4e5a8403244f93a95c2736cdaf7eee"> Code that fits in your brain <a href="#ca4e5a8403244f93a95c2736cdaf7eee" title="permalink">#</a> </h3> <p> As I describe in my <a href="https://cleancoders.com/episode/humane-code-real-episode-1/show">Humane Code</a> video, the human brain can only keep track of <a href="https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two">about seven things</a>. I think that this rule of thumb applies to the way we read and interpret code. If you need to understand and keep track of more than seven separate things at the same time, the code becomes harder to understand. </p> <p> This could explain why small methods are good. They're only good, however, if they're self-contained. When you look at a method like the above <code>Post</code> method, you'll be most effective if you don't need to have a deep understanding of how each of the dependencies work. If this is true, the method only juggles about five dependencies: <code>Validator</code>, <code>Mapper</code>, <code>Repository</code>, <code>maîtreD</code>, and its own base class (which provides the methods <code>BadRequest</code>, <code>StatusCode</code>, and <code>Ok</code>). Five dependencies is fewer than seven. </p> <p> Another way to evaluate the cognitive load of a method is to measure its <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a>. The <code>Post</code> method's cyclomatic complexity is <em>3</em>, so that should be easily within the brain's capacity. </p> <p> These are all heuristics, so read this for inspiration, not as law. They've served me well for years, though. </p> <h3 id="1c2120e143784dde8e520026b67651d4"> Conclusion <a href="#1c2120e143784dde8e520026b67651d4" title="permalink">#</a> </h3> <p> You've probably heard about the <em>80/20 rule</em>, also known as the <a href="https://en.wikipedia.org/wiki/Pareto_principle">Pareto principle</a>. Perhaps the title lead you to believe that this article was a misunderstanding. I admit that I went for an arresting title; perhaps a more proper name is the <em>80x24 rule</em>. </p> <p> The exact numbers can vary, but I've found a maximum method size of 80x24 characters to work well for C#. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="cbdf43b5551242efa330a163f01119ca"> <div class="comment-author">Jiehong</div> <div class="comment-content"> <p> As a matter of fact, <em>terminals</em> had 80 characters lines, because <a href="https://en.wikipedia.org/wiki/Punched_card#/media/File:FortranCardPROJ039.agr.jpg">IBM punch cards</a>, representing only 1 line, had 80 symbols (even though only the first 72 were used at first). However, I don't know why terminals settled for 24 lines! In Java, which is similar to C# in term of verbosity, Clean Code tend to push towards 20-lines long functions or less. One of the danger to make functions even smaller is that many more functions can create many indirections, and that becomes harder to keep track within our brains. </p> </div> <div class="comment-date">2019-11-04 11:13 UTC</div> </div> <div class="comment" id="7fc45f8fd537ba9907ad73daa2c85b52"> <div class="comment-author">Terrell</div> <div class="comment-content"> <p> Some additional terminal sizing history in Mike Hoye's recent similarly-named post: <a href="http://exple.tive.org/blarg/2019/10/23/80x25/">http://exple.tive.org/blarg/2019/10/23/80x25/</a> Spoiler - Banknotes! </p> </div> <div class="comment-date">2019-11-04 13:25 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A basic Haskell solution to the robot journeys coding exercise https://blog.ploeh.dk/2019/10/28/a-basic-haskell-solution-to-the-robot-journeys-coding-exercise 2019-10-28T04:34:00+00:00 Mark Seemann <div id="post"> <p> <em>This article shows an idiomatic, yet beginner-friendly Haskell solution to a coding exercise.</em> </p> <p> <a href="https://twitter.com/mikehadlow/status/1186332184086495233">Mike Hadlow tweeted</a> a coding exercise that involves parsing and evaluating instruction sets. <a href="https://www.haskell.org">Haskell</a> excels at such problems, so I decided to give it a go. Since this was only an exercise for the fun of it, I didn't want to set up a complete Haskell project. Rather, I wanted to write one or two <code>.hs</code> files that I could interact with via <em>GHCi</em>. This means no lenses, monad transformers, or other fancy libraries. </p> <p> Hopefully, this makes the code friendly to Haskell beginners. It shows what I consider <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a>, but basic Haskell, solving a problem of moderate difficulty. </p> <h3 id="79b308125b2d4bcc9df79f7c6569a6e9"> The problem <a href="#79b308125b2d4bcc9df79f7c6569a6e9" title="permalink">#</a> </h3> <p> <a href="https://github.com/mikehadlow/Journeys">Mike Hadlow has a detailed description of the exercise</a>, but in short, you're given a file with a set of instructions that look like this: </p> <p> <pre>1 1 E RFRFRFRF 1 1 E</pre> </p> <p> The first and last lines describe the position and orientation of a robot. The first line, for example, describes a robot at position (1, 1) facing east. A robot can face in one of the four normal directions of the map: north, east, south, and west. </p> <p> The first line gives the robot's start position, and the last line the <em>expected</em> end position. </p> <p> The middle line is a set of instructions to the robot. It can turn left or right, or move forward. </p> <p> The exercise is to evaluate whether journeys are valid; that is, whether the robot's end position matches the expected end position if it follows the commands. </p> <h3 id="cf43d781d49545c38e30f487867e904f"> Imports <a href="#cf43d781d49545c38e30f487867e904f" title="permalink">#</a> </h3> <p> I managed to solve the exercise with a single <code>Main.hs</code> file. Here's the module declaration and the required imports: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Main&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;Data.Foldable <span style="color:blue;">import</span>&nbsp;Data.Ord <span style="color:blue;">import</span>&nbsp;Text.Read&nbsp;(<span style="color:#2b91af;">readPrec</span>) <span style="color:blue;">import</span>&nbsp;Text.ParserCombinators.ReadP <span style="color:blue;">import</span>&nbsp;Text.ParserCombinators.ReadPrec&nbsp;(<span style="color:#2b91af;">readPrec_to_P</span>,&nbsp;<span style="color:#2b91af;">minPrec</span>)</pre> </p> <p> These imports are only required to support parsing of input. Once parsed, you can evaluate each journey using nothing but the functions available in the standard <code>Prelude</code>. </p> <h3 id="8329bba327f54dbfb8474f9102ba8662"> Types <a href="#8329bba327f54dbfb8474f9102ba8662" title="permalink">#</a> </h3> <p> Haskell is a statically typed language, so it often pays to define some types. Granted, the exercise hardly warrants all of these types, but as an example of idiomatic Haskell, I think that this is still good practice. After all, Haskell types are easy to declare. Often, they are one-liners: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Direction&nbsp;=&nbsp;North&nbsp;|&nbsp;East&nbsp;|&nbsp;South&nbsp;|&nbsp;West&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> The <code>Direction</code> type enumerates the four corners of the world. </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Robot&nbsp;=&nbsp;Robot&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;robotPosition&nbsp;::&nbsp;(Integer,&nbsp;Integer) &nbsp;&nbsp;,&nbsp;robotDirection&nbsp;::&nbsp;Direction&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> The <code>Robot</code> record type represents the state of a robot: its position and the direction it faces. </p> <p> You'll also need to enumerate the commands that you can give a robot: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Command&nbsp;=&nbsp;TurnLeft&nbsp;|&nbsp;TurnRight&nbsp;|&nbsp;MoveForward&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> Finally, you can also define a type for a <code>Journey</code>: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Journey&nbsp;=&nbsp;Journey&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;journeyStart&nbsp;::&nbsp;Robot &nbsp;&nbsp;,&nbsp;journeyCommands&nbsp;::&nbsp;[Command] &nbsp;&nbsp;,&nbsp;journeyEnd&nbsp;::&nbsp;Robot&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> These are all the types required for solving the exercise. </p> <h3 id="5ada3f24aa4e4858892855ae076f457b"> Parsing <a href="#5ada3f24aa4e4858892855ae076f457b" title="permalink">#</a> </h3> <p> The format of the input file is simple enough that it could be done in an ad-hoc fashion using <code>lines</code>, <code>word</code>, <code>read</code>, and a few other low-level functions. While the format barely warrants the use of parser combinators, I'll still use some to showcase the power of that approach. </p> <p> Since one of my goals is to implement the functionality using a single <code>.hs</code> file, I can't pull in external parser combinator libraries. Instead, I'll use the built-in <code>ReadP</code> module, which I've often found sufficient to parse files like the present exercise input file. </p> <p> First, you're going to have to be able to parse numbers, which can be done using the <code>Read</code> type class. You'll need, however, to be able to compose <code>Integer</code> parsers with other <code>ReadP</code> parsers. </p> <p> <pre><span style="color:#2b91af;">parseRead</span>&nbsp;::&nbsp;<span style="color:blue;">Read</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">ReadP</span>&nbsp;a parseRead&nbsp;=&nbsp;readPrec_to_P&nbsp;readPrec&nbsp;minPrec</pre> </p> <p> This turns every <code>Read</code> instance value into a <code>ReadP</code> value. (I admit that I wasn't sure which precedence number to use, but <code>minPrec</code> seems to work.) </p> <p> Next, you need a parser for <code>Direction</code> values: </p> <p> <pre><span style="color:#2b91af;">parseDirection</span>&nbsp;::&nbsp;<span style="color:blue;">ReadP</span>&nbsp;<span style="color:blue;">Direction</span> parseDirection&nbsp;= &nbsp;&nbsp;choice&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;N&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;North, &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;E&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;East, &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;S&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;South, &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;W&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;West&nbsp;]</pre> </p> <p> Notice how declarative this looks. The <code>choice</code> function combines a list of other parsers. When an individual parser in that list encounters the <code>'N'</code> character, it'll parse it as <code>North</code>, <code>'E'</code> as <code>East</code>, and so on. </p> <p> You can now parse an entire <code>Robot</code> using the <code>Applicative</code> <code>&lt;*&gt;</code> and <code>&lt;*</code> operators. </p> <p> <pre><span style="color:#2b91af;">parseRobot</span>&nbsp;::&nbsp;<span style="color:blue;">ReadP</span>&nbsp;<span style="color:blue;">Robot</span> parseRobot&nbsp;= &nbsp;&nbsp;(\x&nbsp;y&nbsp;d&nbsp;-&gt;&nbsp;Robot&nbsp;(x,&nbsp;y)&nbsp;d)&nbsp;&lt;$&gt; &nbsp;&nbsp;(parseRead&nbsp;&lt;*&nbsp;char&nbsp;<span style="color:#a31515;">&#39;&nbsp;&#39;</span>)&nbsp;&lt;*&gt; &nbsp;&nbsp;(parseRead&nbsp;&lt;*&nbsp;char&nbsp;<span style="color:#a31515;">&#39;&nbsp;&#39;</span>)&nbsp;&lt;*&gt; &nbsp;&nbsp;&nbsp;parseDirection</pre> </p> <p> The <code>&lt;*&gt;</code> operator combines two parsers by using the output of both of them, whereas the <code>&lt;*</code> combines two parsers by running both of them, but discarding the output of the right-hand parser. A good mnemonic is that the operator points to the parser that produces an output. Here', the <code>parseRobot</code> function uses the <code>&lt;*</code> operator to require that each number is followed by a space. The space, however, is just a delimiter, so you throw it away. </p> <p> <code>parseRead</code> parses any <code>Read</code> instance. Here, the <code>parseRobot</code> function uses it to parse each <code>Integer</code> in a robot's position. It also uses <code>parseDirection</code> to parse the robot's direction. </p> <p> Similar to how you can parse directions, you can also parse the commands: </p> <p> <pre><span style="color:#2b91af;">parseCommand</span>&nbsp;::&nbsp;<span style="color:blue;">ReadP</span>&nbsp;<span style="color:blue;">Command</span> parseCommand&nbsp;= &nbsp;&nbsp;choice&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;L&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;TurnLeft, &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;R&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;TurnRight, &nbsp;&nbsp;&nbsp;&nbsp;char&nbsp;<span style="color:#a31515;">&#39;F&#39;</span>&nbsp;&gt;&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;MoveForward]</pre> </p> <p> Likewise, similar to how you parse a single robot, you can now parse a journey: </p> <p> <pre><span style="color:#2b91af;">parseJourney</span>&nbsp;::&nbsp;<span style="color:blue;">ReadP</span>&nbsp;<span style="color:blue;">Journey</span> parseJourney&nbsp;= &nbsp;&nbsp;Journey&nbsp;&lt;$&gt; &nbsp;&nbsp;(parseRobot&nbsp;&lt;*&nbsp;string&nbsp;<span style="color:#a31515;">&quot;\n&quot;</span>)&nbsp;&lt;*&gt; &nbsp;&nbsp;(many&nbsp;parseCommand&nbsp;&lt;*&nbsp;string&nbsp;<span style="color:#a31515;">&quot;\n&quot;</span>)&nbsp;&lt;*&gt; &nbsp;&nbsp;&nbsp;parseRobot</pre> </p> <p> The only new element compared to <code>parseRobot</code> is the use of the <code>many</code> parser combinator, which looks for zero, one, or many <code>Command</code> values. </p> <p> This gives you a way to parse a complete journey, but the input file contains many of those, separated by newlines and other whitespace: </p> <p> <pre><span style="color:#2b91af;">parseJourneys</span>&nbsp;::&nbsp;<span style="color:blue;">ReadP</span>&nbsp;[<span style="color:blue;">Journey</span>] parseJourneys&nbsp;=&nbsp;parseJourney&nbsp;sepBy&nbsp;skipSpaces</pre> </p> <p> Finally, you can parse a multi-line string into a list of journeys: </p> <p> <pre><span style="color:#2b91af;">parseInput</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[<span style="color:blue;">Journey</span>] parseInput&nbsp;=&nbsp;<span style="color:blue;">fst</span>&nbsp;.&nbsp;minimumBy&nbsp;(comparing&nbsp;<span style="color:blue;">snd</span>)&nbsp;.&nbsp;readP_to_S&nbsp;parseJourneys</pre> </p> <p> When you run <code>readP_to_S</code>, it'll produce a list of alternatives, as there's more than one way to interpret the file according to <code>parseJourneys</code>. Each alternative is presented as a tuple of the parse result and the remaining (or unconsumed) string. I'm after the alternative that consumes as much of the input file as possible (which turns out to be all of it), so I use <code>minimumBy</code> to find the tuple that has the smallest second element. Then I return the first element of that tuple. </p> <p> Play around with <code>readP_to_S parseJourneys</code> in GHCi if you want all the details. </p> <h3 id="34135daae9634db2885e28382760d1fd"> Evaluation <a href="#34135daae9634db2885e28382760d1fd" title="permalink">#</a> </h3> <p> Haskell beginners may still find operators like <code>&lt;*&gt;</code> cryptic, but they're essential to parser combinators. Evaluation of the journeys is, in comparison, simple. </p> <p> You can start by defining a function to turn right: </p> <p> <pre><span style="color:#2b91af;">turnRight</span>&nbsp;::&nbsp;<span style="color:blue;">Robot</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Robot</span> turnRight&nbsp;r@(Robot&nbsp;_&nbsp;North)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;East&nbsp;} turnRight&nbsp;r@(Robot&nbsp;_&nbsp;&nbsp;East)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;South&nbsp;} turnRight&nbsp;r@(Robot&nbsp;_&nbsp;South)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;West&nbsp;} turnRight&nbsp;r@(Robot&nbsp;_&nbsp;&nbsp;West)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;North&nbsp;}</pre> </p> <p> There's more than one way to write a function that rotates one direction to the right, but I chose one that I found most readable. It trades clarity for verbosity by relying on simple pattern matching. I hope that it's easy to understand for Haskell beginners, and perhaps even for people who haven't seen Haskell code before. </p> <p> The function to turn left uses the same structure: </p> <p> <pre><span style="color:#2b91af;">turnLeft</span>&nbsp;::&nbsp;<span style="color:blue;">Robot</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Robot</span> turnLeft&nbsp;r@(Robot&nbsp;_&nbsp;North)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;West&nbsp;} turnLeft&nbsp;r@(Robot&nbsp;_&nbsp;&nbsp;West)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;South&nbsp;} turnLeft&nbsp;r@(Robot&nbsp;_&nbsp;South)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;East&nbsp;} turnLeft&nbsp;r@(Robot&nbsp;_&nbsp;&nbsp;East)&nbsp;=&nbsp;r&nbsp;{&nbsp;robotDirection&nbsp;=&nbsp;North&nbsp;}</pre> </p> <p> The last command you need to implement is moving forward: </p> <p> <pre><span style="color:#2b91af;">moveForward</span>&nbsp;::&nbsp;<span style="color:blue;">Robot</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Robot</span> moveForward&nbsp;(Robot&nbsp;(x,&nbsp;y)&nbsp;North)&nbsp;=&nbsp;Robot&nbsp;(x,&nbsp;y&nbsp;+&nbsp;1)&nbsp;North moveForward&nbsp;(Robot&nbsp;(x,&nbsp;y)&nbsp;&nbsp;East)&nbsp;=&nbsp;Robot&nbsp;(x&nbsp;+&nbsp;1,&nbsp;y)&nbsp;East moveForward&nbsp;(Robot&nbsp;(x,&nbsp;y)&nbsp;South)&nbsp;=&nbsp;Robot&nbsp;(x,&nbsp;y&nbsp;-&nbsp;1)&nbsp;South moveForward&nbsp;(Robot&nbsp;(x,&nbsp;y)&nbsp;&nbsp;West)&nbsp;=&nbsp;Robot&nbsp;(x&nbsp;-&nbsp;1,&nbsp;y)&nbsp;West</pre> </p> <p> The <code>moveForward</code> function also pattern-matches on the direction the robot is facing, this time to increment or decrement the <code>x</code> or <code>y</code> coordinate as appropriate. </p> <p> You can now evaluate all three commands: </p> <p> <pre><span style="color:#2b91af;">evalCommand</span>&nbsp;::&nbsp;<span style="color:blue;">Command</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Robot</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Robot</span> evalCommand&nbsp;&nbsp;&nbsp;TurnRight&nbsp;=&nbsp;turnRight evalCommand&nbsp;&nbsp;&nbsp;&nbsp;TurnLeft&nbsp;=&nbsp;turnLeft evalCommand&nbsp;MoveForward&nbsp;=&nbsp;moveForward</pre> </p> <p> The <code>evalCommand</code> pattern-matches on all three <code>Command</code> cases and returns the appropriate function for each. </p> <p> You can now evaluate whether a <code>Journey</code> is valid: </p> <p> <pre><span style="color:#2b91af;">isJourneyValid</span>&nbsp;::&nbsp;<span style="color:blue;">Journey</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span> isJourneyValid&nbsp;(Journey&nbsp;s&nbsp;cs&nbsp;e)&nbsp;=&nbsp;<span style="color:blue;">foldl</span>&nbsp;(<span style="color:blue;">flip</span>&nbsp;evalCommand)&nbsp;s&nbsp;cs&nbsp;==&nbsp;e</pre> </p> <p> The <code>isJourneyValid</code> function pattern-matches the constituent values out of <code>Journey</code>. I named the <code>journeyStart</code> value <code>s</code> (for <em>start</em>), the <code>journeyCommands</code> value <code>cs</code> (for <em>commands</em>), and the <code>journeyEnd</code> value <code>e</code> (for <em>end</em>). </p> <p> The <code>evalCommand</code> function evaluates a single <code>Command</code>, but a <code>Journey</code> contains many commands. You'll need to evaluate the first command to find the position from which you evaluate the second command, and so on. Imperative programmers would use a <em>for loop</em> for something like that, but in functional programming, a <em>fold</em>, in this case from the left, is how it's done. </p> <p> <code>foldl</code> requires you to supply an initial state <code>s</code> as well as the list of commands <code>cs</code>. The entire <code>foldl</code> expression produces a final <code>Robot</code> state that you can compare against the expected end state <code>e</code>. </p> <h3 id="6267cc3a03594d0fb21cd7cc61430eb0"> Execution <a href="#6267cc3a03594d0fb21cd7cc61430eb0" title="permalink">#</a> </h3> <p> Load the input file, parse it, and evaluate each journey in the <code>main</code> function: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;input&nbsp;&lt;-&nbsp;parseInput&nbsp;&lt;$&gt;&nbsp;<span style="color:blue;">readFile</span>&nbsp;<span style="color:#a31515;">&quot;input.txt&quot;</span> &nbsp;&nbsp;<span style="color:blue;">mapM_</span>&nbsp;<span style="color:blue;">print</span>&nbsp;$&nbsp;isJourneyValid&nbsp;&lt;$&gt;&nbsp;input</pre> </p> <p> I just load the <code>Main.hs</code> file in GHCi and run the <code>main</code> function: </p> <p> <pre>Prelude&gt; :load Main.hs [1 of 1] Compiling Main ( Main.hs, interpreted ) Ok, one module loaded. *Main&gt; main True True True</pre> </p> <p> I used the same input file as Mike Hadlow, and it turns out that all journeys are valid. That's not what I'd expected from an exercise like this, so I cloned and ran Mike's solution as well, and it seems that it arrives at the same result. </p> <h3 id="683ea9809a774338b16aed1ad41e1984"> Conclusion <a href="#683ea9809a774338b16aed1ad41e1984" title="permalink">#</a> </h3> <p> Haskell is a great language for small coding exercises that require parsing and interpretation. In this article, I demonstrated one solution to the <em>robot journeys</em> coding exercise. My goal was to show some beginner-friendly, but still idiomatic Haskell code. </p> <p> Granted, the use of parser combinators is on the verge of being overkill, but I wanted to show an example; Haskell examples are scarce, so I hope it's helpful. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A red-green-refactor checklist https://blog.ploeh.dk/2019/10/21/a-red-green-refactor-checklist 2019-10-21T06:49:00+00:00 Mark Seemann <div id="post"> <p> <em>A simple read-do checklist for test-driven development.</em> </p> <p> I recently read <a href="https://amzn.to/35Wk5yD">The Checklist Manifesto</a>, a book about the power of checklists. That may sound off-putting and tedious, but I actually <a href="https://www.goodreads.com/review/show/2949987528">found it inspiring</a>. It explains how checklists empower skilled professionals to focus on difficult problems, while preventing avoidable mistakes. </p> <p> Since I read the book with the intent to see if there were ideas that we could apply in software development, I thought about checklists one might create for software development. Possibly the simplest checklist is one that describes the <em>red-green-refactor</em> cycle of test-driven development. </p> <h3 id="530e04dfea574602952023fa218d8a7c"> Types of checklists <a href="#530e04dfea574602952023fa218d8a7c" title="permalink">#</a> </h3> <p> As the book describes, there's basically two types of checklists: <ul> <li><strong>Do-confirm.</strong> With such a checklist, you perform a set of tasks, and then subsequently, at a sufficient <em>pause point</em> go through the checklist to verify that you remembered to perform all the tasks on the list.</li> <li><strong>Read-do.</strong> With this type of checklist, you read each item for instructions and then perform the task. Only when you've performed the task do you move on to the next item on the list.</li> </ul> I find it most intuitive to describe the red-green-refactor cycle as a <em>read-do</em> list. I did, however, find it expedient to include a <em>do-confirm</em> sub-list for one of the overall steps. </p> <p> This list is, I think, mostly useful if you're still learning test-driven development. It can be easily internalised. As such, I offer this for inspiration, and as a learning aid. </p> <h3 id="05b38ebc9c0c419b9a146be976578bd2"> Red-green-refactor checklist <a href="#05b38ebc9c0c419b9a146be976578bd2" title="permalink">#</a> </h3> <p> Read each of the steps in the list and perform the task. <ol> <li>Write a failing test. <ul> <li>Did you run the test?</li> <li>Did it fail?</li> <li>Did it fail because of an assertion?</li> <li>Did it fail because of the <em>last</em> assertion?</li> </ul> </li> <li>Make all tests pass by doing the simplest thing that could possibly work.</li> <li>Consider the resulting code. Can it be improved? If so, do it, but make sure that all tests still pass.</li> <li>Repeat</li> </ol> Perhaps the most value this checklist provides isn't so much the overall <em>read-do</em> list, but rather the subordinate <em>do-confirm</em> list associated with the first step. </p> <p> I regularly see people write failing tests as an initial step. The reason the test fails, however, is because the implementation throws an exception. </p> <h3 id="24a066fa0b9b401687d47b92473d63d0"> Improperly failing tests <a href="#24a066fa0b9b401687d47b92473d63d0" title="permalink">#</a> </h3> <p> Consider, as an example, the first test you might write when doing the <a href="https://en.wikipedia.org/wiki/Fizz_buzz">FizzBuzz</a> kata. </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">One</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#2b91af;">FizzBuzz</span>.<span style="color:#74531f;">Convert</span>(1); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Equal</span>(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> I wrote this test first (i.e. before the 'production' code) and used Visual Studio's refactoring tools to generate the implied type and method. </p> <p> When I run the test, it fails. </p> <p> Further investigation, however, reveals that the test fails when <code>Convert</code> is called: </p> <p> <pre>Ploeh.Katas.FizzBuzzKata.FizzBuzzTests.One Source: FizzBuzzTests.cs line: 11 Duration: 8 ms Message: System.NotImplementedException : The method or operation is not implemented. Stack Trace: at FizzBuzz.Convert(Int32 i) in FizzBuzz.cs line: 9 at FizzBuzzTests.One() in FizzBuzzTests.cs line: 13 </pre> </p> <p> This is hardly surprising, since this is the current 'implementation': </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Convert</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">i</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">NotImplementedException</span>(); }</pre> </p> <p> This is what the subordinate <em>do-confirm</em> checklist is for. Did the test fail because of an assertion? In this case, the answer is no. </p> <p> This means that you're not yet done with the <em>read</em> phase. </p> <h3 id="97295029364d4cd7ace15be9c9f8dc64"> Properly failing tests <a href="#97295029364d4cd7ace15be9c9f8dc64" title="permalink">#</a> </h3> <p> You can address the issue by changing the <code>Convert</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Convert</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">i</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;&quot;</span>; }</pre> </p> <p> This causes the test to fail because of an assertion: </p> <p> <pre> Ploeh.Katas.FizzBuzzKata.FizzBuzzTests.One Source: FizzBuzzTests.cs line: 11 Duration: 13 ms Message: Assert.Equal() Failure ↓ (pos 0) Expected: 1 Actual: ↑ (pos 0) Stack Trace: at FizzBuzzTests.One() in FizzBuzzTests.cs line: 14 </pre> </p> <p> Not only does the test fail because of an assertion - it fails because of the last assertion (since there's only one assertion). This completes the <em>do-confirm</em> checklist, and you're now ready to make the simplest change that could possibly work: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Convert</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">i</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>; }</pre> </p> <p> This passes the test suite. </p> <h3 id="bdb785a78ebf475c95c64197274268c7"> Conclusion <a href="#bdb785a78ebf475c95c64197274268c7" title="permalink">#</a> </h3> <p> It's important to see tests fail. Particularly, it's important to see tests fail for the reason you expect them to fail. You'd be surprised how often you inadvertently write an <a href="/2019/10/14/tautological-assertion">assertion that can never fail</a>. </p> <p> Once you've seen the test fail for the proper reason, make it pass. </p> <p> Finally, refactor the code if necessary. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="99be0da15a164d5782afdef808300828"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <p> I remember the first time that I realized that I did the red step wrong because my test didn't fail for the intended reason (i.e. it didn't fail because of an assertion). Before that, I didn't realize that I needed to This is a nice programming checklist. Thanks for sharing it :) </p> <blockquote> 3. Consider the resulting code. Can it be improved? If so, do it, but make sure that all tests still pass. </blockquote> <blockquote> Finally, refactor the code if necessary. </blockquote> <p> If I can be a <a href="https://blog.ploeh.dk/2019/10/07/devils-advocate/">Devil's advocate</a> for a moment, then I would say that code can always be improved and few things are necessary. In all honesty though, I think the refactoring step is the most interesting. All three steps include aspects of science and art, but I think the refactor step includes the most of both. On the one hand, it is extremely creative and full of judgement calls about what code should be refactored and what properties the resulting code should have. On the other hand, much of the work of how to (properly) refactor is laid out in books like <a href="https://www.amazon.com/Refactoring-Improving-Existing-Addison-Wesley-Signature/dp/0134757599">Martin Fowler's Refacoring</a> and is akin to algebraic manipulations of an algebraic formula. </p> <p> In other words, I feel like there is room to expand on this checklist in the refactor step. Do you have any thoughts about you might expand it? </p> </div> <div class="comment-date">2019-10-25 00:33 UTC</div> </div> <div class="comment" id="28976782c7984115a65d539eff3d0414"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. I agree that the <em>refactoring</em> step is both important and compelling. I can't, however, imagine how a checklist would be useful. </p> <p> The point of <em>The Checklist Manifesto</em> is that checklists help identify avoidable mistakes. A checklist isn't intended to describe an algorithm, but rather to make sure that crucial steps aren't forgotten. </p> <p> Another important point from <em>The Checklist Manifesto</em> is that a checklist is only effective if it's not too big. A checklist that tries to cover every eventuality isn't useful, because then people don't follow it. </p> <p> As you write, refactoring is a big topic, covered by several books. All the creativity and experience that goes into refactoring doesn't seem like something that can easily be expressed as an effective checklist. </p> <p> I don't mind being proven wrong, though, so by all means give it a go. </p> </div> <div class="comment-date">2019-10-25 21:51 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Tautological assertion https://blog.ploeh.dk/2019/10/14/tautological-assertion 2019-10-14T18:39:00+00:00 Mark Seemann <div id="post"> <p> <em>It's surprisingly easy to write a unit test assertion that never fails.</em> </p> <p> Recently I was mob programming with a pair of <a href="https://idq.dk">IDQ</a>'s programmers. We were starting a new code base, using test-driven development (TDD). This was the first test we wrote: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">HandleObserveUnitStatusStartsSaga</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">List</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;&nbsp;{&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{4D093799-9CCC-4135-8CB3-8661985A5853}&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicy</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Data&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicyData</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UnitId&nbsp;=&nbsp;123, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Subscribers&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{003C5527-7747-4C7A-980E-67040DB738C3}&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ObserveUnitStatus</span>(123,&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TestableMessageHandlerContext</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">subscriber</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.Data.Subscribers); }</pre> </p> <p> This unit test uses <a href="https://xunit.net">xUnit.net</a> 2.4.0 and <a href="https://particular.net/nservicebus">NServiceBus</a> 7.1.10 on .NET Core 2.2. The System Under Test (SUT) is intended to be an NServiceBus Saga that monitors a resource for status changes. If a <em>unit</em> changes status, the Saga will alert its subscribers. </p> <p> The test verifies that when a new subscriber wishes to observe a unit, then its ID is added to the policy's list of subscribers. </p> <p> The test induced us to implement <code>Handle</code> like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="color:#2b91af;">ObserveUnitStatus</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="color:#2b91af;">IMessageHandlerContext</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>) { &nbsp;&nbsp;&nbsp;&nbsp;Data.Subscribers.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">message</span>.SubscriberId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#2b91af;">Task</span>.CompletedTask; }</pre> </p> <p> Following the <em>red-green-refactor</em> cycle of TDD, this seemed an appropriate implementation. </p> <h3 id="8182b3c72b2940b98195f41b7a1193e8"> Enter the Devil <a href="#8182b3c72b2940b98195f41b7a1193e8" title="permalink">#</a> </h3> <p> I often use the <a href="/2019/10/07/devils-advocate">Devil's advocate</a> technique to figure out what to do next, so I made this change to the <code>Handle</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="color:#2b91af;">ObserveUnitStatus</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="color:#2b91af;">IMessageHandlerContext</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>) { &nbsp;&nbsp;&nbsp;&nbsp;Data.Subscribers.<span style="font-weight:bold;color:#74531f;">Clear</span>(); &nbsp;&nbsp;&nbsp;&nbsp;Data.Subscribers.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">message</span>.SubscriberId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#2b91af;">Task</span>.CompletedTask; }</pre> </p> <p> The change is that the method first deletes all existing subscribers. This is obviously wrong, but it passes all tests. That's no surprise, since I intentionally introduced the change to make us improve the test. </p> <h3 id="55331957e1de4691bb182c2aae614f4e"> False negative <a href="#55331957e1de4691bb182c2aae614f4e" title="permalink">#</a> </h3> <p> We had to write a new test, or improve the existing test, so that the defect I just introduced would be caught. I suggested an improvement to the existing test: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">HandleObserveUnitStatusStartsSaga</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">List</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;&nbsp;{&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{4D093799-9CCC-4135-8CB3-8661985A5853}&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicy</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Data&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicyData</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UnitId&nbsp;=&nbsp;123, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Subscribers&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{003C5527-7747-4C7A-980E-67040DB738C3}&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ObserveUnitStatus</span>(123,&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TestableMessageHandlerContext</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">subscriber</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.Data.Subscribers); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Superset</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">expectedSubset</span>:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HashSet</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;(<span style="font-weight:bold;color:#1f377f;">subscribers</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HashSet</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;(<span style="font-weight:bold;color:#1f377f;">sut</span>.Data.Subscribers)); }</pre> </p> <p> The only change is the addition of the last assertion. </p> <p> Smugly I asked the keyboard driver to run the tests, anticipating that it would now fail. </p> <p> It passed. </p> <p> We'd just managed to write a <a href="http://xunitpatterns.com/false%20negative.html">false negative</a>. Even though there's a defect in the code, the test still passes. I was nonplussed. None of us expected the test to pass, yet it does. </p> <p> It took us a minute to figure out what was wrong. Before you read on, try to figure it out for yourself. Perhaps it's immediately clear to you, but it took three people with decades of programming experience a few minutes to spot the problem. </p> <h3 id="6f985240963a40d8af913e251b3b86bf"> Aliasing <a href="#6f985240963a40d8af913e251b3b86bf" title="permalink">#</a> </h3> <p> The problem is <a href="https://en.wikipedia.org/wiki/Aliasing_(computing)">aliasing</a>. While named differently, <code>subscribers</code> and <code>sut.Data.Subscribers</code> is the same object. Of course one is a subset of the other, since a set is considered to be a subset of itself. </p> <p> The assertion is tautological. It can never fail. </p> <h3 id="9e8ec91f731e4f67aceefbebb76799d4"> Fixing the problem <a href="#9e8ec91f731e4f67aceefbebb76799d4" title="permalink">#</a> </h3> <p> It's surprisingly easy to write tautological assertions when working with mutable state. This regularly happens to me, perhaps a few times a month. Once you've realised that this has happened, however, it's easy to address. </p> <p> <code>subscribers</code> shouldn't change during the test, so make it immutable. </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">HandleObserveUnitStatusStartsSaga</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{4D093799-9CCC-4135-8CB3-8661985A5853}&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicy</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Data&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">StatusPolicyData</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UnitId&nbsp;=&nbsp;123, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Subscribers&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">subscribers</span>.<span style="font-weight:bold;color:#74531f;">ToList</span>() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>&nbsp;=&nbsp;<span style="color:#2b91af;">Guid</span>.<span style="color:#74531f;">Parse</span>(<span style="color:#a31515;">&quot;{003C5527-7747-4C7A-980E-67040DB738C3}&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ObserveUnitStatus</span>(123,&nbsp;<span style="font-weight:bold;color:#1f377f;">subscriber</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">TestableMessageHandlerContext</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Contains</span>(<span style="font-weight:bold;color:#1f377f;">subscriber</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>.Data.Subscribers); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">Superset</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">expectedSubset</span>:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HashSet</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;(<span style="font-weight:bold;color:#1f377f;">subscribers</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">HashSet</span>&lt;<span style="color:#2b91af;">Guid</span>&gt;(<span style="font-weight:bold;color:#1f377f;">sut</span>.Data.Subscribers)); }</pre> </p> <p> An array strictly isn't immutable, but declaring it as <code>IEnumerable&lt;Guid&gt;</code> hides the mutation capabilities. The test now has to copy <code>subscribers</code> to a list before assigning it to the policy's data. This anti-aliases <code>subscribers</code> from <code>sut.Data.Subscribers</code>, and causes the test to fail. After all, there's a defect in the <code>Handle</code> method. </p> <p> You now have to remove the offending line: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Task</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Handle</span>(<span style="color:#2b91af;">ObserveUnitStatus</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">message</span>,&nbsp;<span style="color:#2b91af;">IMessageHandlerContext</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">context</span>) { &nbsp;&nbsp;&nbsp;&nbsp;Data.Subscribers.<span style="font-weight:bold;color:#74531f;">Add</span>(<span style="font-weight:bold;color:#1f377f;">message</span>.SubscriberId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#2b91af;">Task</span>.CompletedTask; }</pre> </p> <p> This makes the test pass. </p> <h3 id="fd8e8ab5e7314ebbadc4775741af11fa"> Summary <a href="#fd8e8ab5e7314ebbadc4775741af11fa" title="permalink">#</a> </h3> <p> This article shows an example where I was surprised by aliasing. The Devil's Advocate technique can help you decide.</em> </p> <p> When I review unit tests, I often utilise a technique I call <em>Devil's Advocate</em>. I do the same whenever I consider if I have a sufficient number of test cases. The first time I explicitly named the technique was, I think, in my <a href="/outside-in-tdd">Outside-in TDD Pluralsight course</a>, in which I also discuss the so-called <em>Gollum style</em> variation. I don't think, however, that I've ever written an article explicitly about this topic. The current text attempts to rectify that omission. </p> <h3 id="a66b04a5812b4a84ba3a60a8609e58be"> Coverage <a href="#a66b04a5812b4a84ba3a60a8609e58be" title="permalink">#</a> </h3> <p> Programmers new to unit testing often struggle with identifying useful test cases. I sometimes see people writing redundant unit tests, while, on the other hand, forgetting to add important test cases. How do you know which test cases to add, and how do you know when you've added enough? </p> <p> I may return to the first question in another article, but in this, I wish to address the second question. How do you know that you have a sufficient set of test cases? </p> <p> You may think that this is a question of turning on code coverage. Surely, if you have <a href="/2015/11/16/code-coverage-is-a-useless-target-measure">100% code coverage</a>, that's sufficient? </p> <p> It's not. Consider this simple class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">capacity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;<span style="color:#1f377f;">capacity</span>; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">if</span>&nbsp;(Capacity&nbsp;&lt;&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This class implements the (simplified) decision logic for an online restaurant reservation system. The <code>CanAccept</code> method has a cyclomatic complexity of 2, so it should be easy to cover with a pair of unit tests: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWithNoPriorReservations</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;4 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>[0],&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); } [<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptOnInsufficientCapacity</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;4 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;7&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> These two tests together completely cover the <code>CanAccept</code> method: </p> <p> <img src="/content/binary/coverage-of-can-accept-method.png" alt="Screen shot showing that the CanAccept method is 100% covered."> </p> <p> You'd think that this is a sufficient number of test cases of the method, then. </p> <h3 id="0ac61ef87e3f4e738475e97179242db5"> As the Devil reads the Bible <a href="#0ac61ef87e3f4e738475e97179242db5" title="permalink">#</a> </h3> <p> In Scandinavia we have an idiom that <a href="https://www.kentbeck.com">Kent Beck</a> (who's worked with Norwegian companies) has also encountered: <blockquote> <p> "TIL: "like the devil reads the Bible"--meaning someone who carefully reads a book to subvert its intent" </p> <footer><cite><a href="https://twitter.com/kentbeck/status/651817458857320449">Kent Beck</a></cite></footer> </blockquote> We have the same saying in Danish, and the Swedes also use it. </p> <p> If you think of a unit test suite as an executable specification, you may consider if you can follow the specification to the letter while intentionally introduce a defect. You can easily do that with the above <code>CanAccept</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">if</span>&nbsp;(Capacity&nbsp;&lt;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> This still passes both tests, and still has a code coverage of 100%, yet it's 'obviously' wrong. </p> <p> Can you spot the difference? </p> <p> Instead of a <em>less-than</em> comparison, it now uses a <em>less-than-or-equal</em> comparison. You could easily, inadvertently, make such a mistake while programming. It belongs in the category of <em>off-by-one errors</em>, which is one of the most common type of bugs. </p> <p> This is, in a nutshell, the Devil's Advocate technique. The intent isn't to break the software by sneaking in defects, but to explore how effectively the test suite detects bugs. In the current (simplified) example, the effectiveness of the test suite isn't impressive. </p> <h3 id="d6e4c657adec4cc1bb6d10af351a415f"> Add test cases <a href="#d6e4c657adec4cc1bb6d10af351a415f" title="permalink">#</a> </h3> <p> The problem introduced by the Devil's Advocate is an edge case. If the reservation under consideration fits the restaurant's remaining capacity, but entirely consumes it, the <code>MaîtreD</code> class should still accept it. Currently, however, it doesn't. </p> <p> It'd seem that the obvious solution is to 'fix' the unit test: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWithNoPriorReservations</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;10 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>[0],&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> Changing the requested <code>Quantity</code> to <code>10</code> does, indeed, cause the test to fail. </p> <h3 id="26be7b38248c4dcba5134eb4529d8214"> Beyond mutation testing <a href="#26be7b38248c4dcba5134eb4529d8214" title="permalink">#</a> </h3> <p> Until this point, you may think that the Devil's Advocate just looks like <em>an ad-hoc, informally-specified, error-prone, manual version of half of <a href="https://en.wikipedia.org/wiki/Mutation_testing">mutation testing</a></em>. So far, the change I made above could also have been made during mutation testing. </p> <p> What I sometimes do with the Devil's Advocate technique is to experiment with other, less heuristically driven changes. For instance, based on my knowledge of the existing test cases, it's not too difficult to come up with this change: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">if</span>&nbsp;(<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;!=&nbsp;10) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> That's an even simpler implementation than the original, but obviously wrong. </p> <p> This should prompt you to add at least one other test case: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4)] [<span style="color:#2b91af;">InlineData</span>(10)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWithNoPriorReservations</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>[0],&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> Notice that I converted the test to a parametrised test. This breaks the Devil's latest attempt, while the original implementation passes all tests. </p> <p> The Devil, not to be outdone, now switches tactics and goes after the <code>reservations</code> instead: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;!<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Any</span>(); }</pre> </p> <p> This still passes all tests, including the new test case. This indicates that you'll need to add at least one test case with existing reservations, but where there's still enough capacity to accept another reservation: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWithOnePriorReservation</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;4 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;4&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> This new test fails, prompting you to correct the implementation of <code>CanAccept</code>. The Devil, however, can do this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;!=&nbsp;7; }</pre> </p> <p> This is still not correct, but passes all tests. It does, however, look like you're getting closer to a proper implementation. </p> <h3 id="9b955ad4a2084823a9eeb668415eb696"> Reverse Transformation Priority Premise <a href="#9b955ad4a2084823a9eeb668415eb696" title="permalink">#</a> </h3> <p> If you find this process oddly familiar, it's because it resembles the <a href="https://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html">Transformation Priority Premise</a> (TPP), just reversed. <blockquote> <p> “As the tests get more specific, the code gets more generic.” </p> <footer><cite><a href="https://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html">Robert C. Martin</a></cite></footer> </blockquote> </p> <p> When I test-drive code, I often try to follow the TPP, but when I review code with tests, the code and the tests are already in place, and it's my task to assess both. </p> <p> Applying the Devil's Advocate review technique to <code>CanAccept</code>, it seems as though I'm getting closer to a proper implementation. It does, however, require more tests. As your next move you may, for instance, consider parametrising the test case that verifies what happens when capacity is insufficient: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(7)] [<span style="color:#2b91af;">InlineData</span>(8)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptOnInsufficientCapacity</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;4 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> That doesn't help much, though, because this passes all tests: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;&lt;&nbsp;7; }</pre> </p> <p> Compared to the initial, 'desired' implementation, there's at least two issues with this code: <ul> <li>It doesn't consider <code>reservation.Quantity</code></li> <li>It doesn't take into account the <code>Capacity</code> of the restaurant</li> </ul> This indicates that you're going to have to add more test cases, varying both <code>reservation.Quantity</code> and <code>Capacity</code>. The happy-path test cases already varies <code>reservation.Quantity</code> a bit, but <code>CanAcceptOnInsufficientCapacity</code> does not, so perhaps you can follow the TPP by varying <code>reservation.Quantity</code> in that method as well: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(&nbsp;1,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;2,&nbsp;&nbsp;9)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;3,&nbsp;&nbsp;8)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;&nbsp;7)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;&nbsp;8)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;5,&nbsp;&nbsp;6)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;6,&nbsp;&nbsp;5)] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;1)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptOnInsufficientCapacity</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>,&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> This makes it harder for the Devil to come up with a malevolent implementation. Harder, but not impossible. </p> <p> It seems clear that since all test cases still use a hard-coded capacity, it ought to be possible to write an implementation that ignores the <code>Capacity</code>, but at this point I don't see a simple way to avoid looking at <code>reservation.Quantity</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;&lt;&nbsp;11; }</pre> </p> <p> This implementation passes all the tests. The last batch of test cases forced the Devil to consider <code>reservation.Quantity</code>. This strongly implies that if you vary <code>Capacity</code> as well, the proper implementation out to emerge. </p> <h3 id="e3ed81c93d5847039ea53e7acd978d99"> Diminishing returns <a href="#e3ed81c93d5847039ea53e7acd978d99" title="permalink">#</a> </h3> <p> What happens, then, if you add just one test case with a different <code>Capacity</code>? </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(&nbsp;1,&nbsp;10,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;2,&nbsp;&nbsp;9,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;3,&nbsp;&nbsp;8,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;&nbsp;7,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;&nbsp;8,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;5,&nbsp;&nbsp;6,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;6,&nbsp;&nbsp;5,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;1,&nbsp;10)] [<span style="color:#2b91af;">InlineData</span>(&nbsp;1,&nbsp;&nbsp;1,&nbsp;&nbsp;1)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptOnInsufficientCapacity</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">capacity</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">False</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> Notice that I just added one test case with a <code>Capacity</code> of <code>1</code>. </p> <p> You may think that this is about where the Devil ought to capitulate, but not so. This passes all tests: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">r</span>&nbsp;<span style="color:#8f08c4;">in</span>&nbsp;<span style="color:#1f377f;">reservations</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">r</span>.Quantity; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">break</span>; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;&lt;=&nbsp;Capacity; }</pre> </p> <p> Here you may feel the urge to protest. So far, all the Devil's Advocate implementations have been objectively <em>simpler</em> than the 'desired' implementation because it has involved fewer elements and has had a lower or equivalent <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a>. This new attempt to circumvent the specification seems more complex. </p> <p> It's also seems clearly ill-intentioned. Recall that the intent of the Devil's Advocate technique isn't to 'cheat' the unit tests, but rather to explore how well the test describe the desired behaviour of the system. The motivation is that it's easy to make off-by-one errors like inadvertently use <code>&lt;=</code> instead of <code>&lt;</code>. It doesn't seem quite as reasonable that a well-intentioned programmer accidentally would leave behind an implementation like the above. </p> <p> You can, however, make it <em>look</em> less complicated: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity).<span style="color:#74531f;">FirstOrDefault</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;&lt;=&nbsp;Capacity; }</pre> </p> <p> You could argue that this still looks intentionally wrong, but I've seen much code that looks like this. It seems to me that there's a kind of programmer who seems generally uncomfortable thinking in collections; they seem to subconsciously gravitate towards code that deals with singular objects. Code that attempts to get 'the' value out of a collection is, unfortunately, not that uncommon. </p> <p> Still, you might think that at this point, you've added enough test cases. That's reasonable. </p> <p> The Devil's Advocate technique isn't an <em>algorithm</em>; it has no deterministic exit criterion. It's just a heuristic that I use to explore the quality of tests. There comes a point where subjectively, I judge that the test cases <em>sufficiently</em> describe the desired behaviour. </p> <p> You may find that we've reached that point now. You could, for example, argue that in order to calculate <code>reservedSeats</code>, <code>reservations.Sum(r =&gt; r.Quantity)</code> is simpler than <code>reservations.Select(r =&gt; r.Quantity).FirstOrDefault()</code>. I'd be inclined to agree. </p> <p> There's diminishing returns to the Devil's Advocate technique. Once you find that the gains from insisting on intentionally pernicious implementations are smaller than the effort required to add more test cases, it's time to stop and commit to the test cases now in place. </p> <h3 id="609ddb35ae364efbbfb7965a646d857e"> Test case variability <a href="#609ddb35ae364efbbfb7965a646d857e" title="permalink">#</a> </h3> <p> Tests specify desired behaviour. If the tests contain less variability than the code they cover, then how can you be certain that the implementation code is correct? </p> <p> The discussion now moves into territory where I usually exercise a great deal of judgement. Read the following for inspiration, not as rigid instructions. My intent with the following is not to imply that you must always go to like extremes, but simply to demonstrate what you <em>can</em> do. Depending on circumstances (such as the cost of a defect in production), I may choose to do the following, and sometimes I may choose to skip it. </p> <p> If you consider the original implementation of <code>CanAccept</code> at the top of the article, notice that it works with <code>reservations</code> of indefinite size. If you think of <code>reservations</code> as a finite collection, it can contain zero, one, two, ten, or hundreds of elements. Yet, no test case goes beyond a single existing reservation. This is, I think, a disconnect. The tests come not even close to the degree of variability that the method can handle. If this is a piece of mission-critical software, that could be a cause for concern. </p> <p> You should add some test cases where there's two, three, or more existing reservations. People often don't do that because it seems that you'd now have to write a test method that exercises one or more test cases with two existing reservations: </p> <p> <pre>[<span style="color:#2b91af;">Fact</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWithTwoPriorReservations</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;4 &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;4&nbsp;},&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;1&nbsp;}&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> While this method now covers the two-existing-reservations test case, you need one to cover the three-existing-reservations test case, and so on. This seems repetitive, and probably bothers you at more than one level: <ul> <li>It's just plain tedious to have to add that kind of variability</li> <li>It seems to violate the <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY principle</a></li> </ul> I don't hold the DRY principle as an absolute that must always be followed, but it often indicates a maintainability problem. I think this is the case here, because the new <code>CanAcceptWithTwoPriorReservations</code> test method looks a lot like the previous <code>CanAcceptWithOnePriorReservation</code> method. If someone makes changes to the <code>MaîtreD</code> class, they would have to go and revisit all those test methods. </p> <p> What you can do instead is to parametrise the key values of the collection(s) in question. While you can't put collections of objects in <code>[InlineData]</code> attributes, you <em>can</em> put arrays of constants. For existing reservations, the key values are the quantities, so supply an array of integers as a test argument: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4,&nbsp;1&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(&nbsp;2,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;1,&nbsp;3,&nbsp;2&nbsp;})] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWhenCapacityIsSufficient</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>,&nbsp;<span style="color:blue;">int</span>[]&nbsp;<span style="color:#1f377f;">reservationQantities</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservations</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">q</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">q</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> This single test method replaces the previous three 'happy path' test methods. The first four <code>[InlineData]</code> annotations reproduce the previous test cases, whereas the fifth <code>[InlineData]</code> annotation adds a new test case with four existing reservations. </p> <p> I gave the method a new name to better reflect the more general nature of it. </p> <p> Notice that the <code>CanAcceptWhenCapacityIsSufficient</code> method uses <code>Select</code> to turn the array of integers into a collection of <code>Reservation</code> objects. </p> <p> You may think that I cheated, since I didn't supply any other values, such as the <code>Date</code> property, to the existing reservations. This is easily addressed: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4,&nbsp;1&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(&nbsp;2,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;1,&nbsp;3,&nbsp;2&nbsp;})] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWhenCapacityIsSufficient</span>(<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>,&nbsp;<span style="color:blue;">int</span>[]&nbsp;<span style="color:#1f377f;">reservationQantities</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">date</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>:&nbsp;10); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservations</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">q</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">q</span>,&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> The only change compared to before is that <code>date</code> is now a variable assigned not only to <code>reservation</code>, but also to all the <code>Reservation</code> objects in <code>reservations</code>. </p> <h3 id="0564ebd7cafc44f4ba6ad017e3f0d0ce"> Towards property-based testing <a href="#0564ebd7cafc44f4ba6ad017e3f0d0ce" title="permalink">#</a> </h3> <p> Looking at a test method like <code>CanAcceptWhenCapacityIsSufficient</code> it should bother you that the <code>capacity</code> is still hard-coded. Why don't you make that a test argument as well? </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;10,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4,&nbsp;1&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(10,&nbsp;&nbsp;2,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;1,&nbsp;3,&nbsp;2&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(20,&nbsp;10,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;2,&nbsp;2,&nbsp;2&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(20,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;2,&nbsp;4,&nbsp;1,&nbsp;3,&nbsp;3&nbsp;})] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWhenCapacityIsSufficient</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">capacity</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>[]&nbsp;<span style="color:#1f377f;">reservationQantities</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">date</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservations</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">q</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">q</span>,&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> The first five <code>[InlineData]</code> annotations just reproduce the test cases that were already present, whereas the bottom two annotations are new test cases with another <code>capacity</code>. </p> <p> How do I come up with new test cases? It's easy: In the happy-path case, the sum of existing reservation quantities, plus the requested quantity, must be less than or equal to the <code>capacity</code>. </p> <p> It sometimes helps to slightly reframe the test method. If you allow the collection of existing reservations to be the most variable element in the test method, you can express the other values relative to that input. For example, instead of supplying the <code>capacity</code> as an absolute number, you can express a test case's capacity in relation to the existing reservations: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(6,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(0,&nbsp;10,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:blue;">int</span>[0])] [<span style="color:#2b91af;">InlineData</span>(2,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(1,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;4,&nbsp;1&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(0,&nbsp;&nbsp;2,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;1,&nbsp;3,&nbsp;2&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(2,&nbsp;10,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;2,&nbsp;2,&nbsp;2&nbsp;})] [<span style="color:#2b91af;">InlineData</span>(1,&nbsp;&nbsp;4,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;2,&nbsp;2,&nbsp;4,&nbsp;1,&nbsp;3,&nbsp;3&nbsp;})] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWhenCapacityIsSufficient</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">capacitySurplus</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="color:#1f377f;">quantity</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>[]&nbsp;<span style="color:#1f377f;">reservationQantities</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">date</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span> &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Sum</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">capacity</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">quantity</span>&nbsp;+&nbsp;<span style="color:#1f377f;">capacitySurplus</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservations</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">q</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">q</span>,&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> Notice that the value supplied as a test argument is now named <code>capacitySurplus</code>. This represents the surplus capacity for each test case. For example, in the first test case, the <code>capacity</code> was previously supplied as the absolute number <code>10</code>. The requested quantity is <code>4</code>, and since there's no prior reservations in that test case, the capacity surplus, after accepting the reservation, is <code>6</code>. </p> <p> Likewise, in the second test case, the requested quantity is <code>10</code>, and since the absolute capacity is also <code>10</code>, when you reframe the test case, the surplus capacity, after accepting the reservation, is <code>0</code>. </p> <p> This seems odd if you aren't used to it. You'd probably intuitively think of a restaurant's <code>Capacity</code> as 'the most absolute' number, in that it's often a number that originates from physical constraints. </p> <p> When you're looking for test cases, however, you aren't looking for test cases for a particular restaurant. You're looking for test cases for an arbitrary restaurant. In other words, you're looking for test inputs that belong to the same <em>equivalence class</em>. </p> <h3 id="174e2338027e4f3ca0b84dd0fb6adc5f"> Property-based testing <a href="#174e2338027e4f3ca0b84dd0fb6adc5f" title="permalink">#</a> </h3> <p> I haven't explicitly stated this yet, but both the <code>capacity</code> and each reservation <code>Quantity</code> should be a positive number. This should really have been <a href="/2015/01/19/from-primitive-obsession-to-domain-modelling">captured as a proper domain object</a>, but I chose to keep these values as primitive integers in order to not complicate the example too much. </p> <p> If you look at the test parameters for the latest incarnation of <code>CanAcceptWhenCapacityIsSufficient</code>, you may now observe the following: <ul> <li><code>capacitySurplus</code> can be an arbitrary non-negative number</li> <li><code>quantity</code> can be an arbitrary positive number</li> <li><code>reservationQantities</code> can be an arbitrary array of positive numbers, including the empty array</li> </ul> This isn't too hard to express with, say, <a href="https://fscheck.github.io/FsCheck">FsCheck</a> (2.14.0): </p> <p> <pre>[<span style="color:#2b91af;">Property</span>] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">CanAcceptWhenCapacityIsSufficient</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">NonNegativeInt</span>&nbsp;<span style="color:#1f377f;">capacitySurplus</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">PositiveInt</span>&nbsp;<span style="color:#1f377f;">quantity</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">PositiveInt</span>[]&nbsp;<span style="color:#1f377f;">reservationQantities</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">date</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(2018,&nbsp;8,&nbsp;30); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">quantity</span>.Item &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">x</span>.Item); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">capacity</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">quantity</span>.Item&nbsp;+&nbsp;<span style="color:#1f377f;">capacitySurplus</span>.Item; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">MaîtreD</span>(<span style="color:#1f377f;">capacity</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservations</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#1f377f;">reservationQantities</span>.<span style="color:#74531f;">Select</span>(<span style="color:#1f377f;">q</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;{&nbsp;Quantity&nbsp;=&nbsp;<span style="color:#1f377f;">q</span>.Item,&nbsp;Date&nbsp;=&nbsp;<span style="color:#1f377f;">date</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:#1f377f;">sut</span>.<span style="color:#74531f;">CanAccept</span>(<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#1f377f;">reservation</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.<span style="color:#74531f;">True</span>(<span style="color:#1f377f;">actual</span>); }</pre> </p> <p> This refactoring takes advantage of FsCheck's built-in wrapper types <code>NonNegativeInt</code> and <code>PositiveInt</code>. If you'd like an introduction to FsCheck, you could watch my <a href="/property-based-testing-intro">Introduction to Property-based Testing with F#</a> Pluralsight course. </p> <p> By default, FsCheck runs each property 100 times, so now, instead of seven test cases, you now have 100. </p> <h3 id="1eddf5bb91324b18880c78d1f1825ea6"> Limits to the Devil's Advocate technique <a href="#1eddf5bb91324b18880c78d1f1825ea6" title="permalink">#</a> </h3> <p> There's a limit to the Devil's Advocate technique. Unless you're working with <a href="/2015/02/23/property-based-testing-without-a-property-based-testing-framework">a problem where you can exhaust the entire domain of possible test cases</a>, your testing strategy is always going to be a sampling strategy. You run your automated tests with either hard-coded values or randomly generated values, but regardless, a test run isn't going to cover all possible input combinations. </p> <p> For example, a truly hostile Devil could make this change to the <code>CanAccept</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:#74531f;">CanAccept</span>(<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Reservation</span>&gt;&nbsp;<span style="color:#1f377f;">reservations</span>,&nbsp;<span style="color:#2b91af;">Reservation</span>&nbsp;<span style="color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">if</span>&nbsp;(<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;==&nbsp;3953911) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;=&nbsp;<span style="color:#1f377f;">reservations</span>.<span style="color:#74531f;">Sum</span>(<span style="color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;<span style="color:#1f377f;">r</span>.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#8f08c4;">return</span>&nbsp;<span style="color:#1f377f;">reservedSeats</span>&nbsp;+&nbsp;<span style="color:#1f377f;">reservation</span>.Quantity&nbsp;&lt;=&nbsp;Capacity; }</pre> </p> <p> Even if you increase the number of test cases that FsCheck generates to, say, 100,000, it's unlikely to find the poisonous branch. The chance of randomly generating a <code>quantity</code> of <em>exactly</em> <code>3953911</code> isn't that great. </p> <p> The Devil's Advocate technique doesn't guarantee that you'll have enough test cases to protect yourself against all sorts of odd defects. It does, however, still work well as an analysis tool to figure out if there's 'enough' test cases. </p> <h3 id="6ad7a48fd0c04f91af236d99f0722617"> Conclusion <a href="#6ad7a48fd0c04f91af236d99f0722617" title="permalink">#</a> </h3> <p> The Devil's Advocate technique is a heuristic you can use to evaluate whether more test cases would improve confidence in the test suite. You can use it to review existing (test) code, but you can also use it as inspiration for new test cases that you should consider adding. </p> <p> The technique is to deliberately implement the system under test incorrectly. The more incorrect you can make it, the more test cases you'll be likely to have to add. </p> <p> When there's only a few test cases, you can probably get away with a decidedly unsound implementation that still passes all tests. These are often simpler than the 'intended' implementation. In this phase of applying the heuristic, this clearly demonstrates the need for more test cases. </p> <p> At a later stage, you'll have to go deliberately out of your way to produce a wrong implementation that still passes all tests. When that happens, it may be time to stop. </p> <p> The intent of the technique is to uncover how many test cases you need to protect against common defects in the future. Thus, it's not a measure of <em>current</em> code coverage. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="fd53c72c360b42999b87c87649460e78"> <div class="comment-author">Tyson Williams</div> <div class="comment-content"> <blockquote> <p> When there's only a few test cases, you can probably get away with a decidedly unsound implementation that still passes all tests. These are often simpler than the 'intended' implementation. In this phase of applying the heuristic, this clearly demonstrates the need for more test cases. </p> <p> At a later stage, you'll have to go deliberately out of your way to produce a wrong implementation that still passes all tests. When that happens, it may be time to stop. </p> </blockquote> <p> I like to think of this behavior as a phrase transition. </p> <blockquote> Unless you're working with a problem where you can exhaust the entire domain of possible test cases, your testing strategy is always going to be a sampling strategy. </blockquote> <p> I agree with this in practice, but it is not always true in theory. A counter eaxample is <a href="https://en.wikipedia.org/wiki/Polynomial_interpolation">polynomial interpolation</a>. </p> <p> Normally we think of a polynomial in an indeterminate <code>x</code> of degree <code>n</code> as being specified by a list of <code>n + 1</code> coefficients, where the <code>i</code>th coefficient is the coefficient of <code>x<sup>i</sup></code>. Evaluating this polynomial given a value for <code>x</code> is easy; it just involves exponentiation, multiplication, and addition. Polynomial evaluation has a conceptual inverse called polynomial interpolation. In this direction, the input is evaluations at <code>n + 1</code> points in "general position" and the output is the <code>n + 1</code> coefficients. For example, a line is a polynomial of degree <code>1</code> and two points are in general position if they are not the same point. This is commonly expressed the phrase "Any two (distinct) points defines a line." Three points are in general position if they are not co-linear, where co-linear means that all three points are on the same line. In general, <code>n + 1</code> points are in general position if they are not all on the same polynomial of degree <code>n</code>. <p> <p> Anyway, here is the point. If a pure function is known to implement some polynomial of degree (at most) <code>n</code>, then even if the domain is infinite, there exists <code>n + 1</code> inputs such that it is sufficient to test this function for correctness on those inputs. </p> <p> This is why I think the phrase transition in the Devil's advocate testing is critical. There is some objective measure of complexity of the function under test (such as cyclomatic complexity), and we have an intuitive sense that a certain number of tests is sufficient for testing functions with that complexity. If the Devil is allowed to add monomials to the polynomial (or, heaven forbid, modify the implementation so that it is not a polynomial), then any finite number of tests can be circumvented. If instead the Devil is only allowed to modify the coefficients of the polynomial, then we have a winning strategy. </p> <blockquote> Here you may feel the urge to protest. So far, all the Devil's Advocate implementations have been objectively simpler than the 'desired' implementation because it has involved fewer elements and has had a lower or equivalent cyclomatic complexity. This new attempt to circumvent the specification seems more complex. </blockquote> <p> I think it would be exceedingly intersting if you can formally define what you mean here by "objectively". In the case of a polynomial (and speaking slightly roughly), changing the "first" nonzero coefficient to <code>0</code> decreases the complexity (i.e. the degree of the polynomial) while any other change to that coefficient or any change to any other coefficient maintains the complexity. </p> </div> <div class="comment-date">2019-10-25 01:32 UTC</div> </div> <div class="comment" id="cb15452b8c96429998efd50b67373da3"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Tyson, thank you for writing. What I meant by <em>objectively simpler</em> I partially explain in the same paragraph. I consider cyclomatic complexity one of hardly any useful measurements in software development. As I also imply in the article, I consider Robert C. Martin's <em>Transformation Priority Premise</em> to include a good ranking of code constructs, e.g. that using a constant is simpler than using a variable, and so on. </p> <p> I don't think you need to reach for polynomial interpolation in order to make your point. Just consider a function that returns a constant value, like this one: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Foo</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">i</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>; }</pre> </p> <p> You can make a similar argument about this function: You only need a single test value in order to demonstrate that it works as intended. I suppose you could view that as a zero-degree polynomial. </p> <p> Beyond what you think of as the <em>phase transition</em> I sometimes try to see what happens if I slightly <em>increase</em> the complexity of a function. For the <code>Foo</code> function, it could be a change like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="color:#74531f;">Foo</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">i</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">i</span>&nbsp;&lt;&nbsp;-1000) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>; }</pre> </p> <p> Unless you just happened to pick a number less than <code>-1000</code> for your test value, your test will not discover such a change. </p> <p> Your argument attempts to guard against that sort of change by assuming that we can somehow 'forbid' a change from a polynomial to something irregular. Real code doesn't work that way. Real code is rarely a continuous function, but rather discrete. That's the reason we have a concept such as <em>edge case</em>, because code branches at discrete values. </p> <p> A polynomial is a single function, regardless of degree. Implemented in code, it'll have a cyclomatic complexity of 1. That may not even be worth testing, because you'd essentially only be reproducing the implementation code in your test. </p> <p> The purpose of the Devil's Advocate technique isn't to demonstrate correctness; that's what unit tests are for. The purpose of the Devil's Advocate technique is to critique the tests. </p> <p> In reality, I never imagine that some malicious developer gains access to the source code. On the other hand, we all make mistakes, and I try to imagine what a likely mistake might look like. </p> </div> <div class="comment-date">2019-10-26 3:57 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. 10x developers https://blog.ploeh.dk/2019/09/30/10x-developers 2019-09-30T06:56:00+00:00 Mark Seemann <div id="post"> <p> <em>Do 10x developers exist? I believe that they do, but not like you may think.</em> </p> <p> The notion that some software developers are ten times (10x) as productive as 'normal' developers is decades old. Once in a while, the discussion resurfaces. It's a controversial subject, but something I've been thinking about for years, so I thought that I'd share my perspective because I don't see anyone else arguing from this position. </p> <p> While I'll try to explain my reasoning, I'll make no attempt at passing this off as anything but my current, subjective viewpoint. Please leave a comment if you have something to add. </p> <h3 id="11f6ec960f694758892bff549eb79d59"> Perspective <a href="#11f6ec960f694758892bff549eb79d59" title="permalink">#</a> </h3> <p> Meet Yohan. You've probably had a colleague like him. He's one of those software developers who gets things done, who never says no when the business asks him to help them out, who always respond with a smile to any request. </p> <p> I've had a few colleagues like Yohan in my career. It can be enlightening overhearing non-technical stakeholders discuss software developers: </p> <p> <strong>Alice:</strong> Yohan is such a dear; he helped me out with that feature on the web site, you know... </p> <p> <strong>Bob:</strong> Yes, he's a real go-getter. All the other programmers just say no and look pissed when I approach them about anything. </p> <p> <strong>Alice:</strong> Yohan always says yes, and he gets things done. He's a real 10x developer. </p> <p> <strong>Bob:</strong> We're so lucky we have him... </p> <p> Overhearing such a conversation can be frustrating. Yohan is your colleague, and you've just about had enough of him. Yohan is one of those developers who'll surround all code with a <code>try-catch</code> block, because then there'll be no exceptions in production. Yohan will make changes directly to the production system and tell no-one. Yohan will copy and paste code. Yohan will put business logic in database triggers, or rewrite logs, or use email as a messaging system, or call, parse, and run HTML-embedded JavaScript code on back-end servers. All 'because it's faster and provides more business value.' </p> <p> Yohan is a 10x developer. </p> <p> You, and the rest of your team, get nothing done. </p> <p> You get nothing done because you waste all your time cleaning up the trail of garbage and technical debt Yohan leaves in his wake. </p> <p> Business stakeholders may view Yohan as being orders of magnitude more productive than other developers, because most programming work is invisible and intangible. Whether or not someone is a 10x developer is highly subjective, and depends on perspective. </p> <h3 id="57c36b775bca4e848cadf50182028191"> Context <a href="#57c36b775bca4e848cadf50182028191" title="permalink">#</a> </h3> <p> The notion that some people are orders of magnitude more productive than the 'baseline' programmer has other problems. It implicitly assumes that a 'baseline' programmer exists in the first place. Modern software development, however, is specialised. </p> <p> As an example, I've been doing test-driven, ASP.NET-based C# server-side enterprise development for decades. Drop me into a project with my favourite stack and watch me go. On the other hand, try asking me to develop a game for the Sony PlayStation, and watch me stall. </p> <p> Clearly, then, I'm a 10x developer, for the tautological reason that I'm much better at the things that I'm good at than the things I'm not good at. </p> <p> Even the greatest <a href="https://en.wikipedia.org/wiki/R_(programming_language)">R</a> developer is unlikely to be of much help on your next <a href="https://en.wikipedia.org/wiki/COBOL">COBOL</a> project. </p> <p> As always, context matters. You can be a great programmer in a particular context, and suck in another. </p> <p> This isn't limited to technology stacks. Some people prefer co-location, while others work best by themselves. Some people are detail-oriented, while others like to look at the big picture. Some people do their best work early in the morning, and others late at night. </p> <p> And some teams of 'mediocre' programmers outperform all-star teams. (This, incidentally, is a phenomenon also <a href="https://en.wikipedia.org/wiki/UEFA_Euro_1992">sometimes seen</a> in professional <a href="https://en.wikipedia.org/wiki/Association_football">Soccer</a>.) </p> <h3 id="34e113a19c8040b0bcb38f7abf1b532d"> Evidence <a href="#34e113a19c8040b0bcb38f7abf1b532d" title="permalink">#</a> </h3> <p> Unfortunately, as I explain in my <a href="https://cleancoders.com/video-details/humane-code-real-episode-1">Humane Code</a> video, I believe that you can't measure software development productivity. Thus, the notion of a 10x developer is subjective. </p> <p> The original idea, however, is decades old, and seems, at first glance, to originate in a 'study'. If you're curious about its origins, I can't recommend <a href="http://bit.ly/leprechauns-of-software-engineering">The Leprechauns of Software Engineering</a> enough. In that book, Laurent Bossavit explains just how insubstantial the evidence is. </p> <p> If the evidence is so weak, then why does the idea that 10x developers exist keep coming back? </p> <h3 id="6988798f60bf4622808b58a3e7f55bce"> 0x developers <a href="#6988798f60bf4622808b58a3e7f55bce" title="permalink">#</a> </h3> <p> I think that the reason that the belief is recurring is that (subjectively) <em>it seems so evident</em>. Barring confirmation bias, I'm sure everyone has encountered a team member that never seemed to get anything done. </p> <p> I know that I've certainly had that experience from time to time. </p> <p> The first job I had, I hated. I just couldn't muster any enthusiasm for the work, and I'd postpone and drag out as long as possible even the simplest task. That wasn't very mature, but I was 25 and it was my first job, and I didn't know how to handle the situation I found myself in. I'm sure that my colleagues back then found that I didn't pull my part. I didn't, and I'm not proud of it, but it's true. </p> <p> I believe now that I was in the wrong context. It wasn't that I was incapable of doing the job, but at that time in my career, I absolutely loathed it, and for that reason, I wasn't productive. </p> <p> Another time, I had a colleague who seemed incapable of producing anything that helped us achieve our goals. I was concerned that I'd <a href="https://en.wikipedia.org/wiki/Bozo_bit">flipped the bozo bit</a> on that colleague, so I started to collect evidence. Our Git repository had few commits from that colleague, and the few that I could find I knew had been made in collaboration with another team member. We shared an office, and I had a pretty good idea about who worked together with whom when. </p> <p> This colleague spent a lot of time talking to other people. Us, other stakeholders, or any hapless victim who didn't escape in time. Based on these meetings and discussions, we'd hear about all sorts of ideas for improvements for our code or development process, but nothing would be implemented, and rarely did it have any relevance to what we were trying to accomplish. </p> <p> I've met programmers who get nothing done more than once. Sometimes, like the above story, they're boisterous bluffs, but most often, they just sit quietly in their corner and fidget with who knows what. </p> <p> Based on the above, mind you, I'm not saying that these people are necessarily incompetent (although I suspect that some are). They might also just find themselves in a wrong context, like I did in my first job. </p> <p> It seems clear to me, then, that there's such a thing as a <em>0x developer</em>. This is a developer who gets zero times (0x) as much done as the 'average' developer. </p> <p> For that reason it seems evident to me that 10x developers exist. Any developer who regularly manages to get code deployed to production is not only ten times, but infinitely more productive than 0x developers. </p> <p> It gets worse, though. </p> <h3 id="fe07476743704027acab9c7b949bc3a4"> −nx developers <a href="#fe07476743704027acab9c7b949bc3a4" title="permalink">#</a> </h3> <p> Not only is it my experience that 0x developers exist, I also believe that I've met more than one <em>−nx developer</em>. These are developers who are <em>minus n times</em> 'more' productive than the 'baseline' developer. In other words, they are software developers who have negative productivity. </p> <p> I've never met anyone who I suspected of deliberately sabotaging our efforts; they always seem well-meaning, but some people can produce more mess than three colleagues can clean up. Yohan, above, is such an archetype. </p> <p> One colleague I had, long ago, was so bad that the rest of the team deliberately compartmentalised him/her. We'd ask him/her to work on an isolated piece of the system, knowing that (s)he would be assigned to another project after four months. We then secretly planned to throw away the code once (s)he was gone, and rewrite it. I don't know if that was the right decision, but since we had padded all other estimates accordingly, we made our deadlines without more than the usual overruns. </p> <p> If you accept the assertion that −nx developers exist, then clearly, anyone who gets anything done at all is an <em>∞x developer</em>. </p> <h3 id="8257729b743641d0afa1c2ddbfe1b0df"> Summary <a href="#8257729b743641d0afa1c2ddbfe1b0df" title="permalink">#</a> </h3> <p> 10x developers exist, but not in the way that people normally interpret the term. </p> <p> 10x developers exist because there's great variability in (perceived) productivity. Much of the variability is context-dependent, so it's less clear if some people are just 'better at programming' than others. Still, when we consider that people like <a href="https://en.wikipedia.org/wiki/Linus_Torvalds">Linus Torvalds</a> exist, it seems compelling that this might be the case. </p> <p> Most of the variability, however, I think correlates with environment. Are you working in a technology stack with which you're comfortable? Do you like what you're doing? Do you like your colleagues? Do you like your hours? Do you like your working environment? </p> <p> Still, even if we could control for all of those variables, we might still find that some people get stuff done, and some people don't. The people who get anything done are ∞x developers. </p> <p> Employers and non-technical start-up founders sometimes look for the 10x unicorns, just like they look for <em>rock star developers</em>. <blockquote> <p> "To really confuse recruiters, someone should make a programming language called Rockstar." </p> <footer><cite><a href="https://twitter.com/paulstovell/status/1013960369465782273">Paul Stovell</a></cite></footer> </blockquote> The above tweet inspired <a href="http://www.dylanbeattie.net">Dylan Beattie</a> to create <a href="https://codewithrockstar.com">the Rockstar programming language</a>. </p> <p> Perhaps we should also create a <em>10x</em> programming language, so that we could put <em>certified Rockstar programmer, 10x developer</em> on our resumes. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. It's possible that I only found it inflexible because I didn't understand it well enough, but I don't think you can argue with my experience of finding it confusing. </p> <p> I prefer a combination of <a href="https://hackage.haskell.org/package/HUnit">HUnit</a> and <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a>. It turns out that it's possible to test a <a href="http://hackage.haskell.org/package/wai">wai</a> application (including Servant) using only those test libraries. </p> <h3 id="1c7a7365bd0c425e85691625d00adcd0"> Testable HTTP requests <a href="#1c7a7365bd0c425e85691625d00adcd0" title="permalink">#</a> </h3> <p> When testing against the HTTP API itself, you want something that can simulate the HTTP traffic. That capability is provided by <a href="http://hackage.haskell.org/package/wai-extra/docs/Network-Wai-Test.html">Network.Wai.Test</a>. At first, however, it wasn't entirely clear to me how that library works, but I could see that the Servant-recommended <a href="http://hackage.haskell.org/package/hspec-wai/docs/Test-Hspec-Wai.html">Test.Hspec.Wai</a> is just a thin wrapper over <em>Network.Wai.Test</em> (notice how <a href="/2019/07/01/yes-silver-bullet">open source makes such research much easier</a>). </p> <p> It turns out that <em>Network.Wai.Test</em> enables you to run your tests in a <code>Session</code> monad. You can, for example, define a simple HTTP GET request like this: </p> <p> <pre><span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.ByteString&nbsp;<span style="color:blue;">as</span>&nbsp;BS <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.ByteString.Lazy&nbsp;<span style="color:blue;">as</span>&nbsp;LBS <span style="color:blue;">import</span>&nbsp;Network.HTTP.Types <span style="color:blue;">import</span>&nbsp;Network.Wai <span style="color:blue;">import</span>&nbsp;Network.Wai.Test <span style="color:#2b91af;">get</span>&nbsp;::&nbsp;<span style="color:blue;">BS</span>.<span style="color:blue;">ByteString</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Session</span>&nbsp;<span style="color:blue;">SResponse</span> get&nbsp;url&nbsp;=&nbsp;request&nbsp;$&nbsp;setPath&nbsp;defaultRequest&nbsp;{&nbsp;requestMethod&nbsp;=&nbsp;methodGet&nbsp;}&nbsp;url </pre> </p> <p> This <code>get</code> function takes a <code>url</code> and returns a <code>Session SResponse</code>. It uses the <code>defaultRequest</code>, so it doesn't set any specific HTTP headers. </p> <p> For HTTP POST requests, I needed a function that'd POST a JSON document to a particular URL. For that purpose, I had to do a little more work: </p> <p> <pre><span style="color:#2b91af;">postJSON</span>&nbsp;::&nbsp;<span style="color:blue;">BS</span>.<span style="color:blue;">ByteString</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">LBS</span>.<span style="color:blue;">ByteString</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Session</span>&nbsp;<span style="color:blue;">SResponse</span> postJSON&nbsp;url&nbsp;json&nbsp;=&nbsp;srequest&nbsp;$&nbsp;SRequest&nbsp;req&nbsp;json &nbsp;&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;&nbsp;&nbsp;req&nbsp;=&nbsp;setPath&nbsp;defaultRequest &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;requestMethod&nbsp;=&nbsp;methodPost &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;requestHeaders&nbsp;=&nbsp;[(hContentType,&nbsp;<span style="color:#a31515;">&quot;application/json&quot;</span>)]}&nbsp;url</pre> </p> <p> This is a little more involved than the <code>get</code> function, because it also has to supply the <code>Content-Type</code> HTTP header. If you don't supply that header with the <code>application/json</code> value, your API is going to reject the request when you attempt to post a string with a JSON object. </p> <p> Apart from that, it works the same way as the <code>get</code> function. </p> <h3 id="d726d432f53b4817b5dc9716a2fabc36"> Running a test session <a href="#d726d432f53b4817b5dc9716a2fabc36" title="permalink">#</a> </h3> <p> The <code>get</code> and <code>postJSON</code> functions both return <code>Session</code> values, so a test must run in the <code>Session</code> monad. This is easily done with Haskell's <code>do</code> notation; you'll see an example of that later in the article. </p> <p> First, however, you'll need a way to run a <code>Session</code>. <em>Network.Wai.Test</em> provides a function for that, called <code>runSession</code>. Besides a <code>Session a</code> value, though, it also requires an <code>Application</code> value. </p> <p> In my test library, I already have an <code>Application</code>, although it's running in <code>IO</code> (for reasons that'll take another article to explain): </p> <p> <pre><span style="color:#2b91af;">app</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;<span style="color:blue;">Application</span></pre> </p> <p> With this value, you can easily convert any <code>Session a</code> to <code>IO a</code>: </p> <p> <pre><span style="color:#2b91af;">runSessionWithApp</span>&nbsp;::&nbsp;<span style="color:blue;">Session</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;a runSessionWithApp&nbsp;s&nbsp;=&nbsp;app&nbsp;&gt;&gt;=&nbsp;runSession&nbsp;s</pre> </p> <p> The next step is to figure out how to turn an <code>IO a</code> into a test. </p> <h3 id="febab39aa10d4d78bfd0bc4d3d45ca8a"> Running a property <a href="#febab39aa10d4d78bfd0bc4d3d45ca8a" title="permalink">#</a> </h3> <p> You can turn an <code>IO a</code> into a <code>Property</code> with either <code>ioProperty</code> or <code>idempotentIOProperty</code>. I admit that the documentation doesn't make the distinction between the two entirely clear, but <code>ioProperty</code> sounds like the safer choice, so that's what I went for here. </p> <p> With <code>ioProperty</code> you now have a <code>Property</code> that you can turn into a <code>Test</code> using <code>testProperty</code> from <a href="http://hackage.haskell.org/package/test-framework-quickcheck2/docs/Test-Framework-Providers-QuickCheck2.html">Test.Framework.Providers.QuickCheck2</a>: </p> <p> <pre><span style="color:#2b91af;">appProperty</span>&nbsp;::&nbsp;(<span style="color:blue;">Functor</span>&nbsp;f,&nbsp;<span style="color:blue;">Testable</span>&nbsp;prop,&nbsp;<span style="color:blue;">Testable</span>&nbsp;(f&nbsp;<span style="color:blue;">Property</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=&gt;&nbsp;TestName&nbsp;-&gt;&nbsp;f&nbsp;(Session&nbsp;prop)&nbsp;-&gt;&nbsp;Test appProperty&nbsp;name&nbsp;= &nbsp;&nbsp;testProperty&nbsp;name&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(ioProperty&nbsp;.&nbsp;runSessionWithApp)</pre> </p> <p> The type of this function seems more cryptic than strictly necessary. What's that <code>Functor f</code> doing there? </p> <p> The way I've written the tests, each property receives input from QuickCheck in the form of function arguments. I could have given the <code>appProperty</code> function a more restricted type, to make it clearer what's going on: </p> <p> <pre><span style="color:#2b91af;">appProperty</span>&nbsp;::&nbsp;(<span style="color:blue;">Arbitrary</span>&nbsp;a,&nbsp;<span style="color:blue;">Show</span>&nbsp;a,&nbsp;<span style="color:blue;">Testable</span>&nbsp;prop) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=&gt;&nbsp;TestName&nbsp;-&gt;&nbsp;(a&nbsp;-&gt;&nbsp;Session&nbsp;prop)&nbsp;-&gt;&nbsp;Test appProperty&nbsp;name&nbsp;= &nbsp;&nbsp;testProperty&nbsp;name&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(ioProperty&nbsp;.&nbsp;runSessionWithApp)</pre> </p> <p> This is the same function, just with a more restricted type. It states that for any <code>Arbitrary a, Show a</code>, a test is a function that takes <code>a</code> as input and returns a <code>Session prop</code>. This restricts tests to take a single input value, which means that you'll have to write all those properties in tupled, uncurried form. You could relax that requirement by introducing a <code>newtype</code> and a type class with an instance that recursively enables curried functions. That's what <a href="http://hackage.haskell.org/package/hspec-wai/docs/Test-Hspec-Wai-QuickCheck.html">Test.Hspec.Wai.QuickCheck</a> does. I decided not to add that extra level of indirection, and instead living with having to write all my properties in tupled form. </p> <p> The <code>Functor f</code> in the above, relaxed type, then, is in actual use the Reader functor. You'll see some examples next. </p> <h3 id="0ebc8724e39149e88e5e71763b03d499"> Properties <a href="#0ebc8724e39149e88e5e71763b03d499" title="permalink">#</a> </h3> <p> You can now define some properties. Here's a simple example: </p> <p> <pre>appProperty&nbsp;<span style="color:#a31515;">&quot;responds&nbsp;with&nbsp;404&nbsp;when&nbsp;no&nbsp;reservation&nbsp;exists&quot;</span>&nbsp;$&nbsp;\rid&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;get&nbsp;$&nbsp;<span style="color:#a31515;">&quot;/reservations/&quot;</span>&nbsp;&lt;&gt;&nbsp;toASCIIBytes&nbsp;rid &nbsp;&nbsp;assertStatus&nbsp;404&nbsp;actual</pre> </p> <p> This is an inlined property, similar to how <a href="/2018/05/07/inlined-hunit-test-lists">I inline HUnit tests in test lists</a>. </p> <p> First, notice that the property is written as a lambda expression, which means that it fits the mould of <code>a -&gt; Session prop</code>. The input value <code>rid</code> (<em>reservationID</em>) is a <a href="http://hackage.haskell.org/package/uuid/docs/Data-UUID.html">UUID</a> value (for which an <code>Arbitrary</code> instance exists via <a href="http://hackage.haskell.org/package/quickcheck-instances">quickcheck-instances</a>). </p> <p> While the test runs in the <code>Session</code> monad, the <code>do</code> notation makes <code>actual</code> an <code>SResponse</code> value that you can then assert with <code>assertStatus</code> (from <em>Network.Wai.Test</em>). </p> <p> This property reproduces an interaction like this: </p> <p> <pre>&amp; curl -v http://localhost:8080/reservations/db38ac75-9ccd-43cc-864a-ce13e90a71d8 * Trying ::1:8080... * TCP_NODELAY set * Trying 127.0.0.1:8080... * TCP_NODELAY set * Connected to localhost (127.0.0.1) port 8080 (#0) &gt; GET /reservations/db38ac75-9ccd-43cc-864a-ce13e90a71d8 HTTP/1.1 &gt; Host: localhost:8080 &gt; User-Agent: curl/7.65.1 &gt; Accept: */* &gt; * Mark bundle as not supporting multiuse &lt; HTTP/1.1 404 Not Found &lt; Transfer-Encoding: chunked &lt; Date: Tue, 02 Jul 2019 18:09:51 GMT &lt; Server: Warp/3.2.27 &lt; * Connection #0 to host localhost left intact</pre> </p> <p> The important result is that the status code is <code>404 Not Found</code>, which is also what the property asserts. </p> <p> If you need more than one input value to your property, you have to write the property in tupled form: </p> <p> <pre>appProperty&nbsp;<span style="color:#a31515;">&quot;fails&nbsp;when&nbsp;reservation&nbsp;is&nbsp;POSTed&nbsp;with&nbsp;invalid&nbsp;quantity&quot;</span>&nbsp;$&nbsp;\ &nbsp;&nbsp;(ValidReservation&nbsp;r,&nbsp;NonNegative&nbsp;q)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;invalid&nbsp;=&nbsp;r&nbsp;{&nbsp;reservationQuantity&nbsp;=&nbsp;<span style="color:blue;">negate</span>&nbsp;q&nbsp;} &nbsp;&nbsp;actual&nbsp;&lt;-&nbsp;postJSON&nbsp;<span style="color:#a31515;">&quot;/reservations&quot;</span>&nbsp;$&nbsp;encode&nbsp;invalid &nbsp;&nbsp;assertStatus&nbsp;400&nbsp;actual</pre> </p> <p> This property still takes a single input, but that input is a tuple where the first element is a <code>ValidReservation</code> and the second element a <code>NonNegative Int</code>. The <a href="/2019/09/02/naming-newtypes-for-quickcheck-arbitraries">ValidReservation newtype wrapper</a> ensures that <code>r</code> is a valid reservation record. This ensures that the property only exercises the path where the reservation quantity is zero or negative. It accomplishes this by negating <code>q</code> and replacing the <code>reservationQuantity</code> with that negative (or zero) number. </p> <p> It then encodes (with <a href="http://hackage.haskell.org/package/aeson">aeson</a>) the <code>invalid</code> reservation and posts it using the <code>postJSON</code> function. </p> <p> Finally it asserts that the HTTP status code is <code>400 Bad Request</code>. </p> <h3 id="ae5b3f7b07634799ad2af0d9a2ac668c"> Summary <a href="#ae5b3f7b07634799ad2af0d9a2ac668c" title="permalink">#</a> </h3> <p> After having tried using <em>Test.Hspec.Wai</em> for some time, I decided to refactor my tests to QuickCheck and HUnit. Once I figured out how <em>Network.Wai.Test</em> works, the remaining work wasn't too difficult. While there's little written documentation for the modules, the types (as usual) act as documentation. Using the types, and looking a little at the underlying code, I was able to figure out how to use the test API. </p> <p> You write tests against <em>wai</em> applications in the <code>Session</code> monad. You can then use <code>runSession</code> to turn the <code>Session</code> into an <code>IO</code> value. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The architectural idea is to load a directory structure from disk into an in-memory tree, manipulate that tree, and use the resulting tree to perform the desired actions: </p> <p> <img src="/content/binary/functional-file-system-interaction.png" alt="A functional program typically loads data, transforms it, and stores it again."> </p> <p> Much of the program will manipulate the tree data, which is immutable. </p> <p> The previous article showed how to implement the <a href="/2019/09/09/picture-archivist-in-haskell">picture archivist architecture in Haskell</a>. In this article, you'll see how to do it in <a href="https://fsharp.org">F#</a>. This is essentially a port of the <a href="https://www.haskell.org">Haskell</a> code. </p> <h3 id="949a876ffec843e09d4faa5ae1c1b4c5"> Tree <a href="#949a876ffec843e09d4faa5ae1c1b4c5" title="permalink">#</a> </h3> <p> You can start by defining a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Tree&lt;&#39;a,&nbsp;&#39;b&gt;&nbsp;=&nbsp;Node&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a&nbsp;*&nbsp;Tree&lt;&#39;a,&nbsp;&#39;b&gt;&nbsp;list&nbsp;|&nbsp;Leaf&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;b</pre> </p> <p> If you wanted to, you could put all the <code>Tree</code> code in a reusable library, because none of it is coupled to a particular application, such as <a href="https://amzn.to/2V06Kji">moving pictures</a>. You could also write a comprehensive test suite for the following functions, but in this article, I'll skip that. </p> <p> Notice that this sort of tree explicitly distinguishes between internal and leaf nodes. This is necessary because you'll need to keep track of the directory names (the internal nodes), while at the same time you'll want to enrich the leaves with additional data - data that you can't meaningfully add to the internal nodes. You'll see this later in the article. </p> <p> While I typically tend to define F# types outside of modules (so that you don't have to, say, prefix the type name with the module name - <code>Tree.Tree</code> is so awkward), the rest of the tree code goes into a module, including two helper functions: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;&#39;b&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;leaf&nbsp;=&nbsp;Leaf &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;&#39;a&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;list&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;node&nbsp;x&nbsp;xs&nbsp;=&nbsp;Node&nbsp;(x,&nbsp;xs)</pre> </p> <p> The <code>leaf</code> function doesn't add much value, but the <code>node</code> function offers a curried alternative to the <code>Node</code> case constructor. That's occasionally useful. </p> <p> The rest of the code related to trees is also defined in the <code>Tree</code> module, but I'm going to present it formatted as free-standing functions. If you're confused about the layout of the code, the entire code base is <a href="https://github.com/ploeh/picture-archivist">available on GitHub</a>. </p> <p> The <a href="/2019/08/05/rose-tree-catamorphism">rose tree catamorphism</a> is this <code>cata</code> function: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;c&nbsp;list&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;cata&nbsp;fd&nbsp;ff&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Leaf&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ff&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Node&nbsp;(x,&nbsp;xs)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;xs&nbsp;|&gt;&nbsp;List.map&nbsp;(cata&nbsp;fd&nbsp;ff)&nbsp;|&gt;&nbsp;fd&nbsp;x</pre> </p> <p> In the corresponding Haskell implementation of this architecture, I called this function <code>foldTree</code>, so why not retain that name? The short answer is that the naming conventions differ between Haskell and F#, and while I favour learning from Haskell, I still want my F# code to be as <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> as possible. </p> <p> While I don't enforce that client code <em>must</em> use the <code>Tree</code> module name to access the functions within, I prefer to name the functions so that they make sense when used with qualified access. Having to write <code>Tree.foldTree</code> seems redundant. A more idiomatic name would be <code>fold</code>, so that you could write <code>Tree.fold</code>. The problem with that name, though, is that <code>fold</code> usually implies a list-biased <em>fold</em> (corresponding to <code>foldl</code> in Haskell), and I'll actually need that name for that particular purpose later. </p> <p> So, <code>cata</code> it is. </p> <p> In this article, tree functionality is (with one exception) directly or transitively implemented with <code>cata</code>. </p> <h3 id="3f30722983ad47bd83c88cec4ba80983"> Filtering trees <a href="#3f30722983ad47bd83c88cec4ba80983" title="permalink">#</a> </h3> <p> It'll be useful to be able to filter the contents of a tree. For example, the picture archivist program will only move image files with valid metadata. This means that it'll need to filter out all files that aren't image files, as well as image files without valid metadata. </p> <p> It turns out that it'll be useful to supply a function that throws away <code>None</code> values from a tree of <code>option</code> leaves. This is similar to <a href="https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/list.choose%5B't%2C'u%5D-function-%5Bfsharp%5D">List.choose</a>, so I call it <code>Tree.choose</code>: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b&nbsp;option)&nbsp;-&gt;&nbsp;Tree&lt;&#39;c,&#39;a&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;c,&#39;b&gt;&nbsp;option</span> <span style="color:blue;">let</span>&nbsp;choose&nbsp;f&nbsp;=&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;List.choose&nbsp;id&nbsp;&gt;&gt;&nbsp;node&nbsp;x&nbsp;&gt;&gt;&nbsp;Some)&nbsp;(f&nbsp;&gt;&gt;&nbsp;Option.map&nbsp;Leaf)</pre> </p> <p> You may find the type of the function surprising. Why does it return a <code>Tree option</code>, instead of simply a <code>Tree</code>? </p> <p> While <code>List.choose</code> simply returns a list, it can do this because lists can be empty. This <code>Tree</code> type, on the other hand, can't be empty. If the purpose of <code>Tree.choose</code> is to throw away all <code>None</code> values, then how do you return a tree from <code>Leaf None</code>? </p> <p> You can't return a <code>Leaf</code> because you have no value to put in the leaf. Similarly, you can't return a <code>Node</code> because, again, you have no value to put in the node. </p> <p> In order to handle this edge case, then, you'll have to return <code>None</code>: </p> <p> <pre>&gt; let l : Tree&lt;string, int option&gt; = Leaf None;; val l : Tree&lt;string,int option&gt; = Leaf None &gt; Tree.choose id l;; val it : Tree&lt;string,int&gt; option = None</pre> </p> <p> If you have anything other than a <code>None</code> leaf, though, you'll get a proper tree, but wrapped in an <code>option</code>: </p> <p> <pre>&gt; Tree.node "Foo" [Leaf (Some 42); Leaf None; Leaf (Some 2112)] |&gt; Tree.choose id;; val it : Tree&lt;string,int&gt; option = Some (Node ("Foo",[Leaf 42; Leaf 2112]))</pre> </p> <p> While the resulting tree is wrapped in a <code>Some</code> case, the leaves contain unwrapped values. </p> <h3 id="32f46f2c16cf428abc39c3d79433caa6"> Bifunctor, functor, and folds <a href="#32f46f2c16cf428abc39c3d79433caa6" title="permalink">#</a> </h3> <p> Through its type class language feature, Haskell has formal definitions of <a href="/2018/03/22/functors">functors</a>, <a href="/2018/12/24/bifunctors">bifunctors</a>, and other types of <em>folds</em> (list-biased <a href="/2019/04/29/catamorphisms">catamorphisms</a>). F# doesn't have a similar degree of formalism, which means that while you can still implement the corresponding functionality, you'll have to rely on conventions to make the functions recognisable. </p> <p> It's straighforward to start with the bifunctor functionality: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;b)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;d)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;c&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;b,&#39;d&gt;</span> <span style="color:blue;">let</span>&nbsp;bimap&nbsp;f&nbsp;g&nbsp;=&nbsp;cata&nbsp;(f&nbsp;&gt;&gt;&nbsp;node)&nbsp;(g&nbsp;&gt;&gt;&nbsp;leaf)</pre> </p> <p> This is, apart from the syntax differences, the same implementation as in Haskell. Based on <code>bimap</code>, you can also trivially implement <code>mapNode</code> and <code>mapLeaf</code> functions if you'd like, but you're not going to need those for the code in this article. You do need, however, a function that we could consider an alias of a hypothetical <code>mapLeaf</code> function: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;c&gt;</span> <span style="color:blue;">let</span>&nbsp;map&nbsp;f&nbsp;=&nbsp;bimap&nbsp;id&nbsp;f</pre> </p> <p> This makes <code>Tree</code> a functor. </p> <p> It'll also be useful to reduce a tree to a potentially more compact value, so you can add some specialised folds: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;a&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;bifold&nbsp;f&nbsp;g&nbsp;z&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;flip&nbsp;f&nbsp;x&nbsp;y&nbsp;=&nbsp;f&nbsp;y&nbsp;x &nbsp;&nbsp;&nbsp;&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;xs&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;flip&nbsp;f&nbsp;x&nbsp;&gt;&gt;&nbsp;List.fold&nbsp;(&gt;&gt;)&nbsp;id&nbsp;xs)&nbsp;(flip&nbsp;g)&nbsp;t&nbsp;z <span style="color:green;">//&nbsp;(&#39;a&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;bifoldBack&nbsp;f&nbsp;g&nbsp;t&nbsp;z&nbsp;=&nbsp;cata&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;xs&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;List.foldBack&nbsp;(&lt;&lt;)&nbsp;xs&nbsp;id&nbsp;&gt;&gt;&nbsp;f&nbsp;x)&nbsp;g&nbsp;t&nbsp;z</pre> </p> <p> In an attempt to emulate the F# naming conventions, I named the functions as I did. There are similar functions in the <code>List</code> and <code>Option</code> modules, for instance. If you're comparing the F# code with the Haskell code in the previous article, <code>Tree.bifold</code> corresponds to <code>bifoldl</code>, and <code>Tree.bifoldBack</code> corresponds to <code>bifoldr</code>. </p> <p> These enable you to implement folds over leaves only: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;c&nbsp;-&gt;&nbsp;&#39;b&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;fold&nbsp;f&nbsp;=&nbsp;bifold&nbsp;(<span style="color:blue;">fun</span>&nbsp;x&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x)&nbsp;f <span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;&#39;c&nbsp;-&gt;&nbsp;&#39;c</span> <span style="color:blue;">let</span>&nbsp;foldBack&nbsp;f&nbsp;=&nbsp;bifoldBack&nbsp;(<span style="color:blue;">fun</span>&nbsp;_&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x)&nbsp;f</pre> </p> <p> These, again, enable you to implement another function that'll turn out to be useful in this article: </p> <p> <pre><span style="color:green;">//&nbsp;(&#39;b&nbsp;-&gt;&nbsp;unit)&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,&#39;b&gt;&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;iter&nbsp;f&nbsp;=&nbsp;fold&nbsp;(<span style="color:blue;">fun</span>&nbsp;()&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;f&nbsp;x)&nbsp;()</pre> </p> <p> The picture archivist program isn't going to explicitly need all of these, but transitively, it will. </p> <h3 id="8a9a50c69a2d461cac5bb87fa4cf3cd9"> Moving pictures <a href="#8a9a50c69a2d461cac5bb87fa4cf3cd9" title="permalink">#</a> </h3> <p> So far, all the code shown here could be in a general-purpose reusable library, since it contains no functionality specifically related to image files. The rest of the code in this article, however, will be specific to the program. I'll put the domain model code in another module that I call <code>Archive</code>. Later in the article, we'll look at how to load a tree from the file system, but for now, we'll just pretend that we have such a tree. </p> <p> The major logic of the program is to create a destination tree based on a source tree. The leaves of the tree will have to carry some extra information apart from a file path, so you can introduce a specific type to capture that information: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;PhotoFile&nbsp;=&nbsp;{&nbsp;File&nbsp;:&nbsp;FileInfo;&nbsp;TakenOn&nbsp;:&nbsp;DateTime&nbsp;}</pre> </p> <p> A <code>PhotoFile</code> not only contains the file path for an image file, but also the date the photo was taken. This date can be extracted from the file's metadata, but that's an impure operation, so we'll delegate that work to the start of the program. We'll return to that later. </p> <p> Given a source tree of <code>PhotoFile</code> leaves, though, the program must produce a destination tree of files: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;Tree&lt;&#39;a,PhotoFile&gt;&nbsp;-&gt;&nbsp;Tree&lt;string,FileInfo&gt;</span> <span style="color:blue;">let</span>&nbsp;moveTo&nbsp;destination&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dirNameOf&nbsp;(dt&nbsp;:&nbsp;DateTime)&nbsp;=&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;%d-%02d&quot;</span>&nbsp;dt.Year&nbsp;dt.Month &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;groupByDir&nbsp;pf&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;key&nbsp;=&nbsp;dirNameOf&nbsp;pf.TakenOn &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dir&nbsp;=&nbsp;Map.tryFind&nbsp;key&nbsp;m&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;[] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Map.add&nbsp;key&nbsp;(pf.File&nbsp;::&nbsp;dir)&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;addDir&nbsp;name&nbsp;files&nbsp;dirs&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Tree.node&nbsp;name&nbsp;(List.map&nbsp;Leaf&nbsp;files)&nbsp;::&nbsp;dirs &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;m&nbsp;=&nbsp;Tree.foldBack&nbsp;groupByDir&nbsp;t&nbsp;Map.empty &nbsp;&nbsp;&nbsp;&nbsp;Map.foldBack&nbsp;addDir&nbsp;m&nbsp;[]&nbsp;|&gt;&nbsp;Tree.node&nbsp;destination</pre> </p> <p> This <code>moveTo</code> function looks, perhaps, overwhelming, but it's composed of three conceptual steps: <ol> <li>Create a map of destination folders (<code>m</code>).</li> <li>Create a list of branches from the map (<code>Map.foldBack addDir m []</code>).</li> <li>Create a tree from the list (<code>Tree.node destination</code>).</li> </ol> The <code>moveTo</code> function starts by folding the input data into a map <code>m</code>. The map is keyed by the directory name, which is formatted by the <code>dirNameOf</code> function. This function takes a <code>DateTime</code> as input and formats it to a <code>YYYY-MM</code> format. For example, December 20, 2018 becomes <code>"2018-12"</code>. </p> <p> The entire mapping step groups the <code>PhotoFile</code> values into a map of the type <code>Map&lt;string,FileInfo list&gt;</code>. All the image files taken in April 2014 are added to the list with the <code>"2014-04"</code> key, all the image files taken in July 2011 are added to the list with the <code>"2011-07"</code> key, and so on. </p> <p> In the next step, the <code>moveTo</code> function converts the map to a list of trees. This will be the branches (or sub-directories) of the <code>destination</code> directory. Because of the desired structure of the destination tree, this is a list of shallow branches. Each node contains only leaves. </p> <p> <img src="/content/binary/shallow-photo-destination-directories.png" alt="Shallow photo destination directories."> </p> <p> The only remaining step is to add that list of branches to a <code>destination</code> node. This is done by piping (<code>|&gt;</code>) the list of sub-directories into <code>Tree.node destination</code>. </p> <p> Since this is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">easy to unit test</a>. Just create some test cases and call the function. First, the test cases. </p> <p> In this code base, I'm using <a href="https://xunit.github.io">xUnit.net</a> 2.4.1, so I'll first create a set of test cases as a test-specific class: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;MoveToDestinationTestData&nbsp;()&nbsp;<span style="color:blue;">as</span>&nbsp;this&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;TheoryData&lt;Tree&lt;string,&nbsp;PhotoFile&gt;,&nbsp;string,&nbsp;Tree&lt;string,&nbsp;string&gt;&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;photoLeaf&nbsp;name&nbsp;(y,&nbsp;mth,&nbsp;d,&nbsp;h,&nbsp;m,&nbsp;s)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;{&nbsp;File&nbsp;=&nbsp;FileInfo&nbsp;name;&nbsp;TakenOn&nbsp;=&nbsp;DateTime&nbsp;(y,&nbsp;mth,&nbsp;d,&nbsp;h,&nbsp;m,&nbsp;s)&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;(2018,&nbsp;11,&nbsp;9,&nbsp;11,&nbsp;47,&nbsp;17), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[Node&nbsp;(<span style="color:#a31515;">&quot;2018-11&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;S&quot;</span>,&nbsp;[photoLeaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>&nbsp;(1972,&nbsp;6,&nbsp;6,&nbsp;16,&nbsp;15,&nbsp;0)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[Node&nbsp;(<span style="color:#a31515;">&quot;1972-06&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;S&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>&nbsp;(2002,&nbsp;10,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>&nbsp;(2007,&nbsp;4,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;D&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2002-10&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2007-04&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-01&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>])])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;bar&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;12,&nbsp;17,&nbsp;16,&nbsp;15); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;(2010,&nbsp;1,&nbsp;21,&nbsp;17,&nbsp;18,&nbsp;19)]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;baz&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>&nbsp;(2010,&nbsp;3,&nbsp;1,&nbsp;2,&nbsp;3,&nbsp;4); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;photoLeaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>&nbsp;(2011,&nbsp;3,&nbsp;4,&nbsp;3,&nbsp;2,&nbsp;1)])]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;qux&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-01&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2010-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>;&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;2011-03&quot;</span>,&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>])]))</pre> </p> <p> That looks like a lot of code, but is really just a list of test cases. Each test case is a triple of a source tree, a destination directory name, and an expected result (another tree). </p> <p> The test itself, on the other hand, is compact: </p> <p> <pre>[&lt;Theory;&nbsp;ClassData(typeof&lt;MoveToDestinationTestData&gt;)&gt;] <span style="color:blue;">let</span>&nbsp;Move&nbsp;to&nbsp;destination&nbsp;source&nbsp;destination&nbsp;expected&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Archive.moveTo&nbsp;destination&nbsp;source &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;Tree.map&nbsp;string&nbsp;actual</pre> </p> <p> The <code>=!</code> operator comes from <a href="https://github.com/SwensenSoftware/unquote">Unquote</a> and means something like <em>must equal</em>. It's an assertion that will throw an exception if <code>expected</code> isn't equal to <code>Tree.map string actual</code>. </p> <p> The reason that the assertion maps <code>actual</code> to a tree of strings is that <code>actual</code> is a <code>Tree&lt;string,FileInfo&gt;</code>, but <code>FileInfo</code> doesn't have structural equality. So either I had to implement a test-specific equality comparer for <code>FileInfo</code> (and for <code>Tree&lt;string,FileInfo&gt;</code>), or map the tree to something with proper equality, such as a <code>string</code>. I chose the latter. </p> <h3 id="abe95ba6865745bc9df8004079d8a250"> Calculating moves <a href="#abe95ba6865745bc9df8004079d8a250" title="permalink">#</a> </h3> <p> One pure step remains. The result of calling the <code>moveTo</code> function is a tree with the desired structure. In order to actually move the files, though, for each file you'll need to keep track of both the source path and the destination path. To make that explicit, you can define a type for that purpose: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Move&nbsp;=&nbsp;{&nbsp;Source&nbsp;:&nbsp;FileInfo;&nbsp;Destination&nbsp;:&nbsp;FileInfo&nbsp;}</pre> </p> <p> A <code>Move</code> is simply a data structure. Contrast this with typical object-oriented design, where it would be a (possibly polymorphic) method on an object. In functional programming, you'll regularly model <em>intent</em> with a data structure. As long as intents remain data, you can easily manipulate them, and once you're done with that, you can run an interpreter over your data structure to perform the work you want accomplished. </p> <p> The unit test cases for the <code>moveTo</code> function suggest that file names are local file names like <code>"L"</code>, <code>"J"</code>, <code>"a"</code>, and so on. That was only to make the tests as compact as possible, since the function actually doesn't manipulate the specific <code>FileInfo</code> objects. </p> <p> In reality, the file names will most likely be longer, and they could also contain the full path, instead of the local path: <code>"C:\foo\bar\a.jpg"</code>. </p> <p> If you call <code>moveTo</code> with a tree where each leaf has a fully qualified path, the output tree will have the desired structure of the destination tree, but the leaves will still contain the full path to each source file. That means that you can calculate a <code>Move</code> for each file: </p> <p> <pre><span style="color:green;">//&nbsp;Tree&lt;string,FileInfo&gt;&nbsp;-&gt;&nbsp;Tree&lt;string,Move&gt;</span> <span style="color:blue;">let</span>&nbsp;calculateMoves&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;replaceDirectory&nbsp;(f&nbsp;:&nbsp;FileInfo)&nbsp;d&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FileInfo&nbsp;(Path.Combine&nbsp;(d,&nbsp;f.Name)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;imp&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Leaf&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;{&nbsp;Source&nbsp;=&nbsp;x;&nbsp;Destination&nbsp;=&nbsp;replaceDirectory&nbsp;x&nbsp;path&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Node&nbsp;(x,&nbsp;xs)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;newNPath&nbsp;=&nbsp;Path.Combine&nbsp;(path,&nbsp;x) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Tree.node&nbsp;newNPath&nbsp;(List.map&nbsp;(imp&nbsp;newNPath)&nbsp;xs) &nbsp;&nbsp;&nbsp;&nbsp;imp&nbsp;<span style="color:#a31515;">&quot;&quot;</span></pre> </p> <p> This function takes as input a <code>Tree&lt;string,FileInfo&gt;</code>, which is compatible with the output of <code>moveTo</code>. It returns a <code>Tree&lt;string,Move&gt;</code>, i.e. a tree where the leaves are <code>Move</code> values. </p> <p> Earlier, I wrote that you can implement desired <code>Tree</code> functionality with the <code>cata</code> function, but that was a simplification. If you can implement the functionality of <code>calculateMoves</code> with <code>cata</code>, I don't know how. You can, however, implement it using explicit pattern matching and simple recursion. </p> <p> The <code>imp</code> function builds up a file path as it recursively negotiates the tree. All <code>Leaf</code> nodes are converted to a <code>Move</code> value using the leaf node's current <code>FileInfo</code> value as the <code>Source</code>, and the <code>path</code> to figure out the desired <code>Destination</code>. </p> <p> This code is still easy to unit test. First, test cases: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;CalculateMovesTestData&nbsp;()&nbsp;<span style="color:blue;">as</span>&nbsp;this&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">inherit</span>&nbsp;TheoryData&lt;Tree&lt;string,&nbsp;FileInfo&gt;,&nbsp;Tree&lt;string,&nbsp;(string&nbsp;*&nbsp;string)&gt;&gt;&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;(Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>),&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>))])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>);&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>)]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>))])) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">do</span>&nbsp;this.Add&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>)]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;c&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(FileInfo&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>)])]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>),&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;2&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>))]); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>),&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;(<span style="color:#a31515;">&quot;3&quot;</span>,&nbsp;Path.Combine&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>))])]))</pre> </p> <p> The test cases in this parametrised test are tuples of an input tree and the expected tree. For each test case, the test calls the <code>Archive.calculateMoves</code> function with <code>tree</code> and asserts that the <code>actual</code> tree is equal to the <code>expected</code> tree: </p> <p> <pre>[&lt;Theory;&nbsp;ClassData(typeof&lt;CalculateMovesTestData&gt;)&gt;] <span style="color:blue;">let</span>&nbsp;Calculate&nbsp;moves&nbsp;tree&nbsp;expected&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Archive.calculateMoves&nbsp;tree &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;Tree.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;m&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(m.Source.ToString&nbsp;(),&nbsp;m.Destination.ToString&nbsp;()))&nbsp;actual</pre> </p> <p> Again, the test maps <code>FileInfo</code> objects to <code>strings</code> to support easy comparison. </p> <p> That's all the pure code you need in order to implement the desired functionality. Now you only need to write some code that loads a tree from disk, and imprints a destination tree to disk, as well as the code that composes it all. </p> <h3 id="bac6be79cf8c44a7b47923e2ec90d99f"> Loading a tree from disk <a href="#bac6be79cf8c44a7b47923e2ec90d99f" title="permalink">#</a> </h3> <p> The remaining code in this article is impure. You could put it in dedicated modules, but for this program, you're only going to need three functions and a bit of composition code, so you could also just put it all in the <code>Program</code> module. That's what I did. </p> <p> To load a tree from disk, you'll need a root directory, under which you load the entire tree. Given a directory path, you read a tree using a recursive function like this: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;Tree&lt;string,string&gt;</span> <span style="color:blue;">let</span>&nbsp;<span style="color:blue;">rec</span>&nbsp;readTree&nbsp;path&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;File.Exists&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;Leaf&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dirsAndFiles&nbsp;=&nbsp;Directory.EnumerateFileSystemEntries&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;branches&nbsp;=&nbsp;Seq.map&nbsp;readTree&nbsp;dirsAndFiles&nbsp;|&gt;&nbsp;Seq.toList &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(path,&nbsp;branches)</pre> </p> <p> This recursive function starts by checking whether the <code>path</code> is a file that exists. If it does, the path is a file, so it creates a new <code>Leaf</code> with that path. </p> <p> If <code>path</code> isn't a file, it's a directory. In that case, use <code>Directory.EnumerateFileSystemEntries</code> to enumerate all the directories and files in that directory, and map all those directory entries recursively. That produces all the <code>branches</code> for the current node. Finally, return a new <code>Node</code> with the <code>path</code> and the <code>branches</code>. </p> <h3 id="7f5e06eb61024264ad214d41b63a8a74"> Loading metadata <a href="#7f5e06eb61024264ad214d41b63a8a74" title="permalink">#</a> </h3> <p> The <code>readTree</code> function only produces a tree with <code>string</code> leaves, while the program requires a tree with <code>PhotoFile</code> leaves. You'll need to read the <a href="https://en.wikipedia.org/wiki/Exif">Exif</a> metadata from each file and enrich the tree with the <em>date-taken</em> data. </p> <p> In this code base, I've written a little <code>Photo</code> module to extract the desired metadata from an image file. I'm not going to list all the code here; if you're interested, the code is <a href="https://github.com/ploeh/picture-archivist">available on GitHub</a>. The <code>Photo</code> module enables you to write an impure operation like this: </p> <p> <pre><span style="color:green;">//&nbsp;FileInfo&nbsp;-&gt;&nbsp;PhotoFile&nbsp;option</span> <span style="color:blue;">let</span>&nbsp;readPhoto&nbsp;file&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;Photo.extractDateTaken&nbsp;file &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Option.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;dateTaken&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;{&nbsp;File&nbsp;=&nbsp;file;&nbsp;TakenOn&nbsp;=&nbsp;dateTaken&nbsp;})</pre> </p> <p> This operation can fail for various reasons: <ul> <li>The file may not exist.</li> <li>The file exists, but has no metadata.</li> <li>The file has metadata, but no <em>date-taken</em> metadata.</li> <li>The <em>date-taken</em> metadata string is malformed.</li> </ul> When you traverse a <code>Tree&lt;string,string&gt;</code> with <code>readPhoto</code>, you'll get a <code>Tree&lt;string,PhotoFile option&gt;</code>. That's when you'll need <code>Tree.choose</code>. You'll see this soon. </p> <h3 id="59159ef499884e10ae92e5ef6e666c36"> Writing a tree to disk <a href="#59159ef499884e10ae92e5ef6e666c36" title="permalink">#</a> </h3> <p> The above <code>calculateMoves</code> function creates a <code>Tree&lt;string,Move&gt;</code>. The final piece of impure code you'll need to write is an operation that traverses such a tree and executes each <code>Move</code>. </p> <p> <pre><span style="color:green;">//&nbsp;Tree&lt;&#39;a,Move&gt;&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;writeTree&nbsp;t&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;copy&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Directory.CreateDirectory&nbsp;m.Destination.DirectoryName&nbsp;|&gt;&nbsp;ignore &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;m.Source.CopyTo&nbsp;m.Destination.FullName&nbsp;|&gt;&nbsp;ignore &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;printfn&nbsp;<span style="color:#a31515;">&quot;Copied&nbsp;to&nbsp;%s&quot;</span>&nbsp;m.Destination.FullName &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;compareFiles&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sourceStream&nbsp;=&nbsp;File.ReadAllBytes&nbsp;m.Source.FullName &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationStream&nbsp;=&nbsp;File.ReadAllBytes&nbsp;m.Destination.FullName &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sourceStream&nbsp;=&nbsp;destinationStream &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;move&nbsp;m&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;copy&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;compareFiles&nbsp;m&nbsp;<span style="color:blue;">then</span>&nbsp;m.Source.Delete&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;Tree.iter&nbsp;move&nbsp;t</pre> </p> <p> The <code>writeTree</code> function traverses the input tree, and for each <code>Move</code>, it first copies the file, then it verifies that the copy was successful, and finally, if that's the case, it deletes the source file. </p> <h3 id="f30093164b184bbf877f307fa4cf4c63"> Composition <a href="#f30093164b184bbf877f307fa4cf4c63" title="permalink">#</a> </h3> <p> You can now compose an <em>impure-pure-impure sandwich</em> from all the Lego pieces: </p> <p> <pre><span style="color:green;">//&nbsp;string&nbsp;-&gt;&nbsp;string&nbsp;-&gt;&nbsp;unit</span> <span style="color:blue;">let</span>&nbsp;movePhotos&nbsp;source&nbsp;destination&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sourceTree&nbsp;=&nbsp;readTree&nbsp;source&nbsp;|&gt;&nbsp;Tree.map&nbsp;FileInfo &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;photoTree&nbsp;=&nbsp;Tree.choose&nbsp;readPhoto&nbsp;sourceTree &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Option.map&nbsp;(Archive.moveTo&nbsp;destination&nbsp;&gt;&gt;&nbsp;Archive.calculateMoves)&nbsp;photoTree &nbsp;&nbsp;&nbsp;&nbsp;Option.iter&nbsp;writeTree&nbsp;destinationTree</pre> </p> <p> First, you load the <code>sourceTree</code> using the <code>readTree</code> operation. This returns a <code>Tree&lt;string,string&gt;</code>, so map the leaves to <code>FileInfo</code> objects. You then load the image metatadata by traversing <code>sourceTree</code> with <code>Tree.choose readPhoto</code>. Each call to <code>readPhoto</code> produces a <code>PhotoFile option</code>, so this is where you want to use <code>Tree.choose</code> to throw all the <code>None</code> values away. </p> <p> Those two lines of code is the initial impure step of the sandwich (yes: mixed metaphors, I know). </p> <p> The pure part of the sandwich is the composition of the pure functions <code>moveTo</code> and <code>calculateMoves</code>. Since <code>photoTree</code> is a <code>Tree&lt;string,PhotoFile&gt; option</code>, you'll need to perform that transformation inside of <code>Option.map</code>. The resulting <code>destinationTree</code> is a <code>Tree&lt;string,Move&gt; option</code>. </p> <p> The final, impure step of the sandwich, then, is to apply all the moves with <code>writeTree</code>. </p> <h3 id="ab0013f79c184586a10aa014db496bef"> Execution <a href="#ab0013f79c184586a10aa014db496bef" title="permalink">#</a> </h3> <p> The <code>movePhotos</code> operation takes <code>source</code> and <code>destination</code> arguments. You could hypothetically call it from a rich client or a background process, but here I'll just call if from a command-line program. The <code>main</code> operation will have to parse the input arguments and call <code>movePhotos</code>: </p> <p> <pre>[&lt;EntryPoint&gt;] <span style="color:blue;">let</span>&nbsp;main&nbsp;argv&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;argv&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;[|source;&nbsp;destination|]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;movePhotos&nbsp;source&nbsp;destination &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;printfn&nbsp;<span style="color:#a31515;">&quot;Please&nbsp;provide&nbsp;source&nbsp;and&nbsp;destination&nbsp;directories&nbsp;as&nbsp;arguments.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;<span style="color:green;">//&nbsp;return&nbsp;an&nbsp;integer&nbsp;exit&nbsp;code</span></pre> </p> <p> You could write more sophisticated parsing of the program arguments, but that's not the topic of this article, so I only wrote the bare minimum required to get the program working. </p> <p> You can now compile and run the program: </p> <p> <pre>$./ArchivePictures "C:\Users\mark\Desktop\Test" "C:\Users\mark\Desktop\Test-Out" Copied to C:\Users\mark\Desktop\Test-Out\2003-04\2003-04-29 15.11.50.jpg Copied to C:\Users\mark\Desktop\Test-Out\2011-07\2011-07-10 13.09.36.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-04\2014-04-18 14.05.02.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-04\2014-04-17 17.11.40.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-05\2014-05-23 16.07.20.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-06\2014-06-21 16.48.40.jpg Copied to C:\Users\mark\Desktop\Test-Out\2014-06\2014-06-30 15.44.52.jpg Copied to C:\Users\mark\Desktop\Test-Out\2016-05\2016-05-01 09.25.23.jpg Copied to C:\Users\mark\Desktop\Test-Out\2017-08\2017-08-22 19.53.28.jpg</pre> </p> <p> This does indeed produce the expected destination directory structure. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> It's always nice when something turns out to work in practice, as well as in theory. </p> <h3 id="3e4503b89d8f4b81b8b9cac9d1f39021"> Summary <a href="#3e4503b89d8f4b81b8b9cac9d1f39021" title="permalink">#</a> </h3> <p> <a href="/2018/11/19/functional-architecture-a-definition">Functional software architecture</a> involves separating pure from impure code so that no pure functions invoke impure operations. Often, you can achieve that with what I call the <em>impure-pure-impure sandwich</em> architecture. In this example, you saw how to model the file system as a tree. This enables you to separate the impure file interactions from the pure program logic. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Picture archivist in Haskell https://blog.ploeh.dk/2019/09/09/picture-archivist-in-haskell 2019-09-09T08:19:00+00:00 Mark Seemann <div id="post"> <p> <em>A comprehensive code example showing how to implement a functional architecture in Haskell.</em> </p> <p> This article shows how to implement the <a href="/2019/08/26/functional-file-system">picture archivist architecture described in the previous article</a>. In short, the task is to move some image files to directories based on their date-taken metadata. You could also write a comprehensive test suite for the following functions, but in this article, I'll skip that. </p> <p> Notice that this sort of tree explicitly distinguishes between internal and leaf nodes. This is necessary because you'll need to keep track of the directory names (the internal nodes), while at the same time you'll want to enrich the leaves with additional data - data that you can't meaningfully add to the internal nodes. You'll see this later in the article. </p> <p> The <a href="/2019/08/05/rose-tree-catamorphism">rose tree catamorphism</a> is this <code>foldTree</code> function: </p> <p> <pre><span style="color:#2b91af;">foldTree</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[c]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c foldTree&nbsp;&nbsp;_&nbsp;fl&nbsp;(Leaf&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x foldTree&nbsp;fn&nbsp;fl&nbsp;(Node&nbsp;x&nbsp;xs)&nbsp;=&nbsp;fn&nbsp;x&nbsp;$&nbsp;foldTree&nbsp;fn&nbsp;fl&nbsp;&lt;$&gt;&nbsp;xs</pre> </p> <p> Sometimes I name the catamorphism <code>cata</code>, sometimes something like <code>tree</code>, but using a library like <code>Data.Tree</code> as another source of inspiration, in this article I chose to name it <code>foldTree</code>. </p> <p> In this article, tree functionality is (with one exception) directly or transitively implemented with <code>foldTree</code>. </p> <h3 id="f5541d8a36b04cf9a455824c5f3a21c7"> Filtering trees <a href="#f5541d8a36b04cf9a455824c5f3a21c7" title="permalink">#</a> </h3> <p> It'll be useful to be able to filter the contents of a tree. For example, the picture archivist program will only move image files with valid metadata. This means that it'll need to filter out all files that aren't image files, as well as image files without valid metadata. </p> <p> It turns out that it'll be useful to supply a function that throws away <code>Nothing</code> values from a tree of <code>Maybe</code> leaves. This is similar to the <code>catMaybes</code> function from <code>Data.Maybe</code>, so I call it <code>catMaybeTree</code>: </p> <p> <pre><span style="color:#2b91af;">catMaybeTree</span>&nbsp;::&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a&nbsp;b) catMaybeTree&nbsp;=&nbsp;foldTree&nbsp;(\x&nbsp;-&gt;&nbsp;Just&nbsp;.&nbsp;Node&nbsp;x&nbsp;.&nbsp;catMaybes)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;Leaf)</pre> </p> <p> You may find the type of the function surprising. Why does it return a <code>Maybe Tree</code>, instead of simply a <code>Tree</code>? And if you accept the type as given, isn't this simply the <code>sequence</code> function? </p> <p> While <code>catMaybes</code> simply returns a list, it can do this because lists can be empty. This <code>Tree</code> type, on the other hand, can't be empty. If the purpose of <code>catMaybeTree</code> is to throw away all <code>Nothing</code> values, then how do you return a tree from <code>Leaf Nothing</code>? </p> <p> You can't return a <code>Leaf</code> because you have no value to put in the leaf. Similarly, you can't return a <code>Node</code> because, again, you have no value to put in the node. </p> <p> In order to handle this edge case, then, you'll have to return <code>Nothing</code>: </p> <p> <pre>Prelude Tree&gt; catMaybeTree$ Leaf Nothing Nothing</pre> </p> <p> Isn't this the same as <code>sequence</code>, then? It's not, because <code>sequence</code> short-circuits all data, as this list example shows: </p> <p> <pre>Prelude&gt; sequence [Just 42, Nothing, Just 2112] Nothing</pre> </p> <p> Contrast this with the behaviour of <code>catMaybes</code>: </p> <p> <pre>Prelude Data.Maybe&gt; catMaybes [Just 42, Nothing, Just 2112] [42,2112]</pre> </p> <p> You've yet to see the <code>Traversable</code> instance for <code>Tree</code>, but it behaves in the same way: </p> <p> <pre>Prelude Tree&gt; sequence $Node "Foo" [Leaf (Just 42), Leaf Nothing, Leaf (Just 2112)] Nothing</pre> </p> <p> The <code>catMaybeTree</code> function, on the other hand, returns a filtered tree: </p> <p> <pre>Prelude Tree&gt; catMaybeTree$ Node "Foo" [Leaf (Just 42), Leaf Nothing, Leaf (Just 2112)] Just (Node "Foo" [Leaf 42,Leaf 2112])</pre> </p> <p> While the resulting tree is wrapped in a <code>Just</code> case, the leaves contain unwrapped values. </p> <h3 id="5f0287c6d6fe42f3ad73a8e31ba9b3c4"> Instances <a href="#5f0287c6d6fe42f3ad73a8e31ba9b3c4" title="permalink">#</a> </h3> <p> The <a href="/2019/08/05/rose-tree-catamorphism">article about the rose tree catamorphism</a> already covered how to add instances of <code>Bifunctor</code>, <code>Bifoldable</code>, and <code>Bitraversable</code>, so I'll give this only cursory treatment. Refer to that article for a more detailed treatment. The code that accompanies that article also has <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a> properties that verify the various laws associated with those instances. Here, I'll just list the instances without further comment: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifunctor</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bimap&nbsp;f&nbsp;s&nbsp;=&nbsp;foldTree&nbsp;(Node&nbsp;.&nbsp;f)&nbsp;(Leaf&nbsp;.&nbsp;s) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifoldable</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bifoldMap&nbsp;f&nbsp;=&nbsp;foldTree&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;mconcat&nbsp;xs) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bitraversable</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bitraverse&nbsp;f&nbsp;s&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;foldTree&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;Node&nbsp;&lt;$&gt;&nbsp;f&nbsp;x&nbsp;&lt;*&gt;&nbsp;sequenceA&nbsp;xs)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;Leaf&nbsp;.&nbsp;s) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;=&nbsp;second <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;bifoldMap&nbsp;mempty <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;bisequenceA&nbsp;.&nbsp;first&nbsp;pure</pre> </p> <p> The picture archivist program isn't going to explicitly need all of these, but transitively, it will. </p> <h3 id="d1bbd6ef895f45619822126f44bf6bfb"> Moving pictures <a href="#d1bbd6ef895f45619822126f44bf6bfb" title="permalink">#</a> </h3> <p> So far, all the code shown here could be in a general-purpose reusable library, since it contains no functionality specifically related to image files. The rest of the code in this article, however, will be specific to the program. I'll put the domain model code in another module and import some functionality: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Archive&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;Data.Time <span style="color:blue;">import</span>&nbsp;Text.Printf <span style="color:blue;">import</span>&nbsp;System.FilePath <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.Map.Strict&nbsp;<span style="color:blue;">as</span>&nbsp;Map <span style="color:blue;">import</span>&nbsp;Tree</pre> </p> <p> Notice that <code>Tree</code> is one of the imported modules. </p> <p> Later, we'll look at how to load a tree from the file system, but for now, we'll just pretend that we have such a tree. </p> <p> The major logic of the program is to create a destination tree based on a source tree. The leaves of the tree will have to carry some extra information apart from a file path, so you can introduce a specific type to capture that information: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PhotoFile&nbsp;= &nbsp;&nbsp;PhotoFile&nbsp;{&nbsp;photoFileName&nbsp;::&nbsp;FilePath,&nbsp;takenOn&nbsp;::&nbsp;LocalTime&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> A <code>PhotoFile</code> not only contains the file path for an image file, but also the date the photo was taken. This date can be extracted from the file's metadata, but that's an impure operation, so we'll delegate that work to the start of the program. We'll return to that later. </p> <p> Given a source tree of <code>PhotoFile</code> leaves, though, the program must produce a destination tree of files: </p> <p> <pre><span style="color:#2b91af;">moveTo</span>&nbsp;::&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Ord</span>&nbsp;a,&nbsp;<span style="color:blue;">PrintfType</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;t&nbsp;<span style="color:blue;">PhotoFile</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;a&nbsp;<span style="color:#2b91af;">FilePath</span> moveTo&nbsp;destination&nbsp;= &nbsp;&nbsp;Node&nbsp;destination&nbsp;.&nbsp;Map.foldrWithKey&nbsp;addDir&nbsp;<span style="color:blue;">[]</span>&nbsp;.&nbsp;<span style="color:blue;">foldr</span>&nbsp;groupByDir&nbsp;Map.empty &nbsp;&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;&nbsp;&nbsp;dirNameOf&nbsp;(LocalTime&nbsp;d&nbsp;_)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;(y,&nbsp;m,&nbsp;_)&nbsp;=&nbsp;toGregorian&nbsp;d&nbsp;<span style="color:blue;">in</span>&nbsp;printf&nbsp;<span style="color:#a31515;">&quot;%d-%02d&quot;</span>&nbsp;y&nbsp;m &nbsp;&nbsp;&nbsp;&nbsp;groupByDir&nbsp;(PhotoFile&nbsp;fileName&nbsp;t)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Map.insertWith&nbsp;<span style="color:#2b91af;">(++)</span>&nbsp;(dirNameOf&nbsp;t)&nbsp;[fileName] &nbsp;&nbsp;&nbsp;&nbsp;addDir&nbsp;name&nbsp;files&nbsp;dirs&nbsp;=&nbsp;Node&nbsp;name&nbsp;(Leaf&nbsp;&lt;$&gt;&nbsp;files)&nbsp;:&nbsp;dirs</pre> </p> <p> This <code>moveTo</code> function looks, perhaps, overwhelming, but it's composed of only three steps: <ol> <li>Create a map of destination folders (<code>foldr groupByDir Map.empty</code>).</li> <li>Create a list of branches from the map (<code>Map.foldrWithKey addDir []</code>).</li> <li>Create a tree from the list (<code>Node destination</code>).</li> </ol> Recall that when Haskell functions are composed with the <code>.</code> operator, you'll have to read the composition from right to left. </p> <p> Notice that this function works with any <code>Foldable</code> data container, so it'd work with lists and other data structures besides trees. </p> <p> The <code>moveTo</code> function starts by folding the input data into a map. The map is keyed by the directory name, which is formatted by the <code>dirNameOf</code> function. This function takes a <code>LocalTime</code> as input and formats it to a <code>YYYY-MM</code> format. For example, December 20, 2018 becomes <code>"2018-12"</code>. </p> <p> The entire mapping step groups the <code>PhotoFile</code> values into a map of the type <code>Map a [FilePath]</code>. All the image files taken in April 2014 are added to the list with the <code>"2014-04"</code> key, all the image files taken in July 2011 are added to the list with the <code>"2011-07"</code> key, and so on. </p> <p> In the next step, the <code>moveTo</code> function converts the map to a list of trees. This will be the branches (or sub-directories) of the <code>destination</code> directory. Because of the desired structure of the destination tree, this is a list of shallow branches. Each node contains only leaves. </p> <p> <img src="/content/binary/shallow-photo-destination-directories.png" alt="Shallow photo destination directories."> </p> <p> The only remaining step is to add that list of branches to a <code>destination</code> node. </p> <p> Since this is a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">easy to unit test</a>. Just create some input values and call the function: </p> <p> <pre><span style="color:#a31515;">&quot;Move&nbsp;to&nbsp;destination&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(source,&nbsp;destination,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;lt&nbsp;2018&nbsp;11&nbsp;9&nbsp;11&nbsp;47&nbsp;17 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;2018-11&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;S&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>&nbsp;$&nbsp;lt&nbsp;1972&nbsp;6&nbsp;6&nbsp;16&nbsp;15&nbsp;00] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;1972-06&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;4&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;S&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>&nbsp;$&nbsp;lt&nbsp;2002&nbsp;10&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>&nbsp;$&nbsp;lt&nbsp;2007&nbsp;4&nbsp;21&nbsp;17&nbsp;18&nbsp;19] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;D&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;D&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;2002-10&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;L&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2007-04&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;J&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;21&nbsp;17&nbsp;18&nbsp;19] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;2&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-01&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;12&nbsp;17&nbsp;16&nbsp;15, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;1&nbsp;21&nbsp;17&nbsp;18&nbsp;19], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;baz&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>&nbsp;$&nbsp;lt&nbsp;2010&nbsp;3&nbsp;1&nbsp;2&nbsp;3&nbsp;4, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;PhotoFile&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>&nbsp;$&nbsp;lt&nbsp;2011&nbsp;3&nbsp;4&nbsp;3&nbsp;2&nbsp;1 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;,&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-01&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2010-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;d&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;2011-03&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;e&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;moveTo&nbsp;destination&nbsp;source &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> This is an <a href="/2018/05/07/inlined-hunit-test-lists">inlined</a> <a href="/2018/04/30/parametrised-unit-tests-in-haskell">parametrised HUnit test</a>. While it looks like a big unit test, it still follows my <a href="/2013/06/24/a-heuristic-for-formatting-code-according-to-the-aaa-pattern">test formatting heuristic</a>. There's only three expressions, but the <em>arrange</em> expression is big because it creates a list of test cases. </p> <p> Each test case is a triple of a <code>source</code> tree, a <code>destination</code> directory name, and an <code>expected</code> result. In order to make the test data code more compact, it utilises this test-specific helper function: </p> <p> <pre>lt&nbsp;y&nbsp;mth&nbsp;d&nbsp;h&nbsp;m&nbsp;s&nbsp;=&nbsp;LocalTime&nbsp;(fromGregorian&nbsp;y&nbsp;mth&nbsp;d)&nbsp;(TimeOfDay&nbsp;h&nbsp;m&nbsp;s)</pre> </p> <p> For each test case, the test calls the <code>moveTo</code> function with the <code>destination</code> directory name and the <code>source</code> tree. It then asserts that the <code>expected</code> value is equal to the <code>actual</code> value. </p> <h3 id="bcf9e8fd9d1b42bbb47b811be75385d0"> Calculating moves <a href="#bcf9e8fd9d1b42bbb47b811be75385d0" title="permalink">#</a> </h3> <p> One pure step remains. The result of calling the <code>moveTo</code> function is a tree with the desired structure. In order to actually move the files, though, for each file you'll need to keep track of both the source path and the destination path. To make that explicit, you can define a type for that purpose: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Move&nbsp;= &nbsp;&nbsp;Move&nbsp;{&nbsp;sourcePath&nbsp;::&nbsp;FilePath,&nbsp;destinationPath&nbsp;::&nbsp;FilePath&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> A <code>Move</code> is simply a data structure. Contrast this with typical object-oriented design, where it would be a (possibly polymorphic) method on an object. In functional programming, you'll regularly model <em>intent</em> with a data structure. As long as intents remain data, you can easily manipulate them, and once you're done with that, you can run an interpreter over your data structure to perform the work you want accomplished. </p> <p> The unit test cases for the <code>moveTo</code> function suggest that file names are local file names like <code>"L"</code>, <code>"J"</code>, <code>"a"</code>, and so on. That was only to make the tests as compact as possible, since the function actually doesn't manipulate the specific <code>FilePath</code> values. </p> <p> In reality, the file names will most likely be longer, and they could also contain the full path, instead of the local path: <code>"C:\foo\bar\a.jpg"</code>. </p> <p> If you call <code>moveTo</code> with a tree where each leaf has a fully qualified path, the output tree will have the desired structure of the destination tree, but the leaves will still contain the full path to each source file. That means that you can calculate a <code>Move</code> for each file: </p> <p> <pre><span style="color:#2b91af;">calculateMoves</span>&nbsp;::&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">Move</span> calculateMoves&nbsp;=&nbsp;imp&nbsp;<span style="color:#a31515;">&quot;&quot;</span> &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;imp&nbsp;path&nbsp;&nbsp;&nbsp;&nbsp;(Leaf&nbsp;x)&nbsp;=&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;x&nbsp;$&nbsp;replaceDirectory&nbsp;x&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;imp&nbsp;path&nbsp;(Node&nbsp;x&nbsp;xs)&nbsp;=&nbsp;Node&nbsp;(path&nbsp;&lt;/&gt;&nbsp;x)&nbsp;$&nbsp;imp&nbsp;(path&nbsp;&lt;/&gt;&nbsp;x)&nbsp;&lt;$&gt;&nbsp;xs</pre> </p> <p> This function takes as input a <code>Tree FilePath FilePath</code>, which is compatible with the output of <code>moveTo</code>. It returns a <code>Tree FilePath Move</code>, i.e. a tree where the leaves are <code>Move</code> values. </p> <p> To be fair, returning a tree is overkill. A <code>[Move]</code> (list of moves) would have been just as useful, but in this article, I'm trying to describe how to write code with a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>. In the overview article, I explained how you can model a file system using a rose tree, and in order to emphasise that point, I'll stick with that model a little while longer. </p> <p> Earlier, I wrote that you can implement desired <code>Tree</code> functionality with the <code>foldTree</code> function, but that was a simplification. If you can implement the functionality of <code>calculateMoves</code> with <code>foldTree</code>, I don't know how. You can, however, implement it using explicit pattern matching and simple recursion. </p> <p> The <code>imp</code> function builds up a file path (using the <code>&lt;/&gt;</code> path combinator) as it recursively negotiates the tree. All <code>Leaf</code> nodes are converted to a <code>Move</code> value using the leaf node's current <code>FilePath</code> value as the <code>sourcePath</code>, and the <code>path</code> to figure out the desired <code>destinationPath</code>. </p> <p> This code is still easy to unit test: </p> <p> <pre><span style="color:#a31515;">&quot;Calculate&nbsp;moves&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(tree,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[Node&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>,&nbsp;Leaf&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>],&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;[Leaf&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>]], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>)&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;1&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;b&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;2&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Node&nbsp;(<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>)&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Leaf&nbsp;$&nbsp;Move&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>&nbsp;$&nbsp;<span style="color:#a31515;">&quot;a&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;c&quot;</span>&nbsp;&lt;/&gt;&nbsp;<span style="color:#a31515;">&quot;3&quot;</span>]]) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;calculateMoves&nbsp;tree &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> The test cases in this parametrised test are tuples of an input <code>tree</code> and the <code>expected</code> tree. For each test case, the test calls the <code>calculateMoves</code> function with <code>tree</code> and asserts that the <code>actual</code> tree is equal to the <code>expected</code> tree. </p> <p> That's all the pure code you need in order to implement the desired functionality. Now you only need to write some code that loads a tree from disk, and imprints a destination tree to disk, as well as the code that composes it all. </p> <h3 id="062fff475b2b47e188dbd2bc930aa882"> Loading a tree from disk <a href="#062fff475b2b47e188dbd2bc930aa882" title="permalink">#</a> </h3> <p> The remaining code in this article is impure. You could put it in dedicated modules, but for this program, you're only going to need three functions and a bit of composition code, so you could also just put it all in the <code>Main</code> module. That's what I did. </p> <p> To load a tree from disk, you'll need a root directory, under which you load the entire tree. Given a directory path, you read a tree using a recursive function like this: </p> <p> <pre><span style="color:#2b91af;">readTree</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:blue;">Tree</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:#2b91af;">FilePath</span>) readTree&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;isFile&nbsp;&lt;-&nbsp;doesFileExist&nbsp;path &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;isFile &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;Leaf&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dirsAndfiles&nbsp;&lt;-&nbsp;listDirectory&nbsp;path &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;paths&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(path&nbsp;&lt;/&gt;)&nbsp;dirsAndfiles &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;branches&nbsp;&lt;-&nbsp;traverse&nbsp;readTree&nbsp;paths &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;Node&nbsp;path&nbsp;branches</pre> </p> <p> This recursive function starts by checking whether the <code>path</code> is a file or a directory. If it's a file, it creates a new <code>Leaf</code> with that <code>FilePath</code>. </p> <p> If <code>path</code> isn't a file, it's a directory. In that case, use <code>listDirectory</code> to enumerate all the directories and files in that directory. These are only local names, so prefix them with <code>path</code> to create full paths, then <code>traverse</code> all those directory entries recursively. That produces all the <code>branches</code> for the current node. Finally, return a new <code>Node</code> with the <code>path</code> and the <code>branches</code>. </p> <h3 id="5ba31d6e6e7f4eee942e39349a45e1ed"> Loading metadata <a href="#5ba31d6e6e7f4eee942e39349a45e1ed" title="permalink">#</a> </h3> <p> The <code>readTree</code> function only produces a tree with <code>FilePath</code> leaves, while the program requires a tree with <code>PhotoFile</code> leaves. You'll need to read the <a href="https://en.wikipedia.org/wiki/Exif">Exif</a> metadata from each file and enrich the tree with the <em>date-taken</em> data. </p> <p> In this code base, I've used the <a href="http://hackage.haskell.org/package/hsexif">hsexif</a> library for this. That enables you to write an impure operation like this: </p> <p> <pre><span style="color:#2b91af;">readPhoto</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">PhotoFile</span>) readPhoto&nbsp;path&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;exifData&nbsp;&lt;-&nbsp;parseFileExif&nbsp;path &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dateTaken&nbsp;=&nbsp;either&nbsp;(<span style="color:blue;">const</span>&nbsp;Nothing)&nbsp;Just&nbsp;exifData&nbsp;&gt;&gt;=&nbsp;getDateTimeOriginal &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;PhotoFile&nbsp;path&nbsp;&lt;$&gt;&nbsp;dateTaken</pre> </p> <p> This operation can fail for various reasons: <ul> <li>The file may not exist.</li> <li>The file exists, but has no metadata.</li> <li>The file has metadata, but no <em>date-taken</em> metadata.</li> <li>The <em>date-taken</em> metadata string is malformed.</li> </ul> The program is just going to skip all files from which it can't extract <em>date-taken</em> metadata, so <code>readPhoto</code> converts the <code>Either</code> value returned by <code>parseFileExif</code> to <code>Maybe</code> and binds the result with <code>getDateTimeOriginal</code>. </p> <p> When you <code>traverse</code> a <code>Tree FilePath FilePath</code> with <code>readPhoto</code>, you'll get a <code>Tree FilePath (Maybe PhotoFile)</code>. That's when you'll need <code>catMaybeTree</code>. You'll see this soon. </p> <h3 id="8b8d1709f9ed4fe2bc78e4ea9b2a2508"> Writing a tree to disk <a href="#8b8d1709f9ed4fe2bc78e4ea9b2a2508" title="permalink">#</a> </h3> <p> The above <code>calculateMoves</code> function creates a <code>Tree FilePath Move</code>. Any <code>Foldable</code> container will do, as the <code>applyMoves</code> operation demonstrates. It traverses the data structure, and for each <code>Move</code>, it first copies the file, then it verifies that the copy was successful, and finally, if that's the case, it deletes the source file. </p> <p> All of the operations invoked by these three steps are defined in various libraries part of the base GHC installation. You're welcome to peruse <a href="https://github.com/ploeh/picture-archivist">the source code repository</a> if you're interested in the details. </p> <h3 id="d336cf55dc9746c08cbed32041803173"> Composition <a href="#d336cf55dc9746c08cbed32041803173" title="permalink">#</a> </h3> <p> You can now compose an <em>impure-pure-impure sandwich</em> from all the Lego pieces: </p> <p> <pre><span style="color:#2b91af;">movePhotos</span>&nbsp;::&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">FilePath</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() movePhotos&nbsp;source&nbsp;destination&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;fold&nbsp;$&nbsp;runMaybeT&nbsp;$&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;sourceTree&nbsp;&lt;-&nbsp;lift&nbsp;$&nbsp;readTree&nbsp;source &nbsp;&nbsp;photoTree&nbsp;&lt;-&nbsp;MaybeT&nbsp;$&nbsp;catMaybeTree&nbsp;&lt;$&gt;&nbsp;traverse&nbsp;readPhoto&nbsp;sourceTree &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;destinationTree&nbsp;=&nbsp;calculateMoves&nbsp;$&nbsp;moveTo&nbsp;destination&nbsp;photoTree &nbsp;&nbsp;lift&nbsp;$&nbsp;applyMoves&nbsp;destinationTree</pre> </p> <p> First, you load the <code>sourceTree</code> using the <code>readTree</code> operation. This is a <code>Tree FilePath FilePath</code> value, because the code is written in <code>do</code> notation, and the context is <code>MaybeT IO ()</code>. You then load the image metatadata by traversing <code>sourceTree</code> with <code>readPhoto</code>. This produces a <code>Tree FilePath (Maybe PhotoFile)</code> that you then filter with <code>catMaybeTree</code>. Again, because of <code>do</code> notation and monad transformer shenanigans, <code>photoTree</code> is a <code>Tree FilePath PhotoFile</code> value. </p> <p> Those two lines of code is the initial impure step of the sandwich (yes: mixed metaphors, I know). </p> <p> The pure part of the sandwich is the composition of the pure functions <code>moveTo</code> and <code>calculateMoves</code>. The result is a <code>Tree FilePath Move</code> value. </p> <p> The final, impure step of the sandwich, then, is to <code>applyMoves</code>. </p> <h3 id="8b44f4d2cd2241e18bff6d40c1ad9ee9"> Execution <a href="#8b44f4d2cd2241e18bff6d40c1ad9ee9" title="permalink">#</a> </h3> <p> The <code>movePhotos</code> operation takes <code>source</code> and <code>destination</code> arguments. You could hypothetically call it from a rich client or a background process, but here I'll just call if from a command-line program. The <code>main</code> operation will have to parse the input arguments and call <code>movePhotos</code>: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;args&nbsp;&lt;-&nbsp;getArgs &nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;args&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;[source,&nbsp;destination]&nbsp;-&gt;&nbsp;movePhotos&nbsp;source&nbsp;destination &nbsp;&nbsp;&nbsp;&nbsp;_&nbsp;-&gt;&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;<span style="color:#a31515;">&quot;Please&nbsp;provide&nbsp;source&nbsp;and&nbsp;destination&nbsp;directories&nbsp;as&nbsp;arguments.&quot;</span></pre> </p> <p> You could write more sophisticated parsing of the program arguments, but that's not the topic of this article, so I only wrote the bare minimum required to get the program working. </p> <p> You can now compile and run the program: </p> <p> <pre>$ ./archpics "C:\Users\mark\Desktop\Test" "C:\Users\mark\Desktop\Test-Out" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2003-04\\2003-04-29 15.11.50.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2011-07\\2011-07-10 13.09.36.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-04\\2014-04-17 17.11.40.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-04\\2014-04-18 14.05.02.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-05\\2014-05-23 16.07.20.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-06\\2014-06-30 15.44.52.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2014-06\\2014-06-21 16.48.40.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2016-05\\2016-05-01 09.25.23.jpg" Copied to "C:\\Users\\mark\\Desktop\\Test-Out\\2017-08\\2017-08-22 19.53.28.jpg"</pre> </p> <p> This does indeed produce the expected destination directory structure. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> It's always nice when something turns out to work in practice, as well as in theory. </p> <h3 id="c50c7ac1276146d79715a5e7ddadfe6d"> Summary <a href="#c50c7ac1276146d79715a5e7ddadfe6d" title="permalink">#</a> </h3> <p> Functional software architecture involves separating pure from impure code so that no pure functions invoke impure operations. Often, you can achieve that with what I call the <em>impure-pure-impure sandwich</em> architecture. In this example, you saw how to model the file system as a tree. This enables you to separate the impure file interactions from the pure program logic. </p> <p> The Haskell type system enforces the <em>functional interaction law</em>, which implies that the architecture is, indeed, properly functional. Other languages, like <a href="https://fsharp.org">F#</a>, don't enforce the law via the compiler, but that doesn't prevent you doing functional programming. Now that we've verified that the architecture is, indeed, functional, we can port it to F#. </p> <p> <strong>Next:</strong> <a href="/2019/09/16/picture-archivist-in-f">Picture archivist in F#</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="f237d98d453a4bcb9a3d58a05bf21d34"> <div class="comment-author"><a href="https://majiehong.com">Jiehong</a></div> <div class="comment-content"> <p> This seems a fair architecture. </p> <p> However, at first glance it does not seem very memory efficient, because everything might be loaded in RAM, and that poses a strict limit. </p> <p> But then, I remember that Haskell does lazy evaluation, so is it the case here? Are path and the tree lazily loaded and processed? </p> <p> In "traditional" architectures, IO would be scattered inside the program, and as each file might be read one at a time, and handled. This sandwich of purity with impure buns forces not to do that. </p> </div> <div class="comment-date">2019-09-09 11:47 UTC</div> </div> <div class="comment" id="ca660cdc1f094bfb8cc9896bb1084460"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> Jiehong, thank you for writing. It's true that Haskell is lazily evaluated, but some strictness rules apply to <code>IO</code>, so it's not so simple. </p> <p> Just running a quick experiment with the code base shown here, when I try to move thousands of files, the program sits and thinks for quite some time before it starts to output progress. This indicates to me that it does, indeed, load at least the <em>structure</em> of the tree into memory before it starts moving the files. Once it does that, though, it looks like it runs at constant memory. </p> <p> There's an interplay of laziness and <code>IO</code> in Haskell that I still don't sufficiently master. When I publish the port to F#, however, it should be clear that you could replace all the nodes of the tree with explicitly lazy values. I'd be surprised if something like that isn't possible in Haskell as well, but here I'll solicit help from readers more well-versed in these matters than I am. </p> </div> <div class="comment-date">2019-09-09 19:16 UTC</div> </div> <div class="comment" id="dd26f6d047b5492b8a012b30d96ad18b"> <div class="comment-author">André Cardoso</div> <div class="comment-content"> <p> I really like your posts and I'm really liking this series. But I struggle with Haskell syntax, specially the difference between the operators $, &lt;$&gt;, &lt;&gt;, &lt;*&gt;. Is there a cheat sheet explaining these operators? </p> </div> <div class="comment-date">2019-09-12 13:51 UTC</div> </div> <div class="comment" id="2e71f695ed9f4cfa8467df818f072da8"> <div class="comment-author"><a href="/">Mark Seemann</a></div> <div class="comment-content"> <p> André, thank you for writing. I've written about why <a href="/2018/07/02/terse-operators-make-business-code-more-readable">I think that terse operators make the code overall more readable</a>, but that's obviously not an explanation of any of those operators. </p> <p> I'm not aware of any cheat sheets for Haskell, although a Google search seems to indicate that many exist. I'm not sure that a cheat sheet will help much if one doesn't know Haskell, and if one does know Haskell, one is likely to also know those operators. </p> <p> <a href="https://hackage.haskell.org/package/base/docs/Prelude.html#v:-36-">$</a> is a sort of delimiter that often saves you from having to nest other function calls in brackets. </p> <p> <a href="https://hackage.haskell.org/package/base/docs/Prelude.html#v:-60--36--62-">&lt;$&gt;</a> is just an infix alias for <code>fmap</code>. In C#, that <a href="/2018/03/22/functors">corresponds to the <code>Select</code> method</a>. </p> <p> <code>&lt;&gt;</code> is a generalised associative binary operation as defined by <a href="http://hackage.haskell.org/package/base/docs/Data-Semigroup.html">Data.Semigroup</a> or <a href="http://hackage.haskell.org/package/base/docs/Data-Monoid.html">Data.Monoid</a>. You can <a href="/2017/10/05/monoids-semigroups-and-friends">read more about monoids and semigroups here on the blog</a>. </p> <p> <a href="http://hackage.haskell.org/package/base/docs/Control-Applicative.html">&lt;*&gt;</a> is part of the <code>Applicative</code> type class. In C#, that corresponds to the Select method.

<> is a generalised associative binary operation as defined by Data.Semigroup or Data.Monoid. You can read more about monoids and semigroups here on the blog.

<*> is part of the Applicative type class. It's hard to translate to other languages, but when I make the attempt, I usually call it Apply. Here's one. </p> <h3 id="c7391ad662e943f1bbe2b52d6b8bde59"> Orphan instances <a href="#c7391ad662e943f1bbe2b52d6b8bde59" title="permalink">#</a> </h3> <p> When you write <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a> properties that involve your own custom types, you'll have to add <code>Arbitrary</code> instances for those types. As an example, here's a restaurant reservation record type: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Reservation&nbsp;=&nbsp;Reservation &nbsp;&nbsp;{&nbsp;reservationId&nbsp;::&nbsp;UUID &nbsp;&nbsp;,&nbsp;reservationDate&nbsp;::&nbsp;LocalTime &nbsp;&nbsp;,&nbsp;reservationName&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;reservationEmail&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;reservationQuantity&nbsp;::&nbsp;Int &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Read</span>,&nbsp;<span style="color:#2b91af;">Generic</span>)</pre> </p> <p> You can easily add an Arbitrary instance to such a type: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">Reservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;liftM5&nbsp;Reservation&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary</pre> </p> <p> The type itself is part of your domain model, while the <code>Arbitrary</code> instance only belongs to your test code. You shouldn't add the <code>Arbitrary</code> instance to the domain model, but that means that you'll have to define the instance apart from the type definition. That, however, is an orphan instance, and the compiler will complain: </p> <p> <pre>test\ReservationAPISpec.hs:31:1: <span style="color:red;">warning:</span> [<span style="color:red;">-Worphans</span>] Orphan instance: instance Arbitrary Reservation To avoid this move the instance declaration to the module of the class or of the type, or wrap the type with a newtype and declare the instance on the new type. <span style="color:blue;">|</span> <span style="color:blue;">31 |</span> <span style="color:red;">instance Arbitrary Reservation where</span> <span style="color:blue;">|</span> <span style="color:red;">^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...</span></pre> </p> <p> Technically, this isn't a difficult problem to solve. The warning even suggests remedies. Moving the instance to the module that declares the type is, however, inappropriate, since test-specific instances don't belong in the domain model. Wrapping the type in a <code>newtype</code> is more appropriate, but what should you call the type? </p> <h3 id="c192d6524b4b4444a35121443f9a61a8"> Suppress the warning <a href="#c192d6524b4b4444a35121443f9a61a8" title="permalink">#</a> </h3> <p> I had trouble coming up with good names for such <code>newtype</code> wrappers, so at first I decided to just suppress that particular compiler warning. I simply added the <code>-fno-warn-orphans</code> flag <em>exclusively to my test code</em>. </p> <p> That solved the immediate problem, but I felt a little dirty. It's okay, though, because you're not supposed to reuse test libraries anyway, so the usual problems with orphan instances don't apply. </p> <p> After having worked a little like this, however, it dawned on me that I needed more than one <code>Arbitrary</code> instance, and a naming scheme presented itself. </p> <h3 id="a946b2c622c6403cb69a3f224551514c"> Naming scheme <a href="#a946b2c622c6403cb69a3f224551514c" title="permalink">#</a> </h3> <p> For some of the properties I wrote, I needed a <em>valid</em> <code>Reservation</code> value. In this case, <em>valid</em> means that the <code>reservationQuantity</code> is a positive number, and that the <code>reservationDate</code> is in the future. It seemed natural to signify these constraints with a <code>newtype</code>: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;ValidReservation&nbsp;=&nbsp;ValidReservation&nbsp;Reservation&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">ValidReservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;rid&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;&lt;-&nbsp;(\dt&nbsp;-&gt;&nbsp;addLocalTime&nbsp;(getPositive&nbsp;dt)&nbsp;now2019)&nbsp;&lt;$&gt;&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;n&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;e&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;(Positive&nbsp;q)&nbsp;&lt;-&nbsp;arbitrary &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;ValidReservation&nbsp;$&nbsp;Reservation&nbsp;rid&nbsp;d&nbsp;n&nbsp;e&nbsp;q</pre> </p> <p> The <code>newtype</code> is, naturally, called <code>ValidReservation</code> and can, for example, be used like this: </p> <p> <pre>it&nbsp;<span style="color:#a31515;">&quot;responds&nbsp;with&nbsp;200&nbsp;after&nbsp;reservation&nbsp;is&nbsp;added&quot;</span>&nbsp;$&nbsp;WQC.property&nbsp;$&nbsp;\ &nbsp;&nbsp;(ValidReservation&nbsp;r)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;_&nbsp;&lt;-&nbsp;postJSON&nbsp;<span style="color:#a31515;">&quot;/reservations&quot;</span>&nbsp;$&nbsp;encode&nbsp;r &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;get&nbsp;$&nbsp;<span style="color:#a31515;">&quot;/reservations/&quot;</span>&nbsp;&lt;&gt;&nbsp;toASCIIBytes&nbsp;(reservationId&nbsp;r) &nbsp;&nbsp;actual&nbsp;shouldRespondWith&nbsp;200</pre> </p> <p> For the few properties where <em>any</em> <code>Reservation</code> goes, a name for a <code>newtype</code> now also suggests itself: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;AnyReservation&nbsp;=&nbsp;AnyReservation&nbsp;Reservation&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Arbitrary</span>&nbsp;<span style="color:blue;">AnyReservation</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;arbitrary&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;AnyReservation&nbsp;&lt;$&gt; &nbsp;&nbsp;&nbsp;&nbsp;liftM5&nbsp;Reservation&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary&nbsp;arbitrary</pre> </p> <p> The only use I've had for that particular instance so far, though, is to ensure that any <code>Reservation</code> correctly serialises to, and deserialises from, JSON: </p> <p> <pre>it&nbsp;<span style="color:#a31515;">&quot;round-trips&quot;</span>&nbsp;$&nbsp;property&nbsp;$&nbsp;\(AnyReservation&nbsp;r)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;json&nbsp;=&nbsp;encode&nbsp;r &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;decode&nbsp;json &nbsp;&nbsp;actual&nbsp;shouldBe&nbsp;Just&nbsp;r</pre> </p> <p> With those two <code>newtype</code> wrappers, I no longer have any orphan instances. </p> <h3 id="758fef8609784b998c3fad65b2fe6e2f"> Summary <a href="#758fef8609784b998c3fad65b2fe6e2f" title="permalink">#</a> </h3> <p> A simple naming scheme for <code>newtype</code> wrappers for QuickCheck <code>Arbitrary</code> instances, then, is: <ul> <li>If the instance is truly unbounded, prefix the wrapper name with <em>Any</em></li> <li>If the instance only produces valid values, prefix the wrapper name with <em>Valid</em></li> </ul> This strikes me as a practical naming scheme. Just try to search the web for <em>file system interface</em>, or <em>mock file system</em>. I'm not going to give you any links, because I think such questions are <a href="https://en.wikipedia.org/wiki/XY_problem">XY problems</a>. I don't think that the most common suggestions are proper solutions. </p> <p> In functional programming, anyway, <a href="/2017/01/30/partial-application-is-dependency-injection">Dependency Injection isn't functional, because it makes everything impure</a>. How, then, do you model the file system in such a way that it's pure, decoupled from the logic you'd like to add on top of it, and still has enough fidelity that you can perform most tasks? </p> <p> You model the file system as a tree, or a forest. </p> <h3 id="4920bedd948d4f7487a13fa96f836371"> File systems are hierarchies <a href="#4920bedd948d4f7487a13fa96f836371" title="permalink">#</a> </h3> <p> It should come as no surprise that file systems are hierarchies, or trees. Each logical drive is the root of a tree. Files are leaves, and directories are internal nodes. Does that sound familiar? That sounds like a <a href="/2019/07/29/church-encoded-rose-tree">rose tree</a>. </p> <p> Rose trees are immutable data structures. It doesn't get much more functional than that. Why not using a rose tree (or a forest) to model the file system? </p> <p> What about interaction with the actual file system? Usually, when you encounter object-oriented attempts at decoupling an abstraction from the actual file system, you'll find polymorphic operations such as <code>WriteAllText</code>, <code>GetFileSystemEntries</code>, <code>CreateDirectory</code>, and so on. These would be the (mockable) methods that you have to implement, usually as <a href="http://xunitpatterns.com/Humble%20Object.html">Humble Objects</a>. </p> <p> If you, instead of a set of interfaces, model the file system as a forest, interacting with the actual file system is not even part of the abstraction. That's a typical shift of perspective from object-oriented design to functional programming. </p> <p> <img src="/content/binary/ood-and-fp-views-on-fily-system-abstraction.png" alt="Object-oriented and functional ways to abstractly model file systems."> </p> <p> In object-oriented design, you typically attempt to model <em>data with behaviour</em>. Sometimes that fits the underlying reality well, but in this case it doesn't. While you have file and directory objects with behaviour, the actual structure of a file system is implicit. It's hidden in the interactions between the objects. </p> <p> By modelling the file system as a tree, you explicitly use the structure of the data. How you load a tree into program memory, or how you imprint a tree unto the file system isn't part of the abstraction. When it comes to input and output, you're free to do what you want. </p> <p> Once you have a model of a directory structure in memory, you can manipulate it to your heart's content. Since <a href="/2019/08/19/a-rose-tree-functor">rose trees are functors</a>, you know that all transformations are structure-preserving. That means that you don't even need to write tests for those parts of your application. </p> <p> You'll appreciate an example, I'm sure. </p> <h3 id="5e19438122b94e059c155509e96c964f"> Picture archivist example <a href="#5e19438122b94e059c155509e96c964f" title="permalink">#</a> </h3> <p> As an example, I'll attempt to answer <a href="https://codereview.stackexchange.com/q/99271/3878">an old Code Review question</a>. I already gave <a href="https://codereview.stackexchange.com/a/99290/3878">an answer</a> in 2015, but I'm not so happy with it today as I was back then. The question is great, though, because it explicitly demonstrates how people have a hard time escaping the notion that abstraction is only available via interfaces or abstract base classes. In 2015, I had long since figured out that <a href="/2009/05/28/DelegatesAreAnonymousInterfaces">delegates (and thus functions) are anonymous interfaces</a>, but I still hadn't figured out how to separate pure from impure behaviour. </p> <p> The question's scenario is how to implement a small program that can inspect a collection of image files, extract the date-taken metadata from each file, and move the files to a new directory structure based on that information. </p> <p> For example, you could have files organised in various directories according to motive. </p> <p> <img src="/content/binary/picture-archivist-source-directory.png" alt="Three example directories with pictures."> </p> <p> You soon realise, however, that that archiving strategy is untenable, because what do you do if there's more than one type of motive in a picture? Instead, you decide to organise the files according to month and year. </p> <p> <img src="/content/binary/picture-archivist-destination-directory.png" alt="Seven example directories with pictures."> </p> <p> Clearly, there's some input and output involved in this application, but there's also some logic that you'd like to unit test. You need to parse the metadata, figure out where to move each image file, filter out files that are not images, and so on. </p> <h3 id="e3bc8b23a3494628a44348749a0369ca"> Object-oriented picture archivist <a href="#e3bc8b23a3494628a44348749a0369ca" title="permalink">#</a> </h3> <p> If you were to implement such a picture archivist program with an object-oriented design, you may use Dependency Injection so that you can 'mock' the file system during unit testing. A typical program might then work like this at run time: </p> <p> <img src="/content/binary/object-oriented-file-system-interaction.png" alt="An object-oriented program typically has busy interaction with the file system."> </p> <p> The program has fine-grained, busy interaction with the file system (through a polymorphic interface). It'll typically read one file, load its metadata, decide where to put the file, and copy it there. Then it'll move on to the next file, although it might also do this in parallel. Throughout the program execution, there's input and output going on, which makes it difficult to isolate the pure from the impure code. </p> <p> Even if you write a program like that in <a href="https://fsharp.org">F#</a>, it's hardly a <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a>. </p> <p> Such an architecture is, in theory, testable, but my experience is that if you attempt to reproduce such busy, fine-grained interaction with mocks and stubs, you're likely to end up with brittle tests. </p> <h3 id="6cddf0e7ca3549c49a87006bfba5d349"> Functional picture archivist <a href="#6cddf0e7ca3549c49a87006bfba5d349" title="permalink">#</a> </h3> <p> In functional programming, you'll have to <a href="/2017/02/02/dependency-rejection">reject the notion of dependencies</a>. Instead, you can often resort to the simple architecture I call an <em>impure-pure-impure sandwich</em>; here, specifically: <ol> <li>Load data from disk (impure)</li> <li>Transform the data (pure)</li> <li>Write data to disk (impure)</li> </ol> A typical program might then work like this at run time: </p> <p> <img src="/content/binary/functional-file-system-interaction.png" alt="A functional program typically loads data, transforms it, and stores it again."> </p> <p> When the program starts, it loads data from disk into a tree. It then manipulates the in-memory model of the files in question, and once it's done, it traverses the entire tree and applies the changes. </p> <p> This gives you a much clearer separation between the pure and impure parts of the code base. The pure part is bigger, and easier to unit test. </p> <h3 id="09d2184be64a428d85b4f01f1149ea7a"> Example code <a href="#09d2184be64a428d85b4f01f1149ea7a" title="permalink">#</a> </h3> <p> This article gave you an overview of the functional architecture. In the next two articles, you'll see how to do this in practice. First, I'll implement the above architecture in <a href="https://www.haskell.org">Haskell</a>, so that we know that if it works there, the architecture does, indeed, respect <a href="/2018/11/19/functional-architecture-a-definition">the functional interaction law</a>. </p> <p> Based on the Haskell implementation, you'll then see a port to F#. <ul> <li><a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a></li> <li><a href="/2019/09/16/picture-archivist-in-f">Picture archivist in F#</a></li> </ul> These two articles share the same architecture. You can read both, or one of them, as you like. The source code is available on GitHub. </p> <h3 id="09e32b681b7a48aa808965bd66c4794b"> Summary <a href="#09e32b681b7a48aa808965bd66c4794b" title="permalink">#</a> </h3> <p> One of the hardest problems in transitioning from object-oriented programming to functional programming is that the design approach is so different. Many well-understood design patterns and principles don't translate easily. Dependency Injection is one of those. Often, you'll have to flip the model on its head, so to speak, before you can take it on in a functional manner. </p> <p> While most object-oriented programmers would say that object-oriented design involves focusing on 'the nouns', in practice, it often revolves around interactions and behaviour. Sometimes, that's appropriate, but often, it's not. </p> <p> Functional programming, in contrast, tends to take a more data-oriented perspective. Load some data, manipulate it, and publish it. If you can come up with an appropriate data structure for the data, you're probably on your way to implementing a functional architecture. </p> <p> <strong>Next:</strong> <a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A rose tree functor https://blog.ploeh.dk/2019/08/19/a-rose-tree-functor 2019-08-19T08:08:00+00:00 Mark Seemann <div id="post"> <p> <em>Rose trees form normal functors. A place-holder article for object-oriented programmers.</em> </p> <p> This article is an instalment in <a href="/2018/03/22/functors">an article series about functors</a>. As another article explains, <a href="/2019/08/12/rose-tree-bifunctor">a rose tree is a bifunctor</a>. This makes it trivially a functor. As such, this article is mostly a place-holder to fit the spot in the <em>functor table of contents</em>, thereby indicating that rose trees are functors. </p> <p> Since a rose tree is a bifunctor, it's actually not one, but two, functors. Many languages, C# included, are best equipped to deal with unambiguous functors. This is also true in <a href="https://haskell.org">Haskell</a>, where you'd usally define the <code>Functor</code> instance over a bifunctor's right, or second, side. Likewise, in C#, you can make <code>IRoseTree&lt;N, L&gt;</code> a functor by implementing <code>Select</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;Select&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectLeaf(selector); }</pre> </p> <p> This method simply delegates all implementation to the <code>SelectLeaf</code> method; it's just <code>SelectLeaf</code> by another name. It obeys the functor laws, since these are just specializations of the bifunctor laws, and we know that a rose tree is a proper bifunctor. </p> <p> It would have been technically possible to instead implement a <code>Select</code> method by calling <code>SelectNode</code>, but it seems marginally more useful to enable syntactic sugar for mapping over the leaves. </p> <h3 id="134b75d98069421e9fe70a8630ac140f"> Menu example <a href="#134b75d98069421e9fe70a8630ac140f" title="permalink">#</a> </h3> <p> As an example, imagine that you're defining part of a menu bar for an old-fashioned desktop application. Perhaps you're even loading the structure of the menu from a text file. Doing so, you could create a simple tree that represents the <em>edit</em> menu: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;editMenuTemplate&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Edit&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Find&nbsp;and&nbsp;Replace&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Find&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Replace&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;Case&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Upper&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Lower&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Cut&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Copy&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;Paste&quot;</span>));</pre> </p> <p> At this point, you have an <code>IRoseTree&lt;string, string&gt;</code>, so you might as well have used a <a href="/2018/08/06/a-tree-functor">'normal' tree</a> instead of a rose tree. The above template, however, is only a first step, because you have this <a href="https://en.wikipedia.org/wiki/Command_pattern">Command</a> class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Command</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;Command(<span style="color:blue;">string</span>&nbsp;name) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;name; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;Name&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">virtual</span>&nbsp;<span style="color:blue;">void</span>&nbsp;Execute() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Apart from this base class, you also have classes that derive from it: <code>FindCommand</code>, <code>ReplaceCommand</code>, and so on. These classes override the <code>Execute</code> method by implenting <em>find</em>, <em>replace</em>, etc. functionality. Imagine that you also have a store or dictionary of these derived objects. This enables you to transform the template tree into a useful user menu: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:#2b91af;">Command</span>&gt;&nbsp;editMenu&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;name&nbsp;<span style="color:blue;">in</span>&nbsp;editMenuTemplate &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;commandStore.Lookup(name);</pre> </p> <p> Notice how this transforms only the leaves, using the command store's <code>Lookup</code> method. This example uses C# query syntax, because this is what the <code>Select</code> method enables, but you could also have written the translation by just calling the <code>Select</code> method. </p> <p> The internal nodes in a menu have no behavious, so it makes little sense to attempt to turn them into <code>Command</code> objects as well. They're only there to provide structure to the menu. With a 'normal' tree, you wouldn't have been able to enrich only the leaves, while leaving the internal nodes untouched, but with a rose tree you can. </p> <p> The above example uses the <code>Select</code> method (via query syntax) to translate the nodes, thereby providing a demonstration of how to use the rose tree as the functor it is. </p> <h3 id="c77f1f9491b246f1bdb7c75d93eaa4ff"> Summary <a href="#c77f1f9491b246f1bdb7c75d93eaa4ff" title="permalink">#</a> </h3> <p> The <code>Select</code> doesn't implement any behaviour not already provided by <code>SelectLeaf</code>, but it enables C# query syntax. The C# compiler understands functors, but not bifunctors, so when you have a bifunctor, you might as well light up that language feature as well by adding a <code>Select</code> method. </p> <p> <strong>Next:</strong> <a href="/2018/08/13/a-visitor-functor">A Visitor functor</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Rose tree bifunctor https://blog.ploeh.dk/2019/08/12/rose-tree-bifunctor 2019-08-12T10:33:00+00:00 Mark Seemann <div id="post"> <p> <em>A rose tree forms a bifunctor. An article for object-oriented developers.</em> </p> <p> This article is an instalment in <a href="/2018/12/24/bifunctors">an article series about bifunctors</a>. While the overview article explains that there's essentially two practically useful bifunctors, here's a third one. <a href="https://en.wikipedia.org/wiki/Rose_tree">rose trees</a>. </p> <h3 id="985e3bc5291c4f8ba98ce258e78f4ec8"> Mapping both dimensions <a href="#985e3bc5291c4f8ba98ce258e78f4ec8" title="permalink">#</a> </h3> <p> Like in the <a href="/2019/01/07/either-bifunctor">previous article on the Either bifunctor</a>, I'll start by implementing the simultaneous two-dimensional translation <code>SelectBoth</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;SelectBoth&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>&gt;&nbsp;selectNode, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selectLeaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Cata( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(n,&nbsp;branches)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;(selectNode(n),&nbsp;branches), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;l&nbsp;=&gt;&nbsp;(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;)<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;(selectLeaf(l))); }</pre> </p> <p> This article uses the previously shown <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree</a> and <a href="/2019/08/05/rose-tree-catamorphism">its catamorphism</a> <code>Cata</code>. </p> <p> In the <code>leaf</code> case, the <code>l</code> argument received by the lambda expression is an object of the type <code>L</code>, since the <code>source</code> tree is an <code>IRoseTree&lt;N, L&gt;</code> object; i.e. a tree with leaves of the type <code>L</code> and nodes of the type <code>N</code>. The <code>selectLeaf</code> argument is a function that converts an <code>L</code> object to an <code>L1</code> object. Since <code>l</code> is an <code>L</code> object, you can call <code>selectLeaf</code> with it to produce an <code>L1</code> object. You can use this resulting object to create a new <code>RoseLeaf&lt;N1, L1&gt;</code>. Keep in mind that while the <code>RoseLeaf</code> class requires two type arguments, it never requires an object of its <code>N</code> type argument, which means that you can create an object with any <em>node</em> type argument, including <code>N1</code>, even if you don't have an object of that type. </p> <p> In the <code>node</code> case, the lambda expression receives two objects: <code>n</code> and <code>branches</code>. The <code>n</code> object has the type <code>N</code>, while the <code>branches</code> object has the type <code>IEnumerable&lt;IRoseTree&lt;N1, L1&gt;&gt;</code>. In other words, the <code>branches</code> have already been translated to the desired result type. That's how the catamorphism works. This means that you only have to figure out how to translate the <code>N</code> object <code>n</code> to an <code>N1</code> object. The <code>selectNode</code> function argument can do that, so you can then create a new <code>RoseNode&lt;N1, L1&gt;</code> and return it. </p> <p> This works as expected: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree RoseNode&lt;string, int&gt;("foo", IRoseTree&lt;string, int&gt;[2] { 42, 1337 }) &gt; tree.SelectBoth(s&nbsp;=&gt;&nbsp;s.Length,&nbsp;i&nbsp;=&gt;&nbsp;i.ToString()) RoseNode&lt;int, string&gt;(3, IRoseTree&lt;int, string&gt;[2] { "42", "1337" })</pre> </p> <p> This <em>C# Interactive</em> example shows how to convert a tree with internal string nodes and integer leaves to a tree of internal integer nodes and string leaves. The strings are converted to strings by counting their <code>Length</code>, while the integers are turned into strings using the standard <code>ToString</code> method available on all objects. </p> <h3 id="c0ea04cfe7d3412c86b9ba3953812025"> Mapping nodes <a href="#c0ea04cfe7d3412c86b9ba3953812025" title="permalink">#</a> </h3> <p> When you have <code>SelectBoth</code>, you can trivially implement the translations for each dimension in isolation. For <a href="/2018/12/31/tuple-bifunctor">tuple bifunctors</a>, I called these methods <code>SelectFirst</code> and <code>SelectSecond</code>, while for <a href="/2019/01/07/either-bifunctor">Either bifunctors</a>, I chose to name them <code>SelectLeft</code> and <code>SelectRight</code>. Continuing the trend of naming the translations after what they translate, instead of their positions, I'll name the corresponding methods here <code>SelectNode</code> and <code>SelectLeaf</code>. In <a href="https://www.haskell.org">Haskell</a>, the functions associated with <code>Data.Bifunctor</code> are always called <code>first</code> and <code>second</code>, but I see no reason to preserve such abstract naming in C#. In Haskell, these functions are part of the <code>Bifunctor</code> type class; the abstract names serve an actual purpose. This isn't the case in C#, so there's no reason to retain the abstract names. You might as well use names that communicate intent, which is what I've tried to do here. </p> <p> If you want to map only the internal nodes, you can implement a <code>SelectNode</code> method based on <code>SelectBoth</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;SelectNode&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">N1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectBoth(selector,&nbsp;l&nbsp;=&gt;&nbsp;l); }</pre> </p> <p> This simply uses the <code>l =&gt; l</code> lambda expression as an ad-hoc <em>identity</em> function, while passing <code>selector</code> as the <code>selectNode</code> argument to the <code>SelectBoth</code> method. </p> <p> You can use this to map the above <code>tree</code> to a tree made entirely of numbers: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree.SelectNode(s =&gt; s.Length) RoseNode&lt;int, int&gt;(3, IRoseTree&lt;int, int&gt;[2] { 42, 1337 })</pre> </p> <p> Such a tree is, incidentally, isomorphic to a <a href="/2018/08/06/a-tree-functor">'normal' tree</a>. It might be a good exercise, if you need one, to demonstrate the isormorphism by writing functions that convert a <code>Tree&lt;T&gt;</code> into an <code>IRoseTree&lt;T, T&gt;</code>, and vice versa. </p> <h3 id="baa9136b506241e39e13639e43679b31"> Mapping leaves <a href="#baa9136b506241e39e13639e43679b31" title="permalink">#</a> </h3> <p> Similar to <code>SelectNode</code>, you can also trivially implement <code>SelectLeaf</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;SelectLeaf&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">L1</span>&gt;&nbsp;selector) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.SelectBoth(n&nbsp;=&gt;&nbsp;n,&nbsp;selector); }</pre> </p> <p> This is another one-liner calling <code>SelectBoth</code>, with the difference that the identity function <code>n =&gt; n</code> is passed as the first argument, instead of as the last. This ensures that only <code>RoseLeaf</code> values are mapped: </p> <p> <pre>&gt; <span style="color:blue;">var</span>&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337)); &gt; tree.SelectLeaf(i =&gt; i % 2 == 0) RoseNode&lt;string, bool&gt;("foo", IRoseTree&lt;string, bool&gt;[2] { true, false })</pre> </p> <p> In the above <em>C# Interactive</em> session, the leaves are mapped to Boolean values, indicating whether they're even or odd. </p> <h3 id="afddb846bd244f4aa8f658fb5716b392"> Identity laws <a href="#afddb846bd244f4aa8f658fb5716b392" title="permalink">#</a> </h3> <p> Rose trees obey all the bifunctor laws. While it's formal work to prove that this is the case, you can get an intuition for it via examples. Often, I use a property-based testing library like <a href="https://fscheck.github.io/FsCheck">FsCheck</a> or <a href="https://github.com/hedgehogqa/fsharp-hedgehog">Hedgehog</a> to demonstrate (not prove) that laws hold, but in this article, I'll keep it simple and only cover each law with a parametrised test. </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;Id&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">T</span>&nbsp;x)&nbsp;=&gt;&nbsp;x; <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:blue;">object</span>[]&gt;&nbsp;BifunctorLawsData { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">get</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(42)&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>))&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">yield</span>&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;exampleTree&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} } [<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectNodeObeysFirstFunctorLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectNode(Id)); }</pre> </p> <p> This test uses <a href="https://xunit.github.io">xUnit.net</a>'s <code>[Theory]</code> feature to supply a small set of example input values. The input values are defined by the <code>BifunctorLawsData</code> property, since I'll reuse the same values for all the bifunctor law demonstration tests. The <code>exampleTree</code> object is the tree shown in <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree</a>. </p> <p> The tests also use the identity function implemented as a <code>private</code> function called <code>Id</code>, since C# doesn't come equipped with such a function in the Base Class Library. </p> <p> For all the <code>IRoseTree&lt;int, string&gt;</code> objects <code>t</code>, the test simply verifies that the original tree <code>t</code> is equal to the tree projected over the first axis with the <code>Id</code> function. </p> <p> Likewise, the first functor law applies when translating over the second dimension: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectLeafObeysFirstFunctorLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectLeaf(Id)); }</pre> </p> <p> This is the same test as the previous test, with the only exception that it calls <code>SelectLeaf</code> instead of <code>SelectNode</code>. </p> <p> Both <code>SelectNode</code> and <code>SelectLeaf</code> are implemented by <code>SelectBoth</code>, so the real test is whether this method obeys the identity law: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectBothObeysIdentityLaw(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t,&nbsp;t.SelectBoth(Id,&nbsp;Id)); }</pre> </p> <p> Projecting over both dimensions with the identity function does, indeed, return an object equal to the input object. </p> <h3 id="bfaa0b763e5346c488f4bd9576ab894c"> Consistency law <a href="#bfaa0b763e5346c488f4bd9576ab894c" title="permalink">#</a> </h3> <p> In general, it shouldn't matter whether you map with <code>SelectBoth</code> or a combination of <code>SelectNode</code> and <code>SelectLeaf</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;ConsistencyLawHolds(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;f(<span style="color:blue;">int</span>&nbsp;i)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DateTime</span>(i); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;<span style="color:blue;">string</span>.IsNullOrWhiteSpace(s); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(t.SelectBoth(f,&nbsp;g),&nbsp;t.SelectLeaf(g).SelectNode(f)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(f).SelectLeaf(g), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(g).SelectNode(f)); }</pre> </p> <p> This example creates two local functions <code>f</code> and <code>g</code>. The first function, <code>f</code>, creates a new <code>DateTime</code> object from an integer, using one of the <code>DateTime</code> constructor overloads. The second function, <code>g</code>, just delegates to <code>string.IsNullOrWhiteSpace</code>, although I want to stress that this is just an example. The law should hold for any two (<a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>) functions. </p> <p> The test then verifies that you get the same result from calling <code>SelectBoth</code> as when you call <code>SelectNode</code> followed by <code>SelectLeaf</code>, or the other way around. </p> <h3 id="dd3046c49d564991bb47924b6e8e65fb"> Composition laws <a href="#dd3046c49d564991bb47924b6e8e65fb" title="permalink">#</a> </h3> <p> The composition laws insist that you can compose functions, or translations, and that again, the choice to do one or the other doesn't matter. Along each of the axes, it's just the second functor law applied. This parametrised test demonstrates that the law holds for <code>SelectNode</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SecondFunctorLawHoldsForSelectNode(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">char</span>&nbsp;f(<span style="color:blue;">bool</span>&nbsp;b)&nbsp;=&gt;&nbsp;b&nbsp;?&nbsp;<span style="color:#a31515;">&#39;T&#39;</span>&nbsp;:&nbsp;<span style="color:#a31515;">&#39;F&#39;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">int</span>&nbsp;i)&nbsp;=&gt;&nbsp;i&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(x&nbsp;=&gt;&nbsp;f(g(x))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectNode(g).SelectNode(f)); }</pre> </p> <p> Here, <code>f</code> is a local function that returns the the character <code>'T'</code> for <code>true</code>, and <code>'F'</code> for <code>false</code>; <code>g</code> is the <em>even</em> function. The second functor law states that mapping <code>f(g(x))</code> in a single step is equivalent to first mapping over <code>g</code> and then map the result of that using <code>f</code>. </p> <p> The same law applies if you fix the first dimension and translate over the second: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SecondFunctorLawHoldsForSelectLeaf(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;f(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;g(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;s.Length; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(x&nbsp;=&gt;&nbsp;f(g(x))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectLeaf(g).SelectLeaf(f)); }</pre> </p> <p> Here, <code>f</code> is the <em>even</em> function, whereas <code>g</code> is a local function that returns the length of a string. Again, the test demonstrates that the output is the same whether you map over an intermediary step, or whether you map using only a single step. </p> <p> This generalises to the composition law for <code>SelectBoth</code>: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>,&nbsp;<span style="color:#2b91af;">MemberData</span>(<span style="color:blue;">nameof</span>(BifunctorLawsData))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;SelectBothCompositionLawHolds(<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;t) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">char</span>&nbsp;f(<span style="color:blue;">bool</span>&nbsp;b)&nbsp;=&gt;&nbsp;b&nbsp;?&nbsp;<span style="color:#a31515;">&#39;T&#39;</span>&nbsp;:&nbsp;<span style="color:#a31515;">&#39;F&#39;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;g(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;h(<span style="color:blue;">int</span>&nbsp;x)&nbsp;=&gt;&nbsp;x&nbsp;%&nbsp;2&nbsp;==&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;i(<span style="color:blue;">string</span>&nbsp;s)&nbsp;=&gt;&nbsp;s.Length; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectBoth(x&nbsp;=&gt;&nbsp;f(g(x)),&nbsp;y&nbsp;=&gt;&nbsp;h(i(y))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.SelectBoth(g,&nbsp;i).SelectBoth(f,&nbsp;h)); }</pre> </p> <p> Again, whether you translate in one or two steps shouldn't affect the outcome. </p> <p> As all of these tests demonstrate, the bifunctor laws hold for rose trees. The tests only showcase five examples, but I hope it gives you an intuition how any rose tree is a bifunctor. After all, the <code>SelectNode</code>, <code>SelectLeaf</code>, and <code>SelectBoth</code> methods are all generic, and they behave the same for all generic type arguments. </p> <h3 id="a1a5dea3d85d4ed1a3ee3fb0a4dca820"> Summary <a href="#a1a5dea3d85d4ed1a3ee3fb0a4dca820" title="permalink">#</a> </h3> <p> Rose trees are bifunctors. You can translate the node and leaf dimension of a rose tree independently of each other, and the bifunctor laws hold for any pure translation, no matter how you compose the projections. </p> <p> As always, there can be performance differences between the various compositions, but the outputs will be the same regardless of composition. </p> <p> A functor, and by extension, a bifunctor, is a structure-preserving map. This means that any projection preserves the structure of the underlying container. For rose trees this means that the shape of the tree remains the same. The number of leaves remain the same, as does the number of internal nodes. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">tree</a> is that internal nodes can hold values of a different type than leaf values. </p> <p> <img src="/content/binary/rose-tree-example.png" alt="A rose tree example diagram, with internal nodes containing integers, and leafs containing strings."> </p> <p> The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches. </p> <h3 id="078386d5f3924a63add86ff199fd88d0"> C# catamorphism <a href="#078386d5f3924a63add86ff199fd88d0" title="permalink">#</a> </h3> <p> As a C# representation of a rose tree, I'll use the <a href="/2019/07/29/church-encoded-rose-tree">Church-encoded rose tree I've previously described</a>. The catamorphism is this extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Cata&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;tree, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;tree.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(n,&nbsp;branches)&nbsp;=&gt;&nbsp;node(n,&nbsp;branches.Select(t&nbsp;=&gt;&nbsp;t.Cata(node,&nbsp;leaf))), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;leaf); }</pre> </p> <p> Like most of the other catamorphisms shown in this article series, this one consists of two functions. One that handles the <em>leaf</em> case, and one that handles the partially reduced <em>node</em> case. Compare it with the <a href="/2019/06/10/tree-catamorphism">tree catamorphism</a>: notice that the rose tree catamorphism's <code>node</code> function is identical to the the tree catamorphism. The <code>leaf</code> function, however, is new. </p> <p> In previous articles, you've seen other examples of catamorphisms for <a href="/2018/05/22/church-encoding">Church-encoded</a> types. The most common pattern has been that the Church encoding (the <code>Match</code> method) was also the catamorphism, with the <a href="/2019/05/13/peano-catamorphism">Peano catamorphism</a> being the only exception so far. When it comes to the Peano catamorphism, however, I'm not entirely confident that the difference between Church encoding and catamorphism is real, or whether it's just an artefact of the way I originally designed the Church encoding. </p> <p> When it comes to the present rose tree, however, notice that the catamorphisms is distinctly different from the Church encoding. That's the reason I called the method <code>Cata</code> instead of <code>Match</code>. </p> <p> The method simply delegates the <code>leaf</code> handler to <code>Match</code>, while it adds behaviour to the <code>node</code> case. It works the same way as for the 'normal' tree catamorphism. </p> <h3 id="87e2c79711c24c63a5ed82fbe4f7b581"> Examples <a href="#87e2c79711c24c63a5ed82fbe4f7b581" title="permalink">#</a> </h3> <p> You can use <code>Cata</code> to implement most other behaviour you'd like <code>IRoseTree&lt;N, L&gt;</code> to have. In a future article, you'll see how to <a href="/2019/08/12/rose-tree-bifunctor">turn the rose tree into a bifunctor</a> and <a href="/2019/08/19/a-rose-tree-functor">functor</a>, so here, we'll look at some other, more ad hoc, examples. As is also the case for the 'normal' tree, you can calculate the sum of all nodes, if you can associate a number with each node. </p> <p> Consider the example tree in the above diagram. You can create it as an <code>IRoseTree&lt;int, string&gt;</code> object like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;exampleTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(2112, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(90125, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;baz&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;qux&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quux&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quuz&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;corge&quot;</span>));</pre> </p> <p> If you want to calculate a sum for a tree like that, you can use the integers for the internal nodes, and perhaps the length of the strings of the leaves. That hardly makes much sense, but is technically possible: </p> <p> <pre>&gt; exampleTree.Cata((x,&nbsp;xs)&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;xs.Sum(),&nbsp;x&nbsp;=&gt;&nbsp;x.Length) 93641</pre> </p> <p> Perhaps slightly more useful is to count the number of leaves: </p> <p> <pre>&gt; exampleTree.Cata((_,&nbsp;xs)&nbsp;=&gt;&nbsp;xs.Sum(),&nbsp;_&nbsp;=&gt;&nbsp;1) 7</pre> </p> <p> A leaf node has, by definition, exactly one leaf node, so the <code>leaf</code> lambda expression always returns <code>1</code>. In the <code>node</code> case, <code>xs</code> contains the partially summed leaf node count, so just <code>Sum</code> those together while ignoring the value of the internal node. </p> <p> You can also measure the maximum depth of the tree: </p> <p> <pre>&gt; exampleTree.Cata((_,&nbsp;xs)&nbsp;=&gt;&nbsp;1&nbsp;+&nbsp;xs.Max(),&nbsp;_&nbsp;=&gt;&nbsp;0) 3</pre> </p> <p> Consistent with the example for 'normal' trees, you can arbitrarily decide that the depth of a leaf node is <code>0</code>, so again, the <code>leaf</code> lambda expression just returns a constant value. The <code>node</code> lambda expression takes the <code>Max</code> of the partially reduced <code>xs</code> and adds <code>1</code>, since an internal node represents another level of depth in a tree. </p> <h3 id="9e673c50edc14c1790a9e89a67d069d1"> Rose tree F-Algebra <a href="#9e673c50edc14c1790a9e89a67d069d1" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/10/tree-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> As always, start with the underlying endofunctor: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;RoseTreeF&nbsp;a&nbsp;b&nbsp;c&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;NodeF&nbsp;{&nbsp;nodeValue&nbsp;::&nbsp;a,&nbsp;nodes&nbsp;::&nbsp;ListFix&nbsp;c&nbsp;} &nbsp;&nbsp;|&nbsp;LeafF&nbsp;{&nbsp;leafValue&nbsp;::&nbsp;b&nbsp;} &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">RoseTreeF</span>&nbsp;a&nbsp;b)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;NodeF&nbsp;x&nbsp;$&nbsp;<span style="color:blue;">fmap</span>&nbsp;f&nbsp;ns &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;LeafF&nbsp;x</pre> </p> <p> Instead of using Haskell's standard list (<code>[]</code>) for the nodes, I've used <code>ListFix</code> from <a href="/2019/05/27/list-catamorphism">the article on list catamorphism</a>. This should, hopefully, demonstrate how you can build on already established definitions derived from first principles. </p> <p> As usual, I've called the 'data' types <code>a</code> and <code>b</code>, and the carrier type <code>c</code> (for <em>carrier</em>). The <code>Functor</code> instance as usual translates the carrier type; the <code>fmap</code> function has the type <code>(c -&gt; c1) -&gt; RoseTreeF a b c -&gt; RoseTreeF a b c1</code>. </p> <p> As was the case when deducing the recent catamorphisms, Haskell isn't too happy about defining instances for a type like <code>Fix (RoseTreeF a b)</code>. To address that problem, you can introduce a <code>newtype</code> wrapper: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;RoseTreeFix&nbsp;a&nbsp;b&nbsp;= &nbsp;&nbsp;RoseTreeFix&nbsp;{&nbsp;unRoseTreeFix&nbsp;::&nbsp;Fix&nbsp;(RoseTreeF&nbsp;a&nbsp;b)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> You can define <code>Bifunctor</code>, <code>Bifoldable</code>, <code>Bitraversable</code>, etc. instances for this type without resorting to any funky GHC extensions. Keep in mind that ultimately, the purpose of all this code is just to figure out what the catamorphism looks like. This code isn't intended for actual use. </p> <p> A pair of helper functions make it easier to define <code>RoseTreeFix</code> values: </p> <p> <pre><span style="color:#2b91af;">roseLeafF</span>&nbsp;::&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b roseLeafF&nbsp;=&nbsp;RoseTreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;LeafF <span style="color:#2b91af;">roseNodeF</span>&nbsp;::&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b roseNodeF&nbsp;x&nbsp;=&nbsp;RoseTreeFix&nbsp;.&nbsp;Fix&nbsp;.&nbsp;NodeF&nbsp;x&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;unRoseTreeFix</pre> </p> <p> <code>roseLeafF</code> creates a leaf node: </p> <p> <pre>Prelude Fix List RoseTree&gt; roseLeafF "ploeh" RoseTreeFix {unRoseTreeFix = Fix (LeafF "ploeh")}</pre> </p> <p> <code>roseNodeF</code> is a helper function to create internal nodes: </p> <p> <pre>Prelude Fix List RoseTree&gt; roseNodeF 6 (consF (roseLeafF 0) nilF) RoseTreeFix {unRoseTreeFix = Fix (NodeF 6 (ListFix (Fix (ConsF (Fix (LeafF 0)) (Fix NilF)))))}</pre> </p> <p> Even with helper functions, construction of <code>RoseTreeFix</code> values is cumbersome, but keep in mind that the code shown here isn't meant to be used in practice. The goal is only to deduce catamorphisms from more basic universal abstractions, and you now have all you need to do that. </p> <h3 id="0bfc3f600a9e43eea1026f1a4a3b7604"> Haskell catamorphism <a href="#0bfc3f600a9e43eea1026f1a4a3b7604" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>RoseTreeF a b</code>), and an object <code>c</code>, but you still need to find a morphism <code>RoseTreeF a b c -&gt; c</code>. Notice that the algebra you have to find is the function that reduces the functor to its <em>carrier type</em> <code>c</code>, not any of the 'data types' <code>a</code> or <code>b</code>. This takes some time to get used to, but that's how catamorphisms work. This doesn't mean, however, that you get to ignore <code>a</code> or <code>b</code>, as you'll see. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>roseTreeF&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>LeafF</code> case? You could pass a function argument to the <code>roseTreeF</code> function and use it with <code>x</code>: </p> <p> <pre>roseTreeF&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x</pre> </p> <p> While you could, technically, pass an argument of the type <code>c</code> to <code>roseTreeF</code> and then return that value from the <code>LeafF</code> case, that would mean that you would ignore the <code>x</code> value. This would be incorrect, so instead, make the argument a function and call it with <code>x</code>. Likewise, you can deal with the <code>NodeF</code> case in the same way: </p> <p> <pre><span style="color:#2b91af;">roseTreeF</span>&nbsp;::&nbsp;(a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">ListFix</span>&nbsp;c&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c roseTreeF&nbsp;fn&nbsp;fl&nbsp;=&nbsp;cata&nbsp;alg&nbsp;.&nbsp;unRoseTreeFix &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(NodeF&nbsp;x&nbsp;ns)&nbsp;=&nbsp;fn&nbsp;x&nbsp;ns &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;(LeafF&nbsp;x)&nbsp;=&nbsp;fl&nbsp;x</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>RoseTreeF</code>, the compiler infers that the <code>alg</code> function has the type <code>RoseTreeF a b c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for a rose tree. As has been the most common pattern so far, it's a pair, made from two functions. It's still not the only possible catamorphism, since you could trivially flip the arguments to <code>roseTreeF</code>, or the arguments to <code>fn</code>. </p> <p> I've chosen the representation shown here because it's similar to the catamorphism I've shown for a 'normal' tree, just with the added function for leaves. </p> <h3 id="256fd0a09c4a4651b6c27b5626b0fb33"> Basis <a href="#256fd0a09c4a4651b6c27b5626b0fb33" title="permalink">#</a> </h3> <p> You can implement most other useful functionality with <code>roseTreeF</code>. Here's the <code>Bifunctor</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifunctor</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bimap&nbsp;f&nbsp;s&nbsp;=&nbsp;roseTreeF&nbsp;(roseNodeF&nbsp;.&nbsp;f)&nbsp;(roseLeafF&nbsp;.&nbsp;s)</pre> </p> <p> Notice how naturally the catamorphism implements <code>bimap</code>. </p> <p> From that instance, the <code>Functor</code> instance trivially follows: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;=&nbsp;second</pre> </p> <p> You could probably also add <code>Applicative</code> and <code>Monad</code> instances, but I find those hard to grasp, so I'm going to skip them in favour of <code>Bifoldable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bifoldable</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bifoldMap&nbsp;f&nbsp;=&nbsp;roseTreeF&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;f&nbsp;x&nbsp;&lt;&gt;&nbsp;fold&nbsp;xs)</pre> </p> <p> The <code>Bifoldable</code> instance enables you to trivially implement the <code>Foldable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;foldMap&nbsp;=&nbsp;bifoldMap&nbsp;mempty</pre> </p> <p> You may find the presence of <code>mempty</code> puzzling, since <code>bifoldMap</code> takes two functions as arguments. Is <code>mempty</code> a function? </p> <p> Yes, <code>mempty</code> can be a function. Here, it is. There's a <code>Monoid</code> instance for any function <code>a -&gt; m</code>, where <code>m</code> is a <code>Monoid</code> instance, and <code>mempty</code> is the identity for that monoid. That's the instance in use here. </p> <p> Just as <code>RoseTreeFix</code> is <code>Bifoldable</code>, it's also <code>Bitraversable</code>: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Bitraversable</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;bitraverse&nbsp;f&nbsp;s&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;roseTreeF&nbsp;(\x&nbsp;xs&nbsp;-&gt;&nbsp;roseNodeF&nbsp;&lt;$&gt;&nbsp;f&nbsp;x&nbsp;&lt;*&gt;&nbsp;sequenceA&nbsp;xs)&nbsp;(<span style="color:blue;">fmap</span>&nbsp;roseLeafF&nbsp;.&nbsp;s)</pre> </p> <p> You can comfortably implement the <code>Traversable</code> instance based on the <code>Bitraversable</code> instance: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Traversable</span>&nbsp;(<span style="color:blue;">RoseTreeFix</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;sequenceA&nbsp;=&nbsp;bisequenceA&nbsp;.&nbsp;first&nbsp;pure</pre> </p> <p> That rose trees are <code>Traversable</code> turns out to be useful, as a future article will show. </p> <h3 id="c02950d3b4954435b384b1f7520d24d4"> Relationships <a href="#c02950d3b4954435b384b1f7520d24d4" title="permalink">#</a> </h3> <p> As was the case for 'normal' trees, the catamorphism for rose trees is more powerful than the <em>fold</em>. There are operations that you can express with the <code>Foldable</code> instance, but other operations that you can't. Consider the tree shown in the diagram at the beginning of the article. This is also the tree that the above C# examples use. In Haskell, using <code>RoseTreeFix</code>, you can define that tree like this: </p> <p> <pre>exampleTree&nbsp;= &nbsp;&nbsp;roseNodeF&nbsp;42&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;1337&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;foo&quot;</span>)&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;bar&quot;</span>)&nbsp;nilF))&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;2112&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseNodeF&nbsp;90125&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;baz&quot;</span>)&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;qux&quot;</span>)&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;quux&quot;</span>)&nbsp;nilF))&nbsp;$&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;(roseLeafF&nbsp;<span style="color:#a31515;">&quot;quuz&quot;</span>)&nbsp;nilF))&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;consF&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;roseLeafF&nbsp;<span style="color:#a31515;">&quot;corge&quot;</span>) &nbsp;&nbsp;&nbsp;&nbsp;nilF)</pre> </p> <p> You can trivially calculate the sum of string lengths of all leaves, using only the <code>Foldable</code> instance: </p> <p> <pre>Prelude RoseTree&gt; sum $length &lt;$&gt; exampleTree 25</pre> </p> <p> You can also fairly easily calculate a sum of all nodes, using the length of the strings as in the above C# example, but that requires the <code>Bifoldable</code> instance: </p> <p> <pre>Prelude Data.Bifoldable Data.Semigroup RoseTree&gt; bifoldMap Sum (Sum . length) exampleTree Sum {getSum = 93641}</pre> </p> <p> Fortunately, we get the same result as above. </p> <p> Counting leaves, or measuring the depth of a tree, on the other hand, is impossible with the <code>Foldable</code> instance, but interestingly, it turns out that counting leaves is possible with the <code>Bifoldable</code> instance: </p> <p> <pre><span style="color:#2b91af;">countLeaves</span>&nbsp;::&nbsp;(<span style="color:blue;">Bifoldable</span>&nbsp;p,&nbsp;<span style="color:blue;">Num</span>&nbsp;n)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;p&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;n countLeaves&nbsp;=&nbsp;getSum&nbsp;.&nbsp;bifoldMap&nbsp;(<span style="color:blue;">const</span>&nbsp;$&nbsp;Sum&nbsp;0)&nbsp;(<span style="color:blue;">const</span>&nbsp;$&nbsp;Sum&nbsp;1)</pre> </p> <p> This works well with the example tree: </p> <p> <pre>Prelude RoseTree&gt; countLeaves exampleTree 7</pre> </p> <p> Notice, however, that <code>countLeaves</code> works for any <code>Bifoldable</code> instance. Does that mean that you can 'count the leaves' of a tuple? Yes, it does: </p> <p> <pre>Prelude RoseTree&gt; countLeaves ("foo", "bar") 1 Prelude RoseTree&gt; countLeaves (1, 42) 1</pre> </p> <p> Or what about <code>EitherFix</code>: </p> <p> <pre>Prelude RoseTree Either&gt; countLeaves $leftF "foo" 0 Prelude RoseTree Either&gt; countLeaves$ rightF "bar" 1</pre> </p> <p> Notice that 'counting the leaves' of tuples always returns <code>1</code>, while 'counting the leaves' of <code>Either</code> always returns <code>0</code> for <code>Left</code> values, and <code>1</code> for <code>Right</code> values. This is because <code>countLeaves</code> considers the left, or <em>first</em>, data type to represent internal nodes, and the right, or <em>second</em>, data type to indicate leaves. </p> <p> You can further follow that train of thought to realise that you can convert both tuples and <code>EitherFix</code> values to small rose trees: </p> <p> <pre><span style="color:#2b91af;">fromTuple</span>&nbsp;::&nbsp;(a,&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b fromTuple&nbsp;(x,&nbsp;y)&nbsp;=&nbsp;roseNodeF&nbsp;x&nbsp;(consF&nbsp;(roseLeafF&nbsp;y)&nbsp;nilF) <span style="color:#2b91af;">fromEitherFix</span>&nbsp;::&nbsp;<span style="color:blue;">EitherFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b fromEitherFix&nbsp;=&nbsp;eitherF&nbsp;(roseNodeF&nbsp;nilF)&nbsp;roseLeafF</pre> </p> <p> The <code>fromTuple</code> function creates a small rose tree with one internal node and one leaf. The label of the internal node is the first value of the tuple, and the label of the leaf is the second value. Here's an example: </p> <p> <pre>Prelude RoseTree&gt; fromTuple ("foo", 42) RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix (ConsF (Fix (LeafF 42)) (Fix NilF)))))}</pre> </p> <p> The <code>fromEitherFix</code> function turns a <em>left</em> value into an internal node with no leaves, and a <em>right</em> value into a leaf. Here are some examples: </p> <p> <pre>Prelude RoseTree Either&gt; fromEitherFix $leftF "foo" RoseTreeFix {unRoseTreeFix = Fix (NodeF "foo" (ListFix (Fix NilF)))} Prelude RoseTree Either&gt; fromEitherFix$ rightF 42 RoseTreeFix {unRoseTreeFix = Fix (LeafF 42)}</pre> </p> <p> While counting leaves can be implemented using <code>Bifoldable</code>, that's not the case for measuring the depths of trees (I think; leave a comment if you know of a way to do this with one of the instances shown here). You can, however, measure tree depth with the catamorphism: </p> <p> <pre><span style="color:#2b91af;">treeDepth</span>&nbsp;::&nbsp;<span style="color:blue;">RoseTreeFix</span>&nbsp;a&nbsp;b&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Integer</span> treeDepth&nbsp;=&nbsp;roseTreeF&nbsp;(\_&nbsp;xs&nbsp;-&gt;&nbsp;1&nbsp;+&nbsp;<span style="color:blue;">maximum</span>&nbsp;xs)&nbsp;(<span style="color:blue;">const</span>&nbsp;0)</pre> </p> <p> The implementation is similar to the implementation for 'normal' trees. I've arbitrarily decided that leaves have a depth of zero, so the function that handles leaves always returns <code>0</code>. The function that handles internal nodes receives <code>xs</code> as a partially reduced list of depths below the node in question. Take the maximum of those and add <code>1</code>, since each internal node has a depth of one. </p> <p> <pre>Prelude RoseTree&gt; treeDepth exampleTree 3</pre> </p> <p> This, hopefully, illustrates that the catamorphism is more capable, and that the fold is just a (list-biased) specialisation. </p> <h3 id="4276c6f8fab248c0acc52a7f14462e41"> Summary <a href="#4276c6f8fab248c0acc52a7f14462e41" title="permalink">#</a> </h3> <p> The catamorphism for rose trees is a pair of functions. One function transforms internal nodes with their partially reduced branches, while the other function transforms leaves. </p> <p> For a realistic example of using a rose tree in a real program, see <a href="/2019/09/09/picture-archivist-in-haskell">Picture archivist in Haskell</a>. </p> <p> This article series has so far covered progressively more complex data structures. The first examples (Boolean catamorphism and Peano catamorphism) were neither functors, applicatives, nor monads. All subsequent examples, on the other hand, are all of these, and more. The next example presents a functor that's neither applicative nor monad, yet still foldable. Obviously, what functionality it offers is still based on a catamorphism.

<strong>Next:</strong> Full binary tree catamorphism. Church-encoded rose tree https://blog.ploeh.dk/2019/07/29/church-encoded-rose-tree 2019-07-29T13:14:00+00:00 Mark Seemann <div id="post"> <p> <em>A rose tree is a tree with leaf nodes of one type, and internal nodes of another.</em> </p> <p> This article is part of <a href="/2018/05/22/church-encoding">a series of articles about Church encoding</a>. In the previous articles, you've seen <a href="/2018/06/04/church-encoded-maybe">how to implement a Maybe container</a>, and <a href="/2018/06/11/church-encoded-either">how to implement an Either container</a>. Through these examples, you've learned how to model <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a> without explicit language support. In this article, you'll see how to model a <a href="https://en.wikipedia.org/wiki/Rose_tree">rose tree</a>. </p> <p> A rose tree is a general-purpose data structure where each node in a tree has an associated value. Each node can have an arbitrary number of branches, including none. The distinguishing feature from a rose tree and just any <a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">tree</a> is that internal nodes can hold values of a different type than leaf values. </p> <p> <img src="/content/binary/rose-tree-example.png" alt="A rose tree example diagram, with internal nodes containing integers, and leaves containing strings."> </p> <p> The diagram shows an example of a tree of internal integers and leaf strings. All internal nodes contain integer values, and all leaves contain strings. Each node can have an arbitrary number of branches. </p> <h3 id="5255946728c14810a5aaef3c1022d126"> Contract <a href="#5255946728c14810a5aaef3c1022d126" title="permalink">#</a> </h3> <p> In C#, you can represent the fundamental structure of a rose tree with a Church encoding, starting with an interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf); }</pre> </p> <p> The structure of a rose tree includes two mutually exclusive cases: internal nodes and leaf nodes. Since there's two cases, the <code>Match</code> method takes two arguments, one for each case. </p> <p> The interface is generic, with two type arguments: <code>N</code> (for <em>Node</em>) and <code>L</code> (for <em>leaf</em>). Any consumer of an <code>IRoseTree&lt;N, L&gt;</code> object must supply two functions when calling the <code>Match</code> method: a function that turns a node into a <code>TResult</code> value, and a function that turns a leaf into a <code>TResult</code> value. </p> <p> Both cases must have a corresponding implementation. </p> <h3 id="89c4833c4e4d46cc8eef2d5eb546f61d"> Leaves <a href="#89c4833c4e4d46cc8eef2d5eb546f61d" title="permalink">#</a> </h3> <p> The <em>leaf</em> implementation is the simplest: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;:&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">L</span>&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;RoseLeaf(<span style="color:#2b91af;">L</span>&nbsp;value) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;leaf(value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(value,&nbsp;other.value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;value.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The <code>RoseLeaf</code> class is an <a href="https://en.wikipedia.org/wiki/Adapter_pattern">Adapter</a> over a value of the generic type <code>L</code>. As is always the case with Church encoding, it implements the <code>Match</code> method by unconditionally calling one of the arguments, in this case the <code>leaf</code> function, with its adapted <code>value</code>. </p> <p> While it doesn't have to do this, it also overrides <code>Equals</code> and <code>GetHashCode</code>. This is an immutable class, so it's a great candidate to be a <a href="https://martinfowler.com/bliki/ValueObject.html">Value Object</a>. Making it a Value Object makes it easier to compare expected and actual values in unit tests, among other benefits. </p> <h3 id="f211476563fe40379eac66ee887ed75b"> Nodes <a href="#f211476563fe40379eac66ee887ed75b" title="permalink">#</a> </h3> <p> The <em>node</em> implementation is slightly more complex: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;:&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">N</span>&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;&nbsp;branches; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;RoseNode(<span style="color:#2b91af;">N</span>&nbsp;value,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;&nbsp;branches) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.branches&nbsp;=&nbsp;branches; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Match&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&gt;,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;node, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">L</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;leaf) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;node(value,&nbsp;branches); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(value,&nbsp;other.value) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&amp;&amp;&nbsp;<span style="color:#2b91af;">Enumerable</span>.SequenceEqual(branches,&nbsp;other.branches); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;value.GetHashCode()&nbsp;^&nbsp;branches.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> A node contains both a value (of the type <code>N</code>) and a collection of sub-trees, or <code>branches</code>. The class implements the <code>Match</code> method by unconditionally calling the <code>node</code> function argument with its constituent values. </p> <p> Again, it overrides <code>Equals</code> and <code>GetHashCode</code> for the same reasons as <code>RoseLeaf</code>. This isn't required to implement Church encoding, but makes comparison and unit testing easier. </p> <h3 id="a5c04c7e127349ed9b759e6361af5ab3"> Usage <a href="#a5c04c7e127349ed9b759e6361af5ab3" title="permalink">#</a> </h3> <p> You can use the <code>RoseLeaf</code> and <code>RoseNode</code> constructors to create new trees, but it sometimes helps to have a static helper method to create values. It turns out that there's little value in a helper method for leaves, but for nodes, it's marginally useful: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;Node&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:#2b91af;">N</span>&nbsp;value,&nbsp;<span style="color:blue;">params</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;[]&nbsp;branches) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseNode</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(value,&nbsp;branches); }</pre> </p> <p> This enables you to create tree objects, like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;tree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#a31515;">&quot;foo&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42),&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337));</pre> </p> <p> That's a single node with the label <code>"foo"</code> and two leaves with the values <code>42</code> and <code>1337</code>, respectively. You can create the tree shown in the above diagram like this: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;exampleTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(42, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(1337, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;foo&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;bar&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(2112, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(90125, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;baz&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;qux&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quux&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;quuz&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">string</span>&gt;(<span style="color:#a31515;">&quot;corge&quot;</span>));</pre> </p> <p> You can add various extension methods to implement useful functionality. In later articles, you'll see some more compelling examples, so here, I'm only going to show a few basic examples. One of the simplest features you can add is a method that will tell you if an <code>IRoseTree&lt;N, L&gt;</code> object is a node or a leaf: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsLeaf&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IChurchBoolean</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;node:&nbsp;(_,&nbsp;__)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchFalse</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaf:&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchTrue</span>()); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IChurchBoolean</span>&nbsp;IsNode&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">N</span>,&nbsp;<span style="color:#2b91af;">L</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChurchNot</span>(source.IsLeaf()); }</pre> </p> <p> Since this article is part of the overall article series on Church encoding, and the purpose of that article series is also to show how basic language features can be created from Church encodings, these two methods return <a href="/2018/05/24/church-encoded-boolean-values">Church-encoded Boolean values</a> instead of the built-in <code>bool</code> type. I'm sure you can imagine how you could change the type to <code>bool</code> if you'd like. </p> <p> You can use these methods like this: </p> <p> <pre>&gt; <span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">double</span>&gt;&nbsp;tree&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Guid</span>,&nbsp;<span style="color:blue;">double</span>&gt;(-3.2); &gt; tree.IsLeaf() ChurchTrue { } &gt; tree.IsNode() ChurchNot(ChurchTrue) &gt; <span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:blue;">long</span>,&nbsp;<span style="color:blue;">string</span>&gt;&nbsp;tree&nbsp;=&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node&lt;<span style="color:blue;">long</span>,&nbsp;<span style="color:blue;">string</span>&gt;(42); &gt; tree.IsLeaf() ChurchFalse { } &gt; tree.IsNode() ChurchNot(ChurchFalse)</pre> </p> <p> In a <a href="/2019/09/16/picture-archivist-in-f">future article, you'll see some more compelling examples</a>. </p> <h3 id="3be01779f059443799df57342e2510cb"> Terminology <a href="#3be01779f059443799df57342e2510cb" title="permalink">#</a> </h3> <p> It's not entirely clear what to call a tree like the one shown here. <a href="https://en.wikipedia.org/wiki/Rose_tree">The Wikipedia entry</a> doesn't state one way or the other whether internal node types ought to be distinguishable from leaf node types, but there are <a href="https://twitter.com/kbattocchi/status/1072538730911752192">indications that this could be the case</a>. At least, it seems that the <a href="https://mail.haskell.org/pipermail/haskell-cafe/2015-May/119633.html">term isn't well-defined</a>, so I took the liberty to retcon the name <em>rose tree</em> to the data structure shown here. </p> <p> In the paper that introduces the <em>rose tree</em> term, Meertens writes: <blockquote> <p> "We consider trees whose internal nodes may fork into an arbitrary (natural) number of sub-trees. (If such a node has zero descendants, we still consider it internal.) Each external node carries a data item. No further information is stored in the tree; in particular, internal nodes are unlabelled." </p> <footer><cite><em>First Steps towards the Theory of Rose Trees</em>, Lambert Meertens, 1988</cite></footer> </blockquote> While the concept is foreign in C#, you can trivially introduce a <a href="/2018/01/15/unit-isomorphisms">unit</a> data type: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Unit</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Unit</span>&nbsp;Instance&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Unit</span>(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;Unit()&nbsp;{&nbsp;} }</pre> </p> <p> This enables you to create a rose tree according to Meertens' definition: </p> <p> <pre><span style="color:#2b91af;">IRoseTree</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;&nbsp;meertensTree&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(2112)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(42), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1337), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(90125)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">RoseTree</span>.Node(<span style="color:#2b91af;">Unit</span>.Instance, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(1984)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">RoseLeaf</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(666));</pre> </p> <p> Visually, you could draw it like this: </p> <p> <img src="/content/binary/meertens-tree-example.png" alt="A Meertens rose tree example diagram, with leaves containing integers."> </p> <p> Thus, the tree structure shown here seems to be a generalisation of Meertens' original definition. </p> <p> I'm not a mathematician, so I may have misunderstood some things. If you have a better name than <em>rose tree</em> for the data structure shown here, please leave a comment. </p> <h3 id="331fa8452cdd435c86ce87b5d39d51c5"> Yeats <a href="#331fa8452cdd435c86ce87b5d39d51c5" title="permalink">#</a> </h3> <p> Now that we're on the topic of <em>rose tree</em> as a term, you may, as a bonus, enjoy a similarly-titled poem: <blockquote> <h4>THE ROSE TREE</h4> <p> "O words are lightly spoken"<br> Said Pearse to Connolly,<br> "Maybe a breath of politic words<br> Has withered our Rose Tree;<br> Or maybe but a wind that blows<br> Across the bitter sea." </p> <p> "It needs to be but watered,"<br> James Connolly replied,<br> "To make the green come out again<br> And spread on every side,<br> And shake the blossom from the bud<br> To be the garden's pride."<br> </p> <p> "But where can we draw water"<br> Said Pearse to Connolly,<br> "When all the wells are parched away?<br> O plain as plain can be<br> There's nothing but our own red blood<br> Can make a right Rose Tree." </p> <footer><cite><a href="https://en.wikipedia.org/wiki/W._B._Yeats">W. B. Yeats</a></cite></footer> </blockquote> As far as I can tell, though, Yeats' metaphor is dissimilar to Meertens'. </p> <h3 id="9906b9a8856248f38b4f03e40252b761"> Summary <a href="#9906b9a8856248f38b4f03e40252b761" title="permalink">#</a> </h3> <p> You may occasionally find use for a tree that distinguishes between internal and leaf nodes. You can model such a tree with a Church encoding, as shown in this article. </p> <p> <strong>Next: </strong> <a href="/2019/04/29/catamorphisms">Catamorphisms</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The catamorphism for rose trees is a pair of functions. One function transforms internal nodes with their partially reduced branches, while the other function transforms leaves.

For a realistic example of using a rose tree in a real program, see Picture archivist in Haskell.

This article series has so far covered progressively more complex data structures. The first examples (Boolean catamorphism and Peano catamorphism) were neither functors, applicatives, nor monads. All subsequent examples, on the other hand, are all of these, and more. The next example presents a functor that's neither applicative nor monad, yet still foldable. Obviously, what functionality it offers is still based on a catamorphism.

<strong>Next:</strong> Full binary tree catamorphism. First, you apply the <a href="/2018/04/03/maybe-monoids">First Maybe monoid</a> over the <a href="/2019/05/27/list-catamorphism">list catamorphism</a>, and then you conclude the reduction with the <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>. </p> <h3 id="46a6c41949db446d9387c8befbf3fdb1"> Pattern <a href="#46a6c41949db446d9387c8befbf3fdb1" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern gives you a way to model cascading conditionals with an object structure. It's a chain (or linked list) of objects that all implement the same interface (or base class). Each object (apart from the the last) has a reference to the next object in the list. </p> <p> <img src="/content/binary/chain-of-responsibility-diagram.png" alt="General diagram of the Chain of Responsibility design pattern."> </p> <p> A client (some other code) calls a method on the first object in the list. If that object can handle the request, it does so, and the interaction ends there. If the method returns a value, the object returns the value. </p> <p> If the first object determines that it can't handle the method call, it calls the next object in the chain. It only knows the next object as the interface, so the only way it can delegate the call is by calling the same method as the first one. In the above diagram, <em>Imp1</em> can't handle the method call, so it calls the same method on <em>Imp2</em>, which also can't handle the request and delegates responsibility to <em>Imp3</em>. In the diagram, <em>Imp3</em> can handle the method call, so it does so and returns a result that propagates back up the chain. In that particular example, <em>Imp4</em> never gets involved. </p> <p> You'll see an example below. </p> <p> One of the advantages of the pattern is that you can rearrange the chain to change its behaviour. You can even do this at run time, if you'd like, since all objects implement the same interface. </p> <h3 id="08a67dafd71f4bdd9a2e2577b0e43f9a"> User icon example <a href="#08a67dafd71f4bdd9a2e2577b0e43f9a" title="permalink">#</a> </h3> <p> Consider an online system that maintains user profiles for users. A user is modelled with the <code>User</code> class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;User(<span style="color:blue;">int</span>&nbsp;id,&nbsp;<span style="color:blue;">string</span>&nbsp;name,&nbsp;<span style="color:blue;">string</span>&nbsp;email,&nbsp;<span style="color:blue;">bool</span>&nbsp;useGravatar,&nbsp;<span style="color:blue;">bool</span>&nbsp;useIdenticon)</pre> </p> <p> While I only show the signature of the class' constructor, it should be enough to give you an idea. If you need more details, the entire example code base is <a href="https://github.com/ploeh/UserProfile">available on GitHub</a>. </p> <p> Apart from an <code>id</code>, a <code>name</code> and <code>email</code> address, a user also has two flags. One flag tracks whether the user wishes to use his or her <a href="http://www.gravatar.com">Gravatar</a>, while another flag tracks if the user would like to use an <a href="https://en.wikipedia.org/wiki/Identicon">Identicon</a>. Obviously, both flags could be <code>true</code>, in which case the current business rule states that the Gravatar should take precedence. </p> <p> If none of the flags are set, users might still have a picture associated with their profile. This could be a picture that they've uploaded to the system, and is being tracked by a database. </p> <p> If no user icon can be found or generated, ultimately the system should use a fallback, default icon: </p> <p> <img src="/content/binary/default-user-icon.png" alt="Default user icon."> </p> <p> To summarise, the current rules are: <ol> <li>Use Gravatar if flag is set.</li> <li>Use Identicon if flag is set.</li> <li>Use uploaded picture if available.</li> <li>Use default icon.</li> </ol> The order of precedence could change in the future, new images sources could be added, or some of the present sources could be removed. Modelling this set of rules as a Chain of Responsibility makes it easy for you to reorder the rules, should you need to. </p> <p> To request an icon, a client can use the <code>IIconReader</code> interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> The <code>Icon</code> class is just a <a href="https://martinfowler.com/bliki/ValueObject.html">Value Object</a> wrapper around a URL. The idea is that such a URL can be used in an <code>img</code> tag to show the icon. Again, the full source code is available on GitHub if you'd like to investigate the details. </p> <p> The various rules for icon retrieval can be implemented using this interface. </p> <h3 id="b2a4cbfb576949c392ea0e0b3d440175"> Gravatar reader <a href="#b2a4cbfb576949c392ea0e0b3d440175" title="permalink">#</a> </h3> <p> Although you don't have to implement the classes in the order in which you are going to compose them, it seems natural to do so, starting with the Gravatar implementation. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;GravatarReader(<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseGravatar) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Gravatar</span>(user.Email).Url); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The <code>GravatarReader</code> class both implements the <code>IIconReader</code> interface, but also decorates another object of the same polymorphic type. If <code>user.UseGravatar</code> is <code>true</code>, it generates the appropriate Gravatar URL based on the user's <code>Email</code> address; otherwise, it delegates the work to the <code>next</code> object in the Chain of Responsibility. </p> <p> The <code>Gravatar</code> class contains the implementation details to generate the Gravatar <code>Url</code>. Again, please refer to the GitHub repository if you're interested in the details. </p> <h3 id="222ae025b264455695f1dbbd74cad17b"> Identicon reader <a href="#222ae025b264455695f1dbbd74cad17b" title="permalink">#</a> </h3> <p> When you compose the chain, according to the above business logic, the next type of icon you should attempt to generate is an Identicon. It's natural to implement the Identicon reader next, then: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IdenticonReader(<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseIdenticon) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Uri</span>(baseUrl,&nbsp;HashUser(user))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Implementation&nbsp;details&nbsp;go&nbsp;here...</span> }</pre> </p> <p> Again, I'm omitting implementation details in order to focus on the Chain of Responsibility design pattern. If <code>user.UseIdenticon</code> is <code>true</code>, the <code>IdenticonReader</code> generates the appropriate Identicon and returns the URL for it; otherwise, it delegates the work to the <code>next</code> object in the chain. </p> <h3 id="e9f2904333b940c1a9a90522d19a41f3"> Database icon reader <a href="#e9f2904333b940c1a9a90522d19a41f3" title="permalink">#</a> </h3> <p> The <code>DBIconReader</code> class attempts to find an icon ID in a database. If it succeeds, it creates a URL corresponding to that ID. The assumption is that that resource exists; either it's a file on disk, or it's an image resource generated on the spot based on binary data stored in the database. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;DBIconReader(<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository,&nbsp;<span style="color:#2b91af;">IIconReader</span>&nbsp;next) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.repository&nbsp;=&nbsp;repository; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.next&nbsp;=&nbsp;next; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!repository.TryReadIconId(user.Id,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;iconId)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;next.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;parameters&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Dictionary</span>&lt;<span style="color:blue;">string</span>,&nbsp;<span style="color:blue;">string</span>&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;<span style="color:#a31515;">&quot;iconId&quot;</span>,&nbsp;iconId&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(urlTemplate.BindByName(baseUrl,&nbsp;parameters)); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">Uri</span>&nbsp;baseUrl&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Uri</span>(<span style="color:#a31515;">&quot;https://example.com&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">UriTemplate</span>&nbsp;urlTemplate&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">UriTemplate</span>(<span style="color:#a31515;">&quot;users/{iconId}/icon&quot;</span>); }</pre> </p> <p> This class demonstrates some variations in the way you can implement the Chain of Responsibility design pattern. The above <code>GravatarReader</code> and <code>IdenticonReader</code> classes both follow the same implementation pattern of checking a condition, and then performing work if the condition is <code>true</code>. The delegation to the next object in the chain happens, in those two classes, outside of the <code>if</code> statement. </p> <p> The <code>DBIconReader</code> class, on the other hand, reverses the structure of the code. It uses a <a href="https://refactoring.com/catalog/replaceNestedConditionalWithGuardClauses.html">Guard Clause</a> to detect whether to exit early, which is done by delegating work to the <code>next</code> object in the chain. </p> <p> If <code>TryReadIconId</code> returns <code>true</code>, however, the <code>ReadIcon</code> method proceeds to create the appropriate icon URL. </p> <p> Another variation on the Chain of Responsibility design pattern demonstrated by the <code>DBIconReader</code> class is that it takes a second dependency, apart from <code>next</code>. The <code>repository</code> is the usual misapplication of the Repository design pattern that everyone think they use correctly. Here, it's used in the common sense to provide access to a database. The main point, though, is that you can add as many other dependencies to a link in the chain as you'd like. All links, apart from the last, however, must have a reference to the <code>next</code> link in the chain. </p> <h3 id="cee40120578b4732892e6fd72329d5de"> Default icon reader <a href="#cee40120578b4732892e6fd72329d5de" title="permalink">#</a> </h3> <p> Like linked lists, a Chain of Responsibility has to ultimately terminate. You can use the following <code>DefaultIconReader</code> for that. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">DefaultIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:#2b91af;">Icon</span>.Default; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This class unconditionally returns the <code>Default</code> icon. Notice that it doesn't have any <code>next</code> object it delegates to. This terminates the chain. If no previous implementation of the <code>IIconReader</code> has returned an <code>Icon</code> for the <code>user</code>, this one does. </p> <h3 id="8eb05bed2d98488a91c09bab52b00a53"> Chain composition <a href="#8eb05bed2d98488a91c09bab52b00a53" title="permalink">#</a> </h3> <p> With four implementations of <code>IIconReader</code>, you can now compose the Chain of Responsibility: </p> <p> <pre><span style="color:#2b91af;">IIconReader</span>&nbsp;reader&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repo, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DefaultIconReader</span>())));</pre> </p> <p> The first link in the chain is a <code>GravatarReader</code> object that contains an <code>IdenticonReader</code> object as its <code>next</code> link, and so on. Referring back to the source code of <code>GravatarReader</code>, notice that its <code>next</code> dependency is declared as an <code>IIconReader</code>. Since the <code>IdenticonReader</code> class implements that interface, you can compose the chain like this, but if you later decide to change the order of the objects, you can do so simply by changing the composition. You could remove objects altogether, or add new classes, and you could even do this at run time, if required. </p> <p> The <code>DBIconReader</code> class requires an extra <code>IUserRepository</code> dependency, here simply an existing object called <code>repo</code>. </p> <p> The <code>DefaultIconReader</code> takes no other dependencies, so this effectively terminates the chain. If you try to pass another <code>IIconReader</code> to its constructor, the code doesn't compile. </p> <h3 id="fc1551665bb940b8ba5e75be81c0629a"> Haskell proof of concept <a href="#fc1551665bb940b8ba5e75be81c0629a" title="permalink">#</a> </h3> <p> When evaluating whether a design is <a href="/2018/11/19/functional-architecture-a-definition">a functional architecture</a>, I often port the relevant parts to <a href="https://www.haskell.org">Haskell</a>. You can do the same with the above example, and put it in a form where it's clearer that the Chain of Responsibility pattern is equivalent to two well-known catamorphisms. </p> <p> Readers not comfortable with Haskell can skip the next few sections. The object-oriented example continues below. </p> <p> <code>User</code> and <code>Icon</code> types are defined by types equivalent to above. There's no explicit interface, however. Creation of Gravatars and Identicons are both pure functions with the type <code>User -&gt; Maybe Icon</code>. Here's the Gravatar function, but the Identicon function looks similar: </p> <p> <pre><span style="color:#2b91af;">gravatarUrl</span>&nbsp;::&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">String</span> gravatarUrl&nbsp;email&nbsp;= &nbsp;&nbsp;<span style="color:#a31515;">&quot;https://www.gravatar.com/avatar/&quot;</span>&nbsp;++&nbsp;<span style="color:blue;">show</span>&nbsp;(hashString&nbsp;email&nbsp;::&nbsp;MD5Digest) <span style="color:#2b91af;">getGravatar</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Icon</span> getGravatar&nbsp;u&nbsp;= &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;useGravatar&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;Just&nbsp;$&nbsp;Icon&nbsp;$&nbsp;gravatarUrl&nbsp;&nbsp;userEmail&nbsp;u &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;Nothing</pre> </p> <p> Reading an icon ID from a database, however, is an impure operation, so the function to do this has the type <code>User -&gt; IO (Maybe Icon)</code>. </p> <h3 id="11adf8bd104d41fab9e6bcaef249210c"> Lazy I/O in Haskell <a href="#11adf8bd104d41fab9e6bcaef249210c" title="permalink">#</a> </h3> <p> Notice that the database icon-querying function has the return type <code>IO (Maybe Icon)</code>. In the introduction you read that the Chain of Responsibility design pattern is a sequence of catamorphisms - the first one over a list of <code>First</code> values. While <code>First</code> is, in itself, a <code>Semigroup</code> instance, it gives rise to a <code>Monoid</code> instance when combined with <code>Maybe</code>. Thus, to showcase the abstractions being used, you could create a list of <code>Maybe (First Icon)</code> values. This forms a <code>Monoid</code>, so is easy to fold. </p> <p> The problem with that, however, is that <code>IO</code> is strict under evaluation, so while it works, <a href="https://stackoverflow.com/q/47120384/126014">it's no longer lazy</a>. You can combine <code>IO (Maybe (First Icon))</code> values, but it leads to too much I/O activity. </p> <p> You can <a href="https://stackoverflow.com/q/47120384/126014">solve this problem with a newtype wrapper</a>: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;FirstIO&nbsp;a&nbsp;=&nbsp;FirstIO&nbsp;(MaybeT&nbsp;IO&nbsp;a)&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Functor</span>,&nbsp;<span style="color:#2b91af;">Applicative</span>,&nbsp;<span style="color:#2b91af;">Monad</span>,&nbsp;<span style="color:#2b91af;">Alternative</span>) <span style="color:#2b91af;">firstIO</span>&nbsp;::&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;a firstIO&nbsp;=&nbsp;FirstIO&nbsp;.&nbsp;MaybeT <span style="color:#2b91af;">getFirstIO</span>&nbsp;::&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;(<span style="color:#2b91af;">Maybe</span>&nbsp;a) getFirstIO&nbsp;(FirstIO&nbsp;(MaybeT&nbsp;x))&nbsp;=&nbsp;x <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Semigroup</span>&nbsp;(<span style="color:blue;">FirstIO</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:#2b91af;">(&lt;&gt;)</span>&nbsp;=&nbsp;<span style="color:#2b91af;">(&lt;|&gt;)</span> <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Monoid</span>&nbsp;(<span style="color:blue;">FirstIO</span>&nbsp;a)&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;mempty&nbsp;=&nbsp;empty</pre> </p> <p> This uses the <code>GeneralizedNewtypeDeriving</code> GHC extension to automatically make <code>FirstIO</code> <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, and <code>Alternative</code>. It also uses the <code>Alternative</code> instance to implement <code>Semigroup</code> and <code>Monoid</code>. You may recall from <a href="http://hackage.haskell.org/package/base/docs/Control-Applicative.html">the documentation</a> that <code>Alternative</code> is already a "monoid on applicative functors." </p> <h3 id="995f9ea8f8344aea93b2ffd0b3aad71f"> Alignment <a href="#995f9ea8f8344aea93b2ffd0b3aad71f" title="permalink">#</a> </h3> <p> You now have three functions with different types: two pure functions with the type <code>User -&gt; Maybe Icon</code> and one impure database-bound function with the type <code>User -&gt; IO (Maybe Icon)</code>. In order to have a common abstraction, you should align them so that all types match. At first glance, <code>User -&gt; IO (Maybe (First Icon))</code> seems like a type that fits all implementations, but that causes too much I/O to take place, so instead, use <code>User -&gt; FirstIO Icon</code>. Here's how to lift the pure <code>getGravatar</code> function: </p> <p> <pre><span style="color:#2b91af;">getGravatarIO</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">FirstIO</span>&nbsp;<span style="color:blue;">Icon</span> getGravatarIO&nbsp;=&nbsp;firstIO&nbsp;.&nbsp;<span style="color:blue;">return</span>&nbsp;.&nbsp;getGravatar</pre> </p> <p> You can lift the other functions in similar fashion, to produce <code>getGravatarIO</code>, <code>getIdenticonIO</code>, and <code>getDBIconIO</code>, all with the mutual type <code>User -&gt; FirstIO Icon</code>. </p> <h3 id="f601a51f3006430398232e05b6595da0"> Haskell composition <a href="#f601a51f3006430398232e05b6595da0" title="permalink">#</a> </h3> <p> The goal of the Haskell proof of concept is to compose a function that can provide an <code>Icon</code> for any <code>User</code> - just like the above C# composition that uses Chain of Responsibility. There's, however, no way around impurity, because one of the steps involve a database, so the aim is a composition with the type <code>User -&gt; IO Icon</code>. </p> <p> While a more compact composition is possible, I'll show it in a way that makes the catamorphisms explicit: </p> <p> <pre><span style="color:#2b91af;">getIcon</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;<span style="color:blue;">Icon</span> getIcon&nbsp;u&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;lazyIcons&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(\f&nbsp;-&gt;&nbsp;f&nbsp;u)&nbsp;[getGravatarIO,&nbsp;getIdenticonIO,&nbsp;getDBIconIO] &nbsp;&nbsp;m&nbsp;&lt;-&nbsp;getFirstIO&nbsp;&nbsp;fold&nbsp;lazyIcons &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;fromMaybe&nbsp;defaultIcon&nbsp;m</pre> </p> <p> The <code>getIcon</code> function starts with a list of all three functions. For each of them, it calls the function with the <code>User</code> value <code>u</code>. This may seem inefficient and redundant, because all three function calls may not be required, but since the return values are <code>FirstIO</code> values, all three function calls are lazily evaluated - even under <code>IO</code>. The result, <code>lazyIcons</code>, is a <code>[FirstIO Icon]</code> value; i.e. a lazily evaluated list of lazily evaluated values. </p> <p> This first step is just to put the potential values in a form that's recognisable. You can now <code>fold</code> the <code>lazyIcons</code> to a single <code>FirstIO Icon</code> value, and then use <code>getFirstIO</code> to unwrap it. Due to <code>do</code> notation, <code>m</code> is a <code>Maybe Icon</code> value. </p> <p> This is the first catamorphism. Granted, the generalisation that <code>fold</code> offers is not really required, since <code>lazyIcons</code> is a list; <code>mconcat</code> would have worked just as well. I did, however, choose to use <code>fold</code> (from <code>Data.Foldable</code>) to emphasise the point. While the <code>fold</code> function itself isn't the catamorphism for lists, we know that <a href="/2019/05/27/list-catamorphism">it's derived from the list catamorphism</a>. </p> <p> The final step is to utilise the Maybe catamorphism to reduce the <code>Maybe Icon</code> value to an <code>Icon</code> value. Again, the <code>getIcon</code> function doesn't use the Maybe catamorphism directly, but rather the derived <code>fromMaybe</code> function. The <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a> is the <code>maybe</code> function, but you can trivially implement <code>fromMaybe</code> with <code>maybe</code>. </p> <p> For <a href="https://en.wikipedia.org/wiki/Code_golf">golfers</a>, it's certainly possible to write this function in a more compact manner. Here's a <a href="https://en.wikipedia.org/wiki/Tacit_programming">point-free</a> version: </p> <p> <pre><span style="color:#2b91af;">getIcon</span>&nbsp;::&nbsp;<span style="color:blue;">User</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;<span style="color:blue;">Icon</span> getIcon&nbsp;= &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;(fromMaybe&nbsp;defaultIcon)&nbsp;.&nbsp;getFirstIO&nbsp;.&nbsp;fold&nbsp;[getGravatarIO,&nbsp;getIdenticonIO,&nbsp;getDBIconIO]</pre> </p> <p> This alternative version utilises that <code>a -&gt; m</code> is a <code>Monoid</code> instance when <code>m</code> is a <code>Monoid</code> instance. That's the reason that you can <code>fold</code> a list of functions. The more explicit version above doesn't do that, but the behaviour is the same in both cases. </p> <p> That's all the Haskell code we need to discern the universal abstractions involved in the Chain of Responsibility design pattern. We can now return to the C# code example. </p> <h3 id="492ff50788784d7dbf6560ed08ed6bf7"> Chains as lists <a href="#492ff50788784d7dbf6560ed08ed6bf7" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern is often illustrated like above, in a staircase-like diagram. There's, however, no inherent requirement to do so. You could also flatten the diagram: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list.png" alt="Chain of Responsibility illustrated as a linked list."> </p> <p> This looks a lot like a linked list. </p> <p> The difference is, however, that the terminator of a linked list is usually empty. Here, however, you have two types of objects. All objects apart from the rightmost object represent a <em>potential</em>. Each object may, or may not, handle the method call and produce an outcome; if an object can't handle the method call, it'll delegate to the next object in the chain. </p> <p> The rightmost object, however, is different. This object can't delegate any further, but <em>must</em> handle the method call. In the icon reader example, this is the <code>DefaultIconReader</code> class. </p> <p> Once you start to see most of the list as a list of potential values, you may realise that you'll be able to collapse into it a single potential value. This is possible because <a href="/2018/04/03/maybe-monoids">a list of values where you pick the first non-empty value forms a monoid</a>. This is sometimes called the <em>First</em> <a href="/2017/10/06/monoids">monoid</a>. </p> <p> In other words, you can reduce, or fold, all of the list, except the rightmost value, to a single potential value: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list-single-fold.png" alt="Chain of Responsibility illustrated as a linked list, with all but the rightmost objects folded to one."> </p> <p> When you do that, however, you're left with a single potential value. The result of folding most of the list is that you get the leftmost non-empty value in the list. There's no guarantee, however, that that value is non-empty. If all the values in the list are empty, the result is also empty. This means that you somehow need to combine a potential value with a value that's guaranteed to be present: the terminator. </p> <p> You can do that wither another fold: </p> <p> <img src="/content/binary/chain-of-responsibility-as-a-linked-list-double-fold.png" alt="Chain of Responsibility illustrated as a linked list, with two consecutive folds."> </p> <p> This second fold isn't a list fold, but rather a Maybe fold. </p> <h3 id="7632b9ff458d417fa49b1c65f7b198ed"> Maybe <a href="#7632b9ff458d417fa49b1c65f7b198ed" title="permalink">#</a> </h3> <p> The <em>First</em> monoid is a monoid over <a href="/2018/03/26/the-maybe-functor">Maybe</a>, so add a <code>Maybe</code> class to the code base. In Haskell, the catamorphism for Maybe is called <code>maybe</code>, but that's not a good method name in object-oriented design. Another option is some variation of <em>fold</em>, but in C#, this functionality tends to be called <code>Aggregate</code>, at least for <code>IEnumerable&lt;T&gt;</code>, so I'll reuse that terminology: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">TResult</span>&nbsp;Aggregate&lt;<span style="color:#2b91af;">TResult</span>&gt;(<span style="color:#2b91af;">TResult</span>&nbsp;@default,&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">T</span>,&nbsp;<span style="color:#2b91af;">TResult</span>&gt;&nbsp;func) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(func&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(func)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;hasItem&nbsp;?&nbsp;func(item)&nbsp;:&nbsp;@default; }</pre> </p> <p> You can implement another, more list-like <code>Aggregate</code> overload from this one, but for this article, you don't need it. </p> <h3 id="8b60d0c605d14cffbfa5e237cf26b7b2"> From TryReadIconId to Maybe <a href="#8b60d0c605d14cffbfa5e237cf26b7b2" title="permalink">#</a> </h3> <p> In the above code examples, <code>DBIconReader</code> depends on <code>IUserRepository</code>, which defined this method: </p> <p> <pre><span style="color:blue;">bool</span>&nbsp;TryReadIconId(<span style="color:blue;">int</span>&nbsp;userId,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;iconId);</pre> </p> <p> From <a href="/2019/07/15/tester-doer-isomorphisms">Tester-Doer isomorphisms</a> we know, however, that such a design is isomorphic to returning a Maybe value, and since that's more composable, do that: </p> <p> <pre><span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;ReadIconId(<span style="color:blue;">int</span>&nbsp;userId);</pre> </p> <p> This requires you to refactor the <code>DBIconReader</code> implementation of the <code>ReadIcon</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;mid&nbsp;=&nbsp;repository.ReadIconId(user.Id); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;lazyResult&nbsp;=&nbsp;mid.Aggregate( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;@default:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(()&nbsp;=&gt;&nbsp;next.ReadIcon(user)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;func:&nbsp;id&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(()&nbsp;=&gt;&nbsp;CreateIcon(id))); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;lazyResult.Value; }</pre> </p> <p> A few things are worth a mention. Notice that the above <code>Aggregate</code> method (the Maybe catamorphism) requires you to supply a <code>@default</code> value (to be used if the Maybe object is empty). In the Chain of Responsibility design pattern, however, the fallback value is produced by calling the <code>next</code> object in the chain. If you do this unconditionally, however, you perform too much work. You're only supposed to call <code>next</code> if the current object can't handle the method call. </p> <p> The solution is to aggregate the <code>mid</code> object to a <code>Lazy&lt;Icon&gt;</code> and then return its <code>Value</code>. The <code>@default</code> value is now a lazy computation that calls <code>next</code> only if its <code>Value</code> is read. When <code>mid</code> is populated, on the other hand, the lazy computation calls the private <code>CreateIcon</code> method when <code>Value</code> is accessed. The private <code>CreateIcon</code> method contains the same logic as before the refactoring. </p> <p> This change of <code>DBIconReader</code> isn't strictly necessary in order to change the overall Chain of Responsibility to a pair of catamorphisms, but serves, I think, as a nice introduction to the use of the Maybe catamorphism. </p> <h3 id="ec329c8a0b70432d81d6f69e7084c13f"> Optional icon readers <a href="#ec329c8a0b70432d81d6f69e7084c13f" title="permalink">#</a> </h3> <p> Previously, the <code>IIconReader</code> interface <em>required</em> each implementation to return an <code>Icon</code> object: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> When you have an object like <code>GravatarReader</code> that may or may not return an <code>Icon</code>, this requirement leads toward the Chain of Responsibility design pattern. You can, however, shift the responsibility of what to do next by changing the interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user); }</pre> </p> <p> An implementation like <code>GravatarReader</code> becomes simpler: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(user.UseGravatar) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Icon</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Gravatar</span>(user.Email).Url)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> No longer do you have to pass in a <code>next</code> dependency. Instead, you just return an empty <code>Maybe&lt;Icon&gt;</code> if you can't handle the method call. The same change applies to the <code>IdenticonReader</code> class. </p> <p> Since <a href="/2018/03/26/the-maybe-functor">Maybe is a functor</a>, and the <code>DBIconReader</code> already works on a <code>Maybe&lt;string&gt;</code> value, its implementation is greatly simplified: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;repository.ReadIconId(user.Id).Select(CreateIcon); }</pre> </p> <p> Since <code>ReadIconId</code> returns a <code>Maybe&lt;string&gt;</code>, you can simply use <code>Select</code> to transform the icon ID to an <code>Icon</code> object if the Maybe is populated. </p> <h3 id="94cac3b9e52e48c2a1768fd24c72e4bd"> Coalescing Composite <a href="#94cac3b9e52e48c2a1768fd24c72e4bd" title="permalink">#</a> </h3> <p> As an intermediate step, you can compose the various readers using a <a href="/2018/04/09/coalescing-composite-as-a-monoid">Coalescing Composite</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">CompositeIconReader</span>&nbsp;:&nbsp;<span style="color:#2b91af;">IIconReader</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[]&nbsp;iconReaders; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;CompositeIconReader(<span style="color:blue;">params</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[]&nbsp;iconReaders) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.iconReaders&nbsp;=&nbsp;iconReaders; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;iconReader&nbsp;<span style="color:blue;">in</span>&nbsp;iconReaders) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;mIcon&nbsp;=&nbsp;iconReader.ReadIcon(user); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(IsPopulated(mIcon)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;mIcon; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;IsPopulated&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;m) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Aggregate(<span style="color:blue;">false</span>,&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">true</span>); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> I prefer a more explicit design over this one, so this is just an intermediate step. This <code>IIconReader</code> implementation composes an array of other <code>IIconReader</code> objects and queries each in order to return the first populated Maybe value it finds. If it doesn't find any populated value, it returns an empty Maybe object. </p> <p> You can now compose your <code>IIconReader</code> objects into a <a href="https://en.wikipedia.org/wiki/Composite_pattern">Composite</a>: </p> <p> <pre><span style="color:#2b91af;">IIconReader</span>&nbsp;reader&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">CompositeIconReader</span>( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repo));</pre> </p> <p> While this gives you a single object on which you can call <code>ReadIcon</code>, the return value of that method is still a <code>Maybe&lt;Icon&gt;</code> object. You still need to reduce the <code>Maybe&lt;Icon&gt;</code> object to an <code>Icon</code> object. You can do this with a Maybe helper method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;GetValueOrDefault(<span style="color:#2b91af;">T</span>&nbsp;@default) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Aggregate(@default,&nbsp;x&nbsp;=&gt;&nbsp;x); }</pre> </p> <p> Given a <code>User</code> object named <code>user</code>, you can now use the composition and the <code>GetValueOrDefault</code> method to get an <code>Icon</code> object: </p> <p> <pre><span style="color:#2b91af;">Icon</span>&nbsp;icon&nbsp;=&nbsp;reader.ReadIcon(user).GetValueOrDefault(<span style="color:#2b91af;">Icon</span>.Default);</pre> </p> <p> First you use the composed <code>reader</code> to produce a <code>Maybe&lt;Icon&gt;</code> object, and then you use the <code>GetValueOrDefault</code> method to reduce the <code>Maybe&lt;Icon&gt;</code> object to an <code>Icon</code> object. </p> <p> The latter of these two steps, <code>GetValueOrDefault</code>, is already based on the Maybe catamorphism, but the first step is still too implicit to clearly show the nature of what's actually going on. The next step is to refactor the Coalescing Composite to a list of monoidal values. </p> <h3 id="c75ce57c2b4f4315a93eaa91b653a370"> First <a href="#c75ce57c2b4f4315a93eaa91b653a370" title="permalink">#</a> </h3> <p> While not strictly necessary, you can introduce a <code>First&lt;T&gt;</code> wrapper: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;First(<span style="color:#2b91af;">T</span>&nbsp;item) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(item&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentNullException</span>(<span style="color:blue;">nameof</span>(item)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Item&nbsp;=&nbsp;item; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">T</span>&nbsp;Item&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Equals(<span style="color:blue;">object</span>&nbsp;obj) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(!(obj&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;other)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">false</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Equals(Item,&nbsp;other.Item); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;GetHashCode() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Item.GetHashCode(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> In this particular example, the <code>First&lt;T&gt;</code> class adds no new capabilities, so it's technically redundant. You could add to it methods to combine two <code>First&lt;T&gt;</code> objects into one (since <em>First</em> forms a <a href="/2017/11/27/semigroups">semigroup</a>), and perhaps a method or two to <a href="/2017/12/11/semigroups-accumulate">accumulate multiple values</a>, but in this article, none of those are required. </p> <p> While the class as shown above doesn't add any behaviour, I like that it signals intent, so I'll use it in that role. </p> <h3 id="c3feb40d90fc4d389fa0b3812abaa62c"> Lazy I/O in C# <a href="#c3feb40d90fc4d389fa0b3812abaa62c" title="permalink">#</a> </h3> <p> Like in the above Haskell code, you'll need to be able to combine two <code>First&lt;T&gt;</code> objects in a lazy fashion, in such a way that if the first object is populated, the I/O associated with producing the second value never happens. In Haskell I addressed that concern with a <code>newtype</code> that, among other abstractions, is a monoid. You can do the same in C# with an extension method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;FindFirst&lt;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;m, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;other) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(m.Value.IsPopulated()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;other; } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;IsPopulated&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;m) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Aggregate(<span style="color:blue;">false</span>,&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">true</span>); }</pre> </p> <p> The <code>FindFirst</code> method returns the first (leftmost) non-empty object of two options. It's a lazy version of the <em>First</em> monoid, and <a href="/2019/04/15/lazy-monoids">that's still a monoid</a>. It's truly lazy because it never accesses the <code>Value</code> property on <code>other</code>. While it has to force evaluation of the first lazy computation, <code>m</code>, it doesn't have to evaluate <code>other</code>. Thus, whenever <code>m</code> is populated, <code>other</code> can remain non-evaluated. </p> <p> Since <a href="/2017/11/20/monoids-accumulate">monoids accumulate</a>, you can also write an extension method to implement that functionality: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&nbsp;FindFirst&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;identity&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&gt;(()&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Aggregate(identity,&nbsp;(acc,&nbsp;x)&nbsp;=&gt;&nbsp;acc.FindFirst(x)); }</pre> </p> <p> This overload just uses the earlier <code>FindFirst</code> extension method to fold an arbitrary number of lazy <code>First&lt;T&gt;</code> objects into one. Notice that <code>Aggregate</code> is the C# name for the list catamorphisms. </p> <p> You can now compose the desired functionality using the basic building blocks of monoids, <a href="/2018/03/22/functors">functors</a>, and catamorphisms. </p> <h3 id="0fe80a69c74c463dacb8af0f86898518"> Composition from universal abstractions <a href="#0fe80a69c74c463dacb8af0f86898518" title="permalink">#</a> </h3> <p> The goal is still a function that takes a <code>User</code> object as input and produces an <code>Icon</code> object as output. While you could compose that functionality directly in-line where you need it, I think it may be helpful to package the composition in a <a href="https://en.wikipedia.org/wiki/Facade_pattern">Facade</a> object. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">IconReaderFacade</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:#2b91af;">IReadOnlyCollection</span>&lt;<span style="color:#2b91af;">IIconReader</span>&gt;&nbsp;readers; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IconReaderFacade(<span style="color:#2b91af;">IUserRepository</span>&nbsp;repository) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readers&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IIconReader</span>[] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">GravatarReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">IdenticonReader</span>(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">DBIconReader</span>(repository) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Icon</span>&nbsp;ReadIcon(<span style="color:#2b91af;">User</span>&nbsp;user) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;&gt;&nbsp;lazyIcons&nbsp;=&nbsp;readers &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(r&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;(()&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r.ReadIcon(user).Select(i&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;(i)))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Lazy</span>&lt;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">First</span>&lt;<span style="color:#2b91af;">Icon</span>&gt;&gt;&gt;&nbsp;m&nbsp;=&nbsp;lazyIcons.FindFirst(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;m.Value.Aggregate(<span style="color:#2b91af;">Icon</span>.Default,&nbsp;fi&nbsp;=&gt;&nbsp;fi.Item); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> When you initialise an <code>IconReaderFacade</code> object, it creates an array of the desired <code>readers</code>. Whenever <code>ReadIcon</code> is invoked, it first transforms all those readers to a sequence of potential icons. All the values in the sequence are lazily evaluated, so in this step, nothing actually happens, even though it looks as though all readers' <code>ReadIcon</code> method gets called. The <code>Select</code> method is a structure-preserving map, so all readers are still potential producers of <code>Icon</code> objects. </p> <p> You now have an <code>IEnumerable&lt;Lazy&lt;Maybe&lt;First&lt;Icon&gt;&gt;&gt;&gt;</code>, which must be a good candidate for the prize for the <em>most nested generic .NET type of 2019</em>. It fits, though, the input type for the above <code>FindFirst</code> overload, so you can call that. The result is a single potential value <code>m</code>. That's the list catamorphism applied. </p> <p> Finally, you force evaluation of the lazy computation and apply the Maybe catamorphism (<code>Aggregate</code>). The <code>@default</code> value is <code>Icon.Default</code>, which gets returned if <code>m</code> turns out to be empty. When <code>m</code> is populated, you pull the <code>Item</code> out of the <code>First</code> object. In either case, you now have an <code>Icon</code> object to return. </p> <p> This composition has exactly the same behaviour as the initial Chain of Responsibility implementation, but is now composed from universal abstractions. </p> <h3 id="23819ca370344b94875ddbf5bde5aef3"> Summary <a href="#23819ca370344b94875ddbf5bde5aef3" title="permalink">#</a> </h3> <p> The Chain of Responsibility design pattern describes a flexible way to implement conditional logic. Instead of relying on keywords like <code>if</code> or <code>switch</code>, you can compose the conditional logic from polymorphic objects. This gives you several advantages. One is that you get better separations of concerns, which will tend to make it easier to refactor the code. Another is that it's possible to change the behaviour at run time, by moving the objects around. </p> <p> You can achieve a similar design, with equivalent advantages, by composing polymorphically similar functions in a list, map the functions to a list of potential values, and then use the list catamorphism to reduce many potential values to one. Finally, you apply the Maybe catamorphism to produce a value, even if the potential value is empty. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Tester-Doer isomorphisms https://blog.ploeh.dk/2019/07/15/tester-doer-isomorphisms 2019-07-15T07:35:00+00:00 Mark Seemann <div id="post"> <p> <em>The Tester-Doer pattern is equivalent to the Try-Parse idiom; both are equivalent to Maybe.</em> </p> <p> This article is part of <a href="/2018/01/08/software-design-isomorphisms">a series of articles about software design isomorphisms</a>. An isomorphism is when a bi-directional lossless translation exists between two representations. Such translations exist between the <em>Tester-Doer</em> pattern and the <em>Try-Parse</em> idiom. Both can also be translated into operations that return <a href="/2018/03/26/the-maybe-functor">Maybe</a>. </p> <p> <img src="/content/binary/tester-doer-try-parse-maybe-isomorphism.png" alt="Isomorphisms between Tester-Doer, Try-Parse, and Maybe."> </p> <p> Given an implementation that uses one of those three idioms or abstractions, you can translate your design into one of the other options. This doesn't imply that each is of equal value. When it comes to composability, Maybe is superior to the two other alternatives, and Tester-Doer isn't thread-safe. </p> <h3 id="e95c8f5d7a6445139b58445d30498493"> Tester-Doer <a href="#e95c8f5d7a6445139b58445d30498493" title="permalink">#</a> </h3> <p> The first time I explicitly encountered the Tester-Doer pattern was in the <a href="https://amzn.to/2zXCCfH">Framework Design Guidelines</a>, which is from where I've taken the name. The pattern is, however, older. The idea that you can query an object about whether a given operation would be possible, and then you only perform it if the answer is affirmative, is almost a leitmotif in <a href="http://amzn.to/1claOin">Object-Oriented Software Construction</a>. Bertrand Meyer often uses linked lists and stacks as examples, but I'll instead use the example that Krzysztof Cwalina and Brad Abrams use: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:blue;">if</span>&nbsp;(!numbers.IsReadOnly) &nbsp;&nbsp;&nbsp;&nbsp;numbers.Add(1);</pre> </p> <p> The idea with the Tester-Doer pattern is that you test whether an intended operation is legal, and only perform it if the answer is affirmative. In the example, you only add to the <code>numbers</code> collection if <code>IsReadOnly</code> is <code>false</code>. Here, <code>IsReadOnly</code> is the <em>Tester</em>, and <code>Add</code> is the <em>Doer</em>. </p> <p> As Jeffrey Richter points out in the book, this is a dangerous pattern: <blockquote> "The potential problem occurs when you have multiple threads accessing the object at the same time. For example, one thread could execute the test method, which reports that all is OK, and before the doer method executes, another thread could change the object, causing the doer to fail." </blockquote> In other words, the pattern isn't thread-safe. While multi-threaded programming was always supported in .NET, this was less of a concern when the guidelines were first published (2006) than it is today. The guidelines were in internal use in Microsoft years before they were published, and there wasn't many multi-core processors in use back then. </p> <p> Another problem with the Tester-Doer pattern is with discoverability. If you're looking for a way to add an element to a collection, you'd usually consider your search over once you find the <code>Add</code> method. Even if you wonder <em>Is this operation safe? Can I always add an element to a collection?</em> you <em>might</em> consider looking for a <code>CanAdd</code> method, but not an <code>IsReadOnly</code> property. Most people don't even ask the question in the first place, though. </p> <h3 id="08bc9f42d8f048119f952aa9c2d94b34"> From Tester-Doer to Try-Parse <a href="#08bc9f42d8f048119f952aa9c2d94b34" title="permalink">#</a> </h3> <p> You could refactor such a Tester-Doer API to a single method, which is both thread-safe and discoverable. One option is a variation of the Try-Parse idiom (discussed in detail below). Using it could look like this: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:blue;">bool</span>&nbsp;wasAdded&nbsp;=&nbsp;numbers.TryAdd(1);</pre> </p> <p> In this special case, you may not need the <code>wasAdded</code> variable, because the original <code>Add</code> operation never returned a value. If, on the other hand, you do care whether or not the element was added to the collection, you'd have to figure out what to do in the case where the return value is <code>true</code> and <code>false</code>, respectively. </p> <p> Compared to the more idiomatic example of the Try-Parse idiom below, you may have noticed that the <code>TryAdd</code> method shown here takes no <code>out</code> parameter. This is because the original <code>Add</code> method returns <code>void</code>; there's nothing to return. From <a href="/2018/01/15/unit-isomorphisms">unit isomorphisms</a>, however, we know that <em>unit</em> is isomorphic to <code>void</code>, so we could, more explicitly, have defined a <code>TryAdd</code> method with this signature: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">Unit</span>&nbsp;unit)</pre> </p> <p> There's no point in doing this, however, apart from demonstrating that the isomorphism holds. </p> <h3 id="e246bcfabcab42e8b76e2b3e314174c4"> From Tester-Doer to Maybe <a href="#e246bcfabcab42e8b76e2b3e314174c4" title="permalink">#</a> </h3> <p> You can also refactor the add-to-collection example to return a Maybe value, although in this degenerate case, it makes little sense. If you automate the refactoring process, you'd arrive at an API like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Unit</span>&gt;&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item)</pre> </p> <p> Using it would look like this: </p> <p> <pre><span style="color:#2b91af;">ICollection</span>&lt;<span style="color:blue;">int</span>&gt;&nbsp;numbers&nbsp;=&nbsp;<span style="color:green;">//&nbsp;...</span> <span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">Unit</span>&gt;&nbsp;m&nbsp;=&nbsp;numbers.TryAdd(1);</pre> </p> <p> The contract is consistent with what Maybe implies: You'd get an empty <code>Maybe&lt;Unit&gt;</code> object if the <em>add</em> operation 'failed', and a populated <code>Maybe&lt;Unit&gt;</code> object if the <em>add</em> operation succeeded. Even in the populated case, though, the value contained in the Maybe object would be <em>unit</em>, which carries no further information than its existence. </p> <p> To be clear, this isn't close to a proper functional design because all the interesting action happens as a side effect. Does the design have to be functional? No, it clearly isn't in this case, but Maybe is a concept that originated in functional programming, so you could be misled to believe that I'm trying to pass this particular design off as functional. It's not. </p> <p> A functional version of this API could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">ICollection</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;&nbsp;TryAdd(<span style="color:#2b91af;">T</span>&nbsp;item)</pre> </p> <p> An implementation wouldn't mutate the object itself, but rather return a new collection with the added item, in case that was possible. This is, however, always possible, because you can always concatenate <code>item</code> to the front of the collection. In other words, this particular line of inquiry is increasingly veering into the territory of the absurd. This isn't, however, a counter-example of my proposition that the isomorphism exists; it's just a result of the initial example being degenerate. </p> <h3 id="9817f0d35d99428f93c38cab9fabc9ad"> Try-Parse <a href="#9817f0d35d99428f93c38cab9fabc9ad" title="permalink">#</a> </h3> <p> Another idiom described in the Framework Design Guidelines is the Try-Parse idiom. This seems to be a coding idiom more specific to the .NET framework, which is the reason I call it an <em>idiom</em> instead of a <em>pattern</em>. (Perhaps it is, after all, a pattern... I'm sure many of my readers are better informed about how problems like these are solved in other languages, and can enlighten me.) </p> <p> A better name might be <em>Try-Do</em>, since the idiom doesn't have to be constrained to parsing. The example that Cwalina and Abrams supply, however, relates to parsing a <code>string</code> into a <code>DateTime</code> value. Such an API is <a href="https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tryparse">already available in the base class library</a>. Using it looks like this: </p> <p> <pre><span style="color:blue;">bool</span>&nbsp;couldParse&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.TryParse(candidate,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:#2b91af;">DateTime</span>&nbsp;dateTime);</pre> </p> <p> Since <code>DateTime</code> is a <a href="https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/value-types">value type</a>, the <code>out</code> parameter will never be <code>null</code>, even if parsing fails. You can, however, examine the return value <code>couldParse</code> to determine whether the <code>candidate</code> could be parsed. </p> <p> In the running commentary in the book, Jeffrey Richter likes this much better: <blockquote> "I like this guideline a lot. It solves the race-condition problem and the performance problem." </blockquote> I agree that it's better than Tester-Doer, but that doesn't mean that you can't refactor such a design to that pattern. </p> <h3 id="166ef01b6b64481a85fe64a6e9e07dc6"> From Try-Parse to Tester-Doer <a href="#166ef01b6b64481a85fe64a6e9e07dc6" title="permalink">#</a> </h3> <p> While I see no compelling reason to design parsing attempts with the Tester-Doer pattern, it's possible. You could create an API that enables interaction like this: </p> <p> <pre><span style="color:#2b91af;">DateTime</span>&nbsp;dateTime&nbsp;=&nbsp;<span style="color:blue;">default</span>(<span style="color:#2b91af;">DateTime</span>); <span style="color:blue;">bool</span>&nbsp;canParse&nbsp;=&nbsp;<span style="color:#2b91af;">DateTimeEnvy</span>.CanParse(candidate); <span style="color:blue;">if</span>&nbsp;(canParse) &nbsp;&nbsp;&nbsp;&nbsp;dateTime&nbsp;=&nbsp;<span style="color:#2b91af;">DateTime</span>.Parse(candidate);</pre> </p> <p> You'd need to add a new <code>CanParse</code> method with this signature: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;CanParse(<span style="color:blue;">string</span>&nbsp;candidate)</pre> </p> <p> In this particular example, you don't have to add a <code>Parse</code> method, because it already exists in the base class library, but in other examples, you'd have to add such a method as well. </p> <p> This example doesn't suffer from issues with thread safety, since strings are immutable, but in general, that problem is always a concern with the Tester-Doer <a href="/2019/01/21/some-thoughts-on-anti-patterns">anti-pattern</a>. Discoverability still suffers in this example. </p> <h3 id="ffd6284cfc8f4f528d1a3b80849fbf8c"> From Try-Parse to Maybe <a href="#ffd6284cfc8f4f528d1a3b80849fbf8c" title="permalink">#</a> </h3> <p> While the Try-Parse idiom is thread-safe, it isn't composable. Every time you run into an API modelled over this template, you have to stop what you're doing and check the return value. Did the operation succeed? Was should the code do if it didn't? </p> <p> <em>Maybe</em>, on the other hand, is composable, so is a much better way to model problems such as parsing. Typically, methods or functions that return Maybe values are still prefixed with <em>Try</em>, but there's no longer any <code>out</code> parameter. A Maybe-based <code>TryParse</code> function could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">DateTime</span>&gt;&nbsp;TryParse(<span style="color:blue;">string</span>&nbsp;candidate)</pre> </p> <p> You could use it like this: </p> <p> <pre><span style="color:#2b91af;">Maybe</span>&lt;<span style="color:#2b91af;">DateTime</span>&gt;&nbsp;m&nbsp;=&nbsp;<span style="color:#2b91af;">DateTimeEnvy</span>.TryParse(candidate);</pre> </p> <p> If the <code>candidate</code> was successfully parsed, you get a populated <code>Maybe&lt;DateTime&gt;</code>; if the string was invalid, you get an empty <code>Maybe&lt;DateTime&gt;</code>. </p> <p> A Maybe object composes much better with other computations. Contrary to the Try-Parse idiom, you don't have to stop and examine a Boolean return value. You don't even have to deal with empty cases at the point where you parse. Instead, you can defer the decision about what to do in case of failure until a later time, where it may be more obvious what to do in that case. </p> <h3 id="4f27ce3476114a5f9b0f80fd415e5370"> Maybe <a href="#4f27ce3476114a5f9b0f80fd415e5370" title="permalink">#</a> </h3> <p> In my <a href="https://blog.ploeh.dk/encapsulation-and-solid">Encapsulation and SOLID</a> Pluralsight course, you get a walk-through of all three options for dealing with an operation that could potentially fail. Like in this article, the course starts with Tester-Doer, progresses over Try-Parse, and arrives at a Maybe-based implementation. In that course, the example involves reading a (previously stored) message from a text file. The final API looks like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&lt;<span style="color:blue;">string</span>&gt;&nbsp;Read(<span style="color:blue;">int</span>&nbsp;id)</pre> </p> <p> The protocol implied by such a signature is that you supply an ID, and if a message with that ID exists on disc, you receive a populated <code>Maybe&lt;string&gt;</code>; otherwise, an empty object. This is not only composable, but also thread-safe. For anyone who understands the <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> of Maybe, it's clear that this is an operation that could fail. Ultimately, client code will have to deal with empty Maybe values, but this doesn't have to happen immediately. Such a decision can be deferred until a proper context exists for that purpose. </p> <h3 id="d35fbacb32bb4ef6afc843813ba901f1"> From Maybe to Tester-Doer <a href="#d35fbacb32bb4ef6afc843813ba901f1" title="permalink">#</a> </h3> <p> Since Tester-Doer is the least useful of the patterns discussed in this article, it makes little sense to refactor a Maybe-based API to a Tester-Doer implementation. Nonetheless, it's still possible. The API could look like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;Exists(<span style="color:blue;">int</span>&nbsp;id) <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;Read(<span style="color:blue;">int</span>&nbsp;id)</pre> </p> <p> Not only is this design not thread-safe, but it's another example of poor discoverability. While the doer is called <code>Read</code>, the tester isn't called <code>CanRead</code>, but rather <code>Exists</code>. If the class has other members, these could be listed interleaved between <code>Exists</code> and <code>Read</code>. It wouldn't be obvious that these two members were designed to be used together. </p> <p> Again, the intended usage is code like this: </p> <p> <pre><span style="color:blue;">string</span>&nbsp;message; <span style="color:blue;">if</span>&nbsp;(fileStore.Exists(49)) &nbsp;&nbsp;&nbsp;&nbsp;message&nbsp;=&nbsp;fileStore.Read(49);</pre> </p> <p> This is still problematic, because you need to decide what to do in the <code>else</code> case as well, although you don't see that case here. </p> <p> The point is, still, that you <em>can</em> translate from one representation to another without loss of information; not that you should. </p> <h3 id="3bbc92082af143d29681b2ce0bb11ccb"> From Maybe to Try-Parse <a href="#3bbc92082af143d29681b2ce0bb11ccb" title="permalink">#</a> </h3> <p> Of the three representations discussed in this article, I firmly believe that a Maybe-based API is superior. Unfortunately, the .NET base class library doesn't (yet) come with a built-in Maybe object, so if you're developing an API as part of a reusable library, you have two options: <ul> <li>Export the library's <code>Maybe&lt;T&gt;</code> type together with the methods that return it.</li> <li>Use Try-Parse for interoperability reasons.</li> </ul> This is the only reason I can think of to use the Try-Parse idiom. For the <code>FileStore</code> example from my Pluralsight course, this would imply not a <code>TryParse</code> method, but a <code>TryRead</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;TryRead(<span style="color:blue;">int</span>&nbsp;id,&nbsp;<span style="color:blue;">out</span>&nbsp;<span style="color:blue;">string</span>&nbsp;message)</pre> </p> <p> This would enable you to expose the method in a reusable library. Client code could interact with it like this: </p> <p> <pre><span style="color:blue;">string</span>&nbsp;message; <span style="color:blue;">if</span>&nbsp;(!fileStore.TryRead(50,&nbsp;<span style="color:blue;">out</span>&nbsp;message)) &nbsp;&nbsp;&nbsp;&nbsp;message&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;</span>;</pre> </p> <p> This has all the problems associated with the Try-Parse idiom already discussed in this article, but it does, at least, have a basic use case. </p> <h3 id="c04073bcc534481eaaf1ba43dd2a22a4"> Isomorphism with Either <a href="#c04073bcc534481eaaf1ba43dd2a22a4" title="permalink">#</a> </h3> <p> At this point, I hope that you find it reasonable to believe that the three representations, Tester-Doer, Try-Parse, and Maybe, are isomorphic. You can translate between any of these representations to any other of these without loss of information. This also means that you can translate back again. </p> <p> While I've only argued with a series of examples, it's my experience that these three representations are truly isomorphic. You can always translate any of these representations into another. Mostly, though, I translate into Maybe. If you disagree with my proposition, all you have to do is to provide a counter-example. </p> <p> There's a fourth isomorphism that's already well-known, and that's between Maybe and <a href="/2018/06/11/church-encoded-either">Either</a>. Specifically, <code>Maybe&lt;T&gt;</code> is isomorphic to <code>Either&lt;Unit, T&gt;</code>. In <a href="https://www.haskell.org">Haskell</a>, this is easily demonstrated with this set of functions: </p> <p> <pre><span style="color:#2b91af;">toMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;()&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a toMaybe&nbsp;(Left&nbsp;<span style="color:blue;">()</span>)&nbsp;=&nbsp;Nothing toMaybe&nbsp;(Right&nbsp;x)&nbsp;=&nbsp;Just&nbsp;x <span style="color:#2b91af;">fromMaybe</span>&nbsp;::&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;()&nbsp;a fromMaybe&nbsp;Nothing&nbsp;=&nbsp;Left&nbsp;<span style="color:blue;">()</span> fromMaybe&nbsp;(Just&nbsp;x)&nbsp;=&nbsp;Right&nbsp;x</pre> </p> <p> Translated to C#, using the <a href="/2018/06/04/church-encoded-maybe">Church-encoded Maybe</a> together with the Church-encoded Either, these two functions could look like the following, starting with the conversion from Maybe to Either: </p> <p> <pre><span style="color:green;">//&nbsp;On&nbsp;Maybe:</span> <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;ToEither&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nothing:&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Left</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;(<span style="color:#2b91af;">Unit</span>.Value), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;just:&nbsp;x&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;(x)); }</pre> </p> <p> Likewise, the conversion from Either to Maybe: </p> <p> <pre><span style="color:green;">//&nbsp;On&nbsp;Either:</span> <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;ToMaybe&lt;<span style="color:#2b91af;">T</span>&gt;(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEither</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;source) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;source.Match&lt;<span style="color:#2b91af;">IMaybe</span>&lt;<span style="color:#2b91af;">T</span>&gt;&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onLeft:&nbsp;_&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Nothing</span>&lt;<span style="color:#2b91af;">T</span>&gt;(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onRight:&nbsp;x&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Just</span>&lt;<span style="color:#2b91af;">T</span>&gt;(x)); }</pre> </p> <p> You can convert back and forth to your heart's content, as this parametrised <a href="https://xunit.github.io">xUnit.net</a> 2.3.1 test shows: </p> <p> <pre>[<span style="color:#2b91af;">Theory</span>] [<span style="color:#2b91af;">InlineData</span>(42)] [<span style="color:#2b91af;">InlineData</span>(1337)] [<span style="color:#2b91af;">InlineData</span>(2112)] [<span style="color:#2b91af;">InlineData</span>(90125)] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;IsomorphicWithPopulatedMaybe(<span style="color:blue;">int</span>&nbsp;i) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;expected&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Right</span>&lt;<span style="color:#2b91af;">Unit</span>,&nbsp;<span style="color:blue;">int</span>&gt;(i); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;actual&nbsp;=&nbsp;expected.ToMaybe().ToEither(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Assert</span>.Equal(expected,&nbsp;actual); }</pre> </p> <p> I decided to exclude <code>IEither&lt;Unit, T&gt;</code> from the overall theme of this article in order to better contrast three alternatives that may not otherwise look equivalent. That <code>IEither&lt;Unit, T&gt;</code> is isomorphic to <code>IMaybe&lt;T&gt;</code> is a well-known result. Besides, I think that both of these two representations already inhabit the same conceptual space. Either and Maybe are both well-known in statically typed functional programming. </p> <h3 id="8e3e7b55ac1e49568712675713426e59"> Summary <a href="#8e3e7b55ac1e49568712675713426e59" title="permalink">#</a> </h3> <p> The Tester-Doer pattern is a decades-old design pattern that attempts to model how to perform operations that can potentially fail, without relying on exceptions for flow control. It predates mainstream multi-core processors by decades, which can explain why it even exists as a pattern in the first place. At the time people arrived at the pattern, thread-safety wasn't a big concern. </p> <p> The Try-Parse idiom is a thread-safe alternative to the Tester-Doer pattern. It combines the two <em>tester</em> and <em>doer</em> methods into a single method with an <code>out</code> parameter. While thread-safe, it's not composable. </p> <p> <em>Maybe</em> offers the best of both worlds. It's both thread-safe and composable. It's also as discoverable as any Try-Parse method. </p> <p> These three alternatives are all, however, isomorphic. This means that you can refactor any of the three designs into one of the other designs, without loss of information. It also means that you can implement <a href="https://en.wikipedia.org/wiki/Adapter_pattern">Adapters</a> between particular implementations, should you so desire. You see this frequently in <a href="https://fsharp.org">F#</a> code, where functions that return <code>'a option</code> adapt Try-Parse methods from the .NET base class library. </p> <p> While all three designs are equivalent in the sense that you can translate one into another, it doesn't imply that they're equally useful. <em>Maybe</em> is the superior design, and Tester-Doer clearly inferior. </p> <p> <strong>Next:</strong> <a href="/2018/05/22/church-encoding">Church encoding</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Payment types catamorphism https://blog.ploeh.dk/2019/07/08/payment-types-catamorphism 2019-07-08T06:08:00+00:00 Mark Seemann <div id="post"> <p> <em>You can find the catamorphism for a custom sum type. Here's an example.</em> </p> <p> This article is part of an <a href="/2019/04/29/catamorphisms">article series about catamorphisms</a>. A catamorphism is a <a href="/2017/10/04/from-design-patterns-to-category-theory">universal abstraction</a> that describes how to digest a data structure into a potentially more compact value. </p> <p> This article presents the catamorphism for a domain-specific <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>, as well as how to identify it. The beginning of this article presents the catamorphism in C#, with a few examples. The rest of the article describes how to deduce the catamorphism. This part of the article presents my work in <a href="https://www.haskell.org">Haskell</a>. Readers not comfortable with Haskell can just read the first part, and consider the rest of the article as an optional appendix. </p> <p> In all previous articles in the series, you've seen catamorphisms for well-known data structures: <a href="/2019/05/06/boolean-catamorphism">Boolean values</a>, <a href="/2019/05/13/peano-catamorphism">Peano numbers</a>, <a href="/2019/05/20/maybe-catamorphism">Maybe</a>, <a href="/2019/06/10/tree-catamorphism">trees</a>, and so on. These are all general-purpose data structures, so you might be left with the impression that catamorphisms are only related to such general types. That's not the case. The point of this article is to demonstrate that you can find the catamorphism for your own custom, domain-specific sum type as well. </p> <h3 id="2b6f7df594c0474589ae9805f1e1a1d0"> C# catamorphism <a href="#2b6f7df594c0474589ae9805f1e1a1d0" title="permalink">#</a> </h3> <p> The custom type we'll examine in this article is the <a href="/2018/06/18/church-encoded-payment-types">Church-encoded payment types</a> I've previously written about. It's just an example of a custom data type, but it serves the purpose of illustration because I've already shown it as a Church encoding in C#, <a href="/2018/06/25/visitor-as-a-sum-type">as a Visitor in C#</a>, and <a href="/2016/11/28/easy-domain-modelling-with-types">as a discriminated union in F#</a>. </p> <p> The catamorphism for the <code>IPaymentType</code> interface is the <code>Match</code> method: </p> <p> <pre><span style="color:#2b91af;">T</span>&nbsp;Match&lt;<span style="color:#2b91af;">T</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">PaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;individual, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">PaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;parent, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2b91af;">Func</span>&lt;<span style="color:#2b91af;">ChildPaymentService</span>,&nbsp;<span style="color:#2b91af;">T</span>&gt;&nbsp;child);</pre> </p> <p> As has turned out to be a common trait, the catamorphism is identical to the Church encoding. </p> <p> I'm not going to show more than a few examples of using the <code>Match</code> method, because you can find other examples in the previous articles, </p> <p> <pre>&gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Individual</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>)); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa" &gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Parent</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>)); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa" &gt; <span style="color:#2b91af;">IPaymentType</span>&nbsp;p&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">Child</span>(<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ChildPaymentService</span>(<span style="color:#a31515;">&quot;1234&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">PaymentService</span>(<span style="color:#a31515;">&quot;Visa&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Pay&quot;</span>))); &gt; p.Match(ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;ps&nbsp;=&gt;&nbsp;ps.Name,&nbsp;cps&nbsp;=&gt;&nbsp;cps.PaymentService.Name) "Visa"</pre> </p> <p> These three examples from a <em>C# Interactive</em> session demonstrate that no matter which payment method you use, you can use the same <code>Match</code> method call to extract the payment name from the <code>p</code> object. </p> <h3 id="f2334a900eef421cb24c6e48a96e411b"> Payment types F-Algebra <a href="#f2334a900eef421cb24c6e48a96e411b" title="permalink">#</a> </h3> <p> As in the <a href="/2019/06/24/full-binary-tree-catamorphism">previous article</a>, I'll use <code>Fix</code> and <code>cata</code> as explained in <a href="https://bartoszmilewski.com">Bartosz Milewski</a>'s excellent <a href="https://bartoszmilewski.com/2017/02/28/f-algebras/">article on F-Algebras</a>. </p> <p> First, you'll have to define the auxiliary types involved in this API: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentService&nbsp;=&nbsp;PaymentService&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;paymentServiceName&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;paymentServiceAction&nbsp;::&nbsp;String &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">data</span>&nbsp;ChildPaymentService&nbsp;=&nbsp;ChildPaymentService&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;originalTransactionKey&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;parentPaymentService&nbsp;::&nbsp;PaymentService &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>)</pre> </p> <p> While F-Algebras and fixed points are mostly used for recursive data structures, you can also define an F-Algebra for a non-recursive data structure. You already saw examples of that in the articles about <a href="/2019/05/06/boolean-catamorphism">Boolean catamorphism</a>, <a href="/2019/05/20/maybe-catamorphism">Maybe catamorphism</a>, and <a href="/2019/06/03/either-catamorphism">Either catamorphism</a>. While each of the three payment types have associated data, none of it is parametrically polymorphic, so a single type argument for the carrier type suffices: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentTypeF&nbsp;c&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;IndividualF&nbsp;PaymentService &nbsp;&nbsp;|&nbsp;ParentF&nbsp;PaymentService &nbsp;&nbsp;|&nbsp;ChildF&nbsp;ChildPaymentService &nbsp;&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Read</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">Functor</span>&nbsp;<span style="color:blue;">PaymentTypeF</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;IndividualF&nbsp;ps &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;ParentF&nbsp;ps &nbsp;&nbsp;<span style="color:blue;">fmap</span>&nbsp;_&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;ChildF&nbsp;cps</pre> </p> <p> I chose to call the carrier type <code>c</code> (for <em>carrier</em>). As was also the case with <code>BoolF</code>, <code>MaybeF</code>, and <code>EitherF</code>, the <code>Functor</code> instance ignores the map function because the carrier type is missing from all three cases. Like the <code>Functor</code> instances for <code>BoolF</code>, <code>MaybeF</code>, and <code>EitherF</code>, it'd seem that nothing happens, but at the type level, this is still a translation from <code>PaymentTypeF c</code> to <code>PaymentTypeF c1</code>. Not much of a function, perhaps, but definitely an <em>endofunctor</em>. </p> <p> Some helper functions make it a little easier to create <code>Fix PaymentTypeF</code> values, but there's really not much to them: </p> <p> <pre><span style="color:#2b91af;">individualF</span>&nbsp;::&nbsp;<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> individualF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;IndividualF <span style="color:#2b91af;">parentF</span>&nbsp;::&nbsp;<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> parentF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;ParentF <span style="color:#2b91af;">childF</span>&nbsp;::&nbsp;<span style="color:blue;">ChildPaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span> childF&nbsp;=&nbsp;Fix&nbsp;.&nbsp;ChildF</pre> </p> <p> That's all you need to identify the catamorphism. </p> <h3 id="da3c2c0fee2747bebb1db38c15110bcb"> Haskell catamorphism <a href="#da3c2c0fee2747bebb1db38c15110bcb" title="permalink">#</a> </h3> <p> At this point, you have two out of three elements of an F-Algebra. You have an endofunctor (<code>PaymentTypeF</code>), and an object <code>c</code>, but you still need to find a morphism <code>PaymentTypeF c -&gt; c</code>. </p> <p> As in the previous articles, start by writing a function that will become the catamorphism, based on <code>cata</code>: </p> <p> <pre>paymentF&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> While this compiles, with its <code>undefined</code> implementations, it obviously doesn't do anything useful. I find, however, that it helps me think. How can you return a value of the type <code>c</code> from the <code>IndividualF</code> case? You could pass an argument to the <code>paymentF</code> function, but you shouldn't ignore the data <code>ps</code> contained in the case, so it has to be a function: </p> <p> <pre>paymentF&nbsp;fi&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> I chose to call the argument <code>fi</code>, for <em>function, individual</em>. You can pass a similar argument to deal with the <code>ParentF</code> case: </p> <p> <pre>paymentF&nbsp;fi&nbsp;fp&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;fp&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;<span style="color:blue;">undefined</span></pre> </p> <p> And of course with the remaining <code>ChildF</code> case as well: </p> <p> <pre><span style="color:#2b91af;">paymentF</span>&nbsp;::&nbsp;(<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">PaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">ChildPaymentService</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">Fix&nbsp;PaymentTypeF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;c paymentF&nbsp;fi&nbsp;fp&nbsp;fc&nbsp;=&nbsp;cata&nbsp;alg &nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;alg&nbsp;(IndividualF&nbsp;ps)&nbsp;=&nbsp;fi&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ParentF&nbsp;ps)&nbsp;=&nbsp;fp&nbsp;ps &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(ChildF&nbsp;cps)&nbsp;=&nbsp;fc&nbsp;cps</pre> </p> <p> This works. Since <code>cata</code> has the type <code>Functor f =&gt; (f a -&gt; a) -&gt; Fix f -&gt; a</code>, that means that <code>alg</code> has the type <code>f a -&gt; a</code>. In the case of <code>PaymentTypeF</code>, the compiler infers that the <code>alg</code> function has the type <code>PaymentTypeF c -&gt; c</code>, which is just what you need! </p> <p> You can now see what the carrier type <code>c</code> is for. It's the type that the algebra extracts, and thus the type that the catamorphism returns. </p> <p> This, then, is the catamorphism for the payment types. Except for the <a href="/2019/06/10/tree-catamorphism">tree catamorphism</a>, all catamorphisms so far have been pairs, but this one is a triplet of functions. This is because the sum type has three cases instead of two. </p> <p> As you've seen repeatedly, this isn't the only possible catamorphism, since you can, for example, trivially reorder the arguments to <code>paymentF</code>. The version shown here is, however, equivalent to the above C# <code>Match</code> method. </p> <h3 id="e6248a9ea34148c79c2b03acc92de5f7"> Usage <a href="#e6248a9ea34148c79c2b03acc92de5f7" title="permalink">#</a> </h3> <p> You can use the catamorphism as a basis for other functionality. If, for example, you want to convert a <code>Fix PaymentTypeF</code> value to JSON, you can first define an <a href="http://hackage.haskell.org/package/aeson/docs/Data-Aeson.html">Aeson</a> record type for that purpose: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;PaymentJson&nbsp;=&nbsp;PaymentJson&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;name&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;action&nbsp;::&nbsp;String &nbsp;&nbsp;,&nbsp;startRecurrent&nbsp;::&nbsp;Bool &nbsp;&nbsp;,&nbsp;transactionKey&nbsp;::&nbsp;Maybe&nbsp;String &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Generic</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">PaymentJson</span></pre> </p> <p> Subsequently, you can use <code>paymentF</code> to implement a conversion from <code>Fix PaymentTypeF</code> to <code>PaymentJson</code>, as in the previous articles: </p> <p> <pre><span style="color:#2b91af;">toJson</span>&nbsp;::&nbsp;<span style="color:blue;">Fix</span>&nbsp;<span style="color:blue;">PaymentTypeF</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">PaymentJson</span> toJson&nbsp;= &nbsp;&nbsp;paymentF &nbsp;&nbsp;&nbsp;&nbsp;(\(PaymentService&nbsp;n&nbsp;a)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;False&nbsp;Nothing) &nbsp;&nbsp;&nbsp;&nbsp;(\(PaymentService&nbsp;n&nbsp;a)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;True&nbsp;Nothing) &nbsp;&nbsp;&nbsp;&nbsp;(\(ChildPaymentService&nbsp;k&nbsp;(PaymentService&nbsp;n&nbsp;a))&nbsp;-&gt;&nbsp;PaymentJson&nbsp;n&nbsp;a&nbsp;False&nbsp;$&nbsp;Just&nbsp;k)</pre> </p> <p> Testing it in GHCi, it works as it's supposed to: </p> <p> <pre>Prelude Data.Aeson B Payment&gt; B.putStrLn $encode$ toJson $parentF$ PaymentService "Visa" "Pay" {"transactionKey":null,"startRecurrent":true,"action":"Pay","name":"Visa"}</pre> </p> <p> Clearly, it would have been easier to define the payment types shown here as a regular Haskell sum type and just use standard pattern matching, but the purpose of this article isn't to present useful code; the only purpose of the code here is to demonstrate how to identify the catamorphism for a custom domain-specific sum type. </p> <h3 id="153479fffaf647f6ad6f5fc6a63fe025"> Summary <a href="#153479fffaf647f6ad6f5fc6a63fe025" title="permalink">#</a> </h3> <p> Even custom, domain-specific sum types have catamorphisms. This article presented the catamorphism for a custom payment sum type. Because this particular sum type has three cases, the catamorphism is a triplet, instead of a pair, which has otherwise been the most common shape of catamorphisms in previous articles. </p> <p> <strong>Next:</strong> <a href="/2018/03/05/some-design-patterns-as-universal-abstractions">Some design patterns as universal abstractions</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. I've been rereading Fred Brooks's 1986 essay No Silver Bullet because I've become increasingly concerned that people seem to draw the wrong conclusions from it. Semantic diffusion seems to have set in. These days, when people state something along the lines that there's no silver bullet in software development, I often get the impression that they mean that there's no panacea.

Indeed; I agree. There's no miracle cure that will magically make all problems in software development go away. That's not what the essay states, however. It is, fortunately, more subtle than that. That's not what the essay states, however. It is, fortunately, more subtle than that. </p> <h3 id="712292e6c9c34663801dd40b4f278d3d"> No silver bullet reread <a href="#712292e6c9c34663801dd40b4f278d3d" title="permalink">#</a> </h3> <p> It's a great essay. It's not my intent to dispute the central argument of the essay, but I think that Brooks made one particular assumption that I disagree with. That doesn't make me smarter in any way. He wrote the essay in 1986. I'm writing this in 2019, with the benefit of the experience of all the years in-between. Hindsight is 20-20, so anyone could make the observations that I do here. </p> <p> Before we get to that, though, a brief summary of the essence of the essay is in order. In short, the conclusion is this: <blockquote> <p> "There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity." </p> <footer><cite>Fred Brooks, <em>No Silver Bullet</em>, 1986</cite></footer> </blockquote> The beginning of the essay is a brilliant analysis of the reasons why software development is inherently difficult. If you read this together with Jack Reeves <em>What Is Software Design?</em> (available various places on the internet, or as an appendix in <a href="http://amzn.to/19W4JHk">APPP</a>), you'll probably agree that there's an inherent complexity to software development that no invention is likely to dispel. </p> <p> Ostensibly in the tradition of <a href="https://en.wikipedia.org/wiki/Aristotle">Aristotle</a>, Brooks distinguishes between <em>essential</em> and <em>accidental</em> complexity. This distinction is central to his argument, so it's worth discussing for a minute. </p> <p> Software development problems are complex, i.e. made up of many interacting sub-problems. Some of that complexity is <em>accidental</em>. This doesn't imply randomness or sloppiness, but only that the complexity isn't inherent to the problem; that it's only the result of our (human) failure to achieve perfection. </p> <p> If you imagine that you could whittle away all the accidental complexity, you'd ultimately reach a point where, in the words of Saint Exupéry, <em>there is nothing more to remove</em>. What's left is the <em>essential</em> complexity. </p> <p> Brooks' conjecture is that a typical software development project comes with both essential and accidental complexity. In his 1995 reflections <em>"No Silver Bullet" Refired</em> (available in <a href="http://bit.ly/mythical-man-month">The Mythical Man-Month</a>), he clarifies what he already implied in 1986: <blockquote> <p> "It is my opinion, and that is all, that the accidental or representational part of the work is now down to about half or less of the total." </p> <footer><cite>Fred Brooks, <em>"No Silver Bullet" Refired</em>, 1995</cite></footer> </blockquote> This I fundamentally disagree with, but more on that later. It makes sense to me to graphically represent the argument like this: </p> <p> <img src="/content/binary/essential-accidental-complexity-shells-brooks-scenario.png" alt="Some, but not much, accidental complexity as a shell around essential complexity."> </p> <p> The way that I think of Brooks' argument is that any software project contains some essential and some accidental complexity. For a given project, the size of the essential complexity is fixed. </p> <p> Brooks believes that less than half of the overall complexity is accidental: </p> <p> <img src="/content/binary/essential-accidental-complexity-pie-chart-brooks-scenario.png" alt="Essential and accidental complexity pie chart."> </p> <p> While a pie chart better illustrates the supposed ratio between the two types of complexity, I prefer to view Brooks' arguments as the first diagram, above. In that visualisation, the essential complexity is a core of fixed size, while accidental complexity is something you can work at removing. If you keep improving your process and technology, you may, conceptually, be able to remove (almost) all of it. </p> <p> <img src="/content/binary/essential-almost-no-accidental-complexity-shells.png" alt="Essential complexity with a very thin shell of accidental complexity."> </p> <p> Brooks' point, with which I agree, is that if the essential complexity is inherent, then you can't reduce the size of it. The only way to decrease the overall complexity is to reduce the accidental complexity. </p> <p> If you agree with the assessment that less than half of the overall complexity in modern software development is accidental, then it follows that no dramatic improvements are available. Even if you remove all accidental complexity, you've only reduced overall complexity by, say, forty percent. </p> <h3 id="d8e6f84d104b4ff6ad6b5473e46a4e30"> Accidental complexity abounds <a href="#d8e6f84d104b4ff6ad6b5473e46a4e30" title="permalink">#</a> </h3> <p> I find Brooks' arguments compelling. I do not, however, accept the premise that there's only little accidental complexity left. Instead of the above diagrams, I believe that the situation looks more like this (not to scale): </p> <p> <img src="/content/binary/accidental-complexity-with-tiny-core-of-essential-complexity.png" alt="Accidental complexity with a tiny core of essential complexity."> </p> <p> I think that most of the complexity in software development is accidental. I'm not sure about today, but I believe that I have compelling evidence that this was the case in 1986, so I don't see why it shouldn't still be the case. </p> <p> To be clear, this is all anecdotal, since I don't believe that software development is quantifiable. In the essay, Brooks explicitly talks about the <em>invisibility</em> of software. Software is pure <em>thought stuff;</em> you can't measure it. I discuss this in my <a href="https://cleancoders.com/episode/humane-code-real-episode-1/show">Humane Code video</a>, but I also recommend that you read <a href="http://bit.ly/leprechauns-of-software-engineering">The Leprechauns of Software Engineering</a> if you have any illusions that we, as an industry, have any reliable measurements of productivity. </p> <p> Brooks predicts that, within the decade (from 1986 to 1996), there would be no single development that would increase productivity with an order of magnitude, i.e. by a factor of at least ten. Ironically, when he wrote <em>"No Silver Bullet" Refired</em> in 1995, at least two such developments were already in motion. </p> <p> We can't blame Brooks for not identifying those developments, because in 1995, their impact was not yet apparent. Again, hindsight is 20-20. </p> <p> Neither of these two developments are purely technological, although technology plays a role. Notice, though, that Brooks' prediction included <em>technology or management technique</em>. It's in the interaction between technology and the humane that the orders-of-magnitude developments emerged. </p> <h3 id="1d23f6fb89884b6d9833ce09d68a3b0f"> World Wide Web <a href="#1d23f6fb89884b6d9833ce09d68a3b0f" title="permalink">#</a> </h3> <p> I have a dirty little secret. In the beginning of my programming career, I became quite the expert on a programming framework called <a href="https://en.wikipedia.org/wiki/Microsoft_Commerce_Server">Microsoft Commerce Server</a>. In fact, I co-authored a chapter of <a href="https://amzn.to/2CpE4rr">Professional Commerce Server 2000 Programming</a>, and in 2003 I received an <a href="https://mvp.microsoft.com">MVP</a> award as an acknowledgement of my work in the Commerce Server community (such as it were; it was mostly on <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a>). </p> <p> The Commerce Server framework was a black box. This was long before Microsoft embraced open source, and while there was a bit of official documentation, it was superficial; it was mostly of the <em>getting-started</em> kind. </p> <p> Over several years, I managed to figure out how the framework really worked, and thus, how one could extend it. This was a painstaking process. Since it was a black box, I couldn't just go and read the code to figure out how it worked. The framework was written in C++ and Visual Basic, so there wasn't even IL code to decompile. </p> <p> I had one window into the framework. It relied on SQL Server, and I could attach the profiler tool to spy on its interaction with the database. Painstakingly, over several years, I managed to wrest the framework's secrets from it. </p> <p> I wasted much time doing detective work like that. </p> <p> In general, programming in the late nineties and early two-thousands was less productive, not because the languages or tools were orders-of-magnitude worse than today, but because when you hit a snag, you were in trouble. </p> <p> These days, if you run into a problem beyond your abilities, you can ask for help on the World Wide Web. Usually, you'll find an existing answer on <a href="https://stackoverflow.com">Stack Overflow</a>, and you'll be able to proceed without too much delay. </p> <p> Compared to twenty years ago, I believe that the World Wide Web has increased my productivity more than ten-fold. While it also existed in 1995, there wasn't much content. It's not the technology itself that provides the productivity increase, but rather the synergy of technology and human knowledge. </p> <p> I think that Brooks vastly underestimated how much time one can waste when one is stuck. That's a sort of accidental complexity, although in the development process rather than in the technology itself. </p> <h3 id="a3b19483cd6a4c509d8c3a77fe324872"> Automated testing <a href="#a3b19483cd6a4c509d8c3a77fe324872" title="permalink">#</a> </h3> <p> In the late nineties, I was developing web sites (with Commerce Server). When I wanted to run my code to see if it worked, I'd launch the web site on my laptop, log in, click around and enter data until I was convinced that the functionality was working as it should. Most of the time, however, it wasn't, so I'd change a bit of the code, and go through the same process again. </p> <p> I think that's a common way to 'test' software; at least, it was back then. </p> <p> While you could get good at going through these motions quickly, verifying a single, or a handful of related functionalities, could easily take at least a couple of seconds, and usually more like half a minute. </p> <p> If you had dozens, or even hundreds, of different scenarios to address, you obviously wouldn't run through them all every time you changed the code. At the very best, you'd click your way through three of four usage scenarios that you thought were relevant to the change you'd made. Other functionality, earlier declared <em>done</em>, you just considered to be unaffected. </p> <p> Needless to say, regressions were regular occurrences. </p> <p> In 2003 I discovered test-driven development, and through that, automated testing. While you can't directly compare unit tests with whole usage scenarios, I think it's fair to compare something like automated integration tests or user-scenario tests (whatever you want to call them) with manually clicking through an application. </p> <p> Even an integration test, if written properly, can verify a scenario <em>at least</em> ten times faster than you can do it by hand. A more realistic estimate is probably hundred times faster, or more. </p> <p> Granted, you have to write the automated test as well, and I know that it's not always trivial. Still, once you have an automated test suite in place, you can run it all the time. </p> <p> I never ran through <em>all</em> usage scenarios when I manually 'tested' my software. With automated tests, I do. This saves me from most regressions. </p> <p> This improvement is, in my opinion, a no-brainer. It's easily a factor ten improvement. All the time wasted manually 'testing' the software, plus the time wasted fixing regressions, can be put to better use. </p> <p> At the time Brooks was writing his own retrospective (in 1995), Kent Beck was beginning to talk to other people about test-driven development. As is a common theme in this article, hindsight is 20-20. </p> <h3 id="c7ca9269cce04b3ab934c97bc8cf0328"> Honourable mentions <a href="#c7ca9269cce04b3ab934c97bc8cf0328" title="permalink">#</a> </h3> <p> There's been other improvements in software development since 1986. I considered including several other improvements as bona fide orders-of-magnitude improvements, but I think that's probably going too far. Each of the following developments have, however, offered significant improvements: <ul> <li> <strong>Git.</strong> It's surprising how much more productive Git can make you. While it's somewhat better than centralised source control systems at the functionality also available with those other systems, the productivity increase comes from all the new, unanticipated workflows it enables. Before I started using DVCS, I'd have lots of code that was commented out, so that I could experiment with various alternatives. With Git, I just create a new branch, or stash my changes, and experiment with abandon. While it's probably not a ten-fold increase in productivity, I believe it's the simplest technology change you can make to dramatically increase your productivity. </li> <li> <strong>Garbage collection.</strong> Since I've admitted that I worked with Microsoft Commerce Server, I've probably lost all credibility with my reader already, but let's see if I can win back a little. While Commerce Server programming involved <a href="https://en.wikipedia.org/wiki/VBScript">VBScript</a> programming, it also often involved <a href="https://en.wikipedia.org/wiki/Component_Object_Model">COM</a> programming, and I did quite a bit of that in C++. Having to make sure that you've cleaned up all memory after use is a bother. Garbage collection just makes this work go away. It's hardly a ten-fold improvement in productivity, but I do find it significant. </li> <li> <strong>Agile software development.</strong> The methodology of decreasing the feedback time between implementation and deployment has made me much more productive. I'm not interested in peddling any particular methodology like Scrum as much as just the general concept of getting rapid feedback. Particularly if you combine continuous delivery with Git, you have a powerful combination. Brooks already talked about incremental software development, and had some hopes attached to this as well. My personal experience can only agree with his sentiment. Again, probably not in itself a ten-fold increase in productivity, but enough that I wouldn't want to work on a project where rapid feedback and incremental development wasn't valued. </li> </ul> I'm probably forgetting lots of other improvements that have happened in the last decades. That's fine. The purpose of this article isn't to produce an exhaustive list, but rather to make the argument that significant improvements have been made since Brooks wrote his essay. I think it'd be folly, then, to believe that we've seen the last of such improvements. </p> <p> Personally, I'm inclined to believe another order-of-magnitude improvement is right at our feet. </p> <h3 id="bd2d47d8dac2401e936ca7902bc9109d"> Statically typed functional programming <a href="#bd2d47d8dac2401e936ca7902bc9109d" title="permalink">#</a> </h3> <p> This section is conjecture on my part. The improvements I've so far covered are already realised (at least for those who choose to take advantage of them). The improvement I'll cover here is more speculative. </p> <p> I believe that statically typed functional programming offers another order-of-magnitude improvement over existing software development. Twenty years ago, I believed that object-oriented programming was a good idea. I now believe that I was wrong about that, so it's possible that in another twenty years, I'll also believe that I was wrong about functional programming. Take the following for what it is. </p> <p> When I carefully reread <em>No Silver Bullet</em>, I got the distinct impression that Brooks considered low-level details of programming part of its essential complexity: <blockquote> <p> "Much of the complexity in a software construct is, however, not due to conformity to the external world but rather to th