ploeh blog 2024-05-22T10:14:01+00:00 Mark Seemann danish software design https://blog.ploeh.dk Fundamentals https://blog.ploeh.dk/2024/05/20/fundamentals 2024-05-20T07:04:00+00:00 Mark Seemann <div id="post"> <p> <em>How to stay current with technology progress.</em> </p> <p> A long time ago, I landed my dream job. My new employer was a consulting company, and my role was to be the resident <a href="https://en.wikipedia.org/wiki/Microsoft_Azure">Azure</a> expert. Cloud computing was still in its infancy, and there was a good chance that I might be able to establish myself as a leading regional authority on the topic. </p> <p> As part of the role, I was supposed to write articles and give presentations showing how to solve various problems with Azure. I dug in with fervour, writing sample code bases and even <a href="http://msdn.microsoft.com/en-us/magazine/gg983487.aspx">an MSDN Magazine article</a>. To my surprise, after half a year I realized that I was bored. </p> <p> At that time I'd already spent more than a decade learning new technology, and I knew that I was good at it. For instance, I worked five years for Microsoft Consulting Services, and a dirty little secret of that kind of role is that, although you're sold as an expert in some new technology, you're often only a few weeks ahead of your customer. For example, I was once engaged as a <a href="https://en.wikipedia.org/wiki/Windows_Workflow_Foundation">Windows Workflow Foundation</a> expert at a time when it was still in beta. No-one had years of experience with that technology, but I was still expected to know much more about it than my customer. </p> <p> I had lots of engagements like that, and they usually went well. I've always been good at cramming, and as a consultant you're also unencumbered by all the daily responsibilities and politics that often occupy the time and energy of regular employees. The point being that while I'm decent at learning new stuff, the role of being a consultant also facilitates that sort of activity. </p> <p> After more then a decade of learning new frameworks, new software libraries, new programming languages, new tools, new online services, it turned out that I was ready for something else. After spending a few months learning Azure, I realized that I'd lost interest in that kind of learning. When investigating a new Azure SDK, I'd quickly come to the conclusion that, <em>oh, this is just another object-oriented library</em>. There are these objects, and you call this method to do that, etc. That's not to say that learning a specific technology is a trivial undertaking. The worse the design, the more difficult it is to learn. </p> <p> Still, after years of learning new technologies, I'd started recognizing certain patterns. Perhaps, I thought, well-designed technologies are based on some fundamental ideas that may be worth learning instead. </p> <h3 id="ac37913a2b8248e6b51d4506c2da0481"> Staying current <a href="#ac37913a2b8248e6b51d4506c2da0481">#</a> </h3> <p> A common lament among software developers is that the pace of technology is so overwhelming that they can't keep up. This is true. You can't keep up. </p> <p> There will always be something that you don't know. In fact, most things you don't know. This isn't a condition isolated only to technology. The sum total of all human knowledge is so vast that you can't know it all. What you will learn, even after a lifetime of diligent study, will be a nanoscopic fraction of all human knowledge - even of everything related to software development. You can't stay current. Get used to it. </p> <p> A more appropriate question is: <em>How do I keep my skill set relevant?</em> </p> <p> Assuming that you wish to stay employable in some capacity, it's natural to be concerned with how your mad <a href="https://en.wikipedia.org/wiki/Adobe_Flash">Flash</a> skillz will land you the next gig. </p> <p> Trying to keep abreast of all new technologies in your field is likely to lead to burnout. Rather, put yourself in a position so that you can quickly learn necessary skills, just in time. </p> <h3 id="c529c0131b284fe1bca42bec0663fc8e"> Study fundamentals, rather than specifics <a href="#c529c0131b284fe1bca42bec0663fc8e">#</a> </h3> <p> Those many years ago, I realized that it'd be a better investment of my time to study fundamentals. Often, once you have some foundational knowledge, you can apply it in many circumstances. Your general knowledge will enable you to get quickly up to speed with specific technologies. </p> <p> Success isn't guaranteed, but knowing fundamentals increases your chances. </p> <p> This may still seem too abstract. Which fundamentals should you learn? </p> <p> In the remainder of this article, I'll give you some examples. The following collection of general programmer knowledge spans software engineering, computer science, broad ideas, but also specific tools. I only intend this set of examples to serve as inspiration. The list isn't complete, nor does it constitute a minimum of what you should learn. </p> <p> If you have other interests, you may put together your own research programme. What follows here are just some examples of fundamentals that I've found useful during my career. </p> <p> A criterion, however, for constituting foundational knowledge is that you should be able to apply that knowledge in a wide variety of contexts. The fundamental should not be tied to a particular programming language, platform, or operating system. </p> <h3 id="4f474189809f4d53b447b4005cef1bfd"> Design patterns <a href="#4f474189809f4d53b447b4005cef1bfd">#</a> </h3> <p> Perhaps the first foundational notion that I personally encountered was that of <em>design patterns</em>. As the Gang of Four (GoF) wrote in <a href="https://en.wikipedia.org/wiki/Design_Patterns">the book</a>, a design pattern is an abstract description of a solution that has been observed 'in the wild', more than once, independently evolved. </p> <p> Please pay attention to the causality. A design pattern isn't prescriptive, but descriptive. It's an observation that a particular code organisation tends to solve a particular problem. </p> <p> There are lots of misconceptions related to design patterns. One of them is that the 'library of patterns' is finite, and more or less constrained to the patterns included in the original book. </p> <p> There are, however, many more patterns. To illustrate how much wider this area is, here's a list of some patterns books in my personal library: </p> <ul> <li><a href="/ref/dp">Design Patterns</a></li> <li><a href="/ref/plopd3">Pattern Languages of Program Design 3</a></li> <li><a href="/ref/peaa">Patterns of Enterprise Application Architecture</a></li> <li><a href="/ref/eip">Enterprise Integration Patterns</a></li> <li><a href="/ref/xunit-patterns">xUnit Test Patterns</a></li> <li><a href="/ref/service-design-patterns">Service Design Patterns</a></li> <li><a href="/ref/implementation-patterns">Implementation Patterns</a></li> <li><a href="/ref/rest-cookbook">RESTful Web Services Cookbook</a></li> <li><a href="/ref/antipatterns">AntiPatterns</a></li> </ul> <p> In addition to these, there are many more books in my library that are patterns-adjacent, including <a href="/dippp">one of my own</a>. The point is that software design patterns is a vast topic, and it pays to know at least the most important ones. </p> <p> A design pattern fits the criterion that you can apply the knowledge independently of technology. The original GoF book has examples in <a href="https://en.wikipedia.org/wiki/C%2B%2B">C++</a> and <a href="https://en.wikipedia.org/wiki/Smalltalk">Smalltalk</a>, but I've found that they apply well to C#. Other people employ them in their <a href="https://www.java.com/">Java</a> code. </p> <p> Knowing design patterns not only helps you design solutions. That knowledge also enables you to recognize patterns in existing libraries and frameworks. It's this fundamental knowledge that makes it easier to learn new technologies. </p> <p> Often (although not always) successful software libraries and frameworks tend to follow known patterns, so if you're aware of these patterns, it becomes easier to learn such technologies. Again, be aware of the causality involved. I'm not claiming that successful libraries are explicitly designed according to published design patterns. Rather, some libraries become successful because they offer good solutions to certain problems. It's not surprising if such a good solution falls into a pattern that other people have already observed and recorded. It's like <a href="https://en.wikipedia.org/wiki/Parallel_evolution">parallel evolution</a>. </p> <p> This was my experience when I started to learn the details of Azure. Many of those SDKs and APIs manifested various design patterns, and once I'd recognized a pattern it became much easier to learn the rest. </p> <p> The idea of design patterns, particularly object-oriented design patterns, have its detractors, too. Let's visit that as the next set of fundamental ideas. </p> <h3 id="c44e7624ea3e4cef9485522146d17a6d"> Functional programming abstractions <a href="#c44e7624ea3e4cef9485522146d17a6d">#</a> </h3> <p> As I'm writing this, yet another Twitter thread pokes fun at object-oriented design (OOD) patterns as being nothing but a published collection of workarounds for the shortcomings of object orientation. The people who most zealously pursue that agenda tends to be functional programmers. </p> <p> Well, I certainly like functional programming (FP) better than OOD too, but rather than poking fun at OOD, I'm more interested in <a href="/2018/03/05/some-design-patterns-as-universal-abstractions">how design patterns relate to universal abstractions</a>. I also believe that FP has shortcomings of its own, but I'll have more to say about that in a future article. </p> <p> Should you learn about <a href="/2017/10/06/monoids">monoids</a>, <a href="/2018/03/22/functors">functors</a>, <a href="/2022/03/28/monads">monads</a>, <a href="/2019/04/29/catamorphisms">catamorphisms</a>, and so on? </p> <p> Yes you should, because these ideas also fit the criterion that the knowledge is technology-independent. I've used my knowledge of these topics in <a href="https://www.haskell.org/">Haskell</a> (hardly surprising) and <a href="https://fsharp.org/">F#</a>, but also in C# and <a href="https://www.python.org/">Python</a>. The various <a href="https://en.wikipedia.org/wiki/Language_Integrated_Query">LINQ</a> methods are really just well-known APIs associated with, you guessed it, functors, monads, monoids, and catamorphisms. </p> <p> Once you've learned these fundamental ideas, it becomes easier to learn new technologies. This has happened to me multiple times, for example in contexts as diverse as property-based testing and asynchronous message-passing architectures. Once I realize that an API gives rise to a monad, say, I know that certain functions must be available. I also know how I should best compose larger code blocks from smaller ones. </p> <p> Must you know all of these concepts before learning, say, F#? No, not at all. Rather, a language like F# is a great vehicle for learning such fundamentals. There's a first time for learning anything, and you need to start somewhere. Rather, the point is that once you know these concepts, it becomes easier to learn the next thing. </p> <p> If, for example, you already know what a monad is when learning F#, picking up the idea behind <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/computation-expressions">computation expressions</a> is easy once you realize that it's just a compiler-specific way to enable syntactic sugaring of monadic expressions. You can learn how computation expressions work without that knowledge, too; it's just harder. </p> <p> This is a recurring theme with many of these examples. You can learn a particular technology without knowing the fundamentals, but you'll have to put in more time to do that. </p> <p> On to the next example. </p> <h3 id="02c8fb3f6fe74fb9bffda719122c60a9"> SQL <a href="#02c8fb3f6fe74fb9bffda719122c60a9">#</a> </h3> <p> Which <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mapper</a> (ORM) should you learn? <a href="https://hibernate.org/orm/">Hibernate</a>? <a href="https://learn.microsoft.com/ef/">Entity Framework</a>? </p> <p> How about learning <a href="https://en.wikipedia.org/wiki/SQL">SQL</a>? I learned SQL in 1999, I believe, and it's served me well ever since. I <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">consider raw SQL to be more productive than using an ORM</a>. Once more, SQL is largely technology-independent. While each database typically has its own SQL dialect, the fundamentals are the same. I'm most well-versed in the <a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server">SQL Server</a> dialect, but I've also used my SQL knowledge to interact with <a href="https://www.oracle.com/database/">Oracle</a> and <a href="https://www.postgresql.org/">PostgreSQL</a>. Once you know one SQL dialect, you can quickly solve data problems in one of the other dialects. </p> <p> It doesn't matter much whether you're interacting with a database from .NET, Haskell, Python, <a href="https://www.ruby-lang.org/">Ruby</a>, or another language. SQL is not only universal, the core of the language is stable. What I learned in 1999 is still useful today. Can you say the same about your current ORM? </p> <p> Most programmers prefer learning the newest, most cutting-edge technology, but that's a risky gamble. Once upon a time <a href="https://en.wikipedia.org/wiki/Microsoft_Silverlight">Silverlight</a> was a cutting-edge technology, and more than one of my contemporaries went all-in on it. </p> <p> On the contrary, most programmers find old stuff boring. It turns out, though, that it may be worthwhile learning some old technologies like SQL. Be aware of the <a href="https://en.wikipedia.org/wiki/Lindy_effect">Lindy effect</a>. If it's been around for a long time, it's likely to still be around for a long time. This is true for the next example as well. </p> <h3 id="e4a7c033c0964420a0abbf83a0bbb773"> HTTP <a href="#e4a7c033c0964420a0abbf83a0bbb773">#</a> </h3> <p> The <a href="https://en.wikipedia.org/wiki/HTTP">HTTP protocol</a> has been around since 1991. It's an effectively text-based protocol, and you can easily engage with a web server on a near-protocol level. This is true for other older protocols as well. </p> <p> In my first IT job in the late 1990s, one of my tasks was to set up and maintain <a href="https://en.wikipedia.org/wiki/Microsoft_Exchange_Server">Exchange Servers</a>. It was also my responsibility to make sure that email could flow not only within the organization, but that we could exchange email with the rest of the internet. In order to test my mail servers, I would often just <a href="https://en.wikipedia.org/wiki/Telnet">telnet</a> into them on port 25 and type in the correct, <a href="https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol">text-based instructions to send a test email</a>. </p> <p> Granted, it's not that easy to telnet into a modern web server on port 80, but a ubiquitous tool like <a href="https://curl.se/">curl</a> accomplishes the same goal. I recently wrote how <a href="/2024/05/13/gratification">knowing curl is better</a> than knowing <a href="https://www.postman.com/">Postman</a>. While this wasn't meant as an attack on Postman specifically, neither was it meant as a facile claim that curl is the only tool useful for ad-hoc interaction with HTTP-based APIs. Sometimes you only realize an underlying truth when you write about a thing and then <a href="/2024/05/13/gratification#9efea1cadb8c4e388bfba1a2064dd59a">other people find fault with your argument</a>. The underlying truth, I think, is that it pays to understand HTTP and being able to engage with an HTTP-based web service at that level of abstraction. </p> <p> Preferably in an automatable way. </p> <h3 id="e3a250b707b243dabc6609134e864aee"> Shells and scripting <a href="#e3a250b707b243dabc6609134e864aee">#</a> </h3> <p> The reason I favour curl over other tools to interact with HTTP is that I already spend quite a bit of time at the command line. I typically have a little handful of terminal windows open on my laptop. If I need to test an HTTP server, curl is already available. </p> <p> Many years ago, an employer introduced me to <a href="https://git-scm.com/">Git</a>. Back then, there were no good graphical tools to interact with Git, so I had to learn to use it from the command line. I'm eternally grateful that it turned out that way. I still use Git from the command line. </p> <p> When you install Git, by default you also install Git Bash. Since I was already using that shell to interact with Git, it began to dawn on me that it's a full-fledged shell, and that I could do all sorts of other things with it. It also struck me that learning <a href="https://www.gnu.org/software/bash/">Bash</a> would be a better investment of my time than learning <a href="https://learn.microsoft.com/powershell/">PowerShell</a>. At the time, there was no indication that PowerShell would ever be relevant outside of Windows, while Bash was already available on most systems. Even today, knowing Bash strikes me as more useful than knowing PowerShell. </p> <p> It's not that I do much Bash-scripting, but I could. Since I'm a programmer, if I need to automate something, I naturally reach for something more robust than shell scripting. Still, it gives me confidence to know that, since I already know Bash, Git, curl, etc., I <em>could</em> automate some tasks if I needed to. </p> <p> Many a reader will probably complain that the Git CLI has horrible <a href="/2024/05/13/gratification">developer experience</a>, but I will, again, postulate that it's not that bad. It helps if you understand some fundamentals. </p> <h3 id="a511cfd8d9bf4bbda433dbf70184284a"> Algorithms and data structures <a href="#a511cfd8d9bf4bbda433dbf70184284a">#</a> </h3> <p> Git really isn't that difficult to understand once you realize that a Git repository is just a <a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed acyclic graph</a> (DAG), and that branches are just labels that point to nodes in the graph. There are basic data structures that it's just useful to know. DAGs, <a href="https://en.wikipedia.org/wiki/Tree_(graph_theory)">trees</a>, <a href="https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)">graphs</a> in general, <a href="https://en.wikipedia.org/wiki/Adjacency_list">adjacency lists</a> or <a href="https://en.wikipedia.org/wiki/Adjacency_matrix">adjacency matrices</a>. </p> <p> Knowing that such data structures exist is, however, not that useful if you don't know what you can <em>do</em> with them. If you have a graph, you can find a <a href="https://en.wikipedia.org/wiki/Minimum_spanning_tree">minimum spanning tree</a> or a <a href="https://en.wikipedia.org/wiki/Shortest-path_tree">shortest-path tree</a>, which sometimes turn out to be useful. Adjacency lists or matrices give you ways to represent graphs in code, which is why they are useful. </p> <p> Contrary to certain infamous interview practices, you don't need to know these algorithms by heart. It's usually enough to know that they exist. I can't remember <a href="https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm">Dijkstra's algorithm</a> off the top of my head, but if I encounter a problem where I need to find the shortest path, I can look it up. </p> <p> Or, if presented with the problem of constructing current state from an Event Store, you may realize that it's just a left <a href="https://en.wikipedia.org/wiki/Fold_(higher-order_function)">fold</a> over a <a href="https://en.wikipedia.org/wiki/Linked_list">linked list</a>. (This isn't my own realization; I first heard it from <a href="https://gotocon.com/cph-2011/presentation/Behavior!">Greg Young in 2011</a>.) </p> <p> Now we're back at one of the first examples, that of FP knowledge. A <a href="/2019/05/27/list-catamorphism">list fold is its catamorphism</a>. Again, these things are much easier to learn if you already know some fundamentals. </p> <h3 id="f109425f27014cd5bd395a74e9575355"> What to learn <a href="#f109425f27014cd5bd395a74e9575355">#</a> </h3> <p> These examples may seems overwhelming. Do you really need to know all of that before things become easier? </p> <p> No, that's not the point. I didn't start out knowing all these things, and some of them, I'm still not very good at. The point is rather that if you're wondering how to invest your limited time so that you can remain up to date, consider pursuing general-purpose knowledge rather than learning a specific technology. </p> <p> Of course, if your employer asks you to use a particular library or programming language, you need to study <em>that</em>, if you're not already good at it. If, on the other hand, you decide to better yourself, you can choose what to learn next. </p> <p> Ultimately, if your're learning for your own sake, the most important criterion may be: Choose something that interests you. If no-one forces you to study, it's too easy to give up if you lose interest. </p> <p> If, however, you have the choice between learning <a href="https://mjvl.github.io/Noun.js/">Noun.js</a> or design patterns, may I suggest the latter? </p> <h3 id="94c3f380b556403d82dd9f3cd0c1d1e9"> For life <a href="#94c3f380b556403d82dd9f3cd0c1d1e9">#</a> </h3> <p> When are you done, you ask? </p> <p> Never. There's more stuff than you can learn in a lifetime. I've met a lot of programmers who finally give up on the grind to keep up, and instead become managers. </p> <p> As if there's nothing to learn when you're a manager. I'm fortunate that, before <a href="/2011/11/08/Independency">I went solo</a>, I mainly had good managers. I'm under no illusion that they automatically became good managers. All I've heard said about management is that there's a lot to learn in that field, too. Really, it'd be surprising if that wasn't the case. </p> <p> I can understand, however, how just keep learning the next library, the next framework, the next tool becomes tiring. As I've already outlined, I hit that wall more than a decade ago. </p> <p> On the other hand, there are so many wonderful fundamentals that you can learn. You can do self-study, or you can enrol in a more formal programme if you have the opportunity. I'm currently following a course on compiler design. It's not that I expect to pivot to writing compilers for the rest of my career, but rather, </p> <blockquote> <ol type="a"> <li>"It is considered a topic that you should know in order to be "well-cultured" in computer science.</li> <li>"A good craftsman should know his tools, and compilers are important tools for programmers and computer scientists.</li> <li>"The techniques used for constructing a compiler are useful for other purposes as well.</li> <li>"There is a good chance that a programmer or computer scientist will need to write a compiler or interpreter for a domain-specific language."</li> </ol> <footer><cite><a href="/ref/introduction-to-compiler-design">Introduction to Compiler Design</a></cite> (from the introduction), Torben Ægidius Mogensen</footer> </blockquote> <p> That's good enough for me, and so far, I'm enjoying the course (although it's also hard work). </p> <p> You may not find this particular topic interesting, but then hopefully you can find something else that you fancy. 3D rendering? Machine learning? Distributed systems architecture? </p> <h3 id="7519e9b6147d49379f545c69871c381a"> Conclusion <a href="#7519e9b6147d49379f545c69871c381a">#</a> </h3> <p> Technology moves at a pace with which it's impossible to keep up. It's not just you who's falling behind. Everyone is. Even the best-paid <a href="https://en.wikipedia.org/wiki/Big_Tech">GAMMA</a> programmer knows next to nothing of all there is to know in the field. They may have superior skills in certain areas, but there will be so much other stuff that they don't know. </p> <p> You may think of me as a <a href="https://x.com/hillelogram/status/1445435617047990273">thought leader</a> if you will. If nothing else, I tend to be a prolific writer. Perhaps you even think I'm a good programmer. I should hope so. Who fancies themselves bad at something? </p> <p> You should, however, have seen me struggle with <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a> programming during a course on computer systems programming. There's a thing I'm happy if I never have to revisit. </p> <p> You can't know it all. You can't keep up. But you can focus on learning the fundamentals. That tends to make it easier to learn specific technologies that build on those foundations. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Gratification https://blog.ploeh.dk/2024/05/13/gratification 2024-05-13T06:27:00+00:00 Mark Seemann <div id="post"> <p> <em>Some thoughts on developer experience.</em> </p> <p> Years ago, I was introduced to a concept called <em>developer ergonomics</em>. Despite the name, it's not about good chairs, standing desks, or multiple monitors. Rather, the concept was related to how easy it'd be for a developer to achieve a certain outcome. How easy is it to set up a new code base in a particular language? How much work is required to save a row in a database? How hard is it to read rows from a database and display the data on a web page? And so on. </p> <p> These days, we tend to discuss <em>developer experience</em> rather than ergonomics, and that's probably a good thing. This term more immediately conveys what it's about. </p> <p> I've recently had some discussions about developer experience (DevEx, DX) with one of my customers, and this has lead me to reflect more explicitly on this topic than previously. Most of what I'm going to write here are opinions and beliefs that go back a long time, but apparently, it's only recently that these notions have congealed in my mind under the category name <em>developer experience</em>. </p> <p> This article may look like your usual old-man-yells-at-cloud article, but I hope that I can avoid that. It's not the case that I yearn for some lost past where 'we' wrote <a href="https://en.wikipedia.org/wiki/Plankalk%C3%BCl">Plankalkül</a> in <a href="https://en.wikipedia.org/wiki/Edlin">Edlin</a>. That, in fact, sounds like a horrible developer experience. </p> <p> The point, rather, is that most attractive things come with consequences. For anyone who have been reading this blog even once in a while, this should come as no surprise. </p> <h3 id="cbc9752f754e40cc94267689f5dd87bf"> Instant gratification <a href="#cbc9752f754e40cc94267689f5dd87bf">#</a> </h3> <p> Fat foods, cakes, and wine can be wonderful, but can be detrimental to your health if you overindulge. It can, however, be hard to resist a piece of chocolate, and even if we think that we shouldn't, we often fail to restrain ourselves. The temptation of instant gratification is simply too great. </p> <p> There are other examples like this. The most obvious are the use of narcotics, lack of exercise, smoking, and dropping out of school. It may feel good in the moment, but can have long-term consequences. </p> <p> Small children are notoriously bad at delaying gratification, and we often associate the ability to delay gratification with maturity. We all, however, fall in from time to time. Food and wine are my weak spots, while I don't do drugs, and I didn't drop out of school. </p> <p> It strikes me that we often talk about ideas related to developer experience in a way where we treat developers as children. To be fair, many developers also act like children. I don't know how many times I've something like, <em>"I don't want to write tests/go through a code review/refactor! I just want to ship working code now!"</em> </p> <p> Fine, so do I. </p> <p> Even if wine is bad for me, it makes life worth living. As the saying goes, even if you don't smoke, don't drink, exercise rigorously, eat healthily, don't do drugs, and don't engage in dangerous activities, you're not guaranteed to live until ninety, but you're guaranteed that it's going to <em>feel</em> that long. </p> <p> Likewise, I'm aware that doing everything right can sometimes take so long that by the time we've deployed the software, it's too late. The point isn't to always or never do certain things, but rather to be aware of the consequences of our choices. </p> <h3 id="ac2969093f264da092186fa0cb7196e5"> Developer experience <a href="#ac2969093f264da092186fa0cb7196e5">#</a> </h3> <p> I've no problem with aiming to make the experience of writing software as good as possible. Some developer-experience thought leaders talk about the importance of documentation, predictability, and timeliness. Neither do I mind that a development environment looks good, completes my words, or helps me refactor. </p> <p> To return to the analogy of human vices, not everything that feels good is ultimately bad for you. While I do like wine and chocolate, I also love <a href="https://en.wikipedia.org/wiki/Sushi">sushi</a>, white <a href="https://en.wikipedia.org/wiki/Asparagus">asparagus</a>, <a href="https://en.wikipedia.org/wiki/Turbot">turbot</a>, <a href="https://en.wikipedia.org/wiki/Chanterelle">chanterelles</a>, <a href="https://en.wikipedia.org/wiki/Cyclopterus">lumpfish</a> roe <a href="https://en.wikipedia.org/wiki/Caviar">caviar</a>, <a href="https://en.wikipedia.org/wiki/Morchella">true morels</a>, <a href="https://en.wikipedia.org/wiki/Nephrops_norvegicus">Norway lobster</a>, and various other foods that tend to be categorized as healthy. </p> <p> A good <a href="https://en.wikipedia.org/wiki/Integrated_development_environment">IDE</a> with refactoring support, statement completion, type information, test runner, etc. is certainly preferable to writing all code in <a href="https://en.wikipedia.org/wiki/Windows_Notepad">Notepad</a>. </p> <p> That said, there's a certain kind of developer tooling and language features that strikes me as more akin to candy. These are typically tools and technologies that tend to demo well. Recent examples include <a href="https://www.openapis.org/">OpenAPI</a>, <a href="https://github.com/features/copilot">GitHub Copilot</a>, <a href="https://learn.microsoft.com/dotnet/csharp/fundamentals/program-structure/top-level-statements">C# top-level statements</a>, code generation, and <a href="https://www.postman.com/">Postman</a>. Not all of these are unequivocally bad, but they strike me as mostly aiming at immature developers. </p> <p> The point of this article isn't to single out these particular products, standards, or language features, but on the other hand, in order to make a point, I do have to at least outline why I find them problematic. They're just examples, and I hope that by explaining what is on my mind, you can see the pattern and apply it elsewhere. </p> <h3 id="f7f676bf5a334b189b3c2baab18b1e6a"> OpenAPI <a href="#f7f676bf5a334b189b3c2baab18b1e6a">#</a> </h3> <p> A standard like OpenAPI, for example, looks attractive because it automates or standardizes much work related to developing and maintaining <a href="https://en.wikipedia.org/wiki/REST">REST APIs</a>. Frameworks and tools that leverage that standard automatically creates machine-readable <a href="/2024/04/15/services-share-schema-and-contract-not-class">schema and contract</a>, which can be used to generate client code. Furthermore, an OpenAPI-aware framework can also autogenerate an entire web-based graphical user interface, which developers can use for ad-hoc testing. </p> <p> I've worked with clients who also published these OpenAPI user interfaces to their customers, so that it was easy to get started with the APIs. Easy onboarding. </p> <p> Instant gratification. </p> <p> What's the problem with this? There are clearly enough apparent benefits that I usually have a hard time talking my clients out of pursuing this strategy. What are the disadvantages? Essentially, OpenAPI locks you into <a href="https://martinfowler.com/articles/richardsonMaturityModel.html">level 2</a> APIs. No hypermedia controls, no <a href="/2015/06/22/rest-implies-content-negotiation">smooth conneg-based versioning</a>, no <a href="https://en.wikipedia.org/wiki/HATEOAS">HATEOAS</a>. In fact, most of what makes REST flexible is lost. What remains is an ad-hoc, informally-specified, bug-ridden, slow implementation of half of <a href="https://en.wikipedia.org/wiki/SOAP">SOAP</a>. </p> <p> I've <a href="/2022/12/05/github-copilot-preliminary-experience-report">previously described my misgivings about Copilot</a>, and while I actually still use it, I don't want to repeat all of that here. Let's move on to another example. </p> <h3 id="f56e835825464650a86c557c7253f095"> Top-level statements <a href="#f56e835825464650a86c557c7253f095">#</a> </h3> <p> Among many other language features, C# 9 got <em>top-level-statements</em>. This means that you don't need to write a <code>Main</code> method in a static class. Rather, you can have a single C# code file where you can immediately start executing code. </p> <p> It's not that I consider this language feature particularly harmful, but it also solves what seems to me a non-problem. It demos well, though. If I understand the motivation right, the feature exists because 'modern' developers are used to languages like <a href="https://www.python.org/">Python</a> where you can, indeed, just create a <code>.py</code> file and start adding code statements. </p> <p> In an attempt to make C# more attractive to such an audience, it, too, got that kind of developer experience enabled. </p> <p> You may argue that this is a bid to remove some of the ceremony from the language, but I'm not convinced that this moves that needle much. The <a href="/2019/12/16/zone-of-ceremony">level of ceremony that a language like C# has is much deeper than that</a>. That's not to target C# in particular. <a href="https://www.java.com/">Java</a> is similar, and don't even get me started on <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a> or <a href="https://en.wikipedia.org/wiki/C%2B%2B">C++</a>! Did anyone say <em>header files?</em> </p> <p> Do 'modern' developers choose Python over C# because they can't be arsed to write a <code>Main</code> method? If that's the <em>only</em> reason, it strikes me as incredibly immature. <em>I want instant gratification, and writing a <code>Main</code> method is just too much trouble!</em> </p> <p> If developers do, indeed, choose Python or JavaScript over C# and Java, I hope and believe that it's for other reasons. </p> <p> This particular C# feature doesn't bother me, but I find it symptomatic of a kind of 'innovation' where language designers target instant gratification. </p> <h3 id="b9ce02aa90074838bd7b8e2cec0189e2"> Postman <a href="#b9ce02aa90074838bd7b8e2cec0189e2">#</a> </h3> <p> Let's consider one more example. You may think that I'm now attacking a company that, for all I know, makes a decent product. I don't really care about that, though. What I do care about is the developer mentality that makes a particular tool so ubiquitous. </p> <p> I've met web service developers who would be unable to interact with the HTTP APIs that they are themselves developing if they didn't have Postman. Likewise, there are innumerable questions on <a href="https://stackoverflow.com/">Stack Overflow</a> where people ask questions about HTTP APIs and post screen shots of Postman sessions. </p> <p> It's okay if you don't know how to interact with an HTTP API. After all, there's a first time for everything, and there was a time when I didn't know how to do this either. Apparently, however, it's easier to install an application with a graphical user interface than it is to use <a href="https://curl.se/">curl</a>. </p> <p> Do yourself a favour and learn curl instead of using Postman. Curl is a command-line tool, which means that you can use it for both ad-hoc experimentation and automation. It takes five to ten minutes to learn the basics. It's also free. </p> <p> It still seems to me that many people are of a mind that it's easier to use Postman than to learn curl. Ultimately, I'd wager that for any task you do with some regularity, it's more productive to learn the text-based tool than the point-and-click tool. In a situation like this, I'd suggest that delayed gratification beats instant gratification. </p> <h3 id="50ed56effb784c95a6f6de4967e883ef"> CV-driven development <a href="#50ed56effb784c95a6f6de4967e883ef">#</a> </h3> <p> It is, perhaps, easy to get the wrong impression from the above examples. I'm not pointing fingers at just any 'cool' new technology. There are techniques, languages, frameworks, and so on, which people pick up because they're exciting for other reasons. Often, such technologies solve real problems in their niches, but are then applied for the sole reason that people want to get on the bandwagon. Examples include <a href="https://kubernetes.io/">Kubernetes</a>, mocks, <a href="/2012/11/06/WhentouseaDIContainer">DI Containers</a>, <a href="/2023/12/04/serialization-with-and-without-reflection">reflection</a>, <a href="https://en.wikipedia.org/wiki/Aspect-oriented_programming">AOP</a>, and <a href="https://en.wikipedia.org/wiki/Microservices">microservices</a>. All of these have legitimate applications, but we also hear about many examples where people use them just to use them. </p> <p> That's a different problem from the one I'm discussing in this article. Usually, learning about such advanced techniques requires delaying gratification. There's nothing wrong with learning new skills, but part of that process is also gaining the understanding of when to apply the skill, and when not to. That's a different discussion. </p> <h3 id="10cc039f39ba4c2caab34f66f17f90b2"> Innovation is fine <a href="#10cc039f39ba4c2caab34f66f17f90b2">#</a> </h3> <p> The point of this article isn't that every innovation is bad. Contrary to <a href="https://www.charlespetzold.com/">Charles Petzold</a>, I don't really believe that Visual Studio rots the mind, although I once did publish <a href="/2013/02/04/BewareofProductivityTools">an article</a> that navigated the same waters. </p> <p> Despite my misgivings, I haven't uninstalled GitHub Copilot, and I do enjoy many of the features in both Visual Studio (VS) and Visual Studio Code (VS Code). I also welcome and use many new language features in various languages. </p> <p> I can certainly appreciate how an IDE makes many things easier. Every time I have to begin a new <a href="https://www.haskell.org/">Haskell</a> code base, I long for the hand-holding offered by Visual Studio when creating a new C# project. </p> <p> And although I don't use the debugger much, the built-in debuggers in VS and VS Code sure beat <a href="https://en.wikipedia.org/wiki/GNU_Debugger">GDB</a>. It even works in Python! </p> <p> There's even tooling that <a href="https://developercommunity.visualstudio.com/t/Test-Explorer:-Better-support-for-TDD-wo/701822">I wish for</a>, but apparently never will get. </p> <h3 id="675450a5c0cf441fa433b928251de8a5"> Simple made easy <a href="#675450a5c0cf441fa433b928251de8a5">#</a> </h3> <p> In <a href="https://www.infoq.com/presentations/Simple-Made-Easy/">Simple Made Easy</a> Rich Hickey follows his usual look-up-a-word-in-the-dictionary-and-build-a-talk-around-the-definition style to contrast <em>simple</em> with <em>easy</em>. I find his distinction useful. A tool or technique that's <em>close at hand</em> is <em>easy</em>. This certainly includes many of the above instant-gratification examples. </p> <p> An <em>easy</em> technique is not, however, necessarily <em>simple</em>. It may or may not be. <a href="https://en.wikipedia.org/wiki/Rich_Hickey">Rich Hickey</a> defines <em>simple</em> as the opposite of <em>complex</em>. Something that is complex is assembled from parts, whereas a simple thing is, ideally, single and undivisible. In practice, truly simple ideas and tools may not be available, and instead we may have to settle with things that are less complex than their alternatives. </p> <p> Once you start looking for things that make simple things easy, you see them in many places. A big category that I personally favour contains all the language features and tools that make functional programming (FP) easier. FP tends to be simpler than object-oriented or procedural programming, because it <a href="/2018/11/19/functional-architecture-a-definition">explicitly distinguishes between and separates</a> predictable code from unpredictable code. This does, however, in itself tend to make some programming tasks harder. How do you generate a random number? Look up the system time? Write a record to a database? </p> <p> Several FP languages have special features that make even those difficult tasks easy. <a href="https://fsharp.org/">F#</a> has <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/computation-expressions">computation expressions</a> and <a href="https://www.haskell.org/">Haskell</a> has <a href="https://en.wikibooks.org/wiki/Haskell/do_notation">do notation</a>. </p> <p> Let's say you want to call a function that consumes a random number generator. In Haskell (as in .NET) random number generators are actually deterministic, as long as you give them the same seed. Generating a random seed, on the other hand, is non-deterministic, so has to happen in <a href="/2020/06/08/the-io-container">IO</a>. </p> <p> Without <code>do</code> notation, you could write the action like this: </p> <p> <pre><span style="color:#2b91af;">rndSelect</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Integral</span>&nbsp;i&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;[a]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;i&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;[a] rndSelect&nbsp;xs&nbsp;count&nbsp;=&nbsp;(\rnd&nbsp;-&gt;&nbsp;rndGenSelect&nbsp;rnd&nbsp;xs&nbsp;count)&nbsp;&lt;$&gt;&nbsp;newStdGen</pre> </p> <p> (The type annotation is optional.) While terse, this is hardly readable, and the developer experience also leaves something to be desired. Fortunately, however, you can <a href="/2018/07/09/typing-and-testing-problem-23">rewrite this action</a> with <code>do</code> notation, like this: </p> <p> <pre><span style="color:#2b91af;">rndSelect</span>&nbsp;::&nbsp;<span style="color:blue;">Integral</span>&nbsp;i&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;[a]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;i&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;[a] rndSelect&nbsp;xs&nbsp;count&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;rnd&nbsp;&lt;-&nbsp;newStdGen &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;rndGenSelect&nbsp;rnd&nbsp;xs&nbsp;count </pre> </p> <p> Now we can clearly see that the action first creates the <code>rnd</code> random number generator and then passes it to <code>rndGenSelect</code>. That's what happened before, but it was buried in a lambda expression and Haskell's right-to-left causality. Most people would find the first version (without <code>do</code> notation) less readable, and more difficult to write. </p> <p> Related to <em>developer ergonomics</em>, though, <code>do</code> notation makes the simple code (i.e. code that separates predictable code from unpredictable code) easy (that is; <em>at hand</em>). </p> <p> F# computation expressions offer the same kind of syntactic sugar, making it easy to write simple code. </p> <h3 id="86e9a9bd6cfc4408bedace8acd330f64"> Delay gratification <a href="#86e9a9bd6cfc4408bedace8acd330f64">#</a> </h3> <p> While it's possible to set up a development context in such a way that it nudges you to work in a way that's ultimately good for you, temptation is everywhere. </p> <p> Not only may new language features, IDE functionality, or frameworks entice you to do something that may be disadvantageous in the long run. There may also be actions you don't take because it just feels better to move on. </p> <p> Do you take the time to write good commit messages? Not just a single-line heading, but <a href="https://github.com/GreanTech/AtomEventStore/commit/615cdee2c4d675d412e6669bcc0678655376c4d1">a proper message that explains your context and reasoning</a>? </p> <p> Most people I've observed working with source control 'just want to move on', and can't be bothered to write a useful commit message. </p> <p> I hear about the same mindset when it comes to code reviews, particularly pull request reviews. Everyone 'just wants to write code', and no-one want to review other people's code. Yet, in a shared code base, you have to live with the code that other people write. Why not review it so that you have a chance to decide what that shared code base should look like? </p> <p> Delay your own gratification a bit, and reap the awards later. </p> <h3 id="630055ed606d43289d71232dd1ef1c25"> Conclusion <a href="#630055ed606d43289d71232dd1ef1c25">#</a> </h3> <p> The only goal I have with this article is to make you think about the consequences of new and innovative tools and frameworks. Particularly if they are immediately compelling, they may be empty calories. Consider if there may be disadvantages to adopting a new way of doing things. </p> <p> Some tools and technologies give you instant gratification, but may be unhealthy in the long run. This is, like most other things, context-dependent. <a href="/2023/01/16/in-the-long-run">In the long run</a> your company may no longer be around. Sometimes, it pays to deliberately do something that you know is bad, in order to reach a goal before your competition. That was the original <em>technical debt</em> metaphor. </p> <p> Often, however, it pays to delay gratification. Learn curl instead of Postman. Learn to design proper REST APIs instead of relying on OpenAI. If you need to write ad-hoc scripts, <a href="/2024/02/05/statically-and-dynamically-typed-scripts">use a language suitable for that</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="9efea1cadb8c4e388bfba1a2064dd59a"> <div class="comment-author"><a href="https://thomaslevesque.com">Thomas Levesque</a> <a href="#9efea1cadb8c4e388bfba1a2064dd59a">#</a></div> <div class="comment-content"> <p> Regarding Postman vs. curl, I have to disagree. Sure, curl is pretty easy to use. But while it's good for one-off tests, it sucks when you need to maintain a collection of requests that you can re-execute whevenever you want. In a testing session, you either need to re-type whole command, or reuse a previous command from the shell's history. Or have a file with all your commands and copy-paste to the shell. Either way, it's not a good experience. </p> <p> That being said, I'm not very fond of Postman either. It's too heavyweight for what it does, IMHO, and the import/export mechanism is terrible for sharing collections with the team. These days, I tend to use VSCode extensions like <a href="https://github.com/AnWeber/vscode-httpyac">httpYac</a> or <a href="https://github.com/Huachao/vscode-restclient">REST Client</a>, or the equivalent that is now built into Visual Studio and Rider. It's much easier to work with than Postman (it's just text), while still being interactive. And since it's just a text file, you can just add it to the Git to share it with the team. </p> </div> <div class="comment-date">2024-05-14 02:38 UTC</div> </div> <div class="comment" id="9efea1cadb8c4e388bfba1a2064dd59b"> <div class="comment-author"><a href="https://majiehong.com/">Jiehong</a> <a href="#9efea1cadb8c4e388bfba1a2064dd59b">#</a></div> <div class="comment-content"> <p> @Thomas Levesque: I agree with you, yet VSCode or Rider's extensions lock you into an editor quite quickly. </p> <p> But you can have the best of both worlds: a cli tool first, with editor extensions. Just like <a href="https://github.com/Orange-OpenSource/hurl">Hurl</a>. </p> <p> Note that you can run a <a href="https://everything.curl.dev/cmdline/configfile.html#urls">curl command from a file with</a> <code>curl --config [curl_request.file]</code>, it makes chaining requests (like with login and secrets) rather cumbersome very quickly. </p> </div> <div class="comment-date">2024-05-16 13:57 UTC</div> </div> <div class="comment" id="2a6dd3839e2e4bf9b06071221b330356"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#2a6dd3839e2e4bf9b06071221b330356">#</a></div> <div class="comment-content"> <p> Thank you, both, for writing. In the end, it's up to every team to settle on technical solutions that work for them, in that context. Likewise, it's up to each developer to identify methodology and tools that work for her or him, as long as it doesn't impact the rest of the team. </p> <p> The reason I suggest curl over other alternatives is that not only is it free, it also tends to be ubiquitous. Most systems come with curl baked in - perhaps not a consumer installation of Windows, but if you have developer tools installed, it's highly likely that you have curl on your machine. It's <a href="/2024/05/20/fundamentals">a fundamental skill that may serve you well if you know it</a>. </p> <p> In addition to that, since curl is a CLI you can always script it if you need a kind of semi-automation. What prevents you from maintaining a collection of script files? They could even take command-line arguments, if you'd like. </p> <p> That said, personally, if I realize that I need to maintain a collection of requests that I can re-execute whenever I want, I'd prefer writing a 'real' program. On the other hand, I find a tool like curl useful for ad-hoc testing. </p> </div> <div class="comment-date">2024-05-21 5:36 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Conservative codomain conjecture https://blog.ploeh.dk/2024/05/06/conservative-codomain-conjecture 2024-05-06T06:35:00+00:00 Mark Seemann <div id="post"> <p> <em>An API design heuristic.</em> </p> <p> For a while now, I've been wondering whether, in the language of <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>, one should favour being liberal in what one accepts over being conservative in what one sends. Yes, according to the design principle, a protocol or API should do both, but sometimes, you can't do that. Instead, you'll have to choose. I've recently reached the tentative conclusion that it may be a good idea favouring being conservative in what one sends. </p> <p> Good API design explicitly considers <em>contracts</em>. What are the preconditions for invoking an operation? What are the postconditions? Are there any invariants? These questions are relevant far beyond object-oriented design. They are <a href="/2022/10/24/encapsulation-in-functional-programming">equally important in Functional Programming</a>, as well as <a href="/2024/04/15/services-share-schema-and-contract-not-class">in service-oriented design</a>. </p> <p> If you have a type system at your disposal, you can often model pre- and postconditions as types. In practice, however, it frequently turns out that there's more than one way of doing that. You can model an additional precondition with an input type, but you can also model potential errors as a return type. Which option is best? </p> <p> That's what this article is about, and my conjecture is that constraining the input type may be preferable, thus being conservative about what is returned. </p> <h3 id="7ef0610940fb4670b7cf12a21bdd725f"> An average example <a href="#7ef0610940fb4670b7cf12a21bdd725f">#</a> </h3> <p> That's all quite abstract, so for the rest of this article, I'll discuss this kind of problem in the context of an example. We'll revisit the <a href="/2020/02/03/non-exceptional-averages">good old example of calculating an average value</a>. This example, however, is only a placeholder for any kind of API design problem. This article is only superficially about designing an API for calculating an <a href="https://en.wikipedia.org/wiki/Average">average</a>. More generally, this is about API design. I like the <em>average</em> example because it's easy to follow, and it does exhibit some characteristics that you can hopefully extrapolate from. </p> <p> In short, what is the contract of the following method? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>&nbsp;<span style="color:#74531f;">Average</span>(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TimeSpan</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;=&nbsp;<span style="color:#2b91af;">TimeSpan</span>.Zero; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">+=</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>++; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">/</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>; }</pre> </p> <p> What are the preconditions? What are the postconditions? Are there any invariants? </p> <p> Before I answer these questions, I'll offer equivalent code in two other languages. Here it is in <a href="https://fsharp.org/">F#</a>: </p> <p> <pre>let&nbsp;average&nbsp;(timeSpans&nbsp;:&nbsp;TimeSpan&nbsp;seq)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;timeSpans &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.averageBy&nbsp;(_.Ticks&nbsp;&gt;&gt;&nbsp;double) &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;int64 &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;TimeSpan.FromTicks</pre> </p> <p> And in <a href="https://www.haskell.org/">Haskell</a>: </p> <p> <pre><span style="color:#2b91af;">average</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Fractional</span>&nbsp;a,&nbsp;<span style="color:blue;">Foldable</span>&nbsp;t)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a average&nbsp;xs&nbsp;=&nbsp;<span style="color:blue;">sum</span>&nbsp;xs&nbsp;/&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;(<span style="color:blue;">length</span>&nbsp;xs)</pre> </p> <p> These three examples have somewhat different implementations, but the same externally observable behaviour. What is the contract? </p> <p> It seems straightforward: If you input a sequence of values, you get the average of all of those values. Are there any preconditions? Yes, the sequence can't be empty. Given an empty sequence, all three implementations throw an exception. (The Haskell version is a little more nuanced than that, but given an empty list of <a href="https://hackage.haskell.org/package/time/docs/Data-Time-Clock.html#t:NominalDiffTime">NominalDiffTime</a>, it does throw an exception.) </p> <p> Any other preconditions? At least one more: The sequence must be finite. All three functions allow infinite streams as input, but if given one, they will fail to return an average. </p> <p> Are there any postconditions? I can only think of a statement that relates to the preconditions: <em>If</em> the preconditions are fulfilled, the functions will return the correct average value (within the precision allowed by floating-point calculations). </p> <p> All of this, however, is just warming up. We've <a href="/2020/02/03/non-exceptional-averages">been over this ground before</a>. </p> <h3 id="7922b269c9924877abe993cb282440a8"> Modelling contracts <a href="#7922b269c9924877abe993cb282440a8">#</a> </h3> <p> Keep in mind that this <em>average</em> function is just an example. Think of it as a stand-in for a procedure that's much more complicated. Think of the most complicated operation in your code base. </p> <p> Not only do real code bases have many complicated operations. Each comes with its own contract, different from the other operations, and if the team isn't explicitly thinking in terms of contracts, these contracts may change over time, as the team adds new features and fixes bugs. </p> <p> It's difficult work to keep track of all those contracts. As I argue in <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>, it helps if you can automate away some of that work. One way is having good test coverage. Another is to leverage a static type system, if you're fortunate enough to work in a language that has one. As I've <em>also</em> already covered, <a href="/2022/08/22/can-types-replace-validation">you can't replace all rules with types</a>, but it doesn't mean that using the type system is ineffectual. Quite the contrary. Every part of a contract that you can offload to the type system frees up your brain to think about something else - something more important, hopefully. </p> <p> Sometimes there's no good way to to model a precondition with a type, or <a href="https://buttondown.email/hillelwayne/archive/making-illegal-states-unrepresentable/">perhaps it's just too awkward</a>. At other times, there's really only a single way to address a concern. When it comes to the precondition that you can't pass an infinite sequence to the <em>average</em> function, <a href="/2020/02/03/non-exceptional-averages">change the type so that it takes some finite collection</a> instead. That's not what this article is about, though. </p> <p> Assuming that you've already dealt with the infinite-sequence issue, how do you address the other precondition? </p> <h3 id="03c13848cbd54058a3dfed204bc85878"> Error-handling <a href="#03c13848cbd54058a3dfed204bc85878">#</a> </h3> <p> A typical object-oriented move is to introduce a Guard Clause: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>&nbsp;<span style="color:#74531f;">Average</span>(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IReadOnlyCollection</span>&lt;<span style="color:#2b91af;">TimeSpan</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!<span style="font-weight:bold;color:#1f377f;">timeSpans</span>.<span style="font-weight:bold;color:#74531f;">Any</span>()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;<span style="color:#2b91af;">ArgumentOutOfRangeException</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">nameof</span>(<span style="font-weight:bold;color:#1f377f;">timeSpans</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Can&#39;t&nbsp;calculate&nbsp;the&nbsp;average&nbsp;of&nbsp;an&nbsp;empty&nbsp;collection.&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;=&nbsp;<span style="color:#2b91af;">TimeSpan</span>.Zero; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">+=</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">/</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Count; }</pre> </p> <p> You could do the same in F#: </p> <p> <pre>let&nbsp;average&nbsp;(timeSpans&nbsp;:&nbsp;TimeSpan&nbsp;seq)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;Seq.isEmpty&nbsp;timeSpans&nbsp;then &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;raise&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ArgumentOutOfRangeException( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nameof&nbsp;timeSpans, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;Can&#39;t&nbsp;calculate&nbsp;the&nbsp;average&nbsp;of&nbsp;an&nbsp;empty&nbsp;collection.&quot;)) &nbsp;&nbsp;&nbsp;&nbsp;timeSpans &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.averageBy&nbsp;(_.Ticks&nbsp;&gt;&gt;&nbsp;double) &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;int64 &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;TimeSpan.FromTicks</pre> </p> <p> You <em>could</em> also replicate such behaviour in Haskell, but it'd be highly unidiomatic. Instead, I'd rather discuss one <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> solution in Haskell, and then back-port it. </p> <p> While you can throw exceptions in Haskell, you typically handle <a href="/2024/01/29/error-categories-and-category-errors">predictable errors</a> with a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>. Here's a version of the Haskell function equivalent to the above C# code: </p> <p> <pre><span style="color:#2b91af;">average</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Fractional</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Either</span>&nbsp;<span style="color:#2b91af;">String</span>&nbsp;a average&nbsp;xs&nbsp;= &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;<span style="color:blue;">null</span>&nbsp;xs &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span>&nbsp;Left&nbsp;<span style="color:#a31515;">&quot;Can&#39;t&nbsp;calculate&nbsp;the&nbsp;average&nbsp;of&nbsp;an&nbsp;empty&nbsp;collection.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;Right&nbsp;$&nbsp;<span style="color:blue;">sum</span>&nbsp;xs&nbsp;/&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;(<span style="color:blue;">length</span>&nbsp;xs) </pre> </p> <p> For the readers that don't know the Haskell <a href="https://hackage.haskell.org/package/base">base</a> library by heart, <a href="https://hackage.haskell.org/package/base/docs/Data-List.html#v:null">null</a> is a predicate that checks whether or not a collection is empty. It has nothing to do with <a href="https://en.wikipedia.org/wiki/Null_pointer">null pointers</a>. </p> <p> This variation returns an <a href="/2018/06/11/church-encoded-either">Either</a> value. In practice you shouldn't just return a <code>String</code> as the error value, but rather a strongly-typed value that other code can deal with in a robust manner. </p> <p> On the other hand, in this particular example, there's really only one error condition that the function is able to detect, so you often see a variation where instead of a single error message, such a function just doesn't return anything: </p> <p> <pre><span style="color:#2b91af;">average</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Fractional</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a average&nbsp;xs&nbsp;=&nbsp;<span style="color:blue;">if</span>&nbsp;<span style="color:blue;">null</span>&nbsp;xs&nbsp;<span style="color:blue;">then</span>&nbsp;Nothing&nbsp;<span style="color:blue;">else</span>&nbsp;Just&nbsp;$&nbsp;<span style="color:blue;">sum</span>&nbsp;xs&nbsp;/&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;(<span style="color:blue;">length</span>&nbsp;xs) </pre> </p> <p> This iteration of the function returns a <a href="/2018/03/26/the-maybe-functor">Maybe</a> value, indicating that a return value may or may not be present. </p> <h3 id="ccc6a2a1804740a8942feee3b637db90"> Liberal domain <a href="#ccc6a2a1804740a8942feee3b637db90">#</a> </h3> <p> We can back-port this design to F#, where I'd also consider it idiomatic: </p> <p> <pre>let&nbsp;average&nbsp;(timeSpans&nbsp;:&nbsp;IReadOnlyCollection&lt;TimeSpan&gt;)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;timeSpans.Count&nbsp;=&nbsp;0&nbsp;then&nbsp;None&nbsp;else &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;timeSpans &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.averageBy&nbsp;(_.Ticks&nbsp;&gt;&gt;&nbsp;double) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;int64 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;TimeSpan.FromTicks &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Some</pre> </p> <p> This version returns a <code>TimeSpan option</code> rather than just a <code>TimeSpan</code>. While this may seem to put the burden of error-handling on the caller, nothing has really changed. The fundamental situation is the same. Now the function is just being more <a href="https://peps.python.org/pep-0020/">explicit</a> (more honest, you could say) about the pre- and postconditions. The type system also now insists that you deal with the possibility of error, rather than just hoping that the problem doesn't occur. </p> <p> In C# you can <a href="/2024/01/29/error-categories-and-category-errors">expand the codomain by returning a nullable TimeSpan value</a>, but such an option may not always be available at the language level. Keep in mind that the <code>Average</code> method is just an example standing in for something that may be more complicated. If the original return type is a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/keywords/reference-types">reference type</a> rather than a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/value-types">value type</a>, only recent versions of C# allows statically-checked <a href="https://learn.microsoft.com/dotnet/csharp/nullable-references">nullable reference types</a>. What if you're working in an older version of C#, or another language that doesn't have that feature? </p> <p> In that case, you may need to introduce an explicit <a href="/2018/03/26/the-maybe-functor">Maybe</a> class and return that: </p> <p> <pre>public&nbsp;static&nbsp;Maybe&lt;TimeSpan&gt;&nbsp;Average(this&nbsp;IReadOnlyCollection&lt;TimeSpan&gt;&nbsp;timeSpans) { &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(timeSpans.Count&nbsp;==&nbsp;0) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;new&nbsp;Maybe&lt;TimeSpan&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;sum&nbsp;=&nbsp;TimeSpan.Zero; &nbsp;&nbsp;&nbsp;&nbsp;foreach&nbsp;(var&nbsp;ts&nbsp;in&nbsp;timeSpans) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sum&nbsp;+=&nbsp;ts; &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;new&nbsp;Maybe&lt;TimeSpan&gt;(sum&nbsp;/&nbsp;timeSpans.Count); }</pre> </p> <p> Two things are going on here; one is obvious while the other is more subtle. Clearly, all of these alternatives change the static type of the function in order to make the pre- and postconditions more explicit. So far, they've all been loosening the <a href="https://en.wikipedia.org/wiki/Codomain">codomain</a> (the return <a href="/2021/11/15/types-as-sets">type</a>). This suggests a connection with <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>: <em>be conservative in what you send, be liberal in what you accept</em>. These variations are all liberal in what they accept, but it seems that the API design pays the price by also having to widen the set of possible return values. In other words, such designs aren't conservative in what they send. </p> <p> Do we have other options? </p> <h3 id="4fb2cc5775c44f80965cacbc37825f27"> Conservative codomain <a href="#4fb2cc5775c44f80965cacbc37825f27">#</a> </h3> <p> Is it possible to instead design the API in such a way that it's conservative in what it returns? Ideally, we'd like it to guarantee that it returns a number. This is possible by making the preconditions even more explicit. I've also <a href="/2020/02/03/non-exceptional-averages">covered that alternative already</a>, so I'm just going to repeat the C# code here without further comments: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>&nbsp;<span style="color:#74531f;">Average</span>(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">NotEmptyCollection</span>&lt;<span style="color:#2b91af;">TimeSpan</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;=&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Head; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Tail) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">+=</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">/</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>.Count; }</pre> </p> <p> This variation promotes another precondition to a type. The precondition that the input collection mustn't be empty can be explicitly modelled with a type. This enables us to be conservative about the codomain. The method now guarantees that it will return a value. </p> <p> This idea is also easily ported to F#: </p> <p> <pre>type&nbsp;NonEmpty&lt;&#39;a&gt;&nbsp;=&nbsp;{&nbsp;Head&nbsp;:&nbsp;&#39;a;&nbsp;Tail&nbsp;:&nbsp;IReadOnlyCollection&lt;&#39;a&gt;&nbsp;} let&nbsp;average&nbsp;(timeSpans&nbsp;:&nbsp;NonEmpty&lt;TimeSpan&gt;)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;[&nbsp;timeSpans.Head&nbsp;]&nbsp;@&nbsp;List.ofSeq&nbsp;timeSpans.Tail &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;List.averageBy&nbsp;(_.Ticks&nbsp;&gt;&gt;&nbsp;double) &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;int64 &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;TimeSpan.FromTicks</pre> </p> <p> The <code>average</code> function now takes a <code>NonEmpty</code> collection as input, and always returns a proper <code>TimeSpan</code> value. </p> <p> Haskell already comes with a built-in <a href="https://hackage.haskell.org/package/base/docs/Data-List-NonEmpty.html">NonEmpty</a> collection type, and while it oddly doesn't come with an <code>average</code> function, it's easy enough to write: </p> <p> <pre><span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.List.NonEmpty&nbsp;<span style="color:blue;">as</span>&nbsp;NE <span style="color:#2b91af;">average</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Fractional</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">NE</span>.<span style="color:blue;">NonEmpty</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a average&nbsp;xs&nbsp;=&nbsp;<span style="color:blue;">sum</span>&nbsp;xs&nbsp;/&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;(NE.<span style="color:blue;">length</span>&nbsp;xs) </pre> </p> <p> You can find a recent example of using a variation of that function <a href="/2024/04/08/extracting-curve-coordinates-from-a-bitmap">here</a>. </p> <h3 id="6f42a53e7c5f4ddb994e85c9d15ec37a"> Choosing between the two alternatives <a href="#6f42a53e7c5f4ddb994e85c9d15ec37a">#</a> </h3> <p> While Postel's law recommends having liberal domains and conservative codomains, in the case of the <em>average</em> API, we can't have both. If we design the API with a liberal input type, the output type has to be liberal as well. If we design with a restrictive input type, the output can be guaranteed. In my experience, you'll often find yourself in such a conundrum. The <em>average</em> API examined in this article is just an example, while the problem occurs often. </p> <p> Given such a choice, what should you choose? Is it even possible to give general guidance on this sort of problem? </p> <p> For decades, I considered such a choice a toss-up. After all, these solutions seem to be equivalent. Perhaps even isomorphic? </p> <p> When I recently began to explore this isomorphism more closely, it dawned on me that there's a small asymmetry in the isomorphism that favours the <em>conservative codomain</em> option. </p> <h3 id="976ac8645de44d51a5796be7481b1c12"> Isomorphism <a href="#976ac8645de44d51a5796be7481b1c12">#</a> </h3> <p> An <a href="https://en.wikipedia.org/wiki/Isomorphism">isomorphism</a> is a two-way translation between two representations. You can go back and forth between the two alternatives without loss of information. </p> <p> Is this possible with the two alternatives outlined above? For example, if you have the conservative version, can create the liberal alternative? Yes, you can: </p> <p> <pre><span style="color:#2b91af;">average&#39;</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Fractional</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;[a]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;a average&#39;&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;average&nbsp;.&nbsp;NE.nonEmpty</pre> </p> <p> Not surprisingly, this is trivial in Haskell. If you have the conservative version, you can just map it over a more liberal input. </p> <p> In F# it looks like this: </p> <p> <pre>module&nbsp;NonEmpty&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;let&nbsp;tryOfSeq&nbsp;xs&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;Seq.isEmpty&nbsp;xs&nbsp;then&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else&nbsp;Some&nbsp;{&nbsp;Head&nbsp;=&nbsp;Seq.head&nbsp;xs;&nbsp;Tail&nbsp;=&nbsp;Seq.tail&nbsp;xs&nbsp;|&gt;&nbsp;List.ofSeq&nbsp;} let&nbsp;average&#39;&nbsp;(timeSpans&nbsp;:&nbsp;IReadOnlyCollection&lt;TimeSpan&gt;)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;NonEmpty.tryOfSeq&nbsp;timeSpans&nbsp;|&gt;&nbsp;Option.map&nbsp;average</pre> </p> <p> In C# we can create a liberal overload that calls the conservative method: </p> <p> <pre>public&nbsp;static&nbsp;TimeSpan?&nbsp;Average(this&nbsp;IReadOnlyCollection&lt;TimeSpan&gt;&nbsp;timeSpans) { &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(timeSpans.Count&nbsp;==&nbsp;0) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;null; &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;arr&nbsp;=&nbsp;timeSpans.ToArray(); &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;new&nbsp;NotEmptyCollection&lt;TimeSpan&gt;(arr[0],&nbsp;arr[1..]).Average(); }</pre> </p> <p> Here I just used a Guard Clause and explicit construction of the <code>NotEmptyCollection</code>. I could also have added a <code>NotEmptyCollection.TryCreate</code> method, like in the F# and Haskell examples, but I chose the above slightly more imperative style in order to demonstrate that my point isn't tightly coupled to the concept of <a href="/2018/03/22/functors">functors</a>, mapping, and other Functional Programming trappings. </p> <p> These examples highlight how you can trivially make a conservative API look like a liberal API. Is it possible to go the other way? Can you make a liberal API look like a conservative API? </p> <p> Yes and no. </p> <p> Consider the liberal Haskell version of <code>average</code>, shown above; that's the one that returns <code>Maybe a</code>. Can you make a conservative function based on that? </p> <p> <pre><span style="color:#2b91af;">average&#39;</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Fractional</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">NE</span>.<span style="color:blue;">NonEmpty</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a average&#39;&nbsp;xs&nbsp;=&nbsp;fromJust&nbsp;$&nbsp;average&nbsp;xs</pre> </p> <p> Yes, this is possible, but only by resorting to the <a href="https://wiki.haskell.org/Partial_functions">partial function</a> <a href="https://hackage.haskell.org/package/base/docs/Data-Maybe.html#v:fromJust">fromJust</a>. I'll explain why that is a problem once we've covered examples in the two other languages, such as F#: </p> <p> <pre>let&nbsp;average&#39;&nbsp;(timeSpans&nbsp;:&nbsp;NonEmpty&lt;TimeSpan&gt;)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;[&nbsp;timeSpans.Head&nbsp;]&nbsp;@&nbsp;List.ofSeq&nbsp;timeSpans.Tail&nbsp;|&gt;&nbsp;average&nbsp;|&gt;&nbsp;Option.get</pre> </p> <p> In this variation, <code>average</code> is the liberal version shown above; the one that returns a <code>TimeSpan option</code>. In order to make a conservative version, the <code>average'</code> function can call the liberal <code>average</code> function, but has to resort to the partial function <code>Option.get</code>. </p> <p> The same issue repeats a third time in C#: </p> <p> <pre>public&nbsp;static&nbsp;TimeSpan&nbsp;Average(this&nbsp;NotEmptyCollection&lt;TimeSpan&gt;&nbsp;timeSpans) { &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;timeSpans.ToList().Average().Value; }</pre> </p> <p> This time, the partial function is the unsafe <a href="https://learn.microsoft.com/dotnet/api/system.nullable-1.value">Value</a> property, which throws an <code>InvalidOperationException</code> if there's no value. </p> <p> This even violates Microsoft's own design guidelines: </p> <blockquote> <p> "AVOID throwing exceptions from property getters." </p> <footer><cite><a href="https://learn.microsoft.com/dotnet/standard/design-guidelines/property">Krzystof Cwalina and Brad Abrams</a></cite></footer> </blockquote> <p> I've cited Cwalina and Abrams as the authors, since this rule can be found in my 2006 edition of <a href="/ref/fdg">Framework Design Guidelines</a>. This isn't a new insight. </p> <p> While the two alternatives are 'isomorphic enough' that we can translate both ways, the translations are asymmetric in the sense that one is safe, while the other has to resort to an inherently unsafe operation to make it work. </p> <h3 id="e10b4b0269b74efa9d89275644c88d8e"> Encapsulation <a href="#e10b4b0269b74efa9d89275644c88d8e">#</a> </h3> <p> I've called the operations <code>fromJust</code>, <code>Option.get</code>, and <code>Value</code> <em>partial</em>, and only just now used the word <em>unsafe</em>. You may protest that neither of the three examples are unsafe in practice, since we know that the input is never empty. Thus, we know that the liberal function will always return a value, and therefore it's safe to call a partial function, even though these operations are unsafe in the general case. </p> <p> While that's true, consider how the burden shifts. When you want to promote a conservative variant to a liberal variant, you can rely on all the operations being total. On the other hand, if you want to make a liberal variant look conservative, the onus is on you. None of the three type systems on display here can perform that analysis for you. </p> <p> This may not be so bad when the example is as simple as taking the average of a collection of numbers, but does it scale? What if the operation you're invoking is much more complicated? Can you still be sure that you safely invoke a partial function on the return value? </p> <p> As <a href="/ctfiyh">Code That Fits in Your Head</a> argues, procedures quickly become so complicated that they no longer fit in your head. If you don't have well-described and patrolled contracts, you don't know what the postconditions are. You can't trust the return values from method calls, or even the state of the objects you passed as arguments. This tend to lead to <a href="/2013/07/08/defensive-coding">defensive coding</a>, where you write code that checks the state of everything all too often. </p> <p> The remedy is, as always, good old <a href="/encapsulation-and-solid">encapsulation</a>. In this case, check the preconditions at the beginning, and capture the result of that check in an object or type that is guaranteed to be always valid. This goes beyond <a href="https://blog.janestreet.com/effective-ml-video/">making illegal states unrepresentable</a> because it also works with <a href="https://www.hillelwayne.com/post/constructive/">predicative</a> types. Once you're past the Guard Clauses, you don't have to check the preconditions <em>again</em>. </p> <p> This kind of thinking illustrates why you need a multidimensional view on API design. As useful as Postel's law sometimes is, it doesn't address all problems. In fact, it turned out to be unhelpful in this context, while another perspective proves more fruitful. Encapsulation is the art and craft of designing APIs in such a way that they suggest or even compels correct interactions. The more I think of this, the more it strikes me that a <em>ranking</em> is implied: Preconditions are more important than postconditions, because if the preconditions are unfulfilled, you can't trust the postconditions, either. </p> <h3 id="35bb4abc87b8402da82c82c5baa71235"> Mapping <a href="#35bb4abc87b8402da82c82c5baa71235">#</a> </h3> <p> What's going on here? One perspective is to view <a href="/2021/11/15/types-as-sets">types as sets</a>. In the <em>average</em> example, the function maps from one set to another: </p> <p> <img src="/content/binary/mapping-from-collections-to-reals.png" alt="Mapping from the set of collections to the set of real numbers."> </p> <p> Which sets are they? We can think of the <em>average</em> function as a mapping from the set of non-empty collections of numbers to the set of <a href="https://en.wikipedia.org/wiki/Real_number">real numbers</a>. In programming, we can't represent real numbers, so instead, the left set is going to be the set of all the non-empty collections the computer or the language can represent and hold in (virtual) memory, and the right-hand set is the set of all the possible numbers of whichever type you'd like (32-bit signed integers, <a href="https://en.wikipedia.org/wiki/Double-precision_floating-point_format">64-bit floating-point numbers</a>, 8-bit unsigned integers, etc.). </p> <p> In reality, the left-hand set is much larger than the set to the right. </p> <p> Drawing all those arrows quickly becomes awkward , so instead, we may <a href="/2021/11/22/functions-as-pipes">draw each mapping as a pipe</a>. Such a pipe also corresponds to a function. Here's an intermediate step in such a representation: </p> <p> <img src="/content/binary/mapping-from-collections-to-reals-transparent-pipe.png" alt="Mapping from one set to the other, drawn inside a transparent pipe."> </p> <p> One common element is, however, missing from the left set. Which one? </p> <h3 id="8920c8df3b9f4f978a5c560d5c9cdcb4"> Pipes <a href="#8920c8df3b9f4f978a5c560d5c9cdcb4">#</a> </h3> <p> The above mapping corresponds to the conservative variation of the function. It's a total function that maps all values in the domain to a value in the codomain. It accomplishes this trick by explicitly constraining the domain to only those elements on which it's defined. Due to the preconditions, that excludes the empty collection, which is therefore absent from the left set. </p> <p> What if we also want to allow the empty collection to be a valid input? </p> <p> Unless we find ourselves in some special context where it makes sense to define a 'default average value', we can't map an empty collection to any meaningful number. Rather, we'll have to map it to some special value, such as <code>Nothing</code>, <code>None</code>, or <code>null</code>: </p> <p> <img src="/content/binary/mapping-with-none-channel-transparent-pipe.png" alt="Mapping the empty collection to null in a pipe separate, but on top of, the proper function pipe."> </p> <p> This extra pipe is free, because it's supplied by the <a href="/2018/03/26/the-maybe-functor">Maybe functor</a>'s mapping (<code>Select</code>, <code>map</code>, <code>fmap</code>). </p> <p> What happens if we need to go the other way? If the function is the liberal variant that also maps the empty collection to a special element that indicates a missing value? </p> <p> <img src="/content/binary/mapping-from-all-collections-to-reals-transparent-pipe.png" alt="Mapping all collections, including the empty collection, to the set of real numbers."> </p> <p> In this case, it's much harder to disentangle the mappings. If you imagine that a liquid flows through the pipes, we can try to be careful and avoid 'filling up' the pipe. </p> <p> <img src="/content/binary/pipe-partially-filled-with-liquid.png" alt="Pipe partially filled with liquid."> </p> <p> The liquid represents the data that we <em>do</em> want to transmit through the pipe. As this illustration suggests, we now have to be careful that nothing goes wrong. In order to catch just the right outputs on the right side, you need to know how high the liquid may go, and attach a an 'flat-top' pipe to it: </p> <p> <img src="/content/binary/pipe-composed-with-open-top-pipe.png" alt="Pipe composed with open-top pipe."> </p> <p> As this illustration tries to get across, this kind of composition is awkward and error-prone. What's worse is that you need to know how high the liquid is going to get on the right side. This depends on what actually goes on inside the pipe, and what kind of input goes into the left-hand side. </p> <p> This is a metaphor. The longer the pipe is, the more difficult it gets to keep track of that knowledge. The stubby little pipe in these illustrations may correspond to the <em>average</em> function, which is an operation that easily fits in our heads. It's not too hard to keep track of the preconditions, and how they map to postconditions. </p> <p> Thus, turning such a small liberal function into a conservative function is possible, but already awkward. If the operation is complicated, you can no longer keep track of all the details of how the inputs relate to the outputs. </p> <h3 id="18b682600e5a4d1baf542b0cd1dcda7f"> Additive extensibility <a href="#18b682600e5a4d1baf542b0cd1dcda7f">#</a> </h3> <p> This really shouldn't surprise us. Most programming languages come with all sorts of facilities that enable <em>extensibility</em>: The ability to <em>add</em> more functionality, more behaviour, more capabilities, to existing building blocks. Conversely, few languages come with <em>removability</em> facilities. You can't, commonly, declare that an object is an instance of a class, <em>except</em> one method, or that a function is just like another function, <em>except</em> that it doesn't accept a particular subset of input. </p> <p> This explains why we can safely make a conservative function liberal, but why it's difficult to make a liberal function conservative. This is because making a conservative function liberal <em>adds</em> functionality, while making a liberal function conservative attempts to remove functionality. </p> <h3 id="6545430a4e1f47a38e121aee1a342b40"> Conjecture <a href="#6545430a4e1f47a38e121aee1a342b40">#</a> </h3> <p> All this leads me to the following conjecture: When faced with a choice between two versions of an API, where one has a liberal domain, and the other a conservative codomain, choose the design with the conservative codomain. </p> <p> If you need the liberal version, you can create it from the conservative operation. The converse need not be true. </p> <h3 id="312f8f4df0c44390a43f4f0f92d2c9d6"> Conclusion <a href="#312f8f4df0c44390a43f4f0f92d2c9d6">#</a> </h3> <p> Postel's law encourages us to be liberal with what we accept, but conservative with what we return. This is a good design heuristic, but sometimes you're faced with mutually exclusive alternatives. If you're liberal with what you accept, you'll also need to be too loose with what you return, because there are input values that you can't handle. On the other hand, sometimes the only way to be conservative with the output is to also be restrictive when it comes to input. </p> <p> Given two such alternatives, which one should you choose? </p> <p> This article conjectures that you should choose the conservative alternative. This isn't a political statement, but simply a result of the conservative design being the smaller building block. From a small building block, you can compose something bigger, whereas from a bigger unit, you can't easily extract something smaller that's still robust and useful. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Service compatibility is determined based on policy https://blog.ploeh.dk/2024/04/29/service-compatibility-is-determined-based-on-policy 2024-04-29T11:12:00+00:00 Mark Seemann <div id="post"> <p> <em>A reading of the fourth Don Box tenet, with some commentary.</em> </p> <p> This article is part of a series titled <a href="/2024/03/04/the-four-tenets-of-soa-revisited">The four tenets of SOA revisited</a>. In each of these articles, I'll pull one of <a href="https://en.wikipedia.org/wiki/Don_Box">Don Box</a>'s <em>four tenets of service-oriented architecture</em> (SOA) out of the <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/a-guide-to-developing-and-running-connected-systems-with-indigo">original MSDN Magazine article</a> and add some of my own commentary. If you're curious why I do that, I cover that in the introductory article. </p> <p> In this article, I'll go over the fourth tenet, quoting from the MSDN Magazine article unless otherwise indicated. </p> <h3 id="57382e74449c40409a7d73d91bc5fd14"> Service compatibility is determined based on policy <a href="#57382e74449c40409a7d73d91bc5fd14">#</a> </h3> <p> The fourth tenet is the forgotten one. I could rarely remember exactly what it included, but it does give me an opportunity to bring up a few points about compatibility. The articles said: </p> <blockquote> <p> Object-oriented designs often confuse structural compatibility with semantic compatibility. Service-orientation deals with these two axes separately. Structural compatibility is based on contract and schema and can be validated (if not enforced) by machine-based techniques (such as packet-sniffing, validating firewalls). Semantic compatibility is based on explicit statements of capabilities and requirements in the form of policy. </p> <p> Every service advertises its capabilities and requirements in the form of a machine-readable policy expression. Policy expressions indicate which conditions and guarantees (called assertions) must hold true to enable the normal operation of the service. Policy assertions are identified by a stable and globally unique name whose meaning is consistent in time and space no matter which service the assertion is applied to. Policy assertions may also have parameters that qualify the exact interpretation of the assertion. Individual policy assertions are opaque to the system at large, which enables implementations to apply simple propositional logic to determine service compatibility. </p> </blockquote> <p> As you can tell, this description is the shortest of the four. This is also the point where I begin to suspect that my reading of <a href="/2024/04/15/services-share-schema-and-contract-not-class">the third tenet</a> may deviate from what Don Box originally had in mind. </p> <p> This tenet is also the most baffling to me. As I understand it, the motivation behind the four tenets was to describe assumptions about the kind of systems that people would develop with <a href="https://en.wikipedia.org/wiki/Windows_Communication_Foundation">Windows Communication Foundation</a> (WCF), or <a href="https://en.wikipedia.org/wiki/SOAP">SOAP</a> in general. </p> <p> While I worked with WCF for a decade, the above description doesn't ring a bell. Reading it now, the description of <em>policy</em> sounds more like a system such as <a href="https://clojure.org/about/spec">clojure.spec</a>, although that's not something I know much about either. I don't recall WCF ever having a machine-readable policy subsystem, and if it had, I never encountered it. </p> <p> It does seem, however, as though what I interpret as <em>contract</em>, Don Box called <em>policy</em>. </p> <p> Despite my confusion, the word <em>compatibility</em> is worth discussing, regardless of whether that was what Don Box meant. A well-designed service is one where you've explicitly considered forwards and backwards compatibility. </p> <h3 id="77bf7878d5304ba08f686cbfbc6cb941"> Versioning <a href="#77bf7878d5304ba08f686cbfbc6cb941">#</a> </h3> <p> Planning for forwards and backwards compatibility does <em>not</em> imply that you're expected to be able to predict the future. It's fine if you have so much experience developing and maintaining online systems that you may have enough foresight to plan for certain likely changes that you may have to make in the future, but that's not what I have in mind. </p> <p> Rather, what you <em>should</em> do is to have a system that enables you to detect breaking changes before you deploy them. Furthermore you should have a strategy for how to deal with the perceived necessity to introduce breaking changes. </p> <p> The most effective strategy that I know of is to employ explicit versioning, particularly <em>message versioning</em>. You <em>can</em> version an entire service as one indivisible block, but I often find it more useful to version at the message level. If you're designing a <a href="https://en.wikipedia.org/wiki/REST">REST</a> API, for example, you can <a href="/2015/06/22/rest-implies-content-negotiation">take advantage of Content Negotiation</a>. </p> <p> If you like, you can use <a href="https://semver.org/">Semantic Versioning</a> as a versioning scheme, but for services, the thing that mostly matters is the major version. Thus, you may simply label your messages with the version numbers <em>1</em>, <em>2</em>, etc. </p> <p> If you already have a published service without explicit message version information, then you can still retrofit versioning afterwards. <a href="/2023/12/04/serialization-with-and-without-reflection">Imagine that your existing data looks like this</a>: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> This <a href="https://json.org/">JSON</a> document has no explicit version information, but you can interpret that as implying that the document has the 'default' version, which is always <em>1:</em> </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;version&quot;</span>:&nbsp;1, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> If you later realize that you need to make a breaking change, you can do that by increasing the (major) version: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;version&quot;</span>:&nbsp;2, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;id&quot;</span>:&nbsp;12, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> Recipients can now look for the <code>version</code> property to learn how to interpret the rest of the message, and failing to find it, infer that this is version <em>1</em>. </p> <p> As Don Box wrote, in a service-oriented system, you can't just update all systems in a single coordinated release. Therefore, you must never break compatibility. Versioning enables you to move forward in a way that does break with the past, but without breaking existing clients. </p> <p> Ultimately, you <a href="/2020/06/01/retiring-old-service-versions">may attempt to retire old service versions</a>, but be ready to keep them around for a long time. </p> <p> For more of my thoughts about backwards compatibility, see <a href="/2021/12/13/backwards-compatibility-as-a-profunctor">Backwards compatibility as a profunctor</a>. </p> <h3 id="ad9cec4f54c243d08fc71d38ff13ac17"> Conclusion <a href="#ad9cec4f54c243d08fc71d38ff13ac17">#</a> </h3> <p> The fourth tenet is the most nebulous, and I wonder if it was ever implemented. If it was, I'm not aware of it. Even so, compatibility is an important component of service design, so I took the opportunity to write about that. In most cases, it pays to think explicitly about message versioning. </p> <p> I have the impression that Don Box had something in mind more akin to what I call <em>contract</em>. Whether you call it one thing or another, it stands to reason that you often need to attach extra rules to simple types. The <em>schema</em> may define an input value as a number, but the service does require that this particular number is a natural number. Or that a string is really a <a href="https://en.wikipedia.org/wiki/ISO_8601">proper encoding</a> of a date. Perhaps you call that <em>policy</em>. I call it <em>contract</em>. In any case, clearly communicating such expectations is important for systems to be compatible. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Fitting a polynomial to a set of points https://blog.ploeh.dk/2024/04/22/fitting-a-polynomial-to-a-set-of-points 2024-04-22T05:35:00+00:00 Mark Seemann <div id="post"> <p> <em>The story of a fiasco.</em> </p> <p> This is the second in a small series of articles titled <a href="/2024/04/01/trying-to-fit-the-hype-cycle">Trying to fit the hype cycle</a>. In the introduction, I've described the exercise I had in mind: Determining a formula, or at least a <a href="https://en.wikipedia.org/wiki/Piecewise">piecewise</a> <a href="https://en.wikipedia.org/wiki/Function_(mathematics)">function</a>, for the <a href="https://en.wikipedia.org/wiki/Gartner_hype_cycle">Gartner hype cycle</a>. This, to be clear, is an entirely frivolous exercise with little practical application. </p> <p> In the previous article, I <a href="/2024/04/08/extracting-curve-coordinates-from-a-bitmap">extracted a set of <em>(x, y)</em> coordinates from a bitmap</a>. In this article, I'll showcase my failed attempt at fitting the data to a <a href="https://en.wikipedia.org/wiki/Polynomial">polynomial</a>. </p> <h3 id="36f71204d90b44a8b39a7d8103f46cca"> Failure <a href="#36f71204d90b44a8b39a7d8103f46cca">#</a> </h3> <p> I've already revealed that I failed to accomplish what I set out to do. Why should you read on, then? </p> <p> You don't have to, and I can't predict the many reasons my readers have for occasionally swinging by. Therefore, I can't tell you why <em>you</em> should keep reading, but I <em>can</em> tell you why I'm writing this article. </p> <p> This blog is a mix of articles that I write because readers ask me interesting questions, and partly, it's my personal research-and-development log. In that mode, I write about things that I've learned, and I write in order to learn. One can learn from failure as well as from success. </p> <p> I'm not <em>that</em> connected to 'the' research community (if such a thing exists), but I'm getting the sense that there's a general tendency in academia that researchers rarely publish their negative results. This could be a problem, because this means that the rest of us never learn about the <em>thousands of ways that don't work</em>. </p> <p> Additionally, in 'the' programming community, we also tend to boast our victories and hide our failures. More than one podcast (sorry about the <a href="https://en.wikipedia.org/wiki/Weasel_word">weasel words</a>, but I don't remember which ones) have discussed how this gives young programmers the wrong impression of what programming is like. It is, indeed, a process of much trial and error, but usually, we only publish our polished, final result. </p> <p> Well, I did manage to produce code to fit a polynomial to the Gartner hype cycle, but I never managed to get a <em>good</em> fit. </p> <h3 id="34ad323fc07f48709fb86c4045bd5892"> The big picture <a href="#34ad323fc07f48709fb86c4045bd5892">#</a> </h3> <p> I realize that I have a habit of burying the lede when I write technical articles. I don't know if I've picked up that tendency from <a href="https://fsharp.org/">F#</a>, which does demand that you define a value or function before you can use it. This, by the way, <a href="/2015/04/15/c-will-eventually-get-all-f-features-right">is a good feature</a>. </p> <p> Here, I'll try to do it the other way around, and start with the big picture: </p> <p> <pre>data&nbsp;=&nbsp;numpy.loadtxt(<span style="color:#a31515;">&#39;coords.txt&#39;</span>,&nbsp;delimiter=<span style="color:#a31515;">&#39;,&#39;</span>) x&nbsp;=&nbsp;data[:,&nbsp;0] t&nbsp;=&nbsp;data[:,&nbsp;1] w&nbsp;=&nbsp;fit_polynomial(x,&nbsp;t,&nbsp;9) plot_fit(x,&nbsp;t,&nbsp;w)</pre> </p> <p> This, by the way, is a <a href="https://www.python.org/">Python</a> script, and it opens with these imports: </p> <p> <pre><span style="color:blue;">import</span>&nbsp;numpy <span style="color:blue;">import</span>&nbsp;matplotlib.pyplot&nbsp;<span style="color:blue;">as</span>&nbsp;plt</pre> </p> <p> The first line of code reads the <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a> file into the <code>data</code> variable. The first column in that file contains all the <em>x</em> values, and the second column the <em>y</em> values. <a href="/ref/rogers-girolami">The book</a> that I've been following uses <em>t</em> for the data, rather than <em>y</em>. (Now that I think about it, I believe that this may only be because it works from an example in which the data to be fitted are <a href="https://en.wikipedia.org/wiki/100_metres">100 m dash</a> times, denoted <em>t</em>.) </p> <p> Once the script has extracted the data, it calls the <code>fit_polynomial</code> function to produce a set of weights <code>w</code>. The constant <code>9</code> is the degree of polynomial to fit, although I think that I've made an off-by-one error so that the result is only a eighth-degree polynomial. </p> <p> Finally, the code plots the original data together with the polynomial: </p> <p> <img src="/content/binary/hype-8th-degree-poly.png" alt="Gartner hype cycle and a eighth-degree fitted polynomial."> </p> <p> The green dots are the <em>(x, y)</em> coordinates that I extracted in the previous article, while the red curve is the fitted eighth-degree polynomial. Even though we're definitely in the realm of over-fitting, it doesn't reproduce the Gartner hype cycle. </p> <p> I've even arrived at the value <code>9</code> after some trial and error. After all, I wasn't trying to do any real science here, so over-fitting is definitely allowed. Even so, <code>9</code> seems to be the best fit I can achieve. With lover values, like <code>8</code>, below, the curve deviates too much: </p> <p> <img src="/content/binary/hype-7th-degree-poly.png" alt="Gartner hype cycle and a seventh-degree fitted polynomial."> </p> <p> The value <code>10</code> looks much like <code>9</code>, but above that (<code>11</code>), the curve completely disconnects from the data, it seems: </p> <p> <img src="/content/binary/hype-10th-degree-poly.png" alt="Gartner hype cycle and a tenth-degree fitted polynomial."> </p> <p> I'm not sure why it does this, to be honest. I would have thought that the more degrees you added, the more (over-)fitted the curve would be. Apparently, this is not so, or perhaps I made a mistake in my code. </p> <h3 id="183834d3c95544d9a185b5ba84bba9a1"> Calculating the weights <a href="#183834d3c95544d9a185b5ba84bba9a1">#</a> </h3> <p> The <code>fit_polynomial</code> function calculates the polynomial coefficients using a <a href="https://en.wikipedia.org/wiki/Linear_algebra">linear algebra</a> formula that I've found in at least two text books. Numpy makes it easy to invert, transpose, and multiply matrices, so the formula itself is just a one-liner. Here it is in the entire context of the function, though: </p> <p> <pre><span style="color:blue;">def</span>&nbsp;<span style="color:#2b91af;">fit_polynomial</span>(x,&nbsp;t,&nbsp;degree): &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;&quot;&quot; &nbsp;&nbsp;&nbsp;&nbsp;Fits&nbsp;a&nbsp;polynomial&nbsp;to&nbsp;the&nbsp;given&nbsp;data. &nbsp;&nbsp;&nbsp;&nbsp;Parameters &nbsp;&nbsp;&nbsp;&nbsp;---------- &nbsp;&nbsp;&nbsp;&nbsp;x&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples] &nbsp;&nbsp;&nbsp;&nbsp;t&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples] &nbsp;&nbsp;&nbsp;&nbsp;degree&nbsp;:&nbsp;degree&nbsp;of&nbsp;the&nbsp;polynomial &nbsp;&nbsp;&nbsp;&nbsp;Returns &nbsp;&nbsp;&nbsp;&nbsp;------- &nbsp;&nbsp;&nbsp;&nbsp;w&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[degree&nbsp;+&nbsp;1] &nbsp;&nbsp;&nbsp;&nbsp;&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">#&nbsp;This&nbsp;expansion&nbsp;creates&nbsp;a&nbsp;matrix,&nbsp;so&nbsp;we&nbsp;name&nbsp;that&nbsp;with&nbsp;an&nbsp;upper-case&nbsp;letter</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">#&nbsp;rather&nbsp;than&nbsp;a&nbsp;lower-case&nbsp;letter,&nbsp;which&nbsp;is&nbsp;used&nbsp;for&nbsp;vectors.</span> &nbsp;&nbsp;&nbsp;&nbsp;X&nbsp;=&nbsp;expand(x.reshape((<span style="color:blue;">len</span>(x),&nbsp;1)),&nbsp;degree) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;numpy.linalg.inv(X.T&nbsp;@&nbsp;X)&nbsp;@&nbsp;X.T&nbsp;@&nbsp;t</pre> </p> <p> This may look daunting, but is really just two lines of code. The rest is <a href="https://en.wikipedia.org/wiki/Docstring">docstring</a> and a comment. </p> <p> The above-mentioned formula is the last line of code. The one before that expands the input data <code>t</code> from a simple one-dimensional array to a matrix of those values squared, cubed, etc. That's how you use the <a href="https://en.wikipedia.org/wiki/Least_squares">least squares</a> method if you want to fit it to a polynomial of arbitrary degree. </p> <h3 id="782c5cbd64de43878eea4a3ddfcdf755"> Expansion <a href="#782c5cbd64de43878eea4a3ddfcdf755">#</a> </h3> <p> The <code>expand</code> function looks like this: </p> <p> <pre><span style="color:blue;">def</span>&nbsp;<span style="color:#2b91af;">expand</span>(x,&nbsp;degree): &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;&quot;&quot; &nbsp;&nbsp;&nbsp;&nbsp;Expands&nbsp;the&nbsp;given&nbsp;array&nbsp;to&nbsp;polynomial&nbsp;elements&nbsp;of&nbsp;the&nbsp;given&nbsp;degree. &nbsp;&nbsp;&nbsp;&nbsp;Parameters &nbsp;&nbsp;&nbsp;&nbsp;---------- &nbsp;&nbsp;&nbsp;&nbsp;x&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples,&nbsp;1] &nbsp;&nbsp;&nbsp;&nbsp;degree&nbsp;:&nbsp;degree&nbsp;of&nbsp;the&nbsp;polynomial &nbsp;&nbsp;&nbsp;&nbsp;Returns &nbsp;&nbsp;&nbsp;&nbsp;------- &nbsp;&nbsp;&nbsp;&nbsp;Xp&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples,&nbsp;degree&nbsp;+&nbsp;1] &nbsp;&nbsp;&nbsp;&nbsp;&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;Xp&nbsp;=&nbsp;numpy.ones((<span style="color:blue;">len</span>(x),&nbsp;1)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">for</span>&nbsp;i&nbsp;<span style="color:blue;">in</span>&nbsp;<span style="color:blue;">range</span>(1,&nbsp;degree&nbsp;+&nbsp;1): &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Xp&nbsp;=&nbsp;numpy.hstack((Xp,&nbsp;numpy.power(x,&nbsp;i))) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;Xp</pre> </p> <p> The function begins by creating a column vector of ones, here illustrated with only three rows: </p> <p> <pre>&gt;&gt;&gt; Xp = numpy.ones((3, 1)) &gt;&gt;&gt; Xp array([[1.], [1.], [1.]])</pre> </p> <p> It then proceeds to loop over as many degrees as you've asked it to, each time adding a column to the <code>Xp</code> matrix. Here's an example of doing that up to a power of three, on example input <code>[1,2,3]</code>: </p> <p> <pre>&gt;&gt;&gt; x = numpy.array([1,2,3]).reshape((3, 1)) &gt;&gt;&gt; x array([[1], [2], [3]]) &gt;&gt;&gt; Xp = numpy.hstack((Xp, numpy.power(x, 1))) &gt;&gt;&gt; Xp array([[1., 1.], [1., 2.], [1., 3.]]) &gt;&gt;&gt; Xp = numpy.hstack((Xp, numpy.power(x, 2))) &gt;&gt;&gt; Xp array([[1., 1., 1.], [1., 2., 4.], [1., 3., 9.]]) &gt;&gt;&gt; Xp = numpy.hstack((Xp, numpy.power(x, 3))) &gt;&gt;&gt; Xp array([[ 1., 1., 1., 1.], [ 1., 2., 4., 8.], [ 1., 3., 9., 27.]])</pre> </p> <p> Once it's done looping, the <code>expand</code> function returns the resulting <code>Xp</code> matrix. </p> <h3 id="cfb27c6067d2486c95836dc61484b2a0"> Plotting <a href="#cfb27c6067d2486c95836dc61484b2a0">#</a> </h3> <p> Finally, here's the <code>plot_fit</code> procedure: </p> <p> <pre><span style="color:blue;">def</span>&nbsp;<span style="color:#2b91af;">plot_fit</span>(x,&nbsp;t,&nbsp;w): &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;&quot;&quot; &nbsp;&nbsp;&nbsp;&nbsp;Plots&nbsp;the&nbsp;polynomial&nbsp;with&nbsp;the&nbsp;given&nbsp;weights&nbsp;and&nbsp;the&nbsp;data. &nbsp;&nbsp;&nbsp;&nbsp;Parameters &nbsp;&nbsp;&nbsp;&nbsp;---------- &nbsp;&nbsp;&nbsp;&nbsp;x&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples] &nbsp;&nbsp;&nbsp;&nbsp;t&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[n_samples] &nbsp;&nbsp;&nbsp;&nbsp;w&nbsp;:&nbsp;Array&nbsp;of&nbsp;shape&nbsp;[degree&nbsp;+&nbsp;1] &nbsp;&nbsp;&nbsp;&nbsp;&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;xs&nbsp;=&nbsp;numpy.linspace(x[0],&nbsp;x[0]+<span style="color:blue;">len</span>(x),&nbsp;100) &nbsp;&nbsp;&nbsp;&nbsp;ys&nbsp;=&nbsp;numpy.polyval(w[::-1],&nbsp;xs) &nbsp;&nbsp;&nbsp;&nbsp;plt.plot(xs,&nbsp;ys,&nbsp;<span style="color:#a31515;">&#39;r&#39;</span>) &nbsp;&nbsp;&nbsp;&nbsp;plt.scatter(x,&nbsp;t,&nbsp;s=10,&nbsp;c=<span style="color:#a31515;">&#39;g&#39;</span>) &nbsp;&nbsp;&nbsp;&nbsp;plt.show()</pre> </p> <p> This is fairly standard pyplot code, so I don't have much to say about it. </p> <h3 id="3730027db8614b01960cf5379d8add78"> Conclusion <a href="#3730027db8614b01960cf5379d8add78">#</a> </h3> <p> When I started this exercise, I'd hoped that I could get close to the Gartner hype cycle by over-fitting the model to some ridiculous polynomial degree. This turned out not to be the case, for reasons that I don't fully understand. As I increase the degree, the curve begins to deviate from the data. </p> <p> I can't say that I'm a data scientist or a statistician of any skill, so it's possible that my understanding is still too shallow. Perhaps I'll return to this article later and marvel at the ineptitude on display here. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="4bef47fad250438a94c2f1de28dc330d"> <div class="comment-author"><a href="https://www.mit.edu/~amu/">Aaron M. Ucko</a> <a href="#4bef47fad250438a94c2f1de28dc330d">#</a></div> <div class="comment-content"> <p> I suspect that increasing the degree wound up backfiring by effectively putting too much weight on the right side, whose flatness clashed with the increasingly steep powers you were trying to mix in. A vertically offset damped sinusoid might make a better starting point for modeling, though identifying its parameters wouldn't be quite as straightforward. One additional wrinkle there is that you want to level fully off after the valley; you could perhaps make that happen by plugging a scaled arctangent or something along those lines into the sinusoid. </p> <p> Incidentally, a neighboring post in my feed reader was about a new release of an open-source data analysis and curve fitting program (QSoas) that might help if you don't want to take such a DIY approach. </p> </div> <div class="comment-date">2024-05-16 02:37 UTC</div> </div> <div class="comment" id="831d9f6360da4cbaa2ab5a08315b532a"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#831d9f6360da4cbaa2ab5a08315b532a">#</a></div> <div class="comment-content"> <p> Aaron, thank you for writing. In retrospect, it becomes increasingly clear to me why this doesn't work. This highlights, I think, why it's a good idea to sometimes do stupid exercises like this one. You learn something from it, even when you fail. </p> </div> <div class="comment-date">2024-05-22 6:15 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Services share schema and contract, not class https://blog.ploeh.dk/2024/04/15/services-share-schema-and-contract-not-class 2024-04-15T07:25:00+00:00 Mark Seemann <div id="post"> <p> <em>A reading of the third Don Box tenet, with some commentary.</em> </p> <p> This article is part of a series titled <a href="/2024/03/04/the-four-tenets-of-soa-revisited">The four tenets of SOA revisited</a>. In each of these articles, I'll pull one of <a href="https://en.wikipedia.org/wiki/Don_Box">Don Box</a>'s <em>four tenets of service-oriented architecture</em> (SOA) out of the <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/a-guide-to-developing-and-running-connected-systems-with-indigo">original MSDN Magazine article</a> and add some of my own commentary. If you're curious why I do that, I cover that in the introductory article. </p> <p> In this article, I'll go over the third tenet, quoting from the MSDN Magazine article unless otherwise indicated. </p> <h3 id="3a56e1083c454dec90a28f8c7ff44d5f"> Services share schema and contract, not class <a href="#3a56e1083c454dec90a28f8c7ff44d5f">#</a> </h3> <p> Compared to <a href="/2024/03/25/services-are-autonomous">the second tenet</a>, the following description may at first seem more dated. Here's what the article said: </p> <blockquote> <p> Object-oriented programming encourages developers to create new abstractions in the form of classes. Most modern development environments not only make it trivial to define new classes, modern IDEs do a better job guiding you through the development process as the number of classes increases (as features like IntelliSense® provide a more specific list of options for a given scenario). </p> <p> Classes are convenient abstractions as they share both structure and behavior in a single named unit. Service-oriented development has no such construct. Rather, services interact based solely on schemas (for structures) and contracts (for behaviors). Every service advertises a contract that describes the structure of messages it can send and/or receive as well as some degree of ordering constraints over those messages. This strict separation between structure and behavior vastly simplifies deployment, as distributed object concepts such as marshal-by-value require a common execution and security environment which is in direct conflict with the goals of autonomous computing. </p> <p> Services do not deal in types or classes per se; rather, only with machine readable and verifiable descriptions of the legal "ins and outs" the service supports. The emphasis on machine verifiability and validation is important given the inherently distributed nature of how a service-oriented application is developed and deployed. Unlike a traditional class library, a service must be exceedingly careful about validating the input data that arrives in each message. Basing the architecture on machine-validatible schema and contract gives both developers and infrastructure the hints they need to protect the integrity of an individual service as well as the overall application as a whole. </p> <p> Because the contract and schema for a given service are visible over broad ranges of both space and time, service-orientation requires that contracts and schema remain stable over time. In the general case, it is impossible to propagate changes in schema and/or contract to all parties who have ever encountered a service. For that reason, the contract and schema used in service-oriented designs tend to have more flexibility than traditional object-oriented interfaces. It is common for services to use features such as XML element wildcards (like xsd:any) and optional SOAP header blocks to evolve a service in ways that do not break already deployed code. </p> </blockquote> <p> With its explicit discussion of <a href="https://en.wikipedia.org/wiki/XML">XML</a>, <a href="https://en.wikipedia.org/wiki/SOAP">SOAP</a>, and <a href="https://en.wikipedia.org/wiki/XML_schema">XSD</a>, this description may seem more stuck in 2004 than the two first tenets. </p> <p> I'll cover the most obvious consequence first. </p> <h3 id="7ddbc0f966b74c499d0414de8741e454"> At the boundaries... <a href="#7ddbc0f966b74c499d0414de8741e454">#</a> </h3> <p> In the MSDN article, the four tenets guide the design of <a href="https://en.wikipedia.org/wiki/Windows_Communication_Foundation">Windows Communication Foundation</a> (WCF) - a technology that in 2004 was under development, but still not completed. While SOAP already existed as a platform-independent protocol, WCF was a .NET endeavour. Most developers using the Microsoft platform at the time were used to some sort of binary protocol, such as <a href="https://en.wikipedia.org/wiki/Distributed_Component_Object_Model">DCOM</a> or <a href="https://en.wikipedia.org/wiki/.NET_Remoting">.NET Remoting</a>. Thus, it makes sense that Don Box was deliberately explicit that this was <em>not</em> how SOA (or WCF) was supposed to work. </p> <p> In fact, since SOAP is platform-independent, you could write a web service in one language (say, <a href="https://www.java.com/">Java</a>) and consume it with a different language (e.g. <a href="https://en.wikipedia.org/wiki/C%2B%2B">C++</a>). WCF was Microsoft's SOAP technology for .NET. </p> <p> If you squint enough that you don't see the explicit references to XML or SOAP, however, the description still applies. Today, you may exchange data with <a href="https://www.json.org">JSON</a> over <a href="https://en.wikipedia.org/wiki/REST">REST</a>, <a href="https://en.wikipedia.org/wiki/Protocol_Buffers">Protocol Buffers</a> via <a href="https://en.wikipedia.org/wiki/GRPC">gRPC</a>, or something else, but it's still common to have a communications protocol that is independent of specific service implementations. A service may be written in <a href="https://www.python.org/">Python</a>, <a href="https://www.haskell.org/">Haskell</a>, <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>, or any other language that supports the wire format. As this little list suggests, the implementation language doesn't even have to be object-oriented. </p> <p> In fact, </p> <ul> <li><a href="/2011/05/31/AttheBoundaries,ApplicationsareNotObject-Oriented">At the Boundaries, Applications are Not Object-Oriented</a></li> <li><a href="/2022/05/02/at-the-boundaries-applications-arent-functional">At the boundaries, applications aren't functional</a></li> <li><a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">At the boundaries, static types are illusory</a></li> </ul> <p> A formal <a href="https://en.wikipedia.org/wiki/Interface_description_language">interface definition language</a> (IDL) may enable you to automate serialization and deserialization, but these are usually constrained to defining the shape of data and operations. Don Box talks about validation, and <a href="/2022/08/22/can-types-replace-validation">types don't replace validation</a> - particularly if you allow <code>xsd:any</code>. That particular remark is quite at odds with the notion that a formal schema definition is necessary, or even desirable. </p> <p> And indeed, today we often see JSON-based REST APIs that are more loosely defined. Even so, the absence of a machine-readable IDL doesn't entail the absence of a schema. As <a href="https://lexi-lambda.github.io/">Alexis King</a> wrote related to the static-versus-dynamic-types debate, <a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/">dynamic type systems are not inherently more open</a>. A similar argument can be made about schema. Regardless of whether or not a formal specification exists, a service always has a de-facto schema. </p> <p> To be honest, though, when I try to interpret what this and the next tenet seem to imply, an IDL may have been all that Don Box had in mind. By <em>schema</em> he may only have meant XSD, and by <em>contract</em>, he may only have meant SOAP. More broadly speaking, this notion of <em>contract</em> may entail nothing more than a list of named operations, and references to schemas that indicate what input each operation takes, and what output it returns. </p> <p> What I have in mind with the rest of this article may be quite an embellishment on that notion. In fact, my usual interpretation of the word <em>contract</em> may be more aligned with what Don Box calls <em>policy</em>. Thus, if you want a very literal reading of the four tenets, what comes next may fit better with the fourth tenet, that service compatibility is determined based on policy. </p> <p> Regardless of whether you think that the following discussion belongs here, or in the next article, I'll assert that it's paramount to designing and developing useful and maintainable web services. </p> <h3 id="99146c84ab1d4d439879970bc17ca728"> Encapsulation <a href="#99146c84ab1d4d439879970bc17ca728">#</a> </h3> <p> If we, once more, ignore the particulars related to SOAP and XML, we may rephrase the notion of schema and contract as follows. Schema describes the shape of data: Is it a number, a string, a tuple, or a combination of these? Is there only one, or several? Is the data composed from smaller such definitions? Does the composition describe the combination of several such definitions, or does it describe mutually exclusive alternatives? </p> <p> Compliant data may be encoded as objects or data structures in memory, or serialized to JSON, XML, <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>, byte streams, etc. We may choose to call a particular agglomeration of data a <em>message</em>, which we may pass from one system to another. The <a href="/2024/03/11/boundaries-are-explicit">first tenet</a> already used this metaphor. </p> <p> You can't, however, just pass arbitrary valid messages from one system to another. Certain operations allow certain data, and may promise to return other kinds of messages. In additions to the schema, we also need to describe a <em>contract</em>. </p> <p> What's a contract? If you consult <a href="/ref/oosc">Object-Oriented Software Construction</a>, a contract stipulates invariants, pre- and postconditions for various operations. </p> <p> Preconditions state what must be true before an operation can take place. This often puts the responsibility on the caller to ensure that the system is in an appropriate state, and that the message that it intends to pass to the other system is valid according to that state. </p> <p> Postconditions, on the other hand, detail what the caller can expect in return. This includes guarantees about response messages, but may also describe the posterior state of the system. </p> <p> Invariants, finally, outline what is always true about the system. </p> <p> Although such a description of a contract originates from a book about object-oriented design, it's <a href="/2022/10/24/encapsulation-in-functional-programming">useful in other areas, too, such as functional programming</a>. It strikes me that it applies equally well in the context of service-orientation. </p> <p> The combination of contract and well-described message structure is, in other words, <a href="/encapsulation-and-solid">encapsulation</a>. There's nothing wrong with that: It works. If you actually apply it as a design principle, that is. </p> <h3 id="7d42dff045a24a4c89a894f8ed5d5166"> Conclusion <a href="#7d42dff045a24a4c89a894f8ed5d5166">#</a> </h3> <p> The third SOA tenet emphasizes that only data travels over service boundaries. In order to communicate effectively, services must agree on the shape of data, and which operations are legal when. While they exchange data, however, they don't share address space, or even internal representation. </p> <p> One service may be written in <a href="https://fsharp.org/">F#</a> and the client in <a href="https://clojure.org/">Clojure</a>. Even so, it's important that they have a shared understanding of what is possible, and what is not. The more explicit you, as a service owner, can be, the better. </p> <p> <strong>Next:</strong> <a href="/2024/04/29/service-compatibility-is-determined-based-on-policy">Service compatibility is determined based on policy</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Extracting curve coordinates from a bitmap https://blog.ploeh.dk/2024/04/08/extracting-curve-coordinates-from-a-bitmap 2024-04-08T05:32:00+00:00 Mark Seemann <div id="post"> <p> <em>Another example of using Haskell as an ad-hoc scripting language.</em> </p> <p> This article is part of a short series titled <a href="/2024/04/01/trying-to-fit-the-hype-cycle">Trying to fit the hype cycle</a>. In the first article, I outlined what it is that I'm trying to do. In this article, I'll describe how I extract a set of <em>x</em> and <em>y</em> coordinates from this bitmap: </p> <p> <img src="/content/binary/hype-cycle-cleaned.png" alt="Gartner hype cycle."> </p> <p> (Actually, this is scaled-down version of the image. The file I work with is a bit larger.) </p> <p> As I already mentioned in the previous article, these days there are online tools for just about everything. Most likely, there's also an online tool that will take a bitmap like that and return a set of <em>(x, y)</em> coordinates. </p> <p> Since I'm doing this for the programming exercise, I'm not interested in that. Rather, I'd like to write a little <a href="https://www.haskell.org/">Haskell</a> script to do it for me. </p> <h3 id="2ed7ee24ae244f3688dc8a362e149c17"> Module and imports <a href="#2ed7ee24ae244f3688dc8a362e149c17">#</a> </h3> <p> Yes, I wrote Haskell <em>script</em>. As I've described before, with good type inference, <a href="/2024/02/05/statically-and-dynamically-typed-scripts">a statically typed language can be as good for scripting as a dynamic one</a>. Just as might be the case with, say, a <a href="https://www.python.org/">Python</a> script, you'll be iterating, trying things out until finally the script settles into its final form. What I present here is the result of my exercise. You should imagine that I made lots of mistakes underway, tried things that didn't work, commented out code and print statements, imported modules I eventually didn't need, etc. Just like I imagine you'd also do with a script in a dynamically typed language. At least, that's how I write Python, when I'm forced to do that. </p> <p> In other words, the following is far from the result of perfect foresight, but rather the equilibrium into which the script settled. </p> <p> I named the module <code>HypeCoords</code>, because the purpose of it is to extract the <em>(x, y)</em> coordinates from the above <a href="https://en.wikipedia.org/wiki/Gartner_hype_cycle">Gartner hype cycle</a> image. These are the imports it turned out that I ultimately needed: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;HypeCoords&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.List.NonEmpty&nbsp;<span style="color:blue;">as</span>&nbsp;NE <span style="color:blue;">import</span>&nbsp;Data.List.NonEmpty&nbsp;(<span style="color:blue;">NonEmpty</span>((:|))) <span style="color:blue;">import</span>&nbsp;Codec.Picture <span style="color:blue;">import</span>&nbsp;Codec.Picture.Types</pre> </p> <p> The <code>Codec.Picture</code> modules come from the <a href="https://hackage.haskell.org/package/JuicyPixels">JuicyPixels</a> package. This is what enables me to read a <code>.png</code> file and extract the pixels. </p> <h3 id="e0f66bef266249ea8a9546c0edf0b15c"> Black and white <a href="#e0f66bef266249ea8a9546c0edf0b15c">#</a> </h3> <p> If you look at the above bitmap, you may notice that it has some vertical lines in a lighter grey than the curve itself. My first task, then, is to get rid of those. The easiest way to do that is to convert the image to a black-and-white bitmap, with no grey scale. </p> <p> Since this is a one-off exercise, I could easily do that with a bitmap editor, but on the other hand, I thought that this was a good first task to give myself. After all, I didn't know the JuicyPixels library <em>at all</em>, so this was an opportunity to start with a task just a notch simpler than the one that was my actual goal. </p> <p> I thought that the easiest way to convert to a black-and-white image would be to turn all pixels white if they are lighter than some threshold, and black otherwise. </p> <p> A <a href="https://en.wikipedia.org/wiki/PNG">PNG</a> file has more information than I need, so I first converted the image to an 8-bit <a href="https://en.wikipedia.org/wiki/RGB_color_model">RGB</a> bitmap. Even though the above image looks as though it's entirely grey scale, each pixel is actually composed of three colours. In order to compare a pixel with a threshold, I needed a single measure of how light or dark it is. </p> <p> That turned out to be about as simple as it sounds: Just take the average of the three colours. Later, I'd need a function to compute the average for another reason, so I made it a reusable function: </p> <p> <pre><span style="color:#2b91af;">average</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Integral</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">NE</span>.<span style="color:blue;">NonEmpty</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a average&nbsp;nel&nbsp;=&nbsp;<span style="color:blue;">sum</span>&nbsp;nel&nbsp;`div`&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;(NE.<span style="color:blue;">length</span>&nbsp;nel)</pre> </p> <p> It's a bit odd that the Haskell <a href="https://hackage.haskell.org/package/base">base</a> library doesn't come with such a function (at least to my knowledge), but anyway, this one is specialized to do integer division. Notice that this function computes only <a href="/2020/02/03/non-exceptional-averages">non-exceptional averages</a>, since it requires the input to be a <a href="https://hackage.haskell.org/package/base/docs/Data-List-NonEmpty.html">NonEmpty</a> list. No division-by-zero errors here, please! </p> <p> Once I'd computed a pixel average and compared it to a threshold value, I wanted to replace it with either black or white. In order to make the code more readable I defined two named constants: </p> <p> <pre><span style="color:#2b91af;">black</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">PixelRGB8</span> black&nbsp;=&nbsp;PixelRGB8&nbsp;<span style="color:blue;">minBound</span>&nbsp;<span style="color:blue;">minBound</span>&nbsp;<span style="color:blue;">minBound</span> <span style="color:#2b91af;">white</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">PixelRGB8</span> white&nbsp;=&nbsp;PixelRGB8&nbsp;<span style="color:blue;">maxBound</span>&nbsp;<span style="color:blue;">maxBound</span>&nbsp;<span style="color:blue;">maxBound</span></pre> </p> <p> With that in place, converting to black-and-white is only a few more lines of code: </p> <p> <pre><span style="color:#2b91af;">toBW</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">PixelRGB8</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">PixelRGB8</span> toBW&nbsp;(PixelRGB8&nbsp;r&nbsp;g&nbsp;b)&nbsp;= &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;threshold&nbsp;=&nbsp;192&nbsp;::&nbsp;Integer &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lum&nbsp;=&nbsp;average&nbsp;(<span style="color:blue;">fromIntegral</span>&nbsp;r&nbsp;:|&nbsp;[<span style="color:blue;">fromIntegral</span>&nbsp;g,&nbsp;<span style="color:blue;">fromIntegral</span>&nbsp;b]) &nbsp;&nbsp;<span style="color:blue;">in</span>&nbsp;<span style="color:blue;">if</span>&nbsp;lum&nbsp;&lt;=&nbsp;threshold&nbsp;<span style="color:blue;">then</span>&nbsp;black&nbsp;<span style="color:blue;">else</span>&nbsp;white</pre> </p> <p> I arrived at the threshold of <code>192</code> after a bit of trial-and-error. That's dark enough that the light vertical lines fall to the <code>white</code> side, while the real curve becomes <code>black</code>. </p> <p> What remained was to glue the parts together to save the black-and-white file: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;readResult&nbsp;&lt;-&nbsp;readImage&nbsp;<span style="color:#a31515;">&quot;hype-cycle-cleaned.png&quot;</span> &nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;readResult&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;Left&nbsp;msg&nbsp;-&gt;&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;msg &nbsp;&nbsp;&nbsp;&nbsp;Right&nbsp;img&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;bwImg&nbsp;=&nbsp;pixelMap&nbsp;toBW&nbsp;$&nbsp;convertRGB8&nbsp;img &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;writePng&nbsp;<span style="color:#a31515;">&quot;hype-cycle-bw.png&quot;</span>&nbsp;bwImg</pre> </p> <p> The <a href="https://hackage.haskell.org/package/JuicyPixels/docs/Codec-Picture.html#v:convertRGB8">convertRGB8</a> function comes from JuicyPixels. </p> <p> The <code>hype-cycle-bw.png</code> picture unsurprisingly looks like this: </p> <p> <img src="/content/binary/hype-cycle-bw.png" alt="Black-and-white Gartner hype cycle."> </p> <p> Ultimately, I didn't need the black-and-white bitmap <em>file</em>. I just wrote the script to create the file in order to be able to get some insights into what I was doing. Trust me, I made a lot of stupid mistakes along the way, and among other issues had some <a href="https://stackoverflow.com/q/77952762/126014">'fun' with integer overflows</a>. </p> <h3 id="2bd5b7d3dbd44e4a93594030bf5faca5"> Extracting image coordinates <a href="#2bd5b7d3dbd44e4a93594030bf5faca5">#</a> </h3> <p> Now I had a general feel for how to work with the JuicyPixels library. It still required quite a bit of spelunking through the documentation before I found a useful API to extract all the pixels from a bitmap: </p> <p> <pre><span style="color:#2b91af;">pixelCoordinates</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Pixel</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">Image</span>&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[((<span style="color:#2b91af;">Int</span>,&nbsp;<span style="color:#2b91af;">Int</span>),&nbsp;a)] pixelCoordinates&nbsp;=&nbsp;pixelFold&nbsp;(\acc&nbsp;x&nbsp;y&nbsp;px&nbsp;-&gt;&nbsp;((x,y),px):acc)&nbsp;<span style="color:blue;">[]</span></pre> </p> <p> While this is, after all, just a one-liner, I'm surprised that something like this doesn't come in the box. It returns a list of tuples, where the first element contains the pixel coordinates (another tuple), and the second element the pixel information (e.g. the RGB value). </p> <h3 id="2b2a30265e1b4577b845bb8d235a97eb"> One y value per x value <a href="#2b2a30265e1b4577b845bb8d235a97eb">#</a> </h3> <p> There were a few more issues to be addressed. The black curve in the black-and-white bitmap is thicker than a single pixel. This means that for each <em>x</em> value, there will be several black pixels. In order to do linear regression, however, we need a single <em>y</em> value per <em>x</em> value. </p> <p> One easy way to address that concern is to calculate the average <em>y</em> value for each <em>x</em> value. This may not always be the best choice, but as far as we can see in the above black-and-white image, it doesn't look as though there's any noise left in the picture. This means that we don't have to worry about outliers pulling the average value away from the curve. In other words, finding the average <em>y</em> value is an easy way to get what we need. </p> <p> <pre><span style="color:#2b91af;">averageY</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Integral</span>&nbsp;b&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:blue;">NonEmpty</span>&nbsp;(a,&nbsp;b)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a,&nbsp;b) averageY&nbsp;nel&nbsp;=&nbsp;(<span style="color:blue;">fst</span>&nbsp;$&nbsp;NE.<span style="color:blue;">head</span>&nbsp;nel,&nbsp;average&nbsp;$&nbsp;<span style="color:blue;">snd</span>&nbsp;&lt;$&gt;&nbsp;nel)</pre> </p> <p> The <code>averageY</code> function converts a <code>NonEmpty</code> list of tuples to a single tuple. <em>Watch out!</em> The input tuples are not the 'outer' tuples that <code>pixelCoordinates</code> returns, but rather a list of actual pixel coordinates. Each tuple is a set of coordinates, but since the function never manipulates the <em>x</em> coordinate, the type of the first element is just unconstrained <code>a</code>. It can literally be anything, but will, in practice, be an integer. </p> <p> The assumption is that the input is a small list of coordinates that all share the same <em>x</em> coordinate, such as <code>(42, 99) :| [(42, 100), (42, 102)]</code>. The function simply returns a single tuple that it creates on the fly. For the first element of the return tuple, it picks the <code>head</code> tuple from the input (<code>(42, 99)</code> in the example), and then that tuple's <code>fst</code> element (<code>42</code>). For the second element, the function averages all the <code>snd</code> elements (<code>99</code>, <code>100</code>, and <code>102</code>) to get <code>100</code> (integer division, you may recall): </p> <p> <pre>ghci&gt; averageY ((42, 99) :| [(42, 100), (42, 102)]) (42,100)</pre> </p> <p> What remains is to glue together the building blocks. </p> <h3 id="305d72ac4dd94c41aa23f6581f7aa716"> Extracting curve coordinates <a href="#305d72ac4dd94c41aa23f6581f7aa716">#</a> </h3> <p> A few more steps were required, but these I just composed <em>in situ</em>. I found no need to define them as individual functions. </p> <p> The final composition looks like this: </p> <p> <pre><span style="color:#2b91af;">main</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:#2b91af;">IO</span>&nbsp;() main&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;readResult&nbsp;&lt;-&nbsp;readImage&nbsp;<span style="color:#a31515;">&quot;hype-cycle-cleaned.png&quot;</span> &nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;readResult&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;Left&nbsp;msg&nbsp;-&gt;&nbsp;<span style="color:blue;">putStrLn</span>&nbsp;msg &nbsp;&nbsp;&nbsp;&nbsp;Right&nbsp;img&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;bwImg&nbsp;=&nbsp;pixelMap&nbsp;toBW&nbsp;$&nbsp;convertRGB8&nbsp;img &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;blackPixels&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">fst</span>&nbsp;&lt;$&gt;&nbsp;<span style="color:blue;">filter</span>&nbsp;((black&nbsp;==)&nbsp;.&nbsp;<span style="color:blue;">snd</span>)&nbsp;(pixelCoordinates&nbsp;bwImg) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;h&nbsp;=&nbsp;imageHeight&nbsp;bwImg &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;lineCoords&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(h&nbsp;-)&nbsp;.&nbsp;averageY&nbsp;&lt;$&gt;&nbsp;NE.groupAllWith&nbsp;<span style="color:blue;">fst</span>&nbsp;blackPixels &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">writeFile</span>&nbsp;<span style="color:#a31515;">&quot;coords.txt&quot;</span>&nbsp;$ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">unlines</span>&nbsp;$&nbsp;(\(x,y)&nbsp;-&gt;&nbsp;<span style="color:blue;">show</span>&nbsp;x&nbsp;++&nbsp;<span style="color:#a31515;">&quot;,&quot;</span>&nbsp;++&nbsp;<span style="color:blue;">show</span>&nbsp;y)&nbsp;&lt;$&gt;&nbsp;lineCoords</pre> </p> <p> The first lines of code, until and including <code>let bwImg</code>, are identical to what you've already seen. </p> <p> We're only interested in the black pixels, so the <code>main</code> action uses the standard <code>filter</code> function to keep only those that are equal to the <code>black</code> constant value. Once the white pixels are gone, we no longer need the pixel information. The expression that defines the <code>blackPixels</code> value finally (remember, you read Haskell code from right to left) throws away the pixel information by only retaining the <code>fst</code> element. That's the tuple that contains the coordinates. You may want to refer back to the type signature of <code>pixelCoordinates</code> to see what I mean. </p> <p> The <code>blackPixels</code> value has the type <code>[(Int, Int)]</code>. </p> <p> Two more things need to happen. One is to group the pixels together per <em>x</em> value so that we can use <code>averageY</code>. The other is that we want the coordinates as normal Cartesian coordinates, and right now, they're in screen coordinates. </p> <p> When working with bitmaps, it's quite common that pixels are measured out from the top left corner, instead of from the bottom left corner. It's not difficult to flip the coordinates, but we need to know the height of the image: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;h&nbsp;=&nbsp;imageHeight&nbsp;bwImg</pre> </p> <p> The <a href="https://hackage.haskell.org/package/JuicyPixels/docs/Codec-Picture.html#v:imageHeight">imageHeight</a> function is another JuicyPixels function. </p> <p> Because I sometimes get carried away, I write the code in a 'nice' compact style that could be more readable. I accomplished both of the above remaining tasks with a single line of code: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;lineCoords&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;(h&nbsp;-)&nbsp;.&nbsp;averageY&nbsp;&lt;$&gt;&nbsp;NE.groupAllWith&nbsp;<span style="color:blue;">fst</span>&nbsp;blackPixels</pre> </p> <p> This first groups the coordinates according to <em>x</em> value, so that all coordinates that share an <em>x</em> value are collected in a single <code>NonEmpty</code> list. This means that we can map all of those groups over <code>averageY</code>. Finally, the expression flips from screen coordinates to Cartesian coordinates by subtracting the <em>y</em> coordinate from the height <code>h</code>. </p> <p> The final <code>writeFile</code> expression writes the coordinates to a text file as <a href="https://en.wikipedia.org/wiki/Comma-separated_values">comma-separated values</a>. The first ten lines of that file looks like this: </p> <p> <pre>9,13 10,13 11,13 12,14 13,15 14,15 15,16 16,17 17,17 18,18 ...</pre> </p> <p> Do these points plot the Gartner hype cycle? </p> <h3 id="eed82185c8cd43dda147bce839454ca9"> Sanity checking by plotting the coordinates <a href="#eed82185c8cd43dda147bce839454ca9">#</a> </h3> <p> To check whether the coordinates look useful, we could plot them. If I wanted to use a few more hours, I could probably figure out how to do that with JuicyPixels as well, but on the other hand, I already know how to do that with Python: </p> <p> <pre>data&nbsp;=&nbsp;numpy.loadtxt(<span style="color:#a31515;">&#39;coords.txt&#39;</span>,&nbsp;delimiter=<span style="color:#a31515;">&#39;,&#39;</span>) x&nbsp;=&nbsp;data[:,&nbsp;0] t&nbsp;=&nbsp;data[:,&nbsp;1] plt.scatter(x,&nbsp;t,&nbsp;s=10,&nbsp;c=<span style="color:#a31515;">&#39;g&#39;</span>) plt.show()</pre> </p> <p> That produces this plot: </p> <p> <img src="/content/binary/hype-cycle-pyplot.png" alt="Coordinates plotted with Python."> </p> <p> LGTM. </p> <h3 id="9836c90de9ac487f9295acfb667090b7"> Conclusion <a href="#9836c90de9ac487f9295acfb667090b7">#</a> </h3> <p> In this article, you've seen how a single Haskell script can extract curve coordinates from a bitmap. The file is 41 lines all in all, including module declaration and white space. This article shows every single line in that file, apart from some blank lines. </p> <p> I loaded the file into GHCi and ran the <code>main</code> action in order to produce the CSV file. </p> <p> I did spend a few hours looking around in the JuicyPixels documentation before I'd identified the functions that I needed. All in all I used some hours on this exercise. I didn't keep track of time, but I guess that I used more than three, but probably fewer than six, hours on this. </p> <p> This was the successful part of the overall exercise. Now onto the fiasco. </p> <p> <strong>Next:</strong> <a href="/2024/04/22/fitting-a-polynomial-to-a-set-of-points">Fitting a polynomial to a set of points</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Trying to fit the hype cycle https://blog.ploeh.dk/2024/04/01/trying-to-fit-the-hype-cycle 2024-04-01T07:14:00+00:00 Mark Seemann <div id="post"> <p> <em>An amateur tries his hand at linear modelling.</em> </p> <p> About a year ago, I was contemplating a conference talk I was going to give. Although I later abandoned the idea for other reasons, for a few days I was thinking about using the <a href="https://en.wikipedia.org/wiki/Gartner_hype_cycle">Gartner hype cycle</a> for an animation. What I had in mind would require me to draw the curve in a way that would enable me to zoom in and out. Vector graphics would be much more useful for that job than a bitmap. </p> <p> <img src="/content/binary/hype-cycle-cleaned.png" alt="Gartner hype cycle."> </p> <p> Along the way, I considered if there was a <a href="https://en.wikipedia.org/wiki/Function_(mathematics)">function</a> that would enable me to draw it on the fly. A few web searches revealed the <a href="https://stats.stackexchange.com/">Cross Validated</a> question <a href="https://stats.stackexchange.com/q/268293/397132">Is there a linear/mixture function that can fit the Gartner hype curve?</a> So I wasn't the first person to have that idea, but at the time I found it, the question was effectively dismissed without a proper answer. Off topic, dontcha know? </p> <p> A web search also seems to indicate the existence of a few research papers where people have undertaken this task, but there's not a lot about it. True, the Gartner hype cycle isn't a real function, but it sounds like a relevant exercise in statistics, if one's into that kind of thing. </p> <p> Eventually, for my presentation, I went with another way to illustrate what I wanted to say, so for half I year, I didn't think more about it. </p> <h3 id="f3bfad5e6e80409e9703c80b1c98099b"> Linear regression? <a href="#f3bfad5e6e80409e9703c80b1c98099b">#</a> </h3> <p> Recently, however, I was following a course in mathematical analysis of data, and among other things, I learned how to fit a line to data. Not just a straight line, but any degree of <a href="https://en.wikipedia.org/wiki/Polynomial">polynomial</a>. So I thought that perhaps it'd be an interesting exercise to see if I could fit the hype cycle to some high-degree polynomial - even though I do realize that the hype cycle isn't a real function, and neither does it look like a straight polynomial function. </p> <p> In order to fit a polynomial to the curve, I needed some data, so my first task was to convert an image to a series of data points. </p> <p> I'm sure that there are online tools and apps that offer to do that for me, but the whole point of this was that I wanted to learn how to tackle problems like these. It's like <a href="/2020/01/13/on-doing-katas">doing katas</a>. The journey is the goal. </p> <p> This turned out to be an exercise consisting of two phases so distinct that I wrote them in two different languages. </p> <ul> <li><a href="/2024/04/08/extracting-curve-coordinates-from-a-bitmap">Extracting curve coordinates from a bitmap</a></li> <li><a href="/2024/04/22/fitting-a-polynomial-to-a-set-of-points">Fitting a polynomial to a set of points</a></li> </ul> <p> As the articles will reveal, the first part went quite well, while the other was, essentially, a fiasco. </p> <h3 id="fc418f36d6c74aa2a056b48489be7162"> Conclusion <a href="#fc418f36d6c74aa2a056b48489be7162">#</a> </h3> <p> There's not much point in finding a formula for the Gartner hype cycle, but the goal of this exercise was, for me, to tinker with some new techniques to see if I could learn from doing the exercise. And I <em>did</em> learn something. </p> <p> In the next articles in this series, I'll go over some of the details. </p> <p> <strong>Next:</strong> <a href="/2024/04/08/extracting-curve-coordinates-from-a-bitmap">Extracting curve coordinates from a bitmap</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Services are autonomous https://blog.ploeh.dk/2024/03/25/services-are-autonomous 2024-03-25T08:31:00+00:00 Mark Seemann <div id="post"> <p> <em>A reading of the second Don Box tenet, with some commentary.</em> </p> <p> This article is part of a series titled <a href="/2024/03/04/the-four-tenets-of-soa-revisited">The four tenets of SOA revisited</a>. In each of these articles, I'll pull one of <a href="https://en.wikipedia.org/wiki/Don_Box">Don Box</a>'s <em>four tenets of service-oriented architecture</em> (SOA) out of the <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/a-guide-to-developing-and-running-connected-systems-with-indigo">original MSDN Magazine article</a> and add some of my own commentary. If you're curious why I do that, I cover that in the introductory article. </p> <p> In this article, I'll go over the second tenet. The quotes are from the MSDN Magazine article unless otherwise indicated. </p> <h3 id="5021be8510304665ba3a8b9d9287a531"> Services are autonomous <a href="#5021be8510304665ba3a8b9d9287a531">#</a> </h3> <p> Compared with <a href="/2024/03/11/boundaries-are-explicit">the first tenet</a>, you'll see that Don Box had more to say about this one. I, conversely, have less to add. First, here's what the article said: </p> <blockquote> <p> Service-orientation mirrors the real world in that it does not assume the presence of an omniscient or omnipotent oracle that has awareness and control over all parts of a running system. This notion of service autonomy appears in several facets of development, the most obvious place being the area of deployment and versioning. </p> <p> Object-oriented programs tend to be deployed as a unit. Despite the Herculean efforts made in the 1990s to enable classes to be independently deployed, the discipline required to enable object-oriented interaction with a component proved to be impractical for most development organizations. When coupled with the complexities of versioning object-oriented interfaces, many organizations have become extremely conservative in how they roll out object-oriented code. The popularity of the XCOPY deployment and private assemblies capabilities of the .NET Framework is indicative of this trend. </p> <p> Service-oriented development departs from object-orientation by assuming that atomic deployment of an application is the exception, not the rule. While individual services are almost always deployed atomically, the aggregate deployment state of the overall system/application rarely stands still. It is common for an individual service to be deployed long before any consuming applications are even developed, let alone deployed into the wild. Amazon.com is one example of this build-it-and-they-will-come philosophy. There was no way the developers at Amazon could have known the multitude of ways their service would be used to build interesting and novel applications. </p> <p> It is common for the topology of a service-oriented application to evolve over time, sometimes without direct intervention from an administrator or developer. The degree to which new services may be introduced into a service-oriented system depends on both the complexity of the service interaction and the ubiquity of services that interact in a common way. Service-orientation encourages a model that increases ubiquity by reducing the complexity of service interactions. As service-specific assumptions leak into the public facade of a service, fewer services can reasonably mimic that facade and stand in as a reasonable substitute. </p> <p> The notion of autonomous services also impacts the way failures are handled. Objects are deployed to run in the same execution context as the consuming application. Service-oriented designs assume that this situation is the exception, not the rule. For that reason, services expect that the consuming application can fail without notice and often without any notification. To maintain system integrity, service-oriented designs use a variety of techniques to deal with partial failure modes. Techniques such as transactions, durable queues, and redundant deployment and failover are quite common in a service-oriented system. </p> <p> Because many services are deployed to function over public networks (such as the Internet), service-oriented development assumes not only that incoming message data may be malformed but also that it may have been transmitted for malicious purposes. Service-oriented architectures protect themselves by placing the burden of proof on all message senders by requiring applications to prove that all required rights and privileges have been granted. Consistent with the notion of service autonomy, service-oriented architectures invariably rely on administratively managed trust relationships in order to avoid per-service authentication mechanisms common in classic Web applications. </p> </blockquote> <p> Again, I'd like to highlight how general these ideas are. Once lifted out of the context of <a href="https://en.wikipedia.org/wiki/Windows_Communication_Foundation">Windows Communication Foundation</a>, all of this applies more broadly. </p> <p> Perhaps a few details now seem dated, but in general I find that this description holds up well. </p> <h3 id="f921c1135edd46d688729181489a9c73"> Wildlife <a href="#f921c1135edd46d688729181489a9c73">#</a> </h3> <p> It's striking that someone in 2004 observed that big, complex, coordinated releases are impractical. Even so, it doesn't seem as though adopting a network-based technology and architecture in itself solves that problem. <a href="/2012/12/18/ZookeepersmustbecomeRangers">I wrote about that in 2012</a>, and I've seen <a href="https://youtu.be/jdliXz70NtM?si=NRSHFqaVHMvWnOPF">Adam Ralph make a similar observation</a>. Many organizations inadvertently create distributed monoliths. I think that this often stems from a failure of heeding the tenet that services are autonomous. </p> <p> I've experienced the following more than once. A team of developers rely on a service. As they take on a new feature, they realize that the way things are currently modelled prevents them from moving forward. Typical examples include mismatched cardinalities. For example, a customer record has a single active address, but the new feature requires that customers may have multiple active addresses. It could be that a customer has a permanent address, but also a summerhouse. </p> <p> It is, however, the other service that defines how customer addresses are modelled, so the development team contacts the service team to discuss a breaking change. The service team agrees to the breaking change, but this means that the service and the relying client team now have to coordinate when they deploy the new versions of their software. The service is no longer autonomous. </p> <p> I've already discussed this kind of problem in <a href="/2023/11/27/synchronizing-concurrent-teams">a previous article</a>, and as Don Box also implied, this discussion is related to the question of versioning, which we'll get back to when covering the fourth tenet. </p> <h3 id="11028dabd5a540cf9160c06c3e1b283c"> Transactions <a href="#11028dabd5a540cf9160c06c3e1b283c">#</a> </h3> <p> It may be worthwhile to comment on this sentence: </p> <blockquote> <p> Techniques such as transactions, durable queues, and redundant deployment and failover are quite common in a service-oriented system. </p> </blockquote> <p> Indeed, but particularly regarding database transactions, a service may use them <em>internally</em> (typically leveraging a database engine like <a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server">SQL Server</a>, <a href="https://en.wikipedia.org/wiki/Oracle_Database">Oracle</a>, <a href="https://en.wikipedia.org/wiki/PostgreSQL">PostgreSQL</a>, etc.), but not across services. Around the time Don Box wrote the original MSDN Magazine article an extension to SOAP colloquially known as <em>WS-Death Star</em> was in the works, and it included <a href="https://en.wikipedia.org/wiki/WS-Transaction">WS Transaction</a>. </p> <p> I don't know whether Don Box had something like this in mind when he wrote the word <em>transaction</em>, but in my experience, you don't want to go there. If you need to, you can make use of database transactions to keep your own service <a href="https://en.wikipedia.org/wiki/ACID">ACID</a>-consistent, but don't presume that this is possible with multiple autonomous services. </p> <p> As always, even if a catchphrase such as <em>services are autonomous</em> sounds good, it's always illuminating to understand that there are trade-offs involved - and what they are. Here, a major trade-off is that you need to think about error-handling in a different way. If you don't already know how to address such concerns, look up <em>lock-free transactions</em> and <a href="https://en.wikipedia.org/wiki/Eventual_consistency">eventual consistency</a>. As Don Box also mentioned, durable queues are often part of such a solution, as is <a href="https://en.wikipedia.org/wiki/Idempotence">idempotence</a>. </p> <h3 id="7dc237c5f67c42c8b2c439140fc7a05b"> Validation <a href="#7dc237c5f67c42c8b2c439140fc7a05b">#</a> </h3> <p> From this discussion follows that an autonomous service should, ideally, exist independently of the software ecosystem in which it exists. While an individual service can't impose its will on its surroundings, it can, and should, behave in a consistent and correct manner. </p> <p> This does include deliberate consistency for the service itself. An autonomous service may make use of ACID or eventual consistency as the service owner deems appropriate. </p> <p> It should also treat all input as suspect, until proven otherwise. Input validation is an important part of service design. It is my belief that <a href="/2020/12/14/validation-a-solved-problem">validation is a solved problem</a>, but that doesn't mean that you don't have to put in the work. You should consider correctness, versioning, as well as <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a>. </p> <h3 id="2482dbc1c20248fdb61a7347abce49ef"> Security <a href="#2482dbc1c20248fdb61a7347abce49ef">#</a> </h3> <p> A similar observation relates to security. Some services (particularly read-only services) may allow for anonymous access, but if a service needs to authenticate or authorize requests, consider how this is done in an autonomous manner. Looking up account information in a centralized database isn't the autonomous way. If a service does that, it now relies on the account database, and is no longer autonomous. </p> <p> Instead, rely on <a href="https://en.wikipedia.org/wiki/Claims-based_identity">claims-based identity</a>. In my experience, <a href="https://en.wikipedia.org/wiki/OAuth">OAuth</a> with <a href="https://en.wikipedia.org/wiki/JSON_Web_Token">JWT</a> is usually fine. </p> <p> If your service needs to know something about the user that only an external source can tell it, don't look it up in an external system. Instead, demand that it's included in the JWT as a claim. Do you need to validate the age of the user? Require a <em>date-of-birth</em> or <em>age</em> claim. Do you need to know if the request is made on behalf of a system administrator? Demand a list of <em>role</em> claims. </p> <h3 id="75412f1e737a45dfaaf11c54e28013fa"> Conclusion <a href="#75412f1e737a45dfaaf11c54e28013fa">#</a> </h3> <p> The second of Don Box's four tenets of SOA state that services should be autonomous. At first glance, you may think that all this means is that a service shouldn't share its database with another service. That is, however, a minimum bar. You need to consider how a service exists in an environment that it doesn't control. Again, the <a href="/2012/12/18/RangersandZookeepers">wildlife metaphor</a> seems apt. Particularly if your service is exposed to the internet, it lives in a hostile environment. </p> <p> Not only should you consider all input belligerent, you must also take into account that friendly systems may disappear or change. Your service exists by itself, supported by itself, relying on itself. If you need to coordinate work with other service owners, that's a strong hint that your service isn't, after all, autonomous. </p> <p> <strong>Next:</strong> <a href="/2024/04/15/services-share-schema-and-contract-not-class">Services share schema and contract, not class</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Extracting data from a small CSV file with Python https://blog.ploeh.dk/2024/03/18/extracting-data-from-a-small-csv-file-with-python 2024-03-18T08:36:00+00:00 Mark Seemann <div id="post"> <p> <em>My inept adventures with a dynamically typed language.</em> </p> <p> This article is the third in <a href="/2024/02/05/statically-and-dynamically-typed-scripts">a small series about ad-hoc programming in two languages</a>. In <a href="/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell">the previous article</a> you saw how I originally solved a small data extraction and analysis problem with <a href="https://www.haskell.org/">Haskell</a>, even though it was strongly implied that <a href="https://www.python.org/">Python</a> was the language for the job. </p> <p> Months after having solved the problem I'd learned a bit more Python, so I decided to return to it and do it again in Python as an exercise. In this article, I'll briefly describe what I did. </p> <h3 id="590b0c98bf064ac0b8893ae41d398daa"> Reading CSV data <a href="#590b0c98bf064ac0b8893ae41d398daa">#</a> </h3> <p> When writing Python, I feel the way I suppose a script kiddie might feel. I cobble together code based on various examples I've seen somewhere else, without a full or deep understanding of what I'm doing. There's more than a hint of <a href="/ref/pragmatic-programmer">programming by coincidence</a>, I'm afraid. One thing I've picked up along the way is that I can use <a href="https://pandas.pydata.org/">pandas</a> to read a <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV file</a>: </p> <p> <pre>data&nbsp;=&nbsp;pd.read_csv(<span style="color:#a31515;">&#39;survey_data.csv&#39;</span>,&nbsp;header=<span style="color:blue;">None</span>) grades&nbsp;=&nbsp;data.iloc[:,&nbsp;2] experiences&nbsp;=&nbsp;data.iloc[:,&nbsp;3]</pre> </p> <p> In order for this to work, I needed to import <code>pandas</code>. Ultimately, my imports looked like this: </p> <p> <pre><span style="color:blue;">import</span>&nbsp;pandas&nbsp;<span style="color:blue;">as</span>&nbsp;pd <span style="color:blue;">from</span>&nbsp;collections&nbsp;<span style="color:blue;">import</span>&nbsp;Counter <span style="color:blue;">from</span>&nbsp;itertools&nbsp;<span style="color:blue;">import</span>&nbsp;combinations,&nbsp;combinations_with_replacement <span style="color:blue;">import</span>&nbsp;matplotlib.pyplot&nbsp;<span style="color:blue;">as</span>&nbsp;plt</pre> </p> <p> In other Python code that I've written, I've been a heavy user of <a href="https://numpy.org/">NumPy</a>, and while I several times added it to my imports, I never needed it for this task. That was a bit surprising, but I've only done Python programming for a year, and I still don't have a good feel for the ecosystem. </p> <p> The above code snippet also demonstrates how easy it is to slice a <em>dataframe</em> into columns: <code>grades</code> contains all the values in the (zero-indexed) second column, and <code>experiences</code> likewise the third column. </p> <h3 id="2a5c679e37394960acf5cf283abd41d5"> Sum of grades <a href="#2a5c679e37394960acf5cf283abd41d5">#</a> </h3> <p> All the trouble I had with binomial choice without replacement that I had with my Haskell code is handled with <code>combinations</code>, which happily handles duplicate values: </p> <p> <pre>&gt&gt&gt list(combinations('foo', 2)) [('f', 'o'), ('f', 'o'), ('o', 'o')]</pre> </p> <p> Notice that <code>combinations</code> doesn't list <code>('o', 'f')</code>, since (apparently) it doesn't consider ordering important. That's more in line with the <a href="https://en.wikipedia.org/wiki/Binomial_coefficient">binomial coefficient</a>, whereas <a href="/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell">my Haskell code</a> considers a tuple like <code>('f', 'o')</code> to be distinct from <code>('o', 'f')</code>. This is completely consistent with how Haskell works, but means that all the counts I arrived at with Haskell are double what they are in this article. Ultimately, <em>6/1406</em> is equal to <em>3/703</em>, so the probabilities are the same. I'll try to call out this factor-of-two difference whenever it occurs. </p> <p> A <code>Counter</code> object counts the number of occurrences of each value, so reading, picking combinations without replacement and adding them together is just two lines of code, and one more to print them: </p> <p> <pre>sumOfGrades&nbsp;=&nbsp;Counter(<span style="color:blue;">map</span>(<span style="color:blue;">sum</span>,&nbsp;combinations(grades,&nbsp;2))) sumOfGrades&nbsp;=&nbsp;<span style="color:blue;">sorted</span>(sumOfGrades.items(),&nbsp;key=<span style="color:blue;">lambda</span>&nbsp;item:&nbsp;item[0]) <span style="color:blue;">print</span>(<span style="color:blue;">f</span><span style="color:#a31515;">&#39;Sums&nbsp;of&nbsp;grades:&nbsp;</span>{sumOfGrades}<span style="color:#a31515;">&#39;</span>)</pre> </p> <p> The output is: </p> <p> <pre>Sums of grades: [(0, 3), (2, 51), (4, 157), (6, 119), (7, 24), (8, 21), (9, 136), (10, 3), (11, 56), (12, 23), (14, 69), (16, 14), (17, 8), (19, 16), (22, 2), (24, 1)]</pre> </p> <p> (Formatting courtesy of yours truly.) </p> <p> As already mentioned, these values are off by a factor two compared to the previous Haskell code, but since I'll ultimately be dealing in ratios, it doesn't matter. What this output indicates is that the sum <em>0</em> occurs three times, the sum <em>2</em> appears <em>51</em> times, and so on. </p> <p> This is where I, in my Haskell code, dropped down to a few ephemeral <a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">REPL</a>-based queries that enabled me to collect enough information to paste into Excel in order to produce a figure. In Python, however, I have <a href="https://matplotlib.org/">Matplotlib</a>, which means that I can create the desired plots entirely in code. It does require that I write a bit more code, though. </p> <p> First, I need to calculate the range of the <a href="https://www.probabilitycourse.com/chapter3/3_1_3_pmf.php">Probability Mass Function</a> (PMF), since there are values that are possible, but not represented in the above data set. To calculate all possible values in the PMF's range, I use <code>combinations_with_replacement</code> against the <a href="https://en.wikipedia.org/wiki/Academic_grading_in_Denmark">Danish grading scale</a>. </p> <p> <pre>grade_scale&nbsp;=&nbsp;[-3,&nbsp;0,&nbsp;2,&nbsp;4,&nbsp;7,&nbsp;10,&nbsp;12] sumOfGradesRange&nbsp;=&nbsp;<span style="color:#2b91af;">set</span>(<span style="color:blue;">map</span>(<span style="color:blue;">sum</span>,&nbsp;combinations_with_replacement(grade_scale,&nbsp;2))) sumOfGradesRange&nbsp;=&nbsp;<span style="color:blue;">sorted</span>(sumOfGradesRange) <span style="color:blue;">print</span>(<span style="color:blue;">f</span><span style="color:#a31515;">&#39;Range&nbsp;of&nbsp;sums&nbsp;of&nbsp;grades:&nbsp;</span>{sumOfGradesRange}<span style="color:#a31515;">&#39;</span>)</pre> </p> <p> The output is this: </p> <p> <pre>Range of sums of grades: [-6, -3, -1, 0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 19, 20, 22, 24]</pre> </p> <p> Next, I create a dictionary of all possible grades, initializing all entries to zero, but then updating that dictionary with the observed values, where they are present: </p> <p> <pre>probs&nbsp;=&nbsp;<span style="color:#2b91af;">dict</span>.fromkeys(sumOfGradesRange,&nbsp;0) probs.update(<span style="color:#2b91af;">dict</span>(sumOfGrades))</pre> </p> <p> Finally, I recompute the dictionary entries to probabilities. </p> <p> <pre>total&nbsp;=&nbsp;<span style="color:blue;">sum</span>(x[1]&nbsp;<span style="color:blue;">for</span>&nbsp;x&nbsp;<span style="color:blue;">in</span>&nbsp;sumOfGrades) <span style="color:blue;">for</span>&nbsp;k,&nbsp;v&nbsp;<span style="color:blue;">in</span>&nbsp;probs.items(): &nbsp;&nbsp;&nbsp;&nbsp;probs[k]&nbsp;=&nbsp;v&nbsp;/&nbsp;total</pre> </p> <p> Now I have all the data needed to plot the desired bar char: </p> <p> <pre>plt.bar(probs.keys(),&nbsp;probs.values()) plt.xlabel(<span style="color:#a31515;">&#39;Sum&#39;</span>) plt.ylabel(<span style="color:#a31515;">&#39;Probability&#39;</span>) plt.show()</pre> </p> <p> The result looks like this: </p> <p> <img src="/content/binary/sum-pmf-plot.png" alt="Bar chart of the sum-of-grades PMF."> </p> <p> While I'm already on line 34 in my Python file, with one more question to answer, I've written proper code in order to produce data that I only wrote ephemeral queries for in Haskell. </p> <h3 id="8831d23c67bd48e9b22db86ca3c21bd4"> Difference of experiences <a href="#8831d23c67bd48e9b22db86ca3c21bd4">#</a> </h3> <p> The next question is almost a repetition of the the first one, and I've addressed it by copying and pasting. After all, it's only <em>duplication</em>, not <em>triplication</em>, so I can always invoke the <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">Rule of Three</a>. Furthermore, this is a one-off script that I don't expect to have to maintain in the future, so copy-and-paste, here we go: </p> <p> <pre>diffOfExperiances&nbsp;=&nbsp;\ &nbsp;&nbsp;&nbsp;&nbsp;Counter(<span style="color:blue;">map</span>(<span style="color:blue;">lambda</span>&nbsp;x:&nbsp;<span style="color:blue;">abs</span>(x[0]&nbsp;-&nbsp;x[1]),&nbsp;combinations(experiences,&nbsp;2))) diffOfExperiances&nbsp;=&nbsp;<span style="color:blue;">sorted</span>(diffOfExperiances.items(),&nbsp;key=<span style="color:blue;">lambda</span>&nbsp;item:&nbsp;item[0]) <span style="color:blue;">print</span>(<span style="color:blue;">f</span><span style="color:#a31515;">&#39;Differences&nbsp;of&nbsp;experiences:&nbsp;</span>{diffOfExperiances}<span style="color:#a31515;">&#39;</span>) experience_scale&nbsp;=&nbsp;<span style="color:#2b91af;">list</span>(<span style="color:blue;">range</span>(1,&nbsp;8)) diffOfExperiancesRange&nbsp;=&nbsp;<span style="color:#2b91af;">set</span>(\ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">map</span>(<span style="color:blue;">lambda</span>&nbsp;x:&nbsp;<span style="color:blue;">abs</span>(x[0]&nbsp;-&nbsp;x[1]),\ &nbsp;&nbsp;&nbsp;&nbsp;combinations_with_replacement(experience_scale,&nbsp;2))) diffOfExperiancesRange&nbsp;=&nbsp;<span style="color:blue;">sorted</span>(diffOfExperiancesRange) probs&nbsp;=&nbsp;<span style="color:#2b91af;">dict</span>.fromkeys(diffOfExperiancesRange,&nbsp;0) probs.update(<span style="color:#2b91af;">dict</span>(diffOfExperiances)) total&nbsp;=&nbsp;<span style="color:blue;">sum</span>(x[1]&nbsp;<span style="color:blue;">for</span>&nbsp;x&nbsp;<span style="color:blue;">in</span>&nbsp;diffOfExperiances) <span style="color:blue;">for</span>&nbsp;k,&nbsp;v&nbsp;<span style="color:blue;">in</span>&nbsp;probs.items(): &nbsp;&nbsp;&nbsp;&nbsp;probs[k]&nbsp;=&nbsp;v&nbsp;/&nbsp;total <span style="color:green;">#&nbsp;Plot&nbsp;the&nbsp;histogram&nbsp;of&nbsp;differences&nbsp;of&nbsp;experiences</span> plt.bar(probs.keys(),&nbsp;probs.values()) plt.xlabel(<span style="color:#a31515;">&#39;Difference&#39;</span>) plt.ylabel(<span style="color:#a31515;">&#39;Probability&#39;</span>) plt.show()</pre> </p> <p> The bar chart has the same style as before, but obviously displays different data. See the bar chart in the <a href="/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell">previous article</a> for the Excel-based rendition of that data. </p> <h3 id="8d7d707edeba43c59d07b5753a4bdb2d"> Conclusion <a href="#8d7d707edeba43c59d07b5753a4bdb2d">#</a> </h3> <p> The Python code runs to 61 lines of code, compared with the 34 lines of Haskell code. The Python code, however, is much more complete than the Haskell code, since it also contains the code that computes the range of each PMF, as well as code that produces the figures. </p> <p> Like the Haskell code, it took me a couple of hours to produce this, so I can't say that I feel much more productive in Python than in Haskell. On the other hand, I also acknowledge that I have less experience writing Python code. If I had to do a lot of ad-hoc data crunching like this, I can see how Python is useful. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Boundaries are explicit https://blog.ploeh.dk/2024/03/11/boundaries-are-explicit 2024-03-11T08:03:00+00:00 Mark Seemann <div id="post"> <p> <em>A reading of the first Don Box tenet, with some commentary.</em> </p> <p> This article is part of a series titled <a href="/2024/03/04/the-four-tenets-of-soa-revisited">The four tenets of SOA revisited</a>. In each of these articles, I'll pull one of <a href="https://en.wikipedia.org/wiki/Don_Box">Don Box</a>'s <em>four tenets of service-oriented architecture</em> (SOA) out of the <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/a-guide-to-developing-and-running-connected-systems-with-indigo">original MSDN Magazine article</a> and add some of my own commentary. If you're curious why I do that, I cover that in the introductory article. </p> <p> In this article, I'll go over the first tenet, quoting from the MSDN Magazine article unless otherwise indicated. </p> <h3 id="3d25f37d4da8482fa846b8660823b8cd"> Boundaries are explicit <a href="#3d25f37d4da8482fa846b8660823b8cd">#</a> </h3> <p> This tenet was the one I struggled with the most. It took me a long time to come to grips with how to apply it, but I'll get back to that in a moment. First, here's what the article said: </p> <blockquote> <p> A service-oriented application often consists of services that are spread over large geographical distances, multiple trust authorities, and distinct execution environments. The cost of traversing these various boundaries is nontrivial in terms of complexity and performance. Service-oriented designs acknowledge these costs by putting a premium on boundary crossings. Because each cross-boundary communication is potentially costly, service-orientation is based on a model of explicit message passing rather than implicit method invocation. Compared to distributed objects, the service-oriented model views cross-service method invocation as a private implementation technique, not as a primitive construct—the fact that a given interaction may be implemented as a method call is a private implementation detail that is not visible outside the service boundary. </p> <p> Though service-orientation does not impose the RPC-style notion of a network-wide call stack, it can support a strong notion of causality. It is common for messages to explicitly indicate which chain(s) of messages a particular message belongs to. This indication is useful for message correlation and for implementing several common concurrency models. </p> <p> The notion that boundaries are explicit applies not only to inter-service communication but also to inter-developer communication. Even in scenarios in which all services are deployed in a single location, it is commonplace for the developers of each service to be spread across geographical, cultural, and/or organizational boundaries. Each of these boundaries increases the cost of communication between developers. Service orientation adapts to this model by reducing the number and complexity of abstractions that must be shared across service boundaries. By keeping the surface area of a service as small as possible, the interaction and communication between development organizations is reduced. One theme that is consistent in service-oriented designs is that simplicity and generality aren't a luxury but rather a critical survival skill. </p> </blockquote> <p> Notice that there's nothing here about <a href="https://en.wikipedia.org/wiki/Windows_Communication_Foundation">Windows Communication Framework</a> (WCF), or any other specific technology. This is common to all four tenets, and one of the reasons that I think they deserve to be lifted out of their original context and put on display as the general ideas that they are. </p> <p> I'm getting the vibe that the above description was written under the impression of the disenchantment with distributed objects that was setting in around that time. The year before, <a href="https://martinfowler.com/">Martin Fowler</a> had formulated his </p> <blockquote> <p> "<strong>First Law of Distributed Object Design:</strong> Don't distribute your objects!" </p> <footer><cite>Martin Fowler, <a href="/ref/peaa">Patterns of Enterprise Application Architecture</a>, (his emphasis)</cite></footer> </blockquote> <p> The way that I read the tenet then, and the way I <em>still</em> read it today, is that in contrast to distributed objects, you should treat any service invocation as a noticeable operation, <em>"putting a premium on boundary crossings"</em>, somehow distinct from normal code. </p> <p> Perhaps I read to much into that, because WCF immediately proceeded to convert any <a href="https://en.wikipedia.org/wiki/SOAP">SOAP</a> service into a lot of auto-generated C# code that would then enable you to invoke operations on a remote service using (you guessed it) a method invocation. </p> <p> Here a code snippet from the <a href="https://learn.microsoft.com/dotnet/framework/wcf/how-to-use-a-wcf-client">WCF documentation</a>: </p> <p> <pre><span style="color:blue;">double</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">value1</span>&nbsp;=&nbsp;100.00D; <span style="color:blue;">double</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">value2</span>&nbsp;=&nbsp;15.99D; <span style="color:blue;">double</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">result</span>&nbsp;=&nbsp;client.Add(value1,&nbsp;value2);</pre> </p> <p> What happens here is that <code>client.Add</code> creates and sends a SOAP message to a service, receives the response, unpacks it, and returns it as a <code>double</code> value. Even so, it looks just like any other method call. There's no <em>"premium on boundary crossings"</em> here. </p> <p> So much for the principle that boundaries are explicit. They're not, and it bothered me twenty years ago, as it bothers me today. </p> <p> I'll remind you what the problem is. When the boundary is <em>not</em> explicit, you may inadvertently write client code that makes network calls, and you may not be aware of it. This could noticeably slow down the application, particularly if you do it in a loop. </p> <h3 id="55bd772540a047a3b8db0d1aee373e87"> How do you make boundaries explicit? <a href="#55bd772540a047a3b8db0d1aee373e87">#</a> </h3> <p> This problem isn't isolated to WCF or SOAP. <a href="https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing">Network calls are slow and unreliable</a>. Perhaps you're connecting to a system on the other side of the Earth. Perhaps the system is unavailable. This is true regardless of protocol. </p> <p> From the software architect's perspective, the tenet that boundaries are explicit is a really good idea. The clearer it is where in a code base network operations take place, the easier it is to troubleshot and maintain that code. This could make it easier to spot <em>n + 1</em> problems, as well as give you opportunities to <a href="/2020/03/23/repeatable-execution">add logging</a>, <a href="https://martinfowler.com/bliki/CircuitBreaker.html">Circuit Breakers</a>, etc. </p> <p> How do you make boundaries explicit? Clearly, WCF failed to do so, despite the design goal. </p> <h3 id="3c69e9213db946dc8d389c9b4bf19de2"> Only Commands <a href="#3c69e9213db946dc8d389c9b4bf19de2">#</a> </h3> <p> After having struggled with this question for years, I had an idea. This idea, however, doesn't really work, but I'll briefly cover it here. After all, if I can have that idea, other people may get it as well. It could save you some time if I explain why I believe that it doesn't address the problem. </p> <p> The idea is to mandate that all network operations are <a href="https://en.wikipedia.org/wiki/Command%E2%80%93query_separation">Commands</a>. In a C-like language, that would indicate a <code>void</code> method. </p> <p> While it turns out that it ultimately doesn't work, this isn't just some arbitrary rule that I've invented. After all, if a method doesn't return anything, the boundary does, in a sense, become explicit. You can't just 'keep dotting', <a href="https://martinfowler.com/bliki/FluentInterface.html">fluent-interface</a> style. </p> <p> <pre>channel.UpdateProduct(pc);</pre> </p> <p> This gives you the opportunity to treat network operations as fire-and-forget operations. While you could still run such Commands in a tight loop, you could at least add them to a queue and move on. Such a queue could be be an in-process data structure, or a persistent queue. Your network card also holds a small queue of network packets. </p> <p> This is essentially an asynchronous messaging architecture. It seems to correlate with Don Box's talk about messages. </p> <p> Although this may seem as though it addresses some concerns about making boundaries explicit, an obvious question arises: How do you perform queries in this model? </p> <p> You <em>could</em> keep such an architecture clean. You might, for example, implement a <a href="https://martinfowler.com/bliki/CQRS.html">CQRS</a> architecture where Commands create Events for which your application may subscribe. Such events could be handled by <em>event handlers</em> (other <code>void</code> methods) to update in-memory data as it changes. </p> <p> Even so, there are practical limitations with such a model. What's likely to happen, instead, is the following. </p> <h3 id="04a7c349122e45e38341eb0b50b877c0"> Request-Reply <a href="#04a7c349122e45e38341eb0b50b877c0">#</a> </h3> <p> It's hardly unlikely that you may need to perform some kind of Query against a remote system. If you can only communicate with services using <code>void</code> methods, such a scenario seems impossible. </p> <p> It's not. There's even a pattern for that. <a href="/ref/eip">Enterprise Integration Patterns</a> call it <a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/RequestReply.html">Request-Reply</a>. You create a Query message and give it a correlation ID, send it, and wait for the reply message to arrive at your own <em>message handler</em>. Client code might look like this: </p> <p> <pre>var&nbsp;correlationId&nbsp;=&nbsp;Guid.NewGuid(); var&nbsp;query&nbsp;=&nbsp;new&nbsp;FavouriteSongsQuery(UserId:&nbsp;123,&nbsp;correlationId); channel.Send(query); IReadOnlyCollection&lt;Song&gt;&nbsp;songs&nbsp;=&nbsp;[]; while&nbsp;(true) { &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;response&nbsp;=&nbsp;subscriber.GetNextResponse(correlationId); &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(response&nbsp;is&nbsp;null) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Thread.Sleep(100); &nbsp;&nbsp;&nbsp;&nbsp;else &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;songs&nbsp;=&nbsp;response; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;break; }</pre> </p> <p> While this works, it's awkward to use, so it doesn't take long before someone decides to wrap it in a helpful helper method: </p> <p> <pre>public&nbsp;IReadOnlyCollection&lt;Song&gt;&nbsp;GetFavouriteSongs(int&nbsp;userId) { &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;correlationId&nbsp;=&nbsp;Guid.NewGuid(); &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;query&nbsp;=&nbsp;new&nbsp;FavouriteSongsQuery(userId,&nbsp;correlationId); &nbsp;&nbsp;&nbsp;&nbsp;channel.Send(query); &nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Song&gt;&nbsp;songs&nbsp;=&nbsp;[]; &nbsp;&nbsp;&nbsp;&nbsp;while&nbsp;(true) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;response&nbsp;=&nbsp;subscriber.GetNextResponse(correlationId); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(response&nbsp;is&nbsp;null) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Thread.Sleep(100); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;songs&nbsp;=&nbsp;response; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;break; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;songs; }</pre> </p> <p> This now enables you to write client code like this: </p> <p> <pre>var&nbsp;songService&nbsp;=&nbsp;new&nbsp;SongService(); var&nbsp;songs&nbsp;=&nbsp;songService.GetFavouriteSongs(123);</pre> </p> <p> We're back where we started. Boundaries are no longer explicit. Equivalent to how <a href="/2020/11/23/good-names-are-skin-deep">good names are only skin-deep</a>, this attempt to make boundaries explicit can't resist programmers' natural tendency to make things easier for themselves. </p> <p> If only there was some way to make an abstraction <a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">contagious</a>... </p> <h3 id="fffc782323e746ef9a7b662132e17257"> Contagion <a href="#fffc782323e746ef9a7b662132e17257">#</a> </h3> <p> Ideally, we'd like to make boundaries explicit in such a way that they can't be hidden away. After all, </p> <blockquote> <p> "Abstraction is <em>the elimination of the irrelevant and the amplification of the essential.</em>" </p> <footer><cite>Robert C. Martin, <a href="/ref/doocautbm">Designing Object-Oriented C++ Applications Using The Booch Method</a>, chapter 00 (sic), (his emphasis)</cite></footer> </blockquote> <p> The existence of a boundary is essential, so while we might want to eliminate various other irrelevant details, this is a property that we should retain and surface in APIs. Even better, it'd be best if we could do it in such a way that it can't easily be swept under the rug, as shown above. </p> <p> In <a href="https://www.haskell.org/">Haskell</a>, this is true for all input/output - not only network requests, but also file access, and other non-deterministic actions. In Haskell this is a 'wrapper' type called <code>IO</code>; for an explanation with C# examples, see <a href="/2020/06/08/the-io-container">The IO Container</a>. </p> <p> In a more <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> way, we can use <a href="/2020/07/27/task-asynchronous-programming-as-an-io-surrogate">task asynchronous programming as an IO surrogate</a>. People often complain that <code>async</code> code is contagious. By that they mean that once a piece of code is asynchronous, the caller must also be asynchronous. This effect is transitive, and while this is often lamented as a problem, this is exactly what we need. Amplify the essential. Make boundaries explicit. </p> <p> This doesn't mean that your entire code base has to be asynchronous. Only your network (and similar, non-deterministic) code should be asynchronous. Write your Domain Model and application code as pure functions, and <a href="/2019/02/11/asynchronous-injection">compose them with the asynchronous code using standard combinators</a>. </p> <h3 id="16ff65dbc4784ad1939257635d08039c"> Conclusion <a href="#16ff65dbc4784ad1939257635d08039c">#</a> </h3> <p> The first of Don Box's four tenets of SOA is that boundaries should be explicit. WCF failed to deliver on that ideal, and it took me more than a decade to figure out how to square that circle. </p> <p> Many languages now come with support for asynchronous programming, often utilizing some kind of generic <code>Task</code> or <code>Async</code> <a href="/2022/03/28/monads">monad</a>. Since such types are usually contagious, you can use them to make boundaries explicit. </p> <p> <strong>Next:</strong> <a href="/2024/03/25/services-are-autonomous">Services are autonomous</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The four tenets of SOA revisited https://blog.ploeh.dk/2024/03/04/the-four-tenets-of-soa-revisited 2024-03-04T06:39:00+00:00 Mark Seemann <div id="post"> <p> <em>Twenty years after.</em> </p> <p> In the <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/msdn-magazine-january-2004">January 2004 issue of MSDN Magazine</a> you can find an article by <a href="https://en.wikipedia.org/wiki/Don_Box">Don Box</a> titled <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2004/january/a-guide-to-developing-and-running-connected-systems-with-indigo">A Guide to Developing and Running Connected Systems with Indigo</a>. Buried within the (now dated) discussion of the technology code-named <em>Indigo</em> (later <a href="https://en.wikipedia.org/wiki/Windows_Communication_Foundation">Windows Communication Foundation</a>) you can find a general discussion of <em>four tenets of service-oriented architecture</em> (SOA). </p> <p> I remember that they resonated strongly with me back then, or that they at least prompted me to think explicitly about how to design software services. Some of these ideas have stayed with me ever since, while another has nagged at me for decades before I found a way to reconcile it with other principles of software design. </p> <p> Now that it's twenty years ago that the MSDN article was published, I find that this is as good a time as ever to revisit it. </p> <h3 id="96e92c4bccef4d5789bbb5d860e3ce3f"> Legacy <a href="#96e92c4bccef4d5789bbb5d860e3ce3f">#</a> </h3> <p> Why should we care about an old article about <a href="https://en.wikipedia.org/wiki/SOAP">SOAP</a> and SOA? Does anyone even use such things today, apart from legacy systems? </p> <p> After all, we've moved on from SOAP to <a href="https://en.wikipedia.org/wiki/REST">REST</a>, <a href="https://en.wikipedia.org/wiki/GRPC">gRPC</a>, or <a href="https://en.wikipedia.org/wiki/GraphQL">GraphQL</a>, and from SOA to <a href="https://en.wikipedia.org/wiki/Microservices">microservices</a> - that is, if we're not already swinging back towards monoliths. </p> <p> Even so, I find much of what Don Box wrote twenty years ago surprisingly prescient. If you're interested in distributed software design involving some kind of remote API design, the four tenets of service-orientation apply beyond their original context. Some of the ideas, at least. </p> <p> As is often the case in our field, various resources discuss the tenets without much regard to proper citation. Thus, I can't be sure that the MSDN article is where they first appeared, but I haven't found any earlier source. </p> <p> My motivation for writing these article is partly to rescue the four tenets from obscurity, and partly to add some of my own commentary. </p> <p> Much of the original article is about Indigo, and I'm going to skip that. On the other hand, I'm going to quote rather extensively from the article, in order to lift the more universal ideas out of their original context. </p> <p> I'll do that in a series of articles, each covering one of the tenets. </p> <ul> <li><a href="/2024/03/11/boundaries-are-explicit">Boundaries are explicit</a></li> <li><a href="/2024/03/25/services-are-autonomous">Services are autonomous</a></li> <li><a href="/2024/04/15/services-share-schema-and-contract-not-class">Services share schema and contract, not class</a></li> <li><a href="/2024/04/29/service-compatibility-is-determined-based-on-policy">Service compatibility is determined based on policy</a></li> </ul> <p> Not all of the tenets have stood the test of time equally well, so I may not add an equal amount of commentary to all four. </p> <h3 id="ad6f66b0ac954647bebf4d288939d2ab"> Conclusion <a href="#ad6f66b0ac954647bebf4d288939d2ab">#</a> </h3> <p> Ever since I first encountered the four tenets of SOA, they've stayed with me in one form or other. When helping teams to design services, even what we may today consider 'modern services', I've drawn on some of those ideas. There are insights of a general nature that are worth considering even today. </p> <p> <strong>Next:</strong> <a href="/2024/03/11/boundaries-are-explicit">Boundaries are explicit</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Testing exceptions https://blog.ploeh.dk/2024/02/26/testing-exceptions 2024-02-26T06:47:00+00:00 Mark Seemann <div id="post"> <p> <em>Some thoughts on testing past the happy path.</em> </p> <p> Test-driven development is a great development technique that enables you to get rapid feedback on design and implementation ideas. It enables you to rapidly move towards a working solution. </p> <p> The emphasis on the <em>happy path</em>, however, can make you forget about all the things that may go wrong. Sooner or later, though, you may realize that the implementation can fail for a number of reasons, and, wanting to make things more robust, you may want to also subject your error-handling code to automated testing. </p> <p> This doesn't have to be difficult, but can raise some interesting questions. In this article, I'll try to address a few. </p> <h3 id="ead73eb4bc4b45eba0bde0cf61269814"> Throwing exceptions with a dynamic mock <a href="#ead73eb4bc4b45eba0bde0cf61269814">#</a> </h3> <p> In <a href="/2023/08/14/replacing-mock-and-stub-with-a-fake#0afe67b375254fe193a3fd10234a1ce9">a question to another article</a> AmirB asks how to use a <a href="http://xunitpatterns.com/Fake%20Object.html">Fake Object</a> to test exceptions. Specifically, since <a href="/2023/11/13/fakes-are-test-doubles-with-contracts">a Fake is a Test Double with a coherent contract</a> it'll be inappropriate to let it throw exceptions that relate to different implementations. </p> <p> Egads, that was quite abstract, so let's consider a concrete example. </p> <p> <a href="/2023/08/14/replacing-mock-and-stub-with-a-fake">The original article</a> that AmirB asked about used this interface as an example: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IUserRepository</span> { &nbsp;&nbsp;&nbsp;&nbsp;User&nbsp;<span style="font-weight:bold;color:#74531f;">Read</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userId</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userId</span>); }</pre> </p> <p> Granted, this interface is a little odd, but it should be good enough for the present purpose. As AmirB wrote: </p> <blockquote> <p> "In scenarios where dynamic mocks (like Moq) are employed, we can mock a method to throw an exception, allowing us to test the expected behavior of the System Under Test (SUT)." </p> <footer><cite><a href="/2023/08/14/replacing-mock-and-stub-with-a-fake#0afe67b375254fe193a3fd10234a1ce9">AmirB</a></cite></footer> </blockquote> <p> Specifically, this might look like this, using <a href="https://github.com/devlooped/moq">Moq</a>: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">CreateThrows</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">td</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Mock&lt;IUserRepository&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;td.Setup(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.Read(1234)).Returns(<span style="color:blue;">new</span>&nbsp;User&nbsp;{&nbsp;Id&nbsp;=&nbsp;0&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;td.Setup(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.Create(It.IsAny&lt;<span style="color:blue;">int</span>&gt;())).Throws(MakeSqlException()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SomeController(td.Object); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.GetUser(1234); &nbsp;&nbsp;&nbsp;&nbsp;Assert.NotNull(actual); }</pre> </p> <p> It's just an example, but the point is that since you can make a dynamic mock do anything that you can define in code, you can also use it to simulate database exceptions. This test pretends that the <code>IUserRepository</code> throws a <a href="https://learn.microsoft.com/dotnet/api/system.data.sqlclient.sqlexception">SqlException</a> from the <code>Create</code> method. </p> <p> Perhaps the <code>GetUser</code> implementation now looks like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;User&nbsp;<span style="font-weight:bold;color:#74531f;">GetUser</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userId</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">u</span>&nbsp;=&nbsp;<span style="color:blue;">this</span>.userRepository.Read(userId); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(u.Id&nbsp;==&nbsp;0) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">try</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.userRepository.Create(userId); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">catch</span>&nbsp;(SqlException) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;u; }</pre> </p> <p> If you find the example contrived, I don't blame you. The <code>IUserRepository</code> interface, the <code>User</code> class, and the <code>GetUser</code> method that orchestrates them are all degenerate in various ways. I originally created this little code example to discuss <a href="/2013/10/23/mocks-for-commands-stubs-for-queries">data flow verification</a>, and I'm now stretching it beyond any reason. I hope that you can look past that. The point I'm making here is more general, and doesn't hinge on particulars. </p> <h3 id="ed58e2e387234a7ebd3c97a384841d9f"> Fake <a href="#ed58e2e387234a7ebd3c97a384841d9f">#</a> </h3> <p> <a href="/2023/08/14/replacing-mock-and-stub-with-a-fake">The article</a> also suggests a <code>FakeUserRepository</code> that is small enough that I can repeat it here. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeUserRepository</span>&nbsp;:&nbsp;Collection&lt;User&gt;,&nbsp;IUserRepository { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userId</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;User&nbsp;{&nbsp;Id&nbsp;=&nbsp;userId&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;User&nbsp;<span style="font-weight:bold;color:#74531f;">Read</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">userId</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">user</span>&nbsp;=&nbsp;<span style="color:blue;">this</span>.SingleOrDefault(<span style="font-weight:bold;color:#1f377f;">u</span>&nbsp;=&gt;&nbsp;u.Id&nbsp;==&nbsp;userId); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(user&nbsp;==&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;User&nbsp;{&nbsp;Id&nbsp;=&nbsp;0&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;user; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The question is how to use something like this when you want to test exceptions? It's possible that this little class may produce <a href="/2024/01/29/error-categories-and-category-errors">errors that I've failed to predict</a>, but it certainly doesn't throw any <code>SqlExceptions</code>! </p> <p> Should we inflate <code>FakeUserRepository</code> by somehow also giving it the capability to throw particular exceptions? </p> <h3 id="c116b036864348b7938dfb6805e0c2dd"> Throwing exceptions from Test Doubles <a href="#c116b036864348b7938dfb6805e0c2dd">#</a> </h3> <p> I understand why AmirB asks that question, because it doesn't seem right. As a start, it would go against the <a href="https://en.wikipedia.org/wiki/Single_responsibility_principle">Single Responsibility Principle</a>. The <code>FakeUserRepository</code> would then have more than reason to change: You'd have to change it if the <code>IUserRepository</code> interface changes, but you'd also have to change it if you wanted to simulate a different error situation. </p> <p> Good coding practices apply to test code as well. Test code is code that you have to read and maintain, so all the good practices that keep production code in good shape also apply to test code. This may include <a href="https://en.wikipedia.org/wiki/SOLID">the SOLID principles</a>, unless you're of the mind that <a href="https://dannorth.net/cupid-for-joyful-coding/">SOLID ought to be a thing of the past</a>. </p> <p> If you really <em>must</em> throw exceptions from a <a href="https://martinfowler.com/bliki/TestDouble.html">Test Double</a>, perhaps a dynamic mock object as shown above is the best option. No-one says that if you use a Fake Object for most of your tests you can't use a dynamic mock library for truly one-off test cases.Or perhaps a one-off Test Double that throws the desired exception. </p> <p> I would, however, consider it a code smell if this happens too often. Not a test smell, but a code smell. </p> <h3 id="302a9e4462744a55974b8fdab6f70054"> Is the exception part of the contract? <a href="#302a9e4462744a55974b8fdab6f70054">#</a> </h3> <p> You may ask yourself whether a particular exception type is part of an object's contract. As I always do, when I use the word <em>contract</em>, I refer to a set of invariants, pre-, and postconditions, taking a cue from <a href="/ref/oosc">Object-Oriented Software Construction</a>. See also my video course <a href="/encapsulation-and-solid">Encapsulation and SOLID</a> for more details. </p> <p> You can <em>imply</em> many things about a contract when you have a static type system at your disposal, but there are always rules that you can't express that way. Parts of a contract are implicitly understood, or communicated in other ways. Code comments, <a href="https://en.wikipedia.org/wiki/Docstring">docstrings</a>, or similar, are good options. </p> <p> What may you infer from the <code>IUserRepository</code> interface? What should you <em>not</em> infer? </p> <p> I'd expect the <code>Read</code> method to return a <code>User</code> object. This code example hails us <a href="/2013/10/23/mocks-for-commands-stubs-for-queries">from 2013</a>, before C# had <a href="https://learn.microsoft.com/dotnet/csharp/nullable-references">nullable reference types</a>. Around that time I'd begun using <a href="/2018/03/26/the-maybe-functor">Maybe</a> to signal that the return value might be missing. This is a <em>convention</em>, so the reader needs to be aware of it in order to correctly infer that part of the contract. Since the <code>Read</code> method does <em>not</em> return <code>Maybe&lt;User&gt;</code> I might infer that a non-null <code>User</code> object is guaranteed; that's a post-condition. </p> <p> These days, I'd also use <a href="/2020/07/27/task-asynchronous-programming-as-an-io-surrogate">asynchronous APIs to hint that I/O is involved</a>, but again, the example is so old and simplified that this isn't the case here. Still, regardless of how this is communicated to the reader, if an interface (or base class) is intended for I/O, we may expect it to fail at times. In most languages, such errors manifest as exceptions. </p> <p> At least two questions arise from such deliberations: </p> <ul> <li>Which exception types may the methods throw?</li> <li>Can you even handle such exceptions?</li> </ul> <p> Should <code>SqlException</code> even be part of the contract? Isn't that an implementation detail? </p> <p> The <code>FakeUserRepository</code> class neither uses SQL Server nor throws <code>SqlExceptions</code>. You may imagine other implementations that use a document database, or even just another relational database than SQL Server (Oracle, MySQL, PostgreSQL, etc.). Those wouldn't throw <code>SqlExceptions</code>, but perhaps other exception types. </p> <p> According to the <a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">Dependency Inversion Principle</a>, </p> <blockquote> <p> "Abstractions should not depend upon details. Details should depend upon abstractions." </p> <footer><cite>Robert C. Martin, <a href="/ref/appp">Agile Principles, Patterns, and Practices in C#</a></cite></footer> </blockquote> <p> If we make <code>SqlException</code> part of the contract, an implementation detail becomes part of the contract. Not only that: With an implementation like the above <code>GetUser</code> method, which catches <code>SqlException</code>, we've also violated the <a href="https://en.wikipedia.org/wiki/Liskov_substitution_principle">Liskov Substitution Principle</a>. If you injected another implementation, one that throws a different type of exception, the code would no longer work as intended. </p> <p> Loosely coupled code shouldn't look like that. </p> <p> Many specific exceptions are of <a href="/2024/01/29/error-categories-and-category-errors">a kind that you can't handle anyway</a>. On the other hand, if you do decide to handle particular error scenarios, make it part of the contract, or, as Michael Feathers puts it, <a href="https://youtu.be/AnZ0uTOerUI?si=1gJXYFoVlNTSbjEt">extend the domain</a>. </p> <h3 id="ed86f41415724219a0afbf9d669ec1b7"> Integration testing <a href="#ed86f41415724219a0afbf9d669ec1b7">#</a> </h3> <p> How should we unit test specific exception? <a href="https://en.wikipedia.org/wiki/Mu_(negative)">Mu</a>, we shouldn't. </p> <blockquote> <p> "Personally, I avoid using try-catch blocks in repositories or controllers and prefer handling exceptions in middleware (e.g., ErrorHandler). In such cases, I write separate unit tests for the middleware. Could this be a more fitting approach?" </p> <footer><cite><a href="/2023/08/14/replacing-mock-and-stub-with-a-fake#0afe67b375254fe193a3fd10234a1ce9">AmirB</a></cite></footer> </blockquote> <p> That is, I think, an excellent approach to those exceptions that that you've decided to not handle explicitly. Such middleware would typically log or otherwise notify operators that a problem has arisen. You could also write some general-purpose middleware that performs retries or implements the <a href="https://martinfowler.com/bliki/CircuitBreaker.html">Circuit Breaker</a> pattern, but reusable libraries that do that already exist. Consider using one. </p> <p> Still, you may have decided to implement a particular feature by leveraging a capability of a particular piece of technology, and the code you intent to write is complicated enough, or important enough, that you'd like to have good test coverage. How do you do that? </p> <p> I'd suggest an integration test. </p> <p> I don't have a good example lying around that involves throwing specific exceptions, but something similar may be of service. The example code base that accompanies my book <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a> pretends to be an online restaurant reservation system. Two concurrent clients may compete for the last table on a particular date; a typical race condition. </p> <p> There are more than one way to address such a concern. As implied in <a href="/2024/01/29/error-categories-and-category-errors">a previous article</a>, you may decide to rearchitect the entire application to be able to handle such edge cases in a robust manner. For the purposes of the book's example code base, however, I considered a <em>lock-free architecture</em> out of scope. Instead, I had in mind dealing with that issue by taking advantage of .NET and SQL Server's support for lightweight transactions via a <a href="https://learn.microsoft.com/dotnet/api/system.transactions.transactionscope">TransactionScope</a>. While this is a handy solution, it's utterly dependent on the technology stack. It's a good example of an implementation detail that I'd rather not expose to a unit test. </p> <p> Instead, I wrote a <a href="/2021/01/25/self-hosted-integration-tests-in-aspnet">self-hosted integration test</a> that runs against a real SQL Server instance (automatically deployed and configured on demand). It tests <em>behaviour</em> rather than implementation details: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&nbsp;<span style="font-weight:bold;color:#74531f;">NoOverbookingRace</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">start</span>&nbsp;=&nbsp;DateTimeOffset.UtcNow; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeOut</span>&nbsp;=&nbsp;TimeSpan.FromSeconds(30); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">i</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(DateTimeOffset.UtcNow&nbsp;-&nbsp;start&nbsp;&lt;&nbsp;timeOut) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;PostTwoConcurrentLiminalReservations(start.DateTime.AddDays(++i)); } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&nbsp;<span style="font-weight:bold;color:#74531f;">PostTwoConcurrentLiminalReservations</span>(DateTime&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) { &nbsp;&nbsp;&nbsp;&nbsp;date&nbsp;=&nbsp;date.Date.AddHours(18.5); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">service</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;RestaurantService(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">task1</span>&nbsp;=&nbsp;service.PostReservation(<span style="color:blue;">new</span>&nbsp;ReservationDtoBuilder() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.WithDate(date) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.WithQuantity(10) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Build()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">task2</span>&nbsp;=&nbsp;service.PostReservation(<span style="color:blue;">new</span>&nbsp;ReservationDtoBuilder() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.WithDate(date) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.WithQuantity(10) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Build()); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Task.WhenAll(task1,&nbsp;task2); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Single(actual,&nbsp;<span style="font-weight:bold;color:#1f377f;">msg</span>&nbsp;=&gt;&nbsp;msg.IsSuccessStatusCode); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Single( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">msg</span>&nbsp;=&gt;&nbsp;msg.StatusCode&nbsp;==&nbsp;HttpStatusCode.InternalServerError); }</pre> </p> <p> This test attempts to make two concurrent reservations for ten people. This is also the maximum capacity of the restaurant: It's impossible to seat twenty people. We'd like for one of the requests to win that race, while the server should reject the loser. </p> <p> This test is only concerned with the behaviour that clients can observe, and since this code base contains hundreds of other tests that inspect HTTP response messages, this one only looks at the status codes. </p> <p> The implementation handles the potential overbooking scenario like this: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;ActionResult&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">TryCreate</span>(Restaurant&nbsp;<span style="font-weight:bold;color:#1f377f;">restaurant</span>,&nbsp;Reservation&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">scope</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;TransactionScope(TransactionScopeAsyncFlowOption.Enabled); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(restaurant.Id,&nbsp;reservation.At); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">now</span>&nbsp;=&nbsp;Clock.GetCurrentDateTime(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!restaurant.MaitreD.WillAccept(now,&nbsp;reservations,&nbsp;reservation)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;NoTables500InternalServerError(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.Create(restaurant.Id,&nbsp;reservation); &nbsp;&nbsp;&nbsp;&nbsp;scope.Complete(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Reservation201Created(restaurant.Id,&nbsp;reservation); }</pre> </p> <p> Notice the <code>TransactionScope</code>. </p> <p> I'm under the illusion that I could radically change this implementation detail without breaking the above test. Granted, unlike <a href="/2023/09/04/decomposing-ctfiyhs-sample-code-base">another experiment</a>, this hypothesis isn't one I've put to the test. </p> <h3 id="9659f21863e74c288d5c2d36534eaa37"> Conclusion <a href="#9659f21863e74c288d5c2d36534eaa37">#</a> </h3> <p> How does one automatically test error branches? Most unit testing frameworks come with APIs that makes it easy to verify that specific exceptions were thrown, so that's not the hard part. If a particular exception is part of the System Under Test's contract, just test it like that. </p> <p> On the other hand, when it comes to objects composed with other objects, implementation details may easily leak through in the shape of specific exception types. I'd think twice before writing a test that verifies whether a piece of client code (such as the above <code>SomeController</code>) handles a particular exception type (such as <code>SqlException</code>). </p> <p> If such a test is difficult to write because all you have is a Fake Object (e.g. <code>FakeUserRepository</code>), that's only good. The rapid feedback furnished by test-driven development strikes again. Listen to your tests. </p> <p> You should probably not write that test at all, because there seems to be an issue with the planned structure of the code. Address <em>that</em> problem instead. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Extracting data from a small CSV file with Haskell https://blog.ploeh.dk/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell 2024-02-19T12:57:00+00:00 Mark Seemann <div id="post"> <p> <em>Statically typed languages are also good for ad-hoc scripting.</em> </p> <p> This article is part of a <a href="/2024/02/05/statically-and-dynamically-typed-scripts">short series of articles</a> that compares ad-hoc scripting in <a href="https://www.haskell.org/">Haskell</a> with solving the same problem in <a href="https://www.python.org/">Python</a>. The <a href="/2024/02/05/statically-and-dynamically-typed-scripts">introductory article</a> describes the problem to be solved, so here I'll jump straight into the Haskell code. In the next article I'll give a similar walkthrough of my Python script. </p> <h3 id="0a705367eb2f4080ac168eb1bbe9b2ec"> Getting started <a href="#0a705367eb2f4080ac168eb1bbe9b2ec">#</a> </h3> <p> When working with Haskell for more than a few true one-off expressions that I can type into GHCi (the Haskell <a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">REPL</a>), I usually create a module file. Since I'd been asked to crunch some data, and I wasn't feeling very imaginative that day, I just named the module (and the file) <code>Crunch</code>. After some iterative exploration of the problem, I also arrived at a set of imports: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Crunch&nbsp;<span style="color:blue;">where</span> <span style="color:blue;">import</span>&nbsp;Data.List&nbsp;(<span style="color:#2b91af;">sort</span>) <span style="color:blue;">import</span>&nbsp;<span style="color:blue;">qualified</span>&nbsp;Data.List.NonEmpty&nbsp;<span style="color:blue;">as</span>&nbsp;NE <span style="color:blue;">import</span>&nbsp;Data.List.Split <span style="color:blue;">import</span>&nbsp;Control.Applicative <span style="color:blue;">import</span>&nbsp;Control.Monad <span style="color:blue;">import</span>&nbsp;Data.Foldable</pre> </p> <p> As we go along, you'll see where some of these fit in. </p> <p> Reading the actual data file, however, can be done with just the Haskell <code>Prelude</code>: </p> <p> <pre>inputLines&nbsp;=&nbsp;<span style="color:blue;">words</span>&nbsp;&lt;$&gt;&nbsp;<span style="color:blue;">readFile</span>&nbsp;<span style="color:#a31515;">&quot;survey_data.csv&quot;</span></pre> </p> <p> Already now, it's possible to load the module in GHCi and start examining the data: </p> <p> <pre>ghci&gt; :l Crunch.hs [1 of 1] Compiling Crunch ( Crunch.hs, interpreted ) Ok, one module loaded. ghci&gt; length &lt;$&gt; inputLines 38</pre> </p> <p> Looks good, but reading a text file is hardly the difficult part. The first obstacle, surprisingly, is to split comma-separated values into individual parts. For some reason that I've never understood, the Haskell base library doesn't even include something as basic as <a href="https://learn.microsoft.com/dotnet/api/system.string.split">String.Split</a> from .NET. I could probably hack together a function that does that, but on the other hand, it's available in the <a href="https://hackage.haskell.org/package/split/docs/Data-List-Split.html">split</a> package; that explains the <code>Data.List.Split</code> import. It's just a bit of a bother that one has to pull in another package only to do that. </p> <h3 id="a70030690c1645a2b0923ad354fa665b"> Grades <a href="#a70030690c1645a2b0923ad354fa665b">#</a> </h3> <p> Extracting all the grades are now relatively easy. This function extracts and parses a grade from a single line: </p> <p> <pre><span style="color:#2b91af;">grade</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Read</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a grade&nbsp;line&nbsp;=&nbsp;<span style="color:blue;">read</span>&nbsp;$&nbsp;splitOn&nbsp;<span style="color:#a31515;">&quot;,&quot;</span>&nbsp;line&nbsp;!!&nbsp;2</pre> </p> <p> It splits the line on commas, picks the third element (zero-indexed, of course, so element <code>2</code>), and finally parses it. </p> <p> One may experiment with it in GHCi to get an impression that it works: </p> <p> <pre>ghci&gt; fmap grade &lt;$&gt; inputLines :: IO [Int] [2,2,12,10,4,12,2,7,2,2,2,7,2,7,2,4,2,7,4,7,0,4,0,7,2,2,2,2,2,2,4,4,2,7,4,0,7,2]</pre> </p> <p> This lists all 38 expected grades found in the data file. </p> <p> In the <a href="/2024/02/05/statically-and-dynamically-typed-scripts">introduction article</a> I spent some time explaining how languages with strong type inference don't need type declarations. This makes iterative development easier, because you can fiddle with an expression until it does what you'd like it to do. When you change an expression, often the inferred type changes as well, but there's no programmer overhead involved with that. The compiler figures that out for you. </p> <p> Even so, the above <code>grade</code> function does have a type annotation. How does that gel with what I just wrote? </p> <p> It doesn't, on the surface, but when I was fiddling with the code, there was no type annotation. The Haskell compiler is perfectly happy to infer the type of an expression like </p> <p> <pre>grade&nbsp;line&nbsp;=&nbsp;<span style="color:blue;">read</span>&nbsp;$&nbsp;splitOn&nbsp;<span style="color:#a31515;">&quot;,&quot;</span>&nbsp;line&nbsp;!!&nbsp;2</pre> </p> <p> The human reader, however, is not so clever (I'm not, at least), so once a particular expression settles, and I'm fairly sure that it's not going to change further, I sometimes add the type annotation to aid myself. </p> <p> When writing this, I was debating the didactics of showing the function <em>with</em> the type annotation, against showing it without it. Eventually I decided to include it, because it's more understandable that way. That decision, however, prompted this explanation. </p> <h3 id="754ea5fced264c439f8784705176851b"> Binomial choice <a href="#754ea5fced264c439f8784705176851b">#</a> </h3> <p> The next thing I needed to do was to list all pairs from the data file. Usually, <a href="/2022/01/17/enumerate-wordle-combinations-with-an-applicative-functor">when I run into a problem related to combinations, I reach for applicative composition</a>. For example, to list all possible combinations of the first three primes, I might do this: </p> <p> <pre>ghci&gt; liftA2 (,) [2,3,5] [2,3,5] [(2,2),(2,3),(2,5),(3,2),(3,3),(3,5),(5,2),(5,3),(5,5)]</pre> </p> <p> You may now protest that this is sampling with replacement, whereas the task is to pick two <em>different</em> rows from the data file. Usually, when I run into that requirement, I just remove the ones that pick the same value twice: </p> <p> <pre>ghci&gt; filter (uncurry (/=)) $ liftA2 (,) [2,3,5] [2,3,5] [(2,3),(2,5),(3,2),(3,5),(5,2),(5,3)]</pre> </p> <p> That works great as long as the values are unique, but what if that's not the case? </p> <p> <pre>ghci&gt; liftA2 (,) "foo" "foo" [('f','f'),('f','o'),('f','o'),('o','f'),('o','o'),('o','o'),('o','f'),('o','o'),('o','o')] ghci&gt; filter (uncurry (/=)) $ liftA2 (,) "foo" "foo" [('f','o'),('f','o'),('o','f'),('o','f')]</pre> </p> <p> This removes too many values! We don't want the combinations where the first <code>o</code> is paired with itself, or when the second <code>o</code> is paired with itself, but we <em>do</em> want the combination where the first <code>o</code> is paired with the second, and vice versa. </p> <p> This is relevant because the data set turns out to contain identical rows. Thus, I needed something that would deal with that issue. </p> <p> Now, bear with me, because it's quite possible that what i did do isn't the simplest solution to the problem. On the other hand, I'm reporting what I did, and how I used Haskell to solve a one-off problem. If you have a simpler solution, please <a href="https://github.com/ploeh/ploeh.github.com?tab=readme-ov-file#comments">leave a comment</a>. </p> <p> You often reach for the tool that you already know, so I used a variation of the above. Instead of combining values, I decided to combine row indices instead. This meant that I needed a function that would produce the indices for a particular list: </p> <p> <pre><span style="color:#2b91af;">indices</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Foldable</span>&nbsp;t&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[<span style="color:#2b91af;">Int</span>] indices&nbsp;f&nbsp;=&nbsp;[0&nbsp;..&nbsp;<span style="color:blue;">length</span>&nbsp;f&nbsp;-&nbsp;1]</pre> </p> <p> Again, the type annotation came later. This just produces sequential numbers, starting from zero: </p> <p> <pre>ghci&gt; indices &lt;$&gt; inputLines [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20, 21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]</pre> </p> <p> Such a function hovers just around the <a href="https://wiki.haskell.org/Fairbairn_threshold">Fairbairn threshold</a>; some experienced Haskellers would probably just inline it. </p> <p> Since row numbers (indices) are unique, the above approach to binomial choice works, so I also added a function for that: </p> <p> <pre><span style="color:#2b91af;">choices</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Eq</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;[a]&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[(a,&nbsp;a)] choices&nbsp;=&nbsp;<span style="color:blue;">filter</span>&nbsp;(<span style="color:blue;">uncurry</span>&nbsp;<span style="color:#2b91af;">(/=)</span>)&nbsp;.&nbsp;join&nbsp;(liftA2&nbsp;<span style="color:#2b91af;">(,)</span>)</pre> </p> <p> Combined with <code>indices</code> I can now enumerate all combinations of two rows in the data set: </p> <p> <pre>ghci&gt; choices . indices &lt;$&gt; inputLines [(0,1),(0,2),(0,3),(0,4),(0,5),(0,6),(0,7),(0,8),(0,9),...</pre> </p> <p> I'm only showing the first ten results here, because in reality, there are <em>1406</em> such pairs. </p> <p> Perhaps you think that all of this seems quite elaborate, but so far it's only four lines of code. The reason it looks like more is because I've gone to some lengths to explain what the code does. </p> <h3 id="bf2d1d52fb3c4a29b6987840b2d46530"> Sum of grades <a href="#bf2d1d52fb3c4a29b6987840b2d46530">#</a> </h3> <p> The above combinations are pairs of <em>indices</em>, not values. What I need is to use each index to look up the row, from the row get the grade, and then sum the two grades. The first parts of that I can accomplish with the <code>grade</code> function, but I need to do if for every row, and for both elements of each pair. </p> <p> While tuples are <code>Functor</code> instances, they only map over the second element, and that's not what I need: </p> <p> <pre>ghci&gt; rows = ["foo", "bar", "baz"] ghci&gt; fmap (rows!!) &lt;$&gt; [(0,1),(0,2)] [(0,"bar"),(0,"baz")]</pre> </p> <p> While this is just a simple example that maps over the two pairs <code>(0,1)</code> and <code>(0,2)</code>, it illustrates the problem: It only finds the row for each tuple's second element, but I need it for both. </p> <p> On the other hand, a type like <code>(a, a)</code> gives rise to a <a href="/2018/03/22/functors">functor</a>, and while a wrapper type like that is not readily available in the <em>base</em> library, defining one is a one-liner: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;Pair&nbsp;a&nbsp;=&nbsp;Pair&nbsp;{&nbsp;unPair&nbsp;::&nbsp;(a,&nbsp;a)&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Functor</span>)</pre> </p> <p> This enables me to map over pairs in one go: </p> <p> <pre>ghci&gt; unPair &lt;$&gt; fmap (rows!!) &lt;$&gt; Pair <$&gt; [(0,1),(0,2)] [("foo","bar"),("foo","baz")]</pre> </p> <p> This makes things a little easier. What remains is to use the <code>grade</code> function to look up the grade value for each row, then add the two numbers together, and finally count how many occurrences there are of each: </p> <p> <pre>sumGrades&nbsp;ls&nbsp;= &nbsp;&nbsp;liftA2&nbsp;<span style="color:#2b91af;">(,)</span>&nbsp;NE.<span style="color:blue;">head</span>&nbsp;<span style="color:blue;">length</span>&nbsp;&lt;$&gt;&nbsp;NE.group &nbsp;&nbsp;&nbsp;&nbsp;(sort&nbsp;(<span style="color:blue;">uncurry</span>&nbsp;<span style="color:#2b91af;">(+)</span>&nbsp;.&nbsp;unPair&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(grade&nbsp;.&nbsp;(ls&nbsp;!!))&nbsp;.&nbsp;Pair&nbsp;&lt;$&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;choices&nbsp;(indices&nbsp;ls)))</pre> </p> <p> You'll notice that this function doesn't have a type annotation, but we can ask GHCi if we're curious: </p> <p> <pre>ghci&gt; :t sumGrades sumGrades :: (Ord a, Num a, Read a) =&gt; [String] -&gt; [(a, Int)]</pre> </p> <p> This enabled me to get a count of each sum of grades: </p> <p> <pre>ghci&gt; sumGrades &lt;$&gt; inputLines [(0,6),(2,102),(4,314),(6,238),(7,48),(8,42),(9,272),(10,6), (11,112),(12,46),(14,138),(16,28),(17,16),(19,32),(22,4),(24,2)]</pre> </p> <p> The way to read this is that the sum <em>0</em> occurs six times, <em>2</em> appears <em>102</em> times, etc. </p> <p> There's one remaining task to accomplish before we can produce a PMF of the sum of grades: We need to enumerate the range, because, as it turns out, there are sums that are possible, but that don't appear in the data set. Can you spot which ones? </p> <p> Using tools already covered, it's easy to enumerate all possible sums: </p> <p> <pre>ghci&gt; import Data.List ghci&gt; sort $ nub $ (uncurry (+)) &lt;$&gt; join (liftA2 (,)) [-3,0,2,4,7,10,12] [-6,-3,-1,0,1,2,4,6,7,8,9,10,11,12,14,16,17,19,20,22,24]</pre> </p> <p> The sums <em>-6</em>, <em>-3</em>, <em>-1</em>, and more, are possible, but don't appear in the data set. Thus, in the PMF for two randomly picked grades, the probability that the sum is <em>-6</em> is <em>0</em>. On the other hand, the probability that the sum is <em>0</em> is <em>6/1406 ~ 0.004267</em>, and so on. </p> <h3 id="ddcac27fb3ff468cb9312f0fcc333865"> Difference of experience levels <a href="#ddcac27fb3ff468cb9312f0fcc333865">#</a> </h3> <p> The other question posed in the assignment was to produce the PMF for the absolute difference between two randomly selected students' experience levels. </p> <p> Answering that question follows the same mould as above. First, extract experience level from each data row, instead of the grade: </p> <p> <pre><span style="color:#2b91af;">experience</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Read</span>&nbsp;a&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;<span style="color:#2b91af;">String</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;a experience&nbsp;line&nbsp;=&nbsp;<span style="color:blue;">read</span>&nbsp;$&nbsp;splitOn&nbsp;<span style="color:#a31515;">&quot;,&quot;</span>&nbsp;line&nbsp;!!&nbsp;3</pre> </p> <p> Since I was doing an ad-hoc script, I just copied the <code>grade</code> function and changed the index from <code>2</code> to <code>3</code>. Enumerating the experience differences were also a close copy of <code>sumGrades</code>: </p> <p> <pre>diffExp&nbsp;ls&nbsp;= &nbsp;&nbsp;liftA2&nbsp;<span style="color:#2b91af;">(,)</span>&nbsp;NE.<span style="color:blue;">head</span>&nbsp;<span style="color:blue;">length</span>&nbsp;&lt;$&gt;&nbsp;NE.group &nbsp;&nbsp;&nbsp;&nbsp;(sort&nbsp;(<span style="color:blue;">abs</span>&nbsp;.&nbsp;<span style="color:blue;">uncurry</span>&nbsp;<span style="color:#2b91af;">(-)</span>&nbsp;.&nbsp;unPair&nbsp;.&nbsp;<span style="color:blue;">fmap</span>&nbsp;(experience&nbsp;.&nbsp;(ls&nbsp;!!))&nbsp;.&nbsp;Pair&nbsp;&lt;$&gt; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;choices&nbsp;(indices&nbsp;ls)))</pre> </p> <p> Running it in the REPL produces some other numbers, to be interpreted the same way as above: </p> <p> <pre>ghci&gt diffExp &lt;$&gt inputLines [(0,246),(1,472),(2,352),(3,224),(4,82),(5,24),(6,6)]</pre> </p> <p> This means that the difference <em>0</em> occurs <em>246</em> times, <em>1</em> appears <em>472</em> times, and so on. From those numbers, it's fairly simple to set up the PMF. </p> <h3 id="948660097f7748f2844ebfd91371b2a2"> Figures <a href="#948660097f7748f2844ebfd91371b2a2">#</a> </h3> <p> Another part of the assignment was to produce plots of both PMFs. I don't know how to produce figures with Haskell, and since the final results are just a handful of numbers each, I just copied them into a text editor to align them, and then pasted them into Excel to produce the figures there. </p> <p> Here's the PMF for the differences: </p> <p> <img src="/content/binary/difference-pmf-plot.png" alt="Bar chart of the differences PMF."> </p> <p> I originally created the figure with Danish labels. I'm sure that you can guess what <em>differens</em> means, and <em>sandsynlighed</em> means <em>probability</em>. </p> <h3 id="9b225242b63e4616b25954ad9141e273"> Conclusion <a href="#9b225242b63e4616b25954ad9141e273">#</a> </h3> <p> In this article you've seen the artefacts of an ad-hoc script to extract and analyze a small data set. While I've spent quite a few words to explain what's going on, the entire <code>Crunch</code> module is only 34 lines of code. Add to that a few ephemeral queries done directly in GHCi, but never saved to a file. It's been some months since I wrote the code, but as far as I recall, it took me a few hours all in all. </p> <p> If you do stuff like this every day, you probably find that appalling, but data crunching isn't really my main thing. </p> <p> Is it quicker to do it in Python? Not for me, it turns out. It also took me a couple of hours to repeat the exercise in Python. </p> <p> <strong>Next:</strong> <a href="/2024/03/18/extracting-data-from-a-small-csv-file-with-python">Extracting data from a small CSV file with Python</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Range as a functor https://blog.ploeh.dk/2024/02/12/range-as-a-functor 2024-02-12T06:59:00+00:00 Mark Seemann <div id="post"> <p> <em>With examples in C#, F#, and Haskell.</em> </p> <p> This article is an instalment in <a href="/2024/01/01/variations-of-the-range-kata">a short series of articles on the Range kata</a>. In the previous three articles you've seen <a href="https://codingdojo.org/kata/Range/">the Range kata</a> implemented <a href="/2024/01/08/a-range-kata-implementation-in-haskell">in Haskell</a>, <a href="/2024/01/15/a-range-kata-implementation-in-f">in F#</a>, and <a href="/2024/01/22/a-range-kata-implementation-in-c">in C#</a>. </p> <p> The reason I engaged with this kata was that I find that it provides a credible example of a how a pair of <a href="/2018/03/22/functors">functors</a> itself forms a functor. In this article, you'll see how that works out in three languages. If you don't care about one or two of those languages, just skip that section. </p> <h3 id="f8b28f239ca6444c8f32c768e725fad7"> Haskell perspective <a href="#f8b28f239ca6444c8f32c768e725fad7">#</a> </h3> <p> If you've done any <a href="https://www.haskell.org/">Haskell</a> programming, you may be thinking that I have in mind the default <code>Functor</code> instances for tuples. As part of the <a href="https://hackage.haskell.org/package/base">base</a> library, tuples (pairs, triples, quadruples, etc.) are already <code>Functor</code> instances. Specifically for pairs, we have this instance: </p> <p> <pre>instance Functor ((,) a)</pre> </p> <p> Those are not the functor instances I have in mind. To a degree, I find these default <code>Functor</code> instances unfortunate, or at least arbitrary. Let's briefly explore the above instance to see why that is. </p> <p> Haskell is a notoriously terse language, but if we expand the above instance to (invalid) pseudocode, it says something like this: </p> <p> <pre>instance Functor ((a,b) b)</pre> </p> <p> What I'm trying to get across here is that the <code>a</code> type argument is fixed, and only the second type argument <code>b</code> can be mapped. Thus, you can map a <code>(Bool, String)</code> pair to a <code>(Bool, Int)</code> pair: </p> <p> <pre>ghci&gt; fmap length (True, "foo") (True,3)</pre> </p> <p> but the first element (<code>Bool</code>, in this example) is fixed, and you can't map that. To be clear, the first element can be any type, but once you've fixed it, you can't change it (within the constraints of the <code>Functor</code> API, mind): </p> <p> <pre>ghci&gt; fmap (replicate 3) (42, 'f') (42,"fff") ghci&gt; fmap ($ 3) ("bar", (* 2)) ("bar",6)</pre> </p> <p> The reason I find these default instances arbitrary is that this isn't the only possible <code>Functor</code> instance. Pairs, in particular, are also <a href="https://hackage.haskell.org/package/base/docs/Data-Bifunctor.html">Bifunctor</a> instances, so you can easily map over the first element, instead of the second: </p> <p> <pre>ghci&gt; first show (42, 'f') ("42",'f')</pre> </p> <p> Similarly, one can easily imagine a <code>Functor</code> instance for triples (three-tuples) that map the middle element. The default instance, however, maps the third (i.e. last) element only. </p> <p> There are some hand-wavy rationalizations out there that argue that in Haskell, application and reduction is usually done from the right, so therefore it's most appropriate to map over the rightmost element of tuples. I admit that it at least argues from a position of consistency, and it does make it easier to remember, but from a didactic perspective I still find it a bit unfortunate. It suggests that a tuple functor only maps the last element. </p> <p> What I had in mind for <em>ranges</em> however, wasn't to map only the first or the last element. Neither did I wish to treat ranges as <a href="/2018/12/24/bifunctors">bifunctors</a>. What I really wanted was the ability to project an entire range. </p> <p> In my Haskell Range implementation, I'd simply treated ranges as tuples of <code>Endpoint</code> values, and although I didn't show that in the article, I ultimately declared <code>Endpoint</code> as a <code>Functor</code> instance: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Endpoint&nbsp;a&nbsp;=&nbsp;Open&nbsp;a&nbsp;|&nbsp;Closed&nbsp;a&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Functor</span>)</pre> </p> <p> This enables you to map a single <code>Endpoint</code> value: </p> <p> <pre>ghci&gt; fmap length $ Closed "foo" Closed 3</pre> </p> <p> That's just a single value, but the Range kata API operates with pairs of <code>Endpoint</code> value. For example, the <code>contains</code> function has this type: </p> <p> <pre><span style="color:#2b91af;">contains</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Ord</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(<span style="color:blue;">Endpoint</span>&nbsp;a,&nbsp;<span style="color:blue;">Endpoint</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span></pre> </p> <p> Notice the <code>(Endpoint a, Endpoint a)</code> input type. </p> <p> Is it possible to treat such a pair as a functor? Yes, indeed, just import <a href="https://hackage.haskell.org/package/base/docs/Data-Functor-Product.html">Data.Functor.Product</a>, which enables you to package two functor values in a single wrapper: </p> <p> <pre>ghci&gt; import Data.Functor.Product ghci&gt; Pair (Closed "foo") (Open "corge") Pair (Closed "foo") (Open "corge")</pre> </p> <p> Now, granted, the <code>Pair</code> data constructor doesn't wrap a <em>tuple</em>, but that's easily fixed: </p> <p> <pre>ghci&gt; uncurry Pair (Closed "foo", Open "corge") Pair (Closed "foo") (Open "corge")</pre> </p> <p> The resulting <code>Pair</code> value is a <code>Functor</code> instance, which means that you can project it: </p> <p> <pre>ghci&gt; fmap length $ uncurry Pair (Closed "foo", Open "corge") Pair (Closed 3) (Open 5)</pre> </p> <p> Now, granted, I find the <code>Data.Functor.Product</code> API a bit lacking in convenience. For instance, there's no <code>getPair</code> function to retrieve the underlying values; you'd have to use pattern matching for that. </p> <p> In any case, my motivation for covering this ground wasn't to argue that <code>Data.Functor.Product</code> is all we need. The point was rather to observe that when you have two functors, you can combine them, and the combination is also a functor. </p> <p> This is one of the many reasons I get so much value out of Haskell. Its abstraction level is so high that it substantiates relationships that may also exist in other code bases, written in other programming languages. Even if a language like <a href="https://fsharp.org/">F#</a> or C# can't formally express some of those abstraction, you can still make use of them as 'design patterns' (for lack of a better term). </p> <h3 id="ba94968ed2bc4780b995639212f8371b"> F# functor <a href="#ba94968ed2bc4780b995639212f8371b">#</a> </h3> <p> What we've learned from Haskell is that if we have two functors we can combine them into one. Specifically, I made <code>Endpoint</code> a <code>Functor</code> instance, and from that followed automatically that a <code>Pair</code> of those was also a <code>Functor</code> instance. </p> <p> I can do the same in F#, starting with <code>Endpoint</code>. In F# I've unsurprisingly defined the type like this: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Endpoint&lt;&#39;a&gt;&nbsp;=&nbsp;Open&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a&nbsp;|&nbsp;Closed&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a</pre> </p> <p> That's just a standard <a href="https://en.wikipedia.org/wiki/Tagged_union">discriminated union</a>. In order to make it a functor, you'll have to add a <code>map</code> function: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Endpoint&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;map&nbsp;f&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Open&nbsp;&nbsp;&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Open&nbsp;&nbsp;&nbsp;(f&nbsp;x) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Closed&nbsp;x&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Closed&nbsp;(f&nbsp;x)</pre> </p> <p> The function alone, however, isn't enough to give rise to a functor. We must also convince ourselves that the <code>map</code> function obeys the functor laws. One way to do that is to write tests. While tests aren't <em>proofs</em>, we may still be sufficiently reassured by the tests that that's good enough for us. While I could, I'm not going to <em>prove</em> that <code>Endpoint.map</code> satisfies the functor laws. I will, later, do just that with the pair, but I'll leave this one as an exercise for the interested reader. </p> <p> Since I was already using <a href="https://hedgehog.qa/">Hedgehog</a> for property-based testing in my F# code, it was obvious to write properties for the functor laws as well. </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``First&nbsp;functor&nbsp;law``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genInt32&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;expected&nbsp;=&nbsp;Gen.choice&nbsp;[Gen.map&nbsp;Open&nbsp;genInt32;&nbsp;Gen.map&nbsp;Closed&nbsp;genInt32] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Endpoint.map&nbsp;id&nbsp;expected &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;actual&nbsp;}</pre> </p> <p> This property exercises the first functor law for integer endpoints. Recall that this law states that if you map a value with the <a href="https://en.wikipedia.org/wiki/Identity_function">identity function</a>, nothing really happens. </p> <p> The second functor law is more interesting. </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Second&nbsp;functor&nbsp;law``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genInt32&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;endpoint&nbsp;=&nbsp;Gen.choice&nbsp;[Gen.map&nbsp;Open&nbsp;genInt32;&nbsp;Gen.map&nbsp;Closed&nbsp;genInt32] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;f&nbsp;=&nbsp;Gen.item&nbsp;[id;&nbsp;((+)&nbsp;1);&nbsp;((*)&nbsp;2)] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;g&nbsp;=&nbsp;Gen.item&nbsp;[id;&nbsp;((+)&nbsp;1);&nbsp;((*)&nbsp;2)] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;Endpoint.map&nbsp;(f&nbsp;&lt;&lt;&nbsp;g)&nbsp;endpoint &nbsp;&nbsp;&nbsp;&nbsp;Endpoint.map&nbsp;f&nbsp;(Endpoint.map&nbsp;g&nbsp;endpoint)&nbsp;=!&nbsp;actual&nbsp;}</pre> </p> <p> This property again exercises the property for integer endpoints. Not only does the property pick a random integer and varies whether the <code>Endpoint</code> is <code>Open</code> or <code>Closed</code>, it also picks two random functions from a small list of functions: The identity function (again), a function that increments by one, and a function that doubles the input. These two functions, <code>f</code> and <code>g</code>, might then be the same, but might also be different from each other. Thus, the composition <code>f&nbsp;&lt;&lt;&nbsp;g</code> <em>might</em> be <code>id &lt;&lt; id</code> or <code>((+) 1) &lt;&lt; ((+) 1)</code>, but might just as well be <code>((+) 1) &lt;&lt; ((*) 2)</code>, or one of the other possible combinations. </p> <p> The law states that the result should be the same regardless of whether you first compose the functions and then map them, or map them one after the other. </p> <p> Which is the case. </p> <p> A <code>Range</code> is defined like this: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Range&lt;&#39;a&gt;&nbsp;=&nbsp;{&nbsp;LowerBound&nbsp;:&nbsp;Endpoint&lt;&#39;a&gt;;&nbsp;UpperBound&nbsp;:&nbsp;Endpoint&lt;&#39;a&gt;&nbsp;}</pre> </p> <p> This record type also gives rise to a functor: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;map&nbsp;f&nbsp;{&nbsp;LowerBound&nbsp;=&nbsp;lowerBound;&nbsp;UpperBound&nbsp;=&nbsp;upperBound&nbsp;}&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;LowerBound&nbsp;=&nbsp;Endpoint.map&nbsp;f&nbsp;lowerBound &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UpperBound&nbsp;=&nbsp;Endpoint.map&nbsp;f&nbsp;upperBound&nbsp;}</pre> </p> <p> This <code>map</code> function uses the projection <code>f</code> on both the <code>lowerBound</code> and the <code>upperBound</code>. It, too, obeys the functor laws: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``First&nbsp;functor&nbsp;law``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genInt64&nbsp;=&nbsp;Gen.int64&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genEndpoint&nbsp;=&nbsp;Gen.choice&nbsp;[Gen.map&nbsp;Open&nbsp;genInt64;&nbsp;Gen.map&nbsp;Closed&nbsp;genInt64] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;expected&nbsp;=&nbsp;Gen.tuple&nbsp;genEndpoint&nbsp;|&gt;&nbsp;Gen.map&nbsp;Range.ofEndpoints &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;expected&nbsp;|&gt;&nbsp;Ploeh.Katas.Range.map&nbsp;id &nbsp;&nbsp;&nbsp;&nbsp;expected&nbsp;=!&nbsp;actual&nbsp;} [&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Second&nbsp;functor&nbsp;law``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genInt16&nbsp;=&nbsp;Gen.int16&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;genEndpoint&nbsp;=&nbsp;Gen.choice&nbsp;[Gen.map&nbsp;Open&nbsp;genInt16;&nbsp;Gen.map&nbsp;Closed&nbsp;genInt16] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;range&nbsp;=&nbsp;Gen.tuple&nbsp;genEndpoint&nbsp;|&gt;&nbsp;Gen.map&nbsp;Range.ofEndpoints &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;f&nbsp;=&nbsp;Gen.item&nbsp;[id;&nbsp;((+)&nbsp;1s);&nbsp;((*)&nbsp;2s)] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;g&nbsp;=&nbsp;Gen.item&nbsp;[id;&nbsp;((+)&nbsp;1s);&nbsp;((*)&nbsp;2s)] &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;range&nbsp;|&gt;&nbsp;Ploeh.Katas.Range.map&nbsp;(f&nbsp;&lt;&lt;&nbsp;g) &nbsp;&nbsp;&nbsp;&nbsp;Ploeh.Katas.Range.map&nbsp;f&nbsp;(Ploeh.Katas.Range.map&nbsp;g&nbsp;range)&nbsp;=!&nbsp;actual&nbsp;}</pre> </p> <p> These two Hedgehog properties are cast in the same mould as the <code>Endpoint</code> properties, only they create 64-bit and 16-bit ranges for variation's sake. </p> <h3 id="c91ec56a7b22445b85ac4253f81c5c74"> C# functor <a href="#c91ec56a7b22445b85ac4253f81c5c74">#</a> </h3> <p> As I wrote about the Haskell result, it teaches us which abstractions are possible, even if we can't formalise them to the same degree in, say, C# as we can in Haskell. It should come as no surprise, then, that we can also make <code><span style="color:#2b91af;">Range</span>&lt;<span style="color:#2b91af;">T</span>&gt;</code> a functor in C#. </p> <p> In C# we <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatically</a> do that by giving a class a <code>Select</code> method. Again, we'll have to begin with <code>Endpoint</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;Endpoint&lt;TResult&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Select</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;(Func&lt;T,&nbsp;TResult&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">selector</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenClosed:&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;Endpoint.Closed(selector(x)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenOpen:&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;Endpoint.Open(selector(x))); }</pre> </p> <p> Does that <code>Select</code> method obey the functor laws? Yes, as we can demonstrate (not prove) with a few properties: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">FirstFunctorLaw</span>() { &nbsp;&nbsp;&nbsp;&nbsp;Gen.OneOf( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Open), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Closed)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample(<span style="font-weight:bold;color:#1f377f;">expected</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;expected.Select(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(expected,&nbsp;actual); &nbsp;&nbsp;&nbsp;&nbsp;}); } [Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ScondFunctorLaw</span>() { &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;endpoint&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOf( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Open), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Closed)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;f&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOfConst&lt;Func&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&gt;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;1,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;*&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;g&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOfConst&lt;Func&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&gt;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;1,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;*&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(endpoint,&nbsp;f,&nbsp;g)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;t.endpoint.Select(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;t.g(t.f(x))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.endpoint.Select(t.f).Select(t.g), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual); &nbsp;&nbsp;&nbsp;&nbsp;}); }</pre> </p> <p> These two tests follow the scheme laid out by the above F# properties, and they both pass. </p> <p> The <code>Range</code> class gets the same treatment. First, a <code>Select</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;Range&lt;TResult&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Select</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;(Func&lt;T,&nbsp;TResult&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">selector</span>) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">where</span>&nbsp;TResult&nbsp;:&nbsp;IComparable&lt;TResult&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Range&lt;TResult&gt;(min.Select(selector),&nbsp;max.Select(selector)); }</pre> </p> <p> which, again, can be demonstrated with two properties that exercise the functor laws: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">FirstFunctorLaw</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">genEndpoint</span>&nbsp;=&nbsp;Gen.OneOf( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Closed), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Open)); &nbsp;&nbsp;&nbsp;&nbsp;genEndpoint.SelectMany(<span style="font-weight:bold;color:#1f377f;">min</span>&nbsp;=&gt;&nbsp;genEndpoint &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(<span style="font-weight:bold;color:#1f377f;">max</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;Range&lt;<span style="color:blue;">int</span>&gt;(min,&nbsp;max))) &nbsp;&nbsp;&nbsp;&nbsp;.Sample(<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.Select(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(sut,&nbsp;actual); &nbsp;&nbsp;&nbsp;&nbsp;}); } [Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">SecondFunctorLaw</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">genEndpoint</span>&nbsp;=&nbsp;Gen.OneOf( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Closed), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Int.Select(Endpoint.Open)); &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;min&nbsp;<span style="color:blue;">in</span>&nbsp;genEndpoint &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;max&nbsp;<span style="color:blue;">in</span>&nbsp;genEndpoint &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;f&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOfConst&lt;Func&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&gt;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;1,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;*&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;g&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOfConst&lt;Func&lt;<span style="color:blue;">int</span>,&nbsp;<span style="color:blue;">int</span>&gt;&gt;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;+&nbsp;1,&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;*&nbsp;2) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(sut&nbsp;:&nbsp;<span style="color:blue;">new</span>&nbsp;Range&lt;<span style="color:blue;">int</span>&gt;(min,&nbsp;max),&nbsp;f,&nbsp;g)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;t.sut.Select(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;t.g(t.f(x))); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;t.sut.Select(t.f).Select(t.g), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual); &nbsp;&nbsp;&nbsp;&nbsp;}); }</pre> </p> <p> These tests also pass. </p> <h3 id="222d65b253b145679994d1b9336069c7"> Laws <a href="#222d65b253b145679994d1b9336069c7">#</a> </h3> <p> Exercising a pair of properties can give us a good warm feeling that the data structures and functions defined above are proper functors. Sometimes, tests are all we have, but in this case we can do better. We can prove that the functor laws always hold. </p> <p> The various above incarnations of a <code>Range</code> type are all <a href="https://en.wikipedia.org/wiki/Product_type">product types</a>, and the canonical form of a product type is a tuple (see e.g. <a href="https://thinkingwithtypes.com/">Thinking with Types</a> for a clear explanation of why that is). That's the reason I stuck with a tuple in my Haskell code. </p> <p> Consider the implementation of the <code>fmap</code> implementation of <code>Pair</code>: </p> <p> <pre>fmap f (Pair x y) = Pair (fmap f x) (fmap f y)</pre> </p> <p> We can use equational reasoning, and as always I'll use the <a href="https://bartoszmilewski.com/2015/01/20/functors/">the notation that Bartosz Milewski uses</a>. It's only natural to begin with the first functor law, using <code>F</code> and <code>G</code> as placeholders for two arbitrary <code>Functor</code> data constructors. </p> <p> <pre> fmap id (Pair (F x) (G y)) = { definition of fmap } Pair (fmap id (F x)) (fmap id (G y)) = { first functor law } Pair (F x) (G y) = { definition of id } id (Pair (F x) (G y))</pre> </p> <p> Keep in mind that in this notation, the equal signs are true equalities, going both ways. Thus, you can read this proof from the top to the bottom, or from the bottom to the top. The equality holds both ways, as should be the case for a true equality. </p> <p> We can proceed in the same vein to prove the second functor law, being careful to distinguish between <code>Functor</code> instances (<code>F</code> and <code>G</code>) and functions (<code>f</code> and <code>g</code>): </p> <p> <pre> fmap (g . f) (Pair (F x) (G y)) = { definition of fmap } Pair (fmap (g . f) (F x)) (fmap (g . f) (G y)) = { second functor law } Pair ((fmap g . fmap f) (F x)) ((fmap g . fmap f) (G y)) = { definition of composition } Pair (fmap g (fmap f (F x))) (fmap g (fmap f (G y))) = { definition of fmap } fmap g (Pair (fmap f (F x)) (fmap f (G y))) = { definition of fmap } fmap g (fmap f (Pair (F x) (G y))) = { definition of composition } (fmap g . fmap f) (Pair (F x) (G y))</pre> </p> <p> Notice that both proofs make use of the functor laws. This may seem self-referential, but is rather recursive. When the proofs refer to the functor laws, they refer to the functors <code>F</code> and <code>G</code>, which are both assumed to be lawful. </p> <p> This is how we know that the product of two lawful functors is itself a functor. </p> <h3 id="54ac7f2fadef46c4a295333a2037656e"> Negations <a href="#54ac7f2fadef46c4a295333a2037656e">#</a> </h3> <p> During all of this, you may have thought: <em>What happens if we project a range with a negation?</em> </p> <p> As a simple example, let's consider the range from <em>-1</em> to <em>2:</em> </p> <p> <pre>ghci&gt; uncurry Pair (Closed (-1), Closed 2) Pair (Closed (-1)) (Closed 2)</pre> </p> <p> We may draw this range on the number line like this: </p> <p> <img src="/content/binary/single-range-on-number-line.png" alt="The range from -1 to 2 drawn on the number line."> </p> <p> What happens if we map that range by multiplying with <em>-1?</em> </p> <p> <pre>ghci&gt; fmap negate $ uncurry Pair (Closed (-1), Closed 2) Pair (Closed 1) (Closed (-2))</pre> </p> <p> We get a range from <em>1</em> to <em>-2!</em> </p> <p> <em>Aha!</em> you say, <em>clearly that's wrong!</em> We've just found a counterexample. After all, <em>range</em> isn't a functor. </p> <p> Not so. The functor laws say nothing about the interpretation of projections (but I'll get back to that in a moment). Rather, they say something about composition, so let's consider an example that reaches a similar, seemingly wrong result: </p> <p> <pre>ghci&gt; fmap ((+1) . negate) $ uncurry Pair (Closed (-1), Closed 2) Pair (Closed 2) (Closed (-1))</pre> </p> <p> This is a range from <em>2</em> to <em>-1</em>, so just as problematic as before. </p> <p> The second functor law states that the outcome should be the same if we map piecewise: </p> <p> <pre>ghci&gt; (fmap (+ 1) . fmap negate) $ uncurry Pair (Closed (-1), Closed 2) Pair (Closed 2) (Closed (-1))</pre> </p> <p> Still a range from <em>2</em> to <em>-1</em>. The second functor law holds. </p> <p> <em>But,</em> you protest, <em>that's doesn't make any sense!</em> </p> <p> I disagree. It could make sense in at least three different ways. </p> <p> What does a range from <em>2</em> to <em>-1</em> mean? I can think of three interpretations: </p> <ul> <li>It's the empty set</li> <li>It's the range from <em>-1</em> to <em>2</em></li> <li>It's the set of numbers that are either less than or equal to <em>-1</em> or greater than or equal to <em>2</em></li> </ul> <p> We may illustrate those three interpretations, together with the original range, like this: </p> <p> <img src="/content/binary/three-ranges-on-number-lines.png" alt="Four number lines, each with a range interpretation drawn in."> </p> <p> According to the first interpretation, we consider the range as the Boolean <em>and</em> of two predicates. In this interpretation the initial range is really the Boolean expression <em>-1 ≤ x ∧ x ≤ 2</em>. The projected range then becomes the expression <em>2 ≤ x ∧ x ≤ -1</em>, which is not possible. This is how I've chosen to implement the <code>contains</code> function: </p> <p> <pre>ghci&gt; Pair x y = fmap ((+1) . negate) $ uncurry Pair (Closed (-1), Closed 2) ghci&gt; contains (x, y) [0] False ghci&gt; contains (x, y) [-3] False ghci&gt; contains (x, y) [4] False</pre> </p> <p> In this interpretation, the result is the empty set. The range isn't impossible; it's just empty. That's the second number line from the top in the above illustration. </p> <p> This isn't, however, the only interpretation. Instead, we may choose to <a href="https://en.wikipedia.org/wiki/Robustness_principle">be liberal in what we accept</a> and interpret the range from <em>2</em> to <em>-1</em> as a 'programmer mistake': <em>What you asked me to do is formally wrong, but I think that I understand that you meant the range from </em>-1<em> to </em>2. </p> <p> That's the third number line in the above illustration. </p> <p> The fourth interpretation is that when the first element of the range is greater than the second, the range represents the <a href="https://en.wikipedia.org/wiki/Complement_(set_theory)">complement</a> of the range. That's the fourth number line in the above illustration. </p> <p> The reason I spent some time on this is that it's easy to confuse the functor laws with other properties that you may associate with a data structure. This may lead you to falsely conclude that a functor isn't a functor, because you feel that it violates some other invariant. </p> <p> If this happens, consider instead whether you could possibly expand the interpretation of the data structure in question. </p> <h3 id="859ece7acfdb415da1ba52578189e9ca"> Conclusion <a href="#859ece7acfdb415da1ba52578189e9ca">#</a> </h3> <p> You can model a <em>range</em> as a functor, which enables you to project ranges, either moving them around on an imaginary number line, or changing the type of the range. This might for example enable you to map a date range to an integer range, or vice versa. </p> <p> A functor enables mapping or projection, and some maps may produce results that you find odd or counter-intuitive. In this article you saw an example of that in the shape of a negated range where the first element (the 'minimum', in one interpretation) becomes greater than the second element (the 'maximum'). You may take that as an indication that the functor isn't, after all, a functor. </p> <p> This isn't the case. A data structure and its <em>map</em> function is a functor if the the mapping obeys the functor laws, which is the case for the range structures you've seen here. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Statically and dynamically typed scripts https://blog.ploeh.dk/2024/02/05/statically-and-dynamically-typed-scripts 2024-02-05T07:53:00+00:00 Mark Seemann <div id="post"> <p> <em>Extracting and analysing data in Haskell and Python.</em> </p> <p> I was recently following a course in mathematical analysis and probability for computer scientists. One assignment asked to analyze a small <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV file</a> with data collected in a student survey. The course contained a mix of pure maths and practical application, and the official programming language to be used was <a href="https://www.python.org/">Python</a>. It was understood that one was to do the work in Python, but it wasn't an explicit requirement, and I was so tired that didn't have the energy for it. </p> <p> I can get by in Python, but it's not a language I'm actually comfortable with. For small experiments, ad-hoc scripting, etc. I reach for <a href="https://www.haskell.org/">Haskell</a>, so that's what I did. </p> <p> This was a few months ago, and I've since followed another course that required more intense use of Python. With a few more months of Python programming under my belt, I decided to revisit that old problem and do it in Python with the explicit purpose of comparing and contrasting the two. </p> <h3 id="ae9c59e5fd0744f98841c6f864b20e33"> Static or dynamic types for scripting <a href="#ae9c59e5fd0744f98841c6f864b20e33">#</a> </h3> <p> I'd like to make one point with these articles, and that is that dynamically typed languages aren't inherently better suited for scripting than statically typed languages. From this, it does not, however, follow that statically typed languages are better, either. Rather, I increasingly believe that whether you find one or the other more productive is a question of personality, past experiences, programming background, etc. I've been over this ground before. <a href="/2021/08/09/am-i-stuck-in-a-local-maximum">Many of my heroes seem to favour dynamically typed languages</a>, while I keep returning to statically typed languages. </p> <p> For more than a decade I've preferred <a href="https://fsharp.org/">F#</a> or Haskell for ad-hoc scripting. Note that while these languages are statically typed, they are <a href="/2019/12/16/zone-of-ceremony">low on ceremony</a>. Types are <em>inferred</em> rather than declared. This means that for scripts, you can experiment with small code blocks, iteratively move closer to what you need, just as you would with a language like Python. Change a line of code, and the inferred type changes with it; there are no type declarations that you also need to fix. </p> <p> When I talk about writing scripts in statically typed languages, I have such languages in mind. I wouldn't write a script in C#, <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>, or <a href="https://www.java.com/">Java</a>. </p> <blockquote> <p> "Let me stop you right there: I don't think there is a real dynamic typing versus static typing debate. </p> <p> "What such debates normally are is language X vs language Y debates (where X happens to be dynamic and Y happens to be static)." </p> <footer><cite><a href="https://twitter.com/KevlinHenney/status/1425513161252278280">Kevlin Henney</a></cite></footer> </blockquote> <p> The present articles compare Haskell and Python, so be careful that you don't extrapolate and draw any conclusions about, say, <a href="https://en.wikipedia.org/wiki/C%2B%2B">C++</a> versus <a href="https://www.erlang.org/">Erlang</a>. </p> <p> When writing an ad-hoc script to extract data from a file, it's important to be able to experiment and iterate. Load the file, inspect the data, figure out how to extract subsets of it (particular columns, for example), calculate totals, averages, etc. A <a href="https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop">REPL</a> is indispensable in such situations. The Haskell REPL (called <em><a href="https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler">Glasgow Haskell Compiler</a> interactive</em>, or just <em>GHCi</em>) is the best one I've encountered. </p> <p> I imagine that a Python expert would start by reading the data to slice and dice it various ways. We may label this a <em>data-first</em> approach, but be careful not to read too much into this, as I don't really know what I'm talking about. That's not how my mind works. Instead, I tend to take a <em>types-first</em> approach. I'll look at the data and start with the types. </p> <h3 id="6454710fbb5644ae979af4aa247dce96"> The assignment <a href="#6454710fbb5644ae979af4aa247dce96">#</a> </h3> <p> The actual task is the following. At the beginning of the course, the professors asked students to fill out a survey. Among the questions asked was which grade the student expected to receive, and how much experience with programming he or she already had. </p> <p> Grades are given according to the <a href="https://en.wikipedia.org/wiki/Academic_grading_in_Denmark">Danish academic scale</a>: -3, 00, 02, 4, 7, 10, and 12, and experience level on a simple numeric scale from 1 to 7, with 1 indicating no experience and 7 indicating expert-level experience. </p> <p> Here's a small sample of the data: </p> <p> <pre>No,3,2,6,6 No,4,2,3,7 No,1,12,6,2 No,4,10,4,3 No,3,4,4,6</pre> </p> <p> The expected grade is in the third column (i.e. <em>2, 2, 12, 10, 4</em>) and the experience level is in the fourth column (<em>6,3,6,4,4</em>). The other columns are answers to different survey questions. The full data set contains 38 rows. </p> <p> The assignment poses the following questions: Two rows from the survey data are randomly selected. What is the <a href="https://www.probabilitycourse.com/chapter3/3_1_3_pmf.php">probability mass function</a> (PMF) of the sum of their expected grades, and what is the PMF of the absolute difference between their programming experience levels? </p> <p> In both cases I was also asked to plot the PMFs. </p> <h3 id="331b2ed3198f4a59872eb0e9d2f4ebd9"> Comparisons <a href="#331b2ed3198f4a59872eb0e9d2f4ebd9">#</a> </h3> <p> As outlined above, I originally wrote a Haskell script to answer the questions, and only months later returned to the problem to give it a go in Python. When reading my detailed walkthroughs, keep in mind that I have 8-9 years of Haskell experience, and that I tend to 'think in Haskell', while I have only about a year of experience with Python. I don't consider myself proficient with Python, so the competition is rigged from the outset. </p> <ul> <li><a href="/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell">Extracting data from a small CSV file with Haskell</a></li> <li><a href="/2024/03/18/extracting-data-from-a-small-csv-file-with-python">Extracting data from a small CSV file with Python</a></li> </ul> <p> For this small task, I don't think that there's a clear winner. I still like my Haskell code the best, but I'm sure someone better at Python could write a much cleaner script. I also have to admit that <a href="https://matplotlib.org/">Matplotlib</a> makes it a breeze to produce nice-looking plots with Python, whereas I don't even know where to start with that with Haskell. </p> <p> Recently I've done some more advanced data analysis with Python, such as random forest classification, principal component analysis, KNN-classification, etc. While I understand that I'm only scratching the surface of data science and machine learning, it's obvious that there's a rich Python ecosystem for that kind of work. </p> <h3 id="dcf63d011f02487eb051e3a75cbe59f7"> Conclusion <a href="#dcf63d011f02487eb051e3a75cbe59f7">#</a> </h3> <p> This lays the foundations for comparing a small Haskell script with an equivalent Python script. There's no scientific method to the comparison; it's just me doing the same exercise twice, a bit like I'd <a href="/2020/01/13/on-doing-katas">do katas</a> with multiple variations in order to learn. </p> <p> While I still like Haskell better than Python, that's only a personal preference. I'm deliberately not declaring a winner. </p> <p> One point I'd like to make, however, is that there's nothing inherently better about a dynamically typed language when it comes to ad-hoc scripting. Languages with strong type inference work well, too. </p> <p> <strong>Next:</strong> <a href="/2024/02/19/extracting-data-from-a-small-csv-file-with-haskell">Extracting data from a small CSV file with Haskell</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Error categories and category errors https://blog.ploeh.dk/2024/01/29/error-categories-and-category-errors 2024-01-29T16:05:00+00:00 Mark Seemann <div id="post"> <p> <em>How I currently think about errors in programming.</em> </p> <p> A reader <a href="/2023/08/14/replacing-mock-and-stub-with-a-fake#0afe67b375254fe193a3fd10234a1ce9">recently asked a question</a> that caused me to reflect on the way I think about errors in software. While my approach to error handling has remained largely the same for years, I don't think I've described it in an organized way. I'll try to present those thoughts here. </p> <p> This article is, for lack of a better term, a <em>think piece</em>. I don't pretend that it represents any fundamental truth, or that this is the only way to tackle problems. Rather, I write this article for two reasons. </p> <ul> <li>Writing things down often helps clarifying your thoughts. While I already feel that my thinking on the topic of error handling is fairly clear, I've written enough articles that I know that by writing this one, I'll learn something new.</li> <li>Publishing this article enables the exchange of ideas. By sharing my thoughts, I enable readers to point out errors in my thinking, or to improve on my work. Again, I may learn something. Perhaps others will, too.</li> </ul> <p> Although I don't claim that the following is universal, I've found it useful for years. </p> <h3 id="43e58fcae3184b0597def9a4ec5629d7"> Error categories <a href="#43e58fcae3184b0597def9a4ec5629d7">#</a> </h3> <p> Almost all software is at risk of failing for a myriad of reasons: User input, malformed data, network partitions, cosmic rays, race conditions, bugs, etc. Even so, we may categorize errors like this: </p> <ul> <li>Predictable errors we can handle</li> <li>Predictable errors we can't handle</li> <li>Errors we've failed to predict</li> </ul> <p> This distinction is hardly original. I believe I've picked it up from Michael Feathers, but although I've searched, I can't find the source, so perhaps I'm remembering it wrong. </p> <p> You may find these three error categories underwhelming, but I find it useful to first consider what may be done about an error. Plenty of error situations are predictable. For example, all input should be considered suspect. This includes user input, but also data you receive from other systems. This kind of potential error you can typically solve with input validation, which I believe is <a href="/2020/12/14/validation-a-solved-problem">a solved problem</a>. Another predictable kind of error is unavailable services. Many systems store data in databases. You can easily predict that the database <em>will</em>, sooner or later, be unreachable. Potential causes include network partitions, a misconfigured connection string, logs running full, a crashed server, denial-of-service attacks, etc. </p> <p> With some experience with software development, it's not that hard producing a list of things that could go wrong. The next step is to decide what to do about it. </p> <p> There are scenarios that are so likely to happen, and where the solution is so well-known, that they fall into the category of predictable errors that you can handle. User input belongs here. You examine the input and inform the user if it's invalid. </p> <p> Even with input, however, other scenarios may lead you down different paths. What if, instead of a system with a user interface, you're developing a batch job that receives a big data file every night? How do you deal with invalid input in that scenario? Do you reject the entire data set, or do you filter it so that you only handle the valid input? Do you raise a notification to asynchronously inform the sender that input was malformed? </p> <p> Notice how categorization is context-dependent. It would be a (category?) error to interpret the above model as fixed and universal. Rather, it's an analysis framework that helps identifying how to categorize various fault scenarios in a particular application context. </p> <p> Another example may be in order. If your system depends on a database, a predictable error is that the database will be unavailable. Can you handle that situation? </p> <p> A common reaction is that there's really not a lot one can do about that. You may retry the operation, log the problem, or notify an on-call engineer, but ultimately the system <em>depends</em> on the database. If the database is unreachable, the system can't work. You can't handle that problem, so this falls in the category of predictable errors that you can't handle. </p> <p> Or does it? </p> <h3 id="b703b2c68f3b4656a1cf1f4042974ab7"> Trade-offs of error handling <a href="#b703b2c68f3b4656a1cf1f4042974ab7">#</a> </h3> <p> The example of an unreachable database is useful to explore in order to demonstrate that error handling isn't writ in stone, but rather an architectural design decision. Consider <a href="/2014/08/11/cqs-versus-server-generated-ids">a common API design</a> like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IRepository</span>&lt;T&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;Create(T&nbsp;item); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;other&nbsp;members</span> }</pre> </p> <p> What happens if client code calls <code>Create</code> but the database is unreachable? This is C# code, but the problem generalizes. With most implementations, the <code>Create</code> method will throw an exception. </p> <p> Can you handle that situation? You may retry a couple of times, but if you have a user waiting for a response, you can't retry for too long. Once time is up, you'll have to accept that the operation failed. In a language like C#, the most robust implementation is to <em>not</em> handle the specific exception, but instead let it bubble up to be handled by a global exception handler that usually can't do much else than showing the user a generic error message, and then log the exception. </p> <p> This isn't your only option, though. You may find yourself in a context where this kind of attitude towards errors is unacceptable. If you're working with <a href="https://twitter.com/ploeh/status/530320252790669313">BLOBAs</a> it's probably fine, but if you're working with medical life-support systems, or deep-space probes, or in other high-value contexts, the overall error-tolerance may be lower. Then what do you do? </p> <p> You may try to address the concern with IT operations: Configure failover systems for the database, installing two network cards in every machine, and so on. This may (also) be a way to address the problem, but isn't your only option. You may also consider changing the software architecture. </p> <p> One option may be to switch to an asynchronous message-based system where messages are transmitted via durable queues. Granted, durables queues may fail as well (everything may fail), but when done right, they tend to be more robust. Even a machine that has lost all network connectivity may queue messages on its local disk until the network returns. Yes, the disk may run full, etc. but it's <em>less</em> likely to happen than a network partition or an unreachable database. </p> <p> Notice that an unreachable database now goes into the category of errors that you've predicted, and that you can handle. On the other hand, failing to send an asynchronous message is now a new kind of error in your system: One that you can predict, but can't handle. </p> <p> Making this change, however, impacts your software architecture. You can no longer have an interface method like the above <code>Create</code> method, because you can't rely on it returning an <code>int</code> in reasonable time. During error scenarios, messages may sit in queues for hours, if not days, so you can't block on such code. </p> <p> As <a href="/2014/08/11/cqs-versus-server-generated-ids">I've explained elsewhere</a> you can instead model a <code>Create</code> method like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IRepository</span>&lt;T&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;Create(<span style="color:#2b91af;">Guid</span>&nbsp;id,&nbsp;T&nbsp;item); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;other&nbsp;members</span> }</pre> </p> <p> Not only does this follow the <a href="https://en.wikipedia.org/wiki/Command%E2%80%93query_separation">Command Query Separation</a> principle, it also makes it easier for you to adopt an asynchronous message-based architecture. Done consistently, however, this requires that you approach application design in a way different from a design where you assume that the database is reachable. </p> <p> It may even impact a user interface, because it'd be a good idea to design user experience in such a way that it helps the user have a congruent mental model of how the system works. This may include making the concept of an <em>outbox</em> explicit in the user interface, as it may help users realize that writes happen asynchronously. Most users understand that email works that way, so it's not inconceivable that they may be able to adopt a similar mental model of other applications. </p> <p> The point is that this is an <em>option</em> that you may consider as an architect. Should you always design systems that way? I wouldn't. There's much extra complexity that you have to deal with in order to make asynchronous messaging work: UX, out-of-order messages, dead-letter queues, message versioning, etc. Getting to <a href="https://en.wikipedia.org/wiki/High_availability">five nines</a> is expensive, and often not warranted. </p> <p> The point is rather that what goes in the <em>predictable errors we can't handle</em> category isn't fixed, but context-dependent. Perhaps we should rather name the category <em>predictable errors we've decided not to handle</em>. </p> <h3 id="529bcc700301441aa3337bdb1911f74d"> Bugs <a href="#529bcc700301441aa3337bdb1911f74d">#</a> </h3> <p> How about the third category of errors, those we've failed to predict? We also call these <em>bugs</em> or <em>defects</em>. By definition, we only learn about them when they manifest. As soon as they become apparent, however, they fall into one of the other categories. If an error occurs once, it may occur again. It is now up to you to decide what to do about it. </p> <p> I usually consider <a href="/2023/01/23/agilean">errors as stop-the-line-issues</a>, so I'd be inclined to immediately address them. On they other hand, if you don't do that, you've implicitly decided to put them in the category of predictable errors that you've decided not to handle. </p> <p> We don't intentionally write bugs; there will always be some of those around. On the other hand, various practices help reducing them: Test-driven development, code reviews, property-based testing, but also up-front design. </p> <h3 id="4f0b84e839d54f16af37d0295140dc17"> Error-free code <a href="#4f0b84e839d54f16af37d0295140dc17">#</a> </h3> <p> Do consider explicitly how code may fail. </p> <p> Despite the title of this section, there's no such thing as error-free code. Still, you can explicitly think about edge cases. For example, how might the following function fail? </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:#2b91af;">TimeSpan</span>&nbsp;<span style="color:#74531f;">Average</span>(<span style="color:blue;">this</span>&nbsp;<span style="color:#2b91af;">IEnumerable</span>&lt;<span style="color:#2b91af;">TimeSpan</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;=&nbsp;<span style="color:#2b91af;">TimeSpan</span>.Zero; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>&nbsp;=&nbsp;0; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">+=</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>++; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;<span style="font-weight:bold;color:#74531f;">/</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">count</span>; }</pre> </p> <p> In at least two ways: The input collection may be empty or infinite. I've <a href="/2020/02/03/non-exceptional-averages">already suggested a few ways to address those problems</a>. Some of them are similar to what Michael Feathers calls <a href="https://youtu.be/AnZ0uTOerUI?si=1gJXYFoVlNTSbjEt">unconditional code</a>, in that we may change the <a href="https://en.wikipedia.org/wiki/Domain_of_a_function">domain</a>. Another option, that I didn't cover in the linked article, is to expand the <a href="https://en.wikipedia.org/wiki/Codomain">codomain</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;TimeSpan?&nbsp;<span style="font-weight:bold;color:#74531f;">Average</span>(<span style="color:blue;">this</span>&nbsp;IReadOnlyCollection&lt;TimeSpan&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">timeSpans</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!timeSpans.Any()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sum</span>&nbsp;=&nbsp;TimeSpan.Zero; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;timeSpans) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sum&nbsp;+=&nbsp;ts; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;sum&nbsp;/&nbsp;timeSpans.Count; }</pre> </p> <p> Now, instead of diminishing the domain, we expand the codomain by allowing the return value to be null. (Interestingly, this is the inverse of <a href="/2021/12/06/the-liskov-substitution-principle-as-a-profunctor">my profunctor description of the Liskov Substitution Principle</a>. I don't yet know what to make of that. See: Just by writing things down, I learn something I hadn't realized before.) </p> <p> This is beneficial in a statically typed language, because such a change makes hidden knowledge explicit. It makes it so explicit that a type checker can point out when we make mistakes. <a href="https://blog.janestreet.com/effective-ml-video/">Make illegal states unrepresentable</a>. <a href="https://en.wikipedia.org/wiki/Poka-yoke">Poka-yoke</a>. A potential run-time exception is now a compile-time error, and it's firmly in the category of errors that we've predicted and decided to handle. </p> <p> In the above example, we could use the built-in .NET <a href="https://learn.microsoft.com/dotnet/api/system.nullable-1">Nullable&lt;T&gt;</a> (with the <code>?</code> syntactic-sugar alias). In other cases, you may resort to returning a <a href="/2018/03/26/the-maybe-functor">Maybe</a> (AKA <em>option</em>). </p> <h3 id="0855ccbcd46f4c29980a219993293279"> Modelling errors <a href="#0855ccbcd46f4c29980a219993293279">#</a> </h3> <p> Explicitly expanding the codomain of functions to signal potential errors is beneficial if you expect the caller to be able to handle the problem. If callers can't handle an error, forcing them to deal with it is just going to make things more difficult. I've never done any professional <a href="https://www.java.com/">Java</a> programming, but I've heard plenty of Java developers complain about checked exceptions. As far as I can tell, the problem in Java isn't so much with the language feature per se, but rather with the exception types that APIs force you to handle. </p> <p> As an example, imagine that every time you call a database API, the compiler forces you to handle an <a href="https://learn.microsoft.com/dotnet/api/system.io.ioexception">IOException</a>. Unless you explicitly architect around it (as outlined above), this is likely to be one of the errors you can predict, but decide not to handle. But if the compiler forces you to handle it, then what do you do? You probably find some workaround that involves re-throwing the exception, or, as I understand that some Java developers do, declare that their own APIs may throw <em>any</em> exception, and by that means just pass the buck. Not helpful. </p> <p> As far as I can tell, (checked) exceptions are equivalent to the <a href="/2018/06/11/church-encoded-either">Either</a> container, also known as <em>Result</em>. We may imagine that instead of throwing exceptions, a function may return an Either value: <em>Right</em> for a right result (explicit mnemonic, there!), and left for an error. </p> <p> It might be tempting to model all error-producing operations as Either-returning, but <a href="https://eiriktsarpalis.wordpress.com/2017/02/19/youre-better-off-using-exceptions/">you're often better off using exceptions</a>. Throw exceptions in those situations that you expect most clients can't recover from. Return <em>left</em> (or <em>error</em>) cases in those situations that you expect that a typical client would want to handle. </p> <p> Again, it's context-specific, so if you're developing a reusable library, there's a balance to strike in API design (or overloads to supply). </p> <h3 id="4ecf10fdb9a842e88aeecf1024cd466a"> Most errors are just branches <a href="#4ecf10fdb9a842e88aeecf1024cd466a">#</a> </h3> <p> In many languages, errors are somehow special. Most modern languages include a facility to model errors as exceptions, and special syntax to throw or catch them. (The odd man out may be C, with its reliance on error codes as return values, but that is incredible awkward for other reasons. You may also reasonably argue that C is hardly a modern language.) </p> <p> Even Haskell has exceptions, even though it also has deep language support for <code>Maybe</code> and <code>Either</code>. Fortunately, Haskell APIs <em>tend</em> to only throw exceptions in those cases where average clients are unlikely to handle them: Timeouts, I/O failures, and so on. </p> <p> It's unfortunate that languages treat errors as something exceptional, because this nudges us to make a proper category error: That errors are somehow special, and that we can't use normal coding constructs or API design practices to model them. </p> <p> But you can. That's what <a href="https://youtu.be/AnZ0uTOerUI?si=1gJXYFoVlNTSbjEt">Michael Feathers' presentation is about</a>, and that's what you can do by <a href="https://blog.janestreet.com/effective-ml-video/">making illegal states unrepresentable</a>, or by returning Maybe or Either values. </p> <p> Most errors are just branches in your code; where it diverges from the happy path in order to do something else. </p> <h3 id="a0e442d5e5c842108555236745f23155"> Conclusion <a href="#a0e442d5e5c842108555236745f23155">#</a> </h3> <p> This article presents a framework for thinking about software errors. There are those you can predict may happen, and you choose to handle; those you predict may happen, but you choose to ignore; and those that you have not yet predicted: bugs. </p> <p> A little up-front thinking will often help you predict some errors, but I'm not advocating that you foresee all errors. Some errors are programmer errors, and we make those errors because we're human, exactly because we're failing to predict the behaviour of a particular state of the code. Once you discover a bug, however, you have a choice: Do you address it or ignore it? </p> <p> There are error conditions that you may deliberately choose to ignore. This doesn't necessarily make you an irresponsible programmer, but may rather be the result of a deliberate feasibility study. For example, every network operation may fail. How important is it that your application can keep running without the network? Is it worthwhile to make the code so robust that it can handle that situation? Or can you rather live with a few hours of downtime per quarter? If the latter, it may be best to let a human deal with network partitions when they occur. </p> <p> The three error categories I suggest here are context-dependent. You decide which problems to deal with, and which ones to ignore, but apart from that, error-handling doesn't have to be difficult. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A Range kata implementation in C# https://blog.ploeh.dk/2024/01/22/a-range-kata-implementation-in-c 2024-01-22T07:05:00+00:00 Mark Seemann <div id="post"> <p> <em>A port of the corresponding F# code.</em> </p> <p> This article is an instalment in <a href="/2024/01/01/variations-of-the-range-kata">a short series of articles on the Range kata</a>. In the <a href="/2024/01/15/a-range-kata-implementation-in-f">previous article</a> I made a pass at <a href="https://codingdojo.org/kata/Range/">the kata</a> in <a href="https://fsharp.org/">F#</a>, using property-based testing with <a href="https://hedgehog.qa/">Hedgehog</a> to generate test data. </p> <p> In the conclusion I mused about the properties I was able to come up with. Is it possible to describe open, closed, and mixed ranges in a way that's less coupled to the implementation? To be honest, I still don't have an answer to that question. Instead, in this article, I describe a straight port of the F# code to C#. There's value in that, too, for people who wonder <a href="/2015/04/15/c-will-eventually-get-all-f-features-right">how to reap the benefits of F# in C#</a>. </p> <p> The code is <a href="https://github.com/ploeh/RangeCSharp">available on GitHub</a>. </p> <h3 id="2b05848a3b494ec99cc0e50da22bdd15"> First property <a href="#2b05848a3b494ec99cc0e50da22bdd15">#</a> </h3> <p> Both F# and C# are .NET languages. They run in the same substrate, and are interoperable. While Hedgehog is written in F#, it's possible to consume F# libraries from C#, and vice versa. I've done this multiple times with <a href="https://fscheck.github.io/FsCheck/">FsCheck</a>, but I admit to never having tried it with Hedgehog. </p> <p> If you want to try property-based testing in C#, a third alternative is available: <a href="https://github.com/AnthonyLloyd/CsCheck">CsCheck</a>. It's written in C# and is more <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> in that context. While I sometimes <a href="/2021/02/15/when-properties-are-easier-than-examples">still use FsCheck from C#</a>, I often choose CsCheck for didactic reasons. </p> <p> The first property I wrote was a direct port of the idea of the first property I wrote in F#: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ClosedRangeContainsList</span>() { &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;xs&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Short.Enumerable.Nonempty &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;min&nbsp;=&nbsp;xs.Min() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;max&nbsp;=&nbsp;xs.Max() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(xs,&nbsp;min,&nbsp;max)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Range&lt;<span style="color:blue;">short</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;ClosedEndpoint&lt;<span style="color:blue;">short</span>&gt;(t.min), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;ClosedEndpoint&lt;<span style="color:blue;">short</span>&gt;(t.max)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.Contains(t.xs); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.True(actual,&nbsp;<span style="color:#a31515;">$&quot;</span><span style="color:#a31515;">Expected&nbsp;</span>{t.xs}<span style="color:#a31515;">&nbsp;to&nbsp;be&nbsp;contained&nbsp;in&nbsp;</span>{sut}<span style="color:#a31515;">.</span><span style="color:#a31515;">&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;}); }</pre> </p> <p> This test (or property, if you will) uses a technique that I often use with property-based testing. I'm still searching for a catchy name for this, but here we may call it something like <em>reverse test-case assembly</em>. My <em>goal</em> is to test a predicate, and this particular property should verify that for a given <a href="https://en.wikipedia.org/wiki/Equivalence_class">Equivalence Class</a>, the predicate is always true. </p> <p> While we may think of an Equivalence Class as a set from which we pick test cases, I don't actually have a full enumeration of such a set. I can't have that, since that set is infinitely big. Instead of randomly picking values from a set that I can't fully populate, I instead carefully pick test case values in such a way that they would all belong to the same <a href="https://en.wikipedia.org/wiki/Partition_of_a_set">set partition</a> (Equivalence Class). </p> <p> The <a href="/2022/06/13/some-thoughts-on-naming-tests">test name suggests the test case</a>: I'd like to verify that given I have a closed range, when I ask it whether a list <em>within</em> that range is contained, then the answer is <em>true</em>. How do I pick such a test case? </p> <p> I do it in reverse. You can say that the sampling is the dual of the test. I start with a list (<code>xs</code>) and only then do I create a range that contains it. Since the first test case is for a closed range, the <code>min</code> and <code>max</code> values are sufficient to define such a range. </p> <p> How do I pass that property? </p> <p> Degenerately, as is often the case with TDD beginnings: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Contains</span>(IEnumerable&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">candidates</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">true</span>; }</pre> </p> <p> Even though the <code>ClosedRangeContainsList</code> property effectively executes a hundred test cases, the <a href="/2019/10/07/devils-advocate">Devil's Advocate</a> can easily ignore that and instead return hard-coded <code>true</code>. </p> <h3 id="61c1050202934baa99baeaa4dfed40e1"> Endpoint sum type <a href="#61c1050202934baa99baeaa4dfed40e1">#</a> </h3> <p> I'm not going to bore you with the remaining properties. The repository is available on GitHub if you're interested in those details. </p> <p> If you've programmed in F# for some time, you typically miss <a href="https://en.wikipedia.org/wiki/Algebraic_data_type">algebraic data types</a> when forced to return to C#. A language like C# does have <a href="https://en.wikipedia.org/wiki/Product_type">product types</a>, but lack native <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>. Even so, not all is lost. I've previously demonstrated that <a href="/2018/06/25/visitor-as-a-sum-type">you can employ the Visitor pattern to encode a sum type</a>. Another option is to use <a href="/2018/05/22/church-encoding">Church encoding</a>, which I've decided to do here. </p> <p> When choosing between Church encoding and the <a href="https://en.wikipedia.org/wiki/Visitor_pattern">Visitor</a> pattern, Visitor is more object-oriented (after all, it's an original <a href="/ref/dp">GoF</a> design pattern), but Church encoding has fewer moving parts. Since I was just doing an exercise, I went for the simpler implementation. </p> <p> An <code>Endpoint</code> object should allow one of two cases: <code>Open</code> or <code>Closed</code>. To avoid <a href="https://wiki.c2.com/?PrimitiveObsession">primitive obsession</a> I gave the class a <code>private</code> constructor: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Endpoint</span>&lt;<span style="color:#2b91af;">T</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;T&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;isClosed; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">Endpoint</span>(T&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>,&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">isClosed</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.isClosed&nbsp;=&nbsp;isClosed; &nbsp;&nbsp;&nbsp;&nbsp;}</pre> </p> <p> Since the constructor is <code>private</code> you need another way to create <code>Endpoint</code> objects. Two factory methods provide that affordance: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Endpoint&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Closed</span>&lt;<span style="color:#2b91af;">T</span>&gt;(T&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Endpoint&lt;T&gt;.Closed(value); } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Endpoint&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Open</span>&lt;<span style="color:#2b91af;">T</span>&gt;(T&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Endpoint&lt;T&gt;.Open(value); }</pre> </p> <p> The heart of the Church encoding is the <code>Match</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;TResult&nbsp;<span style="font-weight:bold;color:#74531f;">Match</span>&lt;<span style="color:#2b91af;">TResult</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;Func&lt;T,&nbsp;TResult&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">whenClosed</span>, &nbsp;&nbsp;&nbsp;&nbsp;Func&lt;T,&nbsp;TResult&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">whenOpen</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(isClosed) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;whenClosed(value); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;whenOpen(value); }</pre> </p> <p> Such an API is an example of <a href="https://en.wikipedia.org/wiki/Poka-yoke">poka-yoke</a> because it obliges you to deal with both cases. The compiler will keep you honest: <em>Did you remember to deal with both the open and the closed case?</em> When calling the <code>Match</code> method, you must supply both arguments, or your code doesn't compile. <a href="https://blog.janestreet.com/effective-ml-video/">Make illegal states unrepresentable</a>. </p> <h3 id="484d1f121cc44050b76f23d92bca429c"> Containment <a href="#484d1f121cc44050b76f23d92bca429c">#</a> </h3> <p> With the <code>Endpoint</code> class in place, you can implement a <code>Range</code> class. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">Range</span>&lt;<span style="color:#2b91af;">T</span>&gt;&nbsp;<span style="color:blue;">where</span>&nbsp;T&nbsp;:&nbsp;IComparable&lt;T&gt;</pre> </p> <p> It made sense to me to constrain the <code>T</code> type argument to <code>IComparable&lt;T&gt;</code>, although it's possible that I could have deferred that constraint to the actual <code>Contains</code> method, like I did with <a href="/2024/01/08/a-range-kata-implementation-in-haskell">my Haskell implementation</a>. </p> <p> A <code>Range</code> holds two <code>Endpoint</code> values: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">Range</span>(Endpoint&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">min</span>,&nbsp;Endpoint&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">max</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.min&nbsp;=&nbsp;min; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.max&nbsp;=&nbsp;max; }</pre> </p> <p> The <code>Contains</code> method makes use of the built-in <a href="https://learn.microsoft.com/dotnet/api/system.linq.enumerable.all">All</a> method, using a <code>private</code> helper function as the predicate: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">IsInRange</span>(T&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;min.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenClosed:&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>&nbsp;=&gt;&nbsp;max.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenClosed:&nbsp;<span style="font-weight:bold;color:#1f377f;">h</span>&nbsp;=&gt;&nbsp;l.CompareTo(candidate)&nbsp;&lt;=&nbsp;0&nbsp;&amp;&amp;&nbsp;candidate.CompareTo(h)&nbsp;&lt;=&nbsp;0, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenOpen:&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">h</span>&nbsp;=&gt;&nbsp;l.CompareTo(candidate)&nbsp;&lt;=&nbsp;0&nbsp;&amp;&amp;&nbsp;candidate.CompareTo(h)&nbsp;&lt;&nbsp;&nbsp;0), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenOpen:&nbsp;<span style="font-weight:bold;color:#1f377f;">l</span>&nbsp;=&gt;&nbsp;max.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenClosed:&nbsp;<span style="font-weight:bold;color:#1f377f;">h</span>&nbsp;=&gt;&nbsp;l.CompareTo(candidate)&nbsp;&lt;&nbsp;&nbsp;0&nbsp;&amp;&amp;&nbsp;candidate.CompareTo(h)&nbsp;&lt;=&nbsp;0, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;whenOpen:&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">h</span>&nbsp;=&gt;&nbsp;l.CompareTo(candidate)&nbsp;&lt;&nbsp;&nbsp;0&nbsp;&amp;&amp;&nbsp;candidate.CompareTo(h)&nbsp;&lt;&nbsp;&nbsp;0)); }</pre> </p> <p> This implementation performs a nested <code>Match</code> to arrive at the appropriate answer. The code isn't as elegant or readable as its F# counterpart, but it comes with comparable compile-time safety. You can't forget a combination, because if you do, your code isn't going to compile. </p> <p> Still, you can't deny that C# involves more <a href="/2019/12/16/zone-of-ceremony">ceremony</a>. </p> <h3 id="54fd635d237d4f57891bda8ef5d13623"> Conclusion <a href="#54fd635d237d4f57891bda8ef5d13623">#</a> </h3> <p> Once you know how, it's not that difficult to port a functional design from F# or <a href="https://www.haskell.org/">Haskell</a> to a language like C#. The resulting code tends to be more complicated, but to a large degree, it's possible to retain the type safety. </p> <p> In this article you saw a sketch of how to make that transition, using the Range kata as an example. The resulting C# API is perfectly serviceable, as the test code demonstrates. </p> <p> Now that we have covered the fundamentals of the Range kata we have learned enough about it to go beyond the exercise and examine some more abstract properties. </p> <p> <strong>Next:</strong> <a href="/2024/02/12/range-as-a-functor">Range as a functor</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A Range kata implementation in F# https://blog.ploeh.dk/2024/01/15/a-range-kata-implementation-in-f 2024-01-15T07:20:00+00:00 Mark Seemann <div id="post"> <p> <em>This time with some property-based testing.</em> </p> <p> This article is an instalment in <a href="/2024/01/01/variations-of-the-range-kata">a short series of articles on the Range kata</a>. In the <a href="/2024/01/08/a-range-kata-implementation-in-haskell">previous article</a> I described my first attempt at the kata, and also complained that I had to think of test cases myself. When I find it tedious coming up with new test cases, I usually start to wonder if it'd be easier to use property-based testing. </p> <p> Thus, when I decided to revisit <a href="https://codingdojo.org/kata/Range/">the kata</a>, the variation that I was most interested in pursuing was to explore whether it would make sense to use property-based testing instead of a set of existing examples. </p> <p> Since I also wanted to do the second attempt in <a href="https://fsharp.org/">F#</a>, I had a choice between <a href="https://fscheck.github.io/FsCheck/">FsCheck</a> and <a href="https://hedgehog.qa/">Hedgehog</a>. Each have their strengths and weaknesses, but since I already know FsCheck so well, I decided to go with Hedgehog. </p> <p> I also soon discovered that I had no interest in developing the full suite of capabilities implied by the kata. Instead, I decided to focus on just the data structure itself, as well as the <code>contains</code> function. As in the previous article, this function can also be used to cover the kata's <em>ContainsRange</em> feature. </p> <h3 id="81447639b1a54c8c9437b81f3856018f"> Getting started <a href="#81447639b1a54c8c9437b81f3856018f">#</a> </h3> <p> There's no rule that you can't combine property-based testing with test-driven development (TDD). On the contrary, that's how I often do it. In this exercise, I first wrote this test: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Closed&nbsp;range&nbsp;contains&nbsp;list``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;xs&nbsp;=&nbsp;Gen.int16&nbsp;(Range.linearBounded&nbsp;())&nbsp;|&gt;&nbsp;Gen.list&nbsp;(Range.linear&nbsp;1&nbsp;99) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;min&nbsp;=&nbsp;List.min&nbsp;xs &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;max&nbsp;=&nbsp;List.max&nbsp;xs &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;(Closed&nbsp;min,&nbsp;Closed&nbsp;max)&nbsp;|&gt;&nbsp;Range.contains&nbsp;xs &nbsp;&nbsp;&nbsp;&nbsp;Assert.True&nbsp;(actual,&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;Range&nbsp;[%i,&nbsp;%i]&nbsp;expected&nbsp;to&nbsp;contain&nbsp;list.&quot;</span>&nbsp;min&nbsp;max)&nbsp;}</pre> </p> <p> We have to be careful when reading and understanding this code: There are two <code>Range</code> modules in action here! </p> <p> Hedgehog comes with a <code>Range</code> module that you must use to define how it samples values from <a href="https://en.wikipedia.org/wiki/Domain_of_a_function">domains</a>. Examples of that here are <code>Range.linearBounded</code> and <code>Range.linear</code>. </p> <p> On the other hand, I've defined <em>my</em> <code>contains</code> function in a <code>Range</code> module, too. As long as there's no ambiguity, the F# compiler doesn't have a problem with that. Since there's no <code>contains</code> function in the Hedgehog <code>Range</code> module, the F# compiler isn't confused. </p> <p> We humans, on the other hand, might be confused, and had this been a code base that I had to maintain for years, I might seriously consider whether I should rename my own <code>Range</code> module to something else, like <code>Interval</code>, perhaps. </p> <p> In any case, the first test (or property, if you will) uses a technique that I often use with property-based testing. I'm still searching for a catchy name for this, but here we may call it something like <em>reverse test-case assembly</em>. My <em>goal</em> is to test a predicate, and this particular property should verify that for a given <a href="https://en.wikipedia.org/wiki/Equivalence_class">Equivalence Class</a>, the predicate is always true. </p> <p> While we may think of an Equivalence Class as a set from which we pick test cases, I don't actually have a full enumeration of such a set. I can't have that, since that set is infinitely big. Instead of randomly picking values from a set that I can't fully populate, I instead carefully pick test case values in such a way that they would all belong to the same <a href="https://en.wikipedia.org/wiki/Partition_of_a_set">set partition</a> (Equivalence Class). </p> <p> The <a href="/2022/06/13/some-thoughts-on-naming-tests">test name suggests the test case</a>: I'd like to verify that given I have a closed range, when I ask it whether a list <em>within</em> that range is contained, then the answer is <em>true</em>. How do I pick such a test case? </p> <p> I do it in reverse. You can say that the sampling is the dual of the test. I start with a list (<code>xs</code>) and only then do I create a range that contains it. Since the first test case is for a closed range, the <code>min</code> and <code>max</code> values are sufficient to define such a range. </p> <p> How do I pass that property? </p> <p> Degenerately, as is often the case with TDD beginnings: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;contains&nbsp;_&nbsp;_&nbsp;=&nbsp;<span style="color:blue;">true</span></pre> </p> <p> Even though the <code>Closed range contains list</code> property effectively executes a hundred test cases, the <a href="/2019/10/07/devils-advocate">Devil's Advocate</a> can easily ignore that and instead return hard-coded <code>true</code>. </p> <p> More properties are required to flesh out the behaviour of the function. </p> <h3 id="50564e684f074ae1956b106333320c9b"> Open range <a href="#50564e684f074ae1956b106333320c9b">#</a> </h3> <p> While I do keep the <a href="https://blog.cleancoder.com/uncle-bob/2013/05/27/TheTransformationPriorityPremise.html">transformation priority premise</a> in mind when picking the next test (or, here, <em>property</em>), I'm rarely particularly analytic about it. Since the first property tests that a closed range barely contains a list of values from its minimum to its maximum, it seemed like a promising next step to consider the case where the range consisted of open endpoints. That was the second test I wrote, then: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Open&nbsp;range&nbsp;doesn&#39;t&nbsp;contain&nbsp;endpoints``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;min&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;max&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;(Open&nbsp;min,&nbsp;Open&nbsp;max)&nbsp;|&gt;&nbsp;Range.contains&nbsp;[min;&nbsp;max] &nbsp;&nbsp;&nbsp;&nbsp;Assert.False&nbsp;(actual,&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;Range&nbsp;(%i,&nbsp;%i)&nbsp;expected&nbsp;not&nbsp;to&nbsp;contain&nbsp;list.&quot;</span>&nbsp;min&nbsp;max)&nbsp;}</pre> </p> <p> This property simply states that if you query the <code>contains</code> predicate about a list that only contains the endpoints of an open range, then the answer is <code>false</code> because the endpoints are <code>Open</code>. </p> <p> One implementation that passes both tests is this one: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;contains&nbsp;_&nbsp;endpoints&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;endpoints&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Open&nbsp;_,&nbsp;Open&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">true</span></pre> </p> <p> This implementation is obviously still incorrect, but we have reason to believe that we're moving closer to something that will eventually work. </p> <h3 id="f01d9a05b57d470fac8f48bcfc85df4d"> Tick-tock <a href="#f01d9a05b57d470fac8f48bcfc85df4d">#</a> </h3> <p> In the spirit of the transformation priority premise, I've often found that when test-driving a predicate, I seem to fall into a tick-tock pattern where I alternate between tests for a <code>true</code> return value, followed by a test for a <code>false</code> return value, or the other way around. This was also the case here. The previous test was for a <code>false</code> value, so the third test requires <code>true</code> to be returned: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Open&nbsp;range&nbsp;contains&nbsp;list``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;xs&nbsp;=&nbsp;Gen.int64&nbsp;(Range.linearBounded&nbsp;())&nbsp;|&gt;&nbsp;Gen.list&nbsp;(Range.linear&nbsp;1&nbsp;99) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;min&nbsp;=&nbsp;List.min&nbsp;xs&nbsp;-&nbsp;1L &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;max&nbsp;=&nbsp;List.max&nbsp;xs&nbsp;+&nbsp;1L &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;(Open&nbsp;min,&nbsp;Open&nbsp;max)&nbsp;|&gt;&nbsp;Range.contains&nbsp;xs &nbsp;&nbsp;&nbsp;&nbsp;Assert.True&nbsp;(actual,&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;Range&nbsp;(%i,&nbsp;%i)&nbsp;expected&nbsp;to&nbsp;contain&nbsp;list.&quot;</span>&nbsp;min&nbsp;max)&nbsp;}</pre> </p> <p> This then led to this implementation of the <code>contains</code> function: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;contains&nbsp;ys&nbsp;endpoints&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;endpoints&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Open&nbsp;x,&nbsp;Open&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&lt;&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&lt;&nbsp;z) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">true</span></pre> </p> <p> Following up on the above <code>true</code>-demanding test, I added one that tested a <code>false</code> scenario: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Open-closed&nbsp;range&nbsp;doesn&#39;t&nbsp;contain&nbsp;endpoints``&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;min&nbsp;=&nbsp;Gen.int16&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;max&nbsp;=&nbsp;Gen.int16&nbsp;(Range.linearBounded&nbsp;()) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;(Open&nbsp;min,&nbsp;Closed&nbsp;max)&nbsp;|&gt;&nbsp;Range.contains&nbsp;[min;&nbsp;max] &nbsp;&nbsp;&nbsp;&nbsp;Assert.False&nbsp;(actual,&nbsp;sprintf&nbsp;<span style="color:#a31515;">&quot;Range&nbsp;(%i,&nbsp;%i]&nbsp;expected&nbsp;not&nbsp;to&nbsp;contain&nbsp;list.&quot;</span>&nbsp;min&nbsp;max)&nbsp;}</pre> </p> <p> This again led to this implementation: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;contains&nbsp;ys&nbsp;endpoints&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;endpoints&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Open&nbsp;x,&nbsp;Open&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&lt;&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&lt;&nbsp;z) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Open&nbsp;x,&nbsp;Closed&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">false</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">true</span></pre> </p> <p> I had to add four more tests before I felt confident that I had the right implementation. I'm not going to show them all here, but you can look at the <a href="https://github.com/ploeh/RangeFSharp">repository on GitHub</a> if you're interested in the interim steps. </p> <h3 id="0cd66550fe5a43b0a7e8a6d3a2b0ea32"> Types and functionality <a href="#0cd66550fe5a43b0a7e8a6d3a2b0ea32">#</a> </h3> <p> So far I had treated a range as a pair (two-tuple), just as I had done with the code in <a href="/2024/01/08/a-range-kata-implementation-in-haskell">my first attempt</a>. I did, however, have a few other things planned for this code base, so I introduced a set of explicit types: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Endpoint&lt;&#39;a&gt;&nbsp;=&nbsp;Open&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a&nbsp;|&nbsp;Closed&nbsp;<span style="color:blue;">of</span>&nbsp;&#39;a <span style="color:blue;">type</span>&nbsp;Range&lt;&#39;a&gt;&nbsp;=&nbsp;{&nbsp;LowerBound&nbsp;:&nbsp;Endpoint&lt;&#39;a&gt;;&nbsp;UpperBound&nbsp;:&nbsp;Endpoint&lt;&#39;a&gt;&nbsp;}</pre> </p> <p> The <code>Range</code> record type is isomorphic to a pair of <code>Endpoint</code> values, so it's not strictly required, but does make things <a href="https://peps.python.org/pep-0020/">more explicit</a>. </p> <p> To support the new type, I added an <code>ofEndpoints</code> function, and finalized the implementation of <code>contains</code>: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Range&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;ofEndpoints&nbsp;(lowerBound,&nbsp;upperBound)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;LowerBound&nbsp;=&nbsp;lowerBound;&nbsp;UpperBound&nbsp;=&nbsp;upperBound&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;contains&nbsp;ys&nbsp;r&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;r.LowerBound,&nbsp;r.UpperBound&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;Open&nbsp;x,&nbsp;&nbsp;&nbsp;Open&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&nbsp;&lt;&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&nbsp;&lt;&nbsp;z) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;Open&nbsp;x,&nbsp;Closed&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&nbsp;&lt;&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&lt;=&nbsp;z) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Closed&nbsp;x,&nbsp;&nbsp;&nbsp;Open&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&lt;=&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&nbsp;&lt;&nbsp;z) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Closed&nbsp;x,&nbsp;Closed&nbsp;z&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;ys&nbsp;|&gt;&nbsp;List.forall&nbsp;(<span style="color:blue;">fun</span>&nbsp;y&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;x&nbsp;&lt;=&nbsp;y&nbsp;&amp;&amp;&nbsp;y&nbsp;&lt;=&nbsp;z)</pre> </p> <p> As is so often the case in F#, pattern matching makes such functions a pleasure to implement. </p> <h3 id="c00252811495433987c37f7bcfc751a5"> Conclusion <a href="#c00252811495433987c37f7bcfc751a5">#</a> </h3> <p> I was curious whether using property-based testing would make the development process of the Range kata simpler. While each property was simple, I still had to write eight of them before I felt I'd fully described the problem. This doesn't seem like much of an improvement over the example-driven approach I took the first time around. It seems to be a comparable amount of code, and on one hand a property is more abstract than an example, but on the hand usually also covers more ground. I feel more confident that this implementation works, because I know that it's being exercised more rigorously. </p> <p> When I find myself writing a property per branch, so to speak, I always feel that I missed a better way to describe the problem. As an example, for years <a href="https://youtu.be/2oN9caQflJ8?si=em1VvFqYFA_AjDlk">I would demonstrate</a> how to test <a href="https://codingdojo.org/kata/FizzBuzz/">the FizzBuzz kata</a> with property-based testing by dividing the problem into Equivalence Classes and then writing a property for each partition. Just as I've done here. This is usually possible, but smells of being too coupled to the implementation. </p> <p> Sometimes, if you think about the problem long enough, you may be able to produce an alternative set of properties that describe the problem in a way that's entirely decoupled from the implementation. After years, <a href="/2021/06/28/property-based-testing-is-not-the-same-as-partition-testing">I finally managed to do that with the FizzBuzz kata</a>. </p> <p> I didn't succeed doing that with the Range kata this time around, but maybe later. </p> <p> <strong>Next:</strong> <a href="/2024/01/22/a-range-kata-implementation-in-c">A Range kata implementation in C#</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A Range kata implementation in Haskell https://blog.ploeh.dk/2024/01/08/a-range-kata-implementation-in-haskell 2024-01-08T07:06:00+00:00 Mark Seemann <div id="post"> <p> <em>A first crack at the exercise.</em> </p> <p> This article is an instalment in <a href="/2024/01/01/variations-of-the-range-kata">a short series of articles on the Range kata</a>. Here I describe my first attempt at the exercise. As I usually advise people <a href="/2020/01/13/on-doing-katas">on doing katas</a>, the first time you try your hand at a kata, use the language with which you're most comfortable. To be honest, I may be most habituated to C#, having programmed in it since 2002, but on the other hand, I currently 'think in <a href="https://www.haskell.org/">Haskell</a>', and am often frustrated with C#'s lack of structural equality, higher-order abstractions, and support for functional expressions. </p> <p> Thus, I usually start with Haskell even though I always find myself struggling with the ecosystem. If you do, too, the source code is <a href="https://github.com/ploeh/RangeHaskell">available on GitHub</a>. </p> <p> I took my own advice by setting out with the explicit intent to follow <a href="https://codingdojo.org/kata/Range/">the Range kata description</a> as closely as possible. This kata doesn't beat about the bush, but instead just dumps a set of test cases on you. It wasn't clear if this is the most useful set of tests, or whether the order in which they're represented is the one most conducive to a good experience of test-driven development, but there was only one way to find out. </p> <p> I quickly learned, however, that the suggested test cases were insufficient in describing the behaviour in enough details. </p> <h3 id="287256d6f0fe412585cce5f16fcf5363"> Containment <a href="#287256d6f0fe412585cce5f16fcf5363">#</a> </h3> <p> I started by adding the first two test cases as <a href="/2018/05/07/inlined-hunit-test-lists">inlined HUnit test lists</a>: </p> <p> <pre><span style="color:#a31515;">&quot;integer&nbsp;range&nbsp;contains&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;candidate,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;6),&nbsp;[2,4],&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;6),&nbsp;[-1,1,6,10],&nbsp;False) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;r&nbsp;`contains`&nbsp;candidate &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> I wasn't particularly keen on going full <a href="/2019/10/07/devils-advocate">Devil's Advocate</a> on the exercise. I could, on the other hand, trivially pass both tests with this obviously degenerate implementation: </p> <p> <pre><span style="color:blue;">import</span>&nbsp;Data.List <span style="color:blue;">data</span>&nbsp;Endpoint&nbsp;a&nbsp;=&nbsp;Open&nbsp;a&nbsp;|&nbsp;Closed&nbsp;a&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) contains&nbsp;_&nbsp;candidate&nbsp;=&nbsp;[2]&nbsp;`isPrefixOf`&nbsp;candidate</pre> </p> <p> Reluctantly, I had to invent some additional test cases: </p> <p> <pre><span style="color:#a31515;">&quot;integer&nbsp;range&nbsp;contains&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;candidate,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;&nbsp;&nbsp;2&nbsp;,&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;6),&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[2,4],&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;&nbsp;&nbsp;2&nbsp;,&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;6),&nbsp;[-1,1,6,10],&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;(-1),&nbsp;Closed&nbsp;10),&nbsp;[-1,1,6,10],&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;(-1),&nbsp;&nbsp;&nbsp;Open&nbsp;10),&nbsp;[-1,1,6,10],&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;(-1),&nbsp;&nbsp;&nbsp;Open&nbsp;10),&nbsp;&nbsp;[-1,1,6,9],&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;&nbsp;&nbsp;2,&nbsp;&nbsp;Closed&nbsp;&nbsp;6),&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[3,5,6],&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;&nbsp;&nbsp;2,&nbsp;&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;6),&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[2,5],&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;&nbsp;&nbsp;2,&nbsp;&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;6),&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">[]</span>,&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;&nbsp;&nbsp;2,&nbsp;&nbsp;Closed&nbsp;&nbsp;6),&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[3,7,4],&nbsp;False) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;r&nbsp;`contains`&nbsp;candidate &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> This was when I began to wonder whether it would have been easier to use property-based testing. That would entail, however, a departure from the kata's suggested test cases, so I decided to stick to the plan and then perhaps return to property-based testing when repeating the exercise. </p> <p> Ultimately I implemented the <code>contains</code> function this way: </p> <p> <pre><span style="color:#2b91af;">contains</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Foldable</span>&nbsp;t,&nbsp;<span style="color:blue;">Ord</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(<span style="color:blue;">Endpoint</span>&nbsp;a,&nbsp;<span style="color:blue;">Endpoint</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;t&nbsp;a&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Bool</span> contains&nbsp;(lowerBound,&nbsp;upperBound)&nbsp;= &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;isHighEnough&nbsp;=&nbsp;<span style="color:blue;">case</span>&nbsp;lowerBound&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Closed&nbsp;x&nbsp;-&gt;&nbsp;(x&nbsp;&lt;=) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;&nbsp;x&nbsp;-&gt;&nbsp;(x&nbsp;&lt;) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;isLowEnough&nbsp;=&nbsp;<span style="color:blue;">case</span>&nbsp;upperBound&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Closed&nbsp;y&nbsp;-&gt;&nbsp;(&lt;=&nbsp;y) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;&nbsp;y&nbsp;-&gt;&nbsp;&nbsp;(&lt;&nbsp;y) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;isContained&nbsp;x&nbsp;=&nbsp;isHighEnough&nbsp;x&nbsp;&amp;&amp;&nbsp;isLowEnough&nbsp;x &nbsp;&nbsp;<span style="color:blue;">in</span>&nbsp;<span style="color:blue;">all</span>&nbsp;isContained</pre> </p> <p> In some ways it seems a bit verbose to me, but I couldn't easily think of a simpler implementation. </p> <p> One of the features I find so fascinating about Haskell is how <em>general</em> it enables me to be. While the tests use integers for concision, the <code>contains</code> function works with any <code>Ord</code> instance; not only <code>Integer</code>, but also <code>Double</code>, <code>Word</code>, <code>Day</code>, <code>TimeOfDay</code>, or some new type I can't even predict. </p> <h3 id="67c8d29aeb5d4c2ca18b4a6664cf6af8"> All points <a href="#67c8d29aeb5d4c2ca18b4a6664cf6af8">#</a> </h3> <p> The next function suggested by the kata is a function to enumerate all points in a range. There's only a single test case, so again I added some more: </p> <p> <pre><span style="color:#a31515;">&quot;getAllPoints&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;6),&nbsp;[2..5]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;4,&nbsp;&nbsp;&nbsp;Open&nbsp;8),&nbsp;[4..7]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Closed&nbsp;6),&nbsp;[2..6]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;4,&nbsp;Closed&nbsp;8),&nbsp;[4..8]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;2,&nbsp;Closed&nbsp;6),&nbsp;[3..6]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;4,&nbsp;Closed&nbsp;8),&nbsp;[5..8]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;6),&nbsp;[3..5]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;4,&nbsp;&nbsp;&nbsp;Open&nbsp;8),&nbsp;[5..7]) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;allPoints&nbsp;r &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> Ultimately, after I'd implemented the <em>next</em> feature, I refactored the <code>allPoints</code> function to make use of it, and it became a simple one-liner: </p> <p> <pre><span style="color:#2b91af;">allPoints</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Enum</span>&nbsp;a,&nbsp;<span style="color:blue;">Num</span>&nbsp;a)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(<span style="color:blue;">Endpoint</span>&nbsp;a,&nbsp;<span style="color:blue;">Endpoint</span>&nbsp;a)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;[a] allPoints&nbsp;=&nbsp;<span style="color:blue;">uncurry</span>&nbsp;<span style="color:blue;">enumFromTo</span>&nbsp;.&nbsp;endpoints</pre> </p> <p> The <code>allPoints</code> function also enabled me to express the kata's <em>ContainsRange</em> test cases without introducing a new API: </p> <p> <pre><span style="color:#a31515;">&quot;ContainsRange&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;candidate,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;5),&nbsp;allPoints&nbsp;(Closed&nbsp;7,&nbsp;Open&nbsp;&nbsp;&nbsp;10),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;5),&nbsp;allPoints&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;&nbsp;10),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;&nbsp;&nbsp;Open&nbsp;&nbsp;5),&nbsp;allPoints&nbsp;(Closed&nbsp;2,&nbsp;Open&nbsp;&nbsp;&nbsp;10),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;10),&nbsp;allPoints&nbsp;(Closed&nbsp;3,&nbsp;Closed&nbsp;&nbsp;5),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Closed&nbsp;&nbsp;5),&nbsp;allPoints&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;&nbsp;&nbsp;5),&nbsp;&nbsp;True) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;r&nbsp;`contains`&nbsp;candidate &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> As I've already mentioned, the above implementation of <code>allPoints</code> is based on the next feature, <code>endpoints</code>. </p> <h3 id="a16cc4c45e614bb9a726c14ef19afc8f"> Endpoints <a href="#a16cc4c45e614bb9a726c14ef19afc8f">#</a> </h3> <p> The kata also suggests a function to return the two endpoints of a range, as well as some test cases to describe it. Once more, I had to add more test cases to adequately describe the desired functionality: </p> <p> <pre><span style="color:#a31515;">&quot;endPoints&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;6),&nbsp;(2,&nbsp;5)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;1,&nbsp;&nbsp;&nbsp;Open&nbsp;7),&nbsp;(1,&nbsp;6)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Closed&nbsp;6),&nbsp;(2,&nbsp;6)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;1,&nbsp;Closed&nbsp;7),&nbsp;(1,&nbsp;7)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;2,&nbsp;&nbsp;&nbsp;Open&nbsp;6),&nbsp;(3,&nbsp;5)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;1,&nbsp;&nbsp;&nbsp;Open&nbsp;7),&nbsp;(2,&nbsp;6)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;2,&nbsp;Closed&nbsp;6),&nbsp;(3,&nbsp;6)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((&nbsp;&nbsp;Open&nbsp;1,&nbsp;Closed&nbsp;7),&nbsp;(2,&nbsp;7)) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;endpoints&nbsp;r &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> The implementation is fairly trivial: </p> <p> <pre><span style="color:#2b91af;">endpoints</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Num</span>&nbsp;a1,&nbsp;<span style="color:blue;">Num</span>&nbsp;a2)&nbsp;<span style="color:blue;">=&gt;</span>&nbsp;(<span style="color:blue;">Endpoint</span>&nbsp;a2,&nbsp;<span style="color:blue;">Endpoint</span>&nbsp;a1)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;(a2,&nbsp;a1) endpoints&nbsp;(Closed&nbsp;x,&nbsp;Closed&nbsp;y)&nbsp;=&nbsp;(x&nbsp;&nbsp;,&nbsp;y) endpoints&nbsp;(Closed&nbsp;x,&nbsp;&nbsp;&nbsp;Open&nbsp;y)&nbsp;=&nbsp;(x&nbsp;&nbsp;,&nbsp;y-1) endpoints&nbsp;(&nbsp;&nbsp;Open&nbsp;x,&nbsp;Closed&nbsp;y)&nbsp;=&nbsp;(x+1,&nbsp;y) endpoints&nbsp;(&nbsp;&nbsp;Open&nbsp;x,&nbsp;&nbsp;&nbsp;Open&nbsp;y)&nbsp;=&nbsp;(x+1,&nbsp;y-1)</pre> </p> <p> One attractive quality of <a href="https://en.wikipedia.org/wiki/Algebraic_data_type">algebraic data types</a> is that the 'algebra' of the type(s) tell you how many cases you need to pattern-match against. Since I'm treating a range as a pair of <code>Endpoint</code> values, and since each <code>Endpoint</code> can be one of two cases (<code>Open</code> or <code>Closed</code>), there's exactly 2 * 2 = 4 possible combinations (since a tuple is a <a href="https://en.wikipedia.org/wiki/Product_type">product type</a>). </p> <p> That fits with the number of pattern-matches required to implement the function. </p> <h3 id="c2b611a0c9ba494b9ccc30c5cd3ec4e8"> Overlapping ranges <a href="#c2b611a0c9ba494b9ccc30c5cd3ec4e8">#</a> </h3> <p> The final interesting feature is a predicate to determine whether one range overlaps another. As has become a refrain by now, I didn't find the suggested test cases sufficient to describe the desired behaviour, so I had to add a few more: </p> <p> <pre><span style="color:#a31515;">&quot;overlapsRange&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(r,&nbsp;candidate,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;7,&nbsp;Open&nbsp;10),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;10),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;10),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;2,&nbsp;Open&nbsp;10),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;1,&nbsp;Open&nbsp;&nbsp;3),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;5,&nbsp;Open&nbsp;&nbsp;7),&nbsp;False) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;r&nbsp;`overlaps`&nbsp;candidate &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> I'm not entirely happy with the implementation: </p> <p> <pre><span style="color:#2b91af;">overlaps</span>&nbsp;<span style="color:blue;">::</span>&nbsp;(<span style="color:blue;">Ord</span>&nbsp;a1,&nbsp;<span style="color:blue;">Ord</span>&nbsp;a2)&nbsp;<span style="color:blue;">=&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Endpoint&nbsp;a1,&nbsp;Endpoint&nbsp;a2)&nbsp;-&gt;&nbsp;(Endpoint&nbsp;a2,&nbsp;Endpoint&nbsp;a1)&nbsp;-&gt;&nbsp;Bool overlaps&nbsp;(l1,&nbsp;h1)&nbsp;(l2,&nbsp;h2)&nbsp;= &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;less&nbsp;(Closed&nbsp;x)&nbsp;(Closed&nbsp;y)&nbsp;=&nbsp;x&nbsp;&lt;=&nbsp;y &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;less&nbsp;(Closed&nbsp;x)&nbsp;&nbsp;&nbsp;(Open&nbsp;y)&nbsp;=&nbsp;x&nbsp;&lt;&nbsp;&nbsp;y &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;less&nbsp;&nbsp;&nbsp;(Open&nbsp;x)&nbsp;(Closed&nbsp;y)&nbsp;=&nbsp;x&nbsp;&lt;&nbsp;&nbsp;y &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;less&nbsp;&nbsp;&nbsp;(Open&nbsp;x)&nbsp;&nbsp;&nbsp;(Open&nbsp;y)&nbsp;=&nbsp;x&nbsp;&lt;&nbsp;&nbsp;y &nbsp;&nbsp;<span style="color:blue;">in</span>&nbsp;l1&nbsp;`less`&nbsp;h2&nbsp;&amp;&amp;&nbsp;l2&nbsp;`less`&nbsp;h1</pre> </p> <p> Noth that the code presented here is problematic in isolation, but if you compare it to the above <code>contains</code> function, there seems to be some repetition going on. Still, it's not <em>quite</em> the same, but the code looks similar enough that it bothers me. I feel that some kind of abstraction is sitting there, right before my nose, mocking me because I can't see it. Still, the code isn't completely duplicated, and even if it was, I can always invoke the <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">rule of three</a> and let it remain as it is. </p> <p> Which is ultimately what I did. </p> <h3 id="c09e7c817ecc48c097d6660a5438f5e0"> Equality <a href="#c09e7c817ecc48c097d6660a5438f5e0">#</a> </h3> <p> The kata also suggests some test cases to verify that it's possible to compare two ranges for equality. Dutifully I added those test cases to the code base, even though I knew that they'd automatically pass. </p> <p> <pre><span style="color:#a31515;">&quot;Equals&quot;</span>&nbsp;~:&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;(x,&nbsp;y,&nbsp;expected)&nbsp;&lt;- &nbsp;&nbsp;&nbsp;&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;&nbsp;True), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;10),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;2,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;3,&nbsp;Open&nbsp;10),&nbsp;False), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;((Closed&nbsp;3,&nbsp;Open&nbsp;&nbsp;5),&nbsp;(Closed&nbsp;2,&nbsp;Open&nbsp;10),&nbsp;False) &nbsp;&nbsp;&nbsp;&nbsp;] &nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;x&nbsp;==&nbsp;y &nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;expected&nbsp;~=?&nbsp;actual</pre> </p> <p> In the beginning of this article, I called attention to C#'s regrettable lack of structural equality. Here's an example of what I mean. In Haskell, these tests automatically pass because <code>Endpoint</code> is an <code>Eq</code> instance (by declaration), and all pairs of <code>Eq</code> instances are themselves <code>Eq</code> instances. Simple, elegant, powerful. </p> <h3 id="094e6dd4c07e40739d6fca4945dc7018"> Conclusion <a href="#094e6dd4c07e40739d6fca4945dc7018">#</a> </h3> <p> As a first pass at the (admittedly uncomplicated) Range kata, I tried to follow the 'plan' implied by the kata description's test cases. I quickly became frustrated with their lack of completion. They were adequate in indicating to a human (me) what the desired behaviour should be, but insufficient to satisfactorily describe the desired behaviour. </p> <p> I could, of course, have stuck with only those test cases, and instead of employing the Devil's Advocate technique (which I actively tried to avoid) made an honest effort to implement the functionality. </p> <p> The things is, however, that <a href="/2023/03/20/on-trust-in-software-development">I don't trust myself</a>. At its essence, the Range kata is all about edge cases, which are where most bugs tend to lurk. Thus, these are exactly the cases that should be covered by tests. </p> <p> Having made enough 'dumb' programming mistakes during my career, I didn't trust myself to be able to write correct implementations without more test coverage than originally suggested. That's the reason I added more tests. </p> <p> On the other hand, I more than once speculated whether property-based testing would make this work easier. I decided to pursue that idea during my second pass at the kata. </p> <p> <strong>Next:</strong> <a href="/2024/01/15/a-range-kata-implementation-in-f">A Range kata implementation in F#</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="f768e9d0ec73603ff40542ae07a1a9bd"> <div class="comment-author"><a href="https://github.com/mormegil-cz">Petr Kadlec</a> <a href="#f768e9d0ec73603ff40542ae07a1a9bd">#</a></div> <div class="comment-content"> <p> I’d have another test case for the Equality function: <code>((Open 2, Open 6), (Closed 3, Closed 5), True)</code>. While it is nice Haskell provides (automatic) structural equality, I don’t think we want to say that the (2, 6) range (on integers!) is something else than the [3, 5] range. </p> <p> But yes, this opens a can of worms: While (2, 6) = [3, 5] on integers, (2.0, 6.0) is obviously different than [3.0, 5.0] (on reals/Doubles/…). I have no idea: In Haskell, could you write an implementation of a function which would behave differently depending on whether the type argument belongs to a typeclass or not? </p> </div> <div class="comment-date">2024-01-09 13:38 UTC</div> </div> <div class="comment" id="9d0f60b0a2654424b10d264cfd8b6c96"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#9d0f60b0a2654424b10d264cfd8b6c96">#</a></div> <div class="comment-content"> <p> Petr, thank you for writing. I don't think I'd add that (or similar) test cases, but it's a judgment call, and it's partly language-specific. What you're suggesting is to consider things that are <em>equivalent</em> equal. I agree that for integers this would be the case, but it wouldn't be for rational numbers, or floating points (or real numbers, if we had those in programming). </p> <p> In Haskell it wouldn't really be idiomatic, because equality is defined by the <code>Eq</code> type class, and most types just go with the default implementation. What you suggest requires writing an explicit <code>Eq</code> instance for <code>Endpoint</code>. It'd be possible, but then you'd have to deal explicitly with the various integer representations separately from other representations that use floating points. </p> <p> The distinction between <em>equivalence</em> and <em>equality</em> is largely artificial or a convenient hand wave. To explain what I mean, consider mathematical expressions. Obviously, <em>3 + 1</em> is equal to <em>2 + 2</em> when evaluated, but they're different <em>expressions</em>. Thus, on an expression level, we don't consider those two expressions equal. I think of the integer ranges <em>(2, 6)</em> and <em>[3, 6]</em> the same way. They evaluate to the same, but there aren't equal. </p> <p> I don't think that this is a strong argument, mind. In other programming languages, I might arrive at a different decision. It also matters what client code needs to <em>do</em> with the API. In any case, the decision to not consider <em>equivalence</em> the same as <em>equality</em> is congruent with how Haskell works. </p> <p> The existence of floating points and rational numbers, however, opens another can of worms that I happily glossed over, since I had a completely different goal with the kata than producing a reusable library. </p> <p> Haskell actually supports rational numbers with the <code>%</code> operator: </p> <p> <pre>ghci&gt; 1%2 1 % 2</pre> </p> <p> This value represents ½, to be explicit. </p> <p> Unfortunately, according to the specification (or, at least, <a href="https://hackage.haskell.org/package/base/docs/GHC-Enum.html#v:succ">the documentation</a>) of the <code>Enum</code> type class, the two 'movement' operations <code>succ</code> and <code>pred</code> jump by increments of <em>1</em>: </p> <p> <pre>ghci&gt; succ $ 1%2 3 % 2 ghci&gt; succ $ succ $ 1%2 5 % 2</pre> </p> <p> The same is the case with floating points: </p> <p> <pre>ghci&gt; succ 1.5 2.5 ghci&gt; succ $ succ 1.5 3.5</pre> </p> <p> This is unfortunate when it comes to floating points, since it would be possible to enumerate all floating points in a range. (For example, if a <a href="https://en.wikipedia.org/wiki/Single-precision_floating-point_format">single-precision floating point</a> occupies 32 bits, there's a finite number of them, and you can enumerate them.) </p> <p> As <a href="https://twitter.com/sonatsuer/status/1744326173524394372">Sonat Süer points out</a>, this means that the <code>allPoints</code> function is fundamentally broken for floating points and rational numbers (and possibly other types as well). </p> <p> One way around that in Haskell would be to introduce a <em>new</em> type class for the purpose of truly enumerating ranges, and either implement it correctly for floating points, or explicitly avoid making <code>Float</code> and <code>Double</code> instances of that new type class. This, on the other hand, would have the downside that all of a sudden, the <code>allPoints</code> function wouldn't support any custom type of which I, as the implementer, is unaware. </p> <p> If this was a library that I'd actually have to ship as a reusable API, I think I'd start by <em>not</em> including the <code>allPoints</code> function, and then see if anyone asks for it. If or when that happens, I'd begin a process to chart why people need it, and what could be done to serve those needs in a useful and mathematically consistent manner. </p> </div> <div class="comment-date">2024-01-13 19:51 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Variations of the Range kata https://blog.ploeh.dk/2024/01/01/variations-of-the-range-kata 2024-01-01T17:00:00+00:00 Mark Seemann <div id="post"> <p> <em>In the languages I usually employ.</em> </p> <p> The <a href="https://codingdojo.org/kata/Range/">Range kata</a> is succinct, bordering on the spartan in both description and requirements. To be honest, it's hardly the most inspiring kata available, and yet it may help showcase a few interesting points about software design in general. It's what it demonstrates about <a href="/2018/03/22/functors">functors</a> that makes it marginally interesting. </p> <p> In this short article series I first cover a few incarnations of the kata in my usual programming languages, and then conclude by looking at <em>range</em> as a functor. </p> <p> The article series contains the following articles: </p> <ul> <li><a href="/2024/01/08/a-range-kata-implementation-in-haskell">A Range kata implementation in Haskell</a></li> <li><a href="/2024/01/15/a-range-kata-implementation-in-f">A Range kata implementation in F#</a></li> <li><a href="/2024/01/22/a-range-kata-implementation-in-c">A Range kata implementation in C#</a></li> <li><a href="/2024/02/12/range-as-a-functor">Range as a functor</a></li> </ul> <p> I didn't take the same approaches through all three exercises. An important point about <a href="/2020/01/13/on-doing-katas">doing katas</a> is to learn something, and when you've done the kata once, you've already gained some knowledge that can't easily be unlearned. Thus, on the second, or third time through, it's only natural to apply that knowledge, but then try different tactics to solve the problem in a different way. That's what I did here, starting with <a href="https://www.haskell.org/">Haskell</a>, proceeding with <a href="https://fsharp.org/">F#</a>, and concluding with C#. </p> <p> <strong>Next:</strong> <a href="/2024/01/08/a-range-kata-implementation-in-haskell">A Range kata implementation in Haskell</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Serializing restaurant tables in C# https://blog.ploeh.dk/2023/12/25/serializing-restaurant-tables-in-c 2023-12-25T11:42:00+00:00 Mark Seemann <div id="post"> <p> <em>Using System.Text.Json, with and without Reflection.</em> </p> <p> This article is part of a short series of articles about <a href="/2023/12/04/serialization-with-and-without-reflection">serialization with and without Reflection</a>. In this instalment I'll explore some options for serializing <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> with C# using the API built into .NET: <a href="https://learn.microsoft.com/dotnet/api/system.text.json">System.Text.Json</a>. I'm not going use <a href="https://www.newtonsoft.com/json">Json.NET</a> in this article, but I've <a href="/2022/01/03/to-id-or-not-to-id">done similar things with that library</a> in the past, so what's here is, at least, somewhat generalizable. </p> <p> Since the API is the same, the only difference from <a href="/2023/12/18/serializing-restaurant-tables-in-f">the previous article</a> is the language syntax. </p> <h3 id="e949466d51f647bfbee9016d551d9b78"> Natural numbers <a href="#e949466d51f647bfbee9016d551d9b78">#</a> </h3> <p> Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the <a href="/2023/12/04/serialization-with-and-without-reflection">introductory article</a> we'd like to parse and write restaurant table configurations like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> On the other hand, I'd like to represent the Domain Model in a way that <a href="/encapsulation-and-solid">encapsulates the rules</a> governing the model, <a href="https://blog.janestreet.com/effective-ml-video/">making illegal states unrepresentable</a>. Even though that's a catchphrase associated with functional programming, it applies equally well to a statically typed object-oriented language like C#. </p> <p> As the first step, we observe that the numbers involved are all <a href="https://en.wikipedia.org/wiki/Natural_number">natural numbers</a>. In C# it's rarer to define <a href="https://www.hillelwayne.com/post/constructive/">predicative data types</a> than in a language like <a href="https://fsharp.org/">F#</a>, but people should do it more. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">struct</span>&nbsp;<span style="color:#2b91af;">NaturalNumber</span>&nbsp;:&nbsp;IEquatable&lt;NaturalNumber&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">int</span>&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">NaturalNumber</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(value&nbsp;&lt;&nbsp;1) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;ArgumentOutOfRangeException( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nameof(value), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;Value&nbsp;must&nbsp;be&nbsp;a&nbsp;natural&nbsp;number&nbsp;greater&nbsp;than&nbsp;zero.&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.value&nbsp;=&nbsp;value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;NaturalNumber?&nbsp;<span style="font-weight:bold;color:#74531f;">TryCreate</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">candidate</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(candidate&nbsp;&lt;&nbsp;1) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;NaturalNumber(candidate); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;&lt;(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;&lt;&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;&gt;(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;&gt;&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;&lt;=(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;&lt;=&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;&gt;=(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;&gt;=&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;==(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;==&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;!=(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">left</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">right</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;left.value&nbsp;!=&nbsp;right.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">explicit</span>&nbsp;<span style="color:blue;">operator</span>&nbsp;<span style="color:blue;">int</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">number</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;number.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Equals</span>(<span style="color:blue;">object</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">obj</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;obj&nbsp;<span style="color:blue;">is</span>&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">number</span>&nbsp;&amp;&amp;&nbsp;Equals(number); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Equals</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">other</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;value&nbsp;==&nbsp;other.value; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#74531f;">GetHashCode</span>() &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;HashCode.Combine(value); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> When comparing all that boilerplate code to the <a href="/2023/12/18/serializing-restaurant-tables-in-f">three lines required to achieve the same result in F#</a>, it seems, at first glance, understandable that C# developers rarely reach for that option. Still, <a href="/2018/09/17/typing-is-not-a-programming-bottleneck">typing is not a programming bottleneck</a>, and most of that code was generated by a combination of Visual Studio and <a href="https://github.com/features/copilot">GitHub Copilot</a>. </p> <p> The <code>TryCreate</code> method may not be <em>strictly</em> necessary, but I consider it good practice to give client code a way to perform a fault-prone operation in a safe manner, without having to resort to a <code>try/catch</code> construct. </p> <p> That's it for natural numbers. 72 lines of code. Compare that to <a href="/2023/12/18/serializing-restaurant-tables-in-f">the F# implementation</a>, which required three lines of code. Syntax does matter. </p> <h3 id="542d19e6713f46d79cfc013fc577980a"> Domain Model <a href="#542d19e6713f46d79cfc013fc577980a">#</a> </h3> <p> Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a <code>NaturalNumber</code> less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six. </p> <p> In the same spirit as above, then, define this type: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">struct</span>&nbsp;<span style="color:#2b91af;">Table</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;NaturalNumber&nbsp;capacity; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;NaturalNumber?&nbsp;minimalReservation; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">Table</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>,&nbsp;NaturalNumber?&nbsp;<span style="font-weight:bold;color:#1f377f;">minimalReservation</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.capacity&nbsp;=&nbsp;capacity; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.minimalReservation&nbsp;=&nbsp;minimalReservation; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Table?&nbsp;<span style="font-weight:bold;color:#74531f;">TryCreateSingle</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>,&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">minimalReservation</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">cap</span>&nbsp;=&nbsp;NaturalNumber.TryCreate(capacity); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(cap&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">min</span>&nbsp;=&nbsp;NaturalNumber.TryCreate(minimalReservation); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(min&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(cap&nbsp;&lt;&nbsp;min) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Table(cap.Value,&nbsp;min.Value); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Table?&nbsp;<span style="font-weight:bold;color:#74531f;">TryCreateCommunal</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">cap</span>&nbsp;=&nbsp;NaturalNumber.TryCreate(capacity); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(cap&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Table(cap.Value,&nbsp;<span style="color:blue;">null</span>); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;T&nbsp;<span style="font-weight:bold;color:#74531f;">Accept</span>&lt;<span style="color:#2b91af;">T</span>&gt;(ITableVisitor&lt;T&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">visitor</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(minimalReservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;visitor.VisitCommunal(capacity); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;visitor.VisitSingle(capacity,&nbsp;minimalReservation.Value); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Here I've <a href="/2018/06/25/visitor-as-a-sum-type">Visitor-encoded</a> the <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> that <code>Table</code> is. It can either be a 'single' table or a communal table. </p> <p> Notice that <code>TryCreateSingle</code> checks the invariant that the <code>capacity</code> must be greater than or equal to the <code>minimalReservation</code>. </p> <p> The point of this little exercise, so far, is that it <em>encapsulates</em> the contract implied by the Domain Model. It does this by using the static type system to its advantage. </p> <h3 id="13ac203cee494ec18420959fbad03003"> JSON serialization by hand <a href="#13ac203cee494ec18420959fbad03003">#</a> </h3> <p> At the boundaries of applications, however, <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">there are no static types</a>. Is the static type system still useful in that situation? </p> <p> For a long time, the most popular .NET library for JSON serialization was <a href="https://www.newtonsoft.com/json">Json.NET</a>, but these days I find the built-in API offered in the <a href="https://learn.microsoft.com/dotnet/api/system.text.json">System.Text.Json</a> namespace adequate. This is also the case here. </p> <p> The original rationale for this article series was to demonstrate how serialization can be done without Reflection, so I'll start there and return to Reflection later. </p> <p> In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this: </p> <p> <pre>{&nbsp;<span style="color:#2e75b6;">&quot;communalTable&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;42&nbsp;}&nbsp;}</pre> </p> <p> Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility. </p> <p> As I outlined in the <a href="/2023/12/04/serialization-with-and-without-reflection">introduction article</a> you can usually find a more weakly typed API to get the job done. For serializing <code>Table</code> to JSON it looks like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Serialize</span>(<span style="color:blue;">this</span>&nbsp;Table&nbsp;<span style="font-weight:bold;color:#1f377f;">table</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;table.Accept(<span style="color:blue;">new</span>&nbsp;TableVisitor()); } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">TableVisitor</span>&nbsp;:&nbsp;ITableVisitor&lt;<span style="color:blue;">string</span>&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#74531f;">VisitCommunal</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">j</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;JsonObject &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;communalTable&quot;</span>]&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;JsonObject &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;=&nbsp;(<span style="color:blue;">int</span>)capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;j.ToJsonString(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#74531f;">VisitSingle</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">j</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;JsonObject &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;singleTable&quot;</span>]&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;JsonObject &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;=&nbsp;(<span style="color:blue;">int</span>)capacity, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<span style="color:#a31515;">&quot;minimalReservation&quot;</span>]&nbsp;=&nbsp;(<span style="color:blue;">int</span>)value &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;j.ToJsonString(); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> In order to separate concerns, I've defined this functionality in a new static class that references the Domain Model. The <code>Serialize</code> extension method uses a <code>private</code> Visitor to write two different <a href="https://learn.microsoft.com/dotnet/api/system.text.json.nodes.jsonobject">JsonObject</a> objects, using the JSON API's underlying Document Object Model (DOM). </p> <h3 id="53c250b548c64292b2d704be12c91aa5"> JSON deserialization by hand <a href="#53c250b548c64292b2d704be12c91aa5">#</a> </h3> <p> You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid <code>Table</code> representation, or the values may be illegal, etc. </p> <p> Since there are several values that explicitly must be integers, it makes sense to define a helper method to try to parse an integer: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">int</span>?&nbsp;<span style="font-weight:bold;color:#74531f;">TryInt</span>(<span style="color:blue;">this</span>&nbsp;JsonNode?&nbsp;<span style="font-weight:bold;color:#1f377f;">node</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(node&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(node.GetValueKind()&nbsp;!=&nbsp;JsonValueKind.Number) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">try</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;(<span style="color:blue;">int</span>)node; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">catch</span>&nbsp;(FormatException) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> I'm surprised that there's no built-in way to do that, but if there is, I couldn't find it. </p> <p> With a helper method like that you can now implement the <code>Deserialize</code> method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Table?&nbsp;<span style="font-weight:bold;color:#74531f;">Deserialize</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">json</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">try</span> &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">node</span>&nbsp;=&nbsp;JsonNode.Parse(json); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">cnode</span>&nbsp;=&nbsp;node?[<span style="color:#a31515;">&quot;communalTable&quot;</span>]; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(cnode&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>&nbsp;=&nbsp;cnode[<span style="color:#a31515;">&quot;capacity&quot;</span>].TryInt(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(capacity&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Table.TryCreateCommunal(capacity.Value); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">snode</span>&nbsp;=&nbsp;node?[<span style="color:#a31515;">&quot;singleTable&quot;</span>]; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(snode&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>&nbsp;=&nbsp;snode[<span style="color:#a31515;">&quot;capacity&quot;</span>].TryInt(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">minimalReservation</span>&nbsp;=&nbsp;snode[<span style="color:#a31515;">&quot;minimalReservation&quot;</span>].TryInt(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(capacity&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>&nbsp;||&nbsp;minimalReservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Table.TryCreateSingle( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;capacity.Value, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;minimalReservation.Value); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">catch</span>&nbsp;(JsonException) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works, and in fact, I did. Here are a few examples: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">DeserializeSingleTableFor4</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">json</span>&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;singleTable&quot;:{&quot;capacity&quot;:4,&quot;minimalReservation&quot;:3}}&quot;&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;TableJson.Deserialize(json); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(Table.TryCreateSingle(4,&nbsp;3),&nbsp;actual); } [Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">DeserializeNonTable</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">json</span>&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;foo&quot;:42}&quot;&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;TableJson.Deserialize(json); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Null(actual); }</pre> </p> <p> Apart from using directives and namespace declaration this hand-written JSON capability requires 87 lines of code, although, to be fair, <code>TryInt</code> is a general-purpose method that ought to be part of the <code>System.Text.Json</code> API. Can we do better with static types and Reflection? </p> <h3 id="fc6832fe72874427b32a0ee062d4fbf6"> JSON serialisation based on types <a href="#fc6832fe72874427b32a0ee062d4fbf6">#</a> </h3> <p> The static <a href="https://learn.microsoft.com/dotnet/api/system.text.json.jsonserializer">JsonSerializer</a> class comes with <code>Serialize&lt;T&gt;</code> and <code>Deserialize&lt;T&gt;</code> methods that use Reflection to convert a statically typed object to and from JSON. You can define a type (a <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Object</a> (DTO) if you will) and let Reflection do the hard work. </p> <p> In <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> I explain how you're usually better off separating the role of serialization from the role of Domain Model. One way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above <code>Table</code> type plays the latter role, so we need new DTO types: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">TableDto</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;CommunalTableDto?&nbsp;CommunalTable&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;SingleTableDto?&nbsp;SingleTable&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">CommunalTableDto</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} } <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">SingleTableDto</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;MinimalReservation&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} }</pre> </p> <p> One way to model a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> with a DTO is to declare both cases as nullable fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary. </p> <p> While you can serialize values of that type, by default the generated JSON doesn't have the right format. Instead, a serialized communal table looks like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;CommunalTable&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;Capacity&quot;</span>:&nbsp;42&nbsp;}, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;SingleTable&quot;</span>:&nbsp;<span style="color:blue;">null</span> }</pre> </p> <p> There are two problems with the generated JSON document: </p> <ul> <li>The casing is wrong</li> <li>The null value shouldn't be there</li> </ul> <p> None of those are too hard to address, but it does make the API a bit more awkward to use, as this test demonstrates: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">SerializeCommunalTableViaReflection</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;TableDto &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;CommunalTable&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;CommunalTableDto&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;42&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;JsonSerializer.Serialize( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;JsonSerializerOptions &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PropertyNamingPolicy&nbsp;=&nbsp;JsonNamingPolicy.CamelCase, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DefaultIgnoreCondition&nbsp;=&nbsp;JsonIgnoreCondition.WhenWritingNull &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(<span style="color:#a31515;">&quot;&quot;&quot;{&quot;communalTable&quot;:{&quot;capacity&quot;:42}}&quot;&quot;&quot;</span>,&nbsp;actual); }</pre> </p> <p> You can, of course, define this particular serialization behaviour as a reusable method, so it's not a problem that you can't address. I just wanted to include this, since it's part of the overall work that you have to do in order to make this work. </p> <h3 id="213e3959eb49407ab3cdf59a4d2aed06"> JSON deserialisation based on types <a href="#213e3959eb49407ab3cdf59a4d2aed06">#</a> </h3> <p> To allow parsing of JSON into the above DTO the Reflection-based <code>Deserialize</code> method pretty much works out of the box, although again, it needs to be configured. Here's a passing test that demonstrates how that works: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">DeserializeSingleTableViaReflection</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">json</span>&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;singleTable&quot;:{&quot;capacity&quot;:4,&quot;minimalReservation&quot;:2}}&quot;&quot;&quot;</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;JsonSerializer.Deserialize&lt;TableDto&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;json, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;JsonSerializerOptions&nbsp;{&nbsp;PropertyNamingPolicy&nbsp;=&nbsp;JsonNamingPolicy.CamelCase&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Null(actual?.CommunalTable); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(4,&nbsp;actual?.SingleTable?.Capacity); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(2,&nbsp;actual?.SingleTable?.MinimalReservation); }</pre> </p> <p> There's only difference in casing, so you'd expect the <code>Deserialize</code> method to be a <a href="https://martinfowler.com/bliki/TolerantReader.html">Tolerant Reader</a>, but no. It's very particular about that, so the <code>JsonNamingPolicy.CamelCase</code> configuration is necessary. Perhaps the API designers found that <a href="https://peps.python.org/pep-0020/">explicit is better than implicit</a>. </p> <p> In any case, you could package that in a reusable <code>Deserialize</code> function that has all the options that are appropriate in a particular code context, so not a big deal. That takes care of actually writing and parsing JSON, but that's only half the battle. This only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate <code>Table</code> data. </p> <h3 id="ef08c05a3ae84141b2e8b20238af83df"> Converting DTO to Domain Model, and vice versa <a href="#ef08c05a3ae84141b2e8b20238af83df">#</a> </h3> <p> As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;TableDto&nbsp;<span style="font-weight:bold;color:#74531f;">ToDto</span>(<span style="color:blue;">this</span>&nbsp;Table&nbsp;<span style="font-weight:bold;color:#1f377f;">table</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;table.Accept(<span style="color:blue;">new</span>&nbsp;TableDtoVisitor()); } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">TableDtoVisitor</span>&nbsp;:&nbsp;ITableVisitor&lt;TableDto&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;TableDto&nbsp;<span style="font-weight:bold;color:#74531f;">VisitCommunal</span>(NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;TableDto &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;CommunalTable&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;CommunalTableDto &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;(<span style="color:blue;">int</span>)capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;TableDto&nbsp;<span style="font-weight:bold;color:#74531f;">VisitSingle</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">value</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;TableDto &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SingleTable&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SingleTableDto &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;(<span style="color:blue;">int</span>)capacity, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MinimalReservation&nbsp;=&nbsp;(<span style="color:blue;">int</span>)value &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Going the other way is <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">fundamentally a parsing exercise</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;Table?&nbsp;<span style="font-weight:bold;color:#74531f;">TryParse</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(CommunalTable&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Table.TryCreateCommunal(CommunalTable.Capacity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(SingleTable&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Table.TryCreateSingle( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SingleTable.Capacity, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SingleTable.MinimalReservation); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; }</pre> </p> <p> Here, like in <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a>, I've made that conversion an instance method on <code>TableDto</code>. </p> <p> Such an operation may fail, so the result is a nullable <code>Table</code> object. </p> <p> Let's take stock of the type-based alternative. It requires 58 lines of code, distributed over three DTO types and the two conversions <code>ToDto</code> and <code>TryParse</code>, but here I haven't counted configuration of <code>Serialize</code> and <code>Deserialize</code>, since I left that to each test case that I wrote. Since all of this code generally stays within 80 characters in line width, that would realistically add another 10 lines of code, for a total around 68 lines. </p> <p> This is smaller than the DOM-based code, but not by much. </p> <h3 id="bfac4d5d5ca940a2a61f964c4336adcf"> Conclusion <a href="#bfac4d5d5ca940a2a61f964c4336adcf">#</a> </h3> <p> In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define <em>types</em> that model the shape of the JSON data, and then leverage type-based automation (here, Reflection) to automatically parse and write the JSON. </p> <p> I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 87 lines of code, while taking advantage of type-based automation requires 68 lines of code. Again, Reflection seems 'easier' if you count lines of code, but the difference is marginal. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="467ce74e29064c60bfa9559140710e51"> <div class="comment-author"><a href="https://blog.oakular.xyz">Callum Warrilow</a> <a href="#467ce74e29064c60bfa9559140710e51">#</a></div> <div class="comment-content"> <p> Great piece as ever Mark. Always enjoy reading about alternatives to methods that have become unquestioned convention. </p> <p> I generally try to avoid reflection, especially within business code, and mainly use it for application bootstrapping, such as to discover services for dependency injection by convention. I also don't like attributes muddying model definitions, even on DTOs, so I would happily take an alternative to <code>System.Text.Json</code>. It is however increasingly integrated into other System libraries in ways that make it almost too useful to pass up. For example, the <code><a href="https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpcontent?view=net-8.0">System.Net.Http.HttpContent</a></code> class has the <code><a href="https://learn.microsoft.com/en-us/dotnet/api/system.net.http.json.httpcontentjsonextensions.readfromjsonasync?view=net-8.0">ReadFromJsonAsync</a></code> extension method, which makes it trivial to deserialize a response body. Analogous methods exist for <code><a href="https://learn.microsoft.com/en-us/dotnet/api/system.binarydata?view=dotnet-plat-ext-8.0">BinaryData</a></code>. I'm not normally a sucker for convenience, but it is difficult to turn down strong integration like this. </p> </div> <div class="comment-date">2024-01-05 21:13 UTC</div> </div> <div class="comment" id="b9a5340eec9c45f49a438a37c7499520"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#b9a5340eec9c45f49a438a37c7499520">#</a></div> <div class="comment-content"> <p> Callum, thank you for writing. You are correct that the people who design and develop .NET put a lot of effort into making things convenient. Some of that convenience, however, comes with a price. You have to buy into a certain way of doing things, and that certain way can sometimes be at odds with other good software practices, such as the <a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">Dependency Inversion Principle</a> or test-driven development. </p> <p> My goal with this (and other) article(s) isn't, however, to say that you mustn't take advantage of convenient integrations, but rather to highlight that alternatives exist. </p> <p> The many 'convenient' ways that a framework gives you to solve various problems comes with the risk that you may paint yourself into a corner, if you aren't careful. You've invested heavily in the framework's way of doing things, but there's just this small edge case that you can't get right. So you write a bit of custom code, after having figured out the correct extensibility point to hook into. Until the framework changes 'how things are done' in the next iteration. </p> <p> This is what I call <a href="/2023/10/02/dependency-whac-a-mole">Framework Whac-A-Mole</a> - a syndrome that I'm becoming increasingly wary of the more experience I gain. Of the examples linked to in that article, <a href="/2022/08/15/aspnet-validation-revisited">ASP.NET validation revisited</a> may be the most relevant to this discussion. </p> <p> As a final note, I'd be remiss if I entered into a discussion about programmer convenience without drawing on <a href="https://en.wikipedia.org/wiki/Rich_Hickey">Rich Hickey</a>'s excellent presentation <a href="https://www.infoq.com/presentations/Simple-Made-Easy/">Simple Made Easy</a>, where he goes to great length distinguishing between what is <em>easy</em> (i.e. close at hand) and what is <em>simple</em> (i.e. not complex). The sweet spot, of course, is the intersection, where things are both simple and easy. </p> <p> Most 'convenient' framework features do not, in my opinion, check that box. </p> </div> <div class="comment-date">2024-01-10 13:37 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Serializing restaurant tables in F# https://blog.ploeh.dk/2023/12/18/serializing-restaurant-tables-in-f 2023-12-18T13:59:00+00:00 Mark Seemann <div id="post"> <p> <em>Using System.Text.Json, with and without Reflection.</em> </p> <p> This article is part of a short series of articles about <a href="/2023/12/04/serialization-with-and-without-reflection">serialization with and without Reflection</a>. In this instalment I'll explore some options for serializing <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> with <a href="https://fsharp.org/">F#</a> using the API built into .NET: <a href="https://learn.microsoft.com/dotnet/api/system.text.json">System.Text.Json</a>. I'm not going use <a href="https://www.newtonsoft.com/json">Json.NET</a> in this article, but I've <a href="/2022/01/03/to-id-or-not-to-id">done similar things with that library</a> in the past, so what's here is, at least, somewhat generalizable. </p> <h3 id="64e7a2f8c5634026ae4ffd1497dd58f9"> Natural numbers <a href="#64e7a2f8c5634026ae4ffd1497dd58f9">#</a> </h3> <p> Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the <a href="/2023/12/04/serialization-with-and-without-reflection">introductory article</a> we'd like to parse and write restaurant table configurations like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> On the other hand, I'd like to represent the Domain Model in a way that <a href="/2022/10/24/encapsulation-in-functional-programming">encapsulates the rules</a> governing the model, <a href="https://blog.janestreet.com/effective-ml-video/">making illegal states unrepresentable</a>. </p> <p> As the first step, we observe that the numbers involved are all <a href="https://en.wikipedia.org/wiki/Natural_number">natural numbers</a>. In F# it's both <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> and easy to define a <a href="https://www.hillelwayne.com/post/constructive/">predicative data type</a>: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;NaturalNumber&nbsp;=&nbsp;<span style="color:blue;">private</span>&nbsp;NaturalNumber&nbsp;<span style="color:blue;">of</span>&nbsp;int</pre> </p> <p> Since it's defined with a <code>private</code> constructor we need to also supply a way to create valid values of the type: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;NaturalNumber&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;tryCreate&nbsp;n&nbsp;=&nbsp;<span style="color:blue;">if</span>&nbsp;n&nbsp;&lt;&nbsp;1&nbsp;<span style="color:blue;">then</span>&nbsp;None&nbsp;<span style="color:blue;">else</span>&nbsp;Some&nbsp;(NaturalNumber&nbsp;n)</pre> </p> <p> In this, as well as the other articles in this series, I've chosen to model the potential for errors with <code>Option</code> values. I could also have chosen to use <code>Result</code> if I wanted to communicate information along the 'error channel', but sticking with <code>Option</code> makes the code a bit simpler. Not so much in F# or <a href="https://www.haskell.org/">Haskell</a>, but once we reach C#, <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">applicative validation</a> becomes complicated. </p> <p> There's no loss of generality in this decision, since both <code>Option</code> and <code>Result</code> are <a href="/2018/10/01/applicative-functors">applicative functors</a>. </p> <p> <pre>&gt; NaturalNumber.tryCreate&nbsp;-1;; val it: NaturalNumber option = None &gt; <span style="color:blue;">let</span>&nbsp;x&nbsp;=&nbsp;NaturalNumber.tryCreate&nbsp;42;; val x: NaturalNumber option = Some NaturalNumber 42</pre> </p> <p> The <code>tryCreate</code> function enables client developers to create <code>NaturalNumber</code> values, and due to the F#'s default equality and comparison implementation, you can even compare them: </p> <p> <pre>&gt; <span style="color:blue;">let</span>&nbsp;y = NaturalNumber.tryCreate 2112;; val y: NaturalNumber option = Some NaturalNumber 2112 &gt; x &lt; y;; val it: bool = true</pre> </p> <p> That's it for natural numbers. Three lines of code. Compare that to <a href="/2023/12/11/serializing-restaurant-tables-in-haskell">the Haskell implementation</a>, which required eight lines of code. This is mostly due to F#'s <code>private</code> keyword, which Haskell doesn't have. </p> <h3 id="8957ab6a606a4279a654e040d0788051"> Domain Model <a href="#8957ab6a606a4279a654e040d0788051">#</a> </h3> <p> Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a <code>NaturalNumber</code> less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six. </p> <p> In the same spirit as above, then, define this type: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;Table&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;SingleTable&nbsp;<span style="color:blue;">of</span>&nbsp;NaturalNumber&nbsp;*&nbsp;NaturalNumber &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;CommunalTable&nbsp;<span style="color:blue;">of</span>&nbsp;NaturalNumber</pre> </p> <p> Once more the <code>private</code> keyword makes it impossible for client code to create instances directly, so we need a pair of functions to create values: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Table&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;trySingle&nbsp;capacity&nbsp;minimalReservation&nbsp;=&nbsp;option&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;cap&nbsp;=&nbsp;NaturalNumber.tryCreate&nbsp;capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;min&nbsp;=&nbsp;NaturalNumber.tryCreate&nbsp;minimalReservation &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;cap&nbsp;&lt;&nbsp;min&nbsp;<span style="color:blue;">then</span>&nbsp;<span style="color:blue;">return!</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;<span style="color:blue;">return</span>&nbsp;SingleTable&nbsp;(cap,&nbsp;min)&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;tryCommunal&nbsp;=&nbsp;NaturalNumber.tryCreate&nbsp;&gt;&gt;&nbsp;Option.map&nbsp;CommunalTable</pre> </p> <p> Notice that <code>trySingle</code> checks the invariant that the <code>capacity</code> must be greater than or equal to the <code>minimalReservation</code>. </p> <p> Again, notice how much easier it is to define a predicative type in F#, compared to Haskell. </p> <p> This isn't a competition between languages, and while F# certainly scores a couple of points here, Haskell has other advantages. </p> <p> The point of this little exercise, so far, is that it <em>encapsulates</em> the contract implied by the Domain Model. It does this by using the static type system to its advantage. </p> <h3 id="560a74686ec64d36858a893c7c63cbb4"> JSON serialization by hand <a href="#560a74686ec64d36858a893c7c63cbb4">#</a> </h3> <p> At the boundaries of applications, however, <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">there are no static types</a>. Is the static type system still useful in that situation? </p> <p> For a long time, the most popular .NET library for JSON serialization was <a href="https://www.newtonsoft.com/json">Json.NET</a>, but these days I find the built-in API offered in the <a href="https://learn.microsoft.com/dotnet/api/system.text.json">System.Text.Json</a> namespace adequate. This is also the case here. </p> <p> The original rationale for this article series was to demonstrate how serialization can be done without Reflection, so I'll start there and return to Reflection later. </p> <p> In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this: </p> <p> <pre>{&nbsp;<span style="color:#2e75b6;">&quot;communalTable&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;42&nbsp;}&nbsp;}</pre> </p> <p> Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility. </p> <p> As I outlined in the <a href="/2023/12/04/serialization-with-and-without-reflection">introduction article</a> you can usually find a more weakly typed API to get the job done. For serializing <code>Table</code> to JSON it looks like this: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;serializeTable&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;SingleTable&nbsp;(NaturalNumber&nbsp;capacity,&nbsp;NaturalNumber&nbsp;minimalReservation)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;j&nbsp;=&nbsp;JsonObject&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j[<span style="color:#a31515;">&quot;singleTable&quot;</span>]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;JsonObject&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j[<span style="color:#a31515;">&quot;singleTable&quot;</span>][<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j[<span style="color:#a31515;">&quot;singleTable&quot;</span>][<span style="color:#a31515;">&quot;minimalReservation&quot;</span>]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;minimalReservation &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j.ToJsonString&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;CommunalTable&nbsp;(NaturalNumber&nbsp;capacity)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;j&nbsp;=&nbsp;JsonObject&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j[<span style="color:#a31515;">&quot;communalTable&quot;</span>]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;JsonObject&nbsp;() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j[<span style="color:#a31515;">&quot;communalTable&quot;</span>][<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j.ToJsonString&nbsp;()</pre> </p> <p> In order to separate concerns, I've defined this functionality in a new module that references the module that defines the Domain Model. The <code>serializeTable</code> function pattern-matches on <code>SingleTable</code> and <code>CommunalTable</code> to write two different <a href="https://learn.microsoft.com/dotnet/api/system.text.json.nodes.jsonobject">JsonObject</a> objects, using the JSON API's underlying Document Object Model (DOM). </p> <h3 id="7fb5251937ac4f86b016f7c782db7680"> JSON deserialization by hand <a href="#7fb5251937ac4f86b016f7c782db7680">#</a> </h3> <p> You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid <code>Table</code> representation, or the values may be illegal, etc. </p> <p> Here I found it appropriate to first define a small API of parsing functions, mostly in order to make the object-oriented API more composable. First, I need some code that looks at the root JSON object to determine which kind of table it is (if it's a table at all). I found it appropriate to do that as a pair of <a href="https://learn.microsoft.com/dotnet/fsharp/language-reference/active-patterns">active patterns</a>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;<span style="color:blue;">private</span>&nbsp;(|Single|_|)&nbsp;(node&nbsp;:&nbsp;JsonNode)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;node[<span style="color:#a31515;">&quot;singleTable&quot;</span>]&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">null</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;tn&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Some&nbsp;tn <span style="color:blue;">let</span>&nbsp;<span style="color:blue;">private</span>&nbsp;(|Communal|_|)&nbsp;(node&nbsp;:&nbsp;JsonNode)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;node[<span style="color:#a31515;">&quot;communalTable&quot;</span>]&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">null</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;tn&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Some&nbsp;tn</pre> </p> <p> It turned out that I also needed a function to even check if a string is a valid JSON document: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;<span style="color:blue;">private</span>&nbsp;tryParseJson&nbsp;(candidate&nbsp;:&nbsp;string)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">try</span>&nbsp;JsonNode.Parse&nbsp;candidate&nbsp;|&gt;&nbsp;Some &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">with</span>&nbsp;|&nbsp;:?&nbsp;System.Text.Json.JsonException&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None</pre> </p> <p> If there's a way to do that without a <code>try/with</code> expression, I couldn't find it. Likewise, trying to parse an integer turns out to be surprisingly complicated: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;<span style="color:blue;">private</span>&nbsp;tryParseInt&nbsp;(node&nbsp;:&nbsp;JsonNode)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;node&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;<span style="color:blue;">null</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;node.GetValueKind&nbsp;()&nbsp;=&nbsp;JsonValueKind.Number &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">then</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">try</span>&nbsp;node&nbsp;|&gt;&nbsp;int&nbsp;|&gt;&nbsp;Some &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">with</span>&nbsp;|&nbsp;:?&nbsp;FormatException&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None&nbsp;<span style="color:green;">//&nbsp;Thrown&nbsp;on&nbsp;decimal&nbsp;numbers</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">else</span>&nbsp;None</pre> </p> <p> Both <code>tryParseJson</code> and <code>tryParseInt</code> are, however, general-purpose functions, so if you have a lot of JSON you need to parse, you can put them in a reusable library. </p> <p> With those building blocks you can now define a function to parse a <code>Table</code>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;tryDeserializeTable&nbsp;(candidate&nbsp;:&nbsp;string)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;tryParseJson&nbsp;candidate&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;(Single&nbsp;node)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;option&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;capacity&nbsp;=&nbsp;node[<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;|&gt;&nbsp;tryParseInt &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;minimalReservation&nbsp;=&nbsp;node[<span style="color:#a31515;">&quot;minimalReservation&quot;</span>]&nbsp;|&gt;&nbsp;tryParseInt &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span>&nbsp;Table.trySingle&nbsp;capacity&nbsp;minimalReservation&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;(Communal&nbsp;node)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;option&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;capacity&nbsp;=&nbsp;node[<span style="color:#a31515;">&quot;capacity&quot;</span>]&nbsp;|&gt;&nbsp;tryParseInt &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return!</span>&nbsp;Table.tryCommunal&nbsp;capacity&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None</pre> </p> <p> Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works, and in fact, I did. Here are a few examples: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Deserialize&nbsp;single&nbsp;table&nbsp;for&nbsp;4``&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;json&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;singleTable&quot;:{&quot;capacity&quot;:4,&quot;minimalReservation&quot;:3}}&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;tryDeserializeTable&nbsp;json &nbsp;&nbsp;&nbsp;&nbsp;Table.trySingle&nbsp;4&nbsp;3&nbsp;=!&nbsp;actual [&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Deserialize&nbsp;non-table``&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;json&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;foo&quot;:42}&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;tryDeserializeTable&nbsp;json &nbsp;&nbsp;&nbsp;&nbsp;None&nbsp;=!&nbsp;actual</pre> </p> <p> Apart from module declaration and imports etc. this hand-written JSON capability requires 46 lines of code, although, to be fair, some of that code (<code>tryParseJson</code> and <code>tryParseInt</code>) are general-purpose functions that belong in a reusable library. Can we do better with static types and Reflection? </p> <h3 id="08491ec39df4485e83cd0b5cf80cdb7e"> JSON serialisation based on types <a href="#08491ec39df4485e83cd0b5cf80cdb7e">#</a> </h3> <p> The static <a href="https://learn.microsoft.com/dotnet/api/system.text.json.jsonserializer">JsonSerializer</a> class comes with <code>Serialize&lt;T&gt;</code> and <code>Deserialize&lt;T&gt;</code> methods that use Reflection to convert a statically typed object to and from JSON. You can define a type (a <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Object</a> (DTO) if you will) and let Reflection do the hard work. </p> <p> In <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> I explain how you're usually better off separating the role of serialization from the role of Domain Model. One way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above <code>Table</code> type plays the latter role, so we need new DTO types: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;CommunalTableDto&nbsp;=&nbsp;{&nbsp;Capacity&nbsp;:&nbsp;int&nbsp;} <span style="color:blue;">type</span>&nbsp;SingleTableDto&nbsp;=&nbsp;{&nbsp;Capacity&nbsp;:&nbsp;int;&nbsp;MinimalReservation&nbsp;:&nbsp;int&nbsp;} <span style="color:blue;">type</span>&nbsp;TableDto&nbsp;=&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;CommunalTable&nbsp;:&nbsp;CommunalTableDto&nbsp;option &nbsp;&nbsp;&nbsp;&nbsp;SingleTable&nbsp;:&nbsp;SingleTableDto&nbsp;option&nbsp;}</pre> </p> <p> One way to model a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> with a DTO is to declare both cases as <code>option</code> fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary. </p> <p> While you can serialize values of that type, by default the generated JSON doesn't have the right format: </p> <p> <pre>&gt; val dto: TableDto = { CommunalTable = Some { Capacity = 42 } SingleTable = None } &gt; JsonSerializer.Serialize dto;; val it: string = "{"CommunalTable":{"Capacity":42},"SingleTable":null}"</pre> </p> <p> There are two problems with the generated JSON document: </p> <ul> <li>The casing is wrong</li> <li>The null value shouldn't be there</li> </ul> <p> None of those are too hard to address, but it does make the API a bit more awkward to use, as this test demonstrates: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Serialize&nbsp;communal&nbsp;table&nbsp;via&nbsp;reflection``&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dto&nbsp;=&nbsp;{&nbsp;CommunalTable&nbsp;=&nbsp;Some&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;42&nbsp;};&nbsp;SingleTable&nbsp;=&nbsp;None&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;JsonSerializer.Serialize&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dto, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;JsonSerializerOptions&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PropertyNamingPolicy&nbsp;=&nbsp;JsonNamingPolicy.CamelCase, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;IgnoreNullValues&nbsp;=&nbsp;<span style="color:blue;">true</span>&nbsp;)) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;communalTable&quot;:{&quot;capacity&quot;:42}}&quot;&quot;&quot;</span>&nbsp;=!&nbsp;actual</pre> </p> <p> You can, of course, define this particular serialization behaviour as a reusable function, so it's not a problem that you can't address. I just wanted to include this, since it's part of the overall work that you have to do in order to make this work. </p> <h3 id="a215bb56b64446afbe2ca6861f724126"> JSON deserialisation based on types <a href="#a215bb56b64446afbe2ca6861f724126">#</a> </h3> <p> To allow parsing of JSON into the above DTO the Reflection-based <code>Deserialize</code> method pretty much works out of the box, although again, it needs to be configured. Here's a passing test that demonstrates how that works: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;``Deserialize&nbsp;single&nbsp;table&nbsp;via&nbsp;reflection``&nbsp;()&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;json&nbsp;=&nbsp;<span style="color:#a31515;">&quot;&quot;&quot;{&quot;singleTable&quot;:{&quot;capacity&quot;:4,&quot;minimalReservation&quot;:2}}&quot;&quot;&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;JsonSerializer.Deserialize&lt;TableDto&gt;&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;json, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;JsonSerializerOptions&nbsp;(&nbsp;PropertyNamingPolicy&nbsp;=&nbsp;JsonNamingPolicy.CamelCase&nbsp;)) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;CommunalTable&nbsp;=&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SingleTable&nbsp;=&nbsp;Some&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;4;&nbsp;MinimalReservation&nbsp;=&nbsp;2&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;=!&nbsp;actual</pre> </p> <p> There's only difference in casing, so you'd expect the <code>Deserialize</code> method to be a <a href="https://martinfowler.com/bliki/TolerantReader.html">Tolerant Reader</a>, but no. It's very particular about that, so the <code>JsonNamingPolicy.CamelCase</code> configuration is necessary. Perhaps the API designers found that <a href="https://peps.python.org/pep-0020/">explicit is better than implicit</a>. </p> <p> In any case, you could package that in a reusable <code>Deserialize</code> function that has all the options that are appropriate in a particular code context, so not a big deal. That takes care of actually writing and parsing JSON, but that's only half the battle. This only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate <code>Table</code> data. </p> <h3 id="8faed9f5f0b149d68ec5e1a457046e59"> Converting DTO to Domain Model, and vice versa <a href="#8faed9f5f0b149d68ec5e1a457046e59">#</a> </h3> <p> As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;toTableDto&nbsp;=&nbsp;<span style="color:blue;">function</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;SingleTable&nbsp;(NaturalNumber&nbsp;capacity,&nbsp;NaturalNumber&nbsp;minimalReservation)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;CommunalTable&nbsp;=&nbsp;None &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SingleTable&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Some &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;capacity &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MinimalReservation&nbsp;=&nbsp;minimalReservation &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;CommunalTable&nbsp;(NaturalNumber&nbsp;capacity)&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;CommunalTable&nbsp;=&nbsp;Some&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;capacity&nbsp;};&nbsp;SingleTable&nbsp;=&nbsp;None&nbsp;}</pre> </p> <p> Going the other way is <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">fundamentally a parsing exercise</a>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;tryParseTableDto&nbsp;candidate&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">match</span>&nbsp;candidate.CommunalTable,&nbsp;candidate.SingleTable&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;Some&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;capacity&nbsp;},&nbsp;None&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;Table.tryCommunal&nbsp;capacity &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;None,&nbsp;Some&nbsp;{&nbsp;Capacity&nbsp;=&nbsp;capacity;&nbsp;MinimalReservation&nbsp;=&nbsp;minimalReservation&nbsp;}&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Table.trySingle&nbsp;capacity&nbsp;minimalReservation &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;_&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;None</pre> </p> <p> Such an operation may fail, so the result is a <code>Table option</code>. It could also have been a <code>Result&lt;Table, 'something&gt;</code>, if you wanted to return information about errors when things go wrong. It makes the code marginally more complex, but doesn't change the overall thrust of this exploration. </p> <p> Ironically, while <code>tryParseTableDto</code> is actually more complex than <code>toTableDto</code> it looks smaller, or at least denser. </p> <p> Let's take stock of the type-based alternative. It requires 26 lines of code, distributed over three DTO types and the two conversions <code>tryParseTableDto</code> and <code>toTableDto</code>, but here I haven't counted configuration of <code>Serialize</code> and <code>Deserialize</code>, since I left that to each test case that I wrote. Since all of this code generally stays within 80 characters in line width, that would realistically add another 10 lines of code, for a total around 36 lines. </p> <p> This is smaller than the DOM-based code, although at the same magnitude. </p> <h3 id="2292c83546d441159ece864f3636cd41"> Conclusion <a href="#2292c83546d441159ece864f3636cd41">#</a> </h3> <p> In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define <em>types</em> that model the shape of the JSON data, and then leverage type-based automation (here, Reflection) to automatically parse and write the JSON. </p> <p> I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 46 lines of code, while taking advantage of type-based automation requires 36 lines of code. Contrary to <a href="/2023/12/11/serializing-restaurant-tables-in-haskell">the Haskell example</a>, Reflection does seem to edge out a win this round. </p> <p> <strong>Next:</strong> <a href="/2023/12/25/serializing-restaurant-tables-in-c">Serializing restaurant tables in C#</a>. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Serializing restaurant tables in Haskell https://blog.ploeh.dk/2023/12/11/serializing-restaurant-tables-in-haskell 2023-12-11T07:35:00+00:00 Mark Seemann <div id="post"> <p> <em>Using Aeson, with and without generics.</em> </p> <p> This article is part of a short series of articles about <a href="/2023/12/04/serialization-with-and-without-reflection">serialization with and without Reflection</a>. In this instalment I'll explore some options for serializing <a href="https://en.wikipedia.org/wiki/JSON">JSON</a> using <a href="https://hackage.haskell.org/package/aeson">Aeson</a>. </p> <p> The source code is <a href="https://github.com/ploeh/HaskellJSONSerialization">available on GitHub</a>. </p> <h3 id="013a78e039024c81a204ca10f1a7af69"> Natural numbers <a href="#013a78e039024c81a204ca10f1a7af69">#</a> </h3> <p> Before we start investigating how to serialize to and from JSON, we must have something to serialize. As described in the <a href="/2023/12/04/serialization-with-and-without-reflection">introductory article</a> we'd like to parse and write restaurant table configurations like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> On the other hand, I'd like to represent the Domain Model in a way that <a href="/2022/10/24/encapsulation-in-functional-programming">encapsulates the rules</a> governing the model, <a href="https://blog.janestreet.com/effective-ml-video/">making illegal states unrepresentable</a>. </p> <p> As the first step, we observe that the numbers involved are all <a href="https://en.wikipedia.org/wiki/Natural_number">natural numbers</a>. While <a href="/2022/01/24/type-level-di-container-prototype">I'm aware</a> that <a href="https://www.haskell.org/">Haskell</a> has built-in <a href="https://hackage.haskell.org/package/base/docs/GHC-TypeLits.html#t:Nat">Nat</a> type, I choose not to use it here, for a couple of reasons. One is that <code>Nat</code> is intended for type-level programming, and while this <em>might</em> be useful here, I don't want to pull in more exotic language features than are required. Another reason is that, in this domain, I want to model natural numbers as excluding zero (and I honestly don't remember if <code>Nat</code> allows zero, but I <em>think</em> that it does..?). </p> <p> Another option is to use <a href="/2019/05/13/peano-catamorphism">Peano numbers</a>, but again, for didactic reasons, I'll stick with something a bit more idiomatic. </p> <p> You can easily introduce a wrapper over, say, <code>Integer</code>, to model natural numbers: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;Natural&nbsp;=&nbsp;Natural&nbsp;Integer&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Ord</span>,&nbsp;<span style="color:#2b91af;">Show</span>)</pre> </p> <p> This, however, doesn't prevent you from writing <code>Natural (-1)</code>, so we need to make this a <a href="https://www.hillelwayne.com/post/constructive/">predicative data type</a>. The first step is to only export the type, but <em>not</em> its data constructor: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Restaurants&nbsp;( &nbsp;&nbsp;<span style="color:blue;">Natural</span>, &nbsp;&nbsp;<span style="color:green;">--&nbsp;More&nbsp;exports&nbsp;here...</span> &nbsp;&nbsp;)&nbsp;<span style="color:blue;">where</span></pre> </p> <p> But this makes it impossible for client code to create values of the type, so we need to supply a <a href="https://wiki.haskell.org/Smart_constructors">smart constructor</a>: </p> <p> <pre><span style="color:#2b91af;">tryNatural</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:#2b91af;">Integer</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Natural</span> tryNatural&nbsp;n &nbsp;&nbsp;|&nbsp;n&nbsp;&lt;&nbsp;1&nbsp;=&nbsp;Nothing &nbsp;&nbsp;|&nbsp;<span style="color:blue;">otherwise</span>&nbsp;=&nbsp;Just&nbsp;(Natural&nbsp;n)</pre> </p> <p> In this, as well as the other articles in this series, I've chosen to model the potential for errors with <code>Maybe</code> values. I could also have chosen to use <code>Either</code> if I wanted to communicate information along the 'error channel', but sticking with <code>Maybe</code> makes the code a bit simpler. Not so much in Haskell or <a href="https://fsharp.org/">F#</a>, but once we reach C#, <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">applicative validation</a> becomes complicated. </p> <p> There's no loss of generality in this decision, since both <code>Maybe</code> and <code>Either</code> are <code>Applicative</code> instances. </p> <p> With the <code>tryNatural</code> function you can now (attempt to) create <code>Natural</code> values: </p> <p> <pre>ghci&gt; tryNatural (-1) Nothing ghci&gt; x = tryNatural 42 ghci&gt; x Just (Natural 42)</pre> </p> <p> This enables client developers to create <code>Natural</code> values, and due to the type's <code>Ord</code> instance, you can even compare them: </p> <p> <pre>ghci&gt; y = tryNatural 2112 ghci&gt; x &lt; y True</pre> </p> <p> Even so, there will be cases when you need to extract the underlying <code>Integer</code> from a <code>Natural</code> value. You could supply a normal function for that purpose, but in order to make some of the following code a little more elegant, I chose to do it with pattern synonyms: </p> <p> <pre>{-#&nbsp;COMPLETE&nbsp;N&nbsp;#-} pattern&nbsp;N&nbsp;::&nbsp;Integer&nbsp;-&gt;&nbsp;Natural pattern&nbsp;N&nbsp;i&nbsp;&lt;-&nbsp;Natural&nbsp;i</pre> </p> <p> That needs to be exported as well. </p> <p> So, eight lines of code to declare a predicative type that models a natural number. Incidentally, this'll be 2-3 lines of code in F#. </p> <h3 id="1e5a116c8a6f4a928cbea3f88eed42ad"> Domain Model <a href="#1e5a116c8a6f4a928cbea3f88eed42ad">#</a> </h3> <p> Modelling a restaurant table follows in the same vein. One invariant I would like to enforce is that for a 'single' table, the minimal reservation should be a <code>Natural</code> number less than or equal to the table's capacity. It doesn't make sense to configure a table for four with a minimum reservation of six. </p> <p> In the same spirit as above, then, define this type: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;SingleTable&nbsp;=&nbsp;SingleTable &nbsp;&nbsp;{&nbsp;singleCapacity&nbsp;::&nbsp;Natural &nbsp;&nbsp;,&nbsp;minimalReservation&nbsp;::&nbsp;Natural &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Ord</span>,&nbsp;<span style="color:#2b91af;">Show</span>)</pre> </p> <p> Again, only export the type, but not its data constructor. In order to extract values, then, supply another pattern synonym: </p> <p> <pre>{-#&nbsp;COMPLETE&nbsp;SingleT&nbsp;#-} pattern&nbsp;SingleT&nbsp;::&nbsp;Natural&nbsp;-&gt;&nbsp;Natural&nbsp;-&gt;&nbsp;SingleTable pattern&nbsp;SingleT&nbsp;c&nbsp;m&nbsp;&lt;-&nbsp;SingleTable&nbsp;c&nbsp;m</pre> </p> <p> Finally, define a <code>Table</code> type and two smart constructors: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;Table&nbsp;=&nbsp;Single&nbsp;SingleTable&nbsp;|&nbsp;Communal&nbsp;Natural&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:#2b91af;">trySingleTable</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:#2b91af;">Integer</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Integer</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Table</span> trySingleTable&nbsp;capacity&nbsp;minimal&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;c&nbsp;&lt;-&nbsp;tryNatural&nbsp;capacity &nbsp;&nbsp;m&nbsp;&lt;-&nbsp;tryNatural&nbsp;minimal &nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;c&nbsp;&lt;&nbsp;m&nbsp;<span style="color:blue;">then</span>&nbsp;Nothing&nbsp;<span style="color:blue;">else</span>&nbsp;Just&nbsp;(Single&nbsp;(SingleTable&nbsp;c&nbsp;m)) <span style="color:#2b91af;">tryCommunalTable</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:#2b91af;">Integer</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Table</span> tryCommunalTable&nbsp;=&nbsp;<span style="color:blue;">fmap</span>&nbsp;Communal&nbsp;.&nbsp;tryNatural</pre> </p> <p> Notice that <code>trySingleTable</code> checks the invariant that the <code>capacity</code> must be greater than or equal to the minimal reservation. </p> <p> The point of this little exercise, so far, is that it <em>encapsulates</em> the contract implied by the Domain Model. It does this by using the static type system to its advantage. </p> <h3 id="a0f3606b82374c349dc0bf116db14493"> JSON serialization by hand <a href="#a0f3606b82374c349dc0bf116db14493">#</a> </h3> <p> At the boundaries of applications, however, <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">there are no static types</a>. Is the static type system still useful in that situation? </p> <p> For Haskell, the most common JSON library is Aeson, and I admit that I'm no expert. Thus, it's possible that there's an easier way to serialize to and deserialize from JSON. If so, please leave a comment explaining the alternative. </p> <p> The original rationale for this article series was to demonstrate how serialization can be done without Reflection, or, in the case of Haskell, <a href="https://hackage.haskell.org/package/base/docs/GHC-Generics.html">Generics</a> (not to be confused with .NET generics, which in Haskell usually is called <em>parametric polymorphism</em>). We'll return to Generics later in this article. </p> <p> In this article series, I consider the JSON format fixed. A single table should be rendered as shown above, and a communal table should be rendered like this: </p> <p> <pre>{&nbsp;<span style="color:#2e75b6;">&quot;communalTable&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;42&nbsp;}&nbsp;}</pre> </p> <p> Often in the real world you'll have to conform to a particular protocol format, or, even if that's not the case, being able to control the shape of the wire format is important to deal with backwards compatibility. </p> <p> As I outlined in the <a href="/2023/12/04/serialization-with-and-without-reflection">introduction article</a> you can usually find a more weakly typed API to get the job done. For serializing <code>Table</code> to JSON it looks like this: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;JSONTable&nbsp;=&nbsp;JSONTable&nbsp;Table&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>) <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">JSONTable</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;toJSON&nbsp;(JSONTable&nbsp;(Single&nbsp;(SingleT&nbsp;(N&nbsp;c)&nbsp;(N&nbsp;m))))&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;object&nbsp;[<span style="color:#a31515;">&quot;singleTable&quot;</span>&nbsp;.=&nbsp;object&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;capacity&quot;</span>&nbsp;.=&nbsp;c, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;minimalReservation&quot;</span>&nbsp;.=&nbsp;m]] &nbsp;&nbsp;toJSON&nbsp;(JSONTable&nbsp;(Communal&nbsp;(N&nbsp;c)))&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;object&nbsp;[<span style="color:#a31515;">&quot;communalTable&quot;</span>&nbsp;.=&nbsp;object&nbsp;[<span style="color:#a31515;">&quot;capacity&quot;</span>&nbsp;.=&nbsp;c]]</pre> </p> <p> In order to separate concerns, I've defined this functionality in a new module that references the module that defines the Domain Model. Thus, to avoid orphan instances, I've defined a <code>JSONTable</code> <code>newtype</code> wrapper that I then make a <code>ToJSON</code> instance. </p> <p> The <code>toJSON</code> function pattern-matches on <code>Single</code> and <code>Communal</code> to write two different <a href="https://hackage.haskell.org/package/aeson/docs/Data-Aeson.html#t:Value">Values</a>, using Aeson's underlying Document Object Model (DOM). </p> <h3 id="9c3c05517701474d8ac88c59e6fd11e7"> JSON deserialization by hand <a href="#9c3c05517701474d8ac88c59e6fd11e7">#</a> </h3> <p> You can also go the other way, and when it looks more complicated, it's because it is. When serializing an encapsulated value, not a lot can go wrong because the value is already valid. When deserializing a JSON string, on the other hand, all sorts of things can go wrong: It might not even be a valid string, or the string may not be valid JSON, or the JSON may not be a valid <code>Table</code> representation, or the values may be illegal, etc. </p> <p> It's no surprise, then, that the <code>FromJSON</code> instance is bigger: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">FromJSON</span>&nbsp;<span style="color:blue;">JSONTable</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;parseJSON&nbsp;(Object&nbsp;v)&nbsp;=&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;single&nbsp;&lt;-&nbsp;v&nbsp;.:?&nbsp;<span style="color:#a31515;">&quot;singleTable&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;communal&nbsp;&lt;-&nbsp;v&nbsp;.:?&nbsp;<span style="color:#a31515;">&quot;communalTable&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;(single,&nbsp;communal)&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Just&nbsp;s,&nbsp;Nothing)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;capacity&nbsp;&lt;-&nbsp;s&nbsp;.:&nbsp;<span style="color:#a31515;">&quot;capacity&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;minimal&nbsp;&lt;-&nbsp;s&nbsp;.:&nbsp;<span style="color:#a31515;">&quot;minimalReservation&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;trySingleTable&nbsp;capacity&nbsp;minimal&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Nothing&nbsp;-&gt;&nbsp;<span style="color:blue;">fail</span>&nbsp;<span style="color:#a31515;">&quot;Expected&nbsp;natural&nbsp;numbers.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Just&nbsp;t&nbsp;-&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;JSONTable&nbsp;t &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Nothing,&nbsp;Just&nbsp;c)&nbsp;-&gt;&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;capacity&nbsp;&lt;-&nbsp;c&nbsp;.:&nbsp;<span style="color:#a31515;">&quot;capacity&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">case</span>&nbsp;tryCommunalTable&nbsp;capacity&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Nothing&nbsp;-&gt;&nbsp;<span style="color:blue;">fail</span>&nbsp;<span style="color:#a31515;">&quot;Expected&nbsp;a&nbsp;natural&nbsp;number.&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Just&nbsp;t&nbsp;-&gt;&nbsp;<span style="color:blue;">return</span>&nbsp;$&nbsp;JSONTable&nbsp;t &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;_&nbsp;-&gt;&nbsp;<span style="color:blue;">fail</span>&nbsp;<span style="color:#a31515;">&quot;Expected&nbsp;exactly&nbsp;one&nbsp;of&nbsp;singleTable&nbsp;or&nbsp;communalTable.&quot;</span> &nbsp;&nbsp;parseJSON&nbsp;_&nbsp;=&nbsp;<span style="color:blue;">fail</span>&nbsp;<span style="color:#a31515;">&quot;Expected&nbsp;an&nbsp;object.&quot;</span></pre> </p> <p> I could probably have done this more succinctly if I'd spent even more time on it than I already did, but it gets the job done and demonstrates the point. Instead of relying on run-time Reflection, the <code>FromJSON</code> instance is, unsurprisingly, a parser, composed from Aeson's specialised parser combinator API. </p> <p> Since both serialisation and deserialization is based on string values, you should write automated tests that verify that the code works. </p> <p> Apart from module declaration and imports etc. this hand-written JSON capability requires 27 lines of code. Can we do better with static types and Generics? </p> <h3 id="ddac03fba5134f0da9e613c29888ce83"> JSON serialisation based on types <a href="#ddac03fba5134f0da9e613c29888ce83">#</a> </h3> <p> The intent with the Aeson library is that you define a type (a <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Object</a> (DTO) if you will), and then let 'compiler magic' do the rest. In Haskell, it's not run-time Reflection, but a compilation technology called Generics. As I understand it, it automatically 'writes' the serialization and parsing code and turns it into machine code as part of normal compilation. </p> <p> You're supposed to first turn on the </p> <p> <pre>{-#&nbsp;<span style="color:gray;">LANGUAGE</span>&nbsp;DeriveGeneric&nbsp;#-}</pre> </p> <p> language pragma and then tell the compiler to automatically derive <code>Generic</code> for the DTO in question. You'll see an example of that shortly. </p> <p> It's a fairly flexible system that you can tweak in various ways, but if it's possible to do it directly with the above <code>Table</code> type, please leave a comment explaining how. I tried, but couldn't make it work. To be clear, I <em>could</em> make it serializable, but not to the above JSON format. <ins datetime="2023-12-11T11:06Z">After enough <a href="/2023/10/02/dependency-whac-a-mole">Aeson Whac-A-Mole</a> I decided to change tactics.</ins> </p> <p> In <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> I explain how you're usually better off separating the role of serialization from the role of Domain Model. The way to do that is exactly by defining a DTO for serialisation, and let the Domain Model remain exclusively to model the rules of the application. The above <code>Table</code> type plays the latter role, so we need new DTO types. </p> <p> We may start with the building blocks: </p> <p> <pre><span style="color:blue;">newtype</span>&nbsp;CommunalDTO&nbsp;=&nbsp;CommunalDTO &nbsp;&nbsp;{&nbsp;communalCapacity&nbsp;::&nbsp;Integer &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Generic</span>)</pre> </p> <p> Notice how it declaratively derives <code>Generic</code>, which works because of the <code>DeriveGeneric</code> language pragma. </p> <p> From here, in principle, all that you need is just a single declaration to make it serializable: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">CommunalDTO</span></pre> </p> <p> While it does serialize to JSON, it doesn't have the right format: </p> <p> <pre>{&nbsp;<span style="color:#2e75b6;">&quot;communalCapacity&quot;</span>:&nbsp;42&nbsp;}</pre> </p> <p> The property name should be <code>capacity</code>, not <code>communalCapacity</code>. Why did I call the record field <code>communalCapacity</code> instead of <code>capacity</code>? Can't I just fix my <code>CommunalDTO</code> record? </p> <p> Unfortunately, I can't just do that, because I also need a <code>capacity</code> JSON property for the single-table case, and Haskell isn't happy about duplicated field names in the same module. (This language feature truly is one of the weak points of Haskell.) </p> <p> Instead, I can tweak the Aeson rules by supplying an <code>Options</code> value to the instance definition: </p> <p> <pre><span style="color:#2b91af;">communalJSONOptions</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Options</span> communalJSONOptions&nbsp;= &nbsp;&nbsp;defaultOptions&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;fieldLabelModifier&nbsp;=&nbsp;\s&nbsp;-&gt;&nbsp;<span style="color:blue;">case</span>&nbsp;s&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;communalCapacity&quot;</span>&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;capacity&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;_&nbsp;-&gt;&nbsp;s&nbsp;} <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">CommunalDTO</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;toJSON&nbsp;=&nbsp;genericToJSON&nbsp;communalJSONOptions &nbsp;&nbsp;toEncoding&nbsp;=&nbsp;genericToEncoding&nbsp;communalJSONOptions</pre> </p> <p> This instructs the compiler to modify how it generates the serialization code, and the generated JSON fragment is now correct. </p> <p> We can do the same with the single-table case: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;SingleDTO&nbsp;=&nbsp;SingleDTO &nbsp;&nbsp;{&nbsp;singleCapacity&nbsp;::&nbsp;Integer &nbsp;&nbsp;,&nbsp;minimalReservation&nbsp;::&nbsp;Integer &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Generic</span>) <span style="color:#2b91af;">singleJSONOptions</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Options</span> singleJSONOptions&nbsp;= &nbsp;&nbsp;defaultOptions&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;fieldLabelModifier&nbsp;=&nbsp;\s&nbsp;-&gt;&nbsp;<span style="color:blue;">case</span>&nbsp;s&nbsp;<span style="color:blue;">of</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;singleCapacity&quot;</span>&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;capacity&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">&quot;minimalReservation&quot;</span>&nbsp;-&gt;&nbsp;<span style="color:#a31515;">&quot;minimalReservation&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;_&nbsp;-&gt;&nbsp;s&nbsp;} <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">SingleDTO</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;toJSON&nbsp;=&nbsp;genericToJSON&nbsp;singleJSONOptions &nbsp;&nbsp;toEncoding&nbsp;=&nbsp;genericToEncoding&nbsp;singleJSONOptions</pre> </p> <p> This takes care of that case, but we still need a container type that will hold either one or the other: </p> <p> <pre><span style="color:blue;">data</span>&nbsp;TableDTO&nbsp;=&nbsp;TableDTO &nbsp;&nbsp;{&nbsp;singleTable&nbsp;::&nbsp;Maybe&nbsp;SingleDTO &nbsp;&nbsp;,&nbsp;communalTable&nbsp;::&nbsp;Maybe&nbsp;CommunalDTO &nbsp;&nbsp;}&nbsp;<span style="color:blue;">deriving</span>&nbsp;(<span style="color:#2b91af;">Eq</span>,&nbsp;<span style="color:#2b91af;">Show</span>,&nbsp;<span style="color:#2b91af;">Generic</span>) <span style="color:#2b91af;">tableJSONOptions</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Options</span> tableJSONOptions&nbsp;= &nbsp;&nbsp;defaultOptions&nbsp;{&nbsp;omitNothingFields&nbsp;=&nbsp;True&nbsp;} <span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">ToJSON</span>&nbsp;<span style="color:blue;">TableDTO</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;toJSON&nbsp;=&nbsp;genericToJSON&nbsp;tableJSONOptions &nbsp;&nbsp;toEncoding&nbsp;=&nbsp;genericToEncoding&nbsp;tableJSONOptions</pre> </p> <p> One way to model a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a> with a DTO is to declare both cases as <code>Maybe</code> fields. While it does allow illegal states to be representable (i.e. both kinds of tables defined at the same time, or none of them present) this is only par for the course at the application boundary. </p> <p> That's quite a bit of infrastructure to stand up, but at least most of it can be reused for parsing. </p> <h3 id="0a507081076c4afe9e2197f013f6f107"> JSON deserialisation based on types <a href="#0a507081076c4afe9e2197f013f6f107">#</a> </h3> <p> To allow parsing of JSON into the above DTO we can make them all <code>FromJSON</code> instances, e.g.: </p> <p> <pre><span style="color:blue;">instance</span>&nbsp;<span style="color:blue;">FromJSON</span>&nbsp;<span style="color:blue;">CommunalDTO</span>&nbsp;<span style="color:blue;">where</span> &nbsp;&nbsp;parseJSON&nbsp;=&nbsp;genericParseJSON&nbsp;communalJSONOptions</pre> </p> <p> Notice that you can reuse the same <code>communalJSONOptions</code> used for the <code>ToJSON</code> instance. Repeat that exercise for the two other record types. </p> <p> That's only half the battle, though, since this only gives you a way to parse and serialize the DTO. What you ultimately want is to persist or dehydrate <code>Table</code> data. </p> <h3 id="7e234848ee2d48b8a1da42c0e05bb088"> Converting DTO to Domain Model, and vice versa <a href="#7e234848ee2d48b8a1da42c0e05bb088">#</a> </h3> <p> As usual, converting a nice, encapsulated value to a more relaxed format is safe and trivial: </p> <p> <pre><span style="color:#2b91af;">toTableDTO</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">Table</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:blue;">TableDTO</span> toTableDTO&nbsp;(Single&nbsp;(SingleT&nbsp;(N&nbsp;c)&nbsp;(N&nbsp;m)))&nbsp;=&nbsp;TableDTO&nbsp;(Just&nbsp;(SingleDTO&nbsp;c&nbsp;m))&nbsp;Nothing toTableDTO&nbsp;(Communal&nbsp;(N&nbsp;c))&nbsp;=&nbsp;TableDTO&nbsp;Nothing&nbsp;(Just&nbsp;(CommunalDTO&nbsp;c))</pre> </p> <p> Going the other way is fundamentally a parsing exercise: </p> <p> <pre><span style="color:#2b91af;">tryParseTable</span>&nbsp;<span style="color:blue;">::</span>&nbsp;<span style="color:blue;">TableDTO</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;<span style="color:#2b91af;">Maybe</span>&nbsp;<span style="color:blue;">Table</span> tryParseTable&nbsp;(TableDTO&nbsp;(Just&nbsp;(SingleDTO&nbsp;c&nbsp;m))&nbsp;Nothing)&nbsp;=&nbsp;trySingleTable&nbsp;c&nbsp;m tryParseTable&nbsp;(TableDTO&nbsp;Nothing&nbsp;(Just&nbsp;(CommunalDTO&nbsp;c)))&nbsp;=&nbsp;tryCommunalTable&nbsp;c tryParseTable&nbsp;_&nbsp;=&nbsp;Nothing</pre> </p> <p> Such an operation may fail, so the result is a <code>Maybe Table</code>. It could also have been an <code>Either something Table</code>, if you wanted to return information about errors when things go wrong. It makes the code marginally more complex, but doesn't change the overall thrust of this exploration. </p> <p> Let's take stock of the type-based alternative. It requires 62 lines of code, distributed over three DTO types, their <code>Options</code>, their <code>ToJSON</code> and <code>FromJSON</code> instances, and finally the two conversions <code>tryParseTable</code> and <code>toTableDTO</code>. </p> <h3 id="473295252dbf46bd93fc31b1d2f505e5"> Conclusion <a href="#473295252dbf46bd93fc31b1d2f505e5">#</a> </h3> <p> In this article I've explored two alternatives for converting a well-encapsulated Domain Model to and from JSON. One option is to directly manipulate the DOM. Another option is take a more declarative approach and define <em>types</em> that model the shape of the JSON data, and then leverage type-based automation (here, Generics) to automatically produce the code that parses and writes the JSON. </p> <p> I've deliberately chosen a Domain Model with some constraints, in order to demonstrate how persisting a non-trivial data model might work. With that setup, writing 'loosely coupled' code directly against the DOM requires 27 lines of code, while 'taking advantage' of type-based automation requires 62 lines of code. </p> <p> To be fair, the dice don't always land that way. You can't infer a general rule from a single example, and it's possible that I could have done something clever with Aeson to reduce the code. Even so, I think that there's a conclusion to be drawn, and it's this: </p> <p> Type-based automation (Generics, or run-time Reflection) may seem simple at first glance. Just declare a type and let some automation library do the rest. It may happen, however, that you need to tweak the defaults so much that it would be easier skipping the type-based approach and instead directly manipulating the DOM. </p> <p> I love static type systems, but I'm also watchful of their limitations. There's likely to be an inflection point where, on the one side, a type-based declarative API is best, while on the other side of that point, a more 'lightweight' approach is better. </p> <p> The position of such an inflection point will vary from context to context. Just be aware of the possibility, and explore alternatives if things begin to feel awkward. </p> <p> <strong>Next:</strong> <a href="/2023/12/18/serializing-restaurant-tables-in-f">Serializing restaurant tables in F#</a>. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Serialization with and without Reflection https://blog.ploeh.dk/2023/12/04/serialization-with-and-without-reflection 2023-12-04T20:53:00+00:00 Mark Seemann <div id="post"> <p> <em>An investigation of alternatives.</em> </p> <p> I recently wrote a tweet that caused more responses than usual: </p> <blockquote> <p> "A decade ago, I used .NET Reflection so often that I know most the the API by heart. </p> <p> "Since then, I've learned better ways to solve my problems. I can't remember when was the last time I used .NET Reflection. I never need it. </p> <p> "Do you?" </p> <footer><cite><a href="https://twitter.com/ploeh/status/1727280699051495857">me</a></cite></footer> </blockquote> <p> Most people who read my tweets are programmers, and some are, perhaps, not entirely neurotypical, but I intended the last paragraph to be a <a href="https://en.wikipedia.org/wiki/Rhetorical_question">rhetorical question</a>. My point, really, was to point out that if I tell you it's possible to do without Reflection, one or two readers might keep that in mind and at least explore options the next time the urge to use Reflection arises. </p> <p> A common response was that Reflection is useful for (de)serialization of data. These days, the most common case is going to and from <a href="https://en.wikipedia.org/wiki/JSON">JSON</a>, but the problem is similar if the format is <a href="https://en.wikipedia.org/wiki/XML">XML</a>, <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>, or another format. In a sense, even <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">reading to and from a database is a kind of serialization</a>. </p> <p> In this little series of articles, I'm going to explore some alternatives to Reflection. I'll use the same example throughout, and I'll stick to JSON, but you can easily extrapolate to other serialization formats. </p> <h3 id="4b1340f8fad4452f950c3cfdde275365"> Table layouts <a href="#4b1340f8fad4452f950c3cfdde275365">#</a> </h3> <p> As always, I find the example domain of online restaurant reservation systems to be so rich as to furnish a useful example. Imagine a multi-tenant service that enables restaurants to take and manage reservations. </p> <p> When a new reservation request arrives, the system has to make a decision on whether to accept or reject the request. The layout, or configuration, of <a href="/2020/01/27/the-maitre-d-kata">tables plays a role in that decision</a>. </p> <p> Such a multi-tenant system may have an API for configuring the restaurant; essentially, entering data into the system about the size and policies regarding tables in a particular restaurant. </p> <p> Most restaurants have 'normal' tables where, if you reserve a table for three, you'll have the entire table for a duration. Some restaurants also have one or more communal tables, typically bar seating where you may get a view of the kitchen. Quite a few high-end restaurants have tables like these, because it enables them to cater to single diners without reserving an entire table that could instead have served two paying customers. </p> <p> <img src="/content/binary/ernst.jpg" alt="Bar seating at Ernst, Berlin."> </p> <p> In Copenhagen, on the other hand, it's also not uncommon to have a special room for larger parties. I think this has something to do with the general age of the buildings in the city. Most establishments are situated in older buildings, with all the trappings, including load-bearing walls, cellars, etc. As part of a restaurant's location, there may be a big cellar room, second-story room, or other room that's not practical for the daily operation of the place, but which works for parties of, say, 15-30 people. Such 'private dining' rooms can be used for private occasions or company outings. </p> <p> A <a href="https://en.wikipedia.org/wiki/Ma%C3%AEtre_d%27h%C3%B4tel">maître d'hôtel</a> may wish to configure the system with a variety of tables, including communal tables, and private dining tables as described above. </p> <p> One way to model such requirements is to distinguish between two kinds of tables: Communal tables, and 'single' tables, and where single tables come with an additional property that models the minimal reservation required to reserve that table. A JSON representation might look like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;singleTable&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;16, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;minimalReservation&quot;</span>:&nbsp;10 &nbsp;&nbsp;} }</pre> </p> <p> This may represent a private dining table that seats up to sixteen people, and where the maître d'hôtel has decided to only accept reservations for at least ten guests. </p> <p> A <code>singleTable</code> can also be used to model 'normal' tables without special limits. If the restaurant has a table for four, but is ready to accept a reservation for one person, you can configure a table for four, with a minimum reservation of one. </p> <p> Communal tables are different, though: </p> <p> <pre>{&nbsp;<span style="color:#2e75b6;">&quot;communalTable&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;capacity&quot;</span>:&nbsp;10&nbsp;}&nbsp;}</pre> </p> <p> Why not just model that as ten single tables that each seat one? </p> <p> You don't want to do that because you want to make sure that parties can eat together. Some restaurants have more than one communal table. Imagine that you only have two communal tables of ten seats each. What happens if you model this as twenty single-person tables? </p> <p> If you do that, you may accept reservations for parties of six, six, and six, because <em>6&nbsp;+&nbsp;6&nbsp;+&nbsp;6&nbsp;=&nbsp;18&nbsp;&lt;&nbsp;20</em>. When those three groups arrive, however, you discover that you have to split one of the parties! The party getting separated may not like that at all, and you are, after all, in the hospitality business. </p> <h3 id="368ebde4f05e434b938327469aba1640"> Exploration <a href="#368ebde4f05e434b938327469aba1640">#</a> </h3> <p> In each article in this short series, I'll explore serialization with and without Reflection in a few languages. I'll start with <a href="https://www.haskell.org/">Haskell</a>, since that language doesn't have run-time Reflection. It does have a related facility called <em>generics</em>, not to be confused with .NET or Java generics, which in Haskell are called <em>parametric polymorphism</em>. It's confusing, I know. </p> <p> Haskell generics look a bit like .NET Reflection, and there's some overlap, but it's not quite the same. The main difference is that Haskell generic programming all 'resolves' at compile time, so there's no run-time Reflection in Haskell. </p> <p> If you don't care about Haskell, you can skip that article. </p> <ul> <li><a href="/2023/12/11/serializing-restaurant-tables-in-haskell">Serializing restaurant tables in Haskell</a></li> <li><a href="/2023/12/18/serializing-restaurant-tables-in-f">Serializing restaurant tables in F#</a></li> <li><a href="/2023/12/25/serializing-restaurant-tables-in-c">Serializing restaurant tables in C#</a></li> </ul> <p> As you can see, the next article repeats the exercise in <a href="https://fsharp.org/">F#</a>, and if you also don't care about that language, you can skip that article as well. </p> <p> The C# article, on the other hand, should be readable to not only C# programmers, but also developers who work in sufficiently equivalent languages. </p> <h3 id="f367d5adccbe4d3792acc8f3828add85"> Descriptive, not prescriptive <a href="#f367d5adccbe4d3792acc8f3828add85">#</a> </h3> <p> The purpose of this article series is only to showcase alternatives. Based on the reactions my tweet elicited I take it that some people can't imagine how serialisation might look without Reflection. </p> <p> It is <em>not</em> my intent that you should eschew the Reflection-based APIs available in your languages. In .NET, for example, a framework like ASP.NET MVC expects you to model JSON or XML as <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Objects</a>. This gives you an <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">illusion of static types at the boundary</a>. </p> <p> Even a Haskell web library like <a href="https://www.servant.dev/">Servant</a> expects you to model web APIs with static types. </p> <p> When working with such a framework, it doesn't always pay to fight against its paradigm. When I work with ASP.NET, I define DTOs just like everyone else. On the other hand, if communicating with a backend system, I <a href="/2022/01/03/to-id-or-not-to-id">sometimes choose to skip static types and instead working directly with a JSON Document Object Model</a> (DOM). </p> <p> I occasionally find that it better fits my use case, but it's not the majority of times. </p> <h3 id="9af8390d4b1b4c50ae9148ea2eb841ad"> Conclusion <a href="#9af8390d4b1b4c50ae9148ea2eb841ad">#</a> </h3> <p> While some sort of Reflection or metadata-driven mechanism is often used to implement serialisation, it often turns out that such convenient language capabilities are programmed on top of an ordinary object model. Even isolated to .NET, I think I'm on my third JSON library, and most (all?) turned out to have an underlying DOM that you can manipulate. </p> <p> In this article I've set the stage for exploring how serialisation can work, with or (mostly) without Reflection. </p> <p> If you're interested in the philosophy of science and <a href="https://en.wikipedia.org/wiki/Epistemology">epistemology</a>, you may have noticed a recurring discussion in academia: A wider society benefits not only from learning what works, but also from learning what doesn't work. It would be useful if researchers published their failures along with their successes, yet few do (for fairly obvious reasons). </p> <p> Well, I depend neither on research grants nor salary, so I'm free to publish negative results, such as they are. </p> <p> Not that I want to go so far as to categorize what I present in the present articles as useless, but they're probably best applied in special circumstances. On the other hand, I don't know <em>your context</em>, and perhaps you're doing something I can't even imagine, and what I present here is just what you need. </p> <p> <strong>Next:</strong> <a href="/2023/12/11/serializing-restaurant-tables-in-haskell">Serializing restaurant tables in Haskell</a>. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="db4a9a94452a1237bf71989561dfd947"> <div class="comment-author">gdifolco <a href="#db4a9a94452a1237bf71989561dfd947">#</a></div> <div class="comment-content"> <q> <i> I'll start with <a href="https://www.haskell.org/">Haskell</a>, since that language doesn't have run-time Reflection. </i> </q> <p> Haskell (the language) does not provide primitives to access to data representation <i>per-se</i>, during the compilation GHC (the compiler) erase a lot of information (more or less depending on the profiling flags) in order to provide to the run-time system (RTS) a minimal "bytecode". </p> <p> That being said, provide three ways to deal structurally with values: <ul> <li><a href="https://wiki.haskell.org/Template_Haskell">TemplateHaskell</a>: give the ability to rewrite the AST at compile-time</li> <li><a href="https://hackage.haskell.org/package/base-4.19.0.0/docs/GHC-Generics.html#t:Generic">Generics</a>: give the ability to have a type-level representation of a type structure</li> <li><a href="https://hackage.haskell.org/package/base-4.19.0.0/docs/Type-Reflection.html#t:Typeable">Typeable</a>: give the ability to have a value-level representation of a type structure</li> </ul> </p> <p> <i>Template Haskell</i> is low-level as it implies to deal with the AST, it is also harder to debug as, in order to be evaluated, the main compilation phase is stopped, then the <i>Template Haskell</i> code is ran, and finally the main compilation phase continue. It also causes compilation cache issues. </p> <p> <i>Generics</i> take type's structure and generate a representation at type-level, the main idea is to be able to go back and forth between the type and its representation, so you can define so behavior over a structure, the good thing being that, since the representation is known at compile-time, many optimizations can be done. On complex types it tends to slow-down compilation and produce larger-than-usual binaries, it is generraly the way libraries are implemented. </p> <p> <i>Typeable</i> is a purely value-level type representation, you get only on type whatever the type structure is, it is generally used when you have "dynamic" types, it provides safe ways to do coercion. </p> <p> Haskell tends to push as much things as possible in compile-time, this may explain this tendency. </p> </div> <div class="comment-date">2023-12-05 21:47 UTC</div> </div> <div class="comment" id="9cf7ca25c1364eb39c489d9883d90cea"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#9cf7ca25c1364eb39c489d9883d90cea">#</a></div> <div class="comment-content"> <p> Thank you for writing. I was already aware of Template Haskell and Generics, but Typeable is new to me. I don't consider the first two equivalent to Reflection, since they resolve at compile time. They're more akin to automated code generation, I think. Reflection, as I'm used to it from .NET, is a run-time feature where you can inspect and interact with types and values as the code is executing. </p> <p> I admit that I haven't had the time to more than browse the documentation of Typeable, and it's so abstract that it's not clear to me what it does, how it works, or whether it's actually comparable to Reflection. The first question that comes to my mind regards the type class <code>Typeable</code> itself. It has no instances. Is it one of those special type classes (like <a href="https://hackage.haskell.org/package/base/docs/Data-Coerce.html#t:Coercible">Coercible</a>) that one doesn't have to explicitly implement? </p> </div> <div class="comment-date">2023-12-08 17:11 UTC</div> </div> <div class="comment" id="d3ad8ff86d11b285e69adc3ebbc74165"> <div class="comment-author">gdifolco <a href="#d3ad8ff86d11b285e69adc3ebbc74165">#</a></div> <div class="comment-content"> <blockquote> I don't consider the first two equivalent to Reflection, since they resolve at compile time. They're more akin to automated code generation, I think. Reflection, as I'm used to it from .NET, is a run-time feature where you can inspect and interact with types and values as the code is executing. </blockquote> <p> I don't know the .NET ecosystem well, but I guess you can borrow information at run-time we have at compile-time with TemplateHaskell and Generics, I think you are right then. </p> <blockquote> I admit that I haven't had the time to more than browse the documentation of Typeable, and it's so abstract that it's not clear to me what it does, how it works, or whether it's actually comparable to Reflection. The first question that comes to my mind regards the type class <code>Typeable</code> itself. It has no instances. Is it one of those special type classes (like <a href="https://hackage.haskell.org/package/base/docs/Data-Coerce.html#t:Coercible">Coercible</a>) that one doesn't have to explicitly implement? </blockquote> <p> You can derive <code>Typeable</code> as any othe type classes: </p> <pre> data MyType = MyType { myString :: String, myInt :: Int } deriving stock (Eq, Show, Typeable) </pre> <p> It's pretty "low-level", <code>typeRep</code> gives you a <code>TypeRep a</code> (<code>a</code> being the type represented) which is <a href="https://hackage.haskell.org/package/base-4.19.0.0/docs/Type-Reflection.html#v:App">a representation of the type with primitive elements</a> (<a href="https://www.youtube.com/watch?v=uR_VzYxvbxg">More details here</a>). </p> <p> Then, you'll be able to either pattern match on it, or <a href="https://hackage.haskell.org/package/base-4.19.0.0/docs/Data-Typeable.html#v:cast">cast it</a> (which it not like casting in Java for example, because you are just proving to the compiler that two types are equivalent). </p> </div> <div class="comment-date">2023-12-11 17:11 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Synchronizing concurrent teams https://blog.ploeh.dk/2023/11/27/synchronizing-concurrent-teams 2023-11-27T08:43:00+00:00 Mark Seemann <div id="post"> <p> <em>Or, rather: Try not to.</em> </p> <p> A few months ago I visited a customer and as the day was winding down we got to talk more informally. One of the architects mentioned, in an almost off-hand manner, "we've embarked on a <a href="https://en.wikipedia.org/wiki/Scaled_agile_framework">SAFe</a> journey..." </p> <p> "Yes..?" I responded, hoping that my inflection would sound enough like a question that he'd elaborate. </p> <p> Unfortunately, I'm apparently sometimes too subtle when dealing with people face-to-face, so I never got to hear just how that 'SAFe journey' was going. Instead, the conversation shifted to the adjacent topic of how to coordinate independent teams. </p> <p> I told them that, in my opinion, the best way to coordinate independent teams is to <em>not</em> coordinate them. I don't remember exactly how I proceeded from there, but I probably said something along the lines that I consider coordination meetings between teams to be an 'architecture smell'. That the need to talk to other teams was a symptom that teams were too tightly coupled. </p> <p> I don't remember if I said exactly that, but it would have been in character. </p> <p> The architect responded: "I don't like silos." </p> <p> How do you respond to that? </p> <h3 id="c5fdd5e722994d8fb75bb3f32f1cf86b"> Autonomous teams <a href="#c5fdd5e722994d8fb75bb3f32f1cf86b">#</a> </h3> <p> I couldn't very well respond that <em>silos are great</em>. First, it doesn't sound very convincing. Second, it'd be an argument suitable only in a kindergarten. <em>Are not! -Are too! -Not! -Too!</em> etc. </p> <p> After feeling momentarily <a href="https://en.wikipedia.org/wiki/Check_(chess)">checked</a>, for once I managed to think on my feet, so I replied, "I don't suggest that your teams should be isolated from each other. I do encourage people to talk to each other, but I don't think that teams should <em>coordinate</em> much. Rather, think of each team as an organism on the savannah. They interact, and what they do impact others, but in the end they're autonomous life forms. I believe an architect's job is like a ranger's. You can't control the plants or animals, but you can nurture the ecosystem, herding it in a beneficial direction." </p> <p> <img src="/content/binary/samburu.jpg" alt="Gazelles and warthogs in Samburu National Reserve, Kenya."> </p> <p> That ranger metaphor is an old pet peeve of mine, originating from what I consider one of my most under-appreciated articles: <a href="/2012/12/18/ZookeepersmustbecomeRangers">Zookeepers must become Rangers</a>. It's closely related to the more popular metaphor of software architecture as gardening, but I like the wildlife variation because it emphasizes an even more hands-off approach. It removes the illusion that you can control a fundamentally unpredictable process, but replaces it with the hopeful emphasis on stewardship. </p> <p> How do ecosystems thrive? A software architect (or ranger) should nurture resilience in each subsystem, just like evolution has promoted plants' and animals' ability to survive a variety of unforeseen circumstances: Flood, draught, fire, predators, lack of prey, disease, etc. </p> <p> You want teams to work independently. This doesn't mean that they work in isolation, but rather they they are free to act according to their abilities and understanding of the situation. An architect can help them understand the wider ecosystem and predict tomorrow's weather, so to speak, but the team should remain autonomous. </p> <h3 id="d0637b2597964cfb9bf6ffd3f0559fd0"> Concurrent work <a href="#d0637b2597964cfb9bf6ffd3f0559fd0">#</a> </h3> <p> I'm assuming that an organisation has multiple teams because they're supposed to work concurrently. While team A is off doing one thing, team B is doing something else. You can attempt to herd them in the same general direction, but beware of tight coordination. </p> <p> What's the problem with coordination? Isn't it a kind of collaboration? Don't we consider that beneficial? </p> <p> I'm not arguing that teams should be antagonistic. Like all metaphors, we should be careful not to take the savannah metaphor too far. I'm not imagining that one team consists of lions, apex predators, killing and devouring other teams. </p> <p> Rather, the reason I'm wary of coordination is because it seems synonymous with <em>synchronisation</em>. </p> <p> In <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> I've already discussed how good practices for Continuous Integration are similar to earlier lessons about <a href="https://en.wikipedia.org/wiki/Optimistic_concurrency_control">optimistic concurrency</a>. It recently struck me that we can draw a similar parallel between concurrent team work and parallel computing. </p> <p> For decades we've known that the less synchronization, the faster parallel code is. Synchronization is costly. </p> <p> In team work, coordination is like thread synchronization. Instead of doing work, you stop in order to coordinate. This implies that one thread or team has to wait for the other to catch up. </p> <p> <img src="/content/binary/sync-wait.png" alt="Two horizontal bars presenting two processes, A and B. A is shorter than B, indicating that it finishes first."> </p> <p> Unless work is perfectly evenly divided, team A may finish before team B. In order to coordinate, team A must sit idle for a while, waiting for B to catch up. (In development organizations, <em>idleness</em> is rarely allowed, so in practice, team A embarks on some other work, with <a href="/2012/12/18/ZookeepersmustbecomeRangers">consequences that I've already outlined</a>.) </p> <p> If you have more than two teams, this phenomenon only becomes worse. You'll have more idle time. This reminds me of <a href="https://en.wikipedia.org/wiki/Amdahl%27s_law">Amdahl's law</a>, which briefly put expresses that there's a limit to how much of a speed improvement you can get from concurrent work. The limit is related to the percentage of the work that can <em>not</em> be parallelized. The greater the need to synchronize work, the lower the ceiling. Conversely, the more you can let concurrent processes run without coordination, the more you gain from parallelization. </p> <p> It seems to me that there's a direct counterpart in team organization. The more teams need to coordinate, the less is gained from having multiple teams. </p> <p> But really, <a href="/ref/mythical-man-month">Fred Brooks could you have told you so in 1975</a>. </p> <h3 id="11f48be968f94f67b42d0934a4d08501"> Versioning <a href="#11f48be968f94f67b42d0934a4d08501">#</a> </h3> <p> A small development team may organize work informally. Work may be divided along 'natural' lines, each developer taking on tasks best suited to his or her abilities. If working in a code base with shared ownership, one developer doesn't <em>have</em> to wait on the work done by another developer. Instead, a programmer may complete the required work individually, or working together with a colleague. Coordination happens, but is both informal and frequent. </p> <p> As development organizations grow, teams are formed. Separate teams are supposed to work independently, but may in practice often depend on each other. Team A may need team B to make a change before they can proceed with their own work. The (felt) need to coordinate team activities arise. </p> <p> In my experience, this happens for a number of reasons. One is that teams may be divided along wrong lines; this is a socio-technical problem. Another, more technical, reason is that <a href="/2012/12/18/RangersandZookeepers">zookeepers</a> rarely think explicitly about versioning or avoiding breaking changes. Imagine that team A needs team B to develop a new capability. This new capability <em>implies</em> a breaking change, so the teams will now need to coordinate. </p> <p> Instead, team B should develop the new feature in such a way that it doesn't break existing clients. If all else fails, the new feature must exist side-by-side with the old way of doing things. With <a href="https://en.wikipedia.org/wiki/Continuous_deployment">Continuous Deployment</a> the new feature becomes available when it's ready. Team A still has to <em>wait</em> for the feature to become available, but no <em>synchronization</em> is required. </p> <h3 id="def545737be9487da7c4b01dcc3eb106"> Conclusion <a href="#def545737be9487da7c4b01dcc3eb106">#</a> </h3> <p> Yet another lesson about thread-safety and concurrent transactions seems to apply to people and processes. Parallel processes should be autonomous, with as little synchronization as possible. The more you coordinate development teams, the more you limit the speed of overall work. This seems to suggest that something akin to Amdahl's law also applies to development organizations. </p> <p> Instead of coordinating teams, encourage them to exist as autonomous entities, but set things up so that <em>not breaking compatibility</em> is a major goal for each team. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Trimming a Fake Object https://blog.ploeh.dk/2023/11/20/trimming-a-fake-object 2023-11-20T06:44:00+00:00 Mark Seemann <div id="post"> <p> <em>A refactoring example.</em> </p> <p> When I introduce the <a href="http://xunitpatterns.com/Fake%20Object.html">Fake Object</a> testing pattern to people, a common concern is the maintenance burden of it. The point of the pattern is that you write some 'working' code only for test purposes. At a glance, it seems as though it'd be more work than using a dynamic mock library like <a href="https://www.devlooped.com/moq/">Moq</a> or <a href="https://site.mockito.org/">Mockito</a>. </p> <p> This article isn't really about that, but the benefit of a Fake Object is that it has a <em>lower</em> maintenance footprint because it gives you a single class to maintain when you change interfaces or base classes. Dynamic mock objects, on the contrary, leads to <a href="https://en.wikipedia.org/wiki/Shotgun_surgery">Shotgun surgery</a> because every time you change an interface or base class, you have to revisit multiple tests. </p> <p> In a <a href="/2023/11/13/fakes-are-test-doubles-with-contracts">recent article</a> I presented a Fake Object that may have looked bigger than most people would find comfortable for test code. In this article I discuss how to trim it via a set of refactorings. </p> <h3 id="22e6648934bb4ae59aa1a181940172ae"> Original Fake read registry <a href="#22e6648934bb4ae59aa1a181940172ae">#</a> </h3> <p> The article presented this <code>FakeReadRegistry</code>, repeated here for your convenience: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>&nbsp;:&nbsp;IReadRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IDictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;&nbsp;views; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;EnumerateDates(arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(GetView) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Aggregate(rooms.AsEnumerable(),&nbsp;Enumerable.Intersect) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>&nbsp;=&nbsp;GetView(d); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">newView</span>&nbsp;=&nbsp;QueryService.Reserve(booking,&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views[d]&nbsp;=&nbsp;newView; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&nbsp;arrival; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">yield</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;d; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&nbsp;d.AddDays(1); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This is 47 lines of code, spread over five members (including the constructor). Three of the methods have a <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a> (CC) of <em>2</em>, which is the maximum for this class. The remaining two have a CC of <em>1</em>. </p> <p> While you <em>can</em> play some <a href="/2023/11/14/cc-golf">CC golf</a> with those CC-2 methods, that tends to pull the code in a direction of being less <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a>. For that reason, I chose to present the code as above. Perhaps more importantly, it doesn't save that many lines of code. </p> <p> Had this been a piece of production code, no-one would bat an eye at size or complexity, but this is test code. To add spite to injury, those 47 lines of code implement this two-method interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IReadRegistry</span> { &nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>); }</pre> </p> <p> Can we improve the situation? </p> <h3 id="49f78ebabf3d4875b469e935151c064a"> Root cause analysis <a href="#49f78ebabf3d4875b469e935151c064a">#</a> </h3> <p> Before you rush to 'improve' code, it pays to understand why it looks the way it looks. </p> <p> Code is a wonderfully malleable medium, so you should regard nothing as set in stone. On the other hand, there's often a reason it looks like it does. It <em>may</em> be that the previous programmers were incompetent ogres for hire, but often there's a better explanation. </p> <p> I've outlined my thinking process in <a href="/2023/11/13/fakes-are-test-doubles-with-contracts">the previous article</a>, and I'm not going to repeat it all here. To summarise, though, I've applied the <a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">Dependency Inversion Principle</a>. </p> <blockquote> <p> "clients [...] own the abstract interfaces" </p> <footer><cite>Robert C. Martin, <a href="/ref/appp">APPP</a>, chapter 11</cite></footer> </blockquote> <p> In other words, I let the needs of the clients guide the design of the <code>IReadRegistry</code> interface, and then the implementation (<code>FakeReadRegistry</code>) had to conform. </p> <p> But that's not the whole truth. </p> <p> I was doing a programming exercise - the <a href="https://codingdojo.org/kata/CQRS_Booking/">CQRS booking</a> kata - and I was following the instructions given in the description. They quite explicitly outline the two dependencies and their methods. </p> <p> When trying a new exercise, it's a good idea to follow instructions closely, so that's what I did. Once you get a sense of a kata, though, there's no law saying that you have to stick to the original rules. After all, the purpose of an exercise is to train, and in programming, <a href="/2020/01/13/on-doing-katas">trying new things is training</a>. </p> <h3 id="740cb249aff74666af7af4784cc166b8"> Test code that wants to be production code <a href="#740cb249aff74666af7af4784cc166b8">#</a> </h3> <p> A major benefit of test-driven development (TDD) is that it provides feedback. It pays to be tuned in to that channel. The above <code>FakeReadRegistry</code> seems to be trying to tell us something. </p> <p> Consider the <code>GetFreeRooms</code> method. I'll repeat the single-expression body here for your convenience: </p> <p> <pre><span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;EnumerateDates(arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;.Select(GetView) &nbsp;&nbsp;&nbsp;&nbsp;.Aggregate(rooms.AsEnumerable(),&nbsp;Enumerable.Intersect) &nbsp;&nbsp;&nbsp;&nbsp;.ToList();</pre> </p> <p> Why is that the implementation? Why does it need to first enumerate the dates in the requested interval? Why does it need to call <code>GetView</code> for each date? </p> <p> Why don't I just do the following and be done with it? </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeStorage</span>&nbsp;:&nbsp;Collection&lt;Booking&gt;,&nbsp;IWriteRegistry,&nbsp;IReadRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeStorage</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">booked</span>&nbsp;=&nbsp;<span style="color:blue;">this</span>.Where(<span style="font-weight:bold;color:#1f377f;">b</span>&nbsp;=&gt;&nbsp;b.Overlaps(arrival,&nbsp;departure)).ToList(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;rooms &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Where(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;!booked.Any(<span style="font-weight:bold;color:#1f377f;">b</span>&nbsp;=&gt;&nbsp;b.RoomName&nbsp;==&nbsp;r.Name)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Save</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(booking); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> To be honest, that's what I did <em>first</em>. </p> <p> While there are two interfaces, there's only one Fake Object implementing both. That's often an easy way to address the <a href="https://en.wikipedia.org/wiki/Interface_segregation_principle">Interface Segregation Principle</a> and still keeping the Fake Object simple. </p> <p> This is much simpler than <code>FakeReadRegistry</code>, so why didn't I just keep that? </p> <p> I didn't feel it was an honest attempt at CQRS. In CQRS you typically write the data changes to one system, and then you have another logical process that propagates the information about the data modification to the <em>read</em> subsystem. There's none of that here. Instead of being based on one or more 'materialised views', the query is just that: A query. </p> <p> That was what I attempted to address with <code>FakeReadRegistry</code>, and I think it's a much more faithful CQRS implementation. It's also more complex, as CQRS tends to be. </p> <p> In both cases, however, it seems that there's some production logic trapped in the test code. Shouldn't <code>EnumerateDates</code> be production code? And how about the general 'algorithm' of <code>RoomBooked</code>: </p> <ul> <li>Enumerate the relevant dates</li> <li>Get the 'materialised' view for each date</li> <li>Calculate the new view for that date</li> <li>Update the collection of views for that date</li> </ul> <p> That seems like just enough code to warrant moving it to the production code. </p> <p> A word of caution before we proceed. When deciding to pull some of that test code into the production code, I'm making a decision about architecture. </p> <p> Until now, I'd been following the Dependency Inversion Principle closely. The interfaces exist because the client code needs them. Those interfaces could be implemented in various ways: You could use a relational database, a document database, files, blobs, etc. </p> <p> Once I decide to pull the above algorithm into the production code, I'm choosing a particular persistent data structure. This now locks the data storage system into a design where there's a persistent view per date, and another database of bookings. </p> <p> Now that I'd learned some more about the exercise, I felt confident making that decision. </p> <h3 id="d1515fe093394daf8cf9e4a9ec687770"> Template Method <a href="#d1515fe093394daf8cf9e4a9ec687770">#</a> </h3> <p> The first move I made was to create a superclass so that I could employ the <a href="https://en.wikipedia.org/wiki/Template_method_pattern">Template Method</a> pattern: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">abstract</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ReadRegistry</span>&nbsp;:&nbsp;IReadRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;EnumerateDates(arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(GetView) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Aggregate(Rooms.AsEnumerable(),&nbsp;Enumerable.Intersect) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>&nbsp;=&nbsp;GetView(d); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">newView</span>&nbsp;=&nbsp;QueryService.Reserve(booking,&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;UpdateView(d,&nbsp;newView); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">abstract</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">UpdateView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">abstract</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;Rooms&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">abstract</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryGetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&nbsp;arrival; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">yield</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;d; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&nbsp;d.AddDays(1); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(TryGetView(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Rooms; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This looks similar to <code>FakeReadRegistry</code>, so how is this an improvement? </p> <p> The new <code>ReadRegistry</code> class is production code. It can, and should, be tested. (Due to the history of how we got here, <a href="/2023/11/13/fakes-are-test-doubles-with-contracts">it's already covered by tests</a>, so I'm not going to repeat that effort here.) </p> <p> True to the <a href="https://en.wikipedia.org/wiki/Template_method_pattern">Template Method</a> pattern, three <code>abstract</code> members await a child class' implementation. These are the <code>UpdateView</code> and <code>TryGetView</code> methods, as well as the <code>Rooms</code> read-only property (glorified getter method). </p> <p> Imagine that in the production code, these are implemented based on file/document/blob storage - one per date. <code>TryGetView</code> would attempt to read the document from storage, <code>UpdateView</code> would create or modify the document, while <code>Rooms</code> returns a default set of rooms. </p> <p> A Test Double, however, can still use an in-memory dictionary: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>&nbsp;:&nbsp;ReadRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IDictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;&nbsp;views; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">override</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;Rooms&nbsp;=&gt;&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">UpdateView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views[date]&nbsp;=&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">protected</span>&nbsp;<span style="color:blue;">override</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryGetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Each <code>override</code> is a one-liner with cyclomatic complexity <em>1</em>. </p> <h3 id="91153011891b4b7791acfe0edc65f997"> First round of clean-up <a href="#91153011891b4b7791acfe0edc65f997">#</a> </h3> <p> An abstract class is already a polymorphic object, so we no longer need the <code>IReadRegistry</code> interface. Delete that, and update all code accordingly. Particularly, the <code>QueryService</code> now depends on <code>ReadRegistry</code> rather than <code>IReadRegistry</code>: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;ReadRegistry&nbsp;readRegistry; <span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">QueryService</span>(ReadRegistry&nbsp;<span style="font-weight:bold;color:#1f377f;">readRegistry</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.readRegistry&nbsp;=&nbsp;readRegistry; }</pre> </p> <p> Now move the <code>Reserve</code> function from <code>QueryService</code> to <code>ReadRegistry</code>. Once this is done, the <code>QueryService</code> looks like this: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">QueryService</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;ReadRegistry&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">QueryService</span>(ReadRegistry&nbsp;<span style="font-weight:bold;color:#1f377f;">readRegistry</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.readRegistry&nbsp;=&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;readRegistry.GetFreeRooms(arrival,&nbsp;departure); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> That class is only passing method calls along, so clearly no longer serving any purpose. Delete it. </p> <p> This is a not uncommon in CQRS. One might even argue that if CQRS is done right, there's almost no code on the query side, since all the data view update happens as events propagate. </p> <h3 id="640e19ed6f904d71b925672028aeee45"> From abstract class to Dependency Injection <a href="#640e19ed6f904d71b925672028aeee45">#</a> </h3> <p> While the current state of the code is based on an abstract base class, the overall architecture of the system doesn't hinge on inheritance. From <a href="/2018/02/19/abstract-class-isomorphism">Abstract class isomorphism</a> we know that it's possible to refactor an abstract class to Constructor Injection. Let's do that. </p> <p> First add an <code>IViewStorage</code> interface that mirrors the three <code>abstract</code> methods defined by <code>ReadRegistry</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IViewStorage</span> { &nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;Rooms&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">UpdateView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryGetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>); }</pre> </p> <p> Then implement it with a Fake Object: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeViewStorage</span>&nbsp;:&nbsp;IViewStorage { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IDictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;&nbsp;views; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;Rooms&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeViewStorage</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">UpdateView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views[date]&nbsp;=&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryGetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Notice the similarity to <code>FakeReadRegistry</code>, which we'll get rid of shortly. </p> <p> Now inject <code>IViewStorage</code> into <code>ReadRegistry</code>, and make <code>ReadRegistry</code> a regular (<code>sealed</code>) class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ReadRegistry</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IViewStorage&nbsp;viewStorage; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">ReadRegistry</span>(IViewStorage&nbsp;<span style="font-weight:bold;color:#1f377f;">viewStorage</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.viewStorage&nbsp;=&nbsp;viewStorage; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;EnumerateDates(arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(GetView) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Aggregate(viewStorage.Rooms.AsEnumerable(),&nbsp;Enumerable.Intersect) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>&nbsp;=&nbsp;GetView(d); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">newView</span>&nbsp;=&nbsp;Reserve(booking,&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;viewStorage.UpdateView(d,&nbsp;newView); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="color:#74531f;">Reserve</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">existingView</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;existingView &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Where(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.Name&nbsp;!=&nbsp;booking.RoomName) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&nbsp;arrival; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">yield</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;d; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&nbsp;d.AddDays(1); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(viewStorage.TryGetView(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;viewStorage.Rooms; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> You can now delete the <code>FakeReadRegistry</code> Test Double, since <code>FakeViewStorage</code> has now taken its place. </p> <p> Finally, we may consider if we can make <code>FakeViewStorage</code> even slimmer. While I usually favour composition over inheritance, I've found that deriving Fake Objects from collection base classes is often an efficient way to get a lot of mileage out of a few lines of code. <code>FakeReadRegistry</code>, however, had to inherit from <code>ReadRegistry</code>, so it couldn't derive from any other class. </p> <p> <code>FakeViewStorage</code> isn't constrained in that way, so it's free to inherit from <code>Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;</code>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeViewStorage</span>&nbsp;:&nbsp;Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;,&nbsp;IViewStorage { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;Rooms&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeViewStorage</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">UpdateView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>[date]&nbsp;=&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#74531f;">TryGetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>,&nbsp;<span style="color:blue;">out</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> This last move isn't strictly necessary, but I found it worth at least mentioning. </p> <p> I hope you'll agree that this is a Fake Object that looks maintainable. </p> <h3 id="cd86ac335816431aa39a6538fd9ce95c"> Conclusion <a href="#cd86ac335816431aa39a6538fd9ce95c">#</a> </h3> <p> Test-driven development is a feedback mechanism. If something is difficult to test, it tells you something about your System Under Test (SUT). If your test code looks bloated, that tells you something too. Perhaps part of the test code really belongs in the production code. </p> <p> In this article, we started with a Fake Object that looked like it contained too much production code. Via a series of refactorings I moved the relevant parts to the production code, leaving me with a more idiomatic and conforming implementation. </p> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. CC golf https://blog.ploeh.dk/2023/11/14/cc-golf 2023-11-14T14:44:00+00:00 Mark Seemann <div id="post"> <p> <em>Noun. Game in which the goal is to minimise cyclomatic complexity.</em> </p> <p> <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">Cyclomatic complexity</a> (CC) is a rare code metric since it <a href="/2019/12/09/put-cyclomatic-complexity-to-good-use">can be actually useful</a>. In general, it's a good idea to minimise it as much as possible. </p> <p> In short, CC measures looping and branching in code, and this is often where bugs lurk. While it's only a rough measure, I nonetheless find the metric useful as a general guideline. Lower is better. </p> <h3 id="0a1e55fd6ebf422aac1f6441e4e34f99"> Golf <a href="#0a1e55fd6ebf422aac1f6441e4e34f99">#</a> </h3> <p> I'd like to propose the term "CC golf" for the activity of minimising cyclomatic complexity in an area of code. The name derives from <a href="https://en.wikipedia.org/wiki/Code_golf">code golf</a>, in which you have to implement some behaviour (typically an algorithm) in fewest possible characters. </p> <p> Such games can be useful because they enable you to explore different ways to express yourself in code. It's always a good <a href="/2020/01/13/on-doing-katas">kata constraint</a>. The <a href="/2011/05/16/TennisKatawithimmutabletypesandacyclomaticcomplexityof1">first time I tried that was in 2011</a>, and when looking back on that code today, I'm not that impressed. Still, it taught me a valuable lesson about the <a href="https://en.wikipedia.org/wiki/Visitor_pattern">Visitor pattern</a> that I never forgot, and that later enabled me to <a href="/2018/06/25/visitor-as-a-sum-type">connect some important dots</a>. </p> <p> But don't limit CC golf to katas and the like. Try it in your production code too. Most production code I've seen could benefit from some CC golf, and if you <a href="https://stackoverflow.blog/2022/12/19/use-git-tactically/">use Git tactically</a> you can always stash the changes if they're no good. </p> <h3 id="f7e54cb8b9954fbd9ed8022ad09f5d7f"> Idiomatic tension <a href="#f7e54cb8b9954fbd9ed8022ad09f5d7f">#</a> </h3> <p> Alternative expressions with lower cyclomatic complexity may not always be idiomatic. Let's look at a few examples. In my <a href="/2023/11/13/fakes-are-test-doubles-with-contracts">previous article</a>, I listed some test code where some helper methods had a CC of <em>2</em>. Here's one of them: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&nbsp;arrival; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">yield</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;d; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&nbsp;d.AddDays(1); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Can you express this functionality with a CC of <em>1?</em> In <a href="https://www.haskell.org/">Haskell</a> it's essentially built in as <code>(. pred) . enumFromTo</code>, and in <a href="https://fsharp.org/">F#</a> it's also idiomatic, although more verbose: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;enumerateDates&nbsp;(arrival&nbsp;:&nbsp;DateOnly)&nbsp;departure&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;Seq.initInfinite&nbsp;id&nbsp;|&gt;&nbsp;Seq.map&nbsp;arrival.AddDays&nbsp;|&gt;&nbsp;Seq.takeWhile&nbsp;(<span style="color:blue;">fun</span>&nbsp;d&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;d&nbsp;&lt;&nbsp;departure)</pre> </p> <p> Can we do the same in C#? </p> <p> If there's a general API in .NET that corresponds to the F#-specific <code>Seq.initInfinite</code> I haven't found it, but we can do something like this: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">const</span>&nbsp;<span style="color:blue;">int</span>&nbsp;infinity&nbsp;=&nbsp;<span style="color:blue;">int</span>.MaxValue;&nbsp;<span style="color:green;">//&nbsp;As&nbsp;close&nbsp;as&nbsp;int&nbsp;gets,&nbsp;at&nbsp;least</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Enumerable.Range(0,&nbsp;infinity).Select(arrival.AddDays).TakeWhile(<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&gt;&nbsp;d&nbsp;&lt;&nbsp;departure); }</pre> </p> <p> In C# infinite sequences are generally unusual, but <em>if</em> you were to create one, a combination of <code>while true</code> and <code>yield return</code> would be the most idiomatic. The problem with that, though, is that such a construct has a cyclomatic complexity of <em>2</em>. </p> <p> The above suggestion gets around that problem by pretending that <code>int.MaxValue</code> is infinity. Practically, at least, a 32-bit signed integer can't get larger than that anyway. I haven't tried to let F#'s <a href="https://fsharp.github.io/fsharp-core-docs/reference/fsharp-collections-seqmodule.html#initInfinite">Seq.initInfinite</a> run out, but by its type it seems <code>int</code>-bound as well, so in practice it, too, probably isn't infinite. (Or, if it is, the index that it supplies will have to overflow and wrap around to a negative value.) </p> <p> Is this alternative C# code better than the first? You be the judge of that. It has a lower cyclomatic complexity, but is less idiomatic. This isn't uncommon. In languages with a procedural background, there's often tension between lower cyclomatic complexity and how 'things are usually done'. </p> <h3 id="d24ea9b2a10f482693071d7dbe1c6604"> Checking for null <a href="#d24ea9b2a10f482693071d7dbe1c6604">#</a> </h3> <p> Is there a way to reduce the cyclomatic complexity of the <code>GetView</code> helper method? </p> <p> <pre><span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;rooms; }</pre> </p> <p> This is an example of the built-in API being in the way. In F#, you naturally write the same behaviour with a CC of <em>1:</em> </p> <p> <pre><span style="color:blue;">let</span>&nbsp;getView&nbsp;(date&nbsp;:&nbsp;DateOnly)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;|&gt;&nbsp;Map.tryFind&nbsp;date&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;rooms&nbsp;|&gt;&nbsp;Set.ofSeq</pre> </p> <p> That <code>TryGet</code> idiom is in the way for further CC reduction, it seems. It <em>is</em> possible to reach a CC of <em>1</em>, though, but it's neither pretty nor idiomatic: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) { &nbsp;&nbsp;&nbsp;&nbsp;views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;view,&nbsp;rooms&nbsp;}.Where(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}).First()!; }</pre> </p> <p> Perhaps there's a better way, but if so, it escapes me. Here, I use my knowledge that <code>view</code> is going to remain <code>null</code> if <code>TryGetValue</code> doesn't find the dictionary entry. Thus, I can put it in front of an array where I put the fallback value <code>rooms</code> as the second element. Then I filter the array by only keeping the elements that are <em>not</em> <code>null</code> (that's what the <code>x is { }</code> pun means; I usually read it as <em>x is something</em>). Finally, I return the first of these elements. </p> <p> I know that <code>rooms</code> is never <code>null</code>, but apparently the compiler can't tell. Thus, I have to suppress its anxiety with the <code>!</code> operator, telling it that this <em>will</em> result in a non-null value. </p> <p> I would never use such a code construct in a professional C# code base. </p> <h3 id="b1c8693ac29a43a1812e4b9ba9f86e6e"> Side effects <a href="#b1c8693ac29a43a1812e4b9ba9f86e6e">#</a> </h3> <p> The third helper method suggests another kind of problem that you may run into: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure)) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>&nbsp;=&nbsp;GetView(d); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">newView</span>&nbsp;=&nbsp;QueryService.Reserve(booking,&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views[d]&nbsp;=&nbsp;newView; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Here the higher-than-one CC stems from the need to loop through dates in order to produce a side effect for each. Even in F# I do that: </p> <p> <pre><span style="color:blue;">member</span>&nbsp;this.RoomBooked&nbsp;booking&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">for</span>&nbsp;d&nbsp;<span style="color:blue;">in</span>&nbsp;enumerateDates&nbsp;booking.Arrival&nbsp;booking.Departure&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;newView&nbsp;=&nbsp;getView&nbsp;d&nbsp;|&gt;&nbsp;QueryService.reserve&nbsp;booking&nbsp;|&gt;&nbsp;Seq.toList &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;d&nbsp;newView&nbsp;views</pre> </p> <p> This also has a cyclomatic complexity of <em>2</em>. You could do something like this: </p> <p> <pre><span style="color:blue;">member</span>&nbsp;this.RoomBooked&nbsp;booking&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;enumerateDates&nbsp;booking.Arrival&nbsp;booking.Departure &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.iter&nbsp;(<span style="color:blue;">fun</span>&nbsp;d&nbsp;<span style="color:blue;">-&gt;</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;newView&nbsp;=&nbsp;getView&nbsp;d&nbsp;|&gt;&nbsp;QueryService.reserve&nbsp;booking&nbsp;|&gt;&nbsp;Seq.toList&nbsp;<span style="color:blue;">in</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;d&nbsp;newView&nbsp;views)</pre> </p> <p> but while that nominally has a CC of <em>1</em>, it has the same level of indentation as the previous attempt. This seems to indicate, at least, that it doesn't <em>really</em> address any complexity issue. </p> <p> You could also try something like this: </p> <p> <pre><span style="color:blue;">member</span>&nbsp;this.RoomBooked&nbsp;booking&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;enumerateDates&nbsp;booking.Arrival&nbsp;booking.Departure &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;d&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;d,&nbsp;getView&nbsp;d&nbsp;|&gt;&nbsp;QueryService.reserve&nbsp;booking&nbsp;|&gt;&nbsp;Seq.toList) &nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.iter&nbsp;(<span style="color:blue;">fun</span>&nbsp;(d,&nbsp;newView)&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;views&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;d&nbsp;newView&nbsp;views)</pre> </p> <p> which, again, may be nominally better, but forced me to wrap the <code>map</code> output in a tuple so that both <code>d</code> and <code>newView</code> is available to <code>Seq.iter</code>. I tend to regard that as a code smell. </p> <p> This latter version is, however, fairly easily translated to C#: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) { &nbsp;&nbsp;&nbsp;&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&gt;&nbsp;(d,&nbsp;view:&nbsp;QueryService.Reserve(booking,&nbsp;GetView(d)))) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ForEach(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;views[x.d]&nbsp;=&nbsp;x.view); }</pre> </p> <p> The standard .NET API doesn't have something equivalent to <code>Seq.iter</code> (although you could trivially write such an action), but <a href="https://stackoverflow.com/a/1509450/126014">you can convert any sequence to a <code>List&lt;T&gt;</code> and use its <code>ForEach</code> method</a>. </p> <p> In practice, though, I tend to <a href="https://ericlippert.com/2009/05/18/foreach-vs-foreach/">agree with Eric Lippert</a>. There's already an idiomatic way to iterate over each item in a collection, and <a href="https://peps.python.org/pep-0020/">being explicit</a> is generally helpful to the reader. </p> <h3 id="5f060ce8557043dfb0f374ef254cc922"> Church encoding <a href="#5f060ce8557043dfb0f374ef254cc922">#</a> </h3> <p> There's a general solution to most of CC golf: Whenever you need to make a decision and branch between two or more pathways, you can model that with a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>. In C# you can mechanically model that with <a href="/2018/05/22/church-encoding">Church encoding</a> or <a href="/2018/06/25/visitor-as-a-sum-type">the Visitor pattern</a>. If you haven't tried that, I recommend it for the exercise, but once you've done it enough times, you realise that it requires little creativity. </p> <p> As an example, in 2021 I <a href="/2021/08/03/the-tennis-kata-revisited">revisited the Tennis kata</a> with the explicit purpose of translating <a href="/2016/02/10/types-properties-software">my usual F# approach to the exercise</a> to C# using Church encoding and the Visitor pattern. </p> <p> Once you've got a sense for how Church encoding enables you to simulate pattern matching in C#, there are few surprises. You may also rightfully question what is gained from such an exercise: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;IScore&nbsp;<span style="font-weight:bold;color:#74531f;">VisitPoints</span>(IPoint&nbsp;<span style="font-weight:bold;color:#1f377f;">playerOnePoint</span>,&nbsp;IPoint&nbsp;<span style="font-weight:bold;color:#1f377f;">playerTwoPoint</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;playerWhoWinsBall.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;playerOne:&nbsp;playerOnePoint.Match&lt;IScore&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;love:&nbsp;<span style="color:blue;">new</span>&nbsp;Points(<span style="color:blue;">new</span>&nbsp;Fifteen(),&nbsp;playerTwoPoint), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fifteen:&nbsp;<span style="color:blue;">new</span>&nbsp;Points(<span style="color:blue;">new</span>&nbsp;Thirty(),&nbsp;playerTwoPoint), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;thirty:&nbsp;<span style="color:blue;">new</span>&nbsp;Forty(playerWhoWinsBall,&nbsp;playerTwoPoint)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;playerTwo:&nbsp;playerTwoPoint.Match&lt;IScore&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;love:&nbsp;<span style="color:blue;">new</span>&nbsp;Points(playerOnePoint,&nbsp;<span style="color:blue;">new</span>&nbsp;Fifteen()), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fifteen:&nbsp;<span style="color:blue;">new</span>&nbsp;Points(playerOnePoint,&nbsp;<span style="color:blue;">new</span>&nbsp;Thirty()), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;thirty:&nbsp;<span style="color:blue;">new</span>&nbsp;Forty(playerWhoWinsBall,&nbsp;playerOnePoint))); }</pre> </p> <p> Believe it or not, but that method has a CC of <em>1</em> despite the double indentation strongly suggesting that there's some branching going on. To a degree, this also highlights the limitations of the cyclomatic complexity metric. Conversely, <a href="/2021/03/29/table-driven-tennis-scoring">stupidly simple code may have a high CC rating</a>. </p> <p> Most of the examples in this article border on the pathological, and I don't recommend that you write code <em>like</em> that. I recommend that you do the exercise. In less pathological scenarios, there are real benefits to be reaped. </p> <h3 id="931c0946572041449fcce50da5f5219b"> Idioms <a href="#931c0946572041449fcce50da5f5219b">#</a> </h3> <p> In 2015 I published an article titled <a href="/2015/08/03/idiomatic-or-idiosyncratic">Idiomatic or idiosyncratic?</a> In it, I tried to explore the idea that the notion of idiomatic code can sometimes hold you back. I revisited that idea in 2021 in an article called <a href="/2021/05/17/against-consistency">Against consistency</a>. The point in both cases is that just because something looks unfamiliar, it doesn't mean that it's bad. </p> <p> Coding idioms somehow arose. If you believe that there's a portion of natural selection involved in the development of coding idioms, you may assume by default that idioms represent good ways of doing things. </p> <p> To a degree I believe this to be true. Many idioms represent the best way of doing things at the time they settled into the shape that we now know them. Languages and contexts change, however. Just look at <a href="/2019/07/15/tester-doer-isomorphisms">the many approaches to data lookups</a> there have been over the years. For many years now, C# has settled into the so-called <em>TryParse</em> idiom to solve that problem. In my opinion this represents a local maximum. </p> <p> Languages that provide <a href="/2018/06/04/church-encoded-maybe">Maybe</a> (AKA <code>option</code>) and <a href="/2018/06/11/church-encoded-either">Either</a> (AKA <code>Result</code>) types offer a superior alternative. These types naturally compose into <em>CC 1</em> pipelines, whereas <em>TryParse</em> requires you to stop what you're doing in order to check a return value. How very <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>-like. </p> <p> All that said, I still think you should write idiomatic code by default, but don't be a slave by what's considered idiomatic, just as you shouldn't be a slave to consistency. If there's a better way of doing things, choose the better way. </p> <h3 id="8bf3cb23fa5a4f3aa532a40b01dbefb1"> Conclusion <a href="#8bf3cb23fa5a4f3aa532a40b01dbefb1">#</a> </h3> <p> While cyclomatic complexity is a rough measure, it's one of the few useful programming metrics I know of. It should be as low as possible. </p> <p> Most professional code I encounter implements decisions almost exclusively with language primitives: <code>if</code>, <code>for</code>, <code>switch</code>, <code>while</code>, etc. Once, an organisation hired me to give a one-day <em>anti-if</em> workshop. There are other ways to make decisions in code. Most of those alternatives reduce cyclomatic complexity. </p> <p> That's not really a goal by itself, but reducing cyclomatic complexity tends to produce the beneficial side effect of structuring the code in a more sustainable way. It becomes easier to understand and change. </p> <p> As the cliché goes: <em>Choose the right tool for the job.</em> You can't, however, do that if you have nothing to choose from. If you only know of one way to do a thing, you have no choice. </p> <p> Play a little CC golf with your code from time to time. It may improve the code, or it may not. If it didn't, just <a href="https://git-scm.com/docs/git-stash">stash</a> those changes. Either way, you've probably <em>learned</em> something. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Fakes are Test Doubles with contracts https://blog.ploeh.dk/2023/11/13/fakes-are-test-doubles-with-contracts 2023-11-13T17:11:00+00:00 Mark Seemann <div id="post"> <p> <em>Contracts of Fake Objects can be described by properties.</em> </p> <p> The first time I tried my hand with the <a href="https://codingdojo.org/kata/CQRS_Booking/">CQRS Booking kata</a>, I abandoned it after 45 minutes because I found that I had little to learn from it. After all, I've already done umpteen variations of (restaurant) booking code examples, in several programming languages. The code example that accompanies my book <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> is only the largest and most complete of those. </p> <p> I also wrote <a href="https://learn.microsoft.com/en-us/archive/msdn-magazine/2011/april/azure-development-cqrs-on-microsoft-azure">an MSDN Magazine article</a> in 2011 about <a href="https://en.wikipedia.org/wiki/Command_Query_Responsibility_Segregation">CQRS</a>, so I think I have that angle covered as well. </p> <p> Still, while at first glance the kata seemed to have little to offer me, I've found myself coming back to it a few times. It does enable me to focus on something else than the 'production code'. In fact, it turns out that even if (or perhaps particularly <em>when</em>) you use test-driven development (TDD), there's precious little production code. Let's get that out of the way first. </p> <h3 id="b1192f76c2ef4f31b6bddcbc944664c7"> Production code <a href="#b1192f76c2ef4f31b6bddcbc944664c7">#</a> </h3> <p> The few times I've now done the kata, there's almost no 'production code'. The implied <code>CommandService</code> has two lines of effective code: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">CommandService</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IWriteRegistry&nbsp;writeRegistry; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadRegistry&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">CommandService</span>(IWriteRegistry&nbsp;<span style="font-weight:bold;color:#1f377f;">writeRegistry</span>,&nbsp;IReadRegistry&nbsp;<span style="font-weight:bold;color:#1f377f;">readRegistry</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.writeRegistry&nbsp;=&nbsp;writeRegistry; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.readRegistry&nbsp;=&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">BookARoom</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;writeRegistry.Save(booking); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readRegistry.RoomBooked(booking); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The <code>QueryService</code> class isn't much more exciting: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">QueryService</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadRegistry&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">QueryService</span>(IReadRegistry&nbsp;<span style="font-weight:bold;color:#1f377f;">readRegistry</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.readRegistry&nbsp;=&nbsp;readRegistry; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="color:#74531f;">Reserve</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">existingView</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;existingView.Where(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.Name&nbsp;!=&nbsp;booking.RoomName).ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;readRegistry.GetFreeRooms(arrival,&nbsp;departure); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> The kata only suggests the <code>GetFreeRooms</code> method, which is only a single line. The only reason the <code>Reserve</code> function also exists is to pull a bit of testable logic back from the below <a href="http://xunitpatterns.com/Fake%20Object.html">Fake object</a>. I'll return to that shortly. </p> <p> I've also done the exercise in <a href="https://fsharp.org/">F#</a>, essentially porting the C# implementation, which only highlights how simple it all is: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;CommandService&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;bookARoom&nbsp;(writeRegistry&nbsp;:&nbsp;IWriteRegistry)&nbsp;(readRegistry&nbsp;:&nbsp;IReadRegistry)&nbsp;booking&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;writeRegistry.Save&nbsp;booking &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readRegistry.RoomBooked&nbsp;booking <span style="color:blue;">module</span>&nbsp;QueryService&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;reserve&nbsp;booking&nbsp;existingView&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;existingView&nbsp;|&gt;&nbsp;Seq.filter&nbsp;(<span style="color:blue;">fun</span>&nbsp;r&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;r.Name&nbsp;&lt;&gt;&nbsp;booking.RoomName) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;getFreeRooms&nbsp;(readRegistry&nbsp;:&nbsp;IReadRegistry)&nbsp;arrival&nbsp;departure&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;readRegistry.GetFreeRooms&nbsp;arrival&nbsp;departure</pre> </p> <p> That's <em>both</em> the Command side and the Query side! </p> <p> This represents my honest interpretation of the kata. Really, there's nothing to it. </p> <p> The reason I still find the exercise interesting is that it explores other aspects of TDD than most katas. The most common katas require you to write a little algorithm: <a href="https://codingdojo.org/kata/Bowling/">Bowling</a>, <a href="https://codingdojo.org/kata/WordWrap/">Word wrap</a>, <a href="https://codingdojo.org/kata/RomanNumerals/">Roman Numerals</a>, <a href="https://codingdojo.org/kata/Diamond/">Diamond</a>, <a href="https://codingdojo.org/kata/Tennis/">Tennis</a>, etc. </p> <p> The CQRS Booking kata suggests no interesting algorithm, but rather teaches some important lessons about software architecture, separation of concerns, and, if you approach it with TDD, real-world test automation. In contrast to all those algorithmic exercises, this one strongly suggests the use of <a href="http://xunitpatterns.com/Test%20Double.html">Test Doubles</a>. </p> <h3 id="6d8d7717cfef428e91418b319c4fe971"> Fakes <a href="#6d8d7717cfef428e91418b319c4fe971">#</a> </h3> <p> You could attempt the kata with a dynamic 'mocking' library such as <a href="https://devlooped.com/moq">Moq</a> or <a href="https://site.mockito.org/">Mockito</a>, but I haven't tried. Since <a href="/2022/10/17/stubs-and-mocks-break-encapsulation">Stubs and Mocks break encapsulation</a> I favour Fake Objects instead. </p> <p> Creating a Fake write registry is trivial: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeWriteRegistry</span>&nbsp;:&nbsp;Collection&lt;Booking&gt;,&nbsp;IWriteRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">Save</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(booking); &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> Its counterpart, the Fake read registry, turns out to be much more involved: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>&nbsp;:&nbsp;IReadRegistry { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;IDictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;&nbsp;views; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">FakeReadRegistry</span>(<span style="color:blue;">params</span>&nbsp;Room[]&nbsp;<span style="font-weight:bold;color:#1f377f;">rooms</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.rooms&nbsp;=&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Dictionary&lt;DateOnly,&nbsp;IReadOnlyCollection&lt;Room&gt;&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;EnumerateDates(arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(GetView) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Aggregate(rooms.AsEnumerable(),&nbsp;Enumerable.Intersect) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;EnumerateDates(booking.Arrival,&nbsp;booking.Departure)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>&nbsp;=&nbsp;GetView(d); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">newView</span>&nbsp;=&nbsp;QueryService.Reserve(booking,&nbsp;view); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views[d]&nbsp;=&nbsp;newView; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;IEnumerable&lt;DateOnly&gt;&nbsp;<span style="color:#74531f;">EnumerateDates</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&nbsp;arrival; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">yield</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;d; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;d&nbsp;=&nbsp;d.AddDays(1); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetView</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">date</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(views.TryGetValue(date,&nbsp;<span style="color:blue;">out</span>&nbsp;var&nbsp;<span style="font-weight:bold;color:#1f377f;">view</span>)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;view; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">else</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;rooms; &nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> I think I can predict the most common reaction: <em>That's much more code than the System Under Test!</em> Indeed. For this particular exercise, this may indicate that a 'dynamic mock' library may have been a better choice. I do, however, also think that it's an artefact of the kata description's lack of requirements. </p> <p> As is evident from the restaurant sample code that accompanies <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a>, once you add <a href="/2020/01/27/the-maitre-d-kata">realistic business rules</a> the production code grows, and the ratio of test code to production code becomes better balanced. </p> <p> The size of the <code>FakeReadRegistry</code> class also stems from the way the .NET base class library API is designed. The <code>GetView</code> helper method demonstrates that it requires four lines of code to look up an entry in a dictionary but return a default value if the entry isn't found. That's a one-liner in F#: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;getView&nbsp;(date&nbsp;:&nbsp;DateOnly)&nbsp;=&nbsp;views&nbsp;|&gt;&nbsp;Map.tryFind&nbsp;date&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;rooms&nbsp;|&gt;&nbsp;Set.ofSeq</pre> </p> <p> I'll show the entire F# Fake later, but you could also play some <a href="/2023/11/14/cc-golf">CC golf</a> with the C# code. That's a bit besides the point, though. </p> <h3 id="1f5b34534edd4b72947c8d6b4c8921bf"> Command service design <a href="#1f5b34534edd4b72947c8d6b4c8921bf">#</a> </h3> <p> Why does <code>FakeReadRegistry</code> look like it does? It's a combination of the kata description and my prior experience with CQRS. When adopting an asynchronous message-based architecture, I would usually not implement the write side exactly like that. Notice how the <code>CommandService</code> class' <code>BookARoom</code> method seems to repeat itself: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">BookARoom</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) { &nbsp;&nbsp;&nbsp;&nbsp;writeRegistry.Save(booking); &nbsp;&nbsp;&nbsp;&nbsp;readRegistry.RoomBooked(booking); }</pre> </p> <p> While semantically it seems to be making two different statements, structurally they're identical. If you rename the methods, you could wrap both method calls in a single <a href="https://en.wikipedia.org/wiki/Composite_pattern">Composite</a>. In a more typical CQRS architecture, you'd post a Command on bus: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">BookARoom</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>) { &nbsp;&nbsp;&nbsp;&nbsp;bus.BookRoom(booking); }</pre> </p> <p> This makes that particular <code>BookARoom</code> method, and perhaps the entire <code>CommandService</code> class, look redundant. Why do we need it? </p> <p> As presented here, we don't, but in a real application, the Command service would likely perform some pre- and post-processing. For example, if this was a web application, the Command service might instead be a Controller concerned with validating and translating HTTP- or Web-based input to a Domain Object before posting to the bus. </p> <p> A realistic code base would also be asynchronous, which, on .NET, would imply the use of the <code>async</code> and <code>await</code> keywords, etc. </p> <h3 id="4dd49e22fe4c479381285eb1b886457e"> Read registry design <a href="#4dd49e22fe4c479381285eb1b886457e">#</a> </h3> <p> A central point of CQRS is that you can optimise the read side for the specific tasks that it needs to perform. Instead of performing a dynamic query every time a client requests a view, you can update and persist a view. Imagine having a JSON or HTML file that the system can serve upon request. </p> <p> Part of handling a Command or Event is that the system background processes update persistent views once per event. </p> <p> For the particular hotel booking system, I imagine that the read registry has a set of files, blobs, documents, or denormalised database rows. When it receives notification of a booking, it'll need to remove that room from the dates of the booking. </p> <p> While a booking may stretch over several days, I found it simplest to think of the storage system as subdivided into single dates, instead of ranges. Indeed, the <code>GetFreeRooms</code> method is a ranged query, so if you really wanted to denormalise the views, you could create a persistent view per range. This would, however, require that you precalculate and persist a view for October 2 to October 4, and another one for October 2 to October 5, and so on. The combinatorial explosion suggests that this isn't a good idea, so instead I imagine keeping a persistent view per date, and then perform a bit of on-the-fly calculation per query. </p> <p> That's what <code>FakeReadRegistry</code> does. It also falls back to a default collection of <code>rooms</code> for all the dates that are yet untouched by a booking. This is, again, because I imagine that I might implement a real system like that. </p> <p> You may still protest that the <code>FakeReadRegistry</code> duplicates production code. True, perhaps, but if this really is a concern, you could <a href="/2023/11/20/trimming-a-fake-object">refactor it to the Template Method pattern</a>. </p> <p> Still, it's not really that complicated; it only looks that way because C# and the Dictionary API is too heavy on <a href="/2019/12/16/zone-of-ceremony">ceremony</a>. The Fake looks much simpler in F#: </p> <p> <pre><span style="color:blue;">type</span>&nbsp;FakeReadRegistry&nbsp;(rooms&nbsp;:&nbsp;IReadOnlyCollection&lt;Room&gt;)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;<span style="color:blue;">mutable</span>&nbsp;views&nbsp;=&nbsp;Map.empty &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;enumerateDates&nbsp;(arrival&nbsp;:&nbsp;DateOnly)&nbsp;departure&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Seq.initInfinite&nbsp;id &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.map&nbsp;arrival.AddDays &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.takeWhile&nbsp;(<span style="color:blue;">fun</span>&nbsp;d&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;d&nbsp;&lt;&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;getView&nbsp;(date&nbsp;:&nbsp;DateOnly)&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;|&gt;&nbsp;Map.tryFind&nbsp;date&nbsp;|&gt;&nbsp;Option.defaultValue&nbsp;rooms&nbsp;|&gt;&nbsp;Set.ofSeq &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">interface</span>&nbsp;IReadRegistry&nbsp;<span style="color:blue;">with</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.GetFreeRooms&nbsp;arrival&nbsp;departure&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;enumerateDates&nbsp;arrival&nbsp;departure &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.map&nbsp;getView &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Seq.fold&nbsp;Set.intersect&nbsp;(Set.ofSeq&nbsp;rooms) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Set.toList&nbsp;:&gt;&nbsp;_ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">member</span>&nbsp;this.RoomBooked&nbsp;booking&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">for</span>&nbsp;d&nbsp;<span style="color:blue;">in</span>&nbsp;enumerateDates&nbsp;booking.Arrival&nbsp;booking.Departure&nbsp;<span style="color:blue;">do</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;newView&nbsp;=&nbsp;getView&nbsp;d&nbsp;|&gt;&nbsp;QueryService.reserve&nbsp;booking&nbsp;|&gt;&nbsp;Seq.toList &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;views&nbsp;<span style="color:blue;">&lt;-</span>&nbsp;Map.add&nbsp;d&nbsp;newView&nbsp;views </pre> </p> <p> This isn't just more dense than the corresponding C# code, as F# tends to be, it also has a lower <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a>. Both the <code>EnumerateDates</code> and <code>GetView</code> C# methods have a cyclomatic complexity of <em>2</em>, while their F# counterparts rate only <em>1</em>. </p> <p> For production code, cyclomatic complexity of <em>2</em> is fine if the code is covered by automatic tests. In test code, however, we should be wary of any branching or looping, since there are (typically) no tests of the test code. </p> <p> While I <em>am</em> going to show some tests of that code in what follows, I do that for a different reason. </p> <h3 id="da4d51ff041c4cf6b4007d53a67f2d76"> Contract <a href="#da4d51ff041c4cf6b4007d53a67f2d76">#</a> </h3> <p> When explaining Fake Objects to people, I've begun to use a particular phrase: </p> <blockquote> <p> A Fake Object is a polymorphic implementation of a dependency that fulfils the contract, but lacks some of the <em>ilities</em>. </p> </blockquote> <p> It's funny how you can arrive at something that strikes you as profound, only to discover that it was part of the definition all along: </p> <blockquote> <p> "We acquire or build a very lightweight implementation of the same functionality as provided by a component on which the SUT [System Under Test] depends and instruct the SUT to use it instead of the real DOC [Depended-On Component]. This implementation need not have any of the "-ilities" that the real DOC needs to have" </p> <footer><cite>Gerard Meszaros, <a href="/ref/xunit-patterns">xUnit Test Patterns</a></cite></footer> </blockquote> <p> A common example is a Fake Repository object that pretends to be a database, often by leveraging a built-in collection API. The above <code>FakeWriteRegistry</code> is as simple an example as you could have. A slightly more compelling example is <a href="/2023/08/14/replacing-mock-and-stub-with-a-fake">the FakeUserRepository shown in another article</a>. Such an 'in-memory database' fulfils the implied contract, because if you 'save' something in the 'database' you can later retrieve it again with a query. As long as the object remains in memory. </p> <p> The <em>ilities</em> that such a Fake database lacks are </p> <ul> <li>data persistence</li> <li>thread safety</li> <li>transaction support</li> </ul> <p> and perhaps others. Such qualities are clearly required in a real production environment, but are in the way in an automated testing context. The implied contract, however, is satisfied: What you save you can later retrieve. </p> <p> Now consider the <code>IReadRegistry</code> interface: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">interface</span>&nbsp;<span style="color:#2b91af;">IReadRegistry</span> { &nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;Room&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>(DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;DateOnly&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>(Booking&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>); }</pre> </p> <p> Which contract does it imply, given what you know about the <em>CQRS Booking</em> kata? </p> <p> I would suggest the following: </p> <ul> <li><em>Precondition:</em> <code>arrival</code> should be less than (or equal?) to <code>departure</code>.</li> <li><em>Postcondition:</em> <code>GetFreeRooms</code> should always return a result. Null isn't a valid return value.</li> <li><em>Invariant:</em> After calling <code>RoomBooked</code>, <code>GetFreeRooms</code> should exclude that room when queried on overlapping dates.</li> </ul> <p> There may be other parts of the contract than this, but I find the third one most interesting. This is exactly what you would expect from a real system: If you reserve a room, you'd be surprised to see <code>GetFreeRooms</code> indicating that this room is free if queried about dates that overlap the reservation. </p> <p> This is the sort of implied interaction that <a href="/2022/10/17/stubs-and-mocks-break-encapsulation">Stubs and Mocks break</a>, but that <code>FakeReadRegistry</code> guarantees. </p> <h3 id="6ab8206598ab4bb990d93e5472d36054"> Properties <a href="#6ab8206598ab4bb990d93e5472d36054">#</a> </h3> <p> There's a close relationship between contracts and properties. Once you can list preconditions, invariants, and postconditions for an object, there's a good chance that you can write code that exercises those qualities. Indeed, why not use property-based testing to do so? </p> <p> I don't wish to imply that you should (normally) write tests of your test code. The following rather serves as a concretisation of the notion that a Fake Object is a Test Double that implements the 'proper' behaviour. In the following, I'll subject the <code>FakeReadRegistry</code> class to that exercise. To do that, I'll use <a href="https://github.com/AnthonyLloyd/CsCheck">CsCheck</a> 2.14.1 with <a href="https://xunit.net/">xUnit.net</a> 2.5.3. </p> <p> Before tackling the above invariant, there's a simpler invariant specific to the <code>FakeReadRegistry</code> class. A <code>FakeReadRegistry</code> object takes a collection of <code>rooms</code> via its constructor, so for this particular implementation, we may wish to establish the reasonable invariant that <code>GetFreeRooms</code> doesn't 'invent' rooms on its own: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Gen&lt;Room&gt;&nbsp;GenRoom&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;name&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.String &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Room(name); [Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">GetFreeRooms</span>() { &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;rooms&nbsp;<span style="color:blue;">in</span>&nbsp;GenRoom.ArrayUnique &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;arrival&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Date.Select(DateOnly.FromDateTime) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;i&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Int[1,&nbsp;1_000] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;departure&nbsp;=&nbsp;arrival.AddDays(i) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(rooms,&nbsp;arrival,&nbsp;departure)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample((<span style="font-weight:bold;color:#1f377f;">rooms</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">arrival</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">departure</span>)&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;FakeReadRegistry(rooms); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.GetFreeRooms(arrival,&nbsp;departure); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.Subset(<span style="color:blue;">new</span>&nbsp;HashSet&lt;Room&gt;(rooms),&nbsp;<span style="color:blue;">new</span>&nbsp;HashSet&lt;Room&gt;(actual)); &nbsp;&nbsp;&nbsp;&nbsp;}); }</pre> </p> <p> This property asserts that the <code>actual</code> value returned from <code>GetFreeRooms</code> is a subset of the <code>rooms</code> used to initialise the <code>sut</code>. Recall that the subset relation is <a href="https://en.wikipedia.org/wiki/Reflexive_relation">reflexive</a>; i.e. a set is a subset of itself. </p> <p> The same property written in F# with <a href="https://hedgehog.qa/">Hedgehog</a> 0.13.0 and <a href="https://github.com/SwensenSoftware/unquote">Unquote</a> 6.1.0 may look like this: </p> <p> <pre><span style="color:blue;">module</span>&nbsp;Gen&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;room&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.alphaNum &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Gen.array&nbsp;(Range.linear&nbsp;1&nbsp;10) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&gt;&nbsp;Gen.map&nbsp;(<span style="color:blue;">fun</span>&nbsp;chars&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;{&nbsp;Name&nbsp;=&nbsp;String&nbsp;chars&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;dateOnly&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;min&nbsp;=&nbsp;DateOnly(2000,&nbsp;1,&nbsp;1).DayNumber &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;max&nbsp;=&nbsp;DateOnly(2100,&nbsp;1,&nbsp;1).DayNumber &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Range.linear&nbsp;min&nbsp;max&nbsp;|&gt;&nbsp;Gen.int32&nbsp;|&gt;&nbsp;Gen.map&nbsp;DateOnly.FromDayNumber [&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;GetFreeRooms&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;rooms&nbsp;=&nbsp;Gen.room&nbsp;|&gt;&nbsp;Gen.list&nbsp;(Range.linear&nbsp;0&nbsp;100) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;arrival&nbsp;=&nbsp;Gen.dateOnly &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;i&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linear&nbsp;1&nbsp;1_000) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;departure&nbsp;=&nbsp;arrival.AddDays&nbsp;i &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;=&nbsp;FakeReadRegistry&nbsp;rooms&nbsp;:&gt;&nbsp;IReadRegistry &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;sut.GetFreeRooms&nbsp;arrival&nbsp;departure &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;Set.isSubset&nbsp;(Set.ofSeq&nbsp;rooms)&nbsp;(Set.ofSeq&nbsp;actual)&nbsp;@&gt;&nbsp;}</pre> </p> <p> Simpler syntax, same idea. </p> <p> Likewise, we can express the contract that describes the relationship between <code>RoomBooked</code> and <code>GetFreeRooms</code> like this: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">RoomBooked</span>() { &nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;rooms&nbsp;<span style="color:blue;">in</span>&nbsp;GenRoom.ArrayUnique.Nonempty &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;arrival&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Date.Select(DateOnly.FromDateTime) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;i&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Int[1,&nbsp;1_000] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;departure&nbsp;=&nbsp;arrival.AddDays(i) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;room&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.OneOfConst(rooms) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;id&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Guid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;booking&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Booking(id,&nbsp;room.Name,&nbsp;arrival,&nbsp;departure) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(rooms,&nbsp;booking)) &nbsp;&nbsp;&nbsp;&nbsp;.Sample((<span style="font-weight:bold;color:#1f377f;">rooms</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">booking</span>)&nbsp;=&gt; &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;FakeReadRegistry(rooms); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sut.RoomBooked(booking); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.GetFreeRooms(booking.Arrival,&nbsp;booking.Departure); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Assert.DoesNotContain(booking.RoomName,&nbsp;actual.Select(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.Name)); &nbsp;&nbsp;&nbsp;&nbsp;}); }</pre> </p> <p> or, in F#: </p> <p> <pre>[&lt;Fact&gt;] <span style="color:blue;">let</span>&nbsp;RoomBooked&nbsp;()&nbsp;=&nbsp;Property.check&nbsp;&lt;|&nbsp;property&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;rooms&nbsp;=&nbsp;Gen.room&nbsp;|&gt;&nbsp;Gen.list&nbsp;(Range.linear&nbsp;1&nbsp;100) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;arrival&nbsp;=&nbsp;Gen.dateOnly &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;i&nbsp;=&nbsp;Gen.int32&nbsp;(Range.linear&nbsp;1&nbsp;1_000) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;departure&nbsp;=&nbsp;arrival.AddDays&nbsp;i &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;room&nbsp;=&nbsp;Gen.item&nbsp;rooms &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let!</span>&nbsp;id&nbsp;=&nbsp;Gen.guid &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;booking&nbsp;=&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ClientId&nbsp;=&nbsp;id &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RoomName&nbsp;=&nbsp;room.Name &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Arrival&nbsp;=&nbsp;arrival &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Departure&nbsp;=&nbsp;departure&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;sut&nbsp;=&nbsp;FakeReadRegistry&nbsp;rooms&nbsp;:&gt;&nbsp;IReadRegistry &nbsp;&nbsp;&nbsp;&nbsp;sut.RoomBooked&nbsp;booking &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;actual&nbsp;=&nbsp;sut.GetFreeRooms&nbsp;arrival&nbsp;departure &nbsp;&nbsp;&nbsp;&nbsp;test&nbsp;&lt;@&nbsp;not&nbsp;(Seq.contains&nbsp;room&nbsp;actual)&nbsp;@&gt;&nbsp;}</pre> </p> <p> In both cases, the property books a room and then proceeds to query <code>GetFreeRooms</code> to see which rooms are free. Since the query is exactly in the range from <code>booking.Arrival</code> to <code>booking.Departure</code>, we expect <em>not</em> to see the name of the booked room among the free rooms. </p> <p> (As I'm writing this, I think that there may be a subtle bug in the F# property. Can you spot it?) </p> <h3 id="fa249347697b49699b7ea62336746651"> Conclusion <a href="#fa249347697b49699b7ea62336746651">#</a> </h3> <p> A Fake Object isn't like other Test Doubles. While <a href="/2022/10/17/stubs-and-mocks-break-encapsulation">Stubs and Mocks break encapsulation</a>, a Fake Object not only stays encapsulated, but it also fulfils the contract implied by a polymorphic API (interface or base class). </p> <p> Or, put another way: When is a Fake Object the right Test Double? When you can describe the contract of the dependency. </p> <p> But if you <em>can't</em> describe the contract of a dependency, you should seriously consider if the design is right. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. A C# port of validation with partial round trip https://blog.ploeh.dk/2023/10/30/a-c-port-of-validation-with-partial-round-trip 2023-10-30T11:52:00+00:00 Mark Seemann <div id="post"> <p> <em>A raw port of the previous F# demo code.</em> </p> <p> This article is part of <a href="/2020/12/14/validation-a-solved-problem">a short article series</a> on <a href="/2018/11/05/applicative-validation">applicative validation</a> with a twist. The twist is that validation, when it fails, should return not only a list of error messages; it should also retain that part of the input that <em>was</em> valid. </p> <p> In the <a href="/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip">previous article</a> I showed <a href="https://fsharp.org/">F#</a> demo code, and since <a href="https://forums.fsharp.org/t/thoughts-on-input-validation-pattern-from-a-noob/1541">the original forum question</a> that prompted the article series was about F# code, for a long time, I left it there. </p> <p> Recently, however, I've found myself writing about validation in a broader context: </p> <ul> <li><a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">An applicative reservation validation example in C#</a></li> <li><a href="/2022/08/15/aspnet-validation-revisited">ASP.NET validation revisited</a></li> <li><a href="/2022/08/22/can-types-replace-validation">Can types replace validation?</a></li> <li><a href="/2023/06/26/validation-and-business-rules">Validation and business rules</a></li> <li><a href="/2023/07/03/validating-or-verifying-emails">Validating or verifying emails</a></li> </ul> <p> Perhaps I should consider adding a <em>validation</em> tag to the blog... </p> <p> In that light I thought that it might be illustrative to continue <a href="/2020/12/14/validation-a-solved-problem">this article series</a> with a port to C#. </p> <p> Here, I use techniques already described on this site to perform the translation. Follow the links for details. </p> <p> The translation given here is direct so produces some fairly non-idiomatic C# code. </p> <h3 id="5cee653b6148484fb782d92fea2ca415"> Building blocks <a href="#5cee653b6148484fb782d92fea2ca415">#</a> </h3> <p> The original problem is succinctly stated, and I follow it as closely as possible. This includes potential errors that may be present in the original post. </p> <p> The task is to translate some input to a Domain Model with <a href="/2022/10/24/encapsulation-in-functional-programming">good encapsulation</a>. The input type looks like this, translated to a <a href="https://learn.microsoft.com/dotnet/csharp/language-reference/builtin-types/record">C# record</a>: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">record</span>&nbsp;<span style="color:#2b91af;">Input</span>(<span style="color:blue;">string</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">Name</span>,&nbsp;DateTime?&nbsp;<span style="font-weight:bold;color:#1f377f;">DoB</span>,&nbsp;<span style="color:blue;">string</span>?&nbsp;<span style="font-weight:bold;color:#1f377f;">Address</span>)</pre> </p> <p> Notice that every input may be null. This indicates poor encapsulation, but is symptomatic of most input. <a href="/2023/10/16/at-the-boundaries-static-types-are-illusory">At the boundaries, static types are illusory</a>. Perhaps it would have been more idiomatic to model such input as a <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Object</a>, but it makes little difference to what comes next. </p> <p> I consider <a href="/2020/12/14/validation-a-solved-problem">validation a solved problem</a>, because it's possible to model the process as an <a href="/2018/10/01/applicative-functors">applicative functor</a>. Really, <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">validation is a parsing problem</a>. </p> <p> Since my main intent with this article is to demonstrate a technique, I will allow myself a few shortcuts. Like I did <a href="/2023/08/28/a-first-crack-at-the-args-kata">when I first encountered the Args kata</a>, I start by copying the <code>Validated</code> code from <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">An applicative reservation validation example in C#</a>; you can go there if you're interested in it. I'm not going to repeat it here. </p> <p> The target type looks similar to the above <code>Input</code> record, but doesn't allow null values: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">record</span>&nbsp;<span style="color:#2b91af;">ValidInput</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">Name</span>,&nbsp;DateTime&nbsp;<span style="font-weight:bold;color:#1f377f;">DoB</span>,&nbsp;<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">Address</span>);</pre> </p> <p> This could also have been a 'proper' class. The following code doesn't depend on that. </p> <h3 id="7af5ab9c8fca4dc193fc5854c2806ff4"> Validating names <a href="#7af5ab9c8fca4dc193fc5854c2806ff4">#</a> </h3> <p> Since I'm now working in an ostensibly object-oriented language, I can make the various validation functions methods on the <code>Input</code> record. Since I'm treating validation as a parsing problem, I'm going to name those methods with the <code>TryParse</code> prefix: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;Validated&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseName</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(Name&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Fail&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;name&nbsp;is&nbsp;required&quot;</span>&nbsp;})); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(Name.Length&nbsp;&lt;=&nbsp;3) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Fail&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">i</span>&nbsp;=&gt;&nbsp;i&nbsp;<span style="color:blue;">with</span>&nbsp;{&nbsp;Name&nbsp;=&nbsp;<span style="color:blue;">null</span>&nbsp;},&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>&nbsp;})); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Succeed&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt;(Name); }</pre> </p> <p> As the two previous articles have explained, the result of trying to parse input is a type isomorphic to <a href="/2019/01/14/an-either-functor">Either</a>, but here called <code><span style="color:#2b91af;">Validated</span>&lt;<span style="color:#2b91af;">F</span>,&nbsp;<span style="color:#2b91af;">S</span>&gt;</code>. (The reason for this distinction is that we <em>don't</em> want the <a href="/2022/05/09/an-either-monad">monadic behaviour of Either</a>, because monads short-circuit.) </p> <p> When parsing succeeds, the <code>TryParseName</code> method returns the <code>Name</code> wrapped in a <code>Success</code> case. </p> <p> Parsing the name may fail in two different ways. If the name is missing, the method returns the input and the error message <em>"name is required"</em>. If the name is present, but too short, <code>TryParseName</code> returns another error message, and also resets <code>Name</code> to <code>null</code>. </p> <p> Compare the C# code with <a href="/2020/12/28/an-f-demo-of-validation-with-partial-data-round-trip">the corresponding F#</a> or <a href="/2020/12/21/a-haskell-proof-of-concept-of-validation-with-partial-data-round-trip">Haskell code</a> and notice how much more verbose the C# has to be. </p> <p> While it's possible to translate many functional programming concepts to a language like C#, syntax does matter, because it affects readability. </p> <h3 id="2113a955061341ab9e2dba711aaf8457"> Validating date of birth <a href="#2113a955061341ab9e2dba711aaf8457">#</a> </h3> <p> From here, the port is direct, if awkward. Here's how to validate the date-of-birth field: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;Validated&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;DateTime&gt; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseDoB</span>(DateTime&nbsp;<span style="font-weight:bold;color:#1f377f;">now</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(!DoB.HasValue) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Fail&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;DateTime&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>&nbsp;})); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(DoB.Value&nbsp;&lt;=&nbsp;now.AddYears(-12)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Fail&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;DateTime&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">i</span>&nbsp;=&gt;&nbsp;i&nbsp;<span style="color:blue;">with</span>&nbsp;{&nbsp;DoB&nbsp;=&nbsp;<span style="color:blue;">null</span>&nbsp;},&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;get&nbsp;off&nbsp;my&nbsp;lawn&quot;</span>&nbsp;})); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Succeed&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;DateTime&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DoB.Value); }</pre> </p> <p> I suspect that the age check should really have been a greater-than relation, but I'm only reproducing the original code. </p> <h3 id="e1fc6b98e4fb4dad81ee5e354032acb8"> Validating addresses <a href="#e1fc6b98e4fb4dad81ee5e354032acb8">#</a> </h3> <p> The final building block is to parse the input address: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;Validated&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#74531f;">TryParseAddress</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(Address&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Fail&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x,&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>&nbsp;})); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Validated.Succeed&lt;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;<span style="color:blue;">string</span>&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Address); }</pre> </p> <p> The <code>TryParseAddress</code> only checks whether or not the <code>Address</code> field is present. </p> <h3 id="b11153a62fa945568e880cf771a7cb19"> Composition <a href="#b11153a62fa945568e880cf771a7cb19">#</a> </h3> <p> The above methods are <code>private</code> because the entire problem is simple enough that I can test the composition as a whole. Had I wanted to, however, I could easily have made them <code>public</code> and tested them individually. </p> <p> You can now use applicative composition to produce a single validation method: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;Validated&lt;(Input,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;ValidInput&gt; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#74531f;">TryParse</span>(DateTime&nbsp;<span style="font-weight:bold;color:#1f377f;">now</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">name</span>&nbsp;=&nbsp;TryParseName(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dob</span>&nbsp;=&nbsp;TryParseDoB(now); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">address</span>&nbsp;=&nbsp;TryParseAddress(); &nbsp;&nbsp;&nbsp;&nbsp;Func&lt;<span style="color:blue;">string</span>,&nbsp;DateTime,&nbsp;<span style="color:blue;">string</span>,&nbsp;ValidInput&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">createValid</span>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="font-weight:bold;color:#1f377f;">n</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">d</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">a</span>)&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>&nbsp;ValidInput(n,&nbsp;d,&nbsp;a); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">static</span>&nbsp;(Func&lt;Input,&nbsp;Input&gt;,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;)&nbsp;<span style="color:#74531f;">combineErrors</span>( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Func&lt;Input,&nbsp;Input&gt;&nbsp;f,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;&nbsp;es)&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Func&lt;Input,&nbsp;Input&gt;&nbsp;g,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;&nbsp;es)&nbsp;<span style="font-weight:bold;color:#1f377f;">y</span>) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;(<span style="font-weight:bold;color:#1f377f;">z</span>&nbsp;=&gt;&nbsp;y.g(x.f(z)),&nbsp;y.es.Concat(x.es).ToArray()); &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;createValid &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Apply(name,&nbsp;combineErrors) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Apply(dob,&nbsp;combineErrors) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Apply(address,&nbsp;combineErrors) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.SelectFailure(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;(x.Item1(<span style="color:blue;">this</span>),&nbsp;x.Item2)); }</pre> </p> <p> This is where the <code>Validated</code> API is still awkward. You need to explicitly define a function to compose error cases. In this case, <code>combineErrors</code> composes the <a href="/2017/11/13/endomorphism-monoid">endomorphisms</a> and concatenates the collections. </p> <p> The final step 'runs' the endomorphism. <code>x.Item1</code> is the endomorphism, and <code>this</code> is the <code>Input</code> value being validated. Again, this isn't readable in C#, but it's where the endomorphism removes the invalid values from the input. </p> <h3 id="8aa59e20c1924002ae0d4e951df71619"> Tests <a href="#8aa59e20c1924002ae0d4e951df71619">#</a> </h3> <p> Since <a href="/2018/11/05/applicative-validation">applicative validation</a> is a functional technique, it's <a href="/2015/05/07/functional-design-is-intrinsically-testable">intrinsically testable</a>. </p> <p> Testing a successful validation is as easy as this: </p> <p> <pre>[Fact] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ValidationSucceeds</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">now</span>&nbsp;=&nbsp;DateTime.Now; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">eightYearsAgo</span>&nbsp;=&nbsp;now.AddYears(-8); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">input</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;input.TryParse(now); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">expected</span>&nbsp;=&nbsp;Validated.Succeed&lt;(Input,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;),&nbsp;ValidInput&gt;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;ValidInput(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>)); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(expected,&nbsp;actual); }</pre> </p> <p> As is often the case, the error conditions are more numerous, or more interesting, if you will, than the success case, so this requires a parametrised test: </p> <p> <pre>[Theory,&nbsp;ClassData(<span style="color:blue;">typeof</span>(ValidationFailureTestCases))] <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="font-weight:bold;color:#74531f;">ValidationFails</span>( &nbsp;&nbsp;&nbsp;&nbsp;Input&nbsp;<span style="font-weight:bold;color:#1f377f;">input</span>, &nbsp;&nbsp;&nbsp;&nbsp;Input&nbsp;<span style="font-weight:bold;color:#1f377f;">expected</span>, &nbsp;&nbsp;&nbsp;&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">expectedMessages</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">now</span>&nbsp;=&nbsp;DateTime.Now; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;input.TryParse(now); &nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;(<span style="font-weight:bold;color:#1f377f;">inp</span>,&nbsp;<span style="font-weight:bold;color:#1f377f;">msgs</span>)&nbsp;=&nbsp;Assert.Single(actual.Match( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onFailure:&nbsp;<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;x&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;onSuccess:&nbsp;<span style="font-weight:bold;color:#1f377f;">_</span>&nbsp;=&gt;&nbsp;Array.Empty&lt;(Input,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;)&gt;())); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(expected,&nbsp;inp); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal(expectedMessages,&nbsp;msgs); }</pre> </p> <p> I also had to take <code>actual</code> apart in order to inspects its individual elements. When working with a pure and immutable data structure, I consider that a test smell. Rather, one should be able to use <a href="/2021/05/03/structural-equality-for-better-tests">structural equality for better tests</a>. Unfortunately, .NET collections don't have structural equality, so the test has to pull the message collection out of <code>actual</code> in order to verify it. </p> <p> Again, in F# or <a href="https://www.haskell.org/">Haskell</a> you don't have that problem, and the tests are much more succinct and robust. </p> <p> The test cases are implemented by this nested <code>ValidationFailureTestCases</code> class: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ValidationFailureTestCases</span>&nbsp;: &nbsp;&nbsp;&nbsp;&nbsp;TheoryData&lt;Input,&nbsp;Input,&nbsp;IReadOnlyCollection&lt;<span style="color:blue;">string</span>&gt;&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">ValidationFailureTestCases</span>() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;name&nbsp;is&nbsp;required&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Bob&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;dob&nbsp;is&nbsp;required&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">eightYearsAgo</span>&nbsp;=&nbsp;DateTime.Now.AddYears(-8); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:blue;">null</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;add1&nbsp;is&nbsp;required&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">fortyYearsAgo</span>&nbsp;=&nbsp;DateTime.Now.AddYears(-40); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;fortyYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Alice&quot;</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;get&nbsp;off&nbsp;my&nbsp;lawn&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Tom&quot;</span>,&nbsp;fortyYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:blue;">null</span>,&nbsp;<span style="color:blue;">null</span>,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;get&nbsp;off&nbsp;my&nbsp;lawn&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Add(<span style="color:blue;">new</span>&nbsp;Input(<span style="color:#a31515;">&quot;Tom&quot;</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Input(<span style="color:blue;">null</span>,&nbsp;eightYearsAgo,&nbsp;<span style="color:#a31515;">&quot;x&quot;</span>), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>[]&nbsp;{&nbsp;<span style="color:#a31515;">&quot;no&nbsp;bob&nbsp;and&nbsp;toms&nbsp;allowed&quot;</span>&nbsp;}); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} }</pre> </p> <p> All eight tests pass. </p> <h3 id="96596b52720f4a2688c216701f48d559"> Conclusion <a href="#96596b52720f4a2688c216701f48d559">#</a> </h3> <p> Once you know <a href="/2018/05/22/church-encoding">how to model sum types (discriminated unions) in C#</a>, translating something like applicative validation isn't difficult per se. It's a fairly automatic process. </p> <p> The code is hardly <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> C#, and the type annotations are particularly annoying. Things work as expected though, and it isn't difficult to imagine how one could refactor some of this code to a more idiomatic form. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Domain Model first https://blog.ploeh.dk/2023/10/23/domain-model-first 2023-10-23T06:09:00+00:00 Mark Seemann <div id="post"> <p> <em>Persistence concerns second.</em> </p> <p> A few weeks ago, I published an article with the title <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">Do ORMs reduce the need for mapping?</a> Not surprisingly, this elicited more than one reaction. In this article, I'll respond to a particular kind of reaction. </p> <p> First, however, I'd like to reiterate the message of the previous article, which is almost revealed by the title: <em>Do <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mappers</a> (ORMs) reduce the need for mapping?</em> To which the article answers a tentative <em>no</em>. </p> <p> Do pay attention to the question. It doesn't ask whether ORMs are bad in general, or in all cases. It mainly analyses whether the use of ORMs reduces the need to write code that maps between different representations of data: From database to objects, from objects to <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Objects</a> (DTOs), etc. </p> <p> Granted, the article looks at a wider context, which I think is only a responsible thing to do. This could lead some readers to extrapolate from the article's specific focus to draw a wider conclusion. </p> <h3 id="951d538881fd4464a081ba3cd09162b0"> Encapsulation-first <a href="#951d538881fd4464a081ba3cd09162b0">#</a> </h3> <p> Most of the systems I work with aren't <a href="https://en.wikipedia.org/wiki/Create,_read,_update_and_delete">CRUD</a> systems, but rather systems where correctness is important. As an example, one of my clients does security-heavy digital infrastructure. Earlier in my career, I helped write web shops when these kinds of systems were new. Let me tell you: System owners were quite concerned that prices were correct, and that orders were taken and handled without error. </p> <p> In my book <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a> I've tried to capture the essence of those kinds of system with the accompanying sample code, which pretends to be an online restaurant reservation system. While this may sound like a trivial CRUD system, <a href="/2020/01/27/the-maitre-d-kata">the business logic isn't entirely straightforward</a>. </p> <p> The point I was making in <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">the previous article</a> is that I consider <a href="/encapsulation-and-solid">encapsulation</a> to be more important than 'easy' persistence. I don't mind writing a bit of mapping code, since <a href="/2018/09/17/typing-is-not-a-programming-bottleneck">typing isn't a programming bottleneck</a> anyway. </p> <p> When prioritising encapsulation you should be able to make use of any design pattern, run-time assertion, as well as static type systems (if you're working in such a language) to guard correctness. You should be able to compose objects, define <a href="https://en.wikipedia.org/wiki/Value_object">Value Objects</a>, <a href="/2015/01/19/from-primitive-obsession-to-domain-modelling">wrap single values to avoid primitive obsession</a>, make constructors private, leverage polymorphism and effectively use any trick your language, <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiom</a>, and platform has on offer. If you want to use <a href="/2018/05/22/church-encoding">Church encoding</a> or the <a href="/2018/06/25/visitor-as-a-sum-type">Visitor pattern to represent a sum type</a>, you should be able to do that. </p> <p> When writing these kinds of systems, I start with the Domain Model without any thought of how to persist or retrieve data. </p> <p> In my experience, once the Domain Model starts to congeal, the persistence question tends to answer itself. There's usually one or two obvious ways to store and read data. </p> <p> Usually, a relational database isn't the most obvious choice. </p> <h3 id="1b562dd9077e4b27b912d782bdca14fb"> Persistence ignorance <a href="#1b562dd9077e4b27b912d782bdca14fb">#</a> </h3> <p> Write the best API you can to solve the problem, and then figure out how to store data. This is the allegedly elusive ideal of <em>persistence ignorance</em>, which turns out to be easier than rumour has it, once you cast a wider net than relational databases. </p> <p> It seems to me, though, that more than one person who has commented on my previous article have a hard time considering alternatives. And granted, I've consulted with clients who knew how to operate a particular database system, but nothing else, and who didn't want to consider adopting another technology. I do understand that such constraints are real, too. Thus, if you need to compromise for reasons such as these, you aren't doing anything wrong. You may still, however, try to get the best out of the situation. </p> <p> One client of mine, for example, didn't want to operate anything else than <a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server">SQL Server</a>, which they already know. For an asynchronous message-based system, then, we chose <a href="https://particular.net/nservicebus">NServiceBus</a> and configured it to use SQL Server as a persistent queue. </p> <p> Several comments still seem to assume that persistence must look in a particular way. </p> <blockquote> <p> "So having a Order, OrderLine, Person, Address and City, all the rows needed to be loaded in advance, mapped to objects and references set to create the object graph to be able to, say, display shipping costs based on person's address." </p> <footer><cite><a href="/2023/09/18/do-orms-reduce-the-need-for-mapping#75ca5755d2a4445ba4836fc3f6922a5c">Vlad</a></cite></footer> </blockquote> <p> I don't wish to single out Vlad, but this is both the first comment, and it captures the essence of other comments well. I imagine that what he has in mind is something like this: </p> <p> <img src="/content/binary/orders-db-diagram.png" alt="Database diagram with five tables: Orders, OrderLines, Persons, Addresses, and Cities."> </p> <p> I've probably simplified things a bit too much. In a more realistic model, each person may have a collection of addresses, instead of just one. If so, it only strengthens Vlad's point, because that would imply even more tables to read. </p> <p> The unstated assumption, however, is that a fully <a href="https://en.wikipedia.org/wiki/Database_normalization">normalised</a> relational data model is the correct way to store such data. </p> <p> It's not. As I already mentioned, I spent the first four years of my programming career developing web shops. Orders were an integral part of that work. </p> <p> An order is a <em>document</em>. You don't want the customer's address to be updatable after the fact. With a normalised relational model, if you change the customer's address row in the future, it's going to look as though the order went to that address instead of the address it actually went to. </p> <p> This also explains why the order lines should <em>not</em> point to the actually product entries in the product catalogue. Trust me, I almost shipped such a system once, when I was young and inexperienced. </p> <p> You should, at the very least, denormalise the database model. To a degree, this has already happened here, since the implied order has order lines, that, I hope, are copies of the relevant product data, rather than linked to the product catalogue. </p> <p> Such insights, however, suggest that other storage mechanisms may be more appropriate. </p> <p> Putting that aside for a moment, though, how would a persistence-ignorant Domain Model look? </p> <p> I'd probably start with something like this: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">order</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Order( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Person(<span style="color:#a31515;">&quot;Olive&quot;</span>,&nbsp;<span style="color:#a31515;">&quot;Hoyle&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Address(<span style="color:#a31515;">&quot;Green&nbsp;Street&nbsp;15&quot;</span>,&nbsp;<span style="color:blue;">new</span>&nbsp;City(<span style="color:#a31515;">&quot;Oakville&quot;</span>),&nbsp;<span style="color:#a31515;">&quot;90125&quot;</span>)), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;OrderLine(123,&nbsp;1), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;OrderLine(456,&nbsp;3), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;OrderLine(789,&nbsp;2));</pre> </p> <p> (As <a href="/ref/90125">the ZIP code</a> implies, I'm more of a <a href="https://en.wikipedia.org/wiki/Yes_(band)">Yes</a> fan, but still can't help but relish writing <code>new Order</code> in code.) </p> <p> With code like this, many a <a href="/ref/ddd">DDD</a>'er would start talking about Aggregate Roots, but that is, frankly, a concept that never made much sense to me. Rather, the above <code>order</code> is a <a href="https://en.wikipedia.org/wiki/Tree_(graph_theory)">tree</a> composed of immutable data structures. </p> <p> It trivially serializes to e.g. JSON: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;customer&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;firstName&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Olive&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;lastName&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Hoyle&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;address&quot;</span>:&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;street&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Green&nbsp;Street&nbsp;15&quot;</span>, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;city&quot;</span>:&nbsp;{&nbsp;<span style="color:#2e75b6;">&quot;name&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Oakville&quot;</span>&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;zipCode&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;90125&quot;</span> &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;}, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;orderLines&quot;</span>:&nbsp;[ &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;sku&quot;</span>:&nbsp;123, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;quantity&quot;</span>:&nbsp;1 &nbsp;&nbsp;&nbsp;&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;sku&quot;</span>:&nbsp;456, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;quantity&quot;</span>:&nbsp;3 &nbsp;&nbsp;&nbsp;&nbsp;}, &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;sku&quot;</span>:&nbsp;789, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#2e75b6;">&quot;quantity&quot;</span>:&nbsp;2 &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;] }</pre> </p> <p> All of this strongly suggests that this kind of data would be <em>much easier</em> to store and retrieve with a document database instead of a relational database. </p> <p> While that's just one example, it strikes me as a common theme when discussing persistence. For most online transaction processing systems, relational database aren't necessarily the best fit. </p> <h3 id="3643959a545940f88001fb82297a286e"> The cart before the horse <a href="#3643959a545940f88001fb82297a286e">#</a> </h3> <p> <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping#359a7bb0d2c14b8eb2dcb2ac6de4897d">Another comment</a> also starts with the premise that a data model is fundamentally relational. This one purports to model the relationship between sheikhs, their wives, and supercars. While I understand that the example is supposed to be tongue-in-cheek, the comment launches straight into problems with how to read and persist such data without relying on an ORM. </p> <p> Again, I don't intend to point fingers at anyone, but on the other hand, I can't suggest alternatives when a problem is presented like that. </p> <p> The whole point of developing a Domain Model <em>first</em> is to find a good way to represent the business problem in a way that encourages correctness and ease of use. </p> <p> If you present me with a relational model without describing the business goals you're trying to achieve, I don't have much to work with. </p> <p> It may be that your business problem is truly relational, in which case an ORM probably is a good solution. I wrote as much in the previous article. </p> <p> In many cases, however, it looks to me as though programmers start with a relational model, only to proceed to complain that it's difficult to work with in object-oriented (or functional) code. </p> <p> If you, on the other hand, start with the business problem and figure out how to model it in code, the best way to store the data may suggest itself. Document databases are often a good fit, as are event stores. I've never had need for a graph database, but perhaps that would be a better fit for the <em>sheikh</em> domain suggested by <em>qfilip</em>. </p> <h3 id="8c32485e1ffd42f4ace9b83c98ae3184"> Reporting <a href="#8c32485e1ffd42f4ace9b83c98ae3184">#</a> </h3> <p> While I no longer feel that relational databases are particularly well-suited for online transaction processing, they are really good at one thing: Ad-hoc querying. Because it's such a rich and mature type of technology, and because <a href="https://en.wikipedia.org/wiki/SQL">SQL</a> is a powerful language, you can slice and dice data in multiple ways. </p> <p> This makes relational databases useful for reporting and other kinds of data extraction tasks. </p> <p> You may have business stakeholders who insist on a relational database for that particular reason. It may even be a good reason. </p> <p> If, however, the sole purpose of having a relational database is to support reporting, you may consider setting it up as a secondary system. Keep your online transactional data in another system, but regularly synchronize it to a relational database. If the only purpose of the relational database is to support reporting, you can treat it as a read-only system. This makes synchronization manageable. In general, you should avoid two-way synchronization if at all possible, but one-way synchronization is usually less of a problem. </p> <p> Isn't that going to be more work, or more expensive? </p> <p> That question, again, has no single answer. Of course setting up and maintaining two systems is more work at the outset. On the other hand, there's a perpetual cost to be paid if you come up with the wrong architecture. If development is slow, and you have many bugs in production, or similar problems, the cause could be that you've chosen the wrong architecture and you're now fighting a losing battle. </p> <p> On the other hand, if you relegate relational databases exclusively to a reporting role, chances are that there's a lot of off-the-shelf software that can support your business users. Perhaps you can even hire a paratechnical power user to take care of that part of the system, freeing you to focus on the 'actual' system. </p> <p> All of this is only meant as inspiration. If you don't want to, or can't, do it that way, then this article doesn't help you. </p> <h3 id="1b0ce932168349f8abb9887f9ed219c8"> Conclusion <a href="#1b0ce932168349f8abb9887f9ed219c8">#</a> </h3> <p> When discussing databases, and particularly ORMs, some people approach the topic with the unspoken assumption that a relational database is the only option for storing data. Many programmers are so skilled in relational data design that they naturally use those skills when thinking new problems over. </p> <p> Sometimes problems are just relational in nature, and that's fine. More often than not, however, that's not the case. </p> <p> Try to model a business problem without concern for storage and see where that leads you. Test-driven development is often a great technique for such a task. Then, once you have a good API, consider how to store the data. The Domain Model that you develop in that way may naturally suggest a good way to store and retrieve the data. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="db4a9a94452a4cc7bf71989561dfd947"> <div class="comment-author"><a href="#db4a9a94452a4cc7bf71989561dfd947">qfilip</a></div> <div class="comment-content"> <q> <i> Again, I don't intend to point fingers at anyone, but on the other hand, I can't suggest alternatives when a problem is presented like that. </i> </q> <p> Heh, that's fair criticism, not finger pointing. I wanted to give a better example here, but I gave up halfway through writing it. You raised some good points. I'll have to rethink my approach on domain modeling further, before asking any meaningful questions. </p> <p> Years of working with EF-Core in a specific way got me... indoctrinated. Not all things are bad ofcourse, but I have missed the bigger picture in some areas, as far as I can tell. </p> <p> Thanks for dedicating so many articles to the subject. </p> </div> <div class="comment-date">2023-10-23 18:05 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. At the boundaries, static types are illusory https://blog.ploeh.dk/2023/10/16/at-the-boundaries-static-types-are-illusory 2023-10-16T08:07:00+00:00 Mark Seemann <div id="post"> <p> <em>Static types are useful, but have limitations.</em> </p> <p> Regular readers of this blog may have noticed that I like static type systems. Not the kind of static types offered by <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>, which strikes me as mostly being able to distinguish between way too many types of integers and pointers. <a href="/2020/01/20/algebraic-data-types-arent-numbers-on-steroids">A good type system is more than just numbers on steroids</a>. A type system like C#'s is <a href="/2019/12/16/zone-of-ceremony">workable, but verbose</a>. The kind of type system I find most useful is when it has <a href="https://en.wikipedia.org/wiki/Algebraic_data_type">algebraic data types</a> and good type inference. The examples that I know best are the type systems of <a href="https://fsharp.org/">F#</a> and <a href="https://www.haskell.org/">Haskell</a>. </p> <p> As great as static type systems can be, they have limitations. <a href="https://www.hillelwayne.com/post/constructive/">Hillel Wayne has already outlined one kind of distinction</a>, but here I'd like to focus on another constraint. </p> <h3 id="ab0d595d35304a9ea9302197b4f796d3"> Application boundaries <a href="#ab0d595d35304a9ea9302197b4f796d3">#</a> </h3> <p> Any piece of software interacts with the 'rest of the world'; effectively everything outside its own process. Sometimes (but increasingly rarely) such interaction is exclusively by way of some user interface, but more and more, an application interacts with other software in some way. </p> <p> <img src="/content/binary/application-boundary.png" alt="A application depicted as an opaque disk with a circle emphasising its boundary. Also included are arrows in and out, with some common communication artefacts: Messages, HTTP traffic, and a database."> </p> <p> Here I've drawn the application as an opaque disc in order to emphasise that what happens inside the process isn't pertinent to the following discussion. The diagram also includes some common kinds of traffic. Many applications rely on some kind of database or send messages (email, SMS, Commands, Events, etc.). We can think of such traffic as the interactions that the application initiates, but many systems also receive and react to incoming data: HTTP traffic or messages that arrive on a queue, and so on. </p> <p> When I talk about application <em>boundaries</em>, I have in mind what goes on in that interface layer. </p> <p> An application can talk to the outside world in multiple ways: It may read or write a file, access shared memory, call operating-system APIs, send or receive network packets, etc. Usually you get to program against higher-level abstractions, but ultimately the application is dealing with various binary protocols. </p> <h3 id="4991578e222e408bb08e261dce6454f1"> Protocols <a href="#4991578e222e408bb08e261dce6454f1">#</a> </h3> <p> The bottom line is that at a sufficiently low level of abstraction, what goes in and out of your application has no static type stronger than an array of bytes. </p> <p> You may counter-argue that higher-level APIs deal with that to present the input and output as static types. When you interact with a text file, you'll typically deal with a list of strings: One for each line in the file. Or you may manipulate <a href="https://en.wikipedia.org/wiki/JSON">JSON</a>, <a href="https://en.wikipedia.org/wiki/XML">XML</a>, <a href="https://en.wikipedia.org/wiki/Protocol_Buffers">Protocol Buffers</a>, or another wire format using a serializer/deserializer API. Sometime, as is often the case with <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>, you may need to write a very simple parser yourself. Or perhaps <a href="/2023/08/28/a-first-crack-at-the-args-kata">something slightly more involved</a>. </p> <p> To demonstrate what I mean, there's no shortage of APIs like <a href="https://learn.microsoft.com/dotnet/api/system.text.json.jsonserializer.deserialize">JsonSerializer.Deserialize</a>, which enables you to write <a href="/2022/05/02/at-the-boundaries-applications-arent-functional">code like this</a>: </p> <p> <pre><span style="color:blue;">let</span>&nbsp;n&nbsp;=&nbsp;JsonSerializer.Deserialize&lt;Name&gt;&nbsp;(json,&nbsp;opts)</pre> </p> <p> and you may say: <em><code>n</code> is statically typed, and its type is <code>Name</code>! Hooray!</em> But you do realise that that's only half a truth, don't you? </p> <p> An interaction at the application boundary is expected to follow some kind of <em>protocol</em>. This is even true if you're reading a text file. In these modern times, you may expect a text file to contain <a href="https://unicode.org/">Unicode</a>, but have you ever received a file from a legacy system and have to deal with its <a href="https://en.wikipedia.org/wiki/EBCDIC">EBCDIC</a> encoding? Or an <a href="https://en.wikipedia.org/wiki/ASCII">ASCII</a> file with a <a href="https://en.wikipedia.org/wiki/Code_page">code page</a> different from the one you expect? Or even just a file written on a Unix system, if you're on Windows, or vice versa? </p> <p> In order to correctly interpret or transmit such data, you need to follow a <em>protocol</em>. </p> <p> Such a protocol can be low-level, as the character-encoding examples I just listed, but it may also be much more high-level. You may, for example, consider an HTTP request like this: </p> <p> <pre>POST /restaurants/90125/reservations?sig=aco7VV%2Bh5sA3RBtrN8zI8Y9kLKGC60Gm3SioZGosXVE%3D HTTP/1.1 Content-Type: application/json { &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;at&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;2021-12-08&nbsp;20:30&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;email&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;snomob@example.com&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;name&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Snow&nbsp;Moe&nbsp;Beal&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;quantity&quot;</span>:&nbsp;1 }</pre> </p> <p> Such an interaction implies a protocol. Part of such a protocol is that the HTTP request's body is a valid JSON document, that it has an <code>at</code> property, that that property encodes a valid date and time, that <code>quantity</code> is a natural number, that <code>email</code> <a href="/2023/07/03/validating-or-verifying-emails">is present</a>, and so on. </p> <p> You can model the expected input as a <a href="https://en.wikipedia.org/wiki/Data_transfer_object">Data Transfer Object</a> (DTO): </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">sealed</span>&nbsp;<span style="color:blue;">class</span>&nbsp;<span style="color:#2b91af;">ReservationDto</span> { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>?&nbsp;At&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>?&nbsp;Email&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">string</span>?&nbsp;Name&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Quantity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;<span style="color:blue;">set</span>;&nbsp;} }</pre> </p> <p> and even set up your 'protocol handlers' (here, an ASP.NET Core <a href="https://learn.microsoft.com/aspnet/core/mvc/controllers/actions">action method</a>) to use such a DTO: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;Task&lt;ActionResult&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Post</span>(ReservationDto&nbsp;<span style="font-weight:bold;color:#1f377f;">dto</span>)</pre> </p> <p> While this may look statically typed, it assumes a particular protocol. What happens when the bytes on the wire don't follow the protocol? </p> <p> Well, we've already been <a href="/2022/08/15/aspnet-validation-revisited">around that block</a> <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">more than once</a>. </p> <p> The point is that there's always an implied protocol at the application boundary, and you can choose to model it more or less explicitly. </p> <h3 id="41f3b4ad7a4b4429bba3f619c2af55d1"> Types as short-hands for protocols <a href="#41f3b4ad7a4b4429bba3f619c2af55d1">#</a> </h3> <p> In the above example, I've relied on <em>some</em> static typing to deal with the problem. After all, I did define a DTO to model the expected shape of input. I could have chosen other alternatives: Perhaps I could have used a JSON parser to explicitly <a href="https://learn.microsoft.com/dotnet/standard/serialization/system-text-json/use-dom">use the JSON DOM</a>, or even more low-level <a href="https://learn.microsoft.com/dotnet/standard/serialization/system-text-json/use-utf8jsonreader">used Utf8JsonReader</a>. Ultimately, I could have decided to write my own JSON parser. </p> <p> I'd rarely (or never?) choose to implement a JSON parser from scratch, so that's not what I'm advocating. Rather, my point is that you can leverage existing APIs to deal with input and output, and some of those APIs offer a convincing illusion that what happens at the boundary is statically typed. </p> <p> This illusion is partly API-specific, and partly language-specific. In .NET, for example, <code>JsonSerializer.Deserialize</code> <em>looks</em> like it'll always deserialize <em>any</em> JSON string into the desired model. Obviously, that's a lie, because the function will throw an exception if the operation is impossible (i.e. when the input is malformed). In .NET (and many other languages or platforms), you can't tell from an API's type what the failure modes might be. In contrast, aeson's <a href="https://hackage.haskell.org/package/aeson/docs/Data-Aeson.html#v:fromJSON">fromJSON</a> function returns a type that explicitly indicates that deserialization may fail. Even in Haskell, however, this is mostly an <a href="/2015/08/03/idiomatic-or-idiosyncratic">idiomatic</a> convention, because Haskell also 'supports' exceptions. </p> <p> At the boundary, a static type can be a useful shorthand for a protocol. You declare a static type (e.g. a DTO) and rely on built-in machinery to handle malformed input. You give up some fine-grained control in exchange for a more declarative model. </p> <p> I often choose to do that because I find such a trade-off beneficial, but I'm under no illusion that my static types fully model what goes 'on the wire'. </p> <h3 id="792212e79feb46889e71a6c08dedb88e"> Reversed roles <a href="#792212e79feb46889e71a6c08dedb88e">#</a> </h3> <p> So far, I've mostly discussed input validation. <a href="/2022/08/22/can-types-replace-validation">Can types replace validation?</a> No, but they can make most common validation scenarios easier. What happens when you return data? </p> <p> You may decide to return a statically typed value. A serializer can faithfully convert such a value to a proper wire format (JSON, XML, or similar). The recipient may not care about that type. After all, you may return a Haskell value, but the system receiving the data is written in <a href="https://www.python.org/">Python</a>. Or you return a C# object, but the recipient is <a href="https://en.wikipedia.org/wiki/JavaScript">JavaScript</a>. </p> <p> Should we conclude, then, that there's no reason to model return data with static types? Not at all, because by modelling output with static types, you are being <a href="https://en.wikipedia.org/wiki/Robustness_principle">conservative with what you send</a>. Since static types are typically more rigid than 'just code', there may be corner cases that a type can't easily express. While this may pose a problem when it comes to input, it's only a benefit when it comes to output. This means that you're <a href="/2021/11/29/postels-law-as-a-profunctor">narrowing the output funnel</a> and thus making your system easier to work with. </p> <p> <img src="/content/binary/liberal-conservative-at-boundary.png" alt="Funnels labelled 'liberal' and 'conservative' to the left of an line indicating an application boundary."> </p> <p> Now consider another role-reversal: When your application <em>initiates</em> an interaction, it starts by producing output and receives input as a result. This includes any database interaction. When you create, update, or delete a row in a database, you <em>send</em> data, and receive a response. </p> <p> Should you not consider <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel's law</a> in that case? </p> <p> <img src="/content/binary/conservative-liberal-at-boundary.png" alt="Funnels labelled 'conservative' and 'liberal' to the right of an line indicating an application boundary."> </p> <p> Most people don't, particularly if they rely on <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mappers</a> (ORMs). After all, if you have a static type (class) that models a database row, what's the harm using that when updating the database? </p> <p> Probably none. After all, based on what I've just written, using a static type is a good way to be conservative with what you send. Here's an example using <a href="https://en.wikipedia.org/wiki/Entity_Framework">Entity Framework</a>: </p> <p> <pre><span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">db</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;RestaurantsContext(ConnectionString); <span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">dbReservation</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Reservation { &nbsp;&nbsp;&nbsp;&nbsp;PublicId&nbsp;=&nbsp;reservation.Id, &nbsp;&nbsp;&nbsp;&nbsp;RestaurantId&nbsp;=&nbsp;restaurantId, &nbsp;&nbsp;&nbsp;&nbsp;At&nbsp;=&nbsp;reservation.At, &nbsp;&nbsp;&nbsp;&nbsp;Name&nbsp;=&nbsp;reservation.Name.ToString(), &nbsp;&nbsp;&nbsp;&nbsp;Email&nbsp;=&nbsp;reservation.Email.ToString(), &nbsp;&nbsp;&nbsp;&nbsp;Quantity&nbsp;=&nbsp;reservation.Quantity }; <span style="color:blue;">await</span>&nbsp;db.Reservations.AddAsync(dbReservation); <span style="color:blue;">await</span>&nbsp;db.SaveChangesAsync();</pre> </p> <p> Here we send a statically typed <code>Reservation</code> 'Entity' to the database, and since we use a static type, we're being conservative with what we send. That's only good. </p> <p> What happens when we query a database? Here's a typical example: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;Restaurants.Reservation?&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">ReadReservation</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">restaurantId</span>,&nbsp;Guid&nbsp;<span style="font-weight:bold;color:#1f377f;">id</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">db</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;RestaurantsContext(ConnectionString); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;db.Reservations.FirstOrDefaultAsync(<span style="font-weight:bold;color:#1f377f;">x</span>&nbsp;=&gt;&nbsp;x.PublicId&nbsp;==&nbsp;id); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(r&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">null</span>; &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Restaurants.Reservation( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r.PublicId, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r.At, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Email(r.Email), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Name(r.Name), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;r.Quantity); }</pre> </p> <p> Here I read a database row <code>r</code> and unquestioning translate it to my domain model. Should I do that? What if the database schema has diverged from my application code? </p> <p> I suspect that much grief and trouble with relational databases, and particularly with ORMs, stem from the illusion that an ORM 'Entity' is a statically-typed view of the database schema. Typically, you can either use an ORM like Entity Framework in a code-first or a database-first fashion, but regardless of what you choose, you have two competing 'truths' about the database: The database schema and the Entity Classes. </p> <p> You need to be disciplined to keep those two views in synch, and I'm not asserting that it's impossible. I'm only suggesting that it may pay to explicitly acknowledge that static types may not represent any truth about what's actually on the other side of the application boundary. </p> <h3 id="1ab46f8e48a74b94ad9aa92cce2d915f"> Types are an illusion <a href="#1ab46f8e48a74b94ad9aa92cce2d915f">#</a> </h3> <p> Given that I usually find myself firmly in the static-types-are-great camp, it may seem odd that I now spend an entire article trashing them. Perhaps it looks as though I've had a revelation and made an about-face, but that's not the case. Rather, I'm fond of making the implicit explicit. This often helps improve understanding, because it helps delineate conceptual boundaries. </p> <p> This, too, is the case here. <a href="https://en.wikipedia.org/wiki/All_models_are_wrong">All models are wrong, but some models are useful</a>. So are static types, I believe. </p> <p> A static type system is a useful tool that enables you to model how your application should behave. The types don't really exist at run time. Even though .NET code (just to point out an example) compiles to <a href="https://en.wikipedia.org/wiki/Common_Intermediate_Language">a binary representation that includes type information</a>, once it runs, it <a href="https://en.wikipedia.org/wiki/Just-in-time_compilation">JITs</a> to machine code. In the end, it's just registers and memory addresses, or, if you want to be even more nihilistic, electrons moving around on a circuit board. </p> <p> Even at a higher level of abstraction, you may say: <em>But at least, a static type system can help you encapsulate rules and assumptions.</em> In a language like C#, for example, consider a <a href="https://www.hillelwayne.com/post/constructive/">predicative type</a> like <a href="/2022/08/22/can-types-replace-validation">this NaturalNumber</a> class: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">struct</span>&nbsp;<span style="color:#2b91af;">NaturalNumber</span>&nbsp;:&nbsp;IEquatable&lt;NaturalNumber&gt; { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">int</span>&nbsp;i; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">NaturalNumber</span>(<span style="color:blue;">int</span>&nbsp;candidate) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">if</span>&nbsp;(candidate&nbsp;&lt;&nbsp;1) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;ArgumentOutOfRangeException( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;nameof(candidate), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#a31515;">$&quot;The&nbsp;value&nbsp;must&nbsp;be&nbsp;a&nbsp;positive&nbsp;(non-zero)&nbsp;number,&nbsp;but&nbsp;was:&nbsp;</span>{candidate}<span style="color:#a31515;">.&quot;</span>); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.i&nbsp;=&nbsp;candidate; &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Various&nbsp;other&nbsp;members&nbsp;follow...</span></pre> </p> <p> Such a class effectively protects the invariant that a <a href="https://en.wikipedia.org/wiki/Natural_number">natural number</a> is always a positive integer. Yes, that works well until someone does this: </p> <p> <pre><span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">n</span>&nbsp;=&nbsp;(NaturalNumber)FormatterServices.GetUninitializedObject(<span style="color:blue;">typeof</span>(NaturalNumber));</pre> </p> <p> This <code>n</code> value has the internal value <code>0</code>. Yes, <a href="https://learn.microsoft.com/dotnet/api/system.runtime.serialization.formatterservices.getuninitializedobject">FormatterServices.GetUninitializedObject</a> bypasses the constructor. This thing is evil, but it exists, and at least in the current discussion serves to illustrate the point that types are illusions. </p> <p> This isn't just a flaw in C#. Other languages have similar backdoors. One of the most famously statically-typed languages, Haskell, comes with <a href="https://hackage.haskell.org/package/base/docs/System-IO-Unsafe.html#v:unsafePerformIO">unsafePerformIO</a>, which enables you to pretend that nothing untoward is going on even if you've written some impure code. </p> <p> You may (and should) institute policies to not use such backdoors in your normal code bases. You don't need them. </p> <h3 id="de28c90a44e14b299f6eb30c09b08821"> Types are useful models <a href="#de28c90a44e14b299f6eb30c09b08821">#</a> </h3> <p> All this may seem like an argument that types are useless. That would, however, be to draw the wrong conclusion. Types don't exist at run time to the same degree that Python objects or JavaScript functions don't exist at run time. Any language (except <a href="https://en.wikipedia.org/wiki/Assembly_language">assembler</a>) is an abstraction: A way to model computer instructions so that programming becomes easier (one would hope, <a href="/2023/09/11/a-first-stab-at-the-brainfuck-kata">but then...</a>). This is true even for C, as low-level and detail-oriented as it may seem. </p> <p> If you grant that high-level programming languages (i.e. any language that is <em>not</em> machine code or assembler) are useful, you must also grant that you can't rule out the usefulness of types. Notice that this argument is one of logic, rather than of preference. The only claim I make here is that programming is based on useful illusions. That the abstractions are illusions don't prevent them from being useful. </p> <p> In statically typed languages, we effectively need to pretend that the type system is good enough, strong enough, generally trustworthy enough that it's safe to ignore the underlying reality. We work with, if you will, a provisional truth that serves as a user interface to the computer. </p> <p> Even though a computer program eventually executes on a processor where types don't exist, a good compiler can still check that our models look sensible. We say that it <em>type-checks</em>. I find that indispensable when modelling the internal behaviour of a program. Even in a large code base, a compiler can type-check whether all the various components look like they may compose correctly. That a program compiles is no guarantee that it works correctly, but if it doesn't type-check, it's strong evidence that the code's model is <em>internally</em> inconsistent. </p> <p> In other words, that a statically-typed program type-checks is a necessary, but not a sufficient condition for it to work. </p> <p> This holds as long as we're considering program internals. Some language platforms allow us to take this notion further, because we can link software components together and still type-check them. The .NET platform is a good example of this, since the IL code retains type information. This means that the C#, F#, or <a href="https://en.wikipedia.org/wiki/Visual_Basic_(.NET)">Visual Basic .NET</a> compiler can type-check your code against the APIs exposed by external libraries. </p> <p> On the other hand, you can't extend that line of reasoning to the boundary of an application. What happens at the boundary is ultimately untyped. </p> <p> Are types useless at the boundary, then? Not at all. <a href="https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-type-systems-are-not-inherently-more-open/">Alexis King has already dealt with this topic better than I could</a>, but the point is that types remain an effective way to capture the result of <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">parsing input</a>. You can view receiving, handling, parsing, or validating input as implementing a protocol, as I've already discussed above. Such protocols are application-specific or domain-specific rather than general-purpose protocols, but they are still protocols. </p> <p> When I decide to write <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">input validation for my restaurant sample code base as a set of composable parsers</a>, I'm implementing a protocol. My starting point isn't raw bits, but rather a loose static type: A DTO. In other cases, I may decide to use a different level of abstraction. </p> <p> One of the (many) reasons I have for <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">finding ORMs unhelpful</a> is exactly because they insist on an illusion past its usefulness. Rather, I prefer implementing the protocol that talks to my database with a lower-level API, such as ADO.NET: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Reservation&nbsp;<span style="color:#74531f;">ReadReservationRow</span>(SqlDataReader&nbsp;<span style="font-weight:bold;color:#1f377f;">rdr</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Reservation( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Guid)rdr[<span style="color:#a31515;">&quot;PublicId&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(DateTime)rdr[<span style="color:#a31515;">&quot;At&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Email((<span style="color:blue;">string</span>)rdr[<span style="color:#a31515;">&quot;Email&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;Name((<span style="color:blue;">string</span>)rdr[<span style="color:#a31515;">&quot;Name&quot;</span>]), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;NaturalNumber((<span style="color:blue;">int</span>)rdr[<span style="color:#a31515;">&quot;Quantity&quot;</span>])); }</pre> </p> <p> This actually isn't a particular good protocol implementation, because it fails to take Postel's law into account. Really, this code should be a <a href="https://martinfowler.com/bliki/TolerantReader.html">Tolerant Reader</a>. In practice, not that much input contravariance is possible, but perhaps, at least, this code ought to gracefully handle if the <code>Name</code> field was missing. </p> <p> The point of this particular example isn't that it's perfect, because it's not, but rather that it's possible to drop down to a lower level of abstraction, and sometimes, this may be a more honest representation of reality. </p> <h3 id="ce2a1d57f63e4f39a28e801fd23164cf"> Conclusion <a href="#ce2a1d57f63e4f39a28e801fd23164cf">#</a> </h3> <p> It may be helpful to acknowledge that static types don't really exist. Even so, internally in a code base, a static type system can be a powerful tool. A good type system enables a compiler to check whether various parts of your code looks internally consistent. Are you calling a procedure with the correct arguments? Have you implemented all methods defined by an interface? Have you handled all cases defined by a <a href="https://en.wikipedia.org/wiki/Tagged_union">sum type</a>? Have you correctly initialized an object? </p> <p> As useful type systems are for this kind of work, you should also be aware of their limitations. A compiler can check whether a code base's internal model makes sense, but it can't verify what happens at run time. </p> <p> As long as one part of your code base sends data to another part of your code base, your type system can still perform a helpful sanity check, but for data that enters (or leaves) your application at run time, bets are off. You may attempt to model what input <em>should</em> look like, and it may even be useful to do that, but it's important to acknowledge that reality may not look like your model. </p> <p> You can write statically-typed, composable parsers. Some of them are quite elegant, but the good ones explicitly model that parsing of input is error-prone. When input is well-formed, the result may be a nicely <a href="/2022/10/24/encapsulation-in-functional-programming">encapsulated</a>, statically-typed value, but when it's malformed, the result is one or more error values. </p> <p> Perhaps the most important message is that databases, other web services, file systems, etc. involve input and output, too. Even if <em>you</em> write code that initiates a database query, or a web service request, should you implicitly trust the data that comes back? </p> <p> This question of trust doesn't have to imply security concerns. Rather, systems evolve and errors happen. Every time you interact with an external system, there's a risk that it has become misaligned with yours. Static types can't protect you against that. </p> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. What's a sandwich? https://blog.ploeh.dk/2023/10/09/whats-a-sandwich 2023-10-09T20:20:00+00:00 Mark Seemann <div id="post"> <p> <em>Ultimately, it's more about programming than food.</em> </p> <p> The <a href="https://en.wikipedia.org/wiki/Sandwich">Sandwich</a> was named after <a href="https://en.wikipedia.org/wiki/John_Montagu,_4th_Earl_of_Sandwich">John Montagu, 4th Earl of Sandwich</a> because of his fondness for this kind of food. As popular story has it, he found it practical because it enabled him to eat without greasing the cards he often played. </p> <p> A few years ago, a corner of the internet erupted in good-natured discussion about exactly what constitutes a sandwich. For instance, is the Danish <a href="https://en.wikipedia.org/wiki/Sm%C3%B8rrebr%C3%B8d">smørrebrød</a> a sandwich? It comes in two incarnations: <em>Højtbelagt</em>, the luxury version which is only consumable with knife and fork and the more modest, everyday <em>håndmad</em> (literally <em>hand food</em>), which, while open-faced, can usually be consumed without cutlery. </p> <p> <img src="/content/binary/bjoernekaelderen-hoejtbelagt.jpg" alt="A picture of elaborate Danish smørrebrød."> </p> <p> If we consider the 4th Earl of Sandwich's motivation as a yardstick, then the depicted <em>højtbelagte smørrebrød</em> is hardly a sandwich, while I believe a case can be made that a <em>håndmad</em> is: </p> <p> <img src="/content/binary/haandmadder.jpg" alt="Two håndmadder a half of a sliced apple."> </p> <p> Obviously, you need a different grip on a <em>håndmad</em> than on a sandwich. The bread (<em>rugbrød</em>) is much denser than wheat bread, and structurally more rigid. You eat it with your thumb and index finger on each side, and remaining fingers supporting it from below. The bottom line is this: A single piece of bread with something on top can also solve the original problem. </p> <p> What if we go in the other direction? How about a combo consisting of bread, meat, bread, meat, and bread? I believe that I've seen burgers like that. Can you eat that with one hand? I think that this depends more on how greasy and overfilled it is, than on the structure. </p> <p> What if you had five layers of meat and six layers of bread? This is unlikely to work with traditional Western leavened bread which, being a foam, will lose structural integrity when cut too thin. Imagining other kinds of bread, though, and thin slices of meat (or other 'content'), I don't see why it couldn't work. </p> <h3 id="00d495b0703a45a98f36607e99799c62"> FP sandwiches <a href="#00d495b0703a45a98f36607e99799c62">#</a> </h3> <p> As regular readers may have picked up over the years, I do like food, but this is, after all, a programming blog. </p> <p> A few years ago I presented a functional-programming design pattern named <a href="/2020/03/02/impureim-sandwich">Impureim sandwich</a>. It argues that it's often beneficial to structure a code base according to the <a href="https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell">functional core, imperative shell</a> architecture. </p> <p> The idea, in a nutshell, is that at every entry point (<code>Main</code> method, message handler, Controller action, etcetera) you first perform all impure actions necessary to collect input data for a <a href="https://en.wikipedia.org/wiki/Pure_function">pure function</a>, then you call that pure function (which may be composed by many smaller functions), and finally you perform one or more impure actions based on the function's return value. That's the <a href="/2020/03/02/impureim-sandwich">impure-pure-impure sandwich</a>. </p> <p> My experience with this pattern is that it's surprisingly often possible to apply it. Not always, but more often than you think. </p> <p> Sometimes, however, it demands a looser interpretation of the word <em>sandwich</em>. </p> <p> Even the examples from <a href="/2020/03/02/impureim-sandwich">the article</a> aren't standard sandwiches, once you dissect them. Consider, first, the <a href="https://www.haskell.org/">Haskell</a> example, here recoloured: </p> <p> <pre><span style="color:#600277;">tryAcceptComposition</span>&nbsp;::&nbsp;<span style="color:blue;">Reservation</span>&nbsp;<span style="color:blue;">-&gt;</span>&nbsp;IO&nbsp;(Maybe&nbsp;Int) tryAcceptComposition&nbsp;reservation&nbsp;<span style="color:#666666;">=</span>&nbsp;runMaybeT&nbsp;<span style="color:#666666;">$</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;liftIO&nbsp;(<span style="color:#dd0000;">DB</span><span style="color:#666666;">.</span>readReservations&nbsp;connectionString</span><span style="background-color: palegreen;">&nbsp;<span style="color:#666666;">$</span>&nbsp;date&nbsp;reservation</span><span style="background-color: lightsalmon;">)</span> <span style="background-color: palegreen;">&nbsp;&nbsp;<span style="color:#666666;">&gt;&gt;=</span>&nbsp;<span style="color:#dd0000;">MaybeT</span>&nbsp;<span style="color:#666666;">.</span>&nbsp;return&nbsp;<span style="color:#666666;">.</span>&nbsp;flip&nbsp;(tryAccept&nbsp;<span style="color:#09885a;">10</span>)&nbsp;reservation</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;<span style="color:#666666;">&gt;&gt;=</span>&nbsp;liftIO&nbsp;<span style="color:#666666;">.</span>&nbsp;<span style="color:#dd0000;">DB</span><span style="color:#666666;">.</span>createReservation&nbsp;connectionString</span></pre> </p> <p> The <code>date</code> function is a pure accessor that retrieves the date and time of the <code>reservation</code>. In C#, it's typically a read-only property: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;<span style="color:#2b91af;">Task</span>&lt;<span style="color:#2b91af;">IActionResult</span>&gt;&nbsp;Post(<span style="color:#2b91af;">Reservation</span>&nbsp;reservation) { <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(</span><span style="background-color: palegreen;">reservation.Date</span><span style="background-color: lightsalmon;">)</span> <span style="background-color: palegreen;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(rs&nbsp;=&gt;&nbsp;maîtreD.TryAccept(rs,&nbsp;reservation))</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.SelectMany(m&nbsp;=&gt;&nbsp;m.Traverse(Repository.Create)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Match(InternalServerError(<span style="color:#a31515;">&quot;Table&nbsp;unavailable&quot;</span>),&nbsp;Ok);</span> }</pre> </p> <p> Perhaps you don't think of a C# property as a function. After all, it's just an idiomatic grouping of language keywords: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;DateTimeOffset&nbsp;Date&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;}</pre> </p> <p> Besides, a function takes input and returns output. What's the input in this case? </p> <p> Keep in mind that a C# read-only property like this is only syntactic sugar for a getter method. In Java it would have been a method called <code>getDate()</code>. From <a href="/2018/01/22/function-isomorphisms">Function isomorphisms</a> we know that an instance method is isomorphic to a function that takes the object as input: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">static</span>&nbsp;DateTimeOffset&nbsp;GetDate(Reservation&nbsp;reservation)</pre> </p> <p> In other words, the <code>Date</code> property is an operation that takes the object itself as input and returns <code>DateTimeOffset</code> as output. The operation has no side effects, and will always return the same output for the same input. In other words, it's a pure function, and that's the reason I've now coloured it green in the above code examples. </p> <p> The layering indicated by the examples may, however, be deceiving. The green colour of <code>reservation.Date</code> is adjacent to the green colour of the <code>Select</code> expression below it. You might interpret this as though the pure middle part of the sandwich partially expands to the upper impure phase. </p> <p> That's not the case. The <code>reservation.Date</code> expression executes <em>before</em> <code>Repository.ReadReservations</code>, and only then does the pure <code>Select</code> expression execute. Perhaps this, then, is a more honest depiction of the sandwich: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;IActionResult&gt;&nbsp;Post(Reservation&nbsp;reservation) { <span style="background-color: palegreen;">&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;date&nbsp;=&nbsp;reservation.Date;</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(date)</span> <span style="background-color: palegreen;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(rs&nbsp;=&gt;&nbsp;maîtreD.TryAccept(rs,&nbsp;reservation))</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.SelectMany(m&nbsp;=&gt;&nbsp;m.Traverse(Repository.Create)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Match(InternalServerError(<span style="color:#a31515;">&quot;Table&nbsp;unavailable&quot;</span>),&nbsp;Ok);</span> }</pre> </p> <p> The corresponding 'sandwich diagram' looks like this: </p> <p> <img src="/content/binary/pure-impure-pure-impure-box.png" alt="A box with green, red, green, and red horizontal tiers."> </p> <p> If you want to interpret the word <em>sandwich</em> narrowly, this is no longer a sandwich, since there's 'content' on top. That's the reason I started this article discussing Danish <em>smørrebrød</em>, also sometimes called <em>open-faced sandwiches</em>. Granted, I've never seen a <em>håndmad</em> with two slices of bread with meat both between and on top. On the other hand, I don't think that having a smidgen of 'content' on top is a showstopper. </p> <h3 id="c3a4d1243ee540af95571141c0dd500e"> Initial and eventual purity <a href="#c3a4d1243ee540af95571141c0dd500e">#</a> </h3> <p> Why is this important? Whether or not <code>reservation.Date</code> is a little light of purity in the otherwise impure first slice of the sandwich actually doesn't concern me that much. After all, my concern is mostly cognitive load, and there's hardly much gained by extracting the <code>reservation.Date</code> expression to a separate line, as I did above. </p> <p> The reason this interests me is that in many cases, the first step you may take is to validate input, and <a href="/2023/06/26/validation-and-business-rules">validation is a composed set of pure functions</a>. While pure, and <a href="/2020/12/14/validation-a-solved-problem">a solved problem</a>, validation may be a sufficiently significant step that it warrants explicit acknowledgement. It's not just a property getter, but complex enough that bugs could hide there. </p> <p> Even if you follow the <em>functional core, imperative shell</em> architecture, you'll often find that the first step is pure validation. </p> <p> Likewise, once you've performed impure actions in the second impure phase, you can easily have a final thin pure translation slice. In fact, the above C# example contains an example of just that: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;IActionResult&nbsp;Ok(<span style="color:blue;">int</span>&nbsp;value) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;OkActionResult(value); } <span style="color:blue;">public</span>&nbsp;IActionResult&nbsp;InternalServerError(<span style="color:blue;">string</span>&nbsp;msg) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;InternalServerErrorActionResult(msg); }</pre> </p> <p> These are two tiny pure functions used as the final translation in the sandwich: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;IActionResult&gt;&nbsp;Post(Reservation&nbsp;reservation) { <span style="background-color: palegreen;">&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;date&nbsp;=&nbsp;reservation.Date;</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">return</span>&nbsp;<span style="color:blue;">await</span>&nbsp;Repository.ReadReservations(date)</span> <span style="background-color: palegreen;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(rs&nbsp;=&gt;&nbsp;maîtreD.TryAccept(rs,&nbsp;reservation))</span> <span style="background-color: lightsalmon;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.SelectMany(m&nbsp;=&gt;&nbsp;m.Traverse(Repository.Create)) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Match(</span><span style="background-color: palegreen;">InternalServerError(<span style="color:#a31515;">&quot;Table&nbsp;unavailable&quot;</span>),&nbsp;Ok</span><span style="background-color: lightsalmon;">);</span></span> }</pre> </p> <p> On the other hand, I didn't want to paint the <code>Match</code> operation green, since it's essentially a continuation of a <a href="https://learn.microsoft.com/dotnet/api/system.threading.tasks.task-1">Task</a>, and if we consider <a href="/2020/07/27/task-asynchronous-programming-as-an-io-surrogate">task asynchronous programming as an IO surrogate</a>, we should, at least, regard it with scepticism. While it might be pure, it probably isn't. </p> <p> Still, we may be left with an inverted 'sandwich' that looks like this: </p> <p> <img src="/content/binary/pure-impure-pure-impure-pure-box.png" alt="A box with green, red, green, red, and green horizontal tiers."> </p> <p> Can we still claim that this is a sandwich? </p> <h3 id="4d14e6795066473e95d1e5cdbcef6c2d"> At the metaphor's limits <a href="#4d14e6795066473e95d1e5cdbcef6c2d">#</a> </h3> <p> This latest development seems to strain the sandwich metaphor. Can we maintain it, or does it fall apart? </p> <p> What seems clear to me, at least, is that this ought to be the limit of how much we can stretch the allegory. If we add more tiers we get a <a href="https://en.wikipedia.org/wiki/Dagwood_sandwich">Dagwood sandwich</a> which is clearly a gimmick of little practicality. </p> <p> But again, I'm appealing to a dubious metaphor, so instead, let's analyse what's going on. </p> <p> In practice, it seems that you can rarely avoid the initial (pure) validation step. Why not? Couldn't you move validation to the functional core and do the impure steps without validation? </p> <p> The short answer is <em>no</em>, because <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">validation done right is actually parsing</a>. At the entry point, you don't even know if the input makes sense. </p> <p> A more realistic example is warranted, so I now turn to the example code base from my book <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>. One blog post shows <a href="/2022/07/25/an-applicative-reservation-validation-example-in-c">how to implement applicative validation for posting a reservation</a>. </p> <p> A typical HTTP <code>POST</code> may include a JSON document like this: </p> <p> <pre>{ &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;id&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;bf4e84130dac451b9c94049da8ea8c17&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;at&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;2024-11-07T20:30&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;email&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;snomob@example.com&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;name&quot;</span>:&nbsp;<span style="color:#a31515;">&quot;Snow&nbsp;Moe&nbsp;Beal&quot;</span>, &nbsp;&nbsp;<span style="color:#2e75b6;">&quot;quantity&quot;</span>:&nbsp;1 }</pre> </p> <p> In order to handle even such a simple request, the system has to perform a set of impure actions. One of them is to query its data store for existing reservations. After all, the restaurant may not have any remaining tables for that day. </p> <p> Which day, you ask? I'm glad you asked. The data access API comes with this method: </p> <p> <pre>Task&lt;IReadOnlyCollection&lt;Reservation&gt;&gt;&nbsp;ReadReservations( &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">int</span>&nbsp;restaurantId,&nbsp;DateTime&nbsp;min,&nbsp;DateTime&nbsp;max);</pre> </p> <p> You can supply <code>min</code> and <code>max</code> values to indicate the range of dates you need. How do you determine that range? You need the desired date of the reservation. In the above example it's 20:30 on November 7 2024. We're in luck, the data is there, and understandable. </p> <p> Notice, however, that due to limitations of wire formats such as JSON, the date is a string. The value might be anything. If it's sufficiently malformed, you can't even perform the impure action of querying the database, because you don't know what to query it about. </p> <p> If keeping the sandwich metaphor untarnished, you might decide to push the parsing responsibility to an impure action, but why make something impure that has a well-known pure solution? </p> <p> A similar argument applies when performing a final, pure translation step in the other direction. </p> <p> So it seems that we're stuck with implementations that don't quite fit the ideal of the sandwich metaphor. Is that enough to abandon the metaphor, or should we keep it? </p> <p> The layers in layered application architecture aren't really layers, and neither are vertical slices really slices. <a href="https://en.wikipedia.org/wiki/All_models_are_wrong">All models are wrong, but some are useful</a>. This is the case here, I believe. You should still keep the <a href="/2020/03/02/impureim-sandwich">Impureim sandwich</a> in mind when structuring code: Keep impure actions at the application boundary - in the 'Controllers', if you will; have only two phases of impurity - the initial and the ultimate; and maximise use of pure functions for everything else. Keep most of the pure execution between the two impure phases, but realistically, you're going to need a pure validation phase in front, and a slim translation layer at the end. </p> <h3 id="5b191dfc434149bab1d9d6bea029a4d4"> Conclusion <a href="#5b191dfc434149bab1d9d6bea029a4d4">#</a> </h3> <p> Despite the prevalence of food imagery, this article about functional programming architecture has eluded any mention of <a href="https://byorgey.wordpress.com/2009/01/12/abstraction-intuition-and-the-monad-tutorial-fallacy/">burritos</a>. Instead, it examines the tension between an ideal, the <a href="/2020/03/02/impureim-sandwich">Impureim sandwich</a>, with real-world implementation details. When you have to deal with concerns such as input validation or translation to egress data, it's practical to add one or two more thin slices of purity. </p> <p> In <a href="/2018/11/19/functional-architecture-a-definition">functional architecture</a> you want to maximise the proportion of pure functions. Adding more pure code is hardly a problem. </p> <p> The opposite is not the case. We shouldn't be cavalier about adding more impure slices to the sandwich. Thus, the adjusted definition of the Impureim sandwich seems to be that it may have at most two impure phases, but from one to three pure slices. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="aa4031dbe9a7467ba087c2731596f420"> <div class="comment-author">qfilip <a href="#aa4031dbe9a7467ba087c2731596f420">#</a></div> <div class="comment-content"> <p> Hello again... </p> <p> In one of your excellent talks (<a href="https://youtu.be/F9bznonKc64?feature=shared&t=3392">here</a>), you ended up refactoring maitreD kata using the <pre>traverse</pre> function. Since this step is crucial for "sandwich" to work, any post detailing it's implementation would be nice. </p> <p> Thanks </p> </div> <div class="comment-date">2023-11-16 10:56 UTC</div> </div> <div class="comment" id="7ea7f0f5f3a24a939be3a1cb5b23e2f5"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#7ea7f0f5f3a24a939be3a1cb5b23e2f5">#</a></div> <div class="comment-content"> <p> qfilip, thank you for writing. That particular talk fortunately comes with a set of companion articles: </p> <ul> <li><a href="/2019/02/04/how-to-get-the-value-out-of-the-monad">How to get the value out of the monad</a></li> <li><a href="/2019/02/11/asynchronous-injection">Asynchronous Injection</a></li> </ul> <p> The latter of the two comes with a link to <a href="https://github.com/ploeh/asynchronous-injection">a GitHub repository with all the sample code</a>, including the <code>Traverse</code> implementation. </p> <p> That said, a more formal description of traversals has long been on my to-do list, as you can infer from <a href="/2022/07/11/functor-relationships">this (currently inactive) table of contents</a>. </p> </div> <div class="comment-date">2023-11-16 11:18 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Dependency Whac-A-Mole https://blog.ploeh.dk/2023/10/02/dependency-whac-a-mole 2023-10-02T07:52:00+00:00 Mark Seemann <div id="post"> <p> <em>AKA Framework Whac-A-Mole, Library Whac-A-Mole.</em> </p> <p> I have now three times used the name <a href="https://en.wikipedia.org/wiki/Whac-A-Mole">Whac-A-Mole</a> about a particular kind of relationship that may evolve with some dependencies. According to the <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">rule of three</a>, I can now extract the explanation to a separate article. This is that article. </p> <h3 id="f9a98473c3ed40eda1f6288eec631795"> Architecture smell <a href="#f9a98473c3ed40eda1f6288eec631795">#</a> </h3> <p> <em>Dependency Whac-A-Mole</em> describes the situation when you're spending too much time investigating, learning, troubleshooting, and overall satisfying the needs of a dependency (i.e. library or framework) instead of delivering value to users. </p> <p> Examples include Dependency Injection containers, <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mappers</a>, validation frameworks, dynamic mock libraries, and perhaps the Gherkin language. </p> <p> From the above list it does <em>not</em> follow that those examples are universally bad. I can think of situations where some of them make sense. I might even use them myself. </p> <p> Rather, the Dependency Whac-A-Mole architecture smell occurs when a given dependency causes more trouble than the benefit it was supposed to provide. </p> <h3 id="9ae83d04788d4d4c9582ba02aa11b19b"> Causes <a href="#9ae83d04788d4d4c9582ba02aa11b19b">#</a> </h3> <p> We rarely set out to do the wrong thing, but we often make mistakes in good faith. You may decide to take a dependency on a library or framework because </p> <ul> <li>it worked well for you in a previous context</li> <li>it looks as though it'll address a major problem you had in a previous context</li> <li>you've heard good things about it</li> <li>you saw a convincing demo</li> <li>you heard about it in a podcast, conference talk, YouTube video, etc.</li> <li>a FAANG company uses it</li> <li>it's the latest tech</li> <li>you want it on your CV</li> </ul> <p> There could be other motivations as well, and granted, some of those I listed aren't really <em>good</em> reasons. Even so, I don't think anyone chooses a dependency with ill intent. </p> <p> And what might work in one context may turn out to not work in another. You can't always predict such consequences, so I imply no judgement on those who choose the 'wrong' dependency. I've done it, too. </p> <p> It is, however, important to be aware that this risk is always there. You picked a library with the best of intentions, but it turns out to slow you down. If so, acknowledge the mistake and kill your darlings. </p> <h3 id="02aa21e2bdc645f1b769c5a8412323f9"> Background <a href="#02aa21e2bdc645f1b769c5a8412323f9">#</a> </h3> <p> Whenever you use a library or framework, you need to learn how to use it effectively. You have to learn its concepts, abstractions, APIs, pitfalls, etc. Not only that, but you need to stay abreast of changes and improvements. </p> <p> Microsoft, for example, is usually good at maintaining backwards compatibility, but even so, things don't stand still. They evolve libraries and frameworks the same way I would do it: Don't introduce breaking changes, but do introduce new, better APIs going forward. This is essentially the <a href=https://martinfowler.com/bliki/StranglerFigApplication.html>Strangler pattern</a> that I also write about in <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>. </p> <p> While it's a good way to evolve a library or framework, the point remains: Even if you trust a supplier to prioritise backwards compatibility, it doesn't mean that you can stop learning. You have to stay up to date with all your dependencies. If you don't, sooner or later, the way that you use something like, say, <a href="https://en.wikipedia.org/wiki/Entity_Framework">Entity Framework</a> is 'the old way', and it's not really supported any longer. </p> <p> In order to be able to move forward, you'll have to rewrite those parts of your code that depend on that old way of doing things. </p> <p> Each dependency comes with benefits and costs. As long as the benefits outweigh the costs, it makes sense to keep it around. If, on the other hand, you spend more time dealing with it than it would take you to do the work yourself, consider getting rid of it. </p> <h3 id="439ea4466014446a9ddfc2e264c86fba"> Symptoms <a href="#439ea4466014446a9ddfc2e264c86fba">#</a> </h3> <p> Perhaps the infamous <em>left-pad</em> incident is too easy an example, but it does highlight the essence of this tension. Do you really need a third-party package to pad a string, or could you have done it yourself? </p> <p> You can spend much time figuring out how to fit a general-purpose library or framework to your particular needs. How do you make your object-relational mapper (ORM) fit a special database schema? How do you annotate a class so that it produces validation messages according to the requirements in your jurisdiction? How do you configure an automatic mapping library so that it correctly projects data? How do you tell a Dependency Injection (DI) Container how to compose a <a href="https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern">Chain of Responsibility</a> where some objects also take strings or integers in their constructors? </p> <p> Do such libraries or frameworks save time, or could you have written the corresponding code quicker? To be clear, I'm not talking about writing your own ORM, your own DI Container, your own auto-mapper. Rather, instead of using a DI Container, <a href="/2014/06/10/pure-di">Pure DI</a> is likely easier. As an alternative to an ORM, what's the cost of just writing <a href="https://en.wikipedia.org/wiki/SQL">SQL</a>? Instead of an <a href="https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule">ad-hoc, informally-specified, bug-ridden</a> validation framework, have you considered <a href="/2018/11/05/applicative-validation">applicative validation</a>? </p> <p> Things become really insidious if your chosen library never really solves all problems. Every time you figure out how to use it for one exotic corner case, your 'solution' causes a new problem to arise. </p> <p> A symptom of <em>Dependency Whac-A-Mole</em> is when you have to advertise after people skilled in a particular technology. </p> <p> Again, it's not necessarily a problem. If you're getting tremendous value out of, say, Entity Framework, it makes sense to list expertise as a job requirement. If, on the other hand, you have to list a litany of libraries and frameworks as necessary skills, it might pay to stop and reconsider. You can call it your 'tech stack' all you will, but is it really an inadvertent case of <a href="https://en.wikipedia.org/wiki/Vendor_lock-in">vendor lock-in</a>? </p> <h3 id="381db0b94f094be2be2b95841e248669"> Anecdotal evidence <a href="#381db0b94f094be2be2b95841e248669">#</a> </h3> <p> I've used the term <em>Whac-A-Mole</em> a couple of times to describe the kind of situation where you feel that you're fighting a technology more than it's helping you. It seems to resonate with other people than me. </p> <p> Here are the original articles where I used the term: </p> <ul> <li><a href="/2022/08/15/aspnet-validation-revisited">ASP.NET validation revisited</a></li> <li><a href="/2022/08/22/can-types-replace-validation">Can types replace validation?</a></li> <li><a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">Do ORMs reduce the need for mapping?</a></li> </ul> <p> These are only the articles where I explicitly use the term. I do, however, think that the phenomenon is more common. I'm particularly sensitive to it when it comes to Dependency Injection, where I generally believe that DI Containers make the technique harder that it has to be. Composing object graphs is easily done with code. </p> <h3 id="2ef657b607cd49408ced7110e28e2321"> Conclusion <a href="#2ef657b607cd49408ced7110e28e2321">#</a> </h3> <p> Sometimes a framework or library makes it more difficult to get things done. You spend much time kowtowing to its needs, researching how to do things 'the xyz way', learning its intricate extensibility points, keeping up to date with its evolving API, and engaging with its community to lobby for new features. </p> <p> Still, you feel that it makes you compromise. You might have liked to organise your code in a different way, but unfortunately you can't, because it doesn't fit the way the dependency works. As you solve issues with it, new ones appear. </p> <p> These are symptoms of <em>Dependency Whac-A-Mole</em>, an architecture smell that indicates that you're using the wrong tool for the job. If so, get rid of the dependency in favour of something better. Often, the better alternative is just plain vanilla code. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="9235995516070545f7cc3ee83d37023d"> <div class="comment-author"><a href="https://github.com/thomaslevesque">Thomas Levesque</a> <a href="#9235995516070545f7cc3ee83d37023d">#</a></div> <div class="comment-content"> <p> The most obvious example of this for me is definitely AutoMapper. I used to think it was great and saved so much time, but more often than not, the mapping configuration ended up being more complex (and fragile) than just mapping the properties manually. </p> </div> <div class="comment-date">2023-10-02 13:27 UTC</div> </div> <div class="comment" id="93b32bb03ee14d298b0d9b7cf65ddcae"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#93b32bb03ee14d298b0d9b7cf65ddcae">#</a></div> <div class="comment-content"> <p> I could imagine. AutoMapper is not, however, a library I've used enough to evaluate. </p> </div> <div class="comment-date">2023-10-02 13:58 UTC</div> </div> <div class="comment" id="3e81ff9e535743148d8898e84ff69595"> <div class="comment-author"><a href="https://blog.oakular.xyz">Callum Warrilow</a> <a href="#3e81ff9e535743148d8898e84ff69595">#</a></div> <div class="comment-content"> <p> The moment I lost any faith in AutoMapper was after trying to debug a mapping that was silently failing on a single property. Three of us were looking at it for a good amount of time before one of us noticed a single character typo on the destination property. As the names did not match, no mapping occurred. It is unfortunately a black box, and obfuscated a problem that a manual mapping would have handled gracefully. <hr /> Mark, it is interesting that you mention Gherkin as potentially one of these moles. It is something I've been evaluating in the hopes of making our tests more business focused, but considering it again now, you can achieve a lot of what Gherkin offers with well defined namespaces, classes and methods in your test assemblies, something like: <ul> <li>Namespace: GivenSomePrecondition</li> <li>TestClass: WhenCarryingOutAnAction</li> <li>TestMethod: ThenTheExpectedPostConditionResults</li> </ul> To get away from playing Whac-a-Mole, it would seem to require changing the question being asked, from <i>what product do I need to solve this problem?</i>, to <i>what tools and patterns can do I have around me to solve this problem?</i>. </p> </div> <div class="comment-date">2023-10-11 15:54 UTC</div> </div> <div class="comment" id="eef76159a60b4ee482238b1cd990ab94"> <div class="comment-author"><a href="/">Mark Seemann</a> <a href="#eef76159a60b4ee482238b1cd990ab94">#</a></div> <div class="comment-content"> <p> Callum, I was expecting someone to comment on including Gherkin on the list. </p> <p> I don't consider all my examples as universally problematic. Rather, they often pop up in contexts where people seem to be struggling with a concept or a piece of technology with no apparent benefit. </p> <p> I'm sure that when <a href="https://dannorth.net/">Dan North</a> came up with the idea of BDD and Gherkin, he actually <em>used</em> it. When used in the way it was originally intended, I can see it providing value. </p> <p> Apart from Dan himself, however, I'm not aware that I've ever met anyone who has used BDD and Gherkin in that way. On the contrary, I've had more than one discussion that went like this: </p> <p> <em>Interlocutor:</em> "We use BDD and Gherkin. It's great! You should try it." </p> <p> <em>Me:</em> "Why?" </p> <p> <em>Interlocutor:</em> "It enables us to <em>organise</em> our tests." </p> <p> <em>Me:</em> "Can't you do that with the <a href="https://wiki.c2.com/?ArrangeActAssert">AAA</a> pattern?" </p> <p> <em>Interlocutor:</em> "..." </p> <p> <em>Me:</em> "Do any non-programmers ever look at your tests?" </p> <p> <em>Interlocutor:</em> "No..." </p> <p> If only programmers look at the test code, then why impose an artificial constraint? <em>Given-when-then</em> is just <em>arrange-act-assert</em> with different names, but free of Gherkin and the tooling that typically comes with it, you're free to write test code that follows normal good coding practices. </p> <p> (As an aside, yes: Sometimes <a href="https://www.dotnetrocks.com/?show=1542">constraints liberate</a>, but what I've seen of Gherkin-based test code, this doesn't seem to be one of those cases.) </p> <p> Finally, to be quite clear, although I may be repeating myself: If you're using Gherkin to interact with non-programmers on a regular basis, it may be beneficial. I've just never been in that situation, or met anyone other than Dan North who have. </p> </div> <div class="comment-date">2023-10-15 14:35 UTC</div> </div> </div> <hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. The case of the mysterious comparison https://blog.ploeh.dk/2023/09/25/the-case-of-the-mysterious-comparison 2023-09-25T05:58:00+00:00 Mark Seemann <div id="post"> <p> <em>A ploeh mystery.</em> </p> <p> I was <a href="/2023/09/18/do-orms-reduce-the-need-for-mapping">recently playing around</a> with the example code from my book <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>, refactoring the <code>Table</code> class to use <a href="/2022/08/22/can-types-replace-validation">a predicative NaturalNumber wrapper</a> to represent a table's seating capacity. </p> <p> Originally, the <code>Table</code> constructor and corresponding read-only data looked like this: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;isStandard; <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;Reservation[]&nbsp;reservations; <span style="color:blue;">public</span>&nbsp;<span style="color:blue;">int</span>&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} <span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">Table</span>(<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">isStandard</span>,&nbsp;<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>,&nbsp;<span style="color:blue;">params</span>&nbsp;Reservation[]&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.isStandard&nbsp;=&nbsp;isStandard; &nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;capacity; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.reservations&nbsp;=&nbsp;reservations; }</pre> </p> <p> Since I wanted to show an example of how wrapper types can help make preconditions explicit, I changed it to this: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;<span style="color:blue;">bool</span>&nbsp;isStandard; <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">readonly</span>&nbsp;Reservation[]&nbsp;reservations; <span style="color:blue;">public</span>&nbsp;NaturalNumber&nbsp;Capacity&nbsp;{&nbsp;<span style="color:blue;">get</span>;&nbsp;} <span style="color:blue;">private</span>&nbsp;<span style="color:#2b91af;">Table</span>(<span style="color:blue;">bool</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">isStandard</span>,&nbsp;NaturalNumber&nbsp;<span style="font-weight:bold;color:#1f377f;">capacity</span>,&nbsp;<span style="color:blue;">params</span>&nbsp;Reservation[]&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.isStandard&nbsp;=&nbsp;isStandard; &nbsp;&nbsp;&nbsp;&nbsp;Capacity&nbsp;=&nbsp;capacity; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">this</span>.reservations&nbsp;=&nbsp;reservations; }</pre> </p> <p> The only thing I changed was the type of <code>Capacity</code> and <code>capacity</code>. </p> <p> As I did that, two tests failed. </p> <h3 id="5942663d531c41c491e2b79116008c5e"> Evidence <a href="#5942663d531c41c491e2b79116008c5e">#</a> </h3> <p> Both tests failed in the same way, so I only show one of the failures: </p> <p> <pre>Ploeh.Samples.Restaurants.RestApi.Tests.MaitreDScheduleTests.Schedule &nbsp;&nbsp;Source: MaitreDScheduleTests.cs line 16 &nbsp;&nbsp;Duration: 340 ms &nbsp;&nbsp;Message: &nbsp;&nbsp;&nbsp;&nbsp;FsCheck.Xunit.PropertyFailedException : &nbsp;&nbsp;&nbsp;&nbsp;Falsifiable, after 2 tests (0 shrinks) (StdGen (48558275,297233133)): &nbsp;&nbsp;&nbsp;&nbsp;Original: &nbsp;&nbsp;&nbsp;&nbsp;&lt;null&gt; &nbsp;&nbsp;&nbsp;&nbsp;(Ploeh.Samples.Restaurants.RestApi.MaitreD, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[|Ploeh.Samples.Restaurants.RestApi.Reservation|]) &nbsp;&nbsp;&nbsp;&nbsp;---- System.InvalidOperationException : Failed to compare two elements in the array. &nbsp;&nbsp;&nbsp;&nbsp;-------- System.ArgumentException : At least one object must implement IComparable. &nbsp;&nbsp;Stack Trace: &nbsp;&nbsp;&nbsp;&nbsp;----- Inner Stack Trace ----- &nbsp;&nbsp;&nbsp;&nbsp;GenericArraySortHelper`1.Sort(T[] keys, Int32 index, Int32 length, IComparer`1 comparer) &nbsp;&nbsp;&nbsp;&nbsp;Array.Sort[T](T[] array, Int32 index, Int32 length, IComparer`1 comparer) &nbsp;&nbsp;&nbsp;&nbsp;EnumerableSorter`2.QuickSort(Int32[] keys, Int32 lo, Int32 hi) &nbsp;&nbsp;&nbsp;&nbsp;EnumerableSorter`1.Sort(TElement[] elements, Int32 count) &nbsp;&nbsp;&nbsp;&nbsp;OrderedEnumerable`1.ToList() &nbsp;&nbsp;&nbsp;&nbsp;Enumerable.ToList[TSource](IEnumerable`1 source) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color: red;">MaitreD.Allocate(IEnumerable`1 reservations)</span> line 91 &nbsp;&nbsp;&nbsp;&nbsp;<span style="color: red;">&lt;&gt;c__DisplayClass21_0.&lt;Schedule&gt;b__4(&lt;&gt;f__AnonymousType7`2 &lt;&gt;h__TransparentIdentifier1)</span> line 114 &nbsp;&nbsp;&nbsp;&nbsp;&lt;&gt;c__DisplayClass2_0`3.&lt;CombineSelectors&gt;b__0(TSource x) &nbsp;&nbsp;&nbsp;&nbsp;SelectIPartitionIterator`2.GetCount(Boolean onlyIfCheap) &nbsp;&nbsp;&nbsp;&nbsp;Enumerable.Count[TSource](IEnumerable`1 source) &nbsp;&nbsp;&nbsp;&nbsp;<span style="color: red;">MaitreDScheduleTests.ScheduleImp(MaitreD sut, Reservation[] reservations)</span> line 31 &nbsp;&nbsp;&nbsp;&nbsp;<span style="color: red;">&lt;&gt;c.&lt;Schedule&gt;b__0_2(ValueTuple`2 t)</span> line 22 &nbsp;&nbsp;&nbsp;&nbsp;ForAll@15.Invoke(Value arg00) &nbsp;&nbsp;&nbsp;&nbsp;Testable.evaluate[a,b](FSharpFunc`2 body, a a) &nbsp;&nbsp;&nbsp;&nbsp;----- Inner Stack Trace ----- &nbsp;&nbsp;&nbsp;&nbsp;Comparer.Compare(Object a, Object b) &nbsp;&nbsp;&nbsp;&nbsp;ObjectComparer`1.Compare(T x, T y) &nbsp;&nbsp;&nbsp;&nbsp;EnumerableSorter`2.CompareAnyKeys(Int32 index1, Int32 index2) &nbsp;&nbsp;&nbsp;&nbsp;ComparisonComparer`1.Compare(T x, T y) &nbsp;&nbsp;&nbsp;&nbsp;ArraySortHelper`1.SwapIfGreater(T[] keys, Comparison`1 comparer, Int32 a, Int32 b) &nbsp;&nbsp;&nbsp;&nbsp;ArraySortHelper`1.IntroSort(T[] keys, Int32 lo, Int32 hi, Int32 depthLimit, Comparison`1 comparer) &nbsp;&nbsp;&nbsp;&nbsp;GenericArraySortHelper`1.Sort(T[] keys, Int32 index, Int32 length, IComparer`1 comparer)</pre> </p> <p> The code highlighted with red is user code (i.e. my code). The rest comes from .NET or <a href="https://fscheck.github.io/FsCheck/">FsCheck</a>. </p> <p> While a stack trace like that can look intimidating, I usually navigate to the top stack frame of my own code. As I reproduce my investigation, see if you can spot the problem before I did. </p> <h3 id="fded951b0b3a4cac848941153e84eaa6"> Understand before resolving <a href="#fded951b0b3a4cac848941153e84eaa6">#</a> </h3> <p> Before starting the investigation proper, we might as well acknowledge what seems evident. I had a fully passing test suite, then I edited two lines of code, which caused the above error. The two nested exception messages contain obvious clues: <em>Failed to compare two elements in the array,</em> and <em>At least one object must implement IComparable</em>. </p> <p> The only edit I made was to change an <code>int</code> to a <code>NaturalNumber</code>, and <code>NaturalNumber</code> didn't implement <code>IComparable</code>. It seems straightforward to just make <code>NaturalNumber</code> implement that interface and move on, and as it turns out, that <em>is</em> the solution. </p> <p> As I describe in <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a>, when troubleshooting, first seek to understand the problem. I've seen too many people go immediately into 'action mode' when faced with a problem. It's often a suboptimal strategy. </p> <p> First, if the immediate solution turns out not to work, you can waste much time trashing, trying various 'fixes' without understanding the problem. </p> <p> Second, even if the resolution is easy, as is the case here, if you don't understand the underlying cause and effect, you can easily build a <a href="https://en.wikipedia.org/wiki/Cargo_cult">cargo cult</a>-like 'understanding' of programming. This could become one such experience: <em>All wrapper types must implement <code>IComparable</code></em>, or some nonsense like that. </p> <p> Unless people are getting hurt or you are bleeding money because of the error, seek first to understand, and only then fix the problem. </p> <h3 id="f881a7f048144dd2a2521e336675d052"> First clue <a href="#f881a7f048144dd2a2521e336675d052">#</a> </h3> <p> The top user stack frame is the <code>Allocate</code> method: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;IEnumerable&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Allocate</span>( &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;List&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">allocation</span>&nbsp;=&nbsp;Tables.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;reservations) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">table</span>&nbsp;=&nbsp;allocation.Find(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;t.Fits(r.Quantity)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(table&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;allocation.Remove(table); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;allocation.Add(table.Reserve(r)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;allocation; }</pre> </p> <p> The stack trace points to line 91, which is the first line of code; where it calls <code>Tables.ToList()</code>. This is also consistent with the stack trace, which indicates that the exception is thrown from <a href="https://learn.microsoft.com/dotnet/api/system.linq.enumerable.tolist">ToList</a>. </p> <p> I am, however, not used to <code>ToList</code> throwing exceptions, so I admit that I was nonplussed. Why would <code>ToList</code> try to sort the input? It usually doesn't do that. </p> <p> Now, I <em>did</em> notice the <code>OrderedEnumerable`1</code> on the stack frame above <code>Enumerable.ToList</code>, but this early in the investigation, I failed to connect the dots. </p> <p> What does the caller look like? It's that scary <code>DisplayClass21</code>... </p> <h3 id="5250f81716324ff1918bea2e57d08ef4"> Immediate caller <a href="#5250f81716324ff1918bea2e57d08ef4">#</a> </h3> <p> The code that calls <code>Allocate</code> is the <code>Schedule</code> method, the System Under Test: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;IEnumerable&lt;TimeSlot&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Schedule</span>( &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;r&nbsp;<span style="color:blue;">in</span>&nbsp;reservations &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">group</span>&nbsp;r&nbsp;<span style="color:blue;">by</span>&nbsp;r.At&nbsp;<span style="color:blue;">into</span>&nbsp;g &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">orderby</span>&nbsp;g.Key &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;seating&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;Seating(SeatingDuration,&nbsp;g.Key) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;overlapping&nbsp;=&nbsp;reservations.Where(seating.Overlaps) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;<span style="color:blue;">new</span>&nbsp;TimeSlot(g.Key,&nbsp;Allocate(overlapping).ToList()); }</pre> </p> <p> While it does <code>orderby</code>, it doesn't seem to be sorting the input to <code>Allocate</code>. While <code>overlapping</code> is a filtered subset of <code>reservations</code>, the code doesn't sort <code>reservations</code>. </p> <p> Okay, moving on, what does the caller of that method look like? </p> <h3 id="0db463ec75d64f93a1b188af9fe731f3"> Test implementation <a href="#0db463ec75d64f93a1b188af9fe731f3">#</a> </h3> <p> The caller of the <code>Schedule</code> method is this test implementation: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;<span style="color:blue;">void</span>&nbsp;<span style="color:#74531f;">ScheduleImp</span>( &nbsp;&nbsp;&nbsp;&nbsp;MaitreD&nbsp;<span style="font-weight:bold;color:#1f377f;">sut</span>, &nbsp;&nbsp;&nbsp;&nbsp;Reservation[]&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">actual</span>&nbsp;=&nbsp;sut.Schedule(reservations); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reservations.Select(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;r.At).Distinct().Count(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual.Count()); &nbsp;&nbsp;&nbsp;&nbsp;Assert.Equal( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual.Select(<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;=&gt;&nbsp;ts.At).OrderBy(<span style="font-weight:bold;color:#1f377f;">d</span>&nbsp;=&gt;&nbsp;d), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual.Select(<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;=&gt;&nbsp;ts.At)); &nbsp;&nbsp;&nbsp;&nbsp;Assert.All(actual,&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;=&gt;&nbsp;AssertTables(sut.Tables,&nbsp;ts.Tables)); &nbsp;&nbsp;&nbsp;&nbsp;Assert.All( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;actual, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">ts</span>&nbsp;=&gt;&nbsp;AssertRelevance(reservations,&nbsp;sut.SeatingDuration,&nbsp;ts)); }</pre> </p> <p> Notice how the first line of code calls <code>Schedule</code>, while the rest is 'just' assertions. </p> <p> Because I had noticed that <code>OrderedEnumerable`1</code> on the stack, I was on the lookout for an expression that would sort an <code>IEnumerable&lt;T&gt;</code>. The <code>ScheduleImp</code> method surprised me, though, because the <code>reservations</code> parameter is an array. If there was any problem sorting it, it should have blown up much earlier. </p> <p> I really should be paying more attention, but despite my best resolution to proceed methodically, I was chasing the wrong clue. </p> <p> Which line of code throws the exception? The stack trace says line 31. That's not the <code>sut.Schedule(reservations)</code> call. It's the first assertion following it. I failed to notice that. </p> <h3 id="5116edf953f1438b9cb4b37c5b043bda"> Property <a href="#5116edf953f1438b9cb4b37c5b043bda">#</a> </h3> <p> I was stumped, and not knowing what to do, I looked at the fourth and final piece of user code in that stack trace: </p> <p> <pre>[Property] <span style="color:blue;">public</span>&nbsp;Property&nbsp;<span style="font-weight:bold;color:#74531f;">Schedule</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Prop.ForAll( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;rs&nbsp;<span style="color:blue;">in</span>&nbsp;Gens.Reservations &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;&nbsp;m&nbsp;<span style="color:blue;">in</span>&nbsp;Gens.MaitreD(rs) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(m,&nbsp;rs)).ToArbitrary(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;ScheduleImp(t.m,&nbsp;t.rs)); }</pre> </p> <p> No sorting there. What's going on? </p> <p> In retrospect, I'm struggling to understand what was going on in my mind. Perhaps you're about to lose patience with me. I was chasing the wrong 'clue', just as I said above that 'other' people do, but surely, it's understood, that I don't. </p> <h3 id="70ad2e90704d4d31bc0d045fff16a011"> WYSIATI <a href="#70ad2e90704d4d31bc0d045fff16a011">#</a> </h3> <p> In <a href="/code-that-fits-in-your-head">Code That Fits in Your Head</a> I spend some time discussing how code relates to human cognition. I'm no neuroscientist, but I try to read books on other topics than programming. I was partially inspired by <a href="/ref/thinking-fast-and-slow">Thinking, Fast and Slow</a> in which <a href="https://en.wikipedia.org/wiki/Daniel_Kahneman">Daniel Kahneman</a> (among many other topics) presents how <em>System 1</em> (the inaccurate <em>fast</em> thinking process) mostly works with what's right in front of it: <em>What You See Is All There Is</em>, or WYSIATI. </p> <p> That <code>OrderedEnumerable`1</code> in the stack trace had made me look for an <code>IEnumerable&lt;T&gt;</code> as the culprit, and in the source code of the <code>Allocate</code> method, one parameter is clearly what I was looking for. I'll repeat that code here for your benefit: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;IEnumerable&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">Allocate</span>( &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;List&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">allocation</span>&nbsp;=&nbsp;Tables.ToList(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">foreach</span>&nbsp;(var&nbsp;<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;<span style="font-weight:bold;color:#8f08c4;">in</span>&nbsp;reservations) &nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">table</span>&nbsp;=&nbsp;allocation.Find(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;t.Fits(r.Quantity)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(table&nbsp;<span style="color:blue;">is</span>&nbsp;{&nbsp;}) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;allocation.Remove(table); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;allocation.Add(table.Reserve(r)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;} &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;allocation; }</pre> </p> <p> Where's the <code>IEnumerable&lt;T&gt;</code> in that code? </p> <p> <code>reservations</code>, right? </p> <h3 id="c99f8e1238284cc88868d5fe39f43f2a"> Revelation <a href="#c99f8e1238284cc88868d5fe39f43f2a">#</a> </h3> <p> As WYSIATI 'predicts', the brain gloms on to what's prominent. I was looking for <code>IEnumerable&lt;T&gt;</code>, and it's right there in the method declaration as the parameter <code>IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span></code>. </p> <p> As covered in multiple places (<a href="/code-that-fits-in-your-head">my book</a>, <a href="/ref/programmers-brain">The Programmer's Brain</a>), the human brain has limited short-term memory. Apparently, while chasing the <code>IEnumerable&lt;T&gt;</code> clue, I'd already managed to forget another important datum. </p> <p> Which line of code throws the exception? This one: </p> <p> <pre>List&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">allocation</span>&nbsp;=&nbsp;Tables.ToList();</pre> </p> <p> The <code>IEnumerable&lt;T&gt;</code> isn't <code>reservations</code>, but <code>Tables</code>. </p> <p> While the code doesn't explicitly say <code>IEnumerable&lt;Table&gt;&nbsp;Tables</code>, that's just what it is. </p> <p> Yes, it took me way too long to notice that I'd been barking up the wrong tree all along. Perhaps you immediately noticed that, but have pity with me. I don't think this kind of human error is uncommon. </p> <h3 id="288f57cb1a4648a1926164e64aebfbe2"> The culprit <a href="#288f57cb1a4648a1926164e64aebfbe2">#</a> </h3> <p> Where do <code>Tables</code> come from? It's a read-only property originally injected via the constructor: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:#2b91af;">MaitreD</span>( &nbsp;&nbsp;&nbsp;&nbsp;TimeOfDay&nbsp;<span style="font-weight:bold;color:#1f377f;">opensAt</span>, &nbsp;&nbsp;&nbsp;&nbsp;TimeOfDay&nbsp;<span style="font-weight:bold;color:#1f377f;">lastSeating</span>, &nbsp;&nbsp;&nbsp;&nbsp;TimeSpan&nbsp;<span style="font-weight:bold;color:#1f377f;">seatingDuration</span>, &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Table&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">tables</span>) { &nbsp;&nbsp;&nbsp;&nbsp;OpensAt&nbsp;=&nbsp;opensAt; &nbsp;&nbsp;&nbsp;&nbsp;LastSeating&nbsp;=&nbsp;lastSeating; &nbsp;&nbsp;&nbsp;&nbsp;SeatingDuration&nbsp;=&nbsp;seatingDuration; &nbsp;&nbsp;&nbsp;&nbsp;Tables&nbsp;=&nbsp;tables; }</pre> </p> <p> Okay, in the test then, where does it come from? That's the <code>m</code> in the above property, repeated here for your convenience: </p> <p> <pre>[Property] <span style="color:blue;">public</span>&nbsp;Property&nbsp;<span style="font-weight:bold;color:#74531f;">Schedule</span>() { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;Prop.ForAll( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">from</span>&nbsp;rs&nbsp;<span style="color:blue;">in</span>&nbsp;Gens.Reservations &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;&nbsp;m&nbsp;<span style="color:blue;">in</span>&nbsp;Gens.MaitreD(rs) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;(m,&nbsp;rs)).ToArbitrary(), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;ScheduleImp(t.m,&nbsp;t.rs)); }</pre> </p> <p> The <code>m</code> variable is generated by <code>Gens.MaitreD</code>, so let's follow that clue: </p> <p> <pre><span style="color:blue;">internal</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Gen&lt;MaitreD&gt;&nbsp;<span style="color:#74531f;">MaitreD</span>( &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;seatingDuration&nbsp;<span style="color:blue;">in</span>&nbsp;Gen.Choose(1,&nbsp;6) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;tables&nbsp;<span style="color:blue;">in</span>&nbsp;Tables(reservations) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;<span style="color:blue;">new</span>&nbsp;MaitreD( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TimeSpan.FromHours(18), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TimeSpan.FromHours(21), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TimeSpan.FromHours(seatingDuration), &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tables); }</pre> </p> <p> We're not there yet, but close. The <code>tables</code> variable is generated by this <code>Tables</code> helper function: </p> <p> <pre><span style="color:gray;">///</span><span style="color:green;">&nbsp;</span><span style="color:gray;">&lt;</span><span style="color:gray;">summary</span><span style="color:gray;">&gt;</span> <span style="color:gray;">///</span><span style="color:green;">&nbsp;Generate&nbsp;a&nbsp;table&nbsp;configuration&nbsp;that&nbsp;can&nbsp;at&nbsp;minimum&nbsp;accomodate&nbsp;all</span> <span style="color:gray;">///</span><span style="color:green;">&nbsp;reservations.</span> <span style="color:gray;">///</span><span style="color:green;">&nbsp;</span><span style="color:gray;">&lt;/</span><span style="color:gray;">summary</span><span style="color:gray;">&gt;</span> <span style="color:gray;">///</span><span style="color:green;">&nbsp;</span><span style="color:gray;">&lt;</span><span style="color:gray;">param</span>&nbsp;<span style="color:gray;">name</span><span style="color:gray;">=</span><span style="color:gray;">&quot;</span>reservations<span style="color:gray;">&quot;</span><span style="color:gray;">&gt;</span><span style="color:green;">The&nbsp;reservations&nbsp;to&nbsp;accommodate</span><span style="color:gray;">&lt;/</span><span style="color:gray;">param</span><span style="color:gray;">&gt;</span> <span style="color:gray;">///</span><span style="color:green;">&nbsp;</span><span style="color:gray;">&lt;</span><span style="color:gray;">returns</span><span style="color:gray;">&gt;</span><span style="color:green;">A&nbsp;generator&nbsp;of&nbsp;valid&nbsp;table&nbsp;configurations.</span><span style="color:gray;">&lt;/</span><span style="color:gray;">returns</span><span style="color:gray;">&gt;</span> <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Gen&lt;IEnumerable&lt;Table&gt;&gt;&nbsp;<span style="color:#74531f;">Tables</span>( &nbsp;&nbsp;&nbsp;&nbsp;IEnumerable&lt;Reservation&gt;&nbsp;<span style="font-weight:bold;color:#1f377f;">reservations</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;Create&nbsp;a&nbsp;table&nbsp;for&nbsp;each&nbsp;reservation,&nbsp;to&nbsp;ensure&nbsp;that&nbsp;all</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:green;">//&nbsp;reservations&nbsp;can&nbsp;be&nbsp;allotted&nbsp;a&nbsp;table.</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">tables</span>&nbsp;=&nbsp;reservations.Select(<span style="font-weight:bold;color:#1f377f;">r</span>&nbsp;=&gt;&nbsp;Table.Standard(r.Quantity)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">from</span>&nbsp;moreTables&nbsp;<span style="color:blue;">in</span> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Gen.Choose(1,&nbsp;12).Select( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#1f377f;">i</span>&nbsp;=&gt;&nbsp;Table.Standard(<span style="color:blue;">new</span>&nbsp;NaturalNumber(i))).ArrayOf() &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">let</span>&nbsp;allTables&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tables.Concat(moreTables).OrderBy(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;t.Capacity) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">select</span>&nbsp;allTables.AsEnumerable(); }</pre> </p> <p> And there you have it: <code>OrderBy(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;t.Capacity)</code>! </p> <p> The <code>Capacity</code> property was exactly the property I changed from <code>int</code> to <code>NaturalNumber</code> - the change that made the test fail. </p> <p> As expected, the fix was to let <code>NaturalNumber</code> implement <code>IComparable&lt;NaturalNumber&gt;</code>. </p> <h3 id="534a507621b840dfb566cdb359261840"> Conclusion <a href="#534a507621b840dfb566cdb359261840">#</a> </h3> <p> I thought this little troubleshooting session was interesting enough to write down. I spent perhaps twenty minutes on it before I understood what was going on. Not disastrously long, but enough time that I was relieved when I figured it out. </p> <p> Apart from the obvious (look for the problem where it is), there is one other useful lesson to be learned, I think. </p> <p> <a href="https://learn.microsoft.com/dotnet/standard/linq/deferred-execution-lazy-evaluation">Deferred execution</a> can confuse even the most experienced programmer. It took me some time before it dawned on me that even though the the <code>MaitreD</code> constructor had run and the object was 'safely' initialised, it actually wasn't. </p> <p> The implication is that there's a 'disconnect' between the constructor and the <code>Allocate</code> method. The error actually happens during initialisation (i.e. in the caller of the constructor), but it only manifests when you run the method. </p> <p> Ever since <a href="/2013/07/20/linq-versus-the-lsp">I discovered the IReadOnlyCollection&lt;T&gt; interface in 2013</a> I've resolved to favour it over <code>IEnumerable&lt;T&gt;</code>. This is one example of why that's a good idea. </p> <p> Despite my best intentions, I, too, cut corners from time to time. I've done it here, by accepting <code>IEnumerable&lt;Table&gt;</code> instead of <code>IReadOnlyCollection&lt;Table&gt;</code> as a constructor parameter. I really should have known better, and now I've paid the price. </p> <p> This is particularly ironic because I also love <a href="https://www.haskell.org/">Haskell</a> so much. Haskell is lazy by default, so you'd think that I run into such issues all the time. An expression like <code>OrderBy(<span style="font-weight:bold;color:#1f377f;">t</span>&nbsp;=&gt;&nbsp;t.Capacity)</code>, however, wouldn't have compiled in Haskell unless the sort key implemented the <a href="https://hackage.haskell.org/package/base/docs/Data-Ord.html#t:Ord">Ord</a> type class. Even C#'s type system can express that a generic type must implement an interface, but <a href="https://learn.microsoft.com/dotnet/api/system.linq.enumerable.orderby">OrderBy</a> doesn't do that. </p> <p> This problem could have been caught at compile-time, but unfortunately it wasn't. </p> </div> <div id="comments"> <hr> <h2 id="comments-header"> Comments </h2> <div class="comment" id="7207e4dc0287435facea31fc9ce49d36"> <div class="comment-author"><a href="https://github.com/JesHansen">Jes Hansen</a> <a href="#7207e4dc0287435facea31fc9ce49d36">#</a></div> <div class="comment-content"> <p> I made a <a href="https://github.com/dotnet/runtime/issues/92691">pull request</a> describing the issue. </p> <p> As this is likely a breaking change I don't have high hopes for it to be fixed, though&hellip; </p> </div> <div class="comment-date">2023-09-27 09:40 UTC</div> </div> </div><hr> This blog is totally free, but if you like it, please consider <a href="https://blog.ploeh.dk/support">supporting it</a>. Do ORMs reduce the need for mapping? https://blog.ploeh.dk/2023/09/18/do-orms-reduce-the-need-for-mapping 2023-09-18T14:40:00+00:00 Mark Seemann <div id="post"> <p> <em>With some Entity Framework examples in C#.</em> </p> <p> In a recent comment, a reader <a href="/2023/07/17/works-on-most-machines#4012c2cddcb64a068c0b06b7989a676e">asked me to expand on my position</a> on <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping">object-relational mappers</a> (ORMs), which is that I'm not a fan: </p> <blockquote> <p> I consider ORMs a waste of time: they create more problems than they solve. </p> <footer><cite><a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>, subsection 12.2.2, footnote</cite></footer> </blockquote> <p> While I acknowledge that only a Sith deals in absolutes, I favour clear assertions over guarded language. I don't really mean it that categorically, but I do stand by the general sentiment. In this article I'll attempt to describe why I don't reach for ORMs when querying or writing to a relational database. </p> <p> As always, any exploration of such a kind is made in a <em>context</em>, and this article is no exception. Before proceeding, allow me to delineate the scope. If your context differs from mine, what I write may not apply to your situation. </p> <h3 id="a29a6dfd90604a358c5e2f8e76941f80"> Scope <a href="#a29a6dfd90604a358c5e2f8e76941f80">#</a> </h3> <p> It's been decades since I last worked on a system where the database 'came first'. The last time that happened, the database was hidden behind an XML-based <a href="https://en.wikipedia.org/wiki/Remote_procedure_call">RPC</a> API that tunnelled through HTTP. Not a <a href="https://en.wikipedia.org/wiki/REST">REST</a> API by a long shot. </p> <p> Since then, I've worked on various systems. Some used relational databases, some document databases, some worked with CSV, or really old legacy APIs, etc. Common to these systems was that they were <em>not</em> designed around a database. Rather, they were developed with an eye to the <a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle">Dependency Inversion Principle</a>, keeping storage details out of the Domain Model. Many were developed with test-driven development (TDD). </p> <p> When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>, I usually develop (vertical) feature slices one at a time, utilising an <a href="/outside-in-tdd">outside-in TDD</a> process, during which I also figure out how to save or retrieve data from persistent storage. </p> <p> Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around. </p> <p> To be clear, then, this article doesn't discuss typical <a href="https://en.wikipedia.org/wiki/Create,_read,_update_and_delete">CRUD</a>-heavy applications that are mostly forms over relational data, with little or no application logic. If you're working with such a code base, an ORM might be useful. I can't really tell, since I last worked with such systems at a time when ORMs didn't exist. </p> <h3 id="b6446ab3f8b8410da2679b4fb915a69e"> The usual suspects <a href="#b6446ab3f8b8410da2679b4fb915a69e">#</a> </h3> <p> The most common criticism of ORMs (that I've come across) is typically related to the queries they generate. People who are skilled in writing <a href="https://en.wikipedia.org/wiki/SQL">SQL</a> by hand, or who are concerned about performance, may look at the SQL that an ORM generates and dislike it for that reason. </p> <p> It's my impression that ORMs have come a long way over the decades, but frankly, the generated SQL is not really what concerns me. It never was. </p> <p> In the abstract, Ted Neward already outlined the problems in the seminal article <a href="https://blogs.newardassociates.com/blog/2006/the-vietnam-of-computer-science.html">The Vietnam of Computer Science</a>. That problem description may, however, be too theoretical to connect with most programmers, so I'll try a more example-driven angle. </p> <h3 id="6908e1b735ee41068baeeb9482a15953"> Database operations without an ORM <a href="#6908e1b735ee41068baeeb9482a15953">#</a> </h3> <p> Once more I turn to the trusty example code base that accompanies <a href="/2021/06/14/new-book-code-that-fits-in-your-head">Code That Fits in Your Head</a>. In it, I used <a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server">SQL Server</a> as the example database, and ADO.NET as the data access technology. </p> <p> I considered this more than adequate for saving and reading restaurant reservations. Here, for example, is the code that creates a new reservation row in the database: </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&nbsp;<span style="font-weight:bold;color:#74531f;">Create</span>(<span style="color:blue;">int</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">restaurantId</span>,&nbsp;Reservation&nbsp;<span style="font-weight:bold;color:#1f377f;">reservation</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(reservation&nbsp;<span style="color:blue;">is</span>&nbsp;<span style="color:blue;">null</span>) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">throw</span>&nbsp;<span style="color:blue;">new</span>&nbsp;ArgumentNullException(nameof(reservation)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">conn</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SqlConnection(ConnectionString); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">cmd</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SqlCommand(createReservationSql,&nbsp;conn); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@Id&quot;</span>,&nbsp;reservation.Id); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@RestaurantId&quot;</span>,&nbsp;restaurantId); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@At&quot;</span>,&nbsp;reservation.At); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@Name&quot;</span>,&nbsp;reservation.Name.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@Email&quot;</span>,&nbsp;reservation.Email.ToString()); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@Quantity&quot;</span>,&nbsp;reservation.Quantity); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;conn.OpenAsync().ConfigureAwait(<span style="color:blue;">false</span>); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;cmd.ExecuteNonQueryAsync().ConfigureAwait(<span style="color:blue;">false</span>); } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">const</span>&nbsp;<span style="color:blue;">string</span>&nbsp;createReservationSql&nbsp;=&nbsp;<span style="color:maroon;">@&quot; &nbsp;&nbsp;&nbsp;&nbsp;INSERT&nbsp;INTO&nbsp;[dbo].[Reservations]&nbsp;( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[PublicId],&nbsp;[RestaurantId],&nbsp;[At],&nbsp;[Name],&nbsp;[Email],&nbsp;[Quantity]) &nbsp;&nbsp;&nbsp;&nbsp;VALUES&nbsp;(@Id,&nbsp;@RestaurantId,&nbsp;@At,&nbsp;@Name,&nbsp;@Email,&nbsp;@Quantity)&quot;</span>;</pre> </p> <p> Yes, there's mapping, even if it's 'only' from a Domain Object to command parameter strings. As I'll argue later, if there's a way to escape such mapping, I'm not aware of it. ORMs don't seem to solve that problem. </p> <p> This, however, seems to be the reader's main concern: </p> <blockquote> <p> "I can work with raw SQL ofcourse... but the mapping... oh the mapping..." </p> <footer><cite><a href="/2023/07/17/works-on-most-machines#4012c2cddcb64a068c0b06b7989a676e">qfilip</a></cite></footer> </blockquote> <p> It's not a concern that I share, but again I'll remind you that if your context differs substantially from mine, what doesn't concern me could reasonably concern you. </p> <p> You may argue that the above example isn't representative, since it only involves a single table. No foreign key relationships are involved, so perhaps the example is artificially easy. </p> <p> In order to work with a slightly more complex schema, I decided to port the read-only in-memory restaurant database (the one that keeps track of the restaurants - the <em>tenants</em> - of the system) to SQL Server. </p> <h3 id="ef3d04206a20442dbd2c01336c48fd28"> Restaurants schema <a href="#ef3d04206a20442dbd2c01336c48fd28">#</a> </h3> <p> In the book's sample code base, I'd only stored restaurant configurations as JSON config files, since I considered it out of scope to include an online tenant management system. Converting to a relational model wasn't hard, though. Here's the database schema: </p> <p> <pre><span style="color:blue;">CREATE</span>&nbsp;<span style="color:blue;">TABLE</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Restaurants]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span> &nbsp;&nbsp;&nbsp;&nbsp;[Id]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">INT</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL,</span> &nbsp;&nbsp;&nbsp;&nbsp;[Name]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">NVARCHAR&nbsp;</span><span style="color:gray;">(</span>50<span style="color:gray;">)</span>&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL</span>&nbsp;<span style="color:blue;">UNIQUE</span><span style="color:gray;">,</span> &nbsp;&nbsp;&nbsp;&nbsp;[OpensAt]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">TIME</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL,</span> &nbsp;&nbsp;&nbsp;&nbsp;[LastSeating]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">TIME</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL,</span> &nbsp;&nbsp;&nbsp;&nbsp;[SeatingDuration]&nbsp;&nbsp;<span style="color:blue;">TIME</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">PRIMARY</span>&nbsp;<span style="color:blue;">KEY</span>&nbsp;<span style="color:blue;">CLUSTERED&nbsp;</span><span style="color:gray;">(</span>[Id]&nbsp;<span style="color:blue;">ASC</span><span style="color:gray;">)</span> <span style="color:gray;">)</span> <span style="color:blue;">CREATE</span>&nbsp;<span style="color:blue;">TABLE</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span> &nbsp;&nbsp;&nbsp;&nbsp;[Id]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">INT</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL</span>&nbsp;<span style="color:blue;">IDENTITY</span><span style="color:gray;">,</span> &nbsp;&nbsp;&nbsp;&nbsp;[RestaurantId]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">INT</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL</span>&nbsp;<span style="color:blue;">REFERENCES</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Restaurants]<span style="color:gray;">(</span>Id<span style="color:gray;">),</span> &nbsp;&nbsp;&nbsp;&nbsp;[Capacity]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">INT</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL,</span> &nbsp;&nbsp;&nbsp;&nbsp;[IsCommunal]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">BIT</span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:gray;">NOT</span>&nbsp;<span style="color:gray;">NULL</span> &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">PRIMARY</span>&nbsp;<span style="color:blue;">KEY</span>&nbsp;<span style="color:blue;">CLUSTERED&nbsp;</span><span style="color:gray;">(</span>[Id]&nbsp;<span style="color:blue;">ASC</span><span style="color:gray;">)</span> <span style="color:gray;">)</span></pre> </p> <p> This little subsystem requires two database tables: One that keeps track of the overall restaurant configuration, such as name, opening and closing times, and another database table that lists all a restaurant's physical tables. </p> <p> You may argue that this is still too simple to realistically capture the intricacies of existing database systems, but conversely I'll remind you that the scope of this article is the sort of system where you develop and design the application first; not a system where you're given a relational database upon which you must create an application. </p> <p> Had I been given this assignment in a realistic setting, a relational database probably wouldn't have been my first choice. Some kind of document database, or even blob storage, strikes me as a better fit. Still, this article is about ORMs, so I'll pretend that there are external circumstances that dictate a relational database. </p> <p> To test the system, I also created a script to populate these tables. Here's part of it: </p> <p> <pre><span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Restaurants]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[Id]<span style="color:gray;">,</span>&nbsp;[Name]<span style="color:gray;">,</span>&nbsp;[OpensAt]<span style="color:gray;">,</span>&nbsp;[LastSeating]<span style="color:gray;">,</span>&nbsp;[SeatingDuration]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>1<span style="color:gray;">,</span>&nbsp;<span style="color:red;">N&#39;Hipgnosta&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;18:00&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;21:00&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;6:00&#39;</span><span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>1<span style="color:gray;">,</span>&nbsp;10<span style="color:gray;">,</span>&nbsp;1<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Restaurants]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[Id]<span style="color:gray;">,</span>&nbsp;[Name]<span style="color:gray;">,</span>&nbsp;[OpensAt]<span style="color:gray;">,</span>&nbsp;[LastSeating]<span style="color:gray;">,</span>&nbsp;[SeatingDuration]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;<span style="color:red;">N&#39;Nono&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;18:00&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;21:00&#39;</span><span style="color:gray;">,</span>&nbsp;<span style="color:red;">&#39;6:00&#39;</span><span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;6<span style="color:gray;">,</span>&nbsp;1<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;4<span style="color:gray;">,</span>&nbsp;1<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;2<span style="color:gray;">,</span>&nbsp;0<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;2<span style="color:gray;">,</span>&nbsp;0<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;4<span style="color:gray;">,</span>&nbsp;0<span style="color:gray;">)</span> <span style="color:blue;">INSERT</span>&nbsp;<span style="color:blue;">INTO</span>&nbsp;[dbo]<span style="color:gray;">.</span>[Tables]<span style="color:blue;">&nbsp;</span><span style="color:gray;">(</span>[RestaurantId]<span style="color:gray;">,</span>&nbsp;[Capacity]<span style="color:gray;">,</span>&nbsp;[IsCommunal]<span style="color:gray;">)</span> <span style="color:blue;">VALUES&nbsp;</span><span style="color:gray;">(</span>2112<span style="color:gray;">,</span>&nbsp;4<span style="color:gray;">,</span>&nbsp;0<span style="color:gray;">)</span></pre> </p> <p> There are more rows than this, but this should give you an idea of what data looks like. </p> <h3 id="ba5d810c332945398ab2a870711357f1"> Reading restaurant data without an ORM <a href="#ba5d810c332945398ab2a870711357f1">#</a> </h3> <p> Due to the foreign key relationship, reading restaurant data from the database is a little more involved than reading from a single table. </p> <p> <pre><span style="color:blue;">public</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;Restaurant?&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">GetRestaurant</span>(<span style="color:blue;">string</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">name</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">cmd</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SqlCommand(readByNameSql); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Parameters.AddWithValue(<span style="color:#a31515;">&quot;@Name&quot;</span>,&nbsp;name); &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">restaurants</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;ReadRestaurants(cmd); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;restaurants.SingleOrDefault(); } <span style="color:blue;">private</span>&nbsp;<span style="color:blue;">const</span>&nbsp;<span style="color:blue;">string</span>&nbsp;readByNameSql&nbsp;=&nbsp;<span style="color:maroon;">@&quot; &nbsp;&nbsp;&nbsp;&nbsp;SELECT&nbsp;[Id],&nbsp;[Name],&nbsp;[OpensAt],&nbsp;[LastSeating],&nbsp;[SeatingDuration] &nbsp;&nbsp;&nbsp;&nbsp;FROM&nbsp;[dbo].[Restaurants] &nbsp;&nbsp;&nbsp;&nbsp;WHERE&nbsp;[Name]&nbsp;=&nbsp;@Name &nbsp;&nbsp;&nbsp;&nbsp;SELECT&nbsp;[RestaurantId],&nbsp;[Capacity],&nbsp;[IsCommunal] &nbsp;&nbsp;&nbsp;&nbsp;FROM&nbsp;[dbo].[Tables] &nbsp;&nbsp;&nbsp;&nbsp;JOIN&nbsp;[dbo].[Restaurants] &nbsp;&nbsp;&nbsp;&nbsp;ON&nbsp;[dbo].[Tables].[RestaurantId]&nbsp;=&nbsp;[dbo].[Restaurants].[Id] &nbsp;&nbsp;&nbsp;&nbsp;WHERE&nbsp;[Name]&nbsp;=&nbsp;@Name&quot;</span>;</pre> </p> <p> There are more than one option when deciding how to construct the query. You could make one query with a join, in which case you'd get rows with repeated data, and you'd then need to detect duplicates, or you could do as I've done here: Query each table to get multiple result sets. </p> <p> I'm not claiming that this is better in any way. I only chose this option because I found the code that I had to write less offensive. </p> <p> Since the <code>IRestaurantDatabase</code> interface defines three different kinds of queries (<code>GetAll()</code>, <code>GetRestaurant(int id)</code>, and <code>GetRestaurant(string name)</code>), I invoked the <a href="https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)">rule of three</a> and extracted a helper method: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">async</span>&nbsp;Task&lt;IEnumerable&lt;Restaurant&gt;&gt;&nbsp;<span style="font-weight:bold;color:#74531f;">ReadRestaurants</span>(SqlCommand&nbsp;<span style="font-weight:bold;color:#1f377f;">cmd</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">conn</span>&nbsp;=&nbsp;<span style="color:blue;">new</span>&nbsp;SqlConnection(ConnectionString); &nbsp;&nbsp;&nbsp;&nbsp;cmd.Connection&nbsp;=&nbsp;conn; &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">await</span>&nbsp;conn.OpenAsync(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">using</span>&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">rdr</span>&nbsp;=&nbsp;<span style="color:blue;">await</span>&nbsp;cmd.ExecuteReaderAsync(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">var</span>&nbsp;<span style="font-weight:bold;color:#1f377f;">restaurants</span>&nbsp;=&nbsp;Enumerable.Empty&lt;Restaurant&gt;(); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(<span style="color:blue;">await</span>&nbsp;rdr.ReadAsync()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;restaurants&nbsp;=&nbsp;restaurants.Append(ReadRestaurantRow(rdr)); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">if</span>&nbsp;(<span style="color:blue;">await</span>&nbsp;rdr.NextResultAsync()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">while</span>&nbsp;(<span style="color:blue;">await</span>&nbsp;rdr.ReadAsync()) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;restaurants&nbsp;=&nbsp;ReadTableRow(rdr,&nbsp;restaurants); &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;restaurants; }</pre> </p> <p> The <code>ReadRestaurants</code> method does the overall work of opening the database connection, executing the query, and moving through rows and result sets. Again, we'll find mapping code hidden in helper methods: </p> <p> <pre><span style="color:blue;">private</span>&nbsp;<span style="color:blue;">static</span>&nbsp;Restaurant&nbsp;<span style="color:#74531f;">ReadRestaurantRow</span>(SqlDataReader&nbsp;<span style="font-weight:bold;color:#1f377f;">rdr</span>) { &nbsp;&nbsp;&nbsp;&nbsp;<span style="font-weight:bold;color:#8f08c4;">return</span>&nbsp;<span style="color:blue;">new</span>&nbsp;Restaurant( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">int</span>)rdr[<span style="color:#a31515;">&quot;Id&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(<span style="color:blue;">string</span>)rdr[<span style="color:#a31515;">&quot;Name&quot;</span>], &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;MaitreD( &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:blue;">new</span>&nbsp;TimeOfDay((