ploeh blog danish software design
A Binary Tree Zipper in C#
A port of another Haskell example, still just because.
This article is part of a series about Zippers. In this one, I port the Zipper
data structure from the Learn You a Haskell for Great Good! article also called Zippers.
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there. Additionally, I'll make heavy use of certain standard constructs to port Haskell code, most notably Church encoding to model sum types in languages that don't natively have them. Such as C#. In some cases, I'll implement the Church encoding using the data structure's catamorphism. Since the cyclomatic complexity of the resulting code is quite low, you may be able to follow what's going on even if you don't know what Church encoding or catamorphisms are, but if you want to understand the background and motivation for that style of programming, you can consult the cited resources.
The code shown in this article is available on GitHub.
Binary tree initialization and structure #
In the Haskell code, the binary Tree
type is a recursive sum type, defined on a single line of code. C#, on the other hand, has no built-in language construct that supports sum types, so a more elaborate solution is required. At least two options are available to us. One is to model a sum type as a Visitor. Another is to use Church encoding. In this article, I'll do the latter.
I find the type name (Tree
) used in the Zippers article a bit too vague, and since I consider explicit better than implicit, I'll use a more precise class name:
public sealed class BinaryTree<T>
Even so, there are different kinds of binary trees. In a previous article I've shown a catamorphism for a full binary tree. This variation is not as strict, since it allows a node to have zero, one, or two children. Or, strictly speaking, a node always has exactly two children, but both, or one of them, may be empty. BinaryTree<T>
uses Church encoding to distinguish between the two, but we'll return to that in a moment.
First, we'll examine how the class allows initialization:
private readonly IBinaryTree root; private BinaryTree(IBinaryTree root) { this.root = root; } public BinaryTree() : this(Empty.Instance) { } public BinaryTree(T value, BinaryTree<T> left, BinaryTree<T> right) : this(new Node(value, left.root, right.root)) { }
The class uses a private
root
object to implement behaviour, and constructor chaining for initialization. The master constructor is private
, since the IBinaryTree
interface is private
. The parameterless constructor implicitly indicates an empty node, whereas the other public
constructor indicates a node with a value and two children. Yes, I know that I just wrote that explicit is better than implicit, but it turns out that with the target-typed new
operator feature in C#, constructing trees in code becomes easier with this design choice:
BinaryTree<int> sut = new( 42, new(), new(2, new(), new()));
As the variable name suggests, I've taken this code example from a unit test.
Private interface #
The class delegates method calls to the root
field, which is an instance of the private
, nested IBinaryTree
interface:
private interface IBinaryTree { TResult Aggregate<TResult>( Func<TResult> whenEmpty, Func<T, TResult, TResult, TResult> whenNode); }
Why is IBinaryTree
a private
interface? Why does that interface even exist?
To be frank, I could have chosen another implementation strategy. Since there's only two mutually exclusive alternatives (node or empty), I could also have indicated which is which with a Boolean flag. You can see an example of that implementation tactic in the Table
class in the sample code that accompanies Code That Fits in Your Head.
Using a Boolean flag, however, only works when there are exactly two choices. If you have three or more, things because more complicated. You could try to use an enum, but in most languages, these tend to be nothing but glorified integers, and are typically not type-safe. If you define a three-way enum, there's no guarantee that a value of that type takes only one of these three values, and a good compiler will typically insist that you check for any other value as well. The C# compiler certainly does.
Church encoding offers a better alternative, but since it makes use of polymorphism, the most idiomatic choice in C# is either an interface or a base class. Since I favour interfaces over base classes, that's what I've chosen here, but for the purposes of this little digression, it makes no difference: The following argument applies to base classes as well.
An interface (or base class) suggests to users of an API that they can implement it in order to extend behaviour. That's an impression I don't wish to give client developers. The purpose of the interface is exclusively to enable double dispatch to work. There's only two implementations of the IBinaryTree
interface, and under no circumstances should there be more.
The interface is an implementation detail, which is why both it, and its implementations, are private
.
Binary tree catamorphism #
The IBinaryTree
interface defines a catamorphism for the BinaryTree<T>
class. Since we may often view a catamorphism as a sort of 'generalized fold', and since these kinds of operations in C# are typically called Aggregate
, that's what I've called the method.
An aggregate function affords a way to traverse a data structure and collect information into a single value, here of type TResult
. The return type may, however, be a complex type, including another BinaryTree<T>
. You'll see examples of complex return values later in this article.
As already discussed, there are exactly two implementations of IBinaryTree
. The one representing an empty node is the simplest:
private sealed class Empty : IBinaryTree { public readonly static Empty Instance = new(); private Empty() { } public TResult Aggregate<TResult>( Func<TResult> whenEmpty, Func<T, TResult, TResult, TResult> whenNode) { return whenEmpty(); } }
The Aggregate
implementation unconditionally calls the supplied whenEmpty
function, which returns some TResult
value unknown to the Empty
class.
Although not strictly necessary, I've made the class a Singleton. Since I like to take advantage of structural equality to write better tests, it was either that, or overriding Equals
and GetHashCode
.
The other implementation gets around that problem by being a record:
private sealed record Node(T Value, IBinaryTree Left, IBinaryTree Right) : IBinaryTree { public TResult Aggregate<TResult>( Func<TResult> whenEmpty, Func<T, TResult, TResult, TResult> whenNode) { return whenNode( Value, Left.Aggregate(whenEmpty, whenNode), Right.Aggregate(whenEmpty, whenNode)); } }
It, too, unconditionally calls one of the two functions passed to its Aggregate
method, but this time whenNode
. It does that, however, by first recursively calling Aggregate
on both Left
and Right
. It needs to do that because the whenNode
function expects the subtrees to have been already converted to values of the TResult
return type. This is a common pattern with catamorphisms, and takes a bit of time getting used to. You can see similar examples in the articles Tree catamorphism, Rose tree catamorphism, and Full binary tree catamorphism.
The BinaryTree<T>
class defines a public
Aggregate
method that delegates to its root
field:
public TResult Aggregate<TResult>( Func<TResult> whenEmpty, Func<T, TResult, TResult, TResult> whenNode) { return root.Aggregate(whenEmpty, whenNode); }
The astute reader may now remark that the Aggregate
method doesn't look like a Church encoding.
Binary tree Church encoding #
A Church encoding will typically have a Match
method that enables client code to match on all the alternative cases in the sum type, without those confusing already-converted TResult
values. It turns out that you can implement the desired Match
method with the Aggregate
method.
One of the advantages of doing meaningless coding exercises like this one is that you can pursue various ideas that interest you. One idea that interests me is the potential universality of catamorphisms. I conjecture that a catamorphism is an algebraic data type's universal API, and that you can implement all other methods or functions with it. I admit that I haven't done much research in the form of perusing existing literature, but at least it seems to be the case conspicuously often.
As it is here.
public TResult Match<TResult>( Func<TResult> whenEmpty, Func<T, BinaryTree<T>, BinaryTree<T>, TResult> whenNode) { return root .Aggregate( () => (tree: new BinaryTree<T>(), result: whenEmpty()), (x, l, r) => ( new BinaryTree<T>(x, l.tree, r.tree), whenNode(x, l.tree, r.tree))) .result; }
Now, I readily admit that it took me a couple of hours tossing and turning in my bed before this solution came to me. I don't find it intuitive at all, but it works.
The Aggregate
method requires that the whenNode
function's left and right values are of the same TResult
type as the return type. How do we consolidate that requirement with the Match
method's variation, where its whenNode
function requires the left and right values to be BinaryTree<T>
values, but the return type still TResult
?
The way out of this conundrum, it turns out, is to combine both in a tuple. Thus, when Match
calls Aggregate
, the implied TResult
type is not the TResult
visible in the Match
method declaration. Rather, it's inferred to be of the type (BinaryTree<T>, TResult)
. That is, a tuple where the first element is a BinaryTree<T>
value, and the second element is a TResult
value. The C# compiler's type inference engine then figures out that (BinaryTree<T>, TResult)
must also be the return type of the Aggregate
method call.
That's not what Match
should return, but the second tuple element contains a value of the correct type, so it returns that. Since I've given the tuple elements names, the Match
implementation accomplishes that by returning the result
tuple field.
Breadcrumbs #
That's just the tree that we want to zip. So far, we can only move from root to branches, but not the other way. Before we can define a Zipper for the tree, we need a data structure to store breadcrumbs (the navigation log, if you will).
In Haskell it's just another one-liner, but in C# this requires another full-fledged class:
public sealed class Crumb<T>
It's another sum type, so once more, I make the constructor private and use a private
class field for the implementation:
private readonly ICrumb imp; private Crumb(ICrumb imp) { this.imp = imp; } internal static Crumb<T> Left(T value, BinaryTree<T> right) { return new(new LeftCrumb(value, right)); } internal static Crumb<T> Right(T value, BinaryTree<T> left) { return new(new RightCrumb(value, left)); }
To stay consistent throughout the code base, I also use Church encoding to distinguish between a Left
and Right
breadcrumb, and the technique is similar. First, define a private
interface:
private interface ICrumb { TResult Match<TResult>( Func<T, BinaryTree<T>, TResult> whenLeft, Func<T, BinaryTree<T>, TResult> whenRight); }
Then, use private
nested types to implement the interface.
private sealed record LeftCrumb(T Value, BinaryTree<T> Right) : ICrumb { public TResult Match<TResult>( Func<T, BinaryTree<T>, TResult> whenLeft, Func<T, BinaryTree<T>, TResult> whenRight) { return whenLeft(Value, Right); } }
The RightCrumb
record is essentially just the 'mirror image' of the LeftCrumb
record, and just as was the case with BinaryTree<T>
, the Crumb<T>
class exposes an externally accessible Match
method that just delegates to the private
class field:
public TResult Match<TResult>( Func<T, BinaryTree<T>, TResult> whenLeft, Func<T, BinaryTree<T>, TResult> whenRight) { return imp.Match(whenLeft, whenRight); }
Finally, all the building blocks are ready for the actual Zipper.
Zipper data structure and initialization #
In the Haskell code, the Zipper is another one-liner, and really just a type alias. In C#, once more, we're going to need a full class.
public sealed class BinaryTreeZipper<T>
The Haskell article simply calls this type alias Zipper
, but I find that name too general, since there's more than one kind of Zipper. I think I understand that the article chooses that name for didactic reasons, but here I've chosen a more consistent disambiguation scheme, so I've named the class BinaryTreeZipper<T>
.
The Haskell example is just a type alias for a tuple, and the C# class is similar, although with significantly more ceremony:
public BinaryTree<T> Tree { get; } public IEnumerable<Crumb<T>> Breadcrumbs { get; } private BinaryTreeZipper( BinaryTree<T> tree, IEnumerable<Crumb<T>> breadcrumbs) { Tree = tree; Breadcrumbs = breadcrumbs; } public BinaryTreeZipper(BinaryTree<T> tree) : this(tree, []) { }
I've here chosen to add an extra bit of encapsulation by making the master constructor private
. This prevents client code from creating an arbitrary object with breadcrumbs without having navigated through the tree. To be honest, I don't think it violates any contract even if we allow this, but it at least highlights that the Breadcrumbs
role is to keep a log of what previously happened to the object.
Navigation #
We can now reproduce the navigation functions from the Haskell article.
public BinaryTreeZipper<T>? GoLeft() { return Tree.Match<BinaryTreeZipper<T>?>( whenEmpty: () => null, whenNode: (x, l, r) => new BinaryTreeZipper<T>( l, Breadcrumbs.Prepend(Crumb.Left(x, r)))); }
Going left 'pattern-matches' on the Tree
and, if not empty, constructs a new BinaryTreeZipper
object with the left tree, and a Left
breadcrumb that stores the 'current' node value and the right subtree. If the 'current' node is empty, on the other hand, the method returns null
. This possibility is explicitly indicated by the BinaryTreeZipper<T>?
return type; notice the question mark, which indicates that the value may be null. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the Maybe monad (which is also what you'd idiomatically do in Haskell).
The GoRight
method is similar to GoLeft
.
We may also attempt to navigate up in the tree, undoing our last downward move:
public BinaryTreeZipper<T>? GoUp() { if (!Breadcrumbs.Any()) return null; var head = Breadcrumbs.First(); var tail = Breadcrumbs.Skip(1); return head.Match( whenLeft: (x, r) => new BinaryTreeZipper<T>( new BinaryTree<T>(x, Tree, r), tail), whenRight: (x, l) => new BinaryTreeZipper<T>( new BinaryTree<T>(x, l, Tree), tail)); }
This is another operation that may fail. If we're already at the root of the tree, there are no Breadcrumbs
, in which case the only option is to return a value indicating that the operation failed; here, null
, but in other languages perhaps None
or Nothing
.
If, on the other hand, there's at least one breadcrumb, the GoUp
method uses the most recent one (head
) to construct a new BinaryTreeZipper<T>
object that reconstitutes the opposite (sibling) subtree and the parent node. It does that by 'pattern-matching' on the head
breadcrumb, which enables it to distinguish a left breadcrumb from a right breadcrumb.
Finally, we may keep trying to GoUp
until we reach the root:
public BinaryTreeZipper<T> TopMost() { return GoUp()?.TopMost() ?? this; }
You'll see an example of that a little later.
Modifications #
Continuing the port of the Haskell code, we can Modify
the current node with a function:
public BinaryTreeZipper<T> Modify(Func<T, T> f) { return new BinaryTreeZipper<T>( Tree.Match( whenEmpty: () => new BinaryTree<T>(), whenNode: (x, l, r) => new BinaryTree<T>(f(x), l, r)), Breadcrumbs); }
This operation always succeeds, since it chooses to ignore the change if the tree is empty. Thus, there's no question mark on the return type, indicating that the method never returns null
.
Finally, we may replace a node with a new subtree:
public BinaryTreeZipper<T> Attach(BinaryTree<T> tree) { return new BinaryTreeZipper<T>(tree, Breadcrumbs); }
The following unit test demonstrates a combination of several of the methods shown above:
[Fact] public void AttachAndGoTopMost() { var sut = new BinaryTreeZipper<char>(freeTree); var farLeft = sut.GoLeft()?.GoLeft()?.GoLeft()?.GoLeft(); var actual = farLeft?.Attach(new('Z', new(), new())).TopMost(); Assert.NotNull(actual); Assert.Equal( new('P', new('O', new('L', new('N', new('Z', new(), new()), new()), new('T', new(), new())), new('Y', new('S', new(), new()), new('A', new(), new()))), new('L', new('W', new('C', new(), new()), new('R', new(), new())), new('A', new('A', new(), new()), new('C', new(), new())))), actual.Tree); Assert.Empty(actual.Breadcrumbs); }
The test starts with freeTree
(not shown) and first navigates to the leftmost empty node. Here it uses Attach
to add a new 'singleton' subtree with the value 'Z'
. Finally, it uses TopMost
to return to the root node.
In the Assert phase, the test verifies that the actual
object contains the expected values.
Conclusion #
The Tree Zipper shown here is a port of the example given in the Haskell Zippers article. As I've already discussed in the introduction article, this data structure doesn't make much sense in C#, where you can easily implement a navigable tree with two-way links. Even if this requires state mutation, you can package such a data structure in a proper object with good encapsulation, so that operations don't leave any dangling pointers or the like.
As far as I can tell, the code shown in this article isn't useful in production code, but I hope that, at least, you still learned something from it. I always learn a new thing or two from doing programming exercises and writing about them, and this was no exception.
In the next article, I continue with the final of the Haskell article's three examples.
Next: FSZipper in C#.
Keeping cross-cutting concerns out of application code
Don't inject third-party dependencies. Use Decorators.
I recently came across a Stack Overflow question that reminded me of a topic I've been meaning to write about for a long time: Cross-cutting concerns.
When it comes to the usual suspects, logging, fault tolerance, caching, the best solution is usually to apply the Decorator pattern.
I often see code that uses Dependency Injection (DI) to inject, say, a logging interface into application code. You can see an example of that in Repeatable execution, as well as a suggestion for a better design. Not surprisingly, the better design involves logging Decorators.
The Stack Overflow question isn't about logging, but rather about fault tolerance; Circuit Breaker, retry policies, timeouts, etc.
Injected concern #
The question does a good job of presenting a minimal, reproducible example. At the outset, the code looks like this:
public class MyApi { private readonly ResiliencePipeline pipeline; private readonly IOrganizationService service; public MyApi(ResiliencePipelineProvider<string> provider, IOrganizationService service) { this.pipeline = provider.GetPipeline("retry-pipeline"); this.service = service; } public List<string> GetSomething(QueryByAttribute query) { var result = this.pipeline.Execute(() => service.RetrieveMultiple(query)); return result.Entities.Cast<string>().ToList(); } }
The Stack Overflow question asks how to test this implementation, but I'd rather take the example as an opportunity to discuss design alternatives. Not surprisingly, it turns out that with a more decoupled design, testing becomes easier, too.
Before we proceed, a few words about this example code. I assume that this isn't Andy Cooke's actual production code. Rather, I interpret it as a reduced example that highlights the actual question. This is important because you might ask: Why bother testing two lines of code?
Indeed, as presented, the GetSomething
method is so simple that you may consider not testing it. Thus, I interpret the second line of code as a stand-in for more complicated production code. Hold on to that thought, because once I'm done, that's all that's going to be left, and you may then think that it's so simple that it really doesn't warrant all this hoo-ha.
Coupling #
As shown, the MyApi
class is coupled to Polly, because ResiliencePipeline
is defined by that library. To be clear, all I've heard is that Polly is a fine library. I've used it for a few projects myself, but I also admit that I haven't that much experience with it. I'd probably use it again the next time I need a Circuit Breaker or similar, so the following discussion isn't a denouncement of Polly. Rather, it applies to all third-party dependencies, or perhaps even dependencies that are part of your language's base library.
Coupling is a major cause of spaghetti code and code rot in general. To write sustainable code, you should be cognizant of coupling. The most decoupled code is code that you can easily delete.
This doesn't mean that you shouldn't use high-quality third-party libraries like Polly. Among myriads of software engineering heuristics, we know that we should be aware of the not-invented-here syndrome.
When it comes to classic cross-cutting concerns, the Decorator pattern is usually a better design than injecting the concern into application code. The above example clearly looks innocuous, but imagine injecting both a ResiliencePipeline
, a logger, and perhaps a caching service, and your real application code eventually disappears in 'infrastructure code'.
It's not that we don't want to have these third-party dependencies, but rather that we want to move them somewhere else.
Resilient Decorator #
The concern in the above example is the desire to make the IOrganizationService
dependency more resilient. The MyApi
class only becomes more resilient as a transitive effect. The first refactoring step, then, is to introduce a resilient Decorator.
public sealed class ResilientOrganizationService( ResiliencePipeline pipeline, IOrganizationService inner) : IOrganizationService { public QueryResult RetrieveMultiple(QueryByAttribute query) { return pipeline.Execute(() => inner.RetrieveMultiple(query)); } }
As Decorators must, this class composes another IOrganizationService
while also implementing that interface itself. It does so by being an Adapter over the Polly API.
I've applied Nikola Malovic's 4th law of DI:
"Every constructor of a class being resolved should not have any implementation other then accepting a set of its own dependencies."
Instead of injecting a ResiliencePipelineProvider<string>
only to call GetPipeline
on it, it just receives a ResiliencePipeline
and saves the object for use in the RetrieveMultiple
method. It does that via a primary constructor, which is a recent C# language addition. It's just syntactic sugar for Constructor Injection, and as usual F# developers should feel right at home.
Simplifying MyApi #
Now that you have a resilient version of IOrganizationService
you don't need to have any Polly code in MyApi
. Remove it and simplify:
public class MyApi { private readonly IOrganizationService service; public MyApi(IOrganizationService service) { this.service = service; } public List<string> GetSomething(QueryByAttribute query) { var result = service.RetrieveMultiple(query); return result.Entities.Cast<string>().ToList(); } }
As promised, there's almost nothing left of it now, but I'll remind you that I consider the second line of GetSomething
as a stand-in for something more complicated that you might need to test. As it is now, though, testing it is trivial:
[Theory] [InlineData("foo", "bar", "baz")] [InlineData("qux", "quux", "corge")] [InlineData("grault", "garply", "waldo")] public void GetSomething(params string[] expected) { var service = new Mock<IOrganizationService>(); service .Setup(s => s.RetrieveMultiple(new QueryByAttribute())) .Returns(new QueryResult(expected)); var sut = new MyApi(service.Object); var actual = sut.GetSomething(new QueryByAttribute()); Assert.Equal(expected, actual); }
The larger point, however, is that not only have you now managed to keep third-party dependencies out of your application code, you've also simplified it and made it easier to test.
Composition #
You can still create a resilient MyApi
object in your Composition Root:
var service = new ResilientOrganizationService(pipeline, inner); var myApi = new MyApi(service);
Decomposing the problem in this way, you decouple your application code from third-party dependencies. You can define ResilientOrganizationService
in the application's Composition Root, which also keeps the Polly dependency there. Even so, you can implement MyApi
as part of your application layer.
I usually illustrate Ports and Adapters, or, if you will, Clean Architecture as concentric circles, but in this diagram I've skewed the circles to make space for the boxes. In other words, the diagram is 'not to scale'. Ideally, the outermost layer is much smaller and thinner than any of the the other layers. I've also included an inner green layer which indicates the architecture's Domain Model, but since I assume that MyApi
is part of some application layer, I've left the Domain Model empty.
Reasons to decouple #
Why is it important to decouple application code from Polly? First, keep in mind that in this discussion Polly is just a stand-in for any third-party dependency. It's up to you as a software architect to decide how you'll structure your code, but third-party dependencies are one of the first things I look for. A third-party component changes with time, and often independently of your base platform. You may have to deal with breaking changes or security patches at inopportune times. The organization that maintains the component may cease to operate. This happens to commercial entities and open-source contributors alike, although for different reasons.
Second, even a top-tier library like Polly will undergo changes. If your time horizon is five to ten years, you'll be surprised how much things change. You may protest that no-one designs software systems with such a long view, but I think that if you ask the business people involved with your software, they most certainly expect your system to last a long time.
I believe that I heard on a podcast that some Microsoft teams had taken a dependency on Polly. Assuming, for the sake of argument, that this is true, while we may not wish to depend on some random open-source component, depending on Polly is safe, right? In the long run, it isn't. Five years ago, you had the same situation with Json.NET, but then Microsoft hired James Newton-King and had him make a JSON API as part of the .NET base library. While Json.NET isn't dead by any means, now you have two competing JSON libraries, and Microsoft uses their own in the frameworks and libraries that they release.
Deciding to decouple your application code from a third-party component is ultimately a question of risk management. It's up to you to make the bet. Do you pay the up-front cost of decoupling, or do you postpone it, hoping it'll never be necessary?
I usually do the former, because the cost is low, and there are other benefits as well. As I've already touched on, unit testing becomes easier.
Configuration #
Since Polly only lives in the Composition Root, you'll also need to define the ResiliencePipeline
there. You can write the code that creates that pieline wherever you like, but it might be natural to make it a creation function on the ResilientOrganizationService
class:
public static ResiliencePipeline CreatePipeline() { return new ResiliencePipelineBuilder() .AddRetry(new RetryStrategyOptions { MaxRetryAttempts = 4 }) .AddTimeout(TimeSpan.FromSeconds(1)) .Build(); }
That's just an example, and perhaps not what you'd like to do. Perhaps you rather want some of these values to be defined in a configuration file. Thus, this isn't what you have to do, but rather what you could do.
If you use this option, however, you could take the return value of this method and inject it into the ResilientOrganizationService
constructor.
Conclusion #
Cross-cutting concerns, like caching, logging, security, or, in this case, fault tolerance, are usually best addressed with the Decorator pattern. In this article, you saw an example of using the Decorator pattern to decouple the concern of fault tolerance from the consumer of the service that you need to handle in a fault-tolerant manner.
The specific example dealt with the Polly library, but the point isn't that Polly is a particularly nasty third-party component that you need to protect yourself against. Rather, it just so happened that I came across a Stack Overflow question that used Polly, and I though it was a a nice example.
As far as I can tell, Polly is actually one of the top .NET open-source packages, so this article is not a denouncement of Polly. It's just a sketch of how to move useful dependencies around in your code base to make sure that they impact your application code as little as possible.
A List Zipper in C#
A port of a Haskell example, just because.
This article is part of a series about Zippers. In this one, I port the ListZipper
data structure from the Learn You a Haskell for Great Good! article also called Zippers.
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there.
The code shown in this article is available on GitHub.
Initialization and structure #
In the Haskell code, ListZipper
is just a type alias, but C# doesn't have that, so instead, we'll have to introduce a class.
public sealed class ListZipper<T> : IEnumerable<T>
Since it implements IEnumerable<T>
, it may be used like any other sequence, but it also comes with some special operations that enable client code to move forward and backward, as well as inserting and removing values.
The class has the following fields, properties, and constructors:
private readonly IEnumerable<T> values; public IEnumerable<T> Breadcrumbs { get; } private ListZipper(IEnumerable<T> values, IEnumerable<T> breadcrumbs) { this.values = values; Breadcrumbs = breadcrumbs; } public ListZipper(IEnumerable<T> values) : this(values, []) { } public ListZipper(params T[] values) : this(values.AsEnumerable()) { }
It uses constructor chaining to initialize a ListZipper
object with proper encapsulation. Notice that the master constructor is private. This prevents client code from initializing an object with arbitrary Breadcrumbs
. Rather, the Breadcrumbs
(the log, if you will) is going to be the result of various operations performed by client code, and only the ListZipper
class itself can use this constructor.
You may consider the constructor that takes a single IEnumerable<T>
as the 'main' public
constructor, and the other one as a convenience that enables a client developer to write code like new ListZipper<string>("foo", "bar", "baz")
.
The class' IEnumerable<T>
implementation only enumerates the values
:
public IEnumerator<T> GetEnumerator() { return values.GetEnumerator(); }
In other words, when enumerating a ListZipper
, you only get the 'forward' values
. Client code may still examine the Breadcrumbs
, since this is a public
property, but it should have little need for that.
(I admit that making Breadcrumbs
public is a concession to testability, since it enabled me to write assertions against this property. It's a form of structural inspection, which is a technique that I use much less than I did a decade ago. Still, in this case, while you may argue that it violates information hiding, it at least doesn't allow client code to put an object in an invalid state. Had the ListZipper
class been a part of a reusable library, I would probably have hidden that data, too, but since this is exercise code, I found this an acceptable compromise. Notice, too, that in the original Haskell code, the breadcrumbs are available to client code.)
Regular readers of this blog may be aware that I usually favour IReadOnlyCollection<T> over IEnumerable<T>. Here, on the other hand, I've allowed values
to be any IEnumerable<T>
, which includes infinite sequences. I decided to do that because Haskell lists, too, may be infinite, and as far as I can tell, ListZipper
actually does work with infinite sequences. I have, at least, written a few tests with infinite sequences, and they pass. (I may still have missed an edge case or two. I can't rule that out.)
Movement #
It's not much fun just being able to initialize an object. You also want to be able to do something with it, such as moving forward:
public ListZipper<T>? GoForward() { var head = values.Take(1); if (!head.Any()) return null; var tail = values.Skip(1); return new ListZipper<T>(tail, head.Concat(Breadcrumbs)); }
You can move forward through any IEnumerable
, so why make things so complicated? The benefit of this GoForward
method (function, really) is that it records where it came from, which means that moving backwards becomes an option:
public ListZipper<T>? GoBack() { var head = Breadcrumbs.Take(1); if (!head.Any()) return null; var tail = Breadcrumbs.Skip(1); return new ListZipper<T>(head.Concat(values), tail); }
This test may serve as an example of client code that makes use of those two operations:
[Fact] public void GoBack1() { var sut = new ListZipper<int>(1, 2, 3, 4); var actual = sut.GoForward()?.GoForward()?.GoForward()?.GoBack(); Assert.Equal([3, 4], actual); Assert.Equal([2, 1], actual?.Breadcrumbs); }
Going forward takes the first element off values
and adds it to the front of Breadcrumbs
. Going backwards is nearly symmetrical: It takes the first element off the Breadcrumbs
and adds it back to the front of the values
. Used in this way, Breadcrumbs
works as a stack.
Notice that both GoForward
and GoBack
admit the possibility of failure. If values
is empty, you can't go forward. If Breadcrumbs
is empty, you can't go back. In both cases, the functions return null
, which are also indicated by the ListZipper<T>?
return types; notice the question mark, which indicates that the value may be null. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the Maybe monad (which is also what you'd idiomatically do in Haskell).
To be clear, the Zippers article does discuss handling failures using Maybe, but only applies it to its binary tree example. Thus, the error handling shown here is my own addition.
Modifications #
In addition to moving back and forth in the list, we can also modify it. The following operations are also not in the Zippers article, but are rather my own contributions. Adding a new element is easy:
public ListZipper<T> Insert(T value) { return new ListZipper<T>(values.Prepend(value), Breadcrumbs); }
Notice that this operation is always possible. Even if the list is empty, we can Insert
a value. In that case, it just becomes the list's first and only element.
A simple test demonstrates usage:
[Fact] public void InsertAtFocus() { var sut = new ListZipper<string>("foo", "bar"); var actual = sut.GoForward()?.Insert("ploeh").GoBack(); Assert.NotNull(actual); Assert.Equal(["foo", "ploeh", "bar"], actual); Assert.Empty(actual.Breadcrumbs); }
Likewise, we may attempt to remove an element from the list:
public ListZipper<T>? Remove() { if (!values.Any()) return null; return new ListZipper<T>(values.Skip(1), Breadcrumbs); }
Contrary to Insert
, the Remove
operation will fail if values
is empty. Notice that this doesn't necessarily imply that the list as such is empty, but only that the focus is at the end of the list (which, of course, never happens if values
is infinite):
[Fact] public void RemoveAtEnd() { var sut = new ListZipper<string>("foo", "bar").GoForward()?.GoForward(); var actual = sut?.Remove(); Assert.Null(actual); Assert.NotNull(sut); Assert.Empty(sut); Assert.Equal(["bar", "foo"], sut.Breadcrumbs); }
In this example, the focus is at the end of the list, so there's nothing to remove. The list, however, is not empty, but all the data currently reside in the Breadcrumbs
.
Finally, we can combine insertion and removal to implement a replacement operation:
public ListZipper<T>? Replace(T newValue) { return Remove()?.Insert(newValue); }
As the name implies, this operation replaces the value currently in focus with a completely different value. Here's an example:
[Fact] public void ReplaceAtFocus() { var sut = new ListZipper<string>("foo", "bar", "baz"); var actual = sut.GoForward()?.Replace("qux")?.GoBack(); Assert.NotNull(actual); Assert.Equal(["foo", "qux", "baz"], actual); Assert.Empty(actual.Breadcrumbs); }
Once more, this may fail if the current focus is empty, so Replace
also returns a nullable value.
Conclusion #
For a C# developer, the ListZipper<T>
class looks odd. Why would you ever want to use this data structure? Why not just use List<T>?
As I hope I've made clear in the introduction article, I can't, indeed, think of a good reason.
I've gone through this exercise to hone my skills, and to prepare myself for the more intimidating exercise it is to implement a binary tree Zipper.
Next: A Binary Tree Zipper in C#.
Zippers
Some functional programming examples ported to C#, just because.
Many algorithms rely on data structures that enable the implementation to move in more than one way. A simple example is a doubly-linked list, where an algorithm can move both forward and backward from a given element. Other examples are various tree-based algorithms, such as red-black trees where certain operations trigger reorganization of the tree. Yet other data structures, such as Fibonacci heaps, combine doubly-linked lists with trees that allow navigation in more than one direction.
In an imperative programming language, you can easily implement such data structures, as long as the language allows data mutation. Here's a simple example:
var node1 = new Node<string>("foo"); var node2 = new Node<string>("bar") { Previous = node1 }; node1.Next = node2;
It's possible to double-link node1
to node2
by first creating node1
. At that point, node2
still doesn't exist, so you can't yet assign node1.Next
, but once you've initialized node2
, you can mutate the state of node1
by changing its Next
property.
When data structures are immutable (as they must be in functional programming) this is no longer possible. How may you get around that limitation?
Alternatives #
Some languages get around this problem in various ways. Haskell, because of its lazy evaluation, enables a technique called tying the knot that, frankly, makes my head hurt.
Even though I write a decent amount of Haskell code, that's not something that I make use of. Usually, it turns out, you can solve most problems by thinking about them differently. By choosing another perspective, and another data structure, you can often arrive at a good, functional solution to your problem.
One family of general-purpose data structures are called Zippers. The general idea is that the data structure has a natural 'focus' (e.g. the head of a list), but it also keeps a record of 'breadcrumbs', that is, where the caller has previously been. This enables client code to 'go back' or 'go up', if the natural direction is to 'go forward' or 'go down'. It's a bit like Event Sourcing, in that every operation leaves a log entry that can later be used to reconstruct what happened. Repeatable Execution also comes to mind, although it's not quite the same.
For an introduction to Zippers, I recommend the excellent and highly readable article Zippers. In this article series, I'm going to assume that you're familiar with the contents of that article.
C# ports #
While I may add more articles to this series in the future, as I'm writing this, I have nothing more planned than writing about how it's possible to implement the article's three Zippers in C#.
Why would you want to do this?
To be honest, for production code, I can't think of a good reason. I did it for a few reasons, most of them didactic. Additionally, writing code for exercise helps you improve. If you know enough Haskell to understand what's going on in the Zippers article, you may consider porting some of it to your favourite language, as an exercise.
It may help you grokking functional programming.
That's really it, though. There's no reason to use Zippers in a language like C#, which idiomatically makes use of mutation. If you want a doubly-linked list, you can just write code as shown in the beginning of this article.
If you're interested in an F# perspective on Zippers, Tomas Petricek has a cool article: Processing trees with F# zipper computation.
Conclusion #
Zippers constitute a family of data structures that enables you to move in multiple directions. Left and right in a list. Up or down in a tree. For an imperative programmer, that's literally just another day at the office, but in disciplined functional programming, making cyclic graphs can be surprisingly tricky.
Even in functional programming, I rarely reach for a Zipper, since I can often find a library with a higher level of abstraction that does what I need it to do. Still, learning of new ways to solve problems never seems a waste to me.
In the next three articles, I'll go through the examples from the Zipper article and show how I ported them to C#. While that article starts with a binary tree, I'll instead begin with the doubly-linked list, since it's the simplest of the three.
Next: A List Zipper in C#.
Using only a Domain Model to persist restaurant table configurations
A data architecture example in C# and ASP.NET.
This is part of a small article series on data architectures. In this, the third instalment, you'll see an alternative way of modelling data in a server-based application. One that doesn't rely on statically typed classes to model data. As the introductory article explains, the example code shows how to create a new restaurant table configuration, or how to display an existing resource. The sample code base is an ASP.NET 8.0 REST API.
Keep in mind that while the sample code does store data in a relational database, the term table in this article mainly refers to physical tables, rather than database tables.
The idea is to use 'raw' serialization APIs to handle communication with external systems. For the presentation layer, the example even moves representation concerns to middleware, so that it's nicely abstracted away from the application layer.
An architecture diagram like this attempts to capture the design:
Here, the arrows indicate mappings, not dependencies.
Like in the DTO-based Ports and Adapters architecture, the goal is to being able to design Domain Models unconstrained by serialization concerns, but also being able to format external data unconstrained by Reflection-based serializers. Thus, while this architecture is centred on a Domain Model, there are no Data Transfer Objects (DTOs) to represent JSON, XML, or database rows.
HTTP interaction #
To establish the context of the application, here's how HTTP interactions may play out. The following is a copy of the identically named section in the article Using Ports and Adapters to persist restaurant table configurations, repeated here for your convenience.
A client can create a new table with a POST
HTTP request:
POST /tables HTTP/1.1 content-type: application/json { "communalTable": { "capacity": 16 } }
Which might elicit a response like this:
HTTP/1.1 201 Created Location: https://example.com/Tables/844581613e164813aa17243ff8b847af
Clients can later use the address indicated by the Location
header to retrieve a representation of the resource:
GET /Tables/844581613e164813aa17243ff8b847af HTTP/1.1 accept: application/json
Which would result in this response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 {"communalTable":{"capacity":16}}
By default, ASP.NET handles and returns JSON. Later in this article you'll see how well it deals with other data formats.
Boundary #
ASP.NET supports some variation of the model-view-controller (MVC) pattern, and Controllers handle HTTP requests. At the outset, the action method that handles the POST
request looks like this:
[HttpPost] public async Task<IActionResult> Post(Table table) { var id = Guid.NewGuid(); await repository.Create(id, table).ConfigureAwait(false); return new CreatedAtActionResult( nameof(Get), null, new { id = id.ToString("N") }, null); }
While this looks identical to the Post
method for the Shared Data Model architecture, it's not, because it's not the same Table
class. Not by a long shot. The Table
class in use here is the one originally introduced in the article Serializing restaurant tables in C#, with a few inconsequential differences.
How does a Controller action method receive an input parameter directly in the form of a Domain Model, keeping in mind that this particular Domain Model is far from serialization-friendly? The short answer is middleware, which we'll get to in a moment. Before we look at that, however, let's also look at the Get
method that supports HTTP GET
requests:
[HttpGet("{id}")] public async Task<IActionResult> Get(string id) { if (!Guid.TryParseExact(id, "N", out var guid)) return new BadRequestResult(); Table? table = await repository.Read(guid).ConfigureAwait(false); if (table is null) return new NotFoundResult(); return new OkObjectResult(table); }
This, too, looks exactly like the Shared Data Model architecture, again with the crucial difference that the Table
class is completely different. The Get
method just takes the table
object and wraps it in an OkObjectResult
and returns it.
The Table
class is, in reality, extraordinarily opaque, and not at all friendly to serialization, so how do the service turn it into JSON?
JSON middleware #
Most web frameworks come with extensibility points where you can add middleware. A common need is to be able to add custom serializers. In ASP.NET they're called formatters, and can be added at application startup:
builder.Services.AddControllers(opts => { opts.InputFormatters.Insert(0, new TableJsonInputFormatter()); opts.OutputFormatters.Insert(0, new TableJsonOutputFormatter()); });
As the names imply, TableJsonInputFormatter
deserializes JSON input, while TableJsonOutputFormatter
serializes strongly typed objects to JSON.
We'll look at each in turn, starting with TableJsonInputFormatter
, which is responsible for deserializing JSON documents into Table
objects, as used by, for example, the Post
method.
JSON input formatter #
You create an input formatter by implementing the IInputFormatter interface, although in this example code base, inheriting from TextInputFormatter is enough:
internal sealed class TableJsonInputFormatter : TextInputFormatter
You can use the constructor to define which media types and encodings the formatter will support:
public TableJsonInputFormatter() { SupportedMediaTypes.Add(MediaTypeHeaderValue.Parse("application/json")); SupportedEncodings.Add(Encoding.UTF8); SupportedEncodings.Add(Encoding.Unicode); }
You'll also need to tell the formatter, which .NET type it supports:
protected override bool CanReadType(Type type) { return type == typeof(Table); }
As far as I can tell, the ASP.NET framework will first determine which action method (that is, which Controller, and which method on that Controller) should handle a given HTTP request. For a POST
request, as shown above, it'll determine that the appropriate action method is the Post
method.
Since the Post
method takes a Table
object as input, the framework then goes through the registered formatters and asks them whether they can read from an HTTP request into that type. In this case, the TableJsonInputFormatter
answers true
only if the type
is Table
.
When CanReadType
answers true
, the framework then invokes a method to turn the HTTP request into an object:
public override async Task<InputFormatterResult> ReadRequestBodyAsync( InputFormatterContext context, Encoding encoding) { using var rdr = new StreamReader(context.HttpContext.Request.Body, encoding); var json = await rdr.ReadToEndAsync().ConfigureAwait(false); var table = TableJson.Deserialize(json); if (table is { }) return await InputFormatterResult.SuccessAsync(table).ConfigureAwait(false); else return await InputFormatterResult.FailureAsync().ConfigureAwait(false); }
The ReadRequestBodyAsync
method reads the HTTP request body into a string
value called json
, and then passes the value to TableJson.Deserialize
. You can see the implementation of the Deserialize
method in the article Serializing restaurant tables in C#. In short, it uses the default .NET JSON parser to probe a document object model. If it can turn the JSON document into a Table
value, it does that. Otherwise, it returns null
.
The above ReadRequestBodyAsync
method then checks if the return value from TableJson.Deserialize
is null
. If it's not, it wraps the result in a value that indicates success. If it's null
, it uses FailureAsync
to indicate a deserialization failure.
With this input formatter in place as middleware, any action method that takes a Table
parameter will automatically receive a deserialized JSON object, if possible.
JSON output formatter #
The TableJsonOutputFormatter
class works much in the same way, but instead derives from the TextOutputFormatter base class:
internal sealed class TableJsonOutputFormatter : TextOutputFormatter
The constructor looks just like the TableJsonInputFormatter
, and instead of a CanReadType
method, it has a CanWriteType
method that also looks identical.
The WriteResponseBodyAsync
serializes a Table
object to JSON:
public override Task WriteResponseBodyAsync( OutputFormatterWriteContext context, Encoding selectedEncoding) { if (context.Object is Table table) return context.HttpContext.Response.WriteAsync(table.Serialize(), selectedEncoding); throw new InvalidOperationException("Expected a Table object."); }
If context.Object
is, in fact, a Table
object, the method calls table.Serialize()
, which you can also see in the article Serializing restaurant tables in C#. In short, it pattern-matches on the two possible kinds of tables and builds an appropriate abstract syntax tree or document object model that it then serializes to JSON.
Data access #
While the application stores data in SQL Server, it uses no object-relational mapper (ORM). Instead, it simply uses ADO.NET, as also outlined in the article Do ORMs reduce the need for mapping?
At first glance, the Create
method looks simple:
public async Task Create(Guid id, Table table) { using var conn = new SqlConnection(connectionString); using var cmd = table.Accept(new SqlInsertCommandVisitor(id)); cmd.Connection = conn; await conn.OpenAsync().ConfigureAwait(false); await cmd.ExecuteNonQueryAsync().ConfigureAwait(false); }
The main work, however, is done by the nested SqlInsertCommandVisitor
class:
private sealed class SqlInsertCommandVisitor(Guid id) : ITableVisitor<SqlCommand> { public SqlCommand VisitCommunal(NaturalNumber capacity) { const string createCommunalSql = @" INSERT INTO [dbo].[Tables] ([PublicId], [Capacity]) VALUES (@PublicId, @Capacity)"; var cmd = new SqlCommand(createCommunalSql); cmd.Parameters.AddWithValue("@PublicId", id); cmd.Parameters.AddWithValue("@Capacity", (int)capacity); return cmd; } public SqlCommand VisitSingle(NaturalNumber capacity, NaturalNumber minimalReservation) { const string createSingleSql = @" INSERT INTO [dbo].[Tables] ([PublicId], [Capacity], [MinimalReservation]) VALUES (@PublicId, @Capacity, @MinimalReservation)"; var cmd = new SqlCommand(createSingleSql); cmd.Parameters.AddWithValue("@PublicId", id); cmd.Parameters.AddWithValue("@Capacity", (int)capacity); cmd.Parameters.AddWithValue("@MinimalReservation", (int)minimalReservation); return cmd; } }
It 'pattern-matches' on the two possible kinds of table and returns an appropriate SqlCommand that the Create
method then executes. Notice that no 'Entity' class is needed. The code works straight on SqlCommand
.
The same is true for the repository's Read
method:
public async Task<Table?> Read(Guid id) { const string readByIdSql = @" SELECT [Capacity], [MinimalReservation] FROM [dbo].[Tables] WHERE[PublicId] = @id"; using var conn = new SqlConnection(connectionString); using var cmd = new SqlCommand(readByIdSql, conn); cmd.Parameters.AddWithValue("@id", id); await conn.OpenAsync().ConfigureAwait(false); using var rdr = await cmd.ExecuteReaderAsync().ConfigureAwait(false); if (!await rdr.ReadAsync().ConfigureAwait(false)) return null; var capacity = (int)rdr["Capacity"]; var mimimalReservation = rdr["MinimalReservation"] as int?; if (mimimalReservation is null) return Table.TryCreateCommunal(capacity); else return Table.TryCreateSingle(capacity, mimimalReservation.Value); }
It works directly on SqlDataReader. Again, no extra 'Entity' class is required. If the data in the database makes sense, the Read
method return a well-encapsulated Table
object.
XML formats #
That covers the basics, but how well does this kind of architecture stand up to changing requirements?
One axis of variation is when a service needs to support multiple representations. In this example, I'll imagine that the service also needs to support not just one, but two, XML formats.
Granted, you may not run into that particular requirement that often, but it's typical of a kind of change that you're likely to run into. In REST APIs, for example, you should use content negotiation for versioning, and that's the same kind of problem.
To be fair, application code also changes for a variety of other reasons, including new features, changes to business logic, etc. I can't possibly cover all, though, and many of these are much better described than changes in wire formats.
As described in the introduction article, ideally the XML should support a format implied by these examples:
<communal-table> <capacity>12</capacity> </communal-table> <single-table> <capacity>4</capacity> <minimal-reservation>3</minimal-reservation> </single-table>
Notice that while these two examples have different root elements, they're still considered to both represent a table. Although at the boundaries, static types are illusory we may still, loosely speaking, consider both of those XML documents as belonging to the same 'type'.
With both of the previous architectures described in this article series, I've had to give up on this schema. The present data architecture, finally, is able to handle this requirement.
HTTP interactions with element-biased XML #
The service should support the new XML format when presented with the the "application/xml"
media type, either as a content-type
header or accept
header. An initial POST
request may look like this:
POST /tables HTTP/1.1 content-type: application/xml <communal-table><capacity>12</capacity></communal-table>
Which produces a reply like this:
HTTP/1.1 201 Created Location: https://example.com/Tables/a77ac3fd221e4a5caaca3a0fc2b83ffc
And just like before, a client can later use the address in the Location
header to request the resource. By using the accept
header, it can indicate that it wishes to receive the reply formatted as XML:
GET /Tables/a77ac3fd221e4a5caaca3a0fc2b83ffc HTTP/1.1 accept: application/xml
Which produces this response with XML content in the body:
HTTP/1.1 200 OK Content-Type: application/xml; charset=utf-8 <communal-table><capacity>12</capacity></communal-table>
How do you add support for this new format?
Element-biased XML formatters #
Not surprisingly, you can add support for the new format by adding new formatters.
opts.InputFormatters.Add(new ElementBiasedTableXmlInputFormatter()); opts.OutputFormatters.Add(new ElementBiasedTableXmlOutputFormatter());
Importantly, and in stark contrast to the DTO-based Ports and Adapters example, you don't have to change the existing code to add XML support. If you're concerned about design heuristics such as the Single Responsibility Principle, you may consider this a win. Apart from the two lines of code adding the formatters, all other code to support this new feature is in new classes.
Both of the new formatters support the "application/xml"
media type.
Deserializing element-biased XML #
The constructor and CanReadType
implementation of ElementBiasedTableXmlInputFormatter
is nearly identical to code you've already seen here, so I'll skip the repetition. The ReadRequestBodyAsync
implementation is also conceptually similar, but of course differs in the details.
public override async Task<InputFormatterResult> ReadRequestBodyAsync( InputFormatterContext context, Encoding encoding) { var xml = await XElement .LoadAsync(context.HttpContext.Request.Body, LoadOptions.None, CancellationToken.None) .ConfigureAwait(false); var table = TableXml.TryParseElementBiased(xml); if (table is { }) return await InputFormatterResult.SuccessAsync(table).ConfigureAwait(false); else return await InputFormatterResult.FailureAsync().ConfigureAwait(false); }
As is also the case with the JSON input formatter, the ReadRequestBodyAsync
method really only implements an Adapter over a more specialized parser function:
internal static Table? TryParseElementBiased(XElement xml) { if (xml.Name == "communal-table") { var capacity = xml.Element("capacity")?.Value; if (capacity is { }) { if (int.TryParse(capacity, out var c)) return Table.TryCreateCommunal(c); } } if (xml.Name == "single-table") { var capacity = xml.Element("capacity")?.Value; var minimalReservation = xml.Element("minimal-reservation")?.Value; if (capacity is { } && minimalReservation is { }) { if (int.TryParse(capacity, out var c) && int.TryParse(minimalReservation, out var mr)) return Table.TryCreateSingle(c, mr); } } return null; }
In keeping with the common theme of the Domain Model Only data architecture, it deserialized by examining an Abstract Syntax Tree (AST) or document object model (DOM), specifically making use of the XElement API. This class is really part of the LINQ to XML API, but you'll probably agree that the above code example makes little use of LINQ.
Serializing element-biased XML #
Hardly surprising, turning a Table
object into element-biased XML involves steps similar to converting it to JSON. The ElementBiasedTableXmlOutputFormatter
class' WriteResponseBodyAsync
method contains this implementation:
public override Task WriteResponseBodyAsync( OutputFormatterWriteContext context, Encoding selectedEncoding) { if (context.Object is Table table) return context.HttpContext.Response.WriteAsync( table.GenerateElementBiasedXml(), selectedEncoding); throw new InvalidOperationException("Expected a Table object."); }
Again, the heavy lifting is done by a specialized function:
internal static string GenerateElementBiasedXml(this Table table) { return table.Accept(new ElementBiasedTableVisitor()); } private sealed class ElementBiasedTableVisitor : ITableVisitor<string> { public string VisitCommunal(NaturalNumber capacity) { var xml = new XElement( "communal-table", new XElement("capacity", (int)capacity)); return xml.ToString(SaveOptions.DisableFormatting); } public string VisitSingle( NaturalNumber capacity, NaturalNumber minimalReservation) { var xml = new XElement( "single-table", new XElement("capacity", (int)capacity), new XElement("minimal-reservation", (int)minimalReservation)); return xml.ToString(SaveOptions.DisableFormatting); } }
True to form, GenerateElementBiasedXml
assembles an appropriate AST for the kind of table in question, and finally converts it to a string
value.
Attribute-biased XML #
I was curious how far I could take this kind of variation, so for the sake of exploration, I invented yet another XML format to support. Instead of making exclusive use of XML elements, this format uses XML attributes for primitive values.
<communal-table capacity="12" /> <single-table capacity="4" minimal-reservation="3" />
In order to distinguish this XML format from the other, I invented the vendor media type "application/vnd.ploeh.table+xml"
. The new formatters only handle this media type.
There's not much new to report. The new formatters work like the previous. In order to parse the new format, a new function does that, still based on XElement
:
internal static Table? TryParseAttributeBiased(XElement xml) { if (xml.Name == "communal-table") { var capacity = xml.Attribute("capacity")?.Value; if (capacity is { }) { if (int.TryParse(capacity, out var c)) return Table.TryCreateCommunal(c); } } if (xml.Name == "single-table") { var capacity = xml.Attribute("capacity")?.Value; var minimalReservation = xml.Attribute("minimal-reservation")?.Value; if (capacity is { } && minimalReservation is { }) { if (int.TryParse(capacity, out var c) && int.TryParse(minimalReservation, out var mr)) return Table.TryCreateSingle(c, mr); } } return null; }
Likewise, converting a Table
object to this format looks like code you've already seen:
internal static string GenerateAttributeBiasedXml(this Table table) { return table.Accept(new AttributedBiasedTableVisitor()); } private sealed class AttributedBiasedTableVisitor : ITableVisitor<string> { public string VisitCommunal(NaturalNumber capacity) { var xml = new XElement( "communal-table", new XAttribute("capacity", (int)capacity)); return xml.ToString(SaveOptions.DisableFormatting); } public string VisitSingle( NaturalNumber capacity, NaturalNumber minimalReservation) { var xml = new XElement( "single-table", new XAttribute("capacity", (int)capacity), new XAttribute("minimal-reservation", (int)minimalReservation)); return xml.ToString(SaveOptions.DisableFormatting); } }
Consistent with adding the first XML support, I didn't have to touch any of the existing Controller or data access code.
Evaluation #
If you're concerned with separation of concerns, the Domain Model Only architecture gracefully handles variation in external formats without impacting application logic, Domain Model, or data access. You deal with each new format in a consistent and independent manner. The architecture offers the ultimate data representation flexibility, since everything you can write as a stream of bytes you can implement.
Since at the boundary, static types are illusory this architecture is congruent with reality. For a REST service, at least, reality is what goes on the wire. While static types can also be used to model what wire formats look like, there's always a risk that you can use your IDE's refactoring tools to change a DTO in such a way that the code still compiles, but you've now changed the wire format. This could easily break existing clients.
When wire compatibility is important, I test-drive enough self-hosted tests that directly use and verify the wire format to give me a good sense of stability. Without DTO classes, it becomes increasingly important to cover externally visible behaviour with a trustworthy test suite, but really, if compatibility is important, you should be doing that anyway.
It almost goes without saying that a requirement for this architecture is that your chosen web framework supports it. As you've seen here, ASP.NET does, but that's not a given in general. Most web frameworks worth their salt will come with mechanisms that enable you to add new wire formats, but the question is how opinionated such extensibility points are. Do they expect you to work with DTOs, or are they more flexible than that?
You may consider the pure Domain Model Only data architecture too specialized for everyday use. I may do that, too. As I wrote in the introduction article, I don't intent these walk-throughs to be prescriptive. Rather, they explore what's possible, so that you and I have a bigger set of alternatives to choose from.
Hybrid architectures #
In the code base that accompanies Code That Fits in Your Head, I use a hybrid data architecture that I've used for years. ADO.NET for data access, as shown here, but DTOs for external JSON serialization. As demonstrated in the article Using Ports and Adapters to persist restaurant table configurations, using DTOs for the presentation layer may cause trouble if you need to support multiple wire formats. On the other hand, if you don't expect that this is a concern, you may decide to run that risk. I often do that.
When presenting these three architectures to a larger audience, one audience member told me that his team used another hybrid architecture: DTOs for the presentation layer, and separate DTOs for data access, but no Domain Model. I can see how this makes sense in a mostly CRUD-heavy application where nonetheless you need to be able to vary user interfaces independently from the database schema.
Finally, I should point out that the Domain Model Only data architecture is, in reality, also a kind of Ports and Adapters architecture. It just uses more low-level Adapter implementations than you idiomatically see.
Conclusion #
The Domain Model Only data architecture emphasises modelling business logic as a strongly-typed, well-encapsulated Domain Model, while eschewing using statically-typed DTOs for communication with external processes. What I most like about this alternative is that it leaves little confusion as to where functionality goes.
When you have, say, TableDto
, Table
, and TableEntity
classes, you need a sophisticated and mature team to trust all developers to add functionality in the right place. If there's only a single Table
Domain Model, it may be more evident to developers that only business logic belongs there, and other concerns ought to be addressed in different ways.
Even so, you may consider all the low-level parsing code not to your liking, and instead decide to use DTOs. I may too, depending on context.
Using a Shared Data Model to persist restaurant table configurations
A data architecture example in C# and ASP.NET.
This is part of a small article series on data architectures. In this, the second instalment, you'll see a common attempt at addressing the mapping issue that I mentioned in the previous article. As the introductory article explains, the example code shows how to create a new restaurant table configuration, or how to display an existing resource. The sample code base is an ASP.NET 8.0 REST API.
Keep in mind that while the sample code does store data in a relational database, the term table in this article mainly refers to physical tables, rather than database tables.
The idea in this data architecture is to use a single, shared data model for each business object in the service. This is in contrast to the Ports and Adapters architecture, where you typically have a Data Transfer Object (DTO) for (JSON or XML) serialization, another class for the Domain Model, and a third to support an object-relational mapper.
An architecture diagram may attempt to illustrate the idea like this:
While ostensibly keeping alive the idea of application layers, data models are allowed to cross layers to be used both for database persistence, business logic, and in the presentation layer.
Data model #
Since the goal is to use a single class to model all application concerns, it means that we also need to use it for database persistence. The most commonly used ORM in .NET is Entity Framework, so I'll use that for the example. It's not something that I normally do, so it's possible that I could have done it better than what follows.
Still, assume that the database schema defines the Tables
table like this:
CREATE TABLE [dbo].[Tables] ( [Id] INT NOT NULL IDENTITY PRIMARY KEY, [PublicId] UNIQUEIDENTIFIER NOT NULL UNIQUE, [Capacity] INT NOT NULL, [MinimalReservation] INT NULL )
I used a scaffolding tool to generate Entity Framework code from the database schema and then modified what it had created. This is the result:
public partial class Table { [JsonIgnore] public int Id { get; set; } [JsonIgnore] public Guid PublicId { get; set; } public string Type => MinimalReservation.HasValue ? "single" : "communal"; public int Capacity { get; set; } public int? MinimalReservation { get; set; } }
Notice that I added [JsonIgnore]
attributes to two of the properties, since I didn't want to serialize them to JSON. I also added the calculated property Type
to include a discriminator in the JSON documents.
HTTP interaction #
A client can create a new table with a POST
HTTP request:
POST /tables HTTP/1.1 content-type: application/json {"type":"communal","capacity":12}
Notice that the JSON document doesn't follow the desired schema described in the introduction article. It can't, because the data architecture is bound to the shared Table
class. Or at least, if it's possible to attain the desired format with a single class and only some strategically placed attributes, I'm not aware of it. As the article Using only a Domain Model to persist restaurant table configurations will show, it is possible to attain that goal with the appropriate middleware, but I consider doing that to be an example of the third architecture, so not something I will cover in this article.
The service will respond to the above request like this:
HTTP/1.1 201 Created Location: https://example.com/Tables/777779466d2549d69f7e30b6c35bde3c
Clients can later use the address indicated by the Location
header to retrieve a representation of the resource:
GET /Tables/777779466d2549d69f7e30b6c35bde3c HTTP/1.1 accept: application/json
Which elicits this response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 {"type":"communal","capacity":12}
The JSON format still doesn't conform to the desired format because the Controller in question deals exclusively with the shared Table
data model.
Boundary #
At the boundary of the application, Controllers handle HTTP requests with action methods (an ASP.NET term). The framework matches requests by a combination of naming conventions and attributes. The Post
action method handles incoming POST
requests.
[HttpPost] public async Task<IActionResult> Post(Table table) { var id = Guid.NewGuid(); await repository.Create(id, table).ConfigureAwait(false); return new CreatedAtActionResult( nameof(Get), null, new { id = id.ToString("N") }, null); }
Notice that the input parameter isn't a separate DTO, but rather the shared Table
object. Since it's shared, the Controller can pass it directly to the repository
without any mapping.
The same simplicity is on display in the Get
method:
[HttpGet("{id}")] public async Task<IActionResult> Get(string id) { if (!Guid.TryParseExact(id, "N", out var guid)) return new BadRequestResult(); Table? table = await repository.Read(guid).ConfigureAwait(false); if (table is null) return new NotFoundResult(); return new OkObjectResult(table); }
Once the Get
method has parsed the id
it goes straight to the repository
, retrieves the table
and returns it if it's there. No mapping is required by the Controller. What about the repository
?
Data access #
The SqlTablesRepository
class reads and writes data from SQL Server using Entity Framework. The Create
method is as simple as this:
public async Task Create(Guid id, Table table) { ArgumentNullException.ThrowIfNull(table); table.PublicId = id; await context.Tables.AddAsync(table).ConfigureAwait(false); await context.SaveChangesAsync().ConfigureAwait(false); }
The Read
method is even simpler - a one-liner:
public async Task<Table?> Read(Guid id) { return await context.Tables .SingleOrDefaultAsync(t => t.PublicId == id).ConfigureAwait(false); }
Again, no mapping. Just return the database Entity.
XML serialization #
Simple, so far, but how does this data architecture handle changing requirements?
One axis of variation is when a service needs to support multiple representations. In this example, I'll imagine that the service also needs to support XML.
Granted, you may not run into that particular requirement that often, but it's typical of a kind of change that you're likely to run into. In REST APIs, for example, you should use content negotiation for versioning, and that's the same kind of problem.
To be fair, application code also changes for a variety of other reasons, including new features, changes to business logic, etc. I can't possibly cover all, though, and many of these are much better described than changes in wire formats.
As was also the case in the previous article, it quickly turns out that it's not possible to support any of the desired XML formats described in the introduction article. Instead, for the sake of exploring what is possible, I'll compromise and support XML documents like these examples:
<table> <type>communal</type> <capacity>12</capacity> </table> <table> <type>single</type> <capacity>4</capacity> <minimal-reservation>3</minimal-reservation> </table>
This schema, it turns out, is the same as the element-biased format from the previous article. I could, instead, have chosen to support the attribute-biased format, but, because of the shared data model, not both.
Notice how using statically typed classes, attributes, and Reflection to guide serialization leads toward certain kinds of formats. You can't easily support any arbitrary JSON or XML schema, but are rather nudged into a more constrained subset of possible formats. There's nothing too bad about this. As usual, there are trade-offs involved. You concede flexibility, but gain convenience: Just slab some attributes on your DTO, and it works well enough for most purposes. I mostly point it out because this entire article series is about awareness of choice. There's always some cost to be paid.
That said, supporting that XML format is surprisingly easy:
[XmlRoot("table")] public partial class Table { [JsonIgnore, XmlIgnore] public int Id { get; set; } [JsonIgnore, XmlIgnore] public Guid PublicId { get; set; } [XmlElement("type"), NotMapped] public string? Type { get; set; } [XmlElement("capacity")] public int Capacity { get; set; } [XmlElement("minimal-reservation")] public int? MinimalReservation { get; set; } public bool ShouldSerializeMinimalReservation() => MinimalReservation.HasValue; internal void InferType() { Type = MinimalReservation.HasValue ? "single" : "communal"; } }
Most of the changes are simple additions of the XmlRoot
, XmlElement
, and XmlIgnore
attributes. In order to serialize the <type>
element, however, I also had to convert the Type
property to a read/write property, which had some ripple effects.
For one, I had to add the NotMapped
attribute to tell Entity Framework that it shouldn't try to save the value of that property in the database. As you can see in the above SQL schema, the Tables
table has no Type
column.
This also meant that I had to change the Read
method in SqlTablesRepository
to call the new InferType
method:
public async Task<Table?> Read(Guid id) { var table = await context.Tables .SingleOrDefaultAsync(t => t.PublicId == id).ConfigureAwait(false); table?.InferType(); return table; }
I'm not happy with this kind of sequential coupling, but to be honest, this data architecture inherently has an appalling lack of encapsulation. Having to call InferType
is just par for the course.
That said, despite a few stumbling blocks, adding XML support turned out to be surprisingly easy in this data architecture. Granted, I had to compromise on the schema, and could only support one XML schema, so we shouldn't really take this as an endorsement. To paraphrase Gerald Weinberg, if it doesn't have to work, it's easy to implement.
Evaluation #
There's no denying that the Shared Data Model architecture is simple. There's no mapping between different layers, and it's easy to get started. Like the DTO-based Ports and Adapters architecture, you'll find many documentation examples and getting-started guides that work like this. In a sense, you can say that it's what the ASP.NET framework, or, perhaps particularly the Entity Framework (EF), 'wants you to do'. To be fair, I find ASP.NET to be reasonably unopinionated, so what inveigling you may run into may be mostly attributable to EF.
While it may feel nice that it's easy to get started, instant gratification often comes at a cost. Consider the Table
class shown here. Because of various constraints imposed by EF and the JSON and XML serializers, it has no encapsulation. One thing is that the sophisticated Visitor-encoded Table
class introduced in the article Serializing restaurant tables in C# is completely out of the question, but you can't even add a required constructor like this one:
public Table(int capacity) { Capacity = capacity; }
Granted, it seems to work with both EF and the JSON serializer, which I suppose is a recent development, but it doesn't work with the XML serializer, which requires that
"A class must have a parameterless constructor to be serialized by XmlSerializer."
Even if this, too, changes in the future, DTO-based designs are at odds with encapsulation. If you doubt the veracity of that statement, I challenge you to complete the Priority Collection kata with serializable DTOs.
Another problem with the Shared Data Model architecture is that it so easily decays to a Big Ball of Mud. Even though the above architecture diagram hollowly insists that layering is still possible, a Shared Data Model is an attractor of behaviour. You'll soon find that a class like Table
has methods that serve presentation concerns, others that implement business logic, and others again that address persistence issues. It has become a God Class.
From these problems it doesn't follow that the architecture doesn't have merit. If you're developing a CRUD-heavy application with a user interface (UI) that's merely a glorified table view, this could be a good choice. You would be coupling the UI to the database, so that if you need to change how the UI works, you might also have to modify the database schema, or vice versa.
This is still not an indictment, but merely an observation of consequences. If you can live with them, then choose the Shared Data Model architecture. I can easily imagine application types where that would be the case.
Conclusion #
In the Shared Data Model architecture you use a single model (here, a class) to handle all application concerns: UI, business logic, data access. While this shows a blatant disregard for the notion of separation of concerns, no law states that you must, always, separate concerns.
Sometimes it's okay to mix concerns, and then the Shared Data Model architecture is dead simple. Just make sure that you know when it's okay.
While this architecture is the ultimate in simplicity, it's also quite constrained. The third and final data architecture I'll cover, on the other hand, offers the ultimate in flexibility, at the expense (not surprisingly) of some complexity.
Next: Using only a Domain Model to persist restaurant table configurations.
Using Ports and Adapters to persist restaurant table configurations
A data architecture example in C# and ASP.NET.
This is part of a small article series on data architectures. In the first instalment, you'll see the outline of a Ports and Adapters implementation. As the introductory article explains, the example code shows how to create a new restaurant table configuration, or how to display an existing resource. The sample code base is an ASP.NET 8.0 REST API.
Keep in mind that while the sample code does store data in a relational database, the term table in this article mainly refers to physical tables, rather than database tables.
While Ports and Adapters architecture diagrams are usually drawn as concentric circles, you can also draw (subsets of) it as more traditional layered diagrams:
Here, the arrows indicate mappings, not dependencies.
HTTP interaction #
A client can create a new table with a POST
HTTP request:
POST /tables HTTP/1.1 content-type: application/json { "communalTable": { "capacity": 16 } }
Which might elicit a response like this:
HTTP/1.1 201 Created Location: https://example.com/Tables/844581613e164813aa17243ff8b847af
Clients can later use the address indicated by the Location
header to retrieve a representation of the resource:
GET /Tables/844581613e164813aa17243ff8b847af HTTP/1.1 accept: application/json
Which would result in this response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 {"communalTable":{"capacity":16}}
By default, ASP.NET handles and returns JSON. Later in this article you'll see how well it deals with other data formats.
Boundary #
ASP.NET supports some variation of the model-view-controller (MVC) pattern, and Controllers handle HTTP requests. At the outset, the action method that handles the POST
request looks like this:
[HttpPost] public async Task<IActionResult> Post(TableDto dto) { ArgumentNullException.ThrowIfNull(dto); var id = Guid.NewGuid(); await repository.Create(id, dto.ToTable()).ConfigureAwait(false); return new CreatedAtActionResult(nameof(Get), null, new { id = id.ToString("N") }, null); }
As is idiomatic in ASP.NET, input and output are modelled by data transfer objects (DTOs), in this case called TableDto
. I've already covered this little object model in the article Serializing restaurant tables in C#, so I'm not going to repeat it here.
The ToTable
method, on the other hand, is a good example of how trying to cut corners lead to more complicated code:
internal Table ToTable() { var candidate = Table.TryCreateSingle(SingleTable?.Capacity ?? -1, SingleTable?.MinimalReservation ?? -1); if (candidate is { }) return candidate.Value; candidate = Table.TryCreateCommunal(CommunalTable?.Capacity ?? -1); if (candidate is { }) return candidate.Value; throw new InvalidOperationException("Invalid TableDto."); }
Compare it to the TryParse
method in the Serializing restaurant tables in C# article. That one is simpler, and less error-prone.
I think that I wrote ToTable
that way because I didn't want to deal with error handling in the Controller, and while I test-drove the code, I never wrote a test that supply malformed input. I should have, and so should you, but this is demo code, and I never got around to it.
Enough about that. The other action method handles GET
requests:
[HttpGet("{id}")] public async Task<IActionResult> Get(string id) { if (!Guid.TryParseExact(id, "N", out var guid)) return new BadRequestResult(); var table = await repository.Read(guid).ConfigureAwait(false); if (table is null) return new NotFoundResult(); return new OkObjectResult(TableDto.From(table.Value)); }
The static TableDto.From
method is identical to the ToDto
method from the Serializing restaurant tables in C# article, just with a different name.
To summarize so far: At the boundary of the application, Controller methods receive or return TableDto
objects, which are mapped to and from the Domain Model named Table
.
Domain Model #
The Domain Model Table
is also identical to the code shown in Serializing restaurant tables in C#. In order to comply with the Dependency Inversion Principle (DIP), mapping to and from TableDto
is defined on the latter. The DTO, being an implementation detail, may depend on the abstraction (the Domain Model), but not the other way around.
In the same spirit, conversions to and from the database are defined entirely within the repository
implementation.
Data access layer #
Keeping the example consistent, the code base also models data access with C# classes. It uses Entity Framework to read from and write to SQL Server. The class that models a row in the database is also a kind of DTO, even though here it's idiomatically called an entity:
public partial class TableEntity { public int Id { get; set; } public Guid PublicId { get; set; } public int Capacity { get; set; } public int? MinimalReservation { get; set; } }
I had a command-line tool scaffold the code for me, and since I don't usually live in that world, I don't know why it's a partial class
. It seems to be working, though.
The SqlTablesRepository
class implements the mapping between Table
and TableEntity
. For instance, the Create
method looks like this:
public async Task Create(Guid id, Table table) { var entity = table.Accept(new TableToEntityConverter(id)); await context.Tables.AddAsync(entity).ConfigureAwait(false); await context.SaveChangesAsync().ConfigureAwait(false); }
That looks simple, but is only because all the work is done by the TableToEntityConverter
, which is a nested class:
private sealed class TableToEntityConverter : ITableVisitor<TableEntity> { private readonly Guid id; public TableToEntityConverter(Guid id) { this.id = id; } public TableEntity VisitCommunal(NaturalNumber capacity) { return new TableEntity { PublicId = id, Capacity = (int)capacity, }; } public TableEntity VisitSingle( NaturalNumber capacity, NaturalNumber minimalReservation) { return new TableEntity { PublicId = id, Capacity = (int)capacity, MinimalReservation = (int)minimalReservation, }; } }
Mapping the other way is easier, so the SqlTablesRepository
does it inline in the Read
method:
public async Task<Table?> Read(Guid id) { var entity = await context.Tables .SingleOrDefaultAsync(t => t.PublicId == id).ConfigureAwait(false); if (entity is null) return null; if (entity.MinimalReservation is null) return Table.TryCreateCommunal(entity.Capacity); else return Table.TryCreateSingle( entity.Capacity, entity.MinimalReservation.Value); }
Similar to the case of the DTO, mapping between Table
and TableEntity
is the responsibility of the SqlTablesRepository
class, since data persistence is an implementation detail. According to the DIP it shouldn't be part of the Domain Model, and it isn't.
XML formats #
That covers the basics, but how well does this kind of architecture stand up to changing requirements?
One axis of variation is when a service needs to support multiple representations. In this example, I'll imagine that the service also needs to support not just one, but two, XML formats.
Granted, you may not run into that particular requirement that often, but it's typical of a kind of change that you're likely to run into. In REST APIs, for example, you should use content negotiation for versioning, and that's the same kind of problem.
To be fair, application code also changes for a variety of other reasons, including new features, changes to business logic, etc. I can't possibly cover all, though, and many of these are much better described than changes in wire formats.
As described in the introduction article, ideally the XML should support a format implied by these examples:
<communal-table> <capacity>12</capacity> </communal-table> <single-table> <capacity>4</capacity> <minimal-reservation>3</minimal-reservation> </single-table>
Notice that while these two examples have different root elements, they're still considered to both represent a table. Although at the boundaries, static types are illusory we may still, loosely speaking, consider both of those XML documents as belonging to the same 'type'.
To be honest, if there's a way to support this kind of schema by defining DTOs to be serialized and deserialized, I don't know what it looks like. That's not meant to imply that it's impossible. There's often an epistemological problem associated with proving things impossible, so I'll just leave it there.
To be clear, it's not that I don't know how to support that kind of schema at all. I do, as the article Using only a Domain Model to persist restaurant table configurations will show. I just don't know how to do it with DTO-based serialisation.
Element-biased XML #
Instead of the above XML schema, I will, instead explore how hard it is to support a variant schema, implied by these two examples:
<table> <type>communal</type> <capacity>12</capacity> </table> <table> <type>single</type> <capacity>4</capacity> <minimal-reservation>3</minimal-reservation> </table>
This variation shares the same <table>
root element and instead distinguishes between the two kinds of table with a <type>
discriminator.
This kind of schema we can define with a DTO:
[XmlRoot("table")] public class ElementBiasedTableXmlDto { [XmlElement("type")] public string? Type { get; set; } [XmlElement("capacity")] public int Capacity { get; set; } [XmlElement("minimal-reservation")] public int? MinimalReservation { get; set; } public bool ShouldSerializeMinimalReservation() => MinimalReservation.HasValue; // Mapping methods not shown here... }
As you may have already noticed, however, this isn't the same type as TableJsonDto
, so how are we going to implement the Controller methods that receive and send objects of this type?
Posting XML #
The service should still accept JSON as shown above, but now, additionally, it should also support HTTP requests like this one:
POST /tables HTTP/1.1 content-type: application/xml <table><type>communal</type><capacity>12</capacity></table>
How do you implement this new feature?
My first thought was to add a Post
overload to the Controller:
[HttpPost] public async Task<IActionResult> Post(ElementBiasedTableXmlDto dto) { ArgumentNullException.ThrowIfNull(dto); var id = Guid.NewGuid(); await repository.Create(id, dto.ToTable()).ConfigureAwait(false); return new CreatedAtActionResult( nameof(Get), null, new { id = id.ToString("N") }, null); }
I just copied and pasted the original Post
method and changed the type of the dto
parameter. I also had to add a ToTable
conversion to ElementBiasedTableXmlDto
:
internal Table ToTable() { if (Type == "single") { var t = Table.TryCreateSingle(Capacity, MinimalReservation ?? 0); if (t is { }) return t.Value; } if (Type == "communal") { var t = Table.TryCreateCommunal(Capacity); if (t is { }) return t.Value; } throw new InvalidOperationException("Invalid Table DTO."); }
While all of that compiles, it doesn't work.
When you attempt to POST
a request against the service, the ASP.NET framework now throws an AmbiguousMatchException
indicating that "The request matched multiple endpoints". Which is understandable.
This lead me to the first round of Framework Whac-A-Mole. What I'd like to do is to select the appropriate action method based on content-type
or accept
headers. How does one do that?
After some web searching, I came across a Stack Overflow answer that seemed to indicate a way forward.
Selecting the right action method #
One way to address the issue is to implement a custom ActionMethodSelectorAttribute:
public sealed class SelectTableActionMethodAttribute : ActionMethodSelectorAttribute { public override bool IsValidForRequest(RouteContext routeContext, ActionDescriptor action) { if (action is not ControllerActionDescriptor cad) return false; if (cad.Parameters.Count != 1) return false; var dtoType = cad.Parameters[0].ParameterType; // Demo code only. This doesn't take into account a possible charset // parameter. See RFC 9110, section 8.3 // (https://www.rfc-editor.org/rfc/rfc9110#field.content-type) for more // information. if (routeContext?.HttpContext.Request.ContentType == "application/json") return dtoType == typeof(TableJsonDto); if (routeContext?.HttpContext.Request.ContentType == "application/xml") return dtoType == typeof(ElementBiasedTableXmlDto); return false; } }
As the code comment suggests, this isn't as robust as it should be. A content-type
header may also look like this:
Content-Type: application/json; charset=utf-8
The exact string equality check shown above would fail in such a scenario, suggesting that a more sophisticated implementation is warranted. I'll skip that for now, since this demo code already compromises on the overall XML schema. For an example of more robust content negotiation implementations, see Using only a Domain Model to persist restaurant table configurations.
Adorn both Post
action methods with this custom attribute, and the service now handles both formats:
[HttpPost, SelectTableActionMethod] public async Task<IActionResult> Post(TableJsonDto dto) // ... [HttpPost, SelectTableActionMethod] public async Task<IActionResult> Post(ElementBiasedTableXmlDto dto) // ...
While that handles POST
requests, it doesn't implement content negotiation for GET
requests.
Getting XML #
In order to GET
an XML representation, clients can supply an accept
header value:
GET /Tables/153f224c91fb4403988934118cc14024 HTTP/1.1 accept: application/xml
which will reply with
HTTP/1.1 200 OK Content-Length: 59 Content-Type: application/xml; charset=utf-8 <table><type>communal</type><capacity>12</capacity></table>
How do we implement that?
Keep in mind that since this data-architecture variation uses two different DTOs to model JSON and XML, respectively, an action method can't just return an object of a single type and hope that the ASP.NET framework takes care of the rest. Again, I'm aware of middleware that'll deal nicely with this kind of problem, but not in this architecture; see Using only a Domain Model to persist restaurant table configurations for such a solution.
The best I could come up with, given the constraints I'd imposed on myself, then, was this:
[HttpGet("{id}")] public async Task<IActionResult> Get(string id) { if (!Guid.TryParseExact(id, "N", out var guid)) return new BadRequestResult(); var table = await repository.Read(guid).ConfigureAwait(false); if (table is null) return new NotFoundResult(); // Demo code only. This doesn't take into account quality values. var accept = httpContextAccessor?.HttpContext?.Request.Headers.Accept.ToString(); if (accept == "application/json") return new OkObjectResult(TableJsonDto.From(table.Value)); if (accept == "application/xml") return new OkObjectResult(ElementBiasedTableXmlDto.From(table.Value)); return new StatusCodeResult((int)HttpStatusCode.NotAcceptable); }
As the comment suggests, this is once again code that barely passes the few tests that I have, but really isn't production-ready. An accept
header may also look like this:
accept: application/xml; q=1.0,application/json; q=0.5
Given such an accept
header, the service ought to return an XML representation with the application/xml
content type, but instead, this Get
method returns 406 Not Acceptable
.
As I've already outlined, I'm not going to fix this problem, as this is only an exploration. It seems that we can already conclude that this style of architecture is ill-suited to deal with this kind of problem. If that's the conclusion, then why spend time fixing outstanding problems?
Attribute-biased XML #
Even so, just to punish myself, apparently, I also tried to add support for an alternative XML format that use attributes to record primitive values. Again, I couldn't make the schema described in the introductory article work, but I did manage to add support for XML documents like these:
<table type="communal" capacity="12" /> <table type="single" capacity="4" minimal-reservation="3" />
The code is similar to what I've already shown, so I'll only list the DTO:
[XmlRoot("table")] public class AttributeBiasedTableXmlDto { [XmlAttribute("type")] public string? Type { get; set; } [XmlAttribute("capacity")] public int Capacity { get; set; } [XmlAttribute("minimal-reservation")] public int MinimalReservation { get; set; } public bool ShouldSerializeMinimalReservation() => 0 < MinimalReservation; // Mapping methods not shown here... }
This DTO looks a lot like the ElementBiasedTableXmlDto
class, only it adorns properties with XmlAttribute
rather than XmlElement
.
Evaluation #
Even though I had to compromise on essential goals, I wasted an appalling amount of time and energy on yak shaving and Framework Whac-A-Mole. The DTO-based approach to modelling external resources really doesn't work when you need to do substantial content negotiation.
Even so, a DTO-based Ports and Adapters architecture may be useful when that's not a concern. If, instead of a REST API, you're developing a web site, you'll typically not need to vary representation independently of resource. In other words, a web page is likely to have at most one underlying model.
Compared to other large frameworks I've run into, ASP.NET is fairly unopinionated. Even so, the idiomatic way to use it is based on DTOs. DTOs to represent external data. DTOs to represent UI components. DTOs to represent database rows (although they're often called entities in that context). You'll find a ton of examples using this data architecture, so it's incredibly well-described. If you run into problems, odds are that someone has blazed a trail before you.
Even outside of .NET, this kind of architecture is well-known. While I've learned a thing or two from experience, I've picked up a lot of my knowledge about software architecture from people like Martin Fowler and Robert C. Martin.
When you also apply the Dependency Inversion Principle, you'll get good separations of concerns. This aspect of Ports and Adapters is most explicitly described in Clean Architecture. For example, a change to the UI generally doesn't affect the database. You may find that example ridiculous, because why should it, but consult the article Using a Shared Data Model to persist restaurant table configurations to see how this may happen.
The major drawbacks of the DTO-based data architecture is that much mapping is required. With three different DTOs (e.g. JSON DTO, Domain Model, and ORM Entity), you need four-way translation as indicated in the above figure. People often complain about all that mapping, and no: ORMs don't reduce the need for mapping.
Another problem is that this style of architecture is complicated. As I've argued elsewhere, Ports and Adapters often constitute an unstable equilibrium. While you can make it work, it requires a level of sophistication and maturity among team members that is not always present. And when it goes wrong, it may quickly deteriorate into a Big Ball of Mud.
Conclusion #
A DTO-based Ports and Adapters architecture is well-described and has good separation of concerns. In this article, however, we've seen that it doesn't deal successfully with realistic content negotiation. While that may seem like a shortcoming, it's a drawback that you may be able to live with. Particularly if you don't need to do content negotiation at all.
This way of organizing code around data is quite common. It's often the default data architecture, and I sometimes get the impression that a development team has 'chosen' to use it without considering alternatives.
It's not a bad architecture despite evidence to the contrary in this article. The scenario examined here may not be relevant. The main drawback of having all these objects playing different roles is all the mapping that's required.
The next data architecture attempts to address that concern.
Next: Using a Shared Data Model to persist restaurant table configurations.
Three data architectures for the server
A comparison, for educational purposes.
Use the right tool for the job. How often have you encountered that phrase when discussing software architecture?
There's nothing wrong with the sentiment per se, but it's almost devoid of meaning. It doesn't pass the 'not test'. Try to negate it and imagine if anyone would seriously hold that belief: Don't use the right tool for the job, said no-one ever.
Even so, the underlying idea is that there are better and worse ways to solve problems. In software architecture too. It follows that you should choose the better solution.
How to do that requires skill and experience. When planning a good software architecture, an important consideration is how it'll handle future requirements. This seems to indicate that an architect should be able to predict the future in order to pick the best architecture. Which is, in general, not possible. Predicting the future is not the topic of this article.
There is, however, a more practical issue related to the notion of using the right tool for the job. One that we can address.
Choice #
In order to choose the better solution, you need to be aware of alternatives. You can't choose if there's nothing to choose from. This seems obvious, but a flowchart may drive home the point in an even stronger fashion.
On the other hand, if you have options, you're now in a position to choose.
In order to make a decision, you must be able to identify alternatives. This is hardly earth-shattering, but perhaps a bit abstract. To make it concrete, in this article, I'll look at a particular example.
Default data architecture #
Many applications need some sort of persistent storage. Particularly when it comes to (relational) database-based systems, I've seen more than one organization defaulting to a single data architecture: A presentation layer with View Models, a business logic layer with Domain Models, and a data access layer with ORM objects. A few decades ago, you'd typically see that model illustrated with horizontal layers. This is no longer en vogue. Today, most organizations that I consult with will tell me that they've decided on Ports and Adapters. Even so, if you do it right, it's the same architecture.
Reusing a diagram from a recent article, we may draw it like this:
The architect or senior developer who made that decision is obviously aware of some of the lore in the industry. He or she can often name that data architecture as either Ports and Adapters, Hexagonal Architecture, Clean Architecture, or, more rarely, Onion Architecture.
I still get the impression that this way of arranging code was chosen by default, without much deliberation. I see it so often that it strikes me as a 'default architecture'. Are architects aware of alternatives? Can they compare the benefits and drawbacks of each alternative?
Three alternatives #
As an example, I'll explore three alternative data architectures, one of them being Ports and Adapters. My goal with this is only to raise awareness. Since I rarely (if ever) see my customers use anything other than Ports and Adapters, I think some readers may benefit from seeing some alternatives.
I'll show three ways to organize data with code, but that doesn't imply that these are the only three options. At the very least, some hybrid combinations are also possible. It's also possible that a fourth or fifth alternative exists, and I'm just not aware of it.
In three articles, you'll see each data architecture explored in more detail.
- Using Ports and Adapters to persist restaurant table configurations
- Using a Shared Data Model to persist restaurant table configurations
- Using only a Domain Model to persist restaurant table configurations
As the titles suggest, all three examples will attempt to address the same problem: How to persist restaurant table configuration for a restaurant. The scenario is the same as already outlined in the article Serialization with and without Reflection, and the example code base also attempts to follow the external data format of those articles.
Data formats #
In JSON, a table may be represented like this:
{ "singleTable": { "capacity": 16, "minimalReservation": 10 } }
Or like this:
{ "communalTable": { "capacity": 10 } }
But I'll also explore what happens if you need to support multiple external formats, such as XML. Generally speaking, a given XML specification may lean towards favouring a verbose style based on elements, or a terser style based on attributes. An example of the former could be:
<communal-table> <capacity>12</capacity> </communal-table>
or
<single-table> <capacity>4</capacity> <minimal-reservation>3</minimal-reservation> </single-table>
while examples of the latter style include
<communal-table capacity="12" />
and
<single-table capacity="4" minimal-reservation="3" />
As it turns out, only one of the three data architectures is flexible enough to fully address such requirements.
Comparisons #
A REST API is the kind of application where data representation flexibility is most likely to be an issue. Thus, that only one of the three alternative architectures is able to exhibit enough expressive power in that dimension doesn't disqualify the other two. Each come with their own benefits and drawbacks.
Ports and Adapters | Shared Data Model | Domain Model only | |
Advantages |
|
|
|
Disadvantages |
|
|
|
I'll discuss each alternative's benefits and drawbacks in their individual articles.
An important point of all this is that none of these articles are meant to be prescriptive. While I do have favourites, my biases are shaped by the kind of work I typically do. In other contexts, another alternative may prevail.
Example code #
As usual, example code is in C#. Of the three languages in which I'm most proficient (the other two being F# and Haskell), this is the most easily digestible for a larger audience.
All three alternatives are written with ASP.NET 8.0, and it's unavoidable that there will be some framework-specific details. In Code That Fits in Your Head, I made it an explicit point that while the examples in the book are in C#, the book (and the code in it) should be understandable by developers who normally use Java, C++, TypeScript, or similar C-based languages.
The book is, for that reason, light on .NET-specific details. Instead, I published an article that collects all the interesting .NET things I ran into while writing the book.
Not so here. The three articles cover enough ASP.NET particulars that readers who don't care about that framework are encouraged to skim-read.
I've developed the three examples as three branches of the same Git repository. The code is available upon request against a small support donation of 10 USD (or more). If you're one of my regular supporters, you have my gratitude and can get the code without further donation. Send me an email in both cases.
Conclusion #
There's more than one way to organize a code base to deal with data. Depending on context, one may be a better choice than another. Thus, it pays to be aware of alternatives.
In the remaining articles in this series, you'll see three examples of how to deal with persistent data from a database. In order to establish a baseline, the first covers the well-known Ports and Adapters architecture.
Next: Using Ports and Adapters to persist restaurant table configurations.
The end of trust?
Software development in a globalized, hostile world.
Imagine that you're perusing the thriller section in an airport book store and come across a book with the following back cover blurb:
Programmers are dying.
Holly-Ann Kerr works as a data scientist for an NGO that fights workplace discrimination. While scrubbing input, she discovers an unusual pattern in the data. Some employees seem to have an unusually high fatal accident rate. Programmers are dying in traffic accidents, falling on stairs, defect electrical wiring, smoking in bed. They work for a variety of companies. Some for Big Tech, others for specialized component vendors, some for IT-related NGOs, others again for utility companies. The deaths seem to have nothing in common, until H-A uncovers a disturbing pattern.
All victims had recently started in a new position. And all were of Iranian descent.
Is a racist killer on the loose? But if so, why is he only targeting new hires? And why only software developers?
When H-A shares her discovery with the wrong people, she soon discovers that she'll be the next victim.
Okay, I'm not a professional editor, so this could probably do with a bit of polish. Does it sound like an exiting piece of fiction, though?
I'm going to spoil the plot, since the book doesn't exist anyway.
An international plot #
(Apologies to Iranian readers. I have nothing against Iranians, but find the regime despicable. In any case, nothing in the following hinges on the ICC. You can replace it with another adversarial intelligence agency that you don't like, including, but not limited to RGB, FSB, or a clandestine Chinese intelligence organization. You could probably even swap the roles and make CIA, MI5, or Mossad be the bad guys, if your loyalties lie elsewhere.)
In the story, it turns out that clandestine Iranian special operations are attempting to recruit moles in software organizations that constitute the supply chain of Western digital infrastructure.
Intelligence bureaus and software organizations that directly develop sensitive software tend to have good security measures. Planting a mole in such an organization is difficult. The entire supply chain of software dependencies, on the other hand, is much more vulnerable. If you can get an employee to install a backdoor in left-pad, chances are that you may attain remote execution capabilities on an ostensibly secure system.
In my hypothetical thriller, the Iranians kill those software developers that they fail to recruit. After all, one can't run a clandestine operation if people notify the police that they've been approached by a foreign power.
Long game #
Does that plot sound far-fetched?
I admit that I did turn to 11 some plot elements. This is, after all, supposed to be a thriller.
The story is, however, 'loosely based on real events'. Earlier this year, a Microsoft developer revealed a backdoor that someone had intentionally planted in xz Utils. That version of the software was close to being merged into Debian and Red Hat Linux distributions. It would have enabled an attacker to execute arbitrary code on an infected machine.
The attack was singularly sophisticated. It also looks as though it was initiated years ago by one or more persons who contributed real, useful work to an open-source project, apparently (in hindsight) with the sole intention of gaining the trust of the rest of the community.
This is such a long game that it reeks of an adversarial state actor. The linked article speculates on which foreign power may be behind the attack. No, not the Iranians, after all.
If you think about it, it's an entirely rational gambit for a foreign intelligence agency to make. It's not that the NSA hasn't already tried something comparable. If anything, the xz hack mostly seems far-fetched because it's so unnecessarily sophisticated.
Usually, the most effective hacking techniques utilize human trust or gullibility. Why spend enormous effort developing sophisticated buffer overrun exploits if you can get a (perhaps unwitting) insider to run arbitrary code for you?
It'd be much cheaper, and much more reliable, to recruit moles on the inside of software companies, and get them to add the backdoors you need. It doesn't necessary have to be new hires, but perhaps (I'm speculating) it's easier to recruit people before they've developed any loyalty to their new team mates.
The soft underbelly #
Which software organizations are the most promising targets? If it were me, I'd particularly try to go after various component vendors. One category may be companies that produce RAD tools such as grid GUIs, but also service providers that offer free SDKs to, say, send email, generate invoices, send SMS, charge credit cards, etc.
I'm not implying that any such company has ill intent, but since such software run on many machines, it's a juicy target if you can sneak a backdoor into one.
Why not open-source software (OSS)? Many OSS libraries run on even more machines, so wouldn't that be an even more attractive target for an adversary? Yes, but on the other hand, most popular open-source code is also scrutinized by many independent agents, so it's harder to sneak in a backdoor. As the attempted xz hack demonstrates, even a year-long sophisticated attack is at risk of being discovered.
Doesn't commercial or closed-source code receive the same level of scrutiny?
In my experience, not always. Of course, some development organizations use proper shared-code-ownership techniques like code reviews or pair programming, but others rely on siloed solo development. Programmers just check in code that no-one else ever looks at.
In such an organization, imagine how easy it'd be for a mole to add a backdoor to a widely-distributed library. He or she wouldn't even have to resort to sophisticated ways to obscure the backdoor, because no colleague would be likely to look at the code. Particularly not if you bury it in seven levels of nested for
loops and call the class MonitorManager
or similar. As long as the reusable library ships as compiled code, it's unlikely that customers will discover the backdoor before its too late.
Trust #
Last year I published an article on trust in software development. The point of that piece wasn't that you should suspect your colleagues of ill intent, but rather that you can trust neither yourself nor your co-workers for the simple reason that people make mistakes.
Since then, I've been doing some work in the digital security space, and I've been forced to think about concerns like supply-chain attacks. The implications are, unfortunately, that you can't automatically trust that your colleague has benign intentions.
This, obviously, will vary with context. If you're only writing a small web site for your HR department to use, it's hard to imagine how an adversarial state actor could take advantage of a backdoor in your code. If so, it's unlikely that anyone will go to the trouble of planting a mole in your organization.
On the other hand, if you're writing any kind of reusable library or framework, you just might be an interesting target. If so, you can no longer entirely trust your team mates.
As a Dane, that bothers me deeply. Denmark, along with the other Nordic countries, exhibit the highest levels of inter-societal trust in the world. I was raised to trust strangers, and so far, it's worked well enough for me. A business transaction in Denmark is often just a short email exchange. It's a great benefit to the business environment, and the economy in general, that we don't have to waste a lot of resources filling out formulas, contracts, agreements, etc. Trust is grease that makes society run smoother.
Even so, Scandinavians aren't naive. We don't believe that we can trust everyone. To a large degree, we rely on a lot of subtle social cues to assess a given situation. Some people shouldn't be trusted, and we're able to identify those situations, too.
What remains is that insisting that you can trust your colleague, just because he or she is your colleague, would be descending into teleology. I'm not a proponent of wishful thinking if good arguments suggest the contrary.
Shared code ownership #
Perhaps you shouldn't trust your colleagues. How does that impact software development?
The good news is that this is yet another argument to practice the beneficial practices of shared code ownership. Crucially, what this should entail is not just that everyone is allowed to edit any line of code, but rather that all team members take responsibility for the entire code base. No-one should be allowed to write code in splendid isolation.
There are several ways to address this concern. I often phrase it as follows: There should be at least two pair of eyes on every line of code before a merge to master.
As I describe in Code That Fits in Your Head, you can achieve that goal with pair programming, ensemble programming, or code reviews (including agile pull request reviews). That's a broad enough palette that it should be possible for every developer in every organization to find a modus vivendi that fits any personality and context.
Just looking at each others' code could significantly raise the bar for a would-be mole to add a backdoor to the code base. As an added benefit, it might also raise the general code quality.
What this does suggest to me, however, is that a too simplistic notion of running on trunk may be dangerous. Letting everyone commit to master and trusting that everyone means well no longer strikes me as a good idea (again, given the context, and all that).
Or, if you do, you should consider having some sort of systematic posterior post mortem review process. I've read of organizations that do that, but specific sources escape me at the moment. With Git, however, it's absolutely within the realms of the possible to make a diff of all change since the last ex-post review, and then go through those changes.
Conclusion #
The world is changed. I feel it in the OWASP top 10. I sense it in the shifting geopolitical climate. I smell it on the code I review.
Much that once was, is lost. The dream of a global computer network with boundless trust is no more. There are countries whose interests no longer align with ours. Who pay full-time salaries to people whose job it is to wage 'cyber warfare' against us. We can't rule out that parts of such campaigns include planting moles in our midsts. Moles whose task it is to weaken the foundations of our digital infrastructure.
In that light, should you always trust your colleagues?
Despite the depressing thought that I probably shouldn't, I'm likely to bounce back to my usual Danish most-people-are-to-be-trusted attitude tomorrow. On the other hand, I'll still insist that more than one person is involved with every line of code. Not only because every other person may be a foreign agent, but mostly, still, because humans are fallible, and two brains think better than one.
Comments
Or, if you do, you should consider having some sort of systematic post mortem review process. I've read of organizations that do that, but specific sources escape me at the moment.
My company has a Google Docs template for postmortem analysis that we use when something goes especially wrong. The primary focus is stating what went wrong according to the "five whys technique". Our template links to this post by Eric Ries. There is alsothis Wikipedia article on the subject. The section heading are "What happened" (one sentence), "Impact on Customers" (duration and severity), "What went wrong (5 Whys)", "What went right (optional)", "Corrective Actions" (and all of the content so far should be short enough to fit on one page), "Timeline" (a bulleted list asking for "Event beginning", "Time to Detect (monitoring)", "Time to Notify (alerting)", "Time to Respond (devops)", "Time to Troubleshoot (devops)", "Time to Mitigate (devops)", "Event end"), "Logs (optional)".
Tyson, thank you for writing. I now realize that 'post mortem' was a poor choice of words on my part, since it implies that something went wrong. I should have written 'posterior' instead. I'll update the article.
I've been digging around a bit to see if I can find the article that originally made me aware of that option. I'm fairly sure that it wasn't Optimizing the Software development process for continuous integration and flow of work, but that article, on the other hand, seems to be the source that other articles cite. It's fairly long, and also discusses other concepts; the relevant section here is the one called Non-blocking reviews.
An shorter overview of this kind of process can be found in Non-Blocking, Continuous Code Reviews - a case study.
In change management/risk control, your There should be at least two pair of eyes on every line of code is called four eye principle, and is a standard practice in my industry (IT services provider for the travel industry).
It goes further, and requires 2 more pair of eyes for any changes from the code review, to the load of a specific software in production.
I has a nice side-effect during code reviews: it's an automatic way to dessiminate knowledge in the team, so the bus factor is never 1.
I think that real people can mostly be trusted. But, software is not always run by people. Even when it is, a single non-trust-worthy person's action is amplified by software being run by mindless computers. It's like one rotten apple is enough to poison the full bag.
In the end, and a bit counter-intuitively, trusting people less now is leading to being able to trust more soon: people are forced to say "you can trust me, and here are the proofs". (Eg: the recently announced Apple's Private Cloud Compute).
Jiehong, thank you for writing. Indeed, in Code That Fits in Your Head I discuss how shared code ownership reduces the bus factor.
From this article and previous discussions I've had, I can see that the term trust is highly charged. People really don't like the notion that trust may be misplaced, or that mistrust, even, might be appropriate. I can't tell if it's a cultural bias of which I'm not aware. While English isn't my native language, I believe that I'm sufficiently acquainted with anglo-saxon culture to know of its most obvious quirks. Still, I'm sometimes surprised.
I admit that I, too, first consider whether I'm dealing with a deliberate adversary if I'm asked whether I trust someone, but soon after, there's a secondary interpretation that originates from normal human fallibility. I've already written about that: No, I don't trust my colleagues to be infallible, as I don't trust myself to be so.
Fortunately, it seems to me that the remedies that may address such concerns are the same, regardless of the underlying reasons.
Should interfaces be asynchronous?
Async and await are notorious for being contagious. Must all interfaces be Task-based, just in case?
I recently came across this question on Mastodon:
"To async or not to async?
"How would you define a library interface for a service that probably will be implemented with an in memory procedure - let's say returning a mapped value to a key you registered programmatically - and a user of your API might want to implement a decorator that needs a 'long running task' - for example you want to log a msg into your db or load additional mapping from a file?
"Would you define the interface to return a Task<string> or just a string?"
While seemingly a simple question, it's both fundamental and turns out to have deeper implications than you may at first realize.
Interpretation #
Before I proceed, I'll make my interpretation of the question more concrete. This is just how I interpret the question, so doesn't necessarily reflect the original poster's views.
The post itself doesn't explicitly mention a particular language, and since several languages now have async
and await
features, the question may be of more general interest that a question constrained to a single language. On the other hand, in order to have something concrete to discuss, it'll be useful with some real code examples. From perusing the discussion surrounding the original post, I get the impression that the language in question may be C#. That suits me well, since it's one of the languages with which I'm most familiar, and is also a language where programmers of other C-based languages should still be able to follow along.
My interpretation of the implementation, then, is this:
public sealed class NameMap { private readonly Dictionary<Guid, string> knownIds = new() { { new Guid("4778CA3D-FB1B-4665-AAC1-6649CEFA4F05"), "Bob" }, { new Guid("8D3B9093-7D43-4DD2-B317-DCEE4C72D845"), "Alice" } }; public string GetName(Guid guid) { return knownIds.TryGetValue(guid, out var name) ? name : "Trudy"; } }
Nothing fancy, but, as Fandermill writes in a follow-up post:
"Used examples that first came into mind, but it could be anything really."
The point, as I understand it, is that the intended implementation doesn't require asynchrony. A Decorator, on the other hand, may.
Should we, then, declare an interface like the following?
public interface INameMap { Task<string> GetName(Guid guid); }
If we do, the NameMap
class can't automatically implement that interface because the return types of the two GetName
methods don't match. What are the options?
Conform #
While the following may not be the 'best' answer, let's get the obvious solution out of the way first. Let the implementation conform to the interface:
public sealed class NameMap : INameMap { private readonly Dictionary<Guid, string> knownIds = new() { { new Guid("4778CA3D-FB1B-4665-AAC1-6649CEFA4F05"), "Bob" }, { new Guid("8D3B9093-7D43-4DD2-B317-DCEE4C72D845"), "Alice" } }; public Task<string> GetName(Guid guid) { return Task.FromResult( knownIds.TryGetValue(guid, out var name) ? name : "Trudy"); } }
This variation of the NameMap
class conforms to the interface by making the GetName
method look asynchronous.
We may even keep the synchronous implementation around as a public method if some client code might need it:
public sealed class NameMap : INameMap { private readonly Dictionary<Guid, string> knownIds = new() { { new Guid("4778CA3D-FB1B-4665-AAC1-6649CEFA4F05"), "Bob" }, { new Guid("8D3B9093-7D43-4DD2-B317-DCEE4C72D845"), "Alice" } }; public Task<string> GetName(Guid guid) { return Task.FromResult(GetNameSync(guid)); } public string GetNameSync(Guid guid) { return knownIds.TryGetValue(guid, out var name) ? name : "Trudy"; } }
Since C# doesn't support return-type-based overloading, we need to distinguish these two methods by giving them different names. In C# it might be more idiomatic to name the asynchronous method GetNameAsync
and the synchronous method just GetName
, but for reasons that would be too much of a digression now, I've never much liked that naming convention. In any case, I'm not going to go in this direction for much longer, so it hardly matters how we name these two methods.
Kinds of interfaces #
Another digression is, however, quite important. Before we can look at some more code, I'm afraid that we have to perform a bit of practical ontology, as it were. It starts with the question: Why do we even need interfaces?
I should also make clear, as a digression within a digression, that by 'interface' in this context, I'm really interested in any kind of mechanism that enables you to achieve polymorphism. In languages like C# or Java, we may in fact avail ourselves of the interface
keyword, as in the above INameMap
example, but we may equally well use a base class or perhaps just what C# calls a delegate. In other languages, we may use function or action types, or even function pointers.
Regardless of specific language constructs, there are, as far as I can tell, two kinds of interfaces:
- Interfaces that enable variability or extensibility in behaviour.
- Interfaces that mostly or exclusively exist to support automated testing.
While there may be some overlap between these two kinds, in my experience, the intersection between the two tends to be surprisingly small. Interfaces tend to mostly belong to one of those two categories.
Strategies and higher-order functions #
In design-patterns parlance, examples of the first kind are Builder, State, Chain of Responsibility, Template Method, and perhaps most starkly represented by the Strategy pattern. A Strategy is an encapsulated piece of behaviour that you pass around as a single 'thing' (an object).
And granted, you could also use a Strategy to access a database or make a web-service call, but that's not how the pattern was originally described. We'll return to that use case in the next section.
Rather, the first kind of interface exists to enable extensibility or variability in algorithms. Typical examples (from Design Patterns) include page layout, user interface component rendering, building a maze, finding the most appropriate help topic for a given application context, and so on. If we wish to relate this kind of interface to the SOLID principles, it mostly exists to support the Open-closed principle.
A good heuristics for identifying such interfaces is to consider the Reused Abstractions Principle (Jason Gorman, 2010, I'd link to it, but the page has fallen off the internet. Use your favourite web archive to read it.). If your code base contains multiple production-ready implementations of the same interface, you're reusing the interface, most likely to vary the behaviour of a general-purpose data structure.
And before the functional-programming (FP) crowd becomes too smug: FP uses this kind of interface all the time. In the FP jargon, however, we rather talk about higher-order functions and the interfaces we use to modify behaviour are typically modelled as functions and passed as lambda expressions. So when you write Cata((_, xs) => xs.Sum(), _ => 1)
(as one does), you might as well just have passed a Visitor implementation to an Accept
method.
This hints at a more quantifiable distinction: If the interface models something that's intended to be a pure function, it'd typically be part of a higher-order API in FP, while we in object-oriented design (once again) lack the terminology to distinguish these interfaces from the other kind.
These days, in C# I mostly use these kinds of interfaces for the Visitor pattern.
Seams #
The other kind of interface exists to afford automated testing. In Working Effectively with Legacy Code, Michael Feathers calls such interfaces Seams. Modern object-oriented code bases often use Dependency Injection (DI) to control which Strategies are in use in a given context. The production system may use an object that communicates with a relational database, while an automated test environment might replace that with a Test Double.
Yes, I wrote Strategies. As I suggested above, a Strategy is really a replaceable object in its purest form. When you use DI you may call all those interfaces IUserRepository
, ICommandHandler
, IEmailGateway
, and so on, but they're really all Strategies.
Contrary to the first kind of interface, you typically only find a single production implementation of each of these interfaces. If you find more that one, the rest are usually Decorators (one that logs, one that caches, one that works as a Circuit Breaker, etc.). All other implementations will be defined in the test code as dynamic mocks or Fakes.
Code bases that rely heavily on DI in order to support testing rub many people the wrong way. In 2014 David Heinemeier Hansson published a serious criticism of such test-induced damage. For the record, I agree with the criticism, but not with the conclusion. While I still practice test-driven development, I only define interfaces for true architectural dependencies. So, yes, my code bases may have an IReservationsRepository
or IEmailGateway
, but no ICommandHandler
or IUserManager
.
The bottom line, though, is that some interfaces exist to support testing. If there's a better way to make inherently non-deterministic systems behave deterministically in a test context, I've yet to discover it.
(As an aside, it's worth looking into tests that adopt non-deterministic behaviour as a driving principle, or at least an unavoidable state of affairs. Property-based testing is one such approach, but I also found the article When I'm done, I don't clean up by Arialdo Martini interesting. You may also want to refer to my article Waiting to happen for a discussion of how to make tests independent of system time.)
Where to define interfaces #
The reason the above distinction is important is that it fundamentally determines where interfaces should be defined. In short, the first kind of interface is part of an object model's API, and should be defined together with that API. The second kind, on the other hand, is part of a particular application's architecture, and should be defined by the client code that talks to the interface.
As an example of the first kind, consider this recent example, where the IPriorityEditor<T>
interface is part of the PriorityCollection<T>
API. You must ship the interface together with the class, because the Edit
method takes an interface implementation as an argument. It's how client code interacts with the API.
Another example is this Table class that comes with an ITableVisitor<T>
interface. In both cases, we'd expect interface implementations to be deterministic. These interfaces don't exist to support automated testing, but rather to afford a flexible programming model.
For the sake of argument, imagine that you package such APIs in reusable libraries that you publish via a package manager. In that case, it's obvious that the interface is as much part of the package as the class.
Contrast this with the other kind of interface, as described in the article Decomposing CTFiYH's sample code base or showcased in the article An example of state-based testing in C#. In the latter example, the interfaces IUserReader
and IUserRepository
are not part of any pre-packaged library. Rather, they are defined by the application code to support application-specific needs.
This may be even more evident if you contemplate the diagram in Decomposing CTFiYH's sample code base. Interfaces like IPostOffice
and IReservationsRepository
only exist to support the application. Following the Dependency Inversion Principle
"clients [...] own the abstract interfaces"
In these code bases, only the Controllers (or rather the tests that exercise them) need these interfaces, so the Controllers get to define them.
Should it be asynchronous, then? #
Okay, so should INameMap.GetName
return string
or Task<string>
, then?
Hopefully, at this point, it should be clear that the answer depends on what kind of interface it is.
If it's the first kind, the return type should support the requirements of the API. If the object model doesn't need the return type to be asynchronous, it shouldn't be.
If it's the second kind of interface, the application code decides what it needs, and defines the interface accordingly.
In neither case, however, is it the concrete class' responsibility to second-guess what client code might need.
But client code may need the method to be asynchronous. What's the harm of returning Task<string>
, just in case?
The problem, as you may well be aware, is that the asynchronous programming model is contagious. Once you've made an API asynchronous, you can't easily make it synchronous, whereas if you have a synchronous API, you can easily make it asynchronous. This follows from Postel's law, in this case: Be conservative with what you send.
Library API #
Imagine, for the sake of argument, that the NameMap
class is defined in a reusable library, wrapped in a package and imported into your code base via a package manager (NuGet, Maven, pip, NPM, Hackage, RubyGems, etc.).
Clearly it shouldn't implement any interface in order to 'support unit testing', since such interfaces should be defined by application code.
It could implement one or more 'extensibility' interfaces, if such interfaces are part of the wider API offered by the library. In the case of the NameMap
class, we don't really know if that's the case. To complete this part of the argument, then, I'd just leave it as shown in the first code example, shown above. It doesn't need to implement any interface, and GetName
can just return string
.
Domain Model #
What if, instead of an external library, the NameMap
class is part of an application's Domain Model?
In that case, you could define application-level interfaces as part of the Domain Model. In fact, most people do. Even so, I'd recommend that you don't, at least if you're aiming for a Functional Core, Imperative Shell architecture, a functional architecture, or even a Ports and Adapters or, if you will, Clean Architecture. The interfaces that exist only to support testing are application concerns, so keep them out of the Domain Model and instead define them in the Application Model.
You don't have to follow my advice. If you want to define interfaces in the Domain Model, I can't stop you. But what if, as I recommend, you define application-specific interfaces in the Application Model? If you do that, your NameMap
Domain Model can't implement your INameMap
interface, because the dependencies point the other way, and most languages will not allow circular dependencies.
In that case, what do you do if, as the original toot suggested, you need to Decorate the GetName
method with some asynchronous behaviour?
You can always introduce an Adapter:
public sealed class NameMapAdapter : INameMap { private readonly NameMap imp; public NameMapAdapter(NameMap imp) { this.imp = imp; } public Task<string> GetName(Guid guid) { return Task.FromResult(imp.GetName(guid)); } }
Now any NameMap
object can look like an INameMap
. This is exactly the kind of problem that the Adapter pattern addresses.
But, you say, that's too much trouble! I don't want to have to maintain two classes that are almost identical.
I understand the concern, and it may even be appropriate. Maybe you're right. As usual, I don't really intend this article to be prescriptive. Rather, I'm offering ideas for your consideration, and you can choose to adopt them or ignore them as it best fits your context.
When it comes to whether or not an Adapter is an unwarranted extra complication, I'll return to that topic later in this article.
Application Model #
The final architectural option is when the concrete NameMap
class is part of the Application Model, where you'd also define the application-specific INameMap
interface. In that case, we must assume that the NameMap
class implements some application-specific concern. If you want it to implement an interface so that you can wrap it in a Decorator, then do that. This means that the GetName
method must conform to the interface, and if that means that it must be asynchronous, then so be it.
As Kent Beck wrote in a Facebook article that used to be accessible without a Facebook account (but isn't any longer):
"Things that change at the same rate belong together. Things that change at different rates belong apart."
If the concrete NameMap
class and the INameMap
interface are both part of the application model, it's not unreasonable to guess that they may change together. (Do be mindful of Shotgun Surgery, though. If you expect the interface and the concrete class to frequently change, then perhaps another design might be more appropriate.)
Easier Adapters #
Before concluding this article, let's revisit the topic of introducing an Adapter for the sole purpose of 'architectural purity'. Should you really go to such lengths only to 'do it right'? You decide, but
You can only be pragmatic if you know how to be dogmatic.
I'm presenting a dogmatic solution for your consideration, so that you know what it might look like. Would I follow my own 'dogmatic' advice? Yes, I usually do, but then, I wouldn't log the return value of a pure function, so I wouldn't introduce an interface for that purpose, at least. To be fair to Fandermill, he or she also wrote: "or load additional mapping from a file", which could be an appropriate motivation for introducing an interface. I'd probably go with an Adapter in that case.
Whether or not an Adapter is an unwarranted complication depends, however, on language specifics. In high-ceremony languages like C#, Java, or C++, adding an Adapter involves at least one new file, and dozens of lines of code.
Consider, on the other hand, a low-ceremony language like Haskell. The corresponding getName
function might close over a statically defined map and have the type getName :: UUID -> String
.
How do you adapt such a pure function to an API that returns IO (which is roughly comparable to task-based programming)? Trivially:
getNameM :: Monad m => UUID -> m String getNameM = return . getName
For didactic purposes I have here shown the 'Adapter' as an explicit function, but in idiomatic Haskell I'd consider this below the Fairbairn threshold; I'd usually just inline the composition return . getName
if I needed to adapt the getName
function to the Kleisli category.
You can do the same in F#, where the composition would be getName >> Task.FromResult
. F# compositions usually go in the (for Westerners) intuitive left-to-right directions, whereas Haskell compositions follow the mathematical right-to-left convention.
The point, however, is that there's nothing conceptually complicated about an Adapter. Unfortunately, however, some languages require substantial ceremony to implement them.
Conclusion #
Should an API return a Task-based (asynchronous) value 'just in case'? In general: No.
You can't predict all possible use cases, so don't make an API more complicated than it has to be. If you need to implement an application-specific interface, use the Adapter design pattern.
A possible exception to this rule is if the entire API (the concrete implementation and the interface) only exists to support a specific application. If the interface and its concrete implementation are both part of the Application Model, you may as well skip the Adapter step and consider the concrete implementation as its own Adapter.
Comments
In this version of the data archictecture, let's suppose that the controller that now accepts a Domain Object directly is part of a larger REST API. How would you handle discoverability of the API, as the usual OpenAPI (Swagger et.al.) tools probably takes offence at this type of request object?
Jes, thank you for writing. If by discoverability you mean 'documentation', I would handle that the same way I usually handle documentation requirements for REST APIs: by writing one or my documents that explain how the API works. If there are other possible uses of OpenAPI than that, and the GUI to perform ad-hoc experiments, I'm going to need to be taken to task, because then I'm not aware of them.
I've recently discussed my general misgivings about OpenAPI, and they apply here as well. I'm aware that other people feel differently about this, and that's okay too.
You may be right, but I haven't tried, so I don't know if this is the case.