FSZipper in C# by Mark Seemann
Another functional model of a file system, with code examples in C#.
This article is part of a series about Zippers. In this one, I port the FSZipper
data structure from the Learn You a Haskell for Great Good! article Zippers.
A word of warning: I'm assuming that you're familiar with the contents of that article, so I'll skip the pedagogical explanations; I can hardly do it better that it's done there. Additionally, I'll make heavy use of certain standard constructs to port Haskell code, most notably Church encoding to model sum types in languages that don't natively have them. Such as C#. In some cases, I'll implement the Church encoding using the data structure's catamorphism. Since the cyclomatic complexity of the resulting code is quite low, you may be able to follow what's going on even if you don't know what Church encoding or catamorphisms are, but if you want to understand the background and motivation for that style of programming, you can consult the cited resources.
The code shown in this article is available on GitHub.
File system item initialization and structure #
If you haven't already noticed, Haskell (and other statically typed functional programming languages like F#) makes heavy use of sum types, and the FSZipper
example is no exception. It starts with a one-liner to define a file system item, which may be either a file or a folder. In C# we must instead use a class:
public sealed class FSItem
Contrary to the two previous examples, the FSItem
class has no generic type parameter. This is because I'm following the Haskell example code as closely as possible, but as I've previously shown, you can model a file hierarchy with a general-purpose rose tree.
Staying consistent with the two previous articles, I'll use Church encoding to model a sum type, and as discussed in the previous article I use a private
implementation for that.
private readonly IFSItem imp; private FSItem(IFSItem imp) { this.imp = imp; } public static FSItem CreateFile(string name, string data) { return new(new File(name, data)); } public static FSItem CreateFolder(string name, IReadOnlyCollection<FSItem> items) { return new(new Folder(name, items)); }
Two static
creation methods enable client developers to create a single FSItem
object, or an entire tree, like the example from the Haskell code, here ported to C#:
private static readonly FSItem myDisk = FSItem.CreateFolder("root", [ FSItem.CreateFile("goat_yelling_like_man.wmv", "baaaaaa"), FSItem.CreateFile("pope_time.avi", "god bless"), FSItem.CreateFolder("pics", [ FSItem.CreateFile("ape_throwing_up.jpg", "bleargh"), FSItem.CreateFile("watermelon_smash.gif", "smash!!"), FSItem.CreateFile("skull_man(scary).bmp", "Yikes!") ]), FSItem.CreateFile("dijon_poupon.doc", "best mustard"), FSItem.CreateFolder("programs", [ FSItem.CreateFile("fartwizard.exe", "10gotofart"), FSItem.CreateFile("owl_bandit.dmg", "mov eax, h00t"), FSItem.CreateFile("not_a_virus.exe", "really not a virus"), FSItem.CreateFolder("source code", [ FSItem.CreateFile("best_hs_prog.hs", "main = print (fix error)"), FSItem.CreateFile("random.hs", "main = print 4") ]) ]) ]);
Since the imp
class field is just a private
implementation detail, a client developer needs a way to query an FSItem
object about its contents.
File system item catamorphism #
Just like the previous article, I'll start with the catamorphism. This is essentially the rose tree catamorphism, just less generic, since FSItem
doesn't have a generic type parameter.
public TResult Aggregate<TResult>( Func<string, string, TResult> whenFile, Func<string, IReadOnlyCollection<TResult>, TResult> whenFolder) { return imp.Aggregate(whenFile, whenFolder); }
The Aggregate
method delegates to its internal implementation class field, which is defined as the private
nested interface IFSItem
:
private interface IFSItem { TResult Aggregate<TResult>( Func<string, string, TResult> whenFile, Func<string, IReadOnlyCollection<TResult>, TResult> whenFolder); }
As discussed in the previous article, the interface is hidden away because it's only a vehicle for polymorphism. It's not intended for client developers to be used (although that would be benign) or implemented (which could break encapsulation). There are only, and should ever only be, two implementations. The one that represents a file is the simplest:
private sealed record File(string Name, string Data) : IFSItem { public TResult Aggregate<TResult>( Func<string, string, TResult> whenFile, Func<string, IReadOnlyCollection<TResult>, TResult> whenFolder) { return whenFile(Name, Data); } }
The File
record's Aggregate
method unconditionally calls the supplied whenFile
function argument with the Name
and Data
that was originally supplied via its constructor.
The Folder
implementation is a bit trickier, mostly due to its recursive nature, but also because I wanted it to have structural equality.
private sealed class Folder : IFSItem { private readonly string name; private readonly IReadOnlyCollection<FSItem> items; public Folder(string Name, IReadOnlyCollection<FSItem> Items) { name = Name; items = Items; } public TResult Aggregate<TResult>( Func<string, string, TResult> whenFile, Func<string, IReadOnlyCollection<TResult>, TResult> whenFolder) { return whenFolder( name, items.Select(i => i.Aggregate(whenFile, whenFolder)).ToList()); } public override bool Equals(object? obj) { return obj is Folder folder && name == folder.name && items.SequenceEqual(folder.items); } public override int GetHashCode() { return HashCode.Combine(name, items); } }
It, too, unconditionally calls one of the two functions passed to its Aggregate
method, but this time whenFolder
. It does that, however, by first recursively calling Aggregate
within a Select
expression. It needs to do that because the whenFolder
function expects the subtree to have been already converted to values of the TResult
return type. This is a common pattern with catamorphisms, and takes a bit of time getting used to. You can see similar examples in the articles Tree catamorphism, Rose tree catamorphism, Full binary tree catamorphism, as well as the previous one in this series.
I also had to make Folder
a class
rather than a record, because I wanted the type to have structural equality, and you can't override Equals on records (and if the base class library has any collection type with structural equality, I'm not aware of it).
File system item Church encoding #
True to the structure of the previous article, the catamorphism doesn't look quite like a Church encoding, but it's possible to define the latter from the former.
public TResult Match<TResult>( Func<string, string, TResult> whenFile, Func<string, IReadOnlyCollection<FSItem>, TResult> whenFolder) { return Aggregate( whenFile: (name, data) => (item: CreateFile(name, data), result: whenFile(name, data)), whenFolder: (name, pairs) => { var items = pairs.Select(i => i.item).ToList(); return (CreateFolder(name, items), whenFolder(name, items)); }).result; }
The trick is the same as in the previous article: Build up an intermediate tuple that contains both the current item
as well as the result
being accumulated. Once the Aggregate
method returns, the Match
method returns only the result
part of the resulting tuple.
I implemented the whenFolder
expression as a code block, because both tuple elements needed the items
collection. You can inline the Select
expression, but that would cause it to run twice. That's probably a premature optimization, but it also made the code a bit shorter, and, one may hope, a bit more readable.
Fily system breadcrumb #
Finally, things seem to be becoming a little easier. The port of FSCrumb
is straightforward.
public sealed class FSCrumb { public FSCrumb( string name, IReadOnlyCollection<FSItem> left, IReadOnlyCollection<FSItem> right) { Name = name; Left = left; Right = right; } public string Name { get; } public IReadOnlyCollection<FSItem> Left { get; } public IReadOnlyCollection<FSItem> Right { get; } public override bool Equals(object? obj) { return obj is FSCrumb crumb && Name == crumb.Name && Left.SequenceEqual(crumb.Left) && Right.SequenceEqual(crumb.Right); } public override int GetHashCode() { return HashCode.Combine(Name, Left, Right); } }
The only reason this isn't a record
is, once again, that I want to override Equals
so that the type can have structural equality. Visual Studio wants me to convert to a primary constructor. That would simplify the code a bit, but actually not that much.
(I'm still somewhat conservative in my choice of new C# language features. Not that I have anything against primary constructors which, after all, F# has had forever. The reason I'm holding back is for didactic reasons. Not every reader is on the latest language version, and some readers may be using another programming language entirely. On the other hand, primary constructors seem natural and intuitive, so I may start using them here on the blog as well. I don't think that they're going to be much of a barrier to understanding.)
Now that we have both the data type we want to zip, as well as the breadcrumb type we need, we can proceed to add the Zipper.
File system Zipper #
The FSZipper
C# class fills the position of the eponymous Haskell type alias. Data structure and initialization is straightforward.
public sealed class FSZipper { private FSZipper(FSItem fSItem, IReadOnlyCollection<FSCrumb> breadcrumbs) { FSItem = fSItem; Breadcrumbs = breadcrumbs; } public FSZipper(FSItem fSItem) : this(fSItem, []) { } public FSItem FSItem { get; } public IReadOnlyCollection<FSCrumb> Breadcrumbs { get; } // Methods follow here...
True to the style I've already established, I've made the master constructor private
in order to highlight that the Breadcrumbs
are the responsibility of the FSZipper
class itself. It's not something client code need worry about.
Going down #
The Haskell Zippers article introduces fsUp
before fsTo
, but if we want to see some example code, we need to navigate to somewhere before we can navigate up. Thus, I'll instead start with the function that navigates to a child node.
public FSZipper? GoTo(string name) { return FSItem.Match( (_, _) => null, (folderName, items) => { FSItem? item = null; var ls = new List<FSItem>(); var rs = new List<FSItem>(); foreach (var i in items) { if (item is null && i.IsNamed(name)) item = i; else if (item is null) ls.Add(i); else rs.Add(i); } if (item is null) return null; return new FSZipper( item, Breadcrumbs.Prepend(new FSCrumb(folderName, ls, rs)).ToList()); }); }
This is by far the most complicated navigation we've seen so far, and I've even taken the liberty of writing an imperative implementation. It's not that I don't know how I could implement it in a purely functional fashion, but I've chosen this implementation for a couple of reasons. The first of which is that, frankly, it was easier this way.
This stems from the second reason: That the .NET base class library, as far as I know, offers no functionality like Haskell's break function. I could have written such a function myself, but felt that it was too much of a digression, even for me. Maybe I'll do that another day. It might make for a nice little exercise.
The third reason is that C# doesn't afford pattern matching on sequences, in the shape of destructuring the head and the tail of a list. (Not that I know of, anyway, but that language changes rapidly at the moment, and it does have some pattern-matching features now.) This means that I have to check item
for null
anyway.
In any case, while the implementation is imperative, an external caller can't tell. The GoTo
method is still referentially transparent. Which means that it fits in your head.
You may have noticed that the implementation calls IsNamed
, which is also new.
public bool IsNamed(string name) { return Match((n, _) => n == name, (n, _) => n == name); }
This is an instance method I added to FSItem
.
In summary, the GoTo
method enables client code to navigate down in the file hierarchy, as this unit test demonstrates:
[Fact] public void GoToSkullMan() { var sut = new FSZipper(myDisk); var actual = sut.GoTo("pics")?.GoTo("skull_man(scary).bmp"); Assert.NotNull(actual); Assert.Equal( FSItem.CreateFile("skull_man(scary).bmp", "Yikes!"), actual.FSItem); }
The example is elementary. First go to the pics
folder, and from there to the skull_man(scary).bmp
.
Going up #
Going back up the hierarchy isn't as complicated.
public FSZipper? GoUp() { if (Breadcrumbs.Count == 0) return null; var head = Breadcrumbs.First(); var tail = Breadcrumbs.Skip(1); return new FSZipper( FSItem.CreateFolder(head.Name, [.. head.Left, FSItem, .. head.Right]), tail.ToList()); }
If the Breadcrumbs
collection is empty, we're already at the root, in which case we can't go further up. In that case, the GoUp
method returns null
, as does the GoTo
method if it can't find an item with the desired name. This possibility is explicitly indicated by the FSZipper?
return type; notice the question mark, which indicates that the value may be null. If you're working in a context or language where that feature isn't available, you may instead consider taking advantage of the Maybe monad (which is also what you'd idiomatically do in Haskell).
If Breadcrumbs
is not empty, it means that there's a place to go up to. It also implies that the previous operation navigated down, and the only way that's possible is if the previous node was a folder. Thus, the GoUp
method knows that it needs to reconstitute a folder, and from the head
breadcrumb, it knows that folder's name, and what was originally to the Left
and Right
of the Zipper's FSItem
property.
This unit test demonstrates how client code may use the GoUp
method:
[Fact] public void GoUpFromSkullMan() { var sut = new FSZipper(myDisk); // This is the same as the GoToSkullMan test var newFocus = sut.GoTo("pics")?.GoTo("skull_man(scary).bmp"); var actual = newFocus?.GoUp()?.GoTo("watermelon_smash.gif"); Assert.NotNull(actual); Assert.Equal( FSItem.CreateFile("watermelon_smash.gif", "smash!!"), actual.FSItem); }
This test first repeats the navigation also performed by the other test, then uses GoUp
to go one level up, which finally enables it to navigate to the watermelon_smash.gif
file.
Renaming a file or folder #
A Zipper enables you to navigate a data structure, but you can also use it to modify the element in focus. One option is to rename a file or folder.
public FSZipper Rename(string newName) { return new FSZipper( FSItem.Match( (_, dat) => FSItem.CreateFile(newName, dat), (_, items) => FSItem.CreateFolder(newName, items)), Breadcrumbs); }
The Rename
method 'pattern-matches' on the 'current' FSItem
and in both cases creates a new file or folder with the new name. Since it doesn't need the old name for anything, it uses the wildcard pattern to ignore that value. This operation is always possible, so the return type is FSZipper
, without a question mark, indicating that the method never returns null
.
The following unit test replicates the Haskell article's example by renaming the pics
folder to cspi
.
[Fact] public void RenamePics() { var sut = new FSZipper(myDisk); var actual = sut.GoTo("pics")?.Rename("cspi").GoUp(); Assert.NotNull(actual); Assert.Empty(actual.Breadcrumbs); Assert.Equal( FSItem.CreateFolder("root", [ FSItem.CreateFile("goat_yelling_like_man.wmv", "baaaaaa"), FSItem.CreateFile("pope_time.avi", "god bless"), FSItem.CreateFolder("cspi", [ FSItem.CreateFile("ape_throwing_up.jpg", "bleargh"), FSItem.CreateFile("watermelon_smash.gif", "smash!!"), FSItem.CreateFile("skull_man(scary).bmp", "Yikes!") ]), FSItem.CreateFile("dijon_poupon.doc", "best mustard"), FSItem.CreateFolder("programs", [ FSItem.CreateFile("fartwizard.exe", "10gotofart"), FSItem.CreateFile("owl_bandit.dmg", "mov eax, h00t"), FSItem.CreateFile("not_a_virus.exe", "really not a virus"), FSItem.CreateFolder("source code", [ FSItem.CreateFile("best_hs_prog.hs", "main = print (fix error)"), FSItem.CreateFile("random.hs", "main = print 4") ]) ]) ]), actual.FSItem); }
Since the test uses GoUp
after Rename
, the actual
value contains the entire tree, while the Breadcrumbs
collection is empty.
Adding a new file #
Finally, we can add a new file to a folder.
public FSZipper? Add(FSItem item) { return FSItem.Match<FSZipper?>( whenFile: (_, _) => null, whenFolder: (name, items) => new FSZipper( FSItem.CreateFolder(name, items.Prepend(item).ToList()), Breadcrumbs)); }
This operation may fail, since we can't add a file to a file. This is, again, clearly indicated by the return type, which allows null
.
This implementation adds the file to the start of the folder, but it would also be possible to add it at the end. I would consider that slightly more idiomatic in C#, but here I've followed the Haskell example code, which conses the new item
to the beginning of the list. As is idiomatic in Haskell.
The following unit test reproduces the Haskell article's example.
[Fact] public void AddPic() { var sut = new FSZipper(myDisk); var actual = sut.GoTo("pics")?.Add(FSItem.CreateFile("heh.jpg", "lol"))?.GoUp(); Assert.NotNull(actual); Assert.Equal( FSItem.CreateFolder("root", [ FSItem.CreateFile("goat_yelling_like_man.wmv", "baaaaaa"), FSItem.CreateFile("pope_time.avi", "god bless"), FSItem.CreateFolder("pics", [ FSItem.CreateFile("heh.jpg", "lol"), FSItem.CreateFile("ape_throwing_up.jpg", "bleargh"), FSItem.CreateFile("watermelon_smash.gif", "smash!!"), FSItem.CreateFile("skull_man(scary).bmp", "Yikes!") ]), FSItem.CreateFile("dijon_poupon.doc", "best mustard"), FSItem.CreateFolder("programs", [ FSItem.CreateFile("fartwizard.exe", "10gotofart"), FSItem.CreateFile("owl_bandit.dmg", "mov eax, h00t"), FSItem.CreateFile("not_a_virus.exe", "really not a virus"), FSItem.CreateFolder("source code", [ FSItem.CreateFile("best_hs_prog.hs", "main = print (fix error)"), FSItem.CreateFile("random.hs", "main = print 4") ]) ]) ]), actual.FSItem); Assert.Empty(actual.Breadcrumbs); }
This example also follows the edit with a GoUp
call, with the effect that the Zipper is once more focused on the entire tree. The assertion verifies that the new heh.jpg
file is the first file in the pics
folder.
Conclusion #
The code for FSZipper
is actually a bit simpler than for the binary tree. This, I think, is mostly attributable to the FSZipper
having fewer constituent sum types. While sum types are trivial, and extraordinarily useful in languages that natively support them, they require a lot of boilerplate in a language like C#.
Do you need something like FSZipper
in C#? Probably not. As I've already discussed, this article series mostly exists as a programming exercise.