Hello, pure command-line interaction by Mark Seemann
A gentle introduction to modelling impure interactions with pure code.
Dependency Injection is a well-described concept in object-oriented programming, but as I've explained earlier, its not functional, because it makes everything impure. In general, you should reject the notion of dependencies by instead designing your application on the concept of an impure/pure/impure sandwich. This is possible more often than you'd think, but there's still a large group of applications where this will not work. If your application needs to interact with the impure world for an extended time, you need a way to model such interactions in a pure way.
This article introduces a way to do that.
Command line API #
Imagine that you have to write a command-line program that can ask a series of questions and print appropriate responses. In the general case, this is a (potentially) long-running series of interactions between the user and the program. To keep it simple, though, in this article we'll start by looking at a degenerate example:
Please enter your name. Mark Hello, Mark!
The program is simply going to request that you enter your name. Once you've done that, it prints the greeting. In object-oriented programming, using Dependency Injection, you might introduce an interface. Keeping it simple, you can restrict such an interface to two methods:
public interface ICommandLine { string ReadLine(); void WriteLine(string text); }
Please note that this is clearly a toy example. In later articles, you'll see how to expand the example to cover some more complex interactions, but you could also read a more realistic example already. Initially, the example is degenerate, because there's only a single interaction. In this case, an impure/pure/impure sandwich is still possible, but such a design wouldn't scale to more complex interactions.
The problem with defining and injecting an interface is that it isn't functional. What's the functional equivalent, then?
Instruction set #
Instead of defining an interface, you can define a discriminated union that describes a limited instruction set for command-line interactions:
type CommandLineInstruction<'a> = | ReadLine of (string -> 'a) | WriteLine of string * 'a
You may notice that it looks a bit like the above C# interface. Instead of defining two methods, it defines two cases, but the names are similar.
The ReadLine
case is an instruction that an interpreter can evaluate. The data contained in the case is a continuation function. After evaluating the instruction, an interpreter must invoke this function with a string. It's up to the interpreter to figure out which string to use, but it could, for example, come from reading an input string from the command line. The continuation function is the next step in whatever program you're writing.
The WriteLine
case is another instruction for interpreters. The data contained in this case is a tuple. The first element of the tuple is input for the interpreter, which can choose to e.g. print the value on the command line, or ignore it, and so on. The second element of the tuple is a value used to continue whatever program this case is a part of.
This enables you to write a small, specialised Abstract Syntax Tree (AST), but there's currently no way to return from it. One way to do that is to add a third 'stop' case. If you're interested in that option, Scott Wlaschin covers this as one iteration in his excellent explanation of the AST design.
Instead of adding a third 'stop' case to CommandLineInstruction<'a>
, another option is to add a new wrapper type around it:
type CommandLineProgram<'a> = | Free of CommandLineInstruction<CommandLineProgram<'a>> | Pure of 'a
The Free
case contains a CommandLineInstruction
that always continues to a new CommandLineProgram
value. The only way you can escape the AST is via the Pure
case, which simply contains the 'return' value.
Abstract Syntax Trees #
With these two types you can write specialised programs that contain instructions for an interpreter. Notice that the types are pure by intent, although in F# we can't really tell. You can, however, repeat this exercise in Haskell, where the instruction set looks like this:
data CommandLineInstruction next = ReadLine (String -> next) | WriteLine String next deriving (Functor) type CommandLineProgram = Free CommandLineInstruction
Both of these types are pure, because IO
is nowhere in sight. In Haskell, functions are pure by default. This also applies to the String -> next
function contained in the ReadLine
case.
Back in F# land, you can write an AST that implements the command-line interaction shown in the beginning of the article:
// CommandLineProgram<unit> let program = Free (WriteLine ( "Please enter your name.", Free (ReadLine ( fun s -> Free (WriteLine ( sprintf "Hello, %s!" s, Pure ()))))))
This AST defines a little program. The first step is a WriteLine
instruction with the input value "Please enter your name."
. The WriteLine
case constructor takes a tuple as input argument. The first tuple element is that prompt, and the second element is the continuation, which has to be a new CommandLineInstruction<CommandLineProgram<'a>>
value.
In this example, the continuation value is a ReadLine
case, which takes a continuation function as input. This function should return a new program value, which it does by returning a WriteLine
.
This second WriteLine
value creates a string
from the outer value s
. The second tuple element for the WriteLine
case must, again, be a new program value, but now the program is done, so you can use the 'stop' value Pure ()
.
You probably think that I should quit the mushrooms. No one in their right mind will want to write code like this. Neither would I. Fortunately, you can make the coding experience much better, but you'll see how to do that later.
Interpretation #
The above program
value is a small CommandLineProgram<unit>
. It's a pure value. In itself, it doesn't do anything.
Clearly, we'd like it to do something. In order to make that happen, you can write an interpreter:
// CommandLineProgram<'a> -> 'a let rec interpret = function | Pure x -> x | Free (ReadLine next) -> Console.ReadLine () |> next |> interpret | Free (WriteLine (s, next)) -> Console.WriteLine s next |> interpret
This interpreter is a recursive function that pattern-matches all the cases in any CommandLineProgram<'a>
. When it encounters a Pure
case, it simply returns the contained value.
When it encounters a ReadLine
value, it calls Console.ReadLine ()
, which returns a string
value read from the command line. It then pipes that input value to its next
continuation function, which produces a new CommandLineInstruction<CommandLineProgram<'a>>
value. Finally, it pipes that continuation value recursively to itself.
A similar treatment is given to the WriteLine
case. Console.WriteLine s
writes s
to the command line, where after next
is recursively piped to interpret
.
When you run interpret program
, you get an interaction like this:
Please enter your name. ploeh Hello, ploeh!
The program
is pure; the interpret
function is impure.
Syntactic sugar #
Clearly, you don't want to write programs as ASTs like the above. Fortunately, you don't have to. You can add syntactic sugar in the form of computation expressions. The way to do that is to turn your AST types into a monad. In Haskell, you'd already be done, because Free
is a monad. In F#, some code is required.
Source functor #
The first step is to define a map function for the underlying instruction set union type. Conceptually, when you can define a map function for a type, you've created a functor (if it obeys the functor laws, that is). Functors are common, so it often pays off being aware of them.
// ('a -> 'b) -> CommandLineInstruction<'a> -> CommandLineInstruction<'b> let private mapI f = function | ReadLine next -> ReadLine (next >> f) | WriteLine (x, next) -> WriteLine (x, next |> f)
The mapI
function takes a CommandLineInstruction<'a>
value and maps it to a new value by mapping the 'underlying value'. I decided to make the function private
because later, I'm also going to define a map function for CommandLineProgram<'a>
, and I don't want to confuse users of the API with two different map functions. This is also the reason that the name of the function isn't simply map
, but rather mapI
, where the I stands for instruction.
mapI
pattern-matches on the (implicit) input argument. If it's a ReadLine
case, it returns a new ReadLine
value, but it uses the mapper function f
to translate the next
function. Recall that next
is a function of the type string -> 'a
. When you compose it with f
(which is a function of the type 'a -> 'b
), you get (string -> 'a) >> ('a -> 'b)
, or string -> 'b
. You've transformed the 'a
to a 'b
for the ReadLine
case. If you can do the same for the WriteLine
case, you'll have a functor.
Fortunately, the WriteLine
case is similar, although a small tweak is required. This case contains a tuple of data. The first element (x
) isn't a generic type (it's a string
), so there's nothing to map. You can use it as-is in the new WriteLine
value that you'll return. The WriteLine
case is degenerate because next
isn't a function, but rather a value. It has a type of 'a
, and f
is a function of the type 'a -> 'b
, so piping next
to f
returns a 'b
.
That's it: now you have a functor.
(In order to keep the category theorists happy, I should point out that such functors are really a sub-type of functors called endo-functors. Additionally, functors must obey some simple and intuitive laws in order to be functors, but that's all I'll say about that here.)
Free monad #
There's a reason I spend so much time talking about functors. The goal is syntactic sugar. You can get that with computation expressions. In order to create a computation expression builder, you need a monad.
You need a recipe for creating a monad. Fortunately, there's a type of monad called a free monad. It has the virtue that it enables you to create a monad from any functor.
Just what you need.
In Haskell, this happens automatically when you declare type CommandLineProgram = Free CommandLineInstruction
. Thanks to Haskell's type system, Free
is automatically a Monad
when the underlying type is a Functor
. In F#, you have to work for your monads, but the fact that Haskell can automate this means that there's a recipe that you can follow.
Earlier in this article, I mentioned in passing that there are alternative ways in which you can define a 'stop' case for your instruction set. The reason I chose to separate the API into two types (an 'instruction set', and a 'program') is that the instruction set is the underlying functor. The 'program' is part of the free monad. The other part is a bind
function (that obeys the monad laws).
// ('a -> CommandLineProgram<'b>) -> CommandLineProgram<'a> // -> CommandLineProgram<'b> let rec bind f = function | Free instruction -> instruction |> mapI (bind f) |> Free | Pure x -> f x
This recursive function pattern-matches on the (implicit) CommandLineProgram<'a>
argument. In the Pure
case, the 'return' value x
has the type 'a
, which fits as input for the f
function. The result is a value of type CommandLineProgram<'b>
.
In the Free
case, the instruction
is a functor with the map function mapI
. The first argument to the mapI
function must be a function with the type 'a -> 'b
. How can you compose such a function?
If you partially apply the recursive bind
function with f
(that is: bind f
), you get a function of the type CommandLineProgram<'a> -> CommandLineProgram<'b>
. This fits with mapI
, because instruction
has the type CommandLineInstruction<CommandLineProgram<'a>>
(refer back to the definition of the Free
case if need to convince yourself of that). The result of calling mapI
with instruction
is a CommandLineInstruction<CommandLineProgram<'b>>
value. In order to turn it into a CommandLineProgram<'b>
value, you wrap it in a new Free
case.
Although this required a bit of explanation, defining a bind
function for a free monad is a repeatable process. After all, in Haskell it's automated. In F#, you have to explicitly write the code, but it follows a recipe. Once you get the hang of it, there's not much to it.
Functor #
You'll occasionally need to explicitly use the bind
function, but often it'll 'disappear' into a computation expression. There are other building blocks to an API than a bind
function, though. You may, for example, need a map
function:
// ('a -> 'b) -> CommandLineProgram<'a> -> CommandLineProgram<'b> let map f = bind (f >> Pure)
This makes CommandLineProgram<'a>
a functor as well. This is the reason I made mapI
private, because mapI
makes the instruction set a functor, but the API is expressed in terms of AST programs, and it should be consistent. Within the same module, map
should work on the same data type as bind
.
Notice that map
can be defined as a composition of bind
and Pure
. This is part of the recipe. For a free monad, the map
function always looks like that. The f
function is a function with the type 'a -> 'b
, and Pure
is a case constructor with the type 'b -> CommandLineProgram<'b>
. Notice that I've used 'b
for the generic type argument instead of the usual 'a
. Hopefully, this makes it clear that when you compose these two functions together (f >> Pure
), you get a function of the type ('a -> 'b) >> ('b -> CommandLineProgram<'b>)
, or 'a -> CommandLineProgram<'b>
. That's just the type of function needed for the bind
function, so the whole composition turns out to type-check and work as intended.
API #
In order to work with an API, you need the ability to create values of the API's type(s). In this case, you must be able to create CommandLineProgram<'a>
values. While you can create them explicitly using the ReadLine
, WriteLine
, Free
, and Pure
case constructors, it'll be more convenient if you have some predefined functions and values for that.
// CommandLineProgram<string> let readLine = Free (ReadLine Pure) // string -> CommandLineProgram<unit> let writeLine s = Free (WriteLine (s, Pure ()))
In the ReadLine
case, there's no input to the instruction, so you can define readLine
as a predefined CommandLineProgram<string>
value.
The WriteLine
case, on the other hand, takes as an input argument a string to write, so you can define writeLine
as a function that returns a CommandLineProgram<unit>
value.
Computation expression #
The addition of map
and supporting API is, to be honest, a bit of digression. You're going to use these functions later, but they aren't required in order to create a computation expression builder. All you need is a bind
function and a way to lift a raw value into the monad. All of these are in place, so the builder is a matter of delegation:
type CommandLineBuilder () = member this.Bind (x, f) = CommandLine.bind f x member this.Return x = Pure x member this.ReturnFrom x = x member this.Zero () = Pure ()
This is a fairly minimal builder, but in my experience, most of times, this is all you need.
Create an instance of the CommandLineBuilder
class, and you can write computation expressions:
let commandLine = CommandLineBuilder ()
I usually put such an object in a module with an [<AutoOpen>]
attribute, so that it's available as a global object.
Producing ASTs with pretty code #
Using the commandLine
computation expression is like using the built-in async
or seq
expressions. You can use it to rewrite the above AST as readable code:
// CommandLineProgram<unit> let program = commandLine { do! CommandLine.writeLine "Please enter your name." let! name = CommandLine.readLine do! sprintf "Hello, %s!" name |> CommandLine.writeLine }
This produces the same AST as before, but with much more readable syntax. The AST is the same, and you can use the above interpret
function to run it. The interaction is the same as before:
Please enter your name. Free Hello, Free!
This is, obviously, a toy example, but in coming articles, you'll see how to gradually enhance the code to perform some more complex interactions.
Summary #
Functional programming emphasises pure functions, and a separation of pure and impure code. The simplest way to achieve such a separation is to design your code as an impure/pure/impure sandwich, but sometimes this isn't possible. When it's not possible, an alternative is to define an instruction set for an AST, and turn it into a free monad in order to enable enough syntactic sugar to keep the code readable.
While this may seem complicated, it has the benefit of making impurities explicit in the code. Whenever you see a CommandLineProgram
value, you know that, at run-time, something impure is likely to happen. It's not uncontrolled impurity, though. Inside a CommandLineProgram
, only reading from, and writing to, the command line will happen. It's not going to generate random values, change global variables, send an email, or any other unpredictable operation - that is, unless the interpreter does that...
Next: A pure command-line wizard.