In functional programming, the notion of dependencies must be rejected. Instead, applications should be composed from pure and impure functions.

This is the third article in a small article series called from dependency injection to dependency rejection. In the previous article in the series, you learned that dependency injection can't be functional, because it makes everything impure. In this article, you'll see what to do instead.

Indirect input and output #

One of the first concepts you learned when you learned to program was that units of operation (functions, methods, procedures) take input and produce output. Input is in the form of input parameters, and output is in the form of return values. (Sometimes, though, a method returns nothing, but we know from category theory that nothing is also a value (called unit).)

In addition to such input and output, a unit with dependencies also take indirect input, and produce indirect output:

A unit with dependencies and direct and indirect input and output.

When a unit queries a dependency for data, the data returned from the dependency is indirect input. In the restaurant reservation example used in this article series, when tryAccept calls readReservations, the returned reservations are indirect input.

Likewise, when a unit invokes a dependency, all arguments passed to that dependency constitute indirect output. In the example, when tryAccept calls createReservation, the reservation value it uses as input argument to that function call becomes output. The intent, in this case, is to save the reservation in a database.

From indirect output to direct output #

Instead of producing indirect output, you can refactor functions to produce direct output.

A unit with dependencies and direct input and output, but no indirect output.

Such a refactoring is often problematic in mainstream object-oriented languages like C# and Java, because you wish to control the circumstances in which the indirect output must be produced. Indirect output often implies side-effects, but perhaps the side-effect must only happen when certain conditions are fulfilled. In the restaurant reservation example, the desired side-effect is to add a reservation to a database, but this must only happen when the restaurant has sufficient remaining capacity to serve the requested number of people. Since languages like C# and Java are statement-based, it can be difficult to separate the decision from the action.

In expression-based languages like F# and Haskell, it's trivial to decouple decisions from effects.

In the previous article, you saw a version of tryAccept with this signature:

// int -> (DateTimeOffset -> Reservation list) -> (Reservation -> int) -> Reservation
// -> int option

The second function argument, with the type Reservation -> int, produces indirect output. The Reservation value is the output. The function even violates Command Query Separation and returns the database ID of the added reservation, so that's additional indirect input. The overall function returns int option: the database ID if the reservation was added, and None if it wasn't.

Refactoring the indirect output to direct output is easy, then: just remove the createReservation function and return the Reservation value instead:

// int -> (DateTimeOffset -> Reservation list) -> Reservation -> Reservation option
let tryAccept capacity readReservations reservation =
    let reservedSeats =
        readReservations reservation.Date |> List.sumBy (fun x -> x.Quantity)
    if reservedSeats + reservation.Quantity <= capacity
    then { reservation with IsAccepted = true } |> Some
    else None

Notice that this refactored version of tryAccept returns a Reservation option value. The implication is that the reservation was accepted if the return value is a Some case, and rejected if the value is None. The decision is embedded in the value, but decoupled from the side-effect of writing to the database.

This function clearly never writes to the database, so at the boundary of your application, you'll have to connect the decision to the effect. To keep the example consistent with the previous article, you can do this in a tryAcceptComposition function, like this:

// Reservation -> int option
let tryAcceptComposition reservation =
    reservation
    |> tryAccept 10 (DB.readReservations connectionString)
    |> Option.map (DB.createReservation connectionString)

Notice that the type of tryAcceptComposition remains Reservation -> int option. This is a true refactoring. The overall API remains the same, as does the behaviour. The reservation is added to the database only if there's sufficient remaining capacity, and in that case, the ID of the reservation is returned.

From indirect input to direct input #

Just as you can refactor from indirect output to direct output can you refactor from indirect input to direct input.

A unit with dependencies and direct input and output.

Again, in statement-based languages like C# and Java, this may be problematic, because you may wish to defer a query, or base it on a decision inside the unit. In expression-based languages you can decouple decisions from effects, and deferred execution can always be done by lazy evaluation, if that's required. In the case of the current example, however, the refactoring is easy:

// int -> Reservation list -> Reservation -> Reservation option
let tryAccept capacity reservations reservation =
    let reservedSeats = reservations |> List.sumBy (fun x -> x.Quantity)
    if reservedSeats + reservation.Quantity <= capacity
    then { reservation with IsAccepted = true } |> Some
    else None

Instead of calling a (potentially impure) function, this version of tryAccept takes a list of existing reservations as input. It still sums over all the quantities, and the rest of the code is the same as before.

Obviously, the list of existing reservations must come from somewhere, like a database, so tryAcceptComposition will still have to take care of that:

// ('a -> 'b -> 'c) -> 'b -> 'a -> 'c
let flip f x y = f y x
 
// Reservation -> int option
let tryAcceptComposition reservation =
    reservation.Date
    |> DB.readReservations connectionString
    |> flip (tryAccept 10) reservation
    |> Option.map (DB.createReservation connectionString)

The type and behaviour of this composition is still the same as before, but the data flow is different. First, the function queries the database, which is an impure operation. Then, it pipes the resulting list of reservations to tryAccept, which is now a pure function. It returns a Reservation option that's finally mapped to another impure operation, which writes the reservation to the database if the reservation was accepted.

You'll notice that I also added a flip function in order to make the composition more concise, but I could also have used a lambda expression when invoking tryAccept. The flip function is a part of Haskell's standard library, but isn't in F#'s core library. It's not crucial to the example, though.

Evaluation #

Did you notice that in the previous diagram, above, all arrows between the unit and its dependencies were gone? This means that the unit no longer has any dependencies:

A unit with direct input and output, but no dependencies.

Dependencies are, by their nature, impure, and since pure functions can't call impure functions, functional programming must reject the notion of dependencies. Pure functions can't depend on impure functions.

Instead, pure functions must take direct input and produce direct output, and the impure boundary of an application must compose impure and pure functions together in order to achieve the desired behaviour.

In the previous article, you saw how Haskell can be used to evaluate whether or not an implementation is functional. You can port the above F# code to Haskell to verify that this is the case.

tryAccept :: Int -> [Reservation-> Reservation -> Maybe Reservation
tryAccept capacity reservations reservation =
  let reservedSeats = sum $ map quantity reservations
  in  if reservedSeats + quantity reservation <= capacity
      then Just $ reservation { isAccepted = True }
      else Nothing

This version of tryAccept is pure, and compiles, but as you learned in the previous article, that's not the crucial question. The question is whether the composition compiles?

tryAcceptComposition :: Reservation -> IO (Maybe Int)
tryAcceptComposition reservation = runMaybeT $
  liftIO (DB.readReservations connectionString $ date reservation)
  >>= MaybeT . return . flip (tryAccept 10) reservation
  >>= liftIO . DB.createReservation connectionString

This version of tryAcceptComposition compiles, and works as desired. The code exhibits a common pattern for Haskell: First, gather data from impure sources. Second, pass pure data to pure functions. Third, take the pure output from the pure functions, and do something impure with it.

It's like a sandwich, with the best parts in the middle, and some necessary stuff surrounding it.

Summary #

Dependencies are, by nature, impure. They're either non-deterministic, have side-effects, or both. Pure functions can't call impure functions (because that would make them impure as well), so pure functions can't have dependencies. Functional programming must reject the notion of dependencies.

Obviously, software is only useful with impure behaviour, so instead of injecting dependencies, functional programs must be composed in impure contexts. Impure functions can call pure functions, so at the boundary, an application must gather impure data, and use it to call pure functions. This automatically leads to the ports and adapters architecture.

This style of programming is surprisingly often possible, but it's not a universal solution; other alternatives exist.


Comments

Hi, Thank you for this blog post series. I also read your other posts on ports and adapters and the proposed architecture makes sense in terms of how it works, but I struggle to see the benefit in a real world application. Maybe let me explain my question with a quick example.

In the 2nd blog post of this series you demonstrated this function:

// int -> (DateTimeOffset -> Reservation list) -> (Reservation -> int) -> Reservation
// -> int option
let tryAccept capacity readReservations createReservation reservation =
    let reservedSeats =
        readReservations reservation.Date |> List.sumBy (fun x -> x.Quantity)
    if reservedSeats + reservation.Quantity <= capacity
    then createReservation { reservation with IsAccepted = true } |> Some
    else None

If I understand it correctly this function is pure if readReservations and createReservation are both pure otherwise it is impure.

I also understand the benefit of having a pure function, because it is a lot easier to understand the code, test the code and reason about it. That makes sense as well :).

So in the 3rd blog post you make tryAccept a pure function, by removing the function dependencies and replacing it with simple values:

// int -> Reservation list -> Reservation -> Reservation option
let tryAccept capacity reservations reservation =
    let reservedSeats = reservations |> List.sumBy (fun x -> x.Quantity)
    if reservedSeats + reservation.Quantity <= capacity
    then { reservation with IsAccepted = true } |> Some
    else None

However this was only possible because you essentially moved the impure code into another new function:

// Reservation -> int option
let tryAcceptComposition reservation =
    reservation.Date
    |> DB.readReservations connectionString
    |> flip (tryAccept 10) reservation
    |> Option.map (DB.createReservation connectionString)

So after all the application hasn't really reduced the total number of impure functions (still 3 in each case - readReservations, createReservation and tryAccept[Composition]).

The only difference I see is that one impure function has been refactored into 2 functions - one pure and one impure. Considering that the original tryAccept function was already fully testable from a unit testing point of view and quite readable what is the benefit of this additional step? I would almost argue that the original tryAccept function was even easier to read/understand than the combination of tryAccept and tryAcceptComposition. I understand that impure functions like this are not truly functional, but in a real world application you must have some impure functions and I would like to better understand where trade-off benefit of that additional step is? Am I missing something else?

2017-02-03 10:34 UTC

Dustin, thank you for writing. There are several answers to your question, depending on the perspective one is interested in. I'll see if I can cover the most important ones.

Is it functional? #

On the most fundamental level, I'm interested in learning functional programming. In order to do this, I seek out strictly functional solutions to problems. Haskell is a great help in that endeavour, because it's not a hybrid language. It only allows you to do functional programming.

Does it make sense to back-port Haskell solutions to F#, then? That depends on what one is trying to accomplish, but if the goal is nothing but learning how to do it functionally, then that goal is accomplished.

Toy examples #

On another level, the example I've presented here is obviously nothing but a toy example. It's simplified, because if I presented readers with a more realistic example, the complexity of the real problem could easily drown out the message of the example. Additionally, most readers would probably give up reading.

I'm asking my readers to pretend that the problem is more complex than the one I present here; pretend that this problem is a stand-in for a harder problem.

In this particular context, there could be all sorts of complications:

  • Reservations could be for time slots instead of whole dates. In order to keep the example simple, I treat each reservation as simply blocking out an entire date. I once dined at a restaurant where they started serving at 19:00, and if you weren't there on time, you'd miss the first courses. Most restaurants, though, allow you to make reservations for a particular time, and many have more than one serving on a single evening.
  • Most restaurants have tables, not seats. Again, the same restaurant I mentioned above seated 12 people at a bar-like arrangement facing the kitchen, but most restaurants have tables of varying sizes. If they get a reservation for three people, they may have to reserve a table for four.
  • Perhaps the restaurant would like to implement a feature where, if it receives a reservation that doesn't fill out a table (like a reservation for three people, and only four-people tables are left), it'd defer the decision to see if a 'better' reservation arrives later.
  • Some people make reservations, but never show up. For that reason, a restaurant may want to allow a degree of overbooking, just like airlines. How much overbooking to allow is a business decision.
  • A further wrinkle on the overbooking business rule is that you may have a different overbooking policy for Fridays than for, say, Wednesdays.
  • Perhaps the restaurant would like to implement a waiting-list feature as well.
As you can see, we could easily imagine that the business logic could be more convoluted. Keeping all of that decision logic pure would be beneficial.

Separation of concerns #

In my experience, there's an entire category of software defects that occur because of state mutation in business logic. You could have an area of your code that calls other code, which calls other code, and so on, for several levels of nesting. Somewhere, deep in the bowels of such a system, a conditional statement flips a boolean flag that consequently impact how the rest of the program runs. I've seen plenty of examples of such software, and it's inhumane; it doesn't fit within human cognitive limits.

Code that allows arbitrary side-effects is difficult to reason about.

Knowing that an subgraph of your call tree is pure reduces defects like that. This is nothing but another way to restate the command-query separation principle. In F#, we still can't be sure unless we exert some discipline, but in Haskell, all it takes is a look at the type of a function or value. If it doesn't include IO, you know that it's pure.

Separating pure code from impure code is separation of concern. Business logic is one concern, and I/O is another concern, and the better you can separate these, the fewer sources of defects you'll have. True, I haven't reduced the amount of code by much, but I've separated concerns by separating the code that contains (side) effects from the pure code.

Testability #

It's true that the partial application version of tryAccept is testable, because it has isolation, but the tests are more complicated than they have to be:

[<Property(QuietOnSuccess = true)>]
let ``tryAccept behaves correctly when it can accept``
    (NonNegativeInt excessCapacity)
    (expected : int) =
    Tuple2.curry id
    <!> Gen.reservation
    <*> Gen.listOf Gen.reservation
    |>  Arb.fromGen |> Prop.forAll <| fun (reservation, reservations) ->
    let capacity =
        excessCapacity
        + (reservations |> List.sumBy (fun x -> x.Quantity))
        + reservation.Quantity
    let readReservations = ((=!) reservation.Date) >>! reservations
    let createReservation =
        ((=!) { reservation with IsAccepted = true }) >>! expected
 
    let actual =
        tryAccept capacity readReservations createReservation reservation
 
    Some expected =! actual
 
[<Property(QuietOnSuccess = true)>]
let ``tryAccept behaves correctly when it can't accept``
    (PositiveInt lackingCapacity) =
    Tuple2.curry id
    <!> Gen.reservation
    <*> Gen.listOf Gen.reservation
    |>  Arb.fromGen |> Prop.forAll <| fun (reservation, reservations) ->
    let capacity =
        (reservations |> List.sumBy (fun x -> x.Quantity)) - lackingCapacity
    let readReservations _ = reservations
    let createReservation _ = failwith "Mock shouldn't be called."
 
    let actual =
        tryAccept capacity readReservations createReservation reservation
 
    None =! actual

(You can find these tests in commit d2387cceb81eabc349a63ab7df1249236e9b1d13 in the accompanying sample code repository.) Contrast those dependency-injection style tests to these tests against the pure version of tryAccept:

[<Property(QuietOnSuccess = true)>]
let ``tryAccept behaves correctly when it can accept``
    (NonNegativeInt excessCapacity) =
    Tuple2.curry id
    <!> Gen.reservation
    <*> Gen.listOf Gen.reservation
    |>  Arb.fromGen |> Prop.forAll <| fun (reservation, reservations) ->
    let capacity =
        excessCapacity
        + (reservations |> List.sumBy (fun x -> x.Quantity))
        + reservation.Quantity
 
    let actual = tryAccept capacity reservations reservation
 
    Some { reservation with IsAccepted = true } =! actual
 
[<Property(QuietOnSuccess = true)>]
let ``tryAccept behaves correctly when it can't accept``
    (PositiveInt lackingCapacity) =
    Tuple2.curry id
    <!> Gen.reservation
    <*> Gen.listOf Gen.reservation
    |>  Arb.fromGen |> Prop.forAll <| fun (reservation, reservations) ->
    let capacity =
        (reservations |> List.sumBy (fun x -> x.Quantity)) - lackingCapacity
 
    let actual = tryAccept capacity reservations reservation
 
    None =! actual

They're simpler, and since they don't use mocks, they're more robust. They were easier to write, and I subscribe to the spirit of GOOS: if test are difficult to write, the system under test should be simplified.

2017-02-05 20:09 UTC

Hi Mark,

Thanks for your talk at NDC last month, and for writing this series! I feel that the functional community (myself included) has a habit of using examples that aren't obviously relevant to the sort of line-of-business programming most of us do in our day jobs, so articles like this are sorely needed.

We talked a little about this in person after your talk at the conference: I wanted to highlight a potential criticism of this style of programming. Namely, there's still some important business logic being carried out by your tryAcceptComposition function, like checking the capacity on the requested reservation date. How do you unit test that readReservations is called with the correct date? Likewise, how do you unit test that rejected reservations don't get saved? Real world business logic isn't always purely functional in nature. Sometimes the side effects that your code performs are part of the requirements.

The Haskell philosophy isn't about rejecting side effects outright - it's about measuring and controlling them. I wouldn't write tryAcceptComposition using IO. Instead I'd program to the interface, not the implementation, using an mtl-style class to abstract over monads which support saving and loading reservations.

class Monad m => MonadReservation m where
    readReservations :: ConnectionString -> Date -> m [Reservation]
    createReservation :: ConnectionString -> Reservation -> m ReservationId


tryAcceptComposition :: MonadReservation m => Reservation -> m (Maybe ReservationId)
tryAcceptComposition r = runMaybeT $ do
    reservations <- lift $ readReservations connectionString (date r)
    accepted <- MaybeT $ return $ tryAccept 10 reservations r
    lift $ createReservation connectionString accepted

Code that lives in a MonadReservation context can read and create reservations in the database but nothing else; it doesn't have all the power of IO. During unit testing I can use an instance of MonadReservation that returns canned values, and in production I can use a monad that actually talks to the database.

Since type classes are syntactic sugar for passing an argument, this is really just a nicer way of writing your original DI-style code. I don't advocate the "free monad" style that's presently trendy in Scala-land because I find it unnecessarily complex. 90% of the purported advantages of free monads are already supported by simpler language features.

I suppose the main downside of this design is that you can't express it in F#, at least not cleanly. It relies on type classes and higher-kinded types.

Hope you find this interesting, I'd love to hear what you think!

Benjamin

2017-02-06 16:28 UTC

Benjamin, thank you for writing. The alternative you propose looks useful in Haskell, but, as you've already suggested, it doesn't translate well into F#.

I write F# code professionally, whereas so far, I've only used Haskell to critique my F# code. (If someone who reads this comment would offer to pay me to write some Haskell code, please get in touch.) In other words, I still have much to learn about Haskell. I think I understand as much, however, that I'd be able to use your suggested design to unit test tryAcceptComposition using the Identity monad for Stubs, or perhaps MonadWriter or MonadState for Mocks. I'll have to try that one day...

In F#, I write integration tests. Such tests are important regardless, and often they more closely relate to actual requirements, so I find this a worthwhile effort anyway.

2017-02-11 22:42 UTC

Hi Mark,

thanks for the post series, which I find interesting and needed. There is one part of your post that I find deserves further exploration. You write:

in statement-based languages like C# and Java, this may be problematic, because you may wish to defer a query, or base it on a decision inside the unit. In expression-based languages you can decouple decisions from effects, and deferred execution can always be done by lazy evaluation, if that's required.
Firstly, I would say that you can write expression-based programs in any language that has expressions, which naturally includes C# and Java. But that's not particularly relevant to this discussion.

More to the point, you're glossing over this as though it were a minor detail, when in fact I don't think it is. Let's explore the case in which "you may wish to defer a query, or base it on a decision inside the unit". The way you do this "by lazy evaluation" would be - I assume - by passing a function as an argument to your unit. But this is then effectively dependency injection, because you're passing in a function which has side effects, which will be called (or not) from the unit.

So, it seems to me that your technique of extracting side effects out of the unit provides a good general guideline, but not a completely general way to replace dependency injection.

2017-02-16 11:47 UTC

Enrico, thank you for writing. There's a lot to unpack in that quote, which was one of the reasons I didn't expand it. It would have made the article too long, and wandered off compared to its main point. I don't mind going into these details here, though.

Direction of data #

In order to get the obvious out of the way first, the issue you point out is with my refactoring of indirect input to direct input. Refactoring from indirect to output to direct output is, as far as I can tell, not on your agenda. Designing with direct input in mind seems uncontroversial to me, so that makes sense.

No hard rules #

On this blog, I often write articles as I figure out how to deal with problems. Sometimes, I report on my discoveries at a time where I've yet to accumulate years of experience. What I've learned so far is that dependency injection isn't functional. What I'm still exploring is what to do instead.

It's my experience that the type of refactoring I demonstrate here can surprisingly often be performed. I don't want to claim that it's always possible to do it like this. In fact, I'm still looking for good examples where this will not be possible. Whenever I think of a simple enough example that I could share it here, I always realise that if only I simplify the problem, I can put it into the shape seen here.

My thinking is, however, constrained by my professional experience. I've been doing web (service) development for so many years now that it constraints my imagination. When you execution scope is exclusively a single HTTP request at a time, you tend to keep things simple. I'd welcome a simplified, but still concrete example where the impure/pure/impure sandwich described here isn't going to be possible.

This may seem like a digression, but my point is that I don't claim to be the holder of a single, undeniable truth. Still, I find that this article describes a broadly applicable design and implementation technique.

Language specifics #

The next topic we need to consider is our choice of language. When I wrote that deferred execution can always be done by lazy evaluation, that's exactly how Haskell works. Haskell is lazily evaluated, so any value passed as direct input can be unevaluated until required. That goes for IO as well, but then, as we've learned, you can't pass impure data to a pure function.

All execution is, in that sense, deferred, unless explicitly forced. Thus, any potential need for deferred execution has no design implications.

F#, on the other hand, is an eagerly evaluated language, so there, deferred execution may have design implications.

Performance #

Perhaps it's my lack of imagination again, but I can't think of a well-designed system where deferred execution is required for purposes of correctness. As far as I can tell, deferred execution is a performance concern. You wish to defer execution of a query because that operation takes significant time.

That's a real concern, but I often find that people worry too much about performance. Again, this is probably my lack of wider experience, as I realise that performance can be important in smart phone apps, games, and the like. Clearly, performance is also important in the world of REST APIs, but I've met a lot of people who worry about performance without ever measuring it.

When you start measuring performance, you'll often be surprised to discover where your code spends actual time. So my design approach is always to prioritise making the system work first, and then, if there are performance problems, figure out how to tweak it so that it becomes satisfactory. In my experience, such tweaking is only necessary now and then. I'm not claiming that my code is the fastest it could be, but it's often fast enough, and as easy to maintain as I can make it.

The need for data #

Another concern is the need for data. If you consider the above tryAccept function, it always uses reservations. Thus, there's no gain in deferring the database query, because you'll always need the data.

Deferred execution is only required in those cases where you have conditional branching, and only in certain cases do you need to read a particular piece of data.

Even conditional branching isn't enough of a criterion, though, because you could have branching where, in 99.9 % of the cases, you'd be performing the query anyway. Would you, then, need deferred execution for the remaining 0.1 % of the cases?

Lazy sequences #

Still, let's assume that we've implemented a system using pure functions that take pure data, but to our dismay we discover that there's one query that takes time to execute, and that we truly only need it some of the time. In .NET, there are two distinct situations:

  • We need a scalar value
  • We need a collection of values
If we need a collection of values, we only need to make a minuscule change to the design of our function. Instead of taking an F# list, or an array, as direct input, we can make the function take a sequence (IEnumerable<T> in C#) as input. These can be implemented as lazily evaluated sequences, which gives us the deferred execution we need.

Lazy scalar values #

This leaves the corner case where we need a lazily evaluated scalar value. In such cases, I may have to make a concession to performance in my function design, but I wouldn't change the argument to a function, but rather to a lazy value.

Lazy values are deferred, but memoised, which is the reason I'd prefer them over function arguments.

2017-02-18 19:54 UTC

In a previous comment, you said:

I'd welcome a simplified, but still concrete example where the impure/pure/impure sandwich described here isn't going to be possible.

I have on two occasions stumbled into cases where I can't find a good way to pull this off. The reason may be that there's a workflow seemingly consisting of several impure steps interleaved with pure decisions. The cases are similar, so I will share one of them as an example.

We have an API for allowing users to register. We also have a separate API for two-factor authentication (2FA). When users register, they have to complete a "proof" using the 2FA API to verify ownership of their mobile number.

The 2FA API has two relevant endpoints used internally by our other APIs: One is used for creating a proof, and returns a proof ID that can be passed on to the API client. This endpoint is used when a client makes a request without a proof ID, or with an invalid proof ID. The other endpoint is used for verifying that a proof has been completed. This endpoint is used when a client supplies a proof ID in the request.

(The 2FA API also has endpoints the client uses to complete the proof, but that is not relevant here.)

When a user registers, this is the workflow:

A flowchart describing the workflow for completing a registration.

Here is a simple implementation of the workflow using "dependency injection" (I will skip the actual composition function, similar to your tryAcceptComposition, which is not interesting here):

let completeRegistrationWorkflow
    (createProof: Mobile -> Async<ProofId>)
    (verifyProof: Mobile -> ProofId -> Async<bool>)
    (completeRegistration: Registration -> Async<unit>)
    (proofId: ProofId option)
    (registration: Registration)
    : Async<CompleteRegistrationResult> =
  async {
    match proofId with
    | None ->
        let! proofId = createProof registration.Mobile
        return ProofRequired proofId
    | Some proofId ->
        let! isValid = verifyProof registration.Mobile proofId
        if isValid then
          do! completeRegistration registration
          return RegistrationCompleted
        else
          let! proofId = createProof registration.Mobile
          return ProofRequired proofId
  }

There are two decisions that are pure and could conceivably be extracted to a pure (non-DI) function, indicated by blue in the flowchart:

  • Initially: Should we create a new proof or verify an existing proof? (based on whether the client supplied a proof ID)
  • After verifying a supplied proof: Should we complete the registration or create a new proof? (based on whether the supplied proof ID was valid)

My question then, is this: Is it possible to refactor this to direct input/output, in a way that actually reduces complexity where it matters? (In other words, in a way where the decisions mentioned above are part of a pure, non-DI function?)

Otherwise, I just want to say that this series has helped me become a better F# programmer. Since originally reading it, I have consistently tried to design for direct input/output as opposed to using "dependency injection", and my code has become better for it. Thanks!

2019-11-06 10:06 UTC

Christer, thank you for writing. That was such a fruitful question that I wrote a new article in order to answer it. I hope you find it useful, but if not, let's continue the discussion there or here, dependening on what makes most sense.

2019-12-02 13:22 UTC

Hi Mark, excelent post series, thank you. Right after finishing this post I had the same questions Dustin Moris Gorski had and I think I still have some, even after your answer.

For simplicity's sake, I'll represent TryAccept as 3 operations: read, calculate and maybe create. Both read and create are injected so that I can change their implementations at runtime, if necessary. That is the same exact representation for both the C# and F# codes that use DI, and also the tryAcceptComposition code, with the difference that tryAcceptComposition depends on the actual implementation of DB, so it's relatively more brittle than the DI alternatives.

Although TryAccept and TryAcceptComposition are doing the same thing, I'm still trying to think why the latter looks more functional than the first and whether DI is really not necessary. I got to 2 conclusions and I wanted to know your opinion about them:

1) The difference between the 2 implementations (TryAccept and TryAcceptComposition) is that the first is following an imperative style, while the second is purely declaritive, with one big composition. Both implementations perform the exact same operations, evoking the same dependencies, but with different code styles.

2) If we try and take your style of sandwich to extremes and push dependencies to the very edges of the application, to the point where it's just a simple big composition with dependencies at the beginning and end, we my replace "injection" with import statements, importing our dependencies in the same place the composition is written. I don't think this would work in every scenario as operations in the middle of this composition would need access to dependencies too (which could them be pushed out like TryAccept, but could make the code less readable). Do you think this is doable?

2020-01-14 20:37 UTC

Danilo, thank you for writing. Regarding your first point, I'm not sure I follow. Both functions are written in F#, which is an expression-based language (at least when you ignore the interop parts). Imperative code is usually defined as coding with assignments rather than expressions. There's no assignments here.

Also, tryAccept and tryAcceptComposition aren't equivalent. One is a specialisation of the other, so to speak. You can't change the behaviour of tryAcceptComposition unless you edit the actual code. You can, on the other hand, change the observable behaviour of tryAccept by composing it with different impure actions.

As to your second observation:

"I don't think this would work in every scenario as operations in the middle of this composition would need access to dependencies too"
I'm still keen on getting some compelling examples of this. Christer van der Meeren kindly tried to supply such an example, but it turned out as another fine sandwich. This happens conspicuously often.

I've never claimed that the sandwich architecture is always possible, but every time I try to find a compelling counter-example, it usually turns out to be refactorable into a sandwich after all. Do, however, refer to the comments to that later article.

2020-01-19 21:07 UTC

Hi Mark, I found myself thinking about this early this morning and wondering about scenarios where injecting dependencies may still be "functional" (i.e. pure) and would like your input. It also incorporates TDD (or at least testing strategies).

Here's my example: I was recently refactoring some code that was structured like this (C#-ish pseudo-code):

				if (shouldNotify(record, supportingRecord, settings)) 
				{
					generateEmail(...);
					record.status = calculateStatusAfterEmail(record, supportingRecord);
				}
				else
				{
					record.status = calculateStatusWithoutEmail(record, supportingRecord);	
				}
				
Let's gloss over the side-effects all over the place here and just focus on the flow.

To ease unit testing, I extracted these private methods into a separate interface and am injecting it into this flow. Call me lazy, but I wanted to just mock/stub the results of these pure calls when testing the overall flow and wrote separate tests on the implementation of the new interface that could focus on all the different combinations of values. Is that a strategy you would support/recommend, or would you hesitate to extract pure functions from an implementation just to ease testing? I also convinced myself that I could justify this on the grounds of following the Single Responsibility Principle (i.e. this flow should not need to be modified if the status calculation logic changes).

Your thoughts? By the way, your new dependency injection strategies is on its way to me, perhaps that book will provide me with your views on that, but I couldn't resist posting it here.

2020-05-28 14:30 UTC

Sven, thank you for writing. I think that your question deserves a longer answer than what you'll get here. It's a topic that for some time I've been contemplating giving a more detailed treatment. It's a rich enough subject that a couple of articles would be required. I'm still pondering how to organise such content.

The short answer is that I'm increasingly looking for alternatives to the sort of interaction-based testing you imply. I think that my article series on the benefits of state-based testing outlines most of my current thinking on the topic.

2020-06-03 5:46 UTC

I read over all the comments and I would like clarification on when logic is required for resolving the dependency being provided. Benjamin Hodgson asked this:

Namely, there's still some important business logic being carried out by your tryAcceptComposition function, like checking the capacity on the requested reservation date. How do you unit test that readReservations is called with the correct date?

Suppose the readRervations didn't just get all reservations from the dabatase just to process for one restaurant or on one day. Thus the restaurant ID and date may be needed to fetch the correct reservations, and this logic would have to happen outside of the unit. I think you touched on this complication in a response above, but I'd like to confirm if I am understanding your responses correctly: you simply perform this logic outside of the unit and have an integration test for the entire workflow?

2021-12-15 22:11 UTC

Chris, thank you for writing. There are more nuances to that question than might be immediately apparent. At least, I'll answer the question in several ways.

First, I'd like to challenge the implicit assumption that one must always test everything. What to test and what not to test depend on multiple independent forces, such as how costly an error might be, how likely an error is to occur, and so on.

Take, as an example, the code base that accompanies my book Code That Fits in Your Head. This code base is also a restaurant reservation system, but has a realistic level of complexity, so I think it might better highlight the sort of considerations that might be warranted. The code that is most relevant to the question is this snippet from ReservationsController.TryCreate:

var reservations = await Repository
    .ReadReservations(restaurant.Id, reservation.At)
    .ConfigureAwait(false);
var now = Clock.GetCurrentDateTime();
if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
    return NoTables500InternalServerError();
 
await Repository.Create(restaurant.Id, reservation)
    .ConfigureAwait(false);

How likely is it that the code changes in such a way that it calls ReadReservations with the wrong input? Either because the original programmer who wrote the code made a mistake, or because a later developer introduced a change in which this code calls ReadReservations with incorrect input?

Now, such errors certainly do occur. Programmers are human, and to err is human.

Still, it's important to keep in mind that the risk of introducing an error into a code base is proportional to the amount of code. The more code, the more errors you'll have. This goes for test code as well. You can also, inadvertently, introduce errors into a test.

You can counteract some of that by writing the test first. That's the reasons it's so important to see a test fail. Only by seeing it fail can you feel confident that it verifies something real.

So, depending on circumstances, it may be relevant and important to write a test that verifies that ReadReservations is called with the correct arguments. Still, I think it's important to weigh advantages and disadvantages before blindly insisting that it's necessary to protect against such a contingency.

When I originally wrote the above code, I wrote no unit test that explicitly verify whether ReadReservations is called with the correct arguments. After all, in that context, there's only one DateTime value in scope: reservation.At. (This is not entirely true, because now is also a DateTime value, but as the code is written, it's not available for ReadReservations because it's retrieved after the ReadReservations call. But see below.)

In a context like this, you'd have to go out of your way to call ReadReservations with the wrong date. You could create a value on the spot, like this:

var reservations = await Repository
    .ReadReservations(restaurant.Id, new DateTime(2021, 12, 19, 18, 30, 0))
    .ConfigureAwait(false);

or this:

var reservations = await Repository
    .ReadReservations(restaurant.Id, reservation.At.AddDays(1))
    .ConfigureAwait(false);

None of these are impossible, but I do consider them to be conspicuous. You don't easily make those kinds of mistakes. You'd almost have to go out of your way to make these mistakes, and a good pairing partner or code review ought to catch such a mistake.

Another option is if someone finds a good reason to reorder the code:

var now = Clock.GetCurrentDateTime();
var reservations = await Repository
    .ReadReservations(restaurant.Id, now)
    .ConfigureAwait(false);

In this example, now is available to ReadReservations, and perhaps a later edit might introduce the error of calling it with now.

This is perhaps a more realistic, honest mistake.

None of the 171 unit tests in the code base catch any of the above three mistakes.

Is the code base unsafe, then?

No, because as luck would have it, an integration test (NoOverbookingRace) fails on any of these three edits. I admit that this is just a fortunate accident. I added that particular integration test for another reason: to reproduce an unrelated bug that occurred in 'production'.

I usually try to base the amount of testing I ask from a team on the quality of the team. The more I trust that they pair-program or perform regular code reviews, the less I insist on test coverage, and vice versa.

In summary, though, if I find that it's important to verify that ReadReservations was called with the correct arguments, I'd favour a state-based integration test.

2021-12-19 12:24 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Thursday, 02 February 2017 08:56:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Thursday, 02 February 2017 08:56:00 UTC