To log or not to log by Mark Seemann
There's no reason to make logging any harder than it has to be. Here's a compositional approach in F#.
Logging seems to be one of those cross-cutting concerns on which people tend to spend a lot of effort. For programmers coming from an object-oriented language like C#, finding a sane approach to logging seems to be particularly difficult.
In my book about Dependency Injection, I made an effort to explain that logging and other cross-cutting concerns are best addressed by Decorators (or dynamic interception). You can use the same design with F# functions.
Example scenario #
Consider, as an example, an HTTP API for a restaurant booking system. This example is taken from my Test-Driven Development with F# Pluralsight course, but you can see an almost identical example for free in this recording of a BuildStuff talk.
The application is composed of small, mostly pure functions:
let imp = Validate.reservationValid >> Rop.bind (Capacity.check 10 SqlGateway.getReservedSeats) >> Rop.map SqlGateway.saveReservation
Exactly what each function does isn't important in this context, but here are the types involved:
In short, the
imp function validates the input, applies some business rules if the input was valid, and saves the reservation to a database if the business rules allow it.
I strongly believe that in any well-designed code base, the core implementation should be independent of cross-cutting concerns such as logging. If the above reservation system is well-designed, it should be possible to retrofit logging onto it without changing the existing functions. Indeed, that turns out to be possible.
Adding logs #
You should use an existing logging library such as Serilog, log4net, NLog, etc. instead of rolling your own. In this example, imagine that you're using the well-known SomeExternalLoggingLibrary. In order to protect yourself against changes etc. in the external library, you first define your own, application-specific logging module:
module BookingLog = let logError = SomeExternalLoggingLibrary.logError let logInformation = SomeExternalLoggingLibrary.logInformation
Both functions have the type
fileName:string -> msg:string -> unit.
As a beginning, you can start by logging the final result of executing the
imp function. Since it has the type
ReservationRendition -> Rop.Result<unit, Error>, if you implement a log function that both accepts and returns
Rop.Result<unit, Error>, you can append that to the composition of
imp. Start with the logging function itself:
module BookingLog = // ... let logReservationsPost logFile result = match result with | Failure(ValidationError msg) -> logError logFile msg | Failure CapacityExceeded -> logError logFile "Capacity exceeded." | Success () -> logInformation logFile "Reservation saved." result
This function has the type
string -> Rop.Result<unit, Error> -> Rop.Result<unit, Error>. It matches on the cases of
result and logs something relevant for each case; then it returns
result without modifying it.
Since the logReservationsPost function both accepts and returns the same type, you can easily append it to the other functions while composing
let imp = Validate.reservationValid >> Rop.bind (Capacity.check 10 SqlGateway.getReservedSeats) >> Rop.map SqlGateway.saveReservation >> BookingLog.logReservationsPost logFile
Notice how BookingLog.logReservationsPost is simply added as the last line of composition. This compiles because that function returns its input.
Running the application with various input demonstrates that logging works as intended:
Information: Reservation saved. Error: Invalid date. Information: Reservation saved. Error: Capacity exceeded.
You've seen that you can append high-level logging of the final value, but can you also add logging deeper in the guts of the implementation?
Logging business behaviour #
Imagine that you need to also log what happens before and after Capacity.check is called. One option is to add a logging function with the same type as Capacity.check, that also Decorates Capacity.check, but I think it's simpler to add two functions that log the values before and after Capacity.check.
The type of Capacity.check is
int -> (DateTimeOffset -> int) -> Reservation -> Rop.Result<Reservation, Error>, but after partial application, it's only
Reservation -> Rop.Result<Reservation, Error>. In order to log what happens before Capacity.check is called, you can add a function that both accepts and returns a Reservation:
let logBeforeCapacityCheck logFile reservation = logInformation logFile (sprintf "Checking capacity for %s..." (reservation.Date.ToString "d")) reservation
This function has the type
string -> Reservation -> Reservation, and is placed within the BookingLog module. The logInformation function is used to log the input, which is then returned.
Likewise, you can also log what happens after Capacity.check is invoked. Since Capacity.check returns Rop.Result<Reservation, Error>, your log file must take that type as both input and output:
let logAfterCapacityCheck logFile result = match result with | Failure(ValidationError msg) -> logError logFile msg | Failure CapacityExceeded -> logError logFile (sprintf "Capacity exceeded.") | Success r -> logInformation logFile (sprintf "All is good for %s." (r.Date.ToString "d")) result
The logAfterCapacityCheck function has the type
string -> Rop.Result<Reservation, Error> -> Rop.Result<Reservation, Error>, and is also placed within the BookingLog module. Like the logReservationsPost function, it matches on
result and logs accordingly; then it returns
result. Do you see a pattern?
Because of these types, you can compose them into
let imp = Validate.reservationValid >> Rop.map (BookingLog.logBeforeCapacityCheck logFile) >> Rop.bind (Capacity.check 10 SqlGateway.getReservedSeats) >> BookingLog.logAfterCapacityCheck logFile >> Rop.map SqlGateway.saveReservation >> BookingLog.logReservationsPost logFile
Notice that BookingLog.logBeforeCapacityCheck and BookingLog.logAfterCapacityCheck are composed around Capacity.check. The final BookingLog.logReservationsPost is also still in effect. Running the application shows that logging still works:
Information: Checking capacity for 27.11.2015... Information: All is good for 27.11.2015. Information: Reservation saved.
The first two log entries are created by the logs around Capacity.check, whereas the last line is written by BookingLog.logReservationsPost.
Conditional logging #
Some programmers are concerned about the performance implications of logging. You may wish to be able to control whether or not to log.
The easiest way to do that is to make logging itself conditional:
let logError fileName msg = if log then SomeExternalLoggingLibrary.logError fileName msg else () let logInformation fileName msg = if log then SomeExternalLoggingLibrary.logInformation fileName msg else ()
log is a boolean value. If
log is false, the above two functions simply return
() (unit) without doing anything. This prevents costly IO from happening, so may already be enough of a performance optimisation. As always when performance is the topic: don't assume anything; measure.
In reality, you probably want to use more granular flags than a single
log flag, so that you can control informational logging independently from error logging, but I'm sure you get the overall idea.
Conditional compilation #
Even with boolean flags, you may be concerned that logging adds overhead even when the
log flag is false. After all, you still have a function like logBeforeCapacityCheck above: it uses sprintf to format a string, and that may still be too much if it happens too often (again: measure).
For the sake of argument, imagine that you've measured the cost of leaving the logging functions logReservationsPost, logBeforeCapacityCheck, and logAfterCapacityCheck in place when
log is false, and that you find that you'll need to turn them off in production. That's not a problem. Recall that before you added these functions, the application worked fine without logging. You compose these functions into
imp in order to add logging, but you don't have to. You can even make this decision at compile time:
let imp = Validate.reservationValid #if LOG >> Rop.map (BookingLog.logBeforeCapacityCheck logFile) #endif >> Rop.bind (Capacity.check 10 SqlGateway.getReservedSeats) #if LOG >> BookingLog.logAfterCapacityCheck logFile #endif >> Rop.map SqlGateway.saveReservation #if LOG >> BookingLog.logReservationsPost logFile #endif
Notice the presence of the conditional compilation flag
LOG. Only if the application is compiled with the
LOG flag will the logging code be compiled into the application; otherwise, it runs without any logging overhead at all.
Personally, I've never needed to control logging at this level, so this isn't a recommendation; it's only a demonstration that it's possible. What's much more important to me is that everything you've seen here has required zero changes of the application code. The only code being modified is the Composition Root, and I regard the Composition Root as a configuration file.
In a well-designed application, you should be able to append logging without impacting the core implementation. You can do that by taking a Decorator-like approach to logging, even in a Functional application. Due to the compositional nature of a well-designed code base, you can simply slide log functions in where you need them.
Even if you're concerned about the performance implications of logging, there are various ways by which you can easily turn off logging overhead if you don't need it. Only do this if you've measured the performance of your application and found that you need to do so. The point is that if you design the application to be composed from small functions, you can always fine-tune logging performance if you need to. You don't have to do a lot of up-front design to cater specifically to logging, though.
In my examples, I deliberately kept things crude in order to make it clear how to approach the problem, but I'm sure Scott Wlaschin could teach us how to refactor such code to a sophisticated monadic design.
It should also be noted that the approach outlined here leverages F#'s support for impure functions. All the log functions shown here return unit, which is a strong indicator of side-effects. In Haskell, logging would have to happen in an IO context, but that wouldn't impact the overall approach. The boundary of a Haskell application is an IO context, and the Composition Root belongs there.
It's implicit in these strategies that "application-specific logging module" will be a singleton, right? Otherwise, it would be necessary to pass a logger instance into every method.
In a multi-threaded context, this will result in all logs going to the same destination, with potential for jumbles if multiple threads are executing simultaneously, relatively unpredictable results if execution flows through multiple threads, etc.
I've never been able to come up with a way around the "jumble" problem other than passing a logger in every function call (or, when using OOP, giving every class a Logger property). But having every function take a "logger" parameter is not ideal either, for obvious reasons.
Do you have any thoughts on how to allow logging to be specific to each flow of execution, other than generating a logger at the entry point and passing it as a parameter to everything?
Ben, thank you for writing. It's not at all implicit that Singletons are involved. This article discusses a semi-functional design where there's no objects, just functions. It seems, however, that some of your concerns relate to object-oriented design.
In object-oriented design, I recommend modelling logging and other cross-cutting concerns as applications of the Decorator or Chain of Responsibility design patterns. As mentioned in this article, I discuss this in my book, but this blog also contains an example of this, although it's about instrumentation instead of logging. These two cross-cutting concerns are closely related in structure, though, so I hope you still find it informative. Such an approach to design eliminates the need for passing
logdependencies around in business logic and other places where it doesn't belong.
The way I've seen people address the problem with multi-threaded logging is to have a logger object per thread (or, safer this day, per HTTP request, or similar). This object simply collects all log data in memory until flushed. Some designs require client developers to explicitly call a
Flushmethod, but typically you can automate this so that it happens automatically when the thread or HTTP context is disposed of.
Flushis called, the infrastructure writes the entire log message to a queue. This can happen concurrently, but then you have a single-threaded subscriber on that queue that handles each message one at a time. This serialises the log messages, and you avoid interleaving of data.