Task asynchronous programming as an IO surrogate by Mark Seemann
Is task asynchronous programming a substitute for the IO container? An article for C# programmers.
This article is part of an article series about the IO container in C#. In the previous articles, you've seen how a type like IO<T>
can be used to distinguish between pure functions and impure actions. While it's an effective and elegant solution to the problem, it depends on a convention: that all impure actions return IO
objects, which are opaque to pure functions.
In reality, .NET base class library methods don't do that, and it's unrealistic that this is ever going to happen. It'd require a breaking reset of the entire .NET ecosystem to introduce this design.
A comparable reset did, however, happen a few years ago.
TAP reset #
Microsoft introduced the task asynchronous programming (TAP) model some years ago. Operations that involve I/O got a new return type. Not IO<T>
, but Task<T>
.
The .NET framework team began a long process of adding asynchronous alternatives to existing APIs that involve I/O. Not as breaking changes, but by adding new, asynchronous methods side-by-side with older methods. ExecuteReaderAsync as an alternative to ExecuteReader, ReadAllLinesAsync side by side with ReadAllLines, and so on.
Modern APIs exclusively with asynchronous methods appeared. For example, the HttpClient class only affords asynchronous I/O-based operations.
The TAP reset was further strengthened by the move from .NET to .NET Core. Some frameworks, most notably ASP.NET, were redesigned on a fundamentally asynchronous core.
In 2020, most I/O operations in .NET are easily recognisable, because they return Task<T>
.
Task as a surrogate IO #
I/O operations are impure. Either you're receiving input from outside the running process, which is consistently non-deterministic, or you're writing to an external resource, which implies a side effect. It might seem natural to think of Task<T>
as a replacement for IO<T>
. Szymon Pobiega had a similar idea in 2016, and I investigated his idea in an article. This was based on F#'s Async<'a>
container, which is equivalent to Task<T>
- except when it comes to referential transparency.
Unfortunately, Task<T>
is far from a perfect replacement of IO<T>
, because the .NET base class library (BCL) still contains plenty of impure actions that 'look' pure. Examples include Console.WriteLine, the parameterless Random constructor, Guid.NewGuid, and DateTime.Now (arguably a candidate for the worst-designed API in the BCL). None of those methods return tasks, which they ought to if tasks should serve as easily recognisable signifiers of impurity.
Still, you could write asynchronous Adapters over such APIs. Your Console
Adapter might present this API:
public static class Console { public static Task<string> ReadLine(); public static Task WriteLine(string value); }
Moreover, the Clock
API might look like this:
public static class Clock { public static Task<DateTime> GetLocalTime(); }
Modern versions of C# enable you to write asynchronous entry points, so the hello world example shown in this article series becomes:
static async Task Main(string[] args) { await Console.WriteLine("What's your name?"); var name = await Console.ReadLine(); var now = await Clock.GetLocalTime(); var greeting = Greeter.Greet(now, name); await Console.WriteLine(greeting); }
That's nice idiomatic C# code, so what's not to like?
No referential transparency #
The above Main
example is probably as good as it's going to get in C#. I've nothing against that style of C# programming, but you shouldn't believe that this gives you compile-time checking of referential transparency. It doesn't.
Consider a simple function like this, written using the IO
container shown in previous articles:
public static string AmIEvil() { Console.WriteLine("Side effect!"); return "No, I'm not."; }
Is this method referentially transparent? Surprisingly, despite the apparent side effect, it is. The reason becomes clearer if you write the code so that it explicitly ignores the return value:
public static string AmIEvil() { IO<Unit> _ = Console.WriteLine("Side effect!"); return "No, I'm not."; }
The Console.WriteLine
method returns an object that represents a computation that might take place. This IO<Unit>
object, however, never escapes the method, and thus never runs. The side effect never takes place, which means that the method is referentially transparent. You can replace AmIEvil()
with its return value "No, I'm not."
, and your program would behave exactly the same.
Consider what happens when you replace IO
with Task
:
public static string AmIEvil() { Task _ = Console.WriteLine("Side effect!"); return "Yes, I am."; }
Is this method a pure function? No, it's not. The problem is that the most common way that .NET libraries return tasks is that the task is already running when it's returned. This is also the case here. As soon as you call this version of Console.WriteLine
, the task starts running on a background thread. Even though you ignore the task and return a plain string
, the side effect sooner or later takes place. You can't replace a call to AmIEvil()
with its return value. If you did, the side effect wouldn't happen, and that would change the behaviour of your program.
Contrary to IO
, tasks don't guarantee referential transparency.
Conclusion #
While it'd be technically possible to make C# distinguish between pure and impure code at compile time, it'd require such a breaking reset to the entire .NET ecosystem that it's unrealistic to hope for. It seems, though, that there's enough overlap with the design of IO<T>
and task asynchronous programming that the latter might fill that role.
Unfortunately it doesn't, because it fails to guarantee referential transparency. It's better than nothing, though. Most C# programmers have now learned that while Task
objects come with a Result property, you shouldn't use it. Instead, you should write your entire program using async
and await
. That, at least, takes you halfway towards where you want to be.
The compiler, on the other hand, doesn't help you when it comes to those impure actions that look pure. Neither does it protect you against asynchronous side effects. Diligence, code reviews, and programming discipline are still required if you want to separate pure functions from impure actions.
Comments
This is a great idea. It seems like the only Problem with Tasks is that they are usually already started, either on the current or a Worker Thread. If we return Tasks that are not started yet, then Side-Effects don't happen until we await them. And we have to await them to get their Result or use them in other Tasks. I experimented with a modified GetTime() Method returning a new Task that is not run yet:
Using a SelectMany Method that ensures that Tasks have been run, the Time is not evaluated until the resulting Task is awaited or another SelectMany is built using the Task from the first SelectMany. The Time of one such Task is also evaluated only once. On repeating Calls the same Result is returned:
Since I/O Operations with side-effects are usually asynchronous anyway, Tasks and I/O are a good match.
Consistenly not starting Tasks and using this SelectMany Method either ensures Method purity or enforces to return the Task. To avoid ambiguity with started Tasks a Wrapper-IO-Class could be constructed, that always takes and creates unstarted Tasks. Am I missing something or do you think this would not be worth the effort? Are there more idiomatic ways to start enforcing purity in C#, except e.g. using the [Pure] Attribute and StyleCop-Warnings for unused Return Values?
Matt, thank you for writing. That's essentially how F# asynchronous workflows work.