Traversing lists of IO. A refactoring.

This article is part of a series named Alternative ways to design with functional programming. In the previous article, you saw how to refactor the example code base to a composition of standard F# combinators. It's a pragmatic solution to the problem of dealing with lots of data in a piecemeal fashion, but although it uses concepts and programming constructs from functional programming, I don't consider it a proper functional architecture.

You'd expect the Haskell version to be the most idiomatic of the three language variations, but ironically, I had more trouble making the code in this article look nice than I had with the F# variation. You'll see what the problem is later, but it boils down to a combination of Haskell's right-to-left default composition order, and precedence rules of some of the operators.

Please consult the previous articles for context about the example code base. The code shown in this article is from the combinators Git branch. It refactors the code shown in the article Porting song recommendations to Haskell.

The goal is to extract pure functions from the overall recommendations algorithm and compose them using standard combinators, such as =<<, <$>, and traverse.

Getting rid of local mutation #

My first goal was to get rid of the IORef-based local mutation shown in the 'baseline' code base. That wasn't too difficult. If you're interested in the micro-commits I made to get to that milestone, you can consult the Git repository. The interim result looked like this:

getRecommendations srvc un = do
  -- 1. Get user's own top scrobbles
  -- 2. Get other users who listened to the same songs
  -- 3. Get top scrobbles of those users
  -- 4. Aggregate the songs into recommendations

  -- Impure
  scrobbles <- getTopScrobbles srvc un
 
  -- Pure
  let scrobblesSnapshot = take 100 $ sortOn (Down . scrobbleCount) scrobbles
 
  -- Impure
  recommendations <-
    join <$>
    traverse (\scrobble ->
      fmap join $
      traverse (\otherListener ->
        fmap scrobbledSong .
        take 10 .
        sortOn (Down . songRating . scrobbledSong) .
        filter (songHasVerifiedArtist . scrobbledSong) <$>
        getTopScrobbles srvc (userName otherListener)) .
      take 20 <$>
      sortOn (Down . userScrobbleCount) .
      filter ((10_000 <=) . userScrobbleCount) =<<
      getTopListeners srvc (songId $ scrobbledSong scrobble))
    scrobblesSnapshot
 
  -- Pure
  return $ take 200 $ sortOn (Down . songRating) recommendations

Granted, it's not the most readable way to present the algorithm, but it is, after all, only an intermediate step. As usual, I'll remind the reader that Haskell code should, by default, be read from right to left. When split over multiple lines, this also means that an expression should be read from the bottom to the top. Armed with that knowledge (and general knowledge of Haskell), combined with some helpful indentation, it's not altogether unreadable, but not something I'd like to come back to after half a year. And definitely not something I would foist upon (hypothetical) colleagues.

The careful reader may notice that I've decided to use the reverse bind operator =<<, rather than the standard >>= operator. I usually do that with Haskell, because most of Haskell is composed from right to left, and =<< is consistent with that direction. The standard >>= operator, on the other hand, composes monadic actions from left to right. You could argue that that's more natural (to Western audiences), but since everything else stays right-to-left biased, using >>= confuses the reading direction.

As a Westerner, I prefer left-to-right reading order, but in general I've found it hard to fight Haskell's bias in the other direction.

As the -- Pure and -- Impure comments indicate, interleaving the pure functions with impure actions makes the entire expression impure. The more I do that, the less pure code remains.

Single expression #

Going from from the above snapshot to a single impure expression doesn't require many more steps.

getRecommendations srvc un =
  -- 1. Get user's own top scrobbles
  -- 2. Get other users who listened to the same songs
  -- 3. Get top scrobbles of those users
  -- 4. Aggregate the songs into recommendations

  take 200 . sortOn (Down . songRating) <$>
  ((\scrobbles ->
    join <$>
    traverse (\scrobble ->
      fmap join $
      traverse (\otherListener ->
        fmap scrobbledSong .
        take 10 .
        sortOn (Down . songRating . scrobbledSong) .
        filter (songHasVerifiedArtist . scrobbledSong) <$>
        getTopScrobbles srvc (userName otherListener)) .
      take 20 <$>
      sortOn (Down . userScrobbleCount) .
      filter ((10_000 <=) . userScrobbleCount) =<<
      getTopListeners srvc (songId $ scrobbledSong scrobble))
    (take 100 $ sortOn (Down . scrobbleCount) scrobbles)) =<<
  getTopScrobbles srvc un)

Neither did it improve readability.

Helper functions #

As in previous incarnations of this exercise, it helps if you extract some well-named helper functions, like this one:

getUsersOwnTopScrobbles :: [Scrobble-> [Scrobble]
getUsersOwnTopScrobbles = take 100 . sortOn (Down . scrobbleCount)

As a one-liner, that one perhaps isn't that impressive, but none of them are particularly complicated. The biggest function is this:

getTopScrobblesOfOtherUsers :: [Scrobble-> [Song]
getTopScrobblesOfOtherUsers =
  fmap scrobbledSong .
  take 10 .
  sortOn (Down . songRating . scrobbledSong) .
  filter (songHasVerifiedArtist . scrobbledSong)

You can see the rest in the Git repository. None of them are exported by the module, which makes them implementation details that you may decide to change or remove at a later date.

You can now compose the overall action.

getRecommendations srvc un =
  -- 1. Get user's own top scrobbles
  -- 2. Get other users who listened to the same songs
  -- 3. Get top scrobbles of those users
  -- 4. Aggregate the songs into recommendations

  (aggregateTheSongsIntoRecommendations . getTopScrobblesOfOtherUsers) . join <$>
  ((traverse (getTopScrobbles srvc . userName) .
      getOtherUsersWhoListenedToTheSameSongs) . join =<<
    (traverse (getTopListeners srvc . (songId . scrobbledSong)) .
      getUsersOwnTopScrobbles =<< getTopScrobbles srvc un))

Some of the parentheses break over multiple lines in a non-conventional way. This is my best effort to format the code in a way that emphasises the four steps comprising the algorithm, while still staying within the bounds of the language, and keeping hlint silent.

I could try to argue that if you squint a bit, the operators and other glue like join should fade into the background, but in this case, I don't even buy that argument myself.

It bothers me that it's so hard to compose the code in a way that approaches being self-documenting. I find that the F# composition in the previous article does a better job of that.

Syntactic sugar #

The stated goal in this article is to demonstrate how it's possible to use standard combinators to glue the algorithm together. I've been complaining throughout this article that, while possible, it leaves the code less readable than desired.

That one reader who actually knows Haskell is likely frustrated with me. After all, the language does offer a way out. Using the syntactic sugar of do notation, you can instead write the composition like this:

getRecommendations srvc un = do
  -- 1. Get user's own top scrobbles
  -- 2. Get other users who listened to the same songs
  -- 3. Get top scrobbles of those users
  -- 4. Aggregate the songs into recommendations

  userTops <- getTopScrobbles srvc un <&> getUsersOwnTopScrobbles
  otherListeners <- 
    traverse (getTopListeners srvc . (songId . scrobbledSong)) userTops <&>
    getOtherUsersWhoListenedToTheSameSongs . join
  songs <-
    traverse (getTopScrobbles srvc . userName) otherListeners <&>
    getTopScrobblesOfOtherUsers . join
  return $ aggregateTheSongsIntoRecommendations songs

By splitting the process up into steps with named variables, you can achieve the much-yearned-for top-to-bottom reading order. Taking advantage of the <&> operator from Data.Functor we also get left-to-right reading order on each line.

That's the best I've been able to achieve under the constraint that the IO-bound operations stay interleaved with pure functions.

Conclusion #

Mixing pure functions with impure actions like this is necessary when composing whole programs (usually at the entry point; i.e. main), but shouldn't be considered good functional-programming style in general. The entire getRecommendations action is impure, being non-deterministic.

Still, even Haskell code eventually needs to compose code in this way. Therefore, it's relevant covering how this may be done. Even so, alternative architectures exist.

Next: Song recommendations with pipes and filters.



Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 30 June 2025 05:54:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 30 June 2025 05:54:00 UTC