When x, y, and z are great variable names by Mark Seemann
A common complaint against Functional Programming is the terse naming: x and y for variables, and f for functions. There are good reasons behind these names, though. Learn to love them here.
One of the facets of Function Programming that bothered my when I first started to look into it, is that often in examples, variables and functions have terribly terse names like x
, y
, f
, and so on. I wasn't alone, feeling like that, either:
"Functional programmer: (noun) One who names variables "x", names functions "f", and names code patterns "zygohistomorphic prepromorphism"" - James IryIn this article, I'm not going to discuss zygohistomorphic prepromorphism, but I am going to discuss names like
x
and f
.
Descriptive names #
When I started my Functional Programming journey, I came from a SOLID Object-Oriented background, and I had read and internalised Clean Code - or so I thought.
Readable code should have descriptive names, and f
and x
hardly seem descriptive.
For a while, I thought that the underlying reason for those 'poor' names was that the people writing all that Functional example code were academics with little practical experience in software development. It seems I'm not the only person who had that thought.
It may be true that Functional Programming has a root in mathematics, and that it has grown out of academia rather than industry, but there are good reasons that some names seem rather generic.
Generics #
In statically typed Functional languages like F# or Haskell, you rarely declare the types of functions and arguments. Instead, types are inferred, based on usage or implementation. It often turns out that functions are more generic than you first thought when you started writing it.
Here's a simple example. When I did the Diamond kata with Property-Based Testing, I created this little helper function along the way:
let isTwoIdenticalLetters x = let hasIdenticalLetters = x |> Seq.distinct |> Seq.length = 1 let hasTwoLetters = x |> Seq.length = 2 hasIdenticalLetters && hasTwoLetters
As the name of the function suggests, it tells us if x
is a string of two identical letters. It returns true for strings such as "ff", "AA", and "11", but false for values like "ab", "aA", and "TTT".
Okay, so there's already an x
there, but this function works on any string, so what else should I have called it? In C#, I'd probably called it text
, but that's at best negligibly better than x
.
Would you say that, based on the nice, descriptive name isTwoIdenticalLetters, you understand what the function does?
That may not be the case.
Consider the function's type: seq<'a> -> bool when 'a : equality
. What!? That's not what we expected! Where's the string
?
This function is more generic than I had in mind when I wrote it. System.String implements seq<char>, but this function can accept any seq<'a> (IEnumerable<T>), as long as the type argument 'a
supports equality comparison.
So it turns out that text
would have been a bad argument name after all. Perhaps xs
would have been better than x
, in order to indicate the plural nature of the argument, but that's about as much meaning as we can put into it. After all, this all works as well:
> isTwoIdenticalLetters [1; 1];; val it : bool = true > isTwoIdenticalLetters [TimeSpan.FromMinutes 1.; TimeSpan.FromMinutes 1.];; val it : bool = true > isTwoIdenticalLetters [true; true; true];; val it : bool = false
That function name is misleading, so you'd want to rename it:
let isTwoIdenticalElements x = let hasIdenticalLetters = x |> Seq.distinct |> Seq.length = 1 let hasTwoLetters = x |> Seq.length = 2 hasIdenticalLetters && hasTwoLetters
That's better, but now the names of the values hasIdenticalLetters and hasTwoLetters are misleading as well. Both are boolean values, but they're not particularly about letters.
This may be more honest:
let isTwoIdenticalElements x = let hasIdenticalElements = x |> Seq.distinct |> Seq.length = 1 let hasTwoElements = x |> Seq.length = 2 hasIdenticalElements && hasTwoElements
This is better, but now I'm beginning to realize that I've been thinking too much about strings and letters, and not about the more general question this function apparently answers. A more straightforward (depending on your perspective) implementation may be this:
let isTwoIdenticalElements x = match x |> Seq.truncate 3 |> Seq.toList with | [y; z] -> y = z | _ -> false
This may be slightly more efficient, because it doesn't have to traverse the sequence twice, but most importantly, I think it looks more idiomatic.
Notice the return of 'Functional' names like y
and z
. Although terse, these are appropriate names. Both y
and z
are values of the generic type argument 'a
. If not y
and z
, then what would you call them? element1
and element2
? How would those names be better?
Because of F#'s strong type inference, you'll frequently experience that if you use as few type annotations as possible, the functions often turn out to be generic, both in the technical sense of the word, but also in the English interpretation of it.
Likewise, when you create higher-order functions, functions passed in as arguments are often generic as well. Such a function could sometimes be any function that matches the required type, which means that f
is often the most appropriate name for it.
Scope #
Another problem I had with the Functional naming style when I started writing F# code was that names were often short. Having done Object-Oriented Programming for years, I'd learned that names should be sufficiently long to be descriptive. As Code Complete explains, teamMemberCount
is better than tmc
.
Using that argument, you'd think that element1
and element2
are better names than y
and z
. Let's try:
let isTwoIdenticalElements x = match x |> Seq.truncate 3 |> Seq.toList with | [element1; element2] -> element1 = element2 | _ -> false
At this point, the discussion becomes subjective, but I don't think this change is helpful. Quite contrary, these longer names only seem to add more noise to the code. Originally, the distance between where y
and z
are introduced and where they're used was only a few characters. In the case of z
, that distance was 9 characters. After the rename, the distance between where element2
is introduced and used is now 16 characters.
There's nothing new about this. Remarkably, I can find support for my dislike of long names in small scopes in Clean Code (which isn't about Functional Programming at all). In the last chapter about smells and heuristics, Robert C. Martin has this to say about scope and naming:
"The length of a name should be related to the length of the scope. You can use very short variable names for tiny scopes, but for big scopes you should use longer names.
"Variable names like
i
andj
are just fine if their scope is five lines long."
Do you use variable names like i
in for
loops in C# or Java? I do, so I find it appropriate to also use short names in small functions in F# and Haskell.
Well-factored Functional code consists of small, concise functions, just as well-factored SOLID code consists of small classes with clear responsibilities. When functions are small, scopes are small, so it's only natural that we encounter many tersely named variables like x
, y
, and f
.
It's more readable that way.
Summary #
There are at least two good reasons for naming values and functions with short names like f
, x
, and y
.
- Functions are sometimes so generic that we can't say anything more meaningful about such values.
- Scopes are small, so short names are more readable than long names.
Comments
Dave, thank you for writing. FWIW, I don't think there's anything wrong with longer camel-cased names when a function or a value is more explicit. As an example, I still kept the name of the example function fairly long and explicit:
isTwoIdenticalElements
.When I started with F#, I had many years of experience with writing C# code, and in the beginning, my F# code was more verbose than it is today. What I'm trying to explain with this article isn't that the short names are terse for the sake of being terse, but rather because sometimes, the functions and values are so generic that they could be anything. When that happens,
f
andx
are good names. When functions and values are less generic, the names still ought to be more descriptive.