Code readability hypotheses by Mark Seemann
This post proposes a scientific experiment about code readability
Once in a while someone tries to drag me into one of many so-called 'religious' debates over code readability:
- Tabs versus spaces?
- Where should the curly braces go?
- Should class members be prefixed (e.g. with an underscore)?
- Is the
thisC# keyword redundant?
In most cases, the argument revolves around what is most readable.
Let's look at the C# keyword
this as an example. Many programmers feel that it's redundant because they can omit it without changing the program. The compiler doesn't care. However, as Martin Fowler wrote in Refactoring: "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." Code is read much more than written, so I don't buy the argument about redundancy.
There are lots of circumstances when code is read. Often, programmers read code from inside their IDE. This has been used as an argument against Hungarian notation. However, at other times, you may be reading code as part of a review process - e.g. on GitHub or CodePlex. Or you may be reading code on a blog post, or in a book.
Often I feel that the
this keyword helps me because it provides information about the scope of a variable. However, I also preach that code should consist of small classes with small methods, and if you follow that principle, variable scoping should be easily visible on a single screen.
Another argument against using the
this keyword is that it's simply noise and that it makes it harder to read the program, simply because there's more text to read. Such a statement is impossible to refute with logic, because it depends on the person reading the code.
Ultimately, the readability of code depends on circumstances and is highly subjective. For that reason, we can't arrive at a conclusion to any of those 'religious' debates by force of logic. However, what we could do, on the other hand, is to perform a set of scientific experiments to figure out what is most readable.
A science experiment idea for future computer scientists #
Here's an idea for a computer science experiment:
- Pick a 'religious' programming debate (e.g. whether or not the
thiskeyword enhances or reduces readability).
- Form a hypothesis (e.g. predict that the
thiskeyword enhances readability).
- Prepare a set of simple code listings, each with two alternatives (with and without
- Randomly select two groubs of test subjects.
- Ask each group to explain to you what the code does (what is the result of calling a method, etc.). One group gets one alternative set, and the other group gets the other set.
- Measure how quickly members of each group arrives at a correct conclusion.
- Compare results against original hypothesis.
Code readability hypotheses by Mark Seemann is licensed under a Creative Commons Attribution 3.0 Unported License.
I believe that these 'religious' debates is the result of overreliance to some popular Productivity Tools. If these tools never existed those debates would appear less often.
Where possible, I use
From a code reader's perspective, I also prefer
thissince I can immediately understand whether a member of a class is either a static member or an instance member. After all, it is possible to prefix a static member with
_but not with
The import thing that emerges from this interesting post is that readabiliy is a key feature of code.
I'm a fun of this concept: code is meant to be read. A person able to program is not a programmer, like a person able to write is not a writer.
The good compromise is common sense.
The scenitifical experiment proposed could give use misleading informations, since the human factor is a problematic variable to put into the equation (not to mention Heisenberg); but I would be curious to see results.
Coming back to the
thiskeyword, my personal opinion is that it can be omitted, if your naming convention supplies a readable alternative (like the
_underscore prefix before private instance members).
I also agree that code should be readable.
However, this being said, the proposed method for testing the hypothesis does not hold from a statistical point of view.
Just to mentionen a few of the issues:
The word 'religious' is mentioned and this is the main problem. Different from science that is based on deduction, religion is based on beliefe (or faith) and precisely because of this cannot be tested or verified.
Karsten, I think you're reading practical issues into my proposal. Would you agree that all but the last of your reservations could be addressed if one were to pick large enough groups of test subjects, without selection bias?
That's what I meant when I wrote "Randomly select two groups of test subjects".
If one were able to pick large enough groups of people, one should be able to control for variables such as demographics, experience, initial biases, and so on.
Your last reservation, obviously, is fundamentally different. How many people would one need in order to be able to control for such variables in a statistically sound fashion? Probably thousands, if not tens of thousands, of programmers, including professional developers. This precludes the use of normal test subject populations of first-year students.
Clearly, compiling such a group of test subjects would be a formidable undertaking. It'd be expensive.
My original argument, however, was that I consider it possible to scientifically examine contentious topics of programming. I didn't claim that it was realistic that anyone would fund such research.
To be clear, however, the main point of the entire article is that we can't reach any conclusions about these controversial issues by arguing about them. If we wish to settle them, we'll have to invest in scientific experiments. I have no illusions that there's any desire to invest the necessary funds and work into answering these questions, but I would love to be proven wrong.
There is a huge problem in the way that we evaluate software quality. Karsten, is naming many of the reasons that make it difficult. But given that it is difficult, The only rational way to solve it without relying as heavily credentials, personality and repetition. Is to establish guidelines and have multiple opionins/measurements of code samples.
I think we overvalue experience in evaluating code. And in the contrary, there are real issues with expert bias, particularly of older programmers who have good credentials but are more financially and emotionally invested in a particular technology. In general, it is much more difficult to write good code, than to read it. Therefore, most programmers can read, understand and appreciate good code. While they may not have been able to write from scratch themselves. The only exception would be for true neophytes (which there may be ways to exclude). We also need to be leary about software qualities that are hard to grasp and evaluate. Unexplainable or unclear instinct can lead to bad outcomes.
There is real oportunity here to more scientifically quantify aspects of code quality. I wish more people devoted time to it.