This post proposes a scientific experiment about code readability

Once in a while someone tries to drag me into one of many so-called 'religious' debates over code readability:

  • Tabs versus spaces?
  • Where should the curly braces go?
  • Should class members be prefixed (e.g. with an underscore)?
  • Is the this C# keyword redundant?
  • etc.

In most cases, the argument revolves around what is most readable.

Let's look at the C# keyword this as an example. Many programmers feel that it's redundant because they can omit it without changing the program. The compiler doesn't care. However, as Martin Fowler wrote in Refactoring: "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." Code is read much more than written, so I don't buy the argument about redundancy.

There are lots of circumstances when code is read. Often, programmers read code from inside their IDE. This has been used as an argument against Hungarian notation. However, at other times, you may be reading code as part of a review process - e.g. on GitHub or CodePlex. Or you may be reading code on a blog post, or in a book.

Often I feel that the this keyword helps me because it provides information about the scope of a variable. However, I also preach that code should consist of small classes with small methods, and if you follow that principle, variable scoping should be easily visible on a single screen.

Another argument against using the this keyword is that it's simply noise and that it makes it harder to read the program, simply because there's more text to read. Such a statement is impossible to refute with logic, because it depends on the person reading the code.

Ultimately, the readability of code depends on circumstances and is highly subjective. For that reason, we can't arrive at a conclusion to any of those 'religious' debates by force of logic. However, what we could do, on the other hand, is to perform a set of scientific experiments to figure out what is most readable.

A science experiment idea for future computer scientists

Here's an idea for a computer science experiment:

  1. Pick a 'religious' programming debate (e.g. whether or not the this keyword enhances or reduces readability).
  2. Form a hypothesis (e.g. predict that the this keyword enhances readability).
  3. Prepare a set of simple code listings, each with two alternatives (with and without this).
  4. Randomly select two groubs of test subjects.
  5. Ask each group to explain to you what the code does (what is the result of calling a method, etc.). One group gets one alternative set, and the other group gets the other set.
  6. Measure how quickly members of each group arrives at a correct conclusion.
  7. Compare results against original hypothesis.
Creative Commons License
Code readability hypotheses by Mark Seemann is licensed under a Creative Commons Attribution 3.0 Unported License.


Comments

Totally agreed.

I believe that these 'religious' debates is the result of overreliance to some popular Productivity Tools. If these tools never existed those debates would appear less often.

Where possible, I use this and base.

From a code reader's perspective, I also prefer this since I can immediately understand whether a member of a class is either a static member or an instance member. After all, it is possible to prefix a static member with _ but not with this.

2013-03-04 14:04 UTC

The import thing that emerges from this interesting post is that readabiliy is a key feature of code.

I'm a fun of this concept: code is meant to be read. A person able to program is not a programmer, like a person able to write is not a writer.

The good compromise is common sense.

The scenitifical experiment proposed could give use misleading informations, since the human factor is a problematic variable to put into the equation (not to mention Heisenberg); but I would be curious to see results.

Coming back to the this keyword, my personal opinion is that it can be omitted, if your naming convention supplies a readable alternative (like the _ underscore prefix before private instance members).

2013-03-04 18:57 UTC

I also agree that code should be readable.

However, this being said, the proposed method for testing the hypothesis does not hold from a statistical point of view.

Just to mentionen a few of the issues:

  1. The two groups could have different leves of experience which would influence their 'ability' to read and understand the code
  2. Demografics such as age, gender, native language, education, ..., would also influence the result
  3. The individual members of the group could be biased towards one or the other, in the used example for or against the use of the this keyword
  4. Simply to compile large enough groups to ensure a low statistics error would be an issue

The word 'religious' is mentioned and this is the main problem. Different from science that is based on deduction, religion is based on beliefe (or faith) and precisely because of this cannot be tested or verified.

2013-03-12 12:00 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or Google Plus, or somewhere else with a permalink. Ping me with the link, and I may add it as a comment.

Published

Monday, 04 March 2013 13:21:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!