Saturday, June 13, 2009

On matters of code style

(Or, “You know where you can shove your curly brace!”)

Source code is written documentation. It is the best, most accurate, in fact perhaps *only* accurate documentation you have. It completely and perfectly describes what your application does. No requirements specification, technical specification, UML diagram or anything else comes close to the comprehensive precision of your source code.

Now lots of written material is read for entertainment. The perhaps-soon-to-be-obsolete newspaper, novels, even for many people, non-fiction. But the first rule of software documentation is nobody reads it unless they have to. Except for a rare few, reading source code is not entertainment. When they are reading software documentation, people are reading it in pursuit of some goal. An emergency bugfix perhaps, or considering how to fit in a change to effect some new functionality. Perhaps looking to improve application performance or lift some existing code as the basis for another project.

Whatever it is, people have an objective when reading software documentation. And like I said, your source code is your *best* documentation. Unlike those written specifications and pretty pictures it is always up to date. It doesn’t suffer from the vagaries and ambiguities of the English language. It doesn’t require updating to reflect the latest design and behavior, it *is* the design and behavior.

Any non-trivial piece of software typically has quite a lengthy shelf life. It will often be far longer than perhaps initially anticipated, typically many years for business applications. Over its not inconsiderable life, with the ebb and flow of need bugs will be fixed, features added, code refactored, performance enhanced, third party libraries updated. All of this by a (more than likely) ever changing team.

Now most, if not all, source code is read on screen. Some people might like to print it out for scrutiny and reflection, but not the majority, whether seeking out a bug or peer reviewing a colleague’s work. Reading on screen is hard. Today I stumbled across a copy of Jakob Nielsen’s “Designing Web Usability” when rooting about in my furnace room. I dipped into it as I remembered him having something to say about content being read on screen. According to him people read 25% slower on screen, and in a study he conducted 79% of people scanned or skimmed rather than read word for word. I think it’s pretty fair to say that this is *exactly* how software engineers read code.

Consider then the trio of factors here:
  1. source code is the best form of documentation you have for an application
  2. it lasts a long time and will be read by a variety of people over it’s lifetime
  3. it will be read on screen, likely skimmed and scanned by busy people on a mission

It seems blindingly clear to me then that the layout and style of your code is important. People need to be able to read, digest and work with source code as quickly as possible. I don’t think it’s any exaggeration to claim that many hundreds of hours of time and thousands of dollars can be saved over an application’s life if the code is easy to read. A simple back of the envelope calculation would demonstrate this for almost all business software.

So we need to make our code easy to read, need to ensure it has code layout and style to enable all those subsequent times it will be read, debugged, modified, refactored and reused. This of course begs the question, what characteristics make code easy to read? I believe it can be done by adhering to two simple principles:
  1. write code that is easy to scan
  2. use identifiers that are descriptive

What makes your code easy to scan is going to vary a bit; it's going to depend upon the language, but I think there are some general things to keep in mind. Here's my list, spawned for the most part from writing code for the last 7 years or so in Java:

  • I want to be able to quickly see the start and end of classes. This is not normally an issue since there's one per file...unless you're using inner classes which can make things a little more exciting.
  • I want to be able to to easily see significant independent pieces of code, specifically the start and end of methods, iterative blocks and conditionals
  • Method arguments should be highly scan-able
  • Judicious use of alignment and indentation can provide an easier to scan layout
  • Remember, white space is free. Use it to improve readability.
  • Any application of style and particular idioms should be used habitually. When you enter some code if you notice that it does not adhere to your preferred layout and style you should consider changing it. It's an easy and quick improvement that will probably pay dividends.

Descriptive identifiers need a bit of thinking. Sometimes the way something should be named is obvious. Sometimes however it's not, and merits discussion with peers. Sometimes a good name comes to you later after your first stab at things. This is where the "rename" refactoring in IDEs is really useful. Poor naming is worth fixing; left untouched people waste mental energy dealing with it that could be better spent. In particular give due consideration to:
  • Good clear names for packages, classes, methods and variables
  • Exploiting the existence of a domain specific vocabulary. However clarity is key; sometimes business users are inconsistent and you need to work hard to identify synonyms or subtle differences in seemingly interchangeable terms
  • Brevity. Overly long unwieldy identifiers are not easily scannable. Similarly strive to reduce code. Less is more and all that...
  • Avoiding redundancy in names, e.g. consider if you really need to pre-pend some system identifier to a series of classes, or if you have to append 'Servlet' to the end of all servlets that you write...
  • Using plain, direct language. Overly abstract terms are often unhelpful. Complexity is inevitably there in most systems, but aim to simplify terms when reasonable to do so.

Finally, I was going to provide an illustrative example. Some code laid out in gnarly 'before' fashion followed by an 'after' rendering of almost poetic beauty. But since this post has languished in unpublished draft mode for months I decided I'm never going to get around to that. So you'll just have to use your imagination.