These seven tips aren't the last, or even the first word on good code-but they do address an all-too-absent requirement that shouldn't be overlooked: readability.
The essence of pretty code? One can infer much about its structure from a glance, without completely reading it. I call this visual parsing: discerning the flow and relative importance of code from its shape. Engineering such code requires a certain amount of artifice to transform otherwise working code into working, readable code, making the extra step to leave visual cues for the user, not the compiler.
These Pillars of Pretty Code are somewhat intertwined. The first five are formulaic; the last two require intuition. Just about all of them are evident in make.c, part of Jam, an open source build tool I wrote in 1992. (Make.c is available at ftp://ftp.perforce.com/jam/src/make.c, and will be referred to throughout this article.) Here, I'm using a C example, but note that these practices can be applied to just about any high-level programming language.
1. Blend In
Code changes should blend in with the original style. It should be impossible to discern previous changes to a file without seeing the previous revisions. Remember, nothing obscures essential visual cues more than a shift in style.
This practice should be applied as widely as possible: absolutely within functions, generally within a file, and if you're lucky, across the system.
When presented with really ugly or neglected code, and you can't infer anything about its structure from a glance, you may have to consider reformatting it wholesale. The deep understanding you gain will then be available for every subsequent reader.
Now on revision 44, make.c has had no major rewrites.
2. Bookish
Keep columns narrow. Just as with books and magazines, code should be narrow to focus the gaze. As I mention in "Overcome Indentation, the left edge of the code holds the structure and the right, the detail. Long lines mix zones of structure and detail, confusing the reader.
There are many remedies for long lines: use shorter names (see "Declutter); line up multiple-function arguments, one per line (see "Make Alike Look Alike); and just plain streamline logic (again, see "Overcome Indentation).
As a rule of thumb, 80 columns fits everywhere, though admittedly it isn't physically possible to format some code (such as wide tables) within this strict limit.
To keep itself narrow, make.c uses both short variable names and a strong hand on indentation.
Break code into logical blocks within functions, and disentangle the purpose of separate blocks, so that each does a single thing or single kind of thing. A reader can avoid a total reading only if a cursory inspection can reveal the whole block's nature.
Approaches vary: When a function is actually a series of minifunctions, each minifunction is a block and should be fairly self-contained. That is, information passed from block to block should be carefully considered.
In an alternate approach, when a function is a single large operation, separate blocks could be organized along the lines of type of activity: initializing variables, checking parameters, computing results, returning results, and printing debug output.
This practice is applied recursively for sub-blocks within large blocks (such as big while loops).
Make.c is a hybrid: It separates a block of debugging/tracing, and is otherwise a series of minifunctions, with each block's purpose segregated.
4. Comment Code Blocks
Set off code blocks with white space and comments that describe each block. Sometimes large code blocks (with multiline comments) may embed small blocks (with single line comments).
Comments should rephrase what happens in the code block, rather than be a literal translation into English. That way, even if your code is inscrutable and your comments gibberish, the reader can at least attempt to triangulate on the actual purpose.
Big comments are needed for subtle or problematic code blocks, not necessarily big code blocks. Historically, I have a ratio of 15 percent blank and 25 percent comment lines. For easy identification, make.c goes so far as the number of the blocks and sub-blocks.
Reduce, reduce, reduce. Remove anything that will distract the reader.
Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (such as distant or OS interfaces). Generally, the tighter the scope, the shorter the name. Long, descriptive names may help the first-time reader, but hinder him thereafter.
Eliminate superfluous syntactic sugar (like != 0, needless casts and heavy parenthesizing). Such stuff may help educate a novice programmer, but is unneeded by anyone doing serious debugging and a hindrance to someone trying to get the big (or medium) picture.
Drop ifdef notdef and any other dead code altogether. It's hard enough reading live code. SCM systems hold old code.
Almost exclusively, make.c uses short names, has just about no syntactic sugar, and has no dead code.
Two or more pieces of code that do the same or similar thing should be made to look the same. Nothing speeds the reader along better than seeing a pattern.
Further, these similar-looking pieces of code should be lined up one after the other. Such grouping reduces the number of entities the reader has to grasp, a critical approach to simplifying the apparent complexity of code.
This practice is best used in conjunction with "Disentangle Code Blocks; a separated code block, composed of a pattern of lines with a single purpose, is a simple entity. Unfortunately, it must also be applied everywhere and requires finesse. Fortunately, it rarely affects the generated code. Examples help:
- Initialize variables together.
- Consistently use "this (or don't).
- Line up parameters on a long function call.
- Consistently use {} around if/else clauses: Either all blocks have them, or none do.
- Put the { of if/for/while on its line (because the closing } is).
- Break apart conditionals at the &&s or ||s and align them.
Make.c's "4d block is an elaborate example of how many lines of code can be made to look like a single entity.
The left edge of the code defines its structure, while the right side holds the detail. You must fight indentation to safeguard this property. Code that moves too quickly from left to right (and back again) mixes major control flow with minor detail.
Forcibly align the main flow of control down the left side, with one level of indentation for if/while/for/do/switch statements. Use break, continue, return, even goto to coerce the code into left-side alignment. Rearrange conditionals so that the block with the quickest exit comes first, and then return (or break, or continue) so that the other leg can continue at the same indentation level.
Real code requires substantial sub-blocks, necessarily indented, and these sub-blocks bring their own indentation battles to be fought. Indent only to move from structure to detail, and not because of an artifact of the programming language.
This is the most difficult of the seven pillars, as it requires the most artifice and can often influence the implementation of individual functions. It's no accident that make.c rarely goes more than two levels of indentation.
The Final Word
These Seven Pillars of Pretty Code aren't the last, or even the first word on good code. But they are, I believe, what distinguishes readable code from the rest.
Code Transformation Seiwald's Seven Pillars of Pretty Code in action.
This specific before/after example may seem like overkill in a function so trivial, but it demonstrates the techniques, which can be applied equally well to less obvious code.
Here's a breakdown of the specific steps I followed:
Bookish. I used steps from "Declutter and "Overcome Indentation, which left the lines very short.
Disentangled Code Blocks. Searching for
Comment Code Blocks. I separated out logical blocks with blank lines and comments.
Declutter. I used 0 for
Make Alike Look Alike. I lined up function arguments, and initialized
Overcome Indentation. I extracted main path (finding the
Before
After
-CS
|
Christopher Seiwald, an expert on Software Configuration Management (SCM), is the founder and CTO of Perforce Software. Prior to founding Perforce in 1995, Seiwald was the manager of network development for Ingres, responsible for developing the communications layer of the networked database product.