Naming in code

int i;

How many times have we seen this? The reason we process this quickly is because we have seen it in code examples in K & R , THinking in C++/Java, or whatever your intro to structured programming was.

Even before that, we saw i used in sequences and series in Algebra. Assuming you took algebra before learning to code, no longer a safe assumption. Actually, I learned Basic before I took Algebra. Anyways, i as an index is very common. If we need multiple we go to j, and then to k.

char c;

Is pretty common too.

float f;

double d;

Probably the defaults, but people here tend to use more explicit names:

float r ; //rate

double a; //accelleration

All these come from the world of math done on a chalk board. If you’ve done any electricity you know:

v= i* (r*r); //twinkle twinkle little start, power equals i squared r.

e=m*(c*c);

a=(b*h)/2;

But it still takes a second to translate:

a=pi*(r*r);

into the more familiar.

a=Пr²

The fact that we can read these so quickly comes from their familiarity. But since we can’t use the Greek letter pi as a variable (at least not without knowing significantly more key bindings in vi than I have at the tips of my finger. I am doing this up in HTML, so I have to go and lookup not only the pi character, but the superscript 2 character as well.

This basic introduction to variable naming should illustrate the point that short names are a benefit if you can quickly translate them to their original intentions. It should not be too hard to show that the contrapositive is also true: if you cannot quickly translate them back to their original meaning, they are not a benefit. I would go so far as to state that they are a hindrance.

How many people can quickly translate the following acronyms:

CINCGARS

CINCPACFLT

RPM

TANSTAAFL

TARFU

RPG

If you have worked in the Army,you know the CINCGARS as the encrypted radio system, although you probably couldn’t state what it meants. If you were in the the Navy, or drove around Pearl Harbor, you would recognize CINCPACFLT as the Commander in CHeif, Pacific Fleet. RPM is rotations per minute, to most people, or the Redhat Package Manager to Linux people. To people who studiedeconomics, TANSTAAFL is short for “There ain;t no such thing as a free lunch.” TARFU is an old WWII slang expression saying that things are two levelsabove SNAFU (Situation normal, all …), having surpassed FUMTU (… more than usual) to “Things are really …” Which is one step below our well loved FUBAR. In all these cases, the acronym is helpful to people already familiar with the term, and a barrier to understanding to people on the outside.

I put RPG last because it illustrates the real problem with abbreviation. We can accept that RedHat chose RPM because it was already popular as an acronym due to the geek cache of rotational dynamics. But RPG is common enough that at least three different realms have used it as an acronym. In the Army, it means rocket propelled grenade. In the mainframe world, it means Report Generator (ver COBOLesque). In the Geek civilian world it means Role Playing Game. These three circles can intersect, cauuing a need to disabiguate the acronym. If you were designing a military role playing game system, you might accept that RPG means the game, and you use the full term for the weapon. If you were building a mainframe program to track the concepts in various Role Playing Game, you might need to use RPG to list the RPGs that refer to RPGs. While you can probably parse the previous sentence, you have to expand each of the acronyms to clarify. Each of these terms resides inside a namespace. Once you know which namespace you are using, you can revert back to the TLA (Three Letter Acronym).

A common pattern of software application development is to have a three tiered design. Say you are writing a program to design and control data centers. Your domain model will contain such objects as Hosts (physical computers), Hostnames, Racks, Ports, Operating Systems, install images, and so forth. Adding a new host to the system may be a multi stage process: specify the physical make up, figure out where to put it, give it a name, put an OS on it, etc. Now you might have differentviews of a host depending on whether you are in the User Interface, business logic, or Data tier of your application. I would tend to create a namespace around each. For instance, in C++:

Web::Host, Business::Host, Data:Host.

All three of these would live in the namespace of the application:

Datacenter::Web::Host, Datacenter::Business::Host, Datacenter::Data:Host.

Note that I chose the specific UI type to name it. Say we alter want a Qt based Application, or a command line interface, the namespaces could reflect that:

Datacenter::Qt::Host Datacenter::CLI::Host

But suppose there is some commonoality between these two. Some might go for a generic term like “Common” But I would caution against that: you want to go as specific as possible. Say the code for validating an IPv6 address is common to all of the UI layers. Put it into a validation package:

Datacenter::Validator::IPv6Address

Note that I move between the terms package and namespace. From a programmatic perspective, they provide the same functionality, they are just different mechanisms depending on the language used (C++ vs Java).

Note that now the Validator package can be used by any stage of the application. You can validate a file load in the Data layer using the same mechanism as the UI layer.

Here’s a rule of thumb:  Make a name as simple as you can.  Use a namespace to avoid name clashes.

Note that the rule states simple, not short. Acronyms require an added levle of translation.  While that may make it easier to type, it does not make it easier to understand.

Here’s another rule of thumb.  The primary focus of the developer should be maintainability.

Not performance.,   Not getting features out of the door as fast as possible, not robustness, not correctness, but maintainability.  Why?  Because it is a crosscutting concern that will support all of those other goals.  The easier it is to maintain your code, the easier it will be to change it for any purpose.   The lead programmer on your team will get hit by a bus, or promoted, or hired away to work at a cool new startup, or decide that he wants a career change, or go on maternity leave, or…and leave someone newly hired to your organization to take over.  Or newly transfered to this group that has never looked at the code before.  You might even be lucky enough to have a change over before the newbie gets the full brunt of the change requests, but it doesn’t matter, because it won’t sink in until the person who wrote is gone…see ya and I wouldn’t want to be ya.

How do you understand a code base?  What tools do you have? One powerful one is refactoring.  Make a copy of the working tree and go crazy renaming methods, extrating blocks from long functions, move things around, etc.  Make the code as pretty as you like.  After a long painful night of it, you will understand the code, in both its pre-and post refactored form.  Then be prepared to throw out the refactored version and work with the code.  But now you can refactor a little at a time.

One of my favorite lines of code from early bproc had a fragment something like this:

req->req.req

THe First req awas a pointer to a messsage that had come in.  The second was the message header, and the third was the message type.  THe current code reads more like this:

message->header.type

Aside from being almost  twice as many characters, the second version is also instantaneously understandable.