Relational Database Field Names

[discussion moved from EveryWordCanBeAbbreviatedToFourLetters]

{Why not call this topic DatabaseFieldNames?? I see nothing specific to relational yet.}

Database column names are often best constructed using groups of three or four words. If the words are kept short, the column names should not get too long.

But, at a high cost to readability. Which may or may not be a good tradeoff.

Full words also have a high cost to readability. If four 4 character words are used to make up a column name that is only 19 characters with underscores. A human eye can see that much at one go. If four 8 character words are used to make up a column name that is 37 characters. This is very difficult to see all at once. That means that you have to read the name rather than seeing it, and that also means that you can generally only fit one or two column names on a line rather than 3 to 5. This means a number of thing:
  1. If you name your variables with names similar to the column names, you take up an entire line of code: 37 + 37 + 3 = 77 characters for a simple assignment statement. This does not even include margins.
  2. This means that you have to remember what was at the beginning of the name as you scan through it taking up more mental energy I would rather spend some other way.
  3. This means that many editors, printers, API's will not adequately handle the code.
  4. This means that people who make column names will be tempted to use fewer words to create column names thus making them less precise in their menaning and therefore making their use harder to understand.

The upshot is that using full words means that focusing on an individual piece of code will be harder, and if a good set of abbreviations can be developed, we avoid all of these problems. It is, I admit, a tradeoff. Full words are easier to read by themselves. I would like to see an AbbreviationStandard? that gets around some of these problems.

You make some interesting points. MeaningfulNames has a few counterpoints. I assume the attempt to list some abbreviations is work towards a standard?

Yes, see above

I think it is a fallacy that variables and schema column names have to be fully self-documenting, especially the more commonly used ones. The database should have a DataDictionary that allows for a fuller description of the column that any developer can reference. Unfortunately, such is not standardized. But, nothing stops you from making your own. Similarly, variables should have a description comment next to them if you cannot find a concise name.

Instead of a special purpose DataDictionary, perhaps a general purpose HyperlinkedTextEditor would do the trick. With this editor you could browse around until you found where the relevant comments/documentation were, and then back again to work on the code.

As names for common concepts become more agreed upon, will they get shortened further? For instance FirstName? or CountryCode? are more or less ubiquitous could they and similar fields/attribute names become shorter to the point of being fn, cc or similar as database/object computing theory matures in the future? I ask this since if you compare mathematical/scientific notation (within context) practitioneers agree upon F as force, c as the speed of light, v as velocity, d(y)/dx as differetiation etc in shared technical discussion though these are highly abreviated.

I don't think database column names cleave into discrete contexts as well as mathematical/scientific symbols. There are too many diverse interests looking at the same names.

This is all silly. Names should always be full words. Any name that is too long as full words, points to a problem in your design, factor it better, you shouldn't be using multiple(more than three) words that often. Abbreviations are used by bad programmers.

I think it is funny that the person advocating for always using full words shortens "should not" to "shouldn't".

To be fair, "shouldn't" doesn't add any ambiguity to the word that I know of. But it does require a reasonable knowledge of English abbreviation rules to read. (I'm against "long field names", by the way.)

The problem with "Names should always be full words" is that it is not, in itself, a convention. This leads to an unfortunate diversity in name building. I might call something "Walk_Dog_In_Park" and the next guy wants to name it "Park_Walks_For_Dog" or "Park_Oriented_Dog_Ambulation" or "Dog_Park_Walk".

Later, when I look at his code (or he at mine), I may wonder what he could possibly mean -- since he and I string our words together differently.

If you have a convention that dictates how names are organized, the "full word" constraint becomes less important. In fact, mnemonics can be quite useful for frequently used terms.

Long names versus short names seems to be one of those never-ending HolyWars because it is probably a personal preference. I personally do not like long names, as long as they are explained somewhere. It would be nice if DBA's kept a list of abbreviations around for all to reference. For example, have a list such as:

   Schd - Schedule
   Rte  - Route
   Mth  - Month
   Cd   - Code
   Dt   - Date
   Invc - Invoice
   Acct - Account
   Tm   - Time

What happens when you write software for some other domain such as accounting, telephony, workforce management, customer management, networking, vetinary management ... Do you create a lookup table for each new abbreviation you create in that field?

If so you are creating a pile of extra work for every maintenance programmer who comes after you. It takes extra mental processing to remember 'Cd = Code' and 'Invc = Invoice'. Not to mention the different ways people abbreviate (e.g. 'Inv' is more common at my workplace).

Just use full words. Literary scholars throughout the centuries have resisted the urge to abbreviate every 5th word in the name of easy typing. They also seem to think it doesn't make their text any clearer. I agree there is a very large exception to that in the science and mathematics communities who use single letter abbreviations all over the place. But their formulas were often written on chalkboards without any intellisense. They have also gone on to use greek letters because their domains contained more than 26 concepts.

I know I would rather see:

Energy = Mass * SpeedOfLight ^ 2


e = m * c ^ 2

if I had a limited understanding of that domain (ala most maintenance programmers).

For a simple one-off usage, I generally agree. But lots of verbose formulas/expressions can be difficult to read, especially when they span multiple lines, at least for me. The more a given variable is referenced, the more abbreviating helps. Put a longer description or comment at the point of declaration, or the column note section if the DB has such.

And your example is an exaggeration. A more realistic example would resemble:

  Engy = Mass * spLight ^ 2

This example is not complicated enough to demonstrate the tradeoffs either way, though. For longer or complex expressions, the structure (shape) of most expressions is easier to grok if it's relatively compact, at least for my own WetWare. Long names can stretch out such expressions too much, including causing line-wrapping. The nesting, parameter divisions, positions, etc. are easier to spot when expressions are compact.

 // Example A
 foo(bar(flib, brot), x, moof(zip, bop, grit), mip + sid); 
 // Example B
 fooFlamFopper(bartonic(flibTrammoof, brotonix), exxySimmo, moofopitanon(zipPamNoffGripp, boppityDoo, gritMatterTad), mippotronic + sidSadDodSodMod);

"A" is far easier for me to grok the structure of in terms of placement and nesting. Maybe YOU have FastEyes and "B" is almost as visually digestible to you as "A". But if so, you are probably not the norm. I suppose my visual mechanisms may be below average for developers, but I doubt the average is much higher such that most would agree that "A" is noticeably easier to visually analyze than "B". The fact that the common math convention is closer to "A" lends credence to this estimate so that I don't think I am an outlier.

I know what I like and I know what confuses me and I reject the notion that YOU know my mind better than me and know what confuses me better than I do. That being said, I'll respect your preference as your preference and realize that no coding style will make everyone happy. Naming is a place WherePsychologyMatters (including biology differences). (This issue came up in another topic. If I find it again, I'll relink it.)

Existing RDBMS do need better or more standardized DataDictionary(s) such that long descriptions can be given in the dictionary. That way the formal column name does not have to carry the entire burden of thoroughly describing the column. If the RDBMS does not provide one, then making your own and placing it in the application, project, or company table namespace is not that difficult with typical tools.

Good naming is an art that seeks to balance brevity against self-documentation. It takes skill, experience, thought, and patience to do it well. I marvel (and learn from) those who get it right and cuss those who bush it up.

e = mc^2 was the first example that came to my head when I thought of single letter names, so it may have been a bad example. But your counter-example of using 'Engy' and 'sLight' just makes me ask 'why bother'? Accepting the fact that all IT professionals should be touch typing at a reasonable speed, it isn't that hard to type the extra few characters per word to save the hassles of:

- Abbreviations that differ from person to person, causing them to check which version was used when referring to the object (e.g. 'Invc' vs 'Inv').

- Forcing future readers to mentally convert the abbreviations each time they are read (or until they are part of muscle memory).

- Causing a delay for readers for cases where the abbreviated word is ambiguous (e.g. 'Cd', 'Dt', 'Tm')

If typing is not the issue then the only other complaint I can find is that full words are harder to read then shortened ones. Again I will point to the fact that 99.9% of English literature uses full words. I'm sure that if people had a hard time reading full words they would've let the authors know a few centuries ago.

Perhaps it's because the words are joined together using PascalCase . Long phrases without any spaces appear to be hard to read for some people, compared to abbreviated forms. While I can accept that some people feel this way, it doesn't hold that 'SpeedOfLight' should be shortened as it's a very short phrase. And if that is the only objection (that long phrases without spaces are hard to read) I would hope those who are pro-abbreviation would agree to avoid abbreviation if we ever evolve to a language that doesn't need PascalCase (e.g. a graphical language where each object on the screen has a label floating above it written with spaces between words.). In response to your last point regarding brevity vs self-documentation, I would argue that communicating the full intent of an object is the most important thing. Development is complex and difficult, the last thing you want to do is cause a reader to do extra mental processing, such as converting abbreviations and looking in data dictionaries. Communicating the full meaning of an object is more important than saving screen real eatate, except for very rare cases where English fails to describe an object within 8 words or so.

Bloat and verbosity also hinders communication, and data-dictionaries are not evil. Enough of this debate already. Everybody has their own personal preferences; I like my preferences and you like yours. Unless you have something really novel to say on this subject, let's AgreeToDisagree and move on.

See also: SelfDocumentingCode, DatabaseBestPractices
CategoryDatabase, CategoryNaming

View edit of June 12, 2014 or FindPage with title or text search