Meaningful comments help.
Not to be confused with HotComments (an antipattern).
Rationale:
Not all of us are compilers. Natural language is our preferred means of communication, albeit less precise and more ambiguous than formal languages like programming languages. Comments provide meta information about the program, reasons for choosing this implementation, known issues, hints for future readers (including yourself).
Commenting interfaces is an especially good thing to do. Implementations don't need to be commented heavily if their interfaces are sufficiently documented. That is, a well-written implementation for a given method is rarely more than 20 language statements (typically around 5 to 10). Thus, if you know what the method is supposed to it, it should be straightforward to understand the mechanics of it.
Meaningful in this context has two parts: First: the comment can be understood by the people who read it. Two: the comment says something that is likely to be not understood by the same readers unless it was present. So: who are your readers?
Arguments:
"Commenting isn't really self-documenting code."
It's close though. Comments have a tendency to go out of phase with the code, especially when code is rapidly changing. Also, bugs in the comments (there are such things) are hard to catch because it requires a human being to sit down and parse the text with his or her own brain.
However, they are very close to the bone and have a higher probability of being correct. If you don't spent more than minute writing a comment (unless this Wiki page!), it's likely that the comment is small enough that it will change when the code below it will change.
"Comments are annoying!"
So is maintaining UndocumentedTrickyCode. Which one costs more money? [hint: Maintaining code can cost upto 100 times as much as the initial development; cf. CodeComplete by SteveMcConnell].
"Comments go out of phase."
Only if you don't refactor the comments along with the code. It's just like refactoring the tests along with the code. On a related note, beware of the ThePalimpsestEffect.
Exceptions:
Self-documented code doesn't need comments. That's the whole point!
Examples:
There are plenty of good commenting examples around. Here is one very bad technique:
int iFoo = iBar * 9; // foo is 9 times barNot only a totally useless comment because it provides no information not immediately obvious from the code, but it also doesn't say why foo is 9 times bar, or what 9 is (see also NamedConstants). Plus, it is a comment at the end of the line. This limits the amount of space the comment can fit into and promotes this fluffy kind of comments. No, it doesn't. If it doesn't fit one line, it can go extend over subsequent lines. [It also violates MeaningfulNames.] Even better would be:
// According to lore, cats have 9 lives. static int s_kiCAT_LIFE_FACTOR = 9;It may be entirely useless to use NamedConstants for the cat life factor if you brought the comment down to the same line. However, if you had many animal to human life conversions (dogs to human lives, elephant to human lifes), it's better to centralize the data instead of distributing it throughout the code....
int iCatLives = iNormalLives * s_kiCAT_LIFE_FACTOR;
(eww, s_kiCAT_LIFE_FACTOR? How about CatLifeFactor? instead?)
Another very good technique would be something like
int numCatLives(int numHumanLives) // <insert literature reference here> { return numHumanLives * 9; }Overcommenting is worse than no comments, because it makes the source code longer, and provides more opportunity for non-testable errors.
If the code and the comments don't match, they're probably both wrong.
Other than the literature reference, which I might have to think about, I see no code here that would need a comment if written expressively. Should we work a Comment Example somewhere? --RonJeffries
ok...
Imagine the problem being a producer-consumer using fork()ed processes that communicate through shared memory and semaphores (on Linux). We're using that very expressive language, C.
/* prodcons.c -- Simple Producer/Consumer solution. ---------- February 6, 2000 -- Sunir Shah */#include <sys/sem.h> #include <signal.h> #include <sys/shm.h>
#define NUMBER_OF_CONSUMERS 10
typedef enum { false, true } bool;
/* Returns a semaphore handle; -1 on error */ int CreateSemaphore();
/* Destroys a semaphore. Returns true on success */ bool DestroySemaphore( int hSemaphore );
/* Wait/signal a semaphore. Returns false on OS error */ bool WaitForSemaphore( int hSemaphore ); bool SignalSemaphore( int hSemaphoreHandle );
/* Allocates a shared memory segment. iSize is the size is bytes. The handle will be stuffed into *pHandle Returns NULL on error, or a pointer to the memory. */ char *AllocateSharedMemory( int iSize, int *pHandle );
/* Deallocates a shared memory segment. Returns true on success */ bool DeallocateSharedMemory( char const *pBuffer, int hSharedMemoryHandle );
/* On error, returns an error message. NULL on success. */ char const *Initialize();
void ConsumerMain(); void ProducerMain();
void ForkConsumers(); void KillConsumers();
pid_t aiConsumerPIDs[NUMBER_OF_CONSUMERS]; char *pSharedMemory = NULL; int hSharedMemory = -1; int hProducerSemaphore = -1; int hConsumerSemaphore = -1;
main() { char const *szError;
/* We must allocate the shared memory, create semaphores, etc. BEFORE we fork the consumers so that they can have access to them.
if( szError = Initialize() ) { Deinitialize(); printf( "%s\n", szError ); return 1; }Comments I feel are really important are: the file header, the description of Initialize()'s and AllocateSharedMemory() signatures and the comment inside main() detailing the important fact that everything has to be allocated before the fork()s. If things get shifted around, subtle bugs will be induced. --ssForkConsumers();
ProducerMain();
KillConsumers();
Deinitialize();
return 0; }
/* Rest of program left as an exercise for the reader. ;) ... */
Seems unlikely to me that anyone would move an initialize method out of the front of the code, but maybe i'm an optimist.
I frequently do. If it makes the code easier to understand, factoring out a method is my favorite way to reducing complexity. --AlbertBrandl
If the signatures have to be away from the code, I suppose they could be commented without, um, comment. OTOH it seemed pretty clear to me without. One might say *pReturnedHandle, perhaps. But the code would tell for sure, and if the code says, I'd not comment.
A couple of suggestions I would make to the above code that would help reduce the need for comments are
typedef int handle;I believe applying these changes would eliminate at least some of the need for comments. If these are operating system or library calls that one cannot modify, I would probably favor the comments as opposed to using either wrapper functions or #define macros to change the signatures. -- WayneMackbool CreateSemaphore(handle * theSemaphore);
bool AllocateSharedMemory( int iSize, handle *pHandle, void * theMemorySegment );
bool Initialize(const char * * theErrorMessage);
Please note that the comment regarding allocation before threading could be eliminated simply by embedding it within the if block. This makes the sequence explicit.
main()
{ char const *szError = Initialize();There are a lot of ways this code could be improved by rewriting rather than just commenting.if( szError == null)
{ ForkConsumers();
ProducerMain();
KillConsumers();
Deinitialize();
return 0;
}
else
{ Deinitialize(); printf( "%s\n", szError ); return 1; } }
Is the above code an example of best commenting practice or of over-commenting?
I assume that it an example of best practice but surely the declaration of DestroySemaphore, for example, does not need a comment. What the function does is clear from its name, i.e. it destroys the specified semaphore. Moreover the given comment does not add any value, it only repeats the name of the function.
The only possible ambiguity is the meaning of the returned bool. This IMO is again clear and does not need a comment. If its meaning is not clear then the following change would be better than an comment:
enum CompletionStatus { Failure, Success };-- JohnWilkinsonCompletionStatus DestroySemaphore( int hSemaphore );
Concur. I have created a lot of enumerated types exactly like this just to make clear what pass and return values are supposed to convey. That way there is no question in some reader's mind what the method needs to see coming in or what it will come back with. On the other hand, I do not depend on some Hungarian notation in the variable names to describe what a variable might contain -- it had better be clear through the var name and its type, otherwise I use a comment to indicate range, etc.
I have a particular beef with AutoGeneratedComments?. My favorite IDE for Java inserts template comments when it generates the body of a method. I don't use Javadoc so I hate this particular feature. I always turn it off, but I have to work with code from other teams which is littered with method declarations that start something like this:
/** * @see SuperclassName#foo(int pitch). * * @param int pitch - * * @return int * * @exceptions IOException */ public int foo(int pitch) throws IOException { ...
counter++; // increment counterisn't very descriptive.
nr_lines++; // we've found another line; // increment the line counteris much more useful.
number_of_lines++;may be all you need. (See MeaningfulVariable)
numLines++; // found another lineSomehow shifting for a capital letter feels utterly natural while typing, but shifting for a punctuation mark like underscore does not. Note that I avoided the spurious comment about incrementing the line counter partly because I think it's obvious to a programmer and partly because I feel that the comment flow is clearer when it ReadsLikeProse.
My code tends to have few comments. I tend to write comments to explain algorithms that are not readily obvious and I find that such sections of code will often have lines just like the above increment. If the code looks like this:
numLines = 0; while ( (lin = fin.readLine()) != null) { do something or other ... numLines++; }then I don't feel the need for a comment. But if it was more like this:
numLines = 0; while ( (lin = fin.readLine()) != null) { do something or other if (isFunny(lin)) { numLines++; // got another funny one } do some more stuff ... }then I feel inclined to explain the code. Since there are not many comments in the code to begin with, the eye is naturally drawn to the ones that are there so they had better be meaningful enough to be worth distracting your reader from the code itself. --MichaelDillon
Why not just remove the need for the comment by renaming numLines to numFunnyLines?
I would suggest creating a method CountFunnyLines(). You might even want to go to the level of creating a FunnyLine class or module with a Count() method. The implication of "do something or other" and "do some more stuff" is that there are multiple level of abstraction going on here, though I may be overanalyzing an example code snippet. -- WayneMack
Comments that simply repeat what the code says are not meaningful. Do not write them. After making your code communicate as much information as possible, strive to write comments that communicate important information that is still not obvious from the code. While writing code your head is full of such information; learn to recognize it and make it useful by capturing it. -- JimPerry
And when you have written comments communicating what isn't obvious from the code, refactor the code to remove the need for the comments. In my code, this is when things actually get named properly. -- EricUlevik
Assembly Language suffers from the age-old tradition of putting comments on the end of the line. There's not enough space on the end of the line to write meaningful comments, so most programmers end up commenting the obvious.
INC AX ;Increment record count CMP AX,10 ;Compare with 10 JLE OK ;Jump if <= MOV AX,10 ;Set to 10 OK:Most assembly reads better if you use a block comment followed by a block of code:
; Limit the record count to 10. ; This works around a kernel bug that causes buffer overruns when ; reading more than 10 records.In assembly as in all languages, assume the reader knows the language. If the reader doesn't know the language, no amount of comments is going to do any good.INC AX CMP AX,10 JLE OK MOV AX,10 OK:
Subroutines work in assembly as do constant defines (replace 10 with "theMaximumRecordsAllowedByKernel"). If your assembler supports renaming registers, one can also substitute a name for the AX register (I am not sure if many assemblers still support this, so I will omit this from the example). This would create something like the following (please feel free to correct the syntax, I'm only showing an example).
INC AX JSB LimitRecordCountLimitRecordCount:
CMP AX,theMaximumRecordsAllowedByKernel JLE LessThanMaximumRecords MOV AX,theMaximumRecordsAllowedByKernelLessThanMaximumRecords:
RETEven in assembly, a block comment is a good hint to use a subroutine.
This page mirrored in WikiPagesAboutRefactoring as of April 29, 2006