Number Pollution

It has been suggested in ZeroOneInfinityRule that zero, one and infinity are the only MagicNumbers we should accept in our programs without well reasoned justification. I'm wondering how many numbers we routinely see in programs and what their justifications might be? Search your own or other people's code and report back here.

Could we have some justifications of the above appearing as literals? I don't believe there is any ;)

One often sees numeric codes in programs. These are not the same as the numbers mentioned above in that their magnitude is unimportant except that it be unique. Here are some familiar numeric codes.

Our never ending quest for unique codes has also been tabulated in the many ways we denote TheEnd of program input.

I assert that some numbers are themselves better explanations of their purpose than any textual name for them could be; 365.2425 is an excellent example of one. ("precise_length_of_year"? "precise_days_in_year"? "days_in_year"? None of those do it for me.) 180, 360, and 256 are others. Pi should be referred to by name though (preferably, one defined by the language), because if it's referred to by number nobody is actually going to type out all the digits. Or, as the old joke goes, because the value of pi is subject to change. Of course, one should still use a named constant rather than violating OAOO. -- DanielKnapp

It is silly to iterate over powers of two above. Suffice it to say that, due to binary representations, the numbers 2^N and more often 2^N-1, for positive integer N, often show up.

However, they usually show up as system defined boundaries of some sort, so should *almost always* be abstracted behind a constant, or macro if you have to.

char *fname[256]; is markedly inferior to char *fname[MAX_FILENAME_LENGTH]; for this reason.

As mentioned above, there are a few numbers that occur naturally enough. Nobody will wonder what angle=x/360.0; means (although you had better check you are working in degrees!). 256, on the other hand, is very rarely useful as a fixed constant... if your code is peppered with 256's (or 255's) and one of your system defined constraints changes (common enough in porting), you have to search everywhere!

Similarly, I have to disagree with 365.2425. Something like const float days_in_year=365.2425; is much better that a constant in the code, IMO.

Definitely. What if you had used the direct numeric constant several times in your code, and you later discover that someone measured it even more accurately and it's actually 365.2423225 or something? Murphy's Law dictates that no matter how much you use Search & Replace, you're going to miss changing one of those and screw up your program in barely detectable ways (but enough to cause massive confusion).

Numbers like 365.25 and 180,360 are horrible when hard-coded for another reason: they are not dimensionless. You really _mean_ days_per_year, or speed_of_light_in_mph. Those constants have the advantage of telling you what units are being used. -- AlainPicard

Good point! Of course, I try and work in dimensionless quantities wherever possible, which also helps.

You can define DAYS and YEARS to be 1, and then write 365.2425 * DAYS / YEARS, or 300000 * KM / SECOND. This is a solution to the units problem. -- AmirLivne

That seems to me to be a good way of getting hard-to-find errors during maintenance. The code does nothing, can't be checked by a compiler, and gives the impression of being meaningful. Without an explanation I would wonder why the implied information isn't in a comment.

I agree that defining DAYS and YEARS as suggested is a bad idea. Adding unnecessary terms/factors to formulae is another form of NumberPollution. And finding them defined to be 1 would confuse me. Better to put units into the symbolic names, as in days_per_year, or degrees_per_circle, or whatever. However, a problem with hard-coding the units into the names like this is that it makes it more difficult to write code that can be used in both metric and non-metric locales. I worked in the traffic engineering field, and half the clients wanted the canonical units of speed to be miles-per-hour, and the others wanted kilometers-per-hour. --KrisJohnson

Where possible, use dimensionless quantities. If computation time is not crucial, you can carry around units explicitly.

All you need to do is decide on some canonical internal units (be they metric or non-metric). Then if you want to input or output that data in a locale that wants the other one, you just convert on the fly.

The problem with that approach (canonical internal units) is that you can get rounding inaccuracies. For example, converting input to canonical form and then back to whatever the input units are might give you a different number. Using the same abstract "dimensionless units" throughout (externally and internally) is often simpler.

Er, that's not a choice you have freedom to make, in most cases. If your values have dimensions, you need to deal with that. You can't just declare them to be dimensionless.

"Dimensionless" may not be the right term. What I meant was "abstract units": units of "speed", units of "length", etc., without specifying whether the units are miles, feet, kilometers, meters, etc. As long as you consistently use the same units throughout, and don't need to convert between those units and other systems, the specific units don't matter. If you have the need to convert between units, and you have the luxury of sufficient CPU power and memory, then by all means, attach dimensions to all your values. But when you're writing code for CPUs with speeds measured in KHz, less than 8K of RAM, communicating real-time data at 1200 baud, simplifications like this are often necessary.

Dimensionless quantities refers to a different approach. Essentially, this limits the number of units you have to deal with, and yield expressions that look the same regardless of the units of the quantities you place in the (i.e., there are not SI and cgs versions of an equation). In some sense the true character of an expression has been divorced from the units you happen to be working in. It is a standard practice in physics, and commonly shows up in some engineering disciplins as well. So this doesn't remove the problem, but minimizes its impact.

I was just kidding about creating constants for "units", and you have got me wrong. -- AmirLivne

Can someone please explain the DimensionlessQuantities? to me? Specific values only exist along with specific units; you can't just say "It's a speed of about 3000." without specifying a unit for that. I understand you can state that E = m * c^2 without specifying units, but if you need to calculate a specific value, you first need to pick a unit. -- AalbertTorsius.

Dimensionless units are most often ratios -- the number of teeth on one gear as compared to the number of teeth on the other, time dilation multipliers, coefficient of friction, index of refraction -- that you use to get one quantity from another with the same units. For instance, if we wanted to get the amount of force being applied to an object because of friction against a surface, we take the amount of force the object is applying to that surface, and multiply it by the coefficient of friction. (note that this only works when the object is moving relative to the surface... stationary objects get a potential resistance sort of thing) I don't think this is what they're trying to refer to up there when they're saying 'dimensionless quantities', but it's the correct definition -- a dimensionless quantity is a quantity that has no units assocated with it, usually because they've all been cancelled out.

The abstract units thing, on the other hand, is something I deal with all the time -- the helmet tracking system used on the comanche (used to overlay symbology on the real world so you get a tank with a box on it and some text next to it instead of just a tank) puts out 16-bit words that, depending on the context, could be 0.06mm per bit, or 0.005 degrees, or something dimensionless that goes into a transformation matrix. -- DanUznanski

Thank you Dan, very insightful. I thought what was meant was using the entities, without specifying units as in the E = m* c ^ 2 example above. (Someone may want to RefactorThis out of thread-mode.)

On AbstractUnit?s: aren't you just creating an extra unit? If I create an AbstractUnit? for measuring angles, with 360 degrees being equal to 100%, I've just created the necessity for another conversion. Aren't you better off just picking one (I prefer metric) and converting to everything else? -- AalbertTorsius

Actually, you should use 2*Pi (i.e. radians) instead of 360 degress, because radians is the dimensionless quantity (i.e. radian is by definition a length / length quantity) for measuring angles. Of course, you can further divide it by 2*Pi such that 1.0 represents 360 degress, but that seems redundant. -- OliverChung

In CeePlusPlus, you can use template metaprogramming to add dimensional analysis with no runtime overhead:

Is there some non-numeric way to express this:

	private static final int RETRY_COUNT = 10;

The question is ill-posed. Is there a non-numeric way to express "10"? Yes, no, maybe, what do you mean? "Ten"? No? On the face of it, you seem to be asking for something self-contradictory, so you need to explain what you really mean.

See also TheSecretLivesOfNumbers.

View edit of April 13, 2009 or FindPage with title or text search