Computers represent numbers by approximations. There is a finite amount of data available to represent every number from negative infinity to positive infinity, so some numbers had to be left out. The numbers which got left out are rounded to the closest number which can be stored in the computer.
There are several different ways to represent numbers on a computer. These representations are different ways of choosing which numbers get left out. (Most numbers in computers IeeeSevenFiftyFour
floating point notation, called "floats" or "doubles" in C, or "reals" in Fortran).
When computers do arithmetic on these approximations, they give results that are approximations of the exact result. The calculated result almost never equal to the exact result. Occasionally the calculated result is very different from the exact result. Most of the time, the calculated result is very close to the exact result.
Due to rounding error, the computer may tell you that 10/5 - 48/24 equals 1.5e-29, which is very close to zero but not equal.
This page is somewhat misguided, as it conflates representations of numbers with numbers, floating point with 'real', and presents false information "Doing arithmetic on real numbers will result in rounding error" is simply not true as written. If there seems to be value in it (right now this page is not referenced that I can see) I will come back and fix it.
It's true that the literal reading of the title is incorrect, but the context I intended was "representations of real numbers on finite digital computers" - I thought it best to avoid picking on single or double precision. That said, maybe I should have said FloatingPointNumbersAreNotEqual?, lest a SmugLispWeenie visit to remind me of how Lisp deals with numbers. (-8 -- MatthewAstley
It's not just the SmugLispWeenie
s. Anyone who has to do "pure math" will avoid using floating-point numbers. There are plenty of ways to represent real numbers, rational numbers, complex numbers and so on without the loss of accuracy that happens when floats are used. Floats work well for some situations. As said below, the fact that so many programmer use them in inappropriate ways is a widespread educational problem.
On the other hand, everyone doing FEA (finite element approximation) in the process of designing real bridges and real airplanes and real plastic-injected toys - uses floating point numbers. They're more than adequate to the task.
Do you do FEA? The people I have met who do - and I have many many, as I used to specialize in highly parallel computing algorithms and tools - worry a great deal about using floats, and go to great lengths to do proper error propagation analysis.
Well, but that conflation is foisted on us to begin with. We are given these programming languages with finite precision floating point number representations and led to believe that you can do math with them. You can, sort of, but you get the wrong answer because the tautologies of mathematics are not preserved. In math, X = Y, but in C, X == Y plus or minus some tolerance that is the programmers' problem to estimate and implement (UntrackedUncertainty?
). If reals were "equal" we wouldn't need ints. -- DavidFlater
C doesn't have any reals, and doesn't claim to (unless my memory is very bad, and the standard writers decided to be stupid for a few days). C has floats though, and the process you describe, however poorly characterized on this page, is well understood. It is a pedagogical problem that so many programmers don't understand floats, but that is a different problem.
The language encourages conflation by giving us elegant support for approximate numerics but not for testing approximate equality. Surely somebody somewhere has got a C++ or Java class that packages a float along with its uncertainty and carries this through all operations. -- DavidFlater
C was designed as a systems language, and it shows. There is little support for mathematics, elegant or otherwise. The type system is good at exposing underlying hardware, but causes many unnatural (from a mathematical point of view) results from mathematical operations. This is certainly not limited to FP operations, consider that in C 1 / 2 = 0, and 1 + MAX_INTEGER (or whatever) overflows. C++ has inherited this weirdness, as has Java and others to certain degrees. What it boils down to is that these languages have poor support for mathematics, so, as you suggest, language extensions or libraries are really the only way out.
Which is as it should be. They are, after all, general-purpose
programming languages, and as such have poor support for most specific
purposes, such as mathematics.
[For languages that provide standard support for math concepts at all
, I think I'd prefer they get the abstractions right and support idiom-based optimizations. (I.e. provide exact reals, and optimize to IEEE floats when it is known the available or required precision is under 7 digits). If a language isn't willing to do this, I think I'd favor the language simply requiring all math be done via libraries.]
It is perfectly straightforward to operate on real numbers in a computer, you do it the same way as on paper - with symbols. If the numbers happen to be smallish integers or rationals, you might want to use a decimal (or other base) expansion. Ah, but you say you can't write down that expansion on paper? Same problem with computers. Oh well, it doesn't really matter does it? The rationals are dense in the field of reals, so we can just use a rational approximation to whatever precision we need. Problem solved.
What's that you say? Running out of memory and time with large rationals in all your calculations? Well, you can't have your cake and eat it too, you know :)
Floating point representations are a compromise, but they are often a good one. They are evenly spaced in the logarithm (except near zero!
), so there is good coverage for a given cost in storage. They are often implemented in hardware, so you can have efficient computations if you can abide by the limitations of the machine representation. However, the price you pay is an inability to represent every number you might wish to exactly. FP operations that can have an exact result do, as a rule. If you are doing integer operations for example, below a reasonably large number, no error is acquired. When FP operations can't represent a result exactly, they will approximate it. The rules of this approximation are easy to understand, and any programmer working with FP should take the time to understand them. It is up to *the programmer* to choose an appropriate representation for numbers in her calculation, balancing accuracy, speed, and storage ... and understand what the choice means.
This article seems relevant:
IEEE floating point math can do exact integer calculations,
a few comments on roundoff error
It seems to say that we don't really *need* ints.
If you calculate lots and lots of numbers, you might want to check out MatrixTemplateLibrary
Alternatively to rational approximations, use Corecursion or LazyEvaluation
, thereby allowing you to work with exact transcendental numbers (like pi, and e). One represents real-numbers as an infinite stream of digits in some base along with base^exponent and potentially a sign (if not using BalancedTernary
as the base). Critically, the amount of memory and processing time required to implement this depends upon the size of the computation and the precision of the final result rather than upon the size or precision of the intermediate computations.
This approach still has problems with infinitesimals (0.999999... vs. 1), UnknowableNumbers
, and requires some careful handling for normal forms (0.1*10^1 vs. 0.01*10^2), so it isn't perfect. But it is pretty damn good. It can also be extended naturally with finite precision concepts.
See also: FloatingPointCurrency