# Pointer Arithmetic

Mathematics on pointers. Assumes that a pointer is little more than a physical machine address--or has components that are. Useful for several things:

• Accessing sub-fields of an object (assuming you know the layout of an object)
• Traversing an array of some object (here, an array is defined as a sequence of objects arranged contiguously in memory).

A prominent feature of C/C++ (and a few other system languages). Widely ConsideredHarmful outside the systems programming domain; most high-level languages do not allow pointer arithmetic. There are two main reasons for this:

• it limits how the language implementation can lay out objects or arrays. In particular it cannot use different layouts depending on context.
• the combination of pointer arithmetic and lack of BoundsChecking is particularly error-prone, easily leading to WildPointers?.

However, some languages, for example CycloneLanguage, support a restricted form of PointerArithmetic that only applies to arrays, and that always performs BoundsChecking. This variant is not ConsideredHarmful.

Even within C/C++, pointer arithmetic is somewhat restricted. If you have a pointer to some type Foo; you can only adjust it via increments of the size of one Foo (written "sizeof (Foo)"). Finer adjustments require casting the pointer to an (unsigned) char *, doing the math on those, and casting back. All objects in C and all objects of PlainOldData types in C++ can be treated as arrays of bytes (unsigned char in C; either char or unsigned char in C++). The representation of objects in terms of bytes is largely unspecified though the layout of structs can be determined using the offsetof macro so long as they are PlainOldData.

According to the C and C++ standards, pointer arithmetic is only valid within an array or just off the end of it - it is valid to point to "one past the end" of the array provided that such pointers are never dereferenced. Most (if not all) C/C++ implementations do not check this--proving that pointers stay within an array's bounds is undecidable at compile time; and deemed too expensive (for C/C++) to do this sort of checking at runtime.

But not too expensive for CycloneLanguage. As usual, the undecidability argument is a RedHerring: many bounds checks can be eliminated at compile time, and the rest can be checked at runtime.

It's not a RedHerring, it's just different design constraints for Cyclone vs C/C++. It is certainly true that some (possibly all, in unusual programs) will have to be checked at runtime due to undecidability, just as he said. Whether that matters or not all depends.

More to the point, perhaps, I've seen many instances of inner loops where it's undecidable, from real world code (when working on code generators). You can say the slowdown doesn't matter for some particular purpose, but if it matters, you shouldn't just hope it doesn't arise -- it does arise in practice.

This appears to be the approach taken by C/C++. If the the validation does not matter, one does not need to do it; if the validation is important, then one is free to validate.

A few other notes:

• A common trick in C programs to simulate arrays with 1-based indexing is to use PointerArithmetic to create an array which points to the element previous to the first element; i.e.

``` Foo zeroBasedArray[size];
Foo *oneBasedArray = zeroBasedArray - 1;
```
This trick is highly unportable, and can actually fail in some architectures. On most modern CPUs, pointers are harmless until dereferenced; but on some older machines the mere act of loading an invalid pointer into a register may cause a GeneralProtectionFault.

• Another one of the common CeeIdioms in SystemProgramming, which is now a part of AnsiCee (it is currently not a part of ANSI CeePlusPlus; though most compilers will let you do it) is the ZeroLengthArray? at the end of the structure. It's not uncommon to find a variable-length data structure consisting of a header followed by zero or more instances of some (fixed length) substructure; the ZeroLengthArray? allows you to do this. One then uses PointerArithmetic to access the elements of the array. However--attempting to use the ZeroLengthArray? is technically UndefinedBehavior! One cannot write portable code in this manner, but in the domain of systems programming, it is often known what the behavior of this construct will be--if it is known that beyond the header lies valid data, this trick will work.

• What is part of AnsiCee is the FlexibleArray? ("int a[];"), not a ZeroLengthArray? ("int a[0];"). The latter is a GnuCeeism.

CategoryPointer

View edit of November 8, 2014 or FindPage with title or text search