A pointer that doesn't point to either a valid object (of the indicated type, if applicable), or to a distinguished null value, if applicable. (In CeeLanguage
, pointers one element past the end of an array are legal but not dereferenceable; they are very similar to NullPointer
s in this regard).
s can be broadly categorized
with respect to time-related behavior in exactly 4 ways; pointer value is invalid because:
- It never had a valid value (uninitialized, and has garbage value)
- It used to have a valid value, which did not change, but transitioned from valid to invalid (a dangling reference/DanglingPointer, e.g., to a freed heap cell or a stack cell after return from function or exit from inner scope)
- It has a value that will be valid in the future, but isn't yet: pointer was created but referent not yet initialized (bad design allows pointer access before initialization is guaranteed, or a similar issue with concurrent access during creation)
- Many CeePlusPlus compilers perform this sort of optimization (stashing a pointer-to-garbage into a register before calling the appropriate constructor which then turns the garbage into a valid object); which is why DoubleCheckedLockingIsBroken (or at least isn't portable--though strictly speaking, no C++ code that uses semaphores is portable).
- It has been overwritten with a garbage value, e.g., by:
- pointer arithmetic that adds or subtracts a value that takes the pointer out of bounds
- casting an inappropriate integer to pointer
- being overwritten with the address of a valid object of the wrong type following a typecast
- being clobbered by a write through another WildPointer (domino effect)
- inappropriate concurrent access, especially on architectures which cannot read/write pointers atomically
- broken register spilling or failure to properly unwind the stack--use of setjmp/longjmp can cause this problem.
The above list is exactly
4 items because it covers all of the relevant possibilities with regard to time
(never, past, current, future). (This is my own analysis, not something I found in the literature, so by all means let me know if I overlooked some possibility. -- dm)
Lists of general causes
can be arbitrarily large.
Those invalid conditions in turn can be caused by (among many, many possibilities):
- PointerArithmetic done incorrectly
- Freeing an object twice (in languages without GarbageCollection)
- Bugs in garbage collectors (OK, not the programmer's fault)
- Pointers to (or within) ActivationRecords that no longer exist.
- Pointers to uninitialized storage; or pointers with uninitialized values.
- Unsafe pointer casts.
s are never truly a good thing long term, although they have been known to accidentally produce desirable behavior in the short term. They often lead to HeisenBug
s, etc., e.g. when a WildPointer
accidentally initializes an otherwise-uninitialized variable, causing the program to work -- until a different compilation option is used, or some new code/data is added, or old code/data changed/modified, or until the program is run under the IDE/debugger -- and then the WildPointer
overwrites something less beneficial because the location of things in RAM gets shifted around, and the code stops working, or a more severe problem comes up, etc.
Many languages attempt to (and some succeed at) eliminating WildPointer
- No objects on the stack; or if it's on the stack then no pointers to the object that might outlive the object (especially by disallowing pointers to stack objects).
- GarbageCollection, or managed pointers
- No PointerArithmetic
- Default initialization (i.e., to NULL) for everything
- No UnsafeCast?s.
Y'know, ultimately you can't legislate against accidents. At some point the programmer has to be responsible for what he writes. Removing a powerful mechanism from a language (or group of languages) on the grounds that some people have accidents with it is an abdication of responsibility. They pay us more because we (supposedly) have the skill and wisdom not to point guns at our own feet.
At some level of the programming stack (near the bottom), PointerArithmetic is unavoidable. Perhaps it is always generated by a compiler; but all the high-level constructs that we use ultimately boil down to operations on the fundamental machine types--ints, floats, and addresses (give or take a few).
On the other hand, at sufficiently high levels of abstraction, it becomes more of a liability than an asset.
You can add overheads to the compiler and/or the runtime environment to "keep us safe" from all the vagaries of the human mind, but every one of these "safeguards" exacts a price.
is a bad thing, but alertness of the programmer would seem to be a better idea than YetAnotherSeatBelt?
This is becoming less true every day; thanks to MooresLaw.
Do you mean to say that the price is no longer there?
Depends on the application. For CrudScreens, shell scripts, and numerous other types of programs; the price is sufficiently inexpensive that it shouldn't be a design concern. For high-performance applications, systems programming, etc., the price is still significant.
Use the language(s) and technique(s) appropriate for your application. That advice has always been sound.
Wild Pointers Are A GoodThing
By the definition of "Wild Pointer" above, a wild pointer is a necessary condition. There are reasons to have a pointer to an invalid object.
- To hold transient objects (objects with a life time less than the life of the program).
- To serve as iterators and point to either a memory location before or after a set of objects.
Please note that GarbageCollection
does not completely solve the problem of memory management. It replaces the problem of early deallocation of memory (and pointers to invalid memory locations) with the problem of late or non-existent deallocation of memory (and memory leaks).
If one has two "owners" of a single object, there are lots of data change issues. Double deallocation is only one of them. -- AnonymousDonor
- Why don't your transient pointers go out of scope when their lifetime expires, thus eliminating the wild pointer?
- Managed pointers (e.g. auto_ptr in CeePlusPlus) can help with this, but frequently scope lifetimes don't jibe with pointer referent lifetimes
- Iterators that do this safely use a distinguished "null" value to mark the beginning or end of the iteration. Note that null values don't have to be (void*)0; any value that's specifically checked for by the algorithm counts as a null by the definition above.
- That's not at all a universal statement; it doesn't apply to the above example of a pointer equal to a guard value just beyond the end of an array, for instance.
does not lead to MemoryLeak
s. The purpose of a heap allocator is to make sure a free block of memory can always be allocated upon request. A garbage collector simply frees unused memory whenever a memory request would otherwise fail (or frees a bit with each allocation, for IncrementalGarbageCollection?
). The programmer cannot tell the difference.
- That's not the point; he was just saying that one can still have memory leaks that GC doesn't fix -- not that GC causes them.
Resource leaks are another concern, and a known problem with garbage collectors, but that's beyond the scope of this page. -- JonathanTang