Null Isa Hack

The observations below apply to JavaLanguage, CeePlusPlus, SmalltalkLanguage, PythonLanguage, PascalLanguage, CeeLanguage and all the other languages that allow null pointers. Simply put, extensive applications written in ObjectiveCaml or HaskellLanguage and other functional languages show more than enough evidence that programmers can manage without nulls. And they manage quite well.

What happens in JavaLanguage CeeCeePlusPlus et. comp, is that there are 2 issues: there are pointers that can have a special value (t) and there are pointers that cannot. Those are technically (from the TypeTheory point of view) 2 different types, and the corresponding pointers are also very different and the usage patterns are different. Now the type systems being what they are for all the above mentioned languages,they fail to make the difference. So programmers cannot express this difference in language constructs and be verified by the compiler and have appropriate object code generated. The result is typically a mess and it leads to serious leaks in the modularity.

Let's take the following example:

 clientCode1() {
   char* value1= "DUMMY VALUE";

clientCode2() { char* value2= getenv("USER_OPTION"); /* possibly returns NULL */ f1(value2); f2(value2); }

void f1(char *param) { printf("%s\n",param); }

void f2(char *param) { if ( param == NULL ) { printf("I GOT A NULL\n"); } else { printf("%s\n",param) }
So we have 4 pieces of code: clientCode1 and clientCode2 are the producers of pointer values, and f1 and f2 are consumers of those values. What we need to be able to do is the following:
 if (value2 != NULL) { f1 (value2) }
 else {/* treat special case in here */ }

In the absence of the above we have the following very messy options: The latter case is what is supposed to happen in practice. Since chains of calls are typically significantly long, pointers are omnipresent (especially in Java, Smalltalk), doing it 100% right is very tedious. Therefore you get NullPointerException, or even better memory access violation, SIGSEGV and other goodies.

Go take a look at GatedCommunityPattern before demanding that the API do all this mumbo jumbo, please. These issues have been done to death for nigh unto 20 years now in the C community; why do we need to rehash all this good stuff yet again?

They might have been beaten this problem to death but with no good results. The latest standards for C, C++, Java , C# all suffer from NullIsaHack problem. And most of the significant size Java projects still suffer from NPE. Even things like EclipseIde, IntellijIdea, and NetBeans will throw NPE at you in some of their less tested modules.

In the same time you have 2 extremely easy and elegant solutions that do away with the NullIsaHack problem: Therefore GatedCommunityPattern is a workaround to what is fundamentally a LanguageSmell.

I simply do not get it. What is the hang-up over null. Why is null a problem at all? The only reason I can see is an historical one that no-longer applies: to whit, if pointers are (virtual-)addresses (as they were in C and some versions of C++), then there's a problem trying to access a nll pointer as an address. But in a modern language - and certainly any with a VM - WHO CARES!? Null is benign. No, no NullObject, NULL itself. Send null as message/method, get null back. Simple. Inspect null as a boolean, see "false", as an int one sees "zero" (kinda like wave-particle duality) null is the ultimate default value and takes on different appearances when viewed through the lens of a particular type. I guess it's clear also what I think of purely pedantic approaches to type safety (a la ML): they don't lead to elegance or clarity, but just to stringent unnatural restrictions.

I would say the exact opposite - NULL only makes sense in C++. In C++, a pointer is not an object, it's a pointer to a block of memory that may point to that object, or a block of crud with undefined behaviour. In that case, it makes sense to define special cases as being specific addresses of blocks of crud with undefined behaviour. However, in a language with sophisticated references, it makes no sense to support the Null object - since the Null object represents a reference to something that is NOT of the type of the reference. When I write "Circle mycircle = Null", that's not a circle, even though I just defined it as a circle. That's NULL. This behaviour makes sense in C++, but in Java/C# it's just stupid.

The reason you need NULL is because it's the only mechanism to say "This might be a Circle"... the blind spot comes from the way you check it. You don't ask "Is this object a circle" but "is it not null"? If you were asking "is it a Circle" then it would be obvious. What you've really got is equivalent to the C-style "Union" operator, except that you've enough metadata to determine which type of the union it contains. Logically, why not allow full unions? Why not allow circleOrTofuOrNull, instead of just the usual circleOrNull? After all, doing any non-null operations on a null are errors.... so why not allow tofuOrCircle, where doing non-Tofu operations on a Tofu are errors? Why is null the only "not-Circle" that's allowed to go in a Circle reference? Hell, that's half the reason we have exceptions too - if we can't return the expected object, we have to return a different type... which violates the type of the receiving variable, so we have to do something wierd. The tricks with stack-traces and popping off the stack are just gravy upon that fundamental need.

One solution to get rid of all NullPointerException: switch to a decent language like a FunctionalProgrammingLanguage. You'll soon wonder where all those null pointers disappeared - they do have plenty of "pointers" or, better said, you get to handle dynamically allocated and garbage collected structures and do it a lot without ever meeting a null pointer or looking for it.

Yeah, thanks for the hint. Now, getting back to reality... -- AnonymousCoward

There are only 2 ways the 0/NULL/null/nil can pop its ugly head: 1. uninitialized structures 1. when they are used as special values, for example to mark the end of the LinkedList or the end of branch in a binary tree. And the code processes those special values as if they were no different than any other proper value.

Interesting point. However, the gist seems to be "null is bad" so the language should avoid null. (Please enlighten me on this if I got that wrong.) That's basically the opposite of the point of the question "Why have NPEs?" - to me, null is absolutely fine; it's the NPE that's not nice..

Yes, languages should avoid null because we don't need no stinking null. Uninitialized structures are a bad idea to begin with, and special cases are special cases. The problem is that any language environment can't provide a default good way to respond to dereferencing a null other than NullPointerException or memory violation, core dump and such fine stuff. In a language without nulls you just don't get these goodies, because you don't get to dereference null objects.

Saying "Uninitialized structures are a bad idea" seems to be like saying "an empty cup is a bad idea". Yet how, without a cup being empty, can it ever be used to hold anything. In the case of structures is amounts to saying: everything needs to have a default value. As was learned 30+ years ago with SQL: you can't reduce the real world to that - it doesn't support a robust enough model. So the same with programming. Otherwise you insist on default values everywhere. Rather if one just says "null is fine" then, voila, you have your default value everywhere: empty/unset/space/whatever call it what you will.

You should choose your references better. SQL is maybe 15 years old, certainly not 30, and NULL is one of the greatest design mistakes and the source of the most common errors in SQL. A fresh reading of ChrisDate on the subject might do you some good.

Hmmmm. Interesting. I have never gone to great length to consider that possibility and it would take me a while to investigate it thoroughly. The first thing that comes to mind is the case where an uninitialized object/type does make sense. Local variables, for instance. Or user-supplied values. Of course one could opt for a pre-assigned nonsense or default value, but for those with no clear default one, it seems to me, is in effect renaming null. Like I said, I haven't delved into this (no extensive practical exp. with a language that never uses null - unlike the exp with ObjectiveCee and JavaLanguage which do use null but with very different side-effects: in the former null is benign, in the latter it causes what might be the most commonly thrown "exception") so I can't see it quite clearly. Following the idea/need for default values, however, one then either is often initializing new objects just to avoid null and might also be defining static default instances on a per-class basis.... if that is the approach then it seems to add a fair bit of complexity and not quite resolve the problem...? Thoughts (or better: personal experience of deployed apps that never use null)?

Again, NullObject pattern doesn't resolve all the semantic issues for which NULL is currently used so in those cases NullObject is not benign at all. It will lead to logic errors which are no more benign anyway than NPE; potentially it is worse because NPE at least is an "in your face" kind of error. For example of large programs built without NULL, I'd recommend you take a look at a lot of projects built using Haskell, ML (and Ocaml).

Having alternate, non-NPE, behavior is NOT the same as having a NullObject. It seems to be a question of preference: is the "in your face" approach preferred? I find it both less desirable and having negative consequences whereas default "do nothing" behavior is better. It's also an extension of thinking as objects. First we learn to think as the thing we're programming: "be" the widget. Then we can learn to "not-be" or be nobody, be space. Nothing there, so nothing is done. Another way to think about it is: is it "ok" to talk to a wall? My answer is "sure" just don't expect any (re-)action. Same thing with NULL - send it all the methods/messages you want. It basically means you don't worry about NPE (or rather the conditions of variables being NULL, since, what I propose is entirely doing away with NPE - just, simply, get rid of it and have the obvious "default" non-behavior for all null references) unless it makes sense from the specific logic of the program itself. In short: why should a language try to tell the programmer the semantic meaning of a variable's value? That makes no sense. But NPE in effect does that. It says: null + invocation MUST be an error. Yet, in truth, most often it is not. I prefer languages that don't get in the programmer's way. As for ML, at the risk of being controversial, if you treat a programmer as a moron (by putting them in a straight-jacket) guess what: you'll get moronic programmers. If you treat programmers as smart - expect that of them and provide flexible languages - that too will be a self-fulfilling approach.

NULL + invocation is a programming error more often than not in my experience. What is a NULL at the end of the list supposed to respond to nextNode() (yet another NULL? infinite loops anyone?) What is a NULL supposed to respond to a trivial function returning let's say an int 0 so exp(2, null) should be 1 as well as cosine(null)? You should know your mathematics better than that. If a programming error prevented the initialization of a variable a mathematical calculation should be able to return a flawed result that is perfectly reasonable for the unsuspecting user?

Your OO mumbo-jumbo (StopUsingMetaphors please) can't brush off the real issue. While the language (either at runtime or compile-time) cannot decide the normal semantics of a NULL value, it can decide it is a programming error, doing nothing is even more a semantic decision made by the language designer on behalf of programmers and what is even worse, is that the decision is hidden away from programmers.

As for ML attracting moronic programmers, I can assure you that the least of ML programmers are way smarter than 90% or more of the rest of programming population. A smart programmer will expect a well designed language to let him handle the real important stuff about a program, and there's no shortage of hard problems to solve when programming. The time of the real smart programmers is way too precious to be wasted chasing for trivialities that can be handled by the language (like who allocates, who deallocates, who initializes and where, who uses it and where, who catches an exception , etc, etc ...).

For the first case, you just don't get them in proper languages, and you really don't need them anyway.

For the second case you use SumType?. For example:

 type 'a tree = Empty | Node of 'a tree * 'a * 'a tree
That is a tree is either a Empty (the special case, no stupid null in here) or a Node which is triple containing a properly typed tree - the left branch('a tree), an element of the proper type ('a), and the right branch. So that's it, you won't get to do anything with nulls.

The quasi-impossibility in C/C++/Java family to easily create such sum types with special values, practically forces people to use null - and it is typically used both for input params and for returning results, and for designing structures and this is a hack force unto us by language design. It is tantamount to a huge hole in their type systems.

Which is well known, hence the NullObject pattern. The NullObject pattern is also a hack as SumgLispWeenies? rightfully noticed. NullObject is useful typically to implement null behavior, when it is possible but for example as a return value from a generic Hashmap.find(key), or for many other uses. The problem is that NullObject implements the same interface as regular objects, while the client may really need to treat it as special case in the client context, therefore NullObject polymorphism is really a hindrance for those clients.

NiceLanguage differentiates between "nullable" and "non-nullable" types and statically checks for null-safety.

 String s1 = null; // illegal! s1 cannot be null
 String s2 = "hello"; // OK
 ?String s2 = null; // OK
 ?String s3 = gets(); // OK, s3 might be null now
 int l = s3.length(); // Compile-time error! potential NPE
 if (s3 != null ) { // compiler now knows s3 is not null
 l = s3.length(); // this is legal
The language is Java-like with additional static checking and some functional-inspired features. The compiler generates Java bytecode.

I can see the utility of insisting that some refs are not-NULL and to throw an exception on assignment. This is a way of saying s1 (et. al.) is a REQUIRED variable (must have a value). BUT it doesn't do anything for the NPE question: s2 & s3 can still throw NPE - so to me this basic issue is unresolved. NPE is silly.

No. The Nice compiler will report an error if you try to dereference s2 or s3 without first checking that they are not null. After a successful compilation, there is a guarantee that no NullPointerException will occur at runtime.

Note: Special emphasis made of the Java context for this discussion to delimit it from other languages' use of NULL. Remember, C, C++, Pascal, and other languages use NULL to good advantage. Please don't lump their use of NULL in with Java and some of these other things.

In view of what's been discussed here how can one claim that C, C++, Pascal and other languages (sic!) use NULL to good advantage? At least Java throws NullPointerException which is far better than C/C++/Pascal (and other languages?) that crash the program. How can one claim that program crashing is better than NullPointerException? [Also keep in mind that catching a NullPointerException allows the program to continue execution, but catching a segmentation fault does not guaranteed the cause is really due to NULL dereferencing, nor can you be certain that no memory overwriting has occurred.]

Not to mention that in C/C++/Pascal one can have undefined pointers (pointers who are not assigned a value before they are used, thus "pointing" to a random memory location), and dereferencing an undefined pointer can succeed (with non-negligible probability: think when a pointer variable lands in the stack space where another pointer variables was previously there) with terrible and uncontrollable results. C/C++/Pascal handling of pointers, including null pointers is far worse than Java.

Gee, wow! Imagine that! If somebody uses pointers before they contain a valid value or after the target space has been deallocated there is going to be a problem?!? Hey, I better get on the phone to K & R so they can include that in their next book! Oh, wait - it's already there, ain't it? And just by the way - most modern compilers offer some sort of runtime null pointer handling built in to the Cstart or other runtime procedure handling. That's nice, but not the point.

Hey, the fact that the language admits such usage is a hack. Maybe it was a decent enough of a hack during good old days of K&R but it is a lousy hack by modern standards.

As with any powerful tool, NULL can cause no end of grief if it is misused. My Desert Eagle is a very powerful pistol, but it's only dangerous if I allow it to point at my foot. I learned long ago how not to point it at my foot, so there's no need for a target discrimination override on the trigger. When one is using a powerful language like C (frequently described as a "super assembler") one has to keep track of one's use of resources. Otherwise you end up with all kinds of problems, not just this NULL thing. But hey, Java's NULL pointer exception thing is really great. Just don't include the rest of the programming language world in with Java when whining about that particular exception handling.

As with any powerful but obsoleted tool, NULL has no merit any longer. This was the case made on this page, and you made no counter-argument why NULL still has some merit. Your pistol analogy is nice, but how about this analogy: do you think it makes sense any longer to have cars without seat belts ? Well, a language design like C that allows one to use illegal pointer values is exactly like a car design without the seat belts. How's that for an ArgumentByAnalogy.

[The analogy would hold if the car forced you to wear the seatbelt, for example, by stopping the engine anytime the seatbelt was not used correctly. While we're beating analogies to death, consider the screwdriver: a useful tool that is also dangerous when used incorrectly. Used as a prybar, it has a funny way of making holes in hands. It can be used as a weapon. Is it reasonable to argue that a screwdriver is obsolete because it permits incorrect uses? Perhaps there is value in the tool's simplicity, even if it can be dangerous in the hands of a fool.] {And since we come to deeply imbricate analogies, I think of using of matches to light the gas stove. Sure it was a fine tool (some people are still using it) and if you're careful enough you won't burn your fingers with it, but modern cooking machines avoid this hassle altogether and they light the fire themselves. Even better, if there's no more burning for whatever reason, they will turn the gas off. The bottom line is that NULL is a fine tool to use if you're constrained by CeeLanguage (or JavaLanguage for that mater), but if you were to design C again, you wouldn't need to use NULL, because it's obsoleted and unsafe. Modern type systems, compiler technology and language design can tell you when your pointers are not initialized.

Not a bad analogy, but analogy is only so useful. After a while it becomes obfuscating rather than clarifying. NULL still has merit in that it still performs the function it was originally intended to perform; a special case pointer value that is well known and can be used for all kinds of processing and comparison purposes.

I don't have statistics at hand but pointer mistakes are common. Ponder Java vs. C handling of bad pointers: a complex environment like VisualC++ can be crashed entirely by such a mistake, and you lose a lot of time and maybe a lot of work, while a complex environment like EclipseIde may show you a NullPointerException message triggered in some obscure module. Which one do you prefer as a user?

As a developer I learn to respect and properly use the tools I have. If I do things that are going to hurt me I should know better. "Doctor, it hurts when I do this." "Well, DontDoThat." Pretty simple, really. There is nothing wrong with NULL when it is used properly. I'm sure that statistics will say that so-and-so many errors are the result of pointer mistakes. Okay, if it hurts then don't do that. Use the tool that is appropriate to the task. If you don't know how to avoid pointer errors then use something that will CYA, such as Java.

For a C weenie it's pretty strange if Visual C++ has never crashed on you (not even a "benign" internal compiler error?). Either you're not a C Weenie or you haven't programmed for Windows. How about gcc, gdb et. comp., then, have they never crashed?

Oh, man!! If I don't know how to avoid pointer errors? :) Are you free of pointer errors? If I think that worrying about pointer errors is a useless past-time, and chasing pointer errors introduced by different programmers because of poor communication is even more useless of an occupation ("I didn't know that function X can return NULL", or "I didn't realize that I shouldn't pass a pointer to a stack object", and on and on the stories go), then I can simply use a language where such errors simply do not happen. As a developer maybe you are too fond of the tools you have. You can have better tools: for a C weenie you might like OcamlLanguage, it's kind of close to the metal in a nice sort of way.

As long as we're slinging derisive labels around: I guess if you can't hack <ahem> the heat, stay out of the kitchen. Java weenies don't belong in an environment where they don't have their mama wiping their noses and don't have the runtime environment cleaning up resources after they drop them on the toy room floor. Don't play with Daddy's tools, Billy.

I'm curious how this discussion interacts with dynamically typed languages. Lisp, and Ruby, for example both have a "nil" object which really doesn't seem to cause problems. I know in C++ (my job's primary language) that the language forces me to use NULL for potentially uninitialized values (no argument here, it can be annoying). But I've never really felt the problem with dynamically typed languages. In those cases, I expect null and branch on it (base case of tree insertion, etc...). Is Null bad in such a context? Why? -- DaveFayram

Aren't there other existing topics on this? See NullConsideredHarmful

NullIsaHack because you can have perfectly good "nullable" behaviour with Option or Maybe types.
    -- haskell language
    data Maybe a = Just a | Nothing
    fromJust :: Maybe a -> a
    fromJust (Just a) = a
    fromJust Nothing  = undefined -- NOT a null value, this is throwing an exception.

-- An example which uses a list instead of a hashtable, -- because I'm too lazy to write a hashtable. -- Or a trie, for that matter. data List a = Cons a (List a) | Nil get :: Int -> List a -> Maybe a get index (Cons _ tail) = get (index-1) tail get 0 (Cons head _) = Just head get _ Nil = Nothing

list = Cons 0 (Cons 1 (Cons 2 Nil))) val0 = get 0 list --> Just 0 res0 = fromJust val0 --> 0 val1 = get 100 list --> Nothing res1 = fromJust val1 --> undefined; will crash if you try to output it.

CategoryProgrammingLanguage CategoryNull

View edit of May 29, 2011 or FindPage with title or text search