Null Considered Harmful

Does anyone else have a problem with the number of NullPointerExceptions that get thrown in Java? These can be seriously difficult to debug for the beginner, and also a pretty hard slog for an experienced developer, especially if you somehow lose your stacktrace on the way to receiving the exception.

For a discussion on an entirely different approach to null valued object references, check out: NullIsBenign. Imagine a java flag that would give default null behavior rather than throwing NPE!, i.e., java -dontPanic ...

Is it possible to get rid of this class of bug? It occurs any time you try:

and object is null.

It seems in Java that any time you declare an object, you are declaring 0-1 references to it, that is, either a reference to an object, or a reference to no object (to null):

 public Object o = ...	// this is either a reference to an object, or null

public Object[] os = ... // this is either a reference to an array of 0-m objects, or null. // In addition, the elements of the array may refer to objects, or they may be null.
For completeness, how about adding, "full", meaning it can't be null:

 public full Object o = ...	// this is exactly 1 object, and can never have null assigned to it.

public full Object[] os = ... // this is exactly 1 array of 0-m objects (presumably default-constructed).

public full Object[full] = ... // this is exactly 1 array of 0-m references to objects, none of which are null.

To support this, I think you'd need to at least be able to cast using (full Object) to this new full type. Additionally, it might be worth having some compiler shortcuts like:

where this kind of compiles to the equivalent of:

 if (x != null) {
Anyway, it's just an idea. I think it would also help firm up a lot of contracts, and mean that you throw IllegalArgumentExceptions? less often. Would this work? Are there any other languages that would support it? Could the JVM as it is support some kind of extra syntax like this? -- RobMoffat

This is implemented in the NiceLanguage, which is an extension of Java. The idea is when you declare the type of a variable or a method argument, you choose whether it can be null or not. It's a similar idea to this "full" modifier, but the other way around: types do not allow the null value by default, you make them accept it by prefixing them with a question mark (?). So the example is written:

 public ?Object o = ... // this is 0 or 1 object
 public Object o = ...	// this is exactly 1 object
 public ?Object[] os = ... // this is exactly 1 array of 0-m objects, each of them possibly null.
 public Object[] os = ...  // this is exactly 1 array of 0-m objects, none of which are null.
The compiler checks every method call and assignment. So it guarantees that no NullPointerException is thrown at runtime. Tests comparing to null are considered in the checks. For instance, the following compiles:

 ?String s1 = getAPossiblyNullString();
 if (s1 != null)
   /* In this branch, the compiler knows that s1 is not null, 
      so can call methods on it.
   System.out.println("The string has length: " + s1.length());
   System.out.println("The string is null");
If you know that a value is not null although its type is, you can "cast" with the notNull construct. Then a runtime check is inserted. It will fail exactly at that location if the value is null, so there is no need to trace back the bug. -- DanielBonniot

The notation 'Object[?]' above is strange. Why is it that way? I'd prefer the relevant two lines from above as

 public ?(Object[]) os = ... // this is 0 or 1 array of 0-m objects, none of which are null.
 public (?Object)[] os = ...  // this is exactly 1 array of 0-m objects, each potentially null.
 public ?(?Object[]) os = ...  // this is 0 or 1 array of 0-m objects, each potentially null.
These are all valid syntax.

The notation ?Object[] is potentially ambiguous. By convention, it means (?Object)[]. You can get ?(Object[]) either by writing exactly that, or by using the shorthand Object[?], for those who don't like parenthesis.

[Note: I changed your description, because ? is actually the marker that null is allowed, not the opposite. Also, Object[?] was previously referenced above by my mistake. Possibly null arrays were not discussed yet.]

The LavaLanguage has a keyword 'mandatory' for fields with the effect that they are not allowed to be assigned null.

I think Java has too many keywords as it is. Something like NoNullBeyondMethodScope seems like a less intrusive solution. -- ChristianTaubman

The problem with this approach is that it is only a convention. You cannot guarantee that null will not be passed as an argument. So either you explicitly check at the start of each of your methods (obfuscating the code), or you can get NullPointerExceptions at runtime that might be difficult to relate to the original error.

If you check for null in the classes that make calls to third-party code, that should be enough. -- ChristianTaubman

It's a good solution if you need to use javac. I tend to think that getting the compiler (nicec) to automatically do the job for you is a good thing, since it's possible. -- DanielBonniot

Would assert() in JDK 1.4 and higher be your friend here?

No, as it does not get you automatic checking at compile time.

This idea is quite common in functional languages. In strongly typed languages such as ML and Haskell, by default variables cannot be null. Where a null value makes sense, a separate type can be used; ML has the concept of an optional value, which is an enumerated type:

type 'a optional = Some of a | None;;

A similar concept can be implemented in C++ using templates - see, for example the Boost optional<> template at .

[Database specific stuff moved to NullsAndRelationalModel]

The Null in a relational database is something totally different than a Null Object. It's more of a value meaning Unknown. For example, if you had a database record on me (an anonymous person) and a field IsCertifiedGenius?, there is no way that you could correctly place either True or a False into that field - you just don't know. Therefore you've got to leave the field empty - that is, Null.

Or if this is a relational/constraint MultiParadigmDatabase, you could leave the field unconstrained.

Unless you're omniscient or limit your database (both records and fields) to things you have 100% knowledge about, you need a third value meaning unknown. You can't model most real world items otherwise. This is a totally different concept to say null pointers or null objects, which mean basically "I haven't been set up" (and you get problems with null references when "I haven't been set up but I've been asked to do something anyway").

Regarding "genius" example, you basically have a 3-way choice: Yes, No, and Unknown. Just another list of choices such as could be found in a pull-down list. I don't think we need a formal Null to represent that. Besides, in practice you usually assume "No" until the information is provided. Or define it as "Cert verified", so "no" is sufficient if it has not been verified. Or perhaps split it into 2 fields: "claims cert" and "cert verified" (or "date verified", however this gets into the issue of non-set dates). I realize that null objects and null fields may be different issues, but they both fit the title. Perhaps split this into NullCellsConsideredHarmful? and NullObjectsConsideredHarmful??

Actually, that doesn't work out. First, if a field has a possible default value, declare it as having a default when defining the associated column. Second, when you get to the place where it makes sense to allow "unknown"s and where no defaults should be set, you either get back to NULL or, much more complexly (and as proposed above) and additional (redundant) field, with then custom logic to manipulate the range of values. I've seen managers override engineers to try to avoid NULLs - they didn't get the concept it seems. Only, of course, to have things need re-engineering down the road, when the system was incapable of correctly handling the problem domain. Once one acknowledges the conditions that lead to a possible NULL value, the rest (seems to me) amounts to a diversity of approaches of how best to represent it. Using 2 fields seems to be a harder way than the standard ThreeValuedLogic.

The concept that "If you check for null in the classes that make calls to third-party code, that should be enough." is also a fallacy - it assumes your own code is perfect. You might be validated in commenting out your null checks after you've thoroughly tested your code, but given the messy consequences, any performance hit is likely to be lesser part of the trade-off.

Unless you've got some real reason not to, initializing everything that otherwise would be a null seems a pretty good idea. Don't just leave them floating around in the system waiting to cause a (sometimes subtle)problem. Eiffel's concept of having a benign no-methods object as the default reference seems good. Obviously the program's not doing whatever the programmer thought they intended, but at least it doesn't cause the system to collapse into a gibbering pile of jelly.

OR, much better yet, have a switch for running Java that says, in effect, NULL IS NOT A PROBLEM: return null rather than throwing NPE's. I've had full fledged production ObjectiveC systems that never broke because of the greater flexibility of having default (non-exception) null handling. The problem/situation you raise is EXACTLY the reason that I think NPE is the absolutely WRONG way to go. It is ungraceful, to say the least. Having default null behavior doesn't replace good testing. However, having NPE means that any otherwise benign null (i.e., a method that doesn't actually get invoked because the obj. ref is null - nothing happens, system state remains unchanged, all's good) will give you a guaranteed exception that might cause system collapse. Sounds like a very silly, or rather dumb, approach to an imaginary problem. In short, NullIsBenign!

[RelationalDatabase]s are based on SetTheory. A table is an unordered set of rows. A row is an unordered set of key/value tuples. A null is not a value - it indicates that there is no tuple for that key in the row.

[More stuff moved to NullsAndRelationalModel]

(Talking about Java, not databases)

I've always thought that a NullPointerException was not a problem. Java has just saved you from a situation where C would crash. You get a stack trace to show the place where the dereference occurred, so you can start finding out why you expected a non-null value.

What other real course of action do you have? Put in a reference to a "fake" object instead. That's not useful. In practice I generally find that I can located the cause of a NullPointerException pretty quickly. I think most people are playing ShootTheMessenger here.
I find Java's RuntimeExceptions, especially NullPointerException, to be useful debugging/code prettification tools. I gradually debug/refactor the code (using things like NullObject in the refactoring) until I prove to myself that the exception can't be thrown, then I remove the try/catch. -- KarlKnechtel

Here's the Java extension I'd like to see. You should be able to define a method, analogous to a static method, that applies when the object is null. For example,

public null int size() {return 0;}

Then, when foo is null, foo.size() returns 0. For backwards compatibility, the default behavior when you don't define a null method will be to throw a NullPointerException. -- AnonymousUser

You can actually define methods that work on null in the NiceLanguage:

 package test;

class Foo {}

// Declaration of a method size with a possibly-null argument foo int size(?Foo foo);

size(null) { return 0; }

size(Foo foo) { ... }

void main(String[] args) { ?Foo foo = null; int x = foo.size(); // 0 }
-- DanielBonniot

In the presence of polymorphism, how would it know which method to call?

Nice has MultiMethods, so this is part of the method resolution algorithm, but I'm not sure of the exact details.

What if the declared type is an interface?

Why would that cause any problem?

Do you allow implementations in an interface, but only for null methods?

See <>.

What if a class implements two interfaces that both define a null method and then that method is called on a null variable of that class. -- JonathanTang

This can only happen if you declare a null method twice (note that the declaration was outside the class in the above example), which is an error, I think.

CycloneLanguage also supports never-null references/pointers. -- DavidSarahHopwood

What would a language look like if it didn't have Null? A LanguageWithoutNull? might be pretty interesting.

Haskell, and other languages with the HindleyMilner type system, don't have the concept of null.

As mentioned above such languages usually provide an optional type (in HaskellLanguage):

 data Maybe a = Just a | Nothing

Maybe is a datatype that can be either just some value or nothing. Usage is simple:

 findCustomerNamed :: String -> Maybe Customer
 fullName :: Customer -> String
 lastOrder :: Customer -> Maybe Order
 date :: Order -> Date

fullNameOf :: String -> Maybe String fullNameOf name = case findCustomerNamed name of Just customer = Just (fullName customer) Nothing = Nothing

Of course using the case ... of construct can become cumbersome:

 lastOrderDateOf :: String -> Maybe Date
 lastOrderDateOf name = case findCustomerNamed name of
                             Just customer = case lastOrder customer of
                                                  Just order = Just (date order)
                                                  Nothing   = Nothing
                             Nothing       = Nothing

So we can define a map-like operation for optional values and rewrite our code: (In the Prelude as "fmap" for general Functors!)

 map :: (a -> b) -> Maybe a -> Maybe b
 map f Nothing  = Nothing
 map f (Just x) = f x

fullNameOf name = map fullName (findCustomerNamed name) lastOrderDateOf name = map date (map lastOrder (findCustomerNamed name))

We also can use the monadic bind operator (>>=) because Maybe can be used as a Monad:

 fullNameOf      name = findCustomerNamed name >>= fullName
 lastOrderDateOf name = findCustomerNamed name >>= lastOrder >>= date

Which is very similar to the full concept described above.

Both Maybe and >>= are part of the HaskellLanguage prelude.

-- AliMoe?

The default acceptance of "null" as an allowed value for a variable also makes static analysis much harder, unless the programmer adds lot of "not null" annotations. It is much easier to do good static analysis if the default is not to allow null and the programmer has to explicitly state where null is allowed. NiceLanguage does this, and we also do it in the Escher Tool language so that we can support VerifiedDesignByContract. In practice, we find that not allowing null is much more common than allowing null. - DavidCrocker

See also: NoNullBeyondMethodScope NullObject NullIsaHack

Contrast: NullIsBenign
What if a language allowed overriding of null as an type, similar to operator overloading? With an event handler thingy to trap/resolve nulls. For example in math, A+B+Null will return null but your override could return A+B. And Null operations could return an enum indicating which definition of null we are using: Null.Empty, Null.Uninitialized, Null.Undefined, Null.Unknown, etc. Each of those could have programmable meanings. Testing for IsNull? all the time is a PITA.
I rarely have unexpected NPE's, because I litter my code with assertions and precondition checks. That way you often get the error at exactly where it needs fixing. - WouterLievens?
C# added ? as a modifier on value types to allow them to be null (such as int? i = null;), which made me really want a ! modifier for objects to make them non-null (to be able to say void foo(SomeObject?! o) {...}). Sometimes it feels like half the code in these projects I end up working on is just if (arg == null) throw new ArgumentNullException?("arg");, and I can't even make a macro to lessen the load. -- ScottMcMurray
While its proof of concept uses Flash, this paper <> may raise some general concerns about the security implications of null dereferences in general. - Khan Klatt
I always thought Objective-C's treatment of NULL (nil) was interesting. Nil responds to all messages (methods) with nil. Anything you send to nil returns nil. nil == 0 so it equates to false. It saves a lot of trouble and prevents needless crashes where things could otherwise carry on.
CategoryJava CategoryConsideredHarmful CategoryNull

View edit of October 25, 2012 or FindPage with title or text search