What Is Null

What is this thing called "NULL"?!?

Null or Nil Characteristics: Is this standard terminology? In my C days, I thought of {char *s = NULL;} as a NullString?, and {char *s = "";} as an EmptyString?. Yes, this is pretty standard.
A few other things that act like NULL in some contexts.

EDS had a "OWL," a research language related to the SmalltalkLanguage. It supported several NULL-related concepts: (I think there were two or three others too, but I no longer have access to the documentation.)

I often found relational database ThreeValuedLogic to be annoying. This system's 5-value "boolean" logic could be really confusing... Like "true and unknown ==> unknown", "true or unknown ==> true". -- JeffGrigg


Yes. The problem of trying to add semantic to null is that the list goes on forever. ChrisDate (from the database community) wrote a paper a few years ago that showed a huge number of possible meanings for null in a given context (e.g. not applicable, not known, not declared, and so on). If I remember, the essence of the article was that trying to enumerate different forms of null was basically impossible, since somebody will always come up with a new meaning. The way we went at AT&T (where I was working at the time) was to use different NullObject classes, each with different meanings. When an application needed a new meaning, we added a new NullObject class. So, there were no fixed semantics. -- AnthonyLauder

[argument about different treatment of nil in Lisp and Scheme moved to IsSchemeLisp]

Will it not always be the case that a separate return value is required to indicate "not found" in a hash lookup? PerlLanguage has "exists" as well as "defined". The instant one tries to allow storage of a value meaning "does not exist" (as opposed to merely "undefined") in the hash, you go back to needing another return value to distinguish the cases.

It is just the problem of QuotingMetaCharacters? all over again. -- MatthewAstley


Null String and Null String Pointer a problem

One of the problems with C/C++ type languages is that a Null String and a Null String pointer are handled inconsistently in libraries. Most take either as meaning "no value", but some require a NullString? and fall over if passed a Null pointer or reference.

With the addition of Unicode or STL strings this becomes even more confusing, where there are now three possible Null values:

When calling system functions it is often necessary to convert the string to old fashioned CharStar strings, and the zero length string needs to be converted to one of the others, again inconsistently.
According to a recent interview with Joshua Bloch, Java 1.5's automatic unboxing will convert null to 0. Oh dear. There is some hope, though. It's not settled, and it could change to throwing an exception.

After some discussion they decided to throw an exception instead. There's a mention about it at LambdaTheUltimate.
In type theory, NIL/NULL/whathaveyou is usually handled in one of several ways:

or

Myself, I prefer the former. For one, it allows multiple NULLs with different meanings; though I'm not aware of any languages which define multiple NULLs like that. For another, many FunctionalProgrammingLanguages use the BottomType to indicate a raised exception (which I think is also braindead) or to indicate divergence (which kindasorta makes sense - "returning" a non-existent type to indicate a condition which, according to the HaltingProblem Theorem, we cannot detect...) For a third reason, it allows us to define two types of references; those which can be NULL and those which can't. (C++ has this capability; though it's easily bypassed... a functional Java variant called Nice (NiceLanguage) also provides references which are guaranteed to not be NULL). If we use BottomType for NULL, then **ALL** objects in the system might possibly be NULL (including things like ints and bools where it doesn't make sense in most cases).

-- engineer_scotty (ScottJohnson)

Also see NullConsideredHarmful.


What is NULL? Baby, don't hurt me...
I've found this helpful. Given the other discussions on site about Null in databases, I thought I'd contribute several sections of definitions from ISO's evolving set of Geographic Information standards.

ISO/PDTS 19103 Conceptual Schema Language (also in 19107 Spatial Schema): NULL means that the value asked for is undefined. This Technical Specification assumes that all NULL values are equivalent. If a NULL is returned when an object has been requested, this means that no object matching the criteria specified exists. EMPTY refers to objects that can be interpreted as being sets that contain no elements. Unlike programming systems that provide strongly typed aggregates, this Technical Specification uses the mathematical tautology that there is one and only one empty set. Any object representing the empty set is equivalent to any other set that does the same. Other than being empty, such sets lack any inherent information, and so a NULL value is assumed equivalent to an EMPTY set in the appropriate context.

But ISO/CD 19126 Geographic Markup Language (an XML encoding) introduces a "convenience type" gml::NullType which is a union of several enumerated values: inapplicable, missing, template (i.e. value will be available later), unknown, withheld and anyURI (for which the example is the lovely but non-existent http://my.big.org/explanations/theDogAteIt. This allows much more clarity of meaning than merely allowing 'minOccurs=0'. We're investigating it for a large database model we're developing. (PeterParslow?)
The problem with NULL is that it is context-sensitive and each context can change the definition. To makes NULL consistent and remove context-sensitivity, you would need several types similar to NULL (uninitialized vars, pointer to nothing, empty object, empty var, etc). In Math it is even more of a problem since the possible meanings of NULL change the actual result. Does A+B+Null = A+B, or undefined, or null, or does it throw an error? This is confounding for developers and language designers, almost a HolyWar (then again everything in language design is).


Donald Rumsfeld has a profound understanding of null -

"There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know."
All of this, and nobody's mentioned NaN (in FP math)? (As of January 2007)
Nice, schmice. You can have non-nullable references in stock Java now, with generics:

public final class Ref<T> {
	T obj;
	public Ref (T obj) {
	set(obj);
	}
	public void set (T obj) {
	if (obj == null) throw new NullPointerException();
	this.obj = obj;
	}
	public T get () {
	return obj;
	}
}

For an immutable version, just make the set method private.

The downside is that the way Java generics work, the Ref<T> may be more cumbersome to use than the T. For example, if Bar derives from Foo, you can assign a Bar to a reference of type Foo, but not a Ref<Bar> to a Ref<Foo>. So you need those variables to be of type Ref<? extends Foo>, which is SyntacticSalt at its finest. (This assignability rule is for a simple reason: otherwise you could assign a Ref<Bar> reference to a Ref<Foo> reference, use a Foo that is not a Bar in a set method invoked on the latter, and then pull a non-Bar out via the get method on the Bar reference. This does mean that you can't use the set method on a Ref<? extends Foo> though, although you can assign a new Ref object with a new referent.


WhatIsNull you ask?

In some languages (OzLanguage being an example) one may introduce a single-assignment variable without actually assigning it. You may then utilize this unassigned variable inside a data structure. One may unify two unassigned variables - i.e. asserting they are the same, such that if one is assigned, then so will be the other. Similarly, you may assign such a variable to a structure that contains yet more unassigned variables. All this is a weakly expressive form of ConstraintProgramming. More expressive ConstraintProgramming would allow you to restrict variables as well as assign them, i.e. to say that X > Y without knowing X or Y. In a secure language, the authority to assign a variable or manipulate constraints may be separate from the ability to view the variable. OzLanguage added security via 'X=!!Y', which says that X can view Y but cannot assign Y.

I've long believed that Nulls - at least in the context of data (SqlNull) - should really be identified by these sort of 'unassigned free variables'. This would allow us to make very interesting observations, the way we do with algebraic and geometric expressions in mathematics. It would be possible, for example, to perform proper joins between tables containing these free variables. If we were using a TableBrowser, one might distribute authority to assign these 'variables', such that one could meaningfully assign to a field that was previously unknown... and update the proper record.

Of course, it would still lead to interesting issues, such as: the sum of 10+X+12 can only be reduced to 22+X, and a join between two tables limited by X>Y may return some interesting 'contingent' records.


See Also: CantHideFromNulls
CategoryNull CategoryDefinition

EditText of this page (last edited August 14, 2010) or FindPage with title or text search