Global Variables Considered Harmful

W.A. Wulf, M. Shaw; Global Variables Considered Harmful, ACM SIGPLAN Notices 8:2, Feb 1973, pp. 80-86

Considered by many to be one of the classic papers of computer science. Nowadays seems obvious - global variables enjoy a reputation only slightly better than that of the infamous GoTo statement. I still use 'em occasionally, but cringe whenever I do.

However, the interesting thing about this paper is that it claims to skewer one prominent language feature - with an effect similar to that of GlobalVariables - embraced by many here on WardsWiki - LexicalScoping, as found in languages such as PascalLanguage, AlgolLanguage, and, more importantly, LispLanguage (and to a lesser extent, SmalltalkLanguage).

In fact, the arguments in the paper are really targeted at NestedScopes; they apply equally (or more forcefully) to DynamicScoping combined with NestedScopes.

In any case, the paper gives four main arguments:

  1. Side effects. Just like with globals; functions modifying variables other than their own locals can cause surprises of all sorts; if pass-by-reference is used then aliasing can occur when it isn't expected.

  2. Indiscriminate access. The programmer cannot prevent sub-procedures from modifying the values of a local variable's procedures.

  3. Vulnerability. New declarations may be interposed between when a variable is declared in an outer scope and when it is used in an inner scope.

  4. No overlapping definitions. It is difficult to control shared access to variables.

This paper is outdated; nested lexical scoping is considered a very desirable feature these days. Here are counterarguments to each of the points above:

Side effects: Some languages have "final" or immutable variables. In that case, accessing only final variables in outer scopes (either as a coding convention, or enforced by the language) would eliminate any concerns about "surprising" side effects. Note that nested lexical scoping is also applicable to PureFunctionalLanguages, in which all variables are immutable.

The point about pass-by-reference is largely irrelevant since very few modern programming languages use pass-by-reference. (Languages that pass "references" by value are not pass-by-reference.)

[Huh? Pass-by-reference ALWAYS requires passing a reference by value. That's how it works. The question is whether the referenced object is a COPY of the caller's object, or an ALIAS for the user's value. Most modern languages pass by reference for non-primitive types.]

Indiscriminate access: Essentially a repetition of point 1, and the same counterargument applies.

Vulnerability: "New declarations may be interposed between when a variable is declared in an outer scope and when it is used in an inner scope." So DontDoThat. This is not difficult to avoid, and it would be easy for a compiler to warn about this situation (it should not be an error).

Note that any rule other than accessing the "innermost" declaration of a variable, would change the meaning of a code fragment if it is moved to another context, and some local variable coincidentally shadows a variable in an outer scope. In any case, variables can and should be renamed in cases where shadowing makes a program too confusing.

No overlapping definitions: It is not difficult to control shared access to variables because of LexicalScoping. On the contrary, ObjectCapabilityLanguages generally support nested lexical scoping precisely because it makes it easier to avoid using global variables, and thereby makes controlling shared access to variables easier.

Many arguments against global variables that are independent of both LexicalScoping and NestedScopes are given in GlobalVariablesAreBad.

In addition (the paper doesn't discuss this); implementing LexicalScoping (with inner functions having access to outer function variables) poses lots of implementation difficulties for the language. You need a StaticChain (or a "display") to be able to access variables defined in enclosing scopes; implementing FirstClass LexicalClosures becomes a pain in the butt.

This is merely conjecture on my part; but I think one of the reasons that CeeLanguage was so successful early on is that it threw LexicalScoping into the bin. It simplified both the semantics and the implementation of the language greatly. The success of C (and C++, NoFlamesPlease?) are good evidence that LexicalScoping is not needed.

[Huh? C is lexically scoped. And lexical scope has nothing to do with "procedures being able to modify global variables". That is a totally orthogonal issue. How is it any better when a dynamically scoped language allows global variables to be altered by a procedure? In fact, it can be much worse, since the name of the variable, built into the procedure, could then enable the procedure to refer to some variable some poor sap who calls the procedure just happened to name the same way]

Yes, C is lexically scoped. The above comments were made before this page was changed to distinguish between LexicalScoping and NestedScopes. It is LexicalScoping combined with NestedScopes that causes the implementation difficulty.

In fact DynamicScoping combined with NestedScopes is even more difficult. HenryBaker wrote a paper on this, which is on-line at . Despite the title "The Buried Binding and Dead Binding Problems of Lisp 1.5", the problems it describes apply to DynamicScoping in general.

<The DynamicScoping/LexicalScoping thing doesn't apply to C/C++; as you cannot have functions within functions in either language. However, you can have classes within functions (and member functions within those classes) in C++; a member function has access to the attributes of the defining class; but not to any other outer scope. A class defined in another class has no access to the members of the outer class, either - except through a reference to the outer class. Java InnerClasses do bring back some elements of LexicalScoping - in inner class defined in a function is similar to a LexicalClosure in some ways.>

Class-like variables can be used in C by declaring them static within a .C file. These variables will be visible to all methods within the file, but hidden to methods outside the file.

[This misunderstands the point. Here's an example of C extended to have nested lexical scoping:]

  typedef (*FuncPtr?)();
  FuncPtr?	/* f returns pointer to function */
  f() {
	FuncPtr? ret;
	int i = 0;
	/* here is a nested lexically scoped func that accesses i */
	void g() { printf("%d\n", i); }
	g();	 /* ==> 0 */
	g();	 /* ==> 1 */
	return g;	/* return ptr to nested function g from f */
  main() {
	FuncPtr? h;
	h = f();	/* prints 0, then 1 */
	(*h)();	/* indirect call to g(), prints 1 */

This indicates the difficulty. When f() is called via the FuncPtr? h in main(), g() still has valid access to the stack variable i in f(), which means that the C compiler is not allowed to throw away the stack frame(s) created by f() when f() returns. Also, if f() is recursive, the i that g() references must be the one in the most recent stack frame created by f().

This is full "nested lexical scoping". As someone said above, it requires a StaticChain/display to achieve this effect, it has to do fancy stuff with tracking stack frames, etc. It's all hugely complicated compared with C semantics... and all of the obvious implementations can require an arbitrary amount of computation, in the worst case, just to access the value of "i".

<The StaticChain/display is required for any sort of nested lexical scoping, even in the absence of FirstClass LexicalClosures.>

As they said, it is indeed arguable that leaving out this crud helped with C's success.

Furthermore, although Lisp's "modern" lexical scoping is preferable to the older dynamic scoping, and although it is sort of necessary just to be able to create local variables with e.g. the LET macro, at least in the paradigm the Lisp world is accustomed to, there is indeed a strong argument that it is nonetheless evil in the absolute.

<Macros, as opposed to functions, get LexicalScoping for free - after all, when a macro is expanded it doesn't create a new scope (unless the macro definition contains code to cause a new scope to be created. Of course, macros are not first-class entities in any language that I'm aware of...>

<That's what I meant; I couldn't think of a better way to say it. It might be possible to UnifyMacrosAndFunctions? at some point - LispLanguage comes close in that it can read in new code on the fly (and process macros invoked therein). As discussed in CeePreprocessorStatements; there are things that can be done only with macros (and things that can only be done with functions, at least in current languages).>

And from the point of view of the innermost function, it is all about a variable that is in fact "global" from that nested function's point of view.

-- DougMerritt

[No, macros are dynamic scoping, and C/C++ (as opposed to C/C++ considering macros as part of the language) are lexically scoped. Lexical/dynamic scope have nothing to do with whether or not nested scopes are allowed. C is simpler to implement (and in many cases simpler to understand) than the Algol family of languages because it does not have nested scopes, not because it does not have lexical scope. When you define a global variable in a file with a function, that one level of scope allowed is lexical. The function, when called, always refers to that variable which is in its own lexical scope. However, when a macro is expanded, if it has a free variable in its definition, that free variable will end up referring to the free variable in its context of expansion. So, macros are scoped dynamically, not lexically]

[[More precisely, macros are ShallowBinding dynamic scoping. They look up variables in the scope of their expansion, but they don't trace unbound variables down the dynamic call stack. DynamicScoping, as used in early Lisp dialects and Elisp, will continue searching the call stack until it finds the appropriate variable. This is DeepBinding]]

<You're correct, of course - with the possible exception of HygienicMacros. However, I wasn't considering those>

<Perhaps we need a page for NestedScopes?>

Perhaps, but now I'm wondering if there's an authoritative source for getting the terminology right, because after all, you can have scoping based on lexical level as well as NestedScopes but still not need chains/displays; many of these issues are more often than not all crammed together into a single term, like LexicalScoping has done.

See also GlobalVariablesAreBad, GlobalConstantsConsideredHarmful

CategoryPaper ConsideredHarmful

View edit of July 10, 2010 or FindPage with title or text search