Helping The Compiler Is Evil

As a language designer, you should know that every language feature you introduce to help yourself (under the rubrik of HelpingTheCompiler) is at the expense of your users (programmers). There are a whole bunch of language features specifically introduced to help compilers and even more whose purpose turns out in hindsight to have been helping the compiler. Here are some:

 10 GOTO 10
  Map<Int,Str> m = new HashMap<Int,Str>
Referential transparency is a good example. No end user expects the result of the same function call applied at different times to result in the same thing. That's because the world is pervasively dynamic and not static. What users expect are concepts to help them deal with that dynamism. What does referential transparency do instead? It merely ignores the world's dynamism so the compiler can optimize the computation.

Do monads help users manage the world's dynamism? I doubt it. So what does? Well, pure OO for one.

The claims above are [untrue]. All the features vilified above are not targeted to help the compiler; they are targeted to help their users. Indeed, for the compiler writer the fewer constraints the language has the easier is his task. Every student is now expected to be able to write a interpreter for a lisp-like language by the end of the first year in university, while asking the same student to write it for a language like HaskellLanguage or MlLanguage is entirely unreasonable, on account of their type systems.

The same for ReferentialTransparency: it is a help for users. Even for languages that are impure it is important to decouple two distinct concepts: identifier binding and variable assignment. In FunctionalProgramming languages, names are by default identifiers and values that are bound to them are by default mathematical values, unless one makes them computer programming von-neumann style values, i.e. references to locations in memory that can be mutated. This, of course, is designed to help users meet their ProofObligation. Decades of practice in functional programming tends to show that in the absence of side-effects, or with limited side-effects, proofs are easier, and consequently functional programs have fewer bugs.

I think this comment misunderstands what "Helping the Compiler" means. The trivial case of, say, Lisp is easy. Making an optimizing, analyzing and generally "good" Lisp compiler is extremely difficult. This is because the compiler has no "help" in figuring things out. In fact, it's so hard that in order to get good performance, Common Lisp defines "declare and declaim" statements explicitly to help the compiler do its job.

While the grammar of, say, C++ is much more complex, making an optimizing compiler is much easier because the language is embedded with all kinds of extra information that the compiler can use to make the fastest code representations. Likewise with ML family languages, the programmer works in a certain fashion and promises to maintain certain coding styles and is held to certain contracts so that the compiler can infer on its own where to make optimizations.

But the thing is that an optimizing compiler was not the design goal of RobinMilner when he designed MlLanguage. Nor was it the reason behind Haskell. The fact that ML may be an easier language to optimize than Lisp is a good side-effect of StaticTyping design. And these days when even a Python interpreter may pass as good enough the argument is moot. The biggest issue that affects the optimization is whether all the functions are by default polymorphic, and how much type safety one decides to include at run-time. Another item on the list was ReferentialTransparency, and in this case the design goal behind it was not to help optimizing compilers.

I concur. Some of the items on that list are more a flavor issue than any real implementation issue. StaticTyping is one of those "Help The Compiler" things, though.

For somebody accustomed to the ways of Lisp, Smalltalk or other dynamically typed languages, StaticTyping may be annoying and thus may be regarded as a way to help the compiler validate his "obviously correct" code. For people with a different background (TypefulProgramming), types are a tool that helps the user. In the words of John Reynolds, types are tools for abstraction, and syntactic tools at that. Like any high intellectual activity, programming requires a minimum of discipline, and thinking through types is one of those disciplines that can help those who master it to write better programs with fewer bugs. Other approaches require less discipline. If you regard the types as a necessary intellectual discipline needed to help you program better, there's a world of difference. Some people don't need that discipline and claim they write good programs; nevertheless, other people claim that the discipline helps them.

Remember that GrandMasterProgrammers like EwDijkstra, NiklausWirth, DonKnuth had absolutely no problem not only working with StaticTyping, but declaring all the variables at the beginning of a block. That would seem horrendous to sloppy Java ( and Python and Perl, etc) programmers of today who are so used to declaring variables at the point where they are being used. Yet it seems that Wirth would not even after decades give up the idea of mandating variable declarations first, and incidentally even the ML family doesn't seem to budge to the idea of encouraging the sloppiness of a programmer. Is it a necessary discipline or is it a gratuitous help to the compiler? It's hard to tell, but judging by the quality of some programs written by the above-mentioned grand masters, vis-a-vis the sloppiness of most Java programs, results should speak for themselves. Another parallel would be the discipline of classical arts versus the sloppiness of contemporary "arts". A colleague of mine has a very funny opinion on that: after the 1800s, they stopped making music and started making noise.

Flexibility is generally a good thing. If you feel you need the discipline given by declaring all your variables at the start of a block, nothing stops you from doing so. Languages are tools, like pencil and paper, and pencil and paper don't force you to write sonnets. Meanwhile, some people write far better free verse than they ever could do sonnets; the freedom of form allows them to worry about more important things. Making things difficult is no way to enforce quality.

yeah, but freedom without discipline is kind of noise.

If someone needs to be "forced to think", they might be in the wrong profession. The places where BondageAndDiscipline on the part of the compiler can be helpful is where a little bit of redundancy can be introduced, just enough so that the programmer's intent is verified. One good example is the existence of variable declarations. Even in languages where the programmer need not specify anything in a variable declaration (other than that it exists), they are a GoodThing as they dramatically reduce the instance of typos becoming new (and unplanned-for) variables; rather than harmless compiler errors. Typos accepted by the compiler can still occur (typing 1 when I mean i is my favorite), but their incidence is reduced. But where the declaration goes strikes me as immaterial.

If we had to be forced to think, then we all should be given 300 baud terminals. :)

Whether or not these features are at the expense of "users" depends heavily on who your users are. A mathematician very much expects the same function call applied at different times to result in the same thing, because that's how the objects they deal with every day behave. That is perhaps why Haskell, oriented towards an academic audience, goes to such length to maintain it.

That all depends on the function call doesn't it? Everyone would expect Math.Sum(4,4,4) to have the same result each time, but how about anObject.PrintTo?(aStream), I think not, it entirely depends on the current state of the object. You are both correct, but talking past each other. Users actually expect both ReferentialTransparency and its opposite, depending on context, just like everything else. People always take context into account, if only computers worked as well.

Sum is a function, PrintTo? is a procedure. It's perfectly reasonable for those to be different types. Haskell deals with it by making actions first-class objects. A mathematician expects the same function applied to the same argument to always give the same result, but obviously the result of an action can vary.
Classes vs prototypes, I'd mostly agree with you, but most class-oriented OO languages are oriented towards programmers used to class-oriented languages. For most people, it's more important to go with what they're familiar with than the "purer" concept. -- JonathanTang

Are you sure it isn't simply that the language designers in question are more familiar with classes than prototypes? Programming is all too often characterized as a handicraft with most of its practitioners ignorant of existing research and established results. Hell, the idea that there is research in "computer science" doesn't even enter most programmers' minds, would astonish them even. -- RK

The one PrototypeBasedLanguage that is mainstream - JavaScript - doesn't have a very good reputation, it seems. Much of this is the fault of many incompatible implementations (a fallout of the BrowserWars) as well as the use of JavaScript to spread viruses, SpyWare, pop-up ads, and other nastiness to users running WindowsWhatever?. OTOH, JavaScript certainly has its ugly bits, even divorced from these issues. [I'm curious what you think those ugly bits are... I think javascript is a fantastic little language, you can't blame a language because people abuse it to do spam.]

At any rate, designing PrototypeBasedLanguages isn't an area demanding lots of research (other than efficient implementation, perhaps). The concept of prototypes is very straightforward... -- ScottJohnson

Yup. And it's also very straightforward that they're superior to classes as they allow the easy creation of all kinds of proxies without resorting to the ugly doesNotUnderstand hack. I've argued this on several pages, not least ClassesPrototypesComparison [...].

[minor flame war between RK and Costin deleted]

Certainly, ReferentialTransparency is useful in many cases - especially, as Jonathan points out, when dealing with math and the like. However, mandating it everywhere is in my view a bit obnoxious. I find it amusing the contortions used by programmers in pure FunctionalProgrammingLanguages in order to emulate state (monads, threading, LinearTypes, etc.) while studiously avoiding anything that might resemble a variable.

The excellent book ConceptsTechniquesAndModelsOfComputerProgramming (published this year and already a classic IMHO [seconded]) contains an excellent treatise on the benefits and drawbacks of mutable state. In particular, they give examples of several data structures for which a purely-functional implementation requires considerably more time to perform operations (greater asymptopic complexity, or a larger BigOh if you will) than stateful equivalents.

Monads can actually hide a lot of different things: state, non-determinism, fallibility, I/O. I can even use a monad to mutiplex I/O using select() and make it look like multithreaded code. This is no contortion, just a very convenient mechanism. The state hidden in the IO monad is not emulated either; I can have ordinary variables and references to them. By the way, the "considerably more time" is just a logarithmic factor in reality.

[Haskell has a didactic problem, in that people think they need to learn monads to do I/O. No. You need to learn that Haskell has first-class actions, which are constructed by composing smaller actions, all the way up to Main::main, which is the action of running the program. If I was teaching Haskell, I'd introduce monads afterwards: "Oh, btw, IO is a monad, and do-notation works for all monads." And if I was inventing Haskell, I'd say "Bindable" or "CanFail?" instead of "Monad".]

OTOH, StaticTyping has many uses besides helping the compiler and providing optimization opportunities. It can serve as documentation, as well as aid in demonstrating correctness. The obnoxious thing is languages which require the user to specify a type for everything. I see nothing wrong with a) providing DynamicTyping as a default; b) allowing programmers to specify types if they want, and c) allowing the compiler to perform TypeInference where appropriate. The best of all worlds!

Totally agreed, does that mean we should all use VisualBasic? ;), just kidding.

Help the compiler in what way? While none of them help the compiler by simplifying the compiler writer's task (each feature in a language is additional work for the designer), some do "help" by providing optimization opportunities. StaticTyping especially; CeePlusPlus is a good example. Many of the optimizations done by a C++ compiler are precisely what makes the language brittle when used in an environment where components of a program are deployed piecemeal; see FragileBinaryInterfaceProblem for that discussion. Prototypes are arguably easier to implement than classes; it's (slightly) easier to dispatch a method call through a local pointer stored directly in the object, than through a VeeTable. Add in the pointer-munging that C++ has to do to implement MultipleInheritance, and it gets truly mind-bending.

Prototypes are not easier than classes since they do not by themselves replace classes. You have to implement maps to get the space savings of classes. And of course, you have to add multiple inheritance because there's nothing stopping someone from creating multiple parent slots in Self ....

Though not as rarified as some of the stuff discussed above, I ran into my first "helping the compiler" problem with the IBM and MS Assemblers in the early '80s. I was told it was "necessary" for the correct assembly of the opcodes. Silly me, I actually bought that - until I ran into the EricIsaacson A86 assembler. Eric actually worked for Intel designing their assembler, and knows the instruction set and the processor better than anyone (aside from the EEs that crafted the silicon itself).

Eric dispensed with all the "helping" crap and allowed the programmer to just write code. I was immediately more productive, and my stuff just about always ran "right out of the box" without the hair-tearing debug sessions I'd had with the IBM/MS version.

The only requirement of the original A86 was that you had to have the label "main:" somewhere in the code. Oh, and the code had to be syntactically correct, but if I have to add that disclaimer [too late] then we're missing the point.

He made AssemblyLanguage fun for me again. -- GarryHamilton

View edit of May 28, 2012 or FindPage with title or text search