Premature Optimization

Premature optimization is the root of all evil -- DonaldKnuth

In DonaldKnuth's paper "StructuredProgrammingWithGoToStatements", he wrote: "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."
However, PrematureOptimization can be defined (in less loaded terms) as optimizing before we know that we need to.

Optimizing up front is often regarded as breaking YouArentGonnaNeedIt (YAGNI). But by the time we decide that we need to optimize, we might be too close to UniformlySlowCode to OptimizeLater. We can use PrematureOptimization as a RiskMitigation strategy to push back the point of UniformlySlowCode, and lower our exposure to the risk of UniformlySlowCode preventing us from reaching our performance target with OptimizeLater.

For those who don't work to strict memory or CPU cycle limits, PrematureOptimization is an AntiPattern, since there is only cost and no benefit. For those who do, it is often confused with poor coding, or with misguided attempts at writing optimal code.

A common misconception is that optimized code is necessarily more complicated, and that therefore optimization always represents a trade-off. However, in practice, better factored code often runs faster and uses less memory as well. In this regard, optimization is closely related to refactoring, since in both cases we are paying into the code so that we may draw back out again later if we need to. We don't (yet) have PrematureRefactoring? regarded as CategoryEvil.

Another common misconception is that any level of execution speed, or resource usage, can be achieved once the code is complete. There are both practical and physical limits given any target platform. PrematureOptimization is not a solution to this, but it can help us DesignForPerformance. When working in an environment where resources are less limited, this is unlikely to be a problem.

The opposite of LazyOptimization. -- DavidMitchell

See: OptimizeLater, RulesOfOptimization, PrematureAnything, ProfileFirst, ProfileBeforeOptimizing, OptimizingTheIdleLoop
ThreadMode discussion:

Optimization practices allow you to produce a better product. Customers like that. -- GeraldLindsly

Premature 'Operations Research' is almost impossible to do wrong. -- gl

It happens once in a while though... Consider Hiroshima and Nagasaki. -- UN

Holy Mackeral, there, Saphire! A new Wiki record! GodwinsLaw invoked on the third sentence! [Well, Nazis aren't mentioned specifically, but bringing US nukage into the discussion has the same effect, eh?]

Sure, a little extreme, but isn't that a clear example of "PrematureSomethingOrOther?" Apologies for jumping on CarHoare's headline there. Thought it would get some attention though. Thx. -- gl

I largely agree with ["premature optimization is the root of all evil"], but I have to wonder about some things... how would you feel about:

"Let's just make the app run as a simple single-threaded CGI executable. We'll get the whole thing finished and then it will be easy to refactor it into a J2EE server later on."

Some decisions have to be made early, or not at all. -- UnknownAuthor
Belated pessimization is the leaf of no good. -- LenLattanzi
I actually had a huge problem with premature optimization until I saw the quote above, and realized it was true. Computer games have always traditionally been heavily optimized by hand, and I used to optimize the most trivial pieces of code. I still shudder at string classes that copy the string data around.

It's a bad idea because of course refactoring and maintaining optimized code is very difficult. Pointer caching (e.g. using *p++ in a for() loop instead of p[i]) makes the code less obvious, and changing the iteration becomes error-prone. Removing redundancy (e.g. if p == q + 50 is always true, eliminate q and replace with p - 50) suffers from similar problems, since redundancy in the first place exists to match the programmer's model.

One might say that code optimization is a form of refactoring in its own right, but which moves you away from what the language wants you to write and towards what the machine wants you to write. -- EddieEdwards
The pointer examples is especially interesting, because a compiler can perform them automatically over short regions of code! Furthermore, manual pointer management is much harder for compilers to understand.... If you have to optimize, try to avoid things that can be automated! -- UnknownAuthor
Also note that in C++ STL, using *p and ++p is the recommended approach. Also, in a C/C++ for() loop you would rarely, if ever, have need for *p++; you would place ++p in the for() loop "arguments" list [I don't know the correct name for the stuff that goes in the parentheses] and the *p access would be in the body. -- WayneMack
Actually, this is not that unique to game programming. Anyone who has ever had to develop a product that meets the capabilities and performances shown in a RAD prototype has experienced this problem. You can do almost anything with demo software, but making it actually work is different. -- WayneMack
A former coworker of mine (who will remain nameless) once said, "Performance anxiety leads to premature optimization." How true. -- ApoorvaMuralidhara
The problem with premature optimization is that you never know in advance where the bottlenecks will be. Also, you make the design and the code very hard to modify later, when requirements change. Yes, requirements DO change. That's called maintenance and even for systems that are still in analysis, design and construction, that holds.

You can optimize a design in order to find the most fast way to solve a problem, only to find out later that your optimization doesn't allow the system to do what the customer wants it to do.

This can also be expressed as:
  1. Make it work.
  2. Make it right.
  3. Make it fast.
This was expressed by one of the original XPers, AFAIK. (Yes; see MakeItWorkMakeItRightMakeItFast -- ed.) The only problem is that UML painters and code crunchers usually apply it in every little step, meaning there is premature optimization all over the project.

The following set of heuristics is a correction:
  1. Make it work.
  2. Make it right (the code is readable [uses IntentionRevealingNames] and every idea is expressed OnceAndOnlyOnce).
  3. Make everything work.
  4. Make everything right.
  5. Use the system and find performance bottlenecks.
  6. Use a profiler in those bottlenecks to determine what needs to be optimized. See ProfileBeforeOptimizing.
  7. Make it fast. You maintained unit tests, right? Then you can refactor the code mercilessly in order to improve the performance.
-- GuillermoSchwarz

I'm not sure you can't know where your bottlenecks are going to be. Don't you guys do order estimates before you code? -- Anonymous

As in BigOh order? You could, but those are sensitive to data set size and other factors that you might not know ahead of time, or that could change over time during development. A team could expend much effort coding an nlogn solution in the belief that the customer's assertion that the data is enormous or the performance factor is critical, only to find out in acceptance tests that the customer's idea of 'enormous' is laughably small. If it does turn out to be a suboptimal algorithm, BigOh can tell you where you are and what your choices for better solutions might be. Better to get a correct solution as outlined above, and then when performance bottlenecks are found, use BigOh to determine if the bottleneck is genuinely caused by that order n^2 code or by the fact that the data is being read over NFS instead of off a local disk. -- StevenNewton
This was originally attributed to "CarHoare" above, but tracking down the origin of this quote I found that it was actually Knuth who said it first. Although Knuth did calls it "Hoare's dictum" 15 years later, Hoare himself disclaimed it. See http://shreevatsa.wordpress.com/2008/05/16/premature-optimization-is-the-root-of-all-evil/ I think we can assume, with fair certainty, that Knuth was the first to use the phrase, and that it passed into folkore subsequently, leading to the wrong people being credited. Does anyone have any further information? -- ShreevatsaR
Absolute statements are the root of all evil. If you optimize too much upfront, you may have wasted time on something that was unnecessary. If you don't optimize upfront, you may waste time later reorganizing a lot of code or the customer may suffer in silence (realizing it's often too late to complain once the produce is "finished".)

The same applies to design in general. If you design too much upfront, you may have wasted time abstracting cases that will never be used. If you don't design upfront, you may waste time later reorganizing...

The key is balance. There are no answers, only questions.
We all become what we do. Do junk and you will only be a good junk producer. Do marvels and you will become better and better at it.

It takes quite a while to learn how to do well. But it pays also.

If you have enough grey matter, that's a no-brainer: do it as well as possible. <-- oxymoron?
After a month in production, I recently finished un-breaking a process I rewrote from scratch for my company. Despite painstaking efforts to replicate the production environment elsewhere, we couldn't reproduce it in any environment but production. At wits' end, I finally got approval to run production data through the system on my development workstation. Lo and behold, the same symptoms manifested plain as day. I profiled the application and traced the CPU load to an XSL transformation.

As it turns out, early on in the design I had thought it would be a good idea to execute XSL transforms in parallel, because these were CPU bound and we had 4 cores in production. So I had thrown the transformation into a thread pool. Unfortunately, the specific configuration of data in production somehow caused the thread pool to run out of control, overloading the CPU and causing timeouts leading to data loss. I dialed the thread pool down to a single thread, and everything just worked.

I now have a post-it note on my monitor that says, in all caps, OptimizeLater!
See also RulesOfOptimization
CategoryOptimization, CategoryEvil

EditText of this page (last edited January 29, 2014) or FindPage with title or text search