Cache Calculations

A more general form of ConstantFolding? and LoopInvariantAnalysis.

Many calculations in a system are performed multiple times on the same sets of data. Consider a function:

	f(x,y,z,...)

This is a "pure" function - i.e. it has no inputs other than the parameters and no outputs other than the result.

If two calls are made to f() and it can be shown that the parameters are the same, the value of f() can be cached the first time, and read from cache the second time.

Depending on circumstances, sometimes classes can be defined to do just this job. Many such classes show up in real life. When the function is a constructor, FlyweightPattern can be used to CacheCalculations. If the parameters are few, discrete, and bounded, then an array can be used to cache the results (e.g. a sine table or a LightMap). CacheCalculations is an extremely general optimization.

It makes sense to try to FactorCache from calculating code - the calculating code can be understood without the cache, the cache can be understood without the calculating code. Sometimes a recursive algorithm, such as a naive implementation of a routine to find the longest common subsequence in two strings, can be made dramatically more efficient by adding a cache. Instead of recalculating (the same) subresults exponentially many times over, the results are calculated once and reused. This is what lies behind the technique of DynamicProgramming.


You get this for free in a language with LazyEvaluation.

In a language without LazyEvaluation, you can choose the caching strategy.

In languages like C++ you have to rewrite the caching class for each function whose results are cached OR you can use templates and do it once, if you are prepared to pass the arguments in as a single class variable.

This is a special case of a more general approach: DontRepeatExpensiveOperations?. The expensive operation may be a calculation or may involve accessing a database or some other non-calculational activity. This is by far the most useful OptimizationPattern I've found, and it's the only approach that ever provides order-of-magnitude performance improvements.


This is a critical optimization in chess programs (such as DeepBlue) and other state-space searchers. A chess program's TranspositionTable? stores results of leaf node evaluations and possibly values of internal sub-trees of the search tree, typically in a HashTable indexed on a hash of the current position (using an incrementally updated Zobrist hash). This can allow whole sub-trees not to be explored, often adding a full ply or two of search depth in the middlegame and often doubling the search depth in simple endgames where transpositions are more likely.

However, one needs to take care that the cached value of a complex calculation doesn't become stale, and really is applicable to the current situation. For instance, a cached position value in a chess program usually also stores the "draft" or maximum depth for which the value was calculated, as well as flags for whether the value is exact or only an upper or lower bound.


See also PartialEvaluation, MemoizationStrategy

but not to be confused with CacheAccessCalculations?

CategoryOptimization CategoryFunctionalProgramming

EditText of this page (last edited November 20, 2004) or FindPage with title or text search