The following pattern language summarizes the use of temporary variables in Smalltalk. I ran across this language while writing the Smalltalk Best Practice Patterns. It seems kind of low-level to me sometimes, but I was interested in how the patterns work together and how a seemingly complex idea like how to use and name temps really boils down to just a few patterns.
Temporary variables let you store and reuse the value of expressions. They can be used to improve the performance or readability of methods. The following patterns motivate and guide the use of temporary variables. The examples are taken from Smalltalk, but the discussion applies to any language with procedure-scoped variables.
1. Temporary Variable
2. Collecting Temporary Variable
3. Caching Temporary Variable
4. Explaining Temporary Variable
5. Reusing Temporary Variable
6. Role Suggesting Temporary Variable Name
??? What are the patterns that tell you to write a method???
How do you save the value of an expression for later use within a method?
A state-less language like FP contains no notion whatever of a variable. If you need the value of an expression, you evaluate the expression. If you need it in many places in a program, you evaluate it many places.
While the abstract properties of a state-less language might be attractive to a language designer, practical programs use variables to simplify the expression of computation. Each variable has several distinguishing features. The scope of a variable is the textual area within which it can be used. The extent of a variable defines how long its value lasts. The type of a variable is signature of messages sent to it.
Long extent, wide scope, and large type all make variables difficult to manage. If all three factors are present in many variables, you have a program which can only be understood in its entirety. Limiting the scope, extent, and type of variables wherever possible produces programs which are easier to understand, modify, and reuse.
Another language design decision is whether to require the explicit declaration of variables. Early languages like FORTRAN detected the presence of variable automatically. Algol and its descendents required explicit declaration. Explicit declaration puts a burden on the programmer writing the program, but pays off when someone else needs to understand the program. The presence and use of variables is often the first place a reader begins.
Therefore, wherever possible create variables whose scope and extent is a single method. Declare them just below the method selector. Assign them a value as soon as the expression is valid.
Collecting Temporary Variable (2) saves intermediate results for later use. Caching Temporary Variable (3) improves performance by saving values. Explaining Temporary Variable (4) improves readability by breaking up complex expressions. Reusing Temporary Variable (5) lets you use the value of a side-effecting expression more than once in a method.
??? How to explain this back link???
How do you gradually collect values to be used later in a method?
The right set of enumeration protocol would make this question moot. Inject:into: in particular often eliminates the need for a temporary variable. Code like:
| sum | sum :=0.
self children do: [:each | sum := sum + each size]. ^sum
Can be rewritten as:
^self children inject: 0 into: [:sum :each | sum + each size]
The variety of enumeration strategies in complex programs makes it impossible to generalize the inject:into: idiom. For example, what if you want to merge two collections together so that you have an element from collection a, and element from collection b, and so on. This would require a special enumeration method:
^self as with: self bs inject: Array new into: [:sum :eachA :eachB | (sum copyWith: eachA) copyWith: eachB]
It is much simpler to create a Stream for the duration of the method:
| results | results := Array new writeStream. self as with: self bs do: [:eachA :eachB | results nextPut: eachA; nextPut: eachB]. ^results contents
Therefore, when you need to collect or merge objects over a complex enumeration, use a temporary variable to hold the collection or merged value.
Role Suggesting Temporary Variable Name (6) explains how to name the variable.
??? Again, what does the back link look like??? This has to also have links to the performance tuning pattern language, probably something about already having measured a bottleneck.
How do you improve the performance of a method?
Many performance related decisions sacrifice programming style on the altar of demanding users with limited machine resources. Successful performance tuning hinges on being explicitly aware of this tradeoff and only introducing changes that pay back in increased performance more than they cost in increased maintenance.
As with variables, the scope and extent of a performance tuning decision dramatically affect its cost. Performance related changes that are confined to a single object are good, changes that only affect a single method are even better.
All performance tuning boils down to two techniques- either you execute code less often or you execute code that costs less. Of these, the first is often the most valuable. It relies on the fact that for reasons of readability, expressions are often executed several times even though they return the same value. Caching saves the value of the expression so that the next time the value is used instantly.
The biggest problem with caches is their assumption that the expression returns the same value. What if this isn't true? What if it is true for a while, but then the value of the expression changes? You can limit the complexity introduced by the need to keep a cache valid by limiting the scope and extent of the variable used for the cache.
Therefore, set a temporary variable to the value of the expression as soon as it is valid. Use the variable instead of the expression in the remainder of the method.
For example, you might have some graphics code which uses the bounds of the receiver. If calculating the bounds is expensive, you can transform:
self children do: [:each | ...self bounds...]
into:
| bounds | bounds := self bounds. self children do: [:each | ...bounds...]
If the cost of calculating the bounds dominates the cost of the method, this takes a method which is linear in cost in the number of children and turns it into one which is constant.
Role Suggesting Temporary Variable Name (6) explains how to name the variable. Instance Variable Cache (?) caches expression values if they are used from many methods.
Temporary Variable (1) can be used to improve the readability of complex methods.
How can you simplify a complex expression within a method?
In the passion of the moment, you can write expressions within methods which are quite complex. Most expressions are simple at first. As soon as you are looking at live data, though, you realize your naive assumptions will never work. You add this complexity, then that one, then another, until you have many layers of messages piled on each other. While you are debugging it is all understandable because you have so much context. Coming back to such a method in six months is quite a different experience.
Fixing the method right might require changes to several objects. While you are just exploring, such a commitment might be inappropriate.
Therefore, take an identifiable subexpression out of the complex expression. Assign its value to a temporary variable before the complex expression. Use the variable in the complex expression.
Role Suggesting Temporary Variable Name (6) explains how to name the variable. Composed Method (?) puts the subexpression where it belongs and gives it a name.
Temporary Variable (1) can be used to reuse the value of expressions which cannot be executed more than once.
How do you use an expression several places in a method when its value may change?
Methods without temporary variables are easier to understand than methods with temporary variables. However, you sometimes encounter expressions whose values change, either because of side-effects of evaluating the expression or because of outside effects, but you need to use the value more than once. Using a temporary variable is worth the cost in such a case, because the code simply wouldn't work otherwise.
For example, if you are reading from a stream, the evaluation of "stream next" causes the stream to change. If you are matching the value read against a list of keywords, you must save the value. Thus:
stream next = a ifTrue: [...]. stream next = b ifTrue: [...]. stream next = c ifTrue: [...]
Is not likely what you mean. Instead, you need to save the value in a temporary variable so you only execute "stream next" once.
| token | token := stream next. token = a ifTrue: [...] ...
Resources that are affected by the outside world also require this treatment. For example, "Time millisecondClockValue" cannot be executed more than once if you want to be guaranteed the same answer.
Therefore, execute the expression once and set a temporary variable. Use the variable instead of the expression in the remainder of the method.
Role Suggesting Temporary Variable Name (6) explains how to name the variable.
Collecting Temporary Variable (2) stores the intermediate results of a computation. Caching Temporary Variable (3) improves performance by eliminating redundant computation. Explaining Temporary Variable (4) makes methods containing complex expressions easier to read. Reusing Temporary Variable (5) correctly executes methods containing side-effecting expressions.
What do you call a temporary variable?
There are two important dimensions to communicate about a variable. The first is its type. Readers wishing to modify code need to know what responsibilities are assumed for an object occupying a variable. The second important dimension is role, that is, how the object is used in the computation. Understanding the role is important to understanding the method in which the variable is used. Different kinds of variables require different naming treatment to communicate type and role.
Temporary variables communicate their role by context. If you are staring at:
| sum | sum :=0.
... sum ...
you cannot possibly be confused about its type. Even if a temporary variable is initialized by a expression, you will be able to understand its type as long as the expression is well written:
| bounds | bounds := self bounds. ... bounds ...
Role, on the other hand, requires explicit communication. You have probably had the experience of reading code whose temporary variables had names like "a", "b", and the ever popular "temp". As a reader, you have to go through the code holding these useless names in your head until the light comes on. "A ha! 'b' is really the left offset of the parent widget."
Therefore, name temporary variables for the role they play in the computation. Use variable naming as an opportunity to communicate valuable tactical information to future readers.