This paper has the following outline:
ENVY/Developer [1] is the most widely used source code and configuration management system for Smalltalk development. Unfortunately, many people find ENVY very difficult to apply. The difficulty is not because ENVY does not work well, but because few people really understand the philosophy for how to use it well. This paper demonstrates the best practices for effectively partitioning source code using ENVY. These practices make ENVY much easier to employ and produce an improved software development and maintenance process.
The format of this paper is a series of sections called "patterns" that form a collection called a "pattern language." Some readers may find this format unusual. Accustomed to reading a paper linearly from beginning to end, a reader may not be prepared to let the patterns show him his own path. This paper demonstrates that the pattern format greatly reduces the effort required both to present these guidelines and to understand them. Readers should find that patterns help convey difficult material clearly.
A pattern is a format, originally developed by Christopher Alexander [3], that documents the solution to a common problem in a context. An expert in a field can use patterns to document the techniques he has learned that make him an expert. The pattern format helps the author document the technique completely yet concisely. Thus one can read these patterns to quickly learn directly from the expert.
A pattern has four main parts:
You do not necessarily have to read all of the sections in a pattern to gain its benefit. These steps show how to read a pattern quickly and still learn from it:
"A pattern language is a collection of patterns that reinforce each other to solve an entire domain of problems." A pattern language combines the patterns in a way that will guide the reader through an entire solution process. When done well, a pattern language combines its patterns in such a way that its combined whole produces a greater benefit than the sum of its parts.
You do not necessarily have to read the patterns in a language in order, nor do you have to read all of them to learn from the language. These steps show how to read a pattern language quickly and learn from it as well:
I often mentor Smalltalk programmers about how to use ENVY/Developer to manage the configuration of their code. The two most common questions they ask are, "How many applications should I use?" and "How many subapplications should I use?" The goal of this pattern language is to answer those two questions.
However, before I can answer those two questions, I need to answer some more fundamental questions first. If you read these patterns in the order presented, the answers to the two common questions will make more sense. But if you can't wait to read about the common questions, read those patterns first, then read the more fundamental ones that they reference.
To answer this question: Read this section: How should I architect my 1. Layered and Sectioned Architecture (sub)system to help make my development team more productive? How should I organize a (sub)system 1.1 Separate Layers so that the components are fairly independent from one another? How should I organize a layer into 1.2 Separate Sections sets of functionality? How should I store my layers in 2. Layer in an Application ENVY so that I can manage them as encapsulated layers of code? How should I store my sections in 3. Section in a Subapplication ENVY so that I can manage them as encapsulated sections of code? How many applications should I 4. Two Applications divide my (sub)system into? How many subapplications should I 5. No Subapplications divide my application into?
Problem
How should I architect my (sub)system to help make my development team more productive?
Context
References within code produce code dependencies. A code component that references another becomes dependent on that other component. If the other component does not work properly, the dependent one probably will not work either. A code reference can take many forms, but in Smalltalk a reference is usually a class reference. Thus whenever a component uses a global variable that is a class, it is creating a dependency on that class and its code.
Dependencies between components make code difficult to implement, maintain, and reuse. A developer cannot finish implementing a component until the components it depends on have been implemented. Changes to a component with many dependents can adversely affect those dependents. A component cannot be reused if the components it depends on are not available. Thus minimizing and simplifying the dependencies between components makes code better.
Dependencies between components become paths of communication between developers. Code structure reflects the organizational structure that created it. [5] Numerous complex dependencies in the code require similarly complicated paths of communication within the team. Code dependencies and team communication paths become one and the same.
Figure 1: The productivity of each team member declines as a team gets larger.
A team that must devote greater amounts of its resources to communication looses its effectiveness. Worst case, a team of n people requires n^2 paths of communication. Each new team member adds n new paths, so coordination overhead goes up and productivity per team member goes down. Figure 1 sketches the relationship between a team's size and the productivity of its members. A team must simplify its communication or it cannot grow effectively.
Dependencies within code are necessary, but the system architecture should minimize and simplify those dependencies. This will make the code better and the team more productive.
Solution
Develop a system with a Layered and Sectioned Architecture to reduce dependencies between components and help make the developers more productive.
A complex system implemented with a large amount of code will be difficult to develop and manage as one huge chunk of code. Code should be designed and implemented with components that have high cohesion and low adhesion. Divide the system into components, each with a relatively simple and well-defined interface. Make the dependencies between these components as simple as possible by reducing the references between components. When references are necessary, make them one-way references when possible.
Two components can be related in one of three ways. Their relationship tells you how to architect them:
A system designed with a Layered and Sectioned Architecture will produce these well encapsulated components. To learn how to create such an architecture, read the patterns that show you how to Separate Layers and Separate Sections.
Example
Teams of 3-10 people tend to be highly productive, but larger teams do not. [6] Yet a system may be so large and require so much code to be written that it may seem to require 20-50 (or more) people to develop it.
The answer is not to develop the system with a team of 20-50 people. The effort required to coordinate all of those people will cause the process to grind to a standstill. Instead, divide the system into well encapsulated subsystems, each of which can be developed by a team of 3-10 people. Also create an architecture and infrastructure team to define the subsystems and their interfaces and enforce compliance.
If the subsystems are well encapsulated, this means that the dependencies between them will be few and simple. Similarly, the process to develop them will only require a small number of simple paths of communication between the teams. This will help keep the teams productive, reducing the project's risk.
Problem
How should I organize a (sub)system so that the components are fairly independent from one another?
Context
Code designed with a Layered and Sectioned Architecture is easier to maintain and reuse.
Reducing the references to a code component helps encapsulate that code. If two components refer to each other, changes in either can affect the other. The two components would be easier to maintain if only one of them referred to the other, but not visa versa. Then changes to the one which isn't referenced cannot affect the other.
As components are grouped to separate a set that is referenced from another doing the referencing, these sets form layers stacked one on top of the other. Like floors of an office building, each floor depends on the ones beneath it for support but does not depend on those on top of it (or in the office tower next door). Similarly, a layer of code references those below it but not those above it or at the same level.
Each layer should be simple enough that one team can be responsible for implementing and maintaining it. Otherwise, two teams developing a layer will need to coordinate their activities, coordination that increases communication overhead within and between the teams.
Smalltalk code is a hyper-linked system where all of the code appears to refer to all of the other code. Dividing such a system into loosely coupled components is difficult because it requires that the code in some of the components not refer to that in the other components. The Smalltalk development environment allows-if not encourages-any code to refer to any other code, so it actually hampers the development of loosely coupled components. Discipline on the part of the developers of a system is required to design these loosely coupled components and to implement code that obeys the components' boundaries.
Solution
Divide a (sub)system into Separate Layers to separate its components.
Design the (sub)system using a layered architecture. Each layer will depend on the ones below it but not those above it or at the same level. Each layer will be simple enough to be developed by one team. This way the code in the lower layers will be better encapsulated because they will not depend on any of the code in the rest of the system. Each team will be better "encapsulated" because it will only be responsible for its own layer(s) and not miscellaneous parts of the entire system.
To separate code into layers, diagram (mentally or on paper) how the classes refer to each other. If two classes refer to each other, either directly or indirectly, they will have to go in the same layer. If one class (call it ClassA) references another (ClassB) but not visa versa, ClassB can go in a layer that is lower than the one ClassA is in. Repeat this process until all of the classes in each layer refer to each other or classes in lower layers. A class should never refer to another class in a higher layer.
Sometimes when diagramming classes, a developer will find that a large number of classes all reference each other eventually. This is spaghetti code that will be difficult to maintain and reuse. It should be redesigned and refactored such that it can be divided into layers.
Example
The graphics system in VisualWorks [7] is layered. Here are some of those layers:
Each of these layers requires the one before it, but none of them uses those that come after it.
The layers help keep changes in one layer from affecting the others. For example, if you wanted to rearchitect the windowing system, you would need to rewrite the third layer. However, these changes to the third layer would hardly affect the first two if at all. On the other hand, if you want to rewrite the graphics system, that wouldn't affect the first layer, but it would probably require rewriting the windowing system as well. Changes to a layer affect the layers above it but not those below it.
VisualWorks has another layer, File System, which implements streams to read from and write to the operating system's files. Since it's implemented using objects (as all of VisualWorks is), File System is built on top of Kernel. However, since File System doesn't do anything graphical, it's not built on top of Graphics System. File System and Graphics System are peers, so changes to one cannot affect the other. Thus changes to one of those layers only affect part of the system, not the entire system. The only changes that can affect the entire system are those made to Kernel. This layered system is much safer than one where all code affects all other code.
Problem
How should I organize a layer into sets of functionality?
Context
Code designed with a Layered and Sectioned Architecture is easier to maintain and reuse.
If one component of code references the other but not visa versa, they can be divided into Separate Layers. However, this can still leave a large number of components that all reference each other and thus must be stored within the same layer.
Reducing the references between components helps encapsulate them. The components may all have to reference each other eventually, since they're in the same layer, but it's better to have relatively few references than many. Components that have many references between them will be more dependent on each other than those with few references between them. Changes can often be made to one low dependence component without requiring corresponding changes to other components that depend on it little.
As components are grouped into highly dependent sets that are fairly independent of each other, these sets form sections of code with high cohesion and low adhesion. Like multiple businesses on the same floor of an office building, the activities in each business depend on themselves much more than the activities in other businesses. Similarly, a section of code references the other sections some but mostly references itself.
Each section should be simple enough that one developer can be responsible for implementing and maintaining it. Otherwise, two developers implementing a section will need to coordinate their activities, coordination that increases communication overhead within the team.
Developing Smalltalk code into highly cohesive sections with low adhesion is difficult. The Smalltalk development environment allows and even encourages the code in various sections to reference each other heavily. Discipline is required on the part of Smalltalk developers to design their code into sections and implement code that obeys those section's boundaries.
Solution
Divide each of the Separate Layers into Separate Sections to separate its sets of functionality.
Design each of the Separate Layers with a sectioned architecture. Each section will be highly cohesive and the sections will have low adhesion with each other. Each section will be simple enough to be developed by one person. This way each component will become better encapsulated as its connections to other components are minimized. Similarly, each developer will be better "encapsulated" because he will only be responsible for his own section(s), not miscellaneous parts of the entire system.
To separate code into sections, diagram (mentally or on paper) how the classes refer to each other. If one class references another one a lot and/or visa versa, put them in the same section. If two classes reference each other relatively little, put them in separate sections. If a resulting section contains numerous classes, repeat the process to divide it into sections. Repeat this process until all of the classes in a section refer to each other a lot and have fewer references to classes in other sections.
Sometimes when diagramming classes, a developer will find that a large number of classes all reference each other heavily. This is spaghetti code that will be difficult to maintain and reuse. It should be redesigned and refactored such that only a few classes at a time reference each other heavily.
Example
VisualWorks contains a feature called BOSS (the Binary Object Streaming System) for writing an object to disk and reading it back into the image. Because it can be applied to any object, it is one of the most basic features of the VisualWorks system, and thus is part of the bottom layer (Kernel). Although it requires a number of classes to implement, only one of those classes (BinaryObjectStorage) is referenced by the rest of VisualWorks. [8] Thus BOSS is a section of code. The classes reference each other heavily, but they reference the rest of the system little. All of the references to the BOSS classes are channeled through one class with a relatively simple interface.
Problem
How should I store my layers in ENVY so that I can manage them as encapsulated layers of code?
Context
Code designed with a Layered and Sectioned Architecture is easier to maintain and reuse. However, developing Smalltalk code with Separate Layers goes against the hyper-linked nature of that code.
Even if Smalltalk code has a Layered and Sectioned Architecture, these layers are difficult to recognize. A large set of code contains numerous classes, all of which seem to reference each other. A developer learning such a library of code often feels that he must learn all of it to understand any of it. The code may be in layers, but the developer cannot see those layers.
Actions are often applied to an entire layer of code at a time. A layer should be simple enough that one team can be responsible for implementing and maintaining it. When filing out a class, one often wishes to file out its associated classes without filing out the other classes in the system. This means that one should file out the layer as a whole.
As maintenance is performed on code, the layers must be preserved so that the advantages of the Layered and Sectioned Architecture are preserved. This is difficult to accomplish when the layers are not apparent.
Smalltalk needs a mechanism for distinguishing separate layers of code. ENVY adds such a mechanism to Smalltalk, a type of component called an "application." Just as a layer contains tightly coupled classes, so does an application. A layer must know what layers come immediately before it, so an application records this information.
While developing Smalltalk code with a Layered and Sectioned Architecture is difficult, it is necessary in order to partition the code into ENVY applications. ENVY allows each application to be loaded by itself as long as its prerequisite applications are loaded. This requires that the code in the application not attempt to collaborate with or reference any other code except that in its application's prerequisites. So a Layered and Sectioned Architecture is required for effective use of ENVY.
Solution
Store each Layer in an Application so that you can manage them as layers.
Design a system using a Layered and Sectioned Architecture, then give each layer a name. Implement the code using ENVY, creating an application for each layer and naming each application the same as its corresponding layer. Set each application's prerequisites to be the applications of the layers immediately beneath the layer being defined.
Example
VisualWorks is implemented using a Layered and Sectioned Architecture, but this is very difficult to see in the base VisualWorks image (without ENVY). Base VisualWorks looks like a thousand classes that all seem to use each other to implement themselves.
ENVY clearly shows the Layered and Sectioned Architecture of VisualWorks. The main applications and the architecture they embody are shown in Figure 2. Each of the ten blocks in the figure is an application/layer. Two applications along the same horizontal line are peer applications, which means that they do not depend on each other but they do both depend on the same foundation. Each vertical slice shows a column of layers with direct dependencies. The simplest column is the one with VisualWorksDevelopment on top; it is:
Figure 2: Applications show the layered architecture of VisualWorks.
Problem
How should I store my sections in ENVY so that I can manage them as encapsulated sections of code?
Context
Code designed with a Layered and Sectioned Architecture is easier to maintain and reuse. However, designing code with Separate Layers and Separate Sections goes against the hyper-linked nature of Smalltalk code.
Even if Smalltalk code has a sectioned architecture, these sections are difficult to recognize. A developer learning a set of classes has difficulty realizing that some of the classes collaborate much more closely than others.
Actions are often applied to an entire section of code at a time. A section should be simple enough that one developer can be responsible for implementing and maintaining it. When filing out a class, one needs to also file out its close collaborators, which can easily be done by filing out the section as a whole.
As maintenance is performed on code, the sections must be preserved, but to do so the sections must be apparent.
Smalltalk needs a mechanism for distinguishing separate sections of code. ENVY adds such a mechanism to Smalltalk, a type of component called a "subapplication." Just as a section contains classes that collaborate a lot, so does a subapplication. A section must know what layer it belongs in, so a subapplication records this information.
While developing Smalltalk code with a Layered and Sectioned Architecture is difficult, it is necessary in order to use ENVY subapplications effectively. Subapplications show logical divisions in the code. If the divisions do not exist, separating them into subapplications is misleading and counterproductive. So a Layered and Sectioned Architecture is necessary for the effective use of ENVY.
Solution
Store each Section in a Subapplication so that you can manage them as sections.
Design a system using a Layered and Sectioned Architecture, then give each section a name. Implement the code using ENVY, creating a subapplication for each section and naming each subapplication the same as its corresponding section. Since the section is part of a particular layer, create the subapplication to be part of the application for that layer. If the section is within a larger section, create its subapplication within the larger section's corresponding subapplication.
Example
The VisualWorksBase application contains the fundamental window builder code, a complex framework that combines together many sets of functionality. To make this code easier to manage, ENVY divided it into sections:
Many of these sections are in turn divided into subsections.
ENVY clearly shows the sectioned architecture of VisualWorksBase. Those subapplications and the architecture they embody are shown in Figure 3.
Figure 3: Subapplications show the sections of the VisualWorksBase application.
Problem
How many applications should I divide my (sub)system into?
Context
When implementing Smalltalk code in an ENVY image, each piece of code must be stored in an application. Thus you will need at least one application to store your code in.
The base ENVY image already includes several applications that contain the vendor (ParcPlace and OTI) code. You could store your code in these, but this would introduce several problems, the primary one being the difficulty distinguishing your code from the vendors'. Thus you should store your code in your own applications, not the vendors'.
You could store all of your code in one application. However, then ENVY would manage it all as one unit of code and would not be able to help you manage different sets of code independently. To manage these sets independently, design your subsystem with Separate Layers and store each Layer in an Application.
One layer is often insufficient for managing most subsystems. Most can easily be broken into two distinct layers, one containing the domain models and the other containing the application models. As design and implementation progress, other layers may become apparent as well.
Solution
Start a (sub)system with Two Applications.
One of the applications will contain the domain model layer(s) and another will contain the application model layer(s). [9] As the need for other layers becomes apparent, create applications for those as well.
Example
A Billing subsystem might contain domain objects such as Bill and Customer and application models for windows such as BillEditor and BillBrowser. Implement this using two ENVY applications:
Later, the subsystem might be expanded to contain some automated processing features. This would introduce a new layer and corresponding application:
Problem
How many subapplications should I divide my application into?
Context
When implementing Smalltalk code in an ENVY image, each piece of code must be stored in an application. However, code does not have to be stored in a subapplication; it can be stored directly in an application.
You could store all of a layer's code in the application. However, then ENVY would manage it all as one unit of code and would not be able to help you manage different sets of code independently. To manage these sets independently, design your layer with Separate Sections and store each Section in a Subapplication.
Sections are often difficult to anticipate during design and so often are not discovered until implementation.
Solution
Start an application with No Subapplications.
Since a newly created application contains no subapplications, don't create any subapplications to begin with. As the need for sections becomes apparent, create subapplications to represent them.
Example
Let's say you're going to implement a number of user interface windows for your Billing application. Create an application for them called "BillingUI."
As you implement these windows, you may find that you have two main sets of windows. Perhaps you have one set of windows for reviewing large lists of data and another for entering and editing specific data elements. The windows in each group work together closely but interact little with each other. Meanwhile, this is the main menu window that gives the user access to both sets of windows. When you discover this division, create subapplications to represent it:
The main menu window class is common to both sets so it would go in the BillingUI application itself.
I would like to thank: Ward Cunningham and Ken Auer for their extensive help in revising this paper; Kent Beck for inspiring me to see the connection between coding processes and people's; and everyone at PLoP '95 and KSC who made suggestions for improving this paper.
Bobby Woolf is a Senior Member of Development Staff at Knowledge Systems Corp. in Cary, North Carolina. He mentors Smalltalk developers in the use of VisualWorks, ENVY, and design patterns. He welcomes your comments at woolf@acm.org or at http://www.ksccary.com.