(This is just an early cut; hopefully, there will be more to come.)
Here's some stuff I know about packages:
Forces that act on Packages
- Tightly coupled classes need to be changed together - changes in one frequently cause the need for change in others.
- Clients need to adopt these groups of changes as a whole.
- Individual developers need to know which changes exist and what they must do to integrate those changes.
- Adopting new changes to a class means adopting old changes as well.
- There are generally too many classes in a project to treat each one as a UnitOfModularity.
- Libraries can be used to group classes physically the same way that packages can be used to group them logically.
- Clients of a library depend on the whole library - if any part of the library changes, the client must be revalidated.
- It is cheaper to make many changes to one library than it is to make the same change to many libraries.
- Depended on classes provide a needed context for the testing of dependent classes.
- Testing least dependent classes first allows errors to be localized to their originating classes automatically.
- If dependencies form a loop then errors can only be localized to the loop, not to individual classes within the loop.
- Dependencies should not propagate through a package (from the UML definition of package).
- It is expensive to change a depended on class because the dependent classes may have to change and will have to be retested.
- Packages that insulate against the propagation of requirement changes help to make the overall system more flexible.
Some of these forces have been distilled into a set of principles that govern package design:
- The ReuseReleaseEquivalencePrinciple
- The granule of reuse is the same as the granule of release. Only components that are released through a tracking system can be effectively reused.
- The CommonClosurePrinciple
- Classes within a released component should share common closure. That is, if one needs to be changed, they all are likely to need to be changed. What affects one, affects all.
- The CommonReusePrinciple
- Classes within a released component should be reused together. That is, it is impossible to separate the components from each other in order to reuse less than the total.
- The AcyclicDependenciesPrinciple
- The dependency structure for released components must be a directed acyclic graph. There can be no cycles.
- The StableDependenciesPrinciple
- The dependencies between packages should be oriented in the direction of increasing stability. A package should only depend on packages more stable than it is.
- The StableAbstractionsPrinciple
- The stability exhibited by a package is directly proportional to its level of abstraction. The more abstract a package is, the more stable it tends to be. The more concrete a package is, the more unstable it tends to be.
Here are some other ways to resolve these forces:
Designing a Package
- Encapsulate packages as libraries. Java does this with .jar files and directory structures, C++ does it with static or dynamic libraries and their associated header files. I don't know how Smalltalk does it.
- Track package releases through revision control. See Robert Martin's article Granularity for more on this. Versions allow team members to control what they need to adopt and when. The are also important for the tracking of bugs.
- A package should have a PublishedInterface. Some classes in a package should be made available for the use of external clients. No other class in the package should be visible to clients. Any other class in the package should be there to directly support the InterfaceClasses?.
- InterfaceClasses? should be fully insulating. Clients of a package should not be exposed to any dependencies that originate from outside of the package, and they should not depend on the implementation of any class. Since InterfaceClasses? are the only classes in a package that clients are exposed to, they must provide all of the DependencyInsulation? for the package. In C++ this means supplying CompilationInsulation?, which can be difficult to write and maintain. In any language it means having as stable an interface as possible to keep from breaking client code.
- Non-InterfaceClasses? need not be insulating at all. Insulating classes take care of all the DependencyInsulation? needs of the entire package. No other classes need to carry any of this burden. Instead, they should be constructed to be as efficient and flexible as possible.
- UnitTestPackages? instead of classes Since packages are the UnitOfReuse?, you only need to write UnitTests at the package level. Doing so allows the classes within a package to be tightly coupled and even to have dependency loops between them. I'm not sure that I can support the notion that these are desirable traits, but I will say that it is time-consuming and, therefore, costly to remove them. You can save on that cost if you UnitTestPackages?.
I write UnitTest
s method-by-method so I develop faster. A subset of the resulting tests will generally serve as good tests for a package. But the tests aren't there to fulfil some abstract requirement, and to be omitted if possible. They are there because I can't develop without them. -- KentBeck
How do you write UnitTest
s for private methods? I assume that you do it by calling them indirectly through public methods. Well, testing an entire package through its public interface is essentially the same thing. If you can't get complete coverage by testing through the interface classes then something's wrong. The UnitTest
s in ExtremeProgramming
substitutes for more traditional analysis documents - it tells us what the code must do. My assertion here, then, is that it is sufficient to define what a package must do and let the methods called by the UnitTest
s test the methods that they call. -- PhilGoodwin
I'm programming in Smalltalk, so I can test private methods directly. I find this is important, because I write tests to help me think. When I'm involved in a tricky bit of logic, it is enough for me to think about the logic, and not try to also guess how I can invoke it through some sequence of public messages. If the purpose of the unit tests was to verify and communicate the correct operation of the public interfaces, then your argument makes sense. If they are also a thinking tool, then they belong wherever they can help my thinking. Note that I'm not arguing against tests' role in public specification. I just use tests for that and other things, too. -- kb
It occurs to me that you can write a public member function in C++ that will do the testing of private methods on behalf of the caller, so you could use the same method in C++ as well. I do think, though, that regression testing should occur at the package level and that new tests should be added at that level when possible. -- pg
I tend to write unit tests on class-level, feature-by-feature. If I find myself in a situation where I'd like access to a private method for unit testing, it makes me wonder if my class is not taking too much responsibility, and if I should not delegate that responsibility to a class that I can test independently. So unit testing - specifically test-driven development - gives me a quick feedback on the complexity of my design. If class becomes too complex, I see it quickly not only by a desire to test private methods, but also but combinatorial explosion in my unit tests. -- GregWdowiak
(It looks like the forces of the CommonClosurePrinciple
tend towards FineGrainedPackages?
, perhaps overly so. -- StevenBlack
I think this puts too much weight on packages. Sometimes they are just ad-hoc collections of classes. If we insist that all classes in a package be tightly coupled, that means we must have a lot of packages that just contain a single class. Shouldn't Class be in its own package? Math?i System? Locale? StringTokenizer?
I don't like the Java package system, especially the rule that protected members can be accessed by any class in the same package. Remember that there is no language-level security for packages - anyone can add a new class to any package. I ignore that rule and treat packages as nothing more than a convenient way to break up the name space, like the directory structure of a file system. My unit of reuse is the class.
I don't usually find that my packages contain only a single class unless I'm working on a fairly small project, in which cases the forces driving this particular discussion are fairly weak - there may be no pressing need for packages at all.
Libraries that are shipped as standalone products are kind of their own sort of beast - every thing is meant to be reusable on its own or close to it. I'm just learning Java, so I'm not in a position to critique its library structure.
One of the forces that is motivating me to try and develop rules about packages that go beyond what I've seen from RobertMartin
is that it's such a pain to make classes reusable (especially in C++), and in many cases actually makes the classes themselves behave less efficiently. If I can separate the structural responsibilities of a class from its behavioral responsibilities by breaking up the class, then I think that I can get more efficiency and flexibility out of my classes. If I do it at the granularity of packages, then I also can benefit from grouping the structural responsibilities of a group of classes into one or two classes that represent the entire package. One of the big advantages of doing this is that I can develop a relatively stable interface for a package and release it while the rest of the code is really only of prototype quality. I can then refactor fairly liberally without impacting the rest of the system. -- pg
The neat thing about packages in my opinion is that they can be applied independent of languages - Java Packages, MS COM or .NET "objects" (really a collection of classes which you instantiate as object variables), Perl Modules all try to group together sets of related classes in a logical way. On top of that they have their own diagram in UML (ClassDiagram
s and/or other packages go "inside" each package). When I think of packages, I think of an electronic circuit board (top-level package), unto which is embedded IC chips (classes) with conducting networks between them (association, aggregation, dependency lines in the ClassDiagram
s). So the Microsoft Crypto.dll, MsXml?
, etc., are all really packages. Even using SOAP (SimpleObjectAccessProtocol
) across a network or the internet, effectively this provides communication between interfaces to packages.