Large Scale Cpp Software Design

Large Scale C++ Software Design by JohnLakos

ISBN 0201633620

Purists revile Lakos's large number of design guidelines that eschew The Latest C++ Feature, like 'template's, in favor of adamantly portable techniques like RegisteredPackage Prefixes.

He lists every rule needed to flatten partial and total rebuild times, and most of these rules also apply to normal-sized C++ projects that also compile quickly.

More important for XP, he specifies in a level of detail needed in large projects how to ensure that UnitTests run in the correct order (from most to least depended-on) so that any detected bugs are known to live in only the unit under test, not the ones it depends on.
Observation: There's a lot of good material on this page that when summarized, refactored, distilled, or further documented, could help one to be knowledgeable about this subject. It is presently somewhat difficult to sort out those things which are widely accepted from the things which are true or false.

The value that comes from the book

I wanted to add this, because this was not explicitly stated in the discussion to date (20060430). For me, the biggest benefit of this book is the discussion of coupling, levelization, and refactoring to reduce coupling, as well as the relationship to unit testing. This book's influence can be seen in tools such as Pasta/OptimalAdvisor? and HeadwaySoftware's Structure101. I think this book is the strongest voice on coupling and dependencies that I've seen; perhaps RobertMartin comes in second with his discussions on DependencyInversionPrinciple and the AcyclicDependenciesPrinciple. I have come to believe that paying attention to dependencies, which is facilitated by visualization of dependencies, makes the largest beneficial impact on software quality on non-trivial software systems. -- ChrisDailey?

Yes-- the discussion is valuable. Not the perscriptions that the book offers. Just because an author can make an argument sound perfectly reasonable, does not mean that it is. Perhaps it was for the particular project that he writes about, and the author just assumes that it is good for yours too, and convinces you. That is the sign of a good salesman, not a good engineer. Let's see "pay attention to physical dependecies" sounds real good , and it should be done. But did one real stop to ask just what are the "physical dependencies" that are being discussed?? In the book, they are in fact "makefile dependencies." MentorGraphics had big problems with compile times -- Nowadays, a project the size of MentorGraphics would be compiled in a few minutes tops. So is it really so important now to engineer a project to slice the build time down 40%, saving a few minutes tops on an old box? What about the dependencies that I really need help with, like Operating system context, DB connections, interprocess protocols, etc. These are not discussed. The book is wildly out of date, and while perhaps you can find value in any well written technical book, debating the merits of *this* book is like debating the merits of the 16th century De Re Metallica. Good book. Wouldn't dare suggest we follow its precepts though. Perhaps the reported second book is a little more up to date, but judging from the recent Lakos inspired proposals to revise the C++ standard (N1850 and N1851), and other comments on this page, I'm not holding my breath.

Run Unit Tests Frequently

Note that for XP purposes, UnitTest order isn't as important. You always know that what broke was something you touched since the last time you ran the UnitTests, and you should be running them frequently. -- JohnBrewer
Changes in operating parameters

Unless the last thing you did was upgrade the operating system, or switch to a different compiler, or build on a different architecture ... XP may not have anything to say about portability (I don't know - does it?) but it doesn't make the requirement go away. -- DanBarlow


As far as I remember, this book was published in 1997. At that time, compilers did not widely support features like namespaces - even today, HP's aC++ compiler does not put the Standard Library in namespace std. -- OliverKamps
Understanding, not Absolutism Most Important

The book is a repository of the SaneSubset of a single, huge project. Nearly anyone writing not so huge and portable a system would consider the subset too sane. While the alternative to 'template's poorly applied is often The Template Design Pattern well applied, no alternative to 'namespace' doesn't suck. But the book still can't advocate 'namespace'. Purists need to understand the book, not treat it as a list of absolute requirements and then chafe on a yoke they put their own heads into. (Or worse, think they must rescue newbies from the same.) I'm referring to some specific purists who've written very interesting, if misguided, things about why they "reject" the book, as if common sense needed a rejection. -- PhlIp
Namespaces, Packages and Prefixes

Do you think that book couldn't advocate 'namespace' if published today? If so, why? I would assume that namespaces were supported by today's compilers widely enough to be advocated. I don't believe they were supported widely enough in 1997. I would assume that even in a very large project putting a class into a namespace instead of using a package prefix would work nicely. (Am I missing anything?) You'd probably still need the package prefix (the namespace name or an abbreviation for it) for file names. -- OliverKamps
If and When important in the mindset of the comparison

If the book were a repository of the SaneSubset of a single, huge project written today... -- PhlIp

I'm sorry, but I'm not getting this. Could it or could it not advocate 'namespace'? -- ok

Preference for approaches based on PrefixNotation

Lakos talks about namespaces and prefers his prefix notation based on some emotional, gut-level, arguments. He just seems to feel that the code flows better if all the classes have a package prefix. I've tried it on one project and I didn't do much for me although I can see why it would appeal to someone else. Most of the advice in the book is purely pragmatic. This piece is not, at least it's not entirely. It's a useful way of managing the global namespace. Namespaces are another, equally useful, way of doing it. -- PhilGoodwin
Read also Martin's approach

BTW, it's good practice to put each namespace in a separate directory as suggested by RobertMartin. If you're going to read Lakos, you should probably also read Martin. RobertMartin's approach to things is complementary to Lakos's and his design principles fill in gaps in Lakos's work while standing on their own at any project scale. -- PhilGoodwin

Observation: This point is really not worth threading. There is nothing to "get".

Basis is experience on a successful, real-life project, UnderstandingRequired

Lakos based his book on experience writing a huge, real-life project, not on an "approach" or things he lectures to others. Purists need to understand the book, not treat it as a list of absolute requirements and then chafe on a yoke they put their own heads into. He could not "investigate" or "evangelize" namespaces because his primary goal was shipping a bunch of code on time. He just seems to feel, on a gut level, that getting paid to fulfill requirements is more important than sitting on a mountain ruminating. The book details exactly what he got working on all the C++ compilers available around 1994. This is a valuable resource because tracing how he arrived at his conclusions teaches you how to arrive at your own.

In fact, the "real-life" project the book draws its (only) experience from is the Mentor Graphics disaster. The code was way late, and nearly put Mentor Graphics out of business. The only compiler the book tested against were not "all the compilers of 1994" but Cfront 3.0 . This is all in the preface. I work with Lakos, and I can indeed attest that he seems to regard the book as a list of absolute requirements, so much so that decreed the STL "logically flawed" (because the containers operator== did not include all the visible properties of a container, like the allocator, a violation of one of the precepts). So he wanted to replace it. That was real productive. The ACE library equally did not come up to snuff, and placed in the lists of no-nos just like 'avoid unsigned int' (another precept of the book). His team began their effort to replace all these library with ones that followed the precepts - the rest of us just did the obvious and used the readily available libraries, whether or not they conformed. Of course, management eventually became tired of paying for this, and the Lakos team went on to build a code management tool - although the company is still laboring with the standard C++ Library un standard, to solve how operator== copes with allocators - arrgghh. However, The rest of us in the company actually interested in shipping product had to find our way around all the precepts and defenders of the faith. The "physical design" of the book ONLY concerns how to statically link in large number of libraries. So yes the methods in the book do offer a sound way to create huge monolithic applications. But that is the irony - huge monolithic applications are not scalable. -- LCD

Realize today is different and practice and theory produce different products

Folks, use namespaces. Use them in preference to RegisteredPackagePrefixes. If Lakos were here he'd tell you the same. But he can't go and write a book full of "theory", or "nice to have" ideals, if the book claims to target the ultimate in portability and short compile times, based on a real project, and if your or my favorite "theory" is not available on every single compiler. -- PhlIp
Disagreement over possible advocacy

No he wouldn't - he still (February, 2003) advocates RPPs over namespace without convincing rationale - much to the consternation of those of us who may be compelled to use it. -- Anonymous I'm sorry

The book actually says that RegisteredPackagePrefixes are better than namespaces in so many words. I don't know what Lakos would recommend today but at the time he authored the book he had considered both approaches and had a strong preference for Prefixes.

--begin-AndrewMarlow?-- He still says to use package prefixes. I have had this argument with him many times quite recently (January 2004). I have given up. He does not like namespaces with the one exception that a top-level namespace is permitted to avoid symbol name clashes in the global namespace. --end-AndrewMarlow?--
Cross Platform and Namespaces not included Well, I have to support code across a number of platforms and not all of them have namespaces.

Include file Structure

John mentions in the first chapter that you should have RedundantIncludeGuards... this practice has always bothered me.

It seems to me that the only time that the guards really come into play is when you have headers included within headers, and this seems unnecessary. (Or if you add one twice to the same file.)

Wouldn't it be better to have something like the following in the header only?

 #error	//or whatever stops the compile

Seeing as the redundant guard method is so commonly used (and therefor proven :), what am I missing? -- SvenDowideit

Back in the old days, a complier would open an #include file *from disk* every time it saw the #include directive mentioned. The redundant include guards of course prevented this. xlc6 is the only compiler left breathing that I know of that still reads from disk, rather than just keeping the whole file in memory. I used to use Redundant Include Guards, now I don't. --LanceDiduck?

In C++ it is usually necessary to mix interface and implementation within a header file (i.e. private declarations and inline methods). If a header file was pure interface, then it might be reasonable to require a client to include the declarations of the types that the interface depends on. But it is not reasonable to have a client include declarations for items used solely in the implementation of the class it uses. Therefore it is necessary to have include files including other include files, therefore it is prudent to guard all include files (to minimize effort I have editor macros set up to do it for me - using the directory and file names, it is easy to create skeletons for the classes). -- DaveWhipp

I wondered about this for a bit ... when you put an include into a header file you are forcing a client to include declarations used solely in the implementation of the class it is using - you are just using the pre-processor to do it, rather than forcing the client to be aware of the added dependencies. This is a place were you are obscuring a mess made because are choosing to make methods inline before you need to ... (oops that's Heretical, isn't it?). The way that I am heading is that either you want to hide the implementation for real (and therefor you make a pure interface class), or you realize that you are exposing the implementation to the client and so you make them aware of the mess they are inheriting (by making them put the includes needed into their implementation file) -- SvenDowideit (I hope that this can be refactored by someone who can communicate clearly :)

A way to visualize the mess is DoxyGen, which generates include graphs. The reason to have SelfContainedHeaders is that if you add one that is not, and later remove it, you are often left with totally useless include artifacts. They tend to assemble in heaps, too. Also, if I want to use some class interface, I'm not interested in the ImplementationDetail? of what it needs to implement itself. Finally, when that implementation changes, with "externalized" include dependencies, the chance is much higher that such a change breaks massive amounts of client code. -- JuergenHermann

The chance of continued development including many more definitions than necessary also goes up, and then your compiler mistakenly recompiles many more files than necessary (the guard blocks do not help here). From this, I think that for a Large project you are considerably better off forcing the user of the class to know what dependencies they are adding. Dis-allowing recurring #includes also encourages header files to be an interface containing as little implementation as possible. -- SvenDowideit

I'm not sure I understand - you have foo.h which uses string.h and bar.h which also uses string.h. Later. some program.cpp needs to use foo.h and bar.h. These are recurring includes.

Thanks :) (I'm very poor at starting examples :) I feel that program.cpp needs string.h too, so why put it into foo.h and bar.h? you may be saving typing, but in reality the dependencies exist either way, only if you don't allow #includes in header files they are explicit (and the client will probably think twice about using a bloated component..)

Imagine if the class defined in foo.h uses the string class only for private functions. Those private functions will require foo.h to include string.h. program.cpp may make no use of strings so it would seem odd to include string.h in 'program.cpp''' just because class foo needs strings.

If program.cpp needs foo.h and bar.h, it shouldn't need to know that they require string.h, or widget.h, or whatever. If foo.cpp starts with
 #include "foo.h"
then you know that foo.h doesn't have any implied include dependencies. program.cpp can
 #include "foo.h"
and know that it will compile. This is covered in the book. -- DirckBlaskey

This does lead to a problem: If program.cpp needs foo.h and bar.h, but foo.h already includes bar.h, your compile will work. If, at some point in the future, program.cpp stops requiring foo.h (so you take it out), all of the bar-related stuff will stop compiling. This is moderately confusing (but not enough to stop using this idiom). -- RogerLipscombe

It's a good idea to try to remember to include bar.h upfront, to avoid the mystery later, even if it compiles ok without it. No way to detect or enforce it, though.

Lakos enforces this through the rule that bar.cpp must include bar.h first.

Large-scale? How does one measure the size? (SystemSizeMetrics)

I have present knowledge of what Lakos eschews

I work with John Lakos. He still eschews namespaces, eschews exceptions, avoids templates.

Alternatives -- We should use, He didn't use, It Isn't, A Scalable method

Yes we should avoid "cyclic dependencies" and this has always been a well known rule. Cfront 1.0 shipped with "dependency checkers." However there are far far better ways of creating "large scale" software than the methods found in this book. This is why purists revile it - not because Lakos doesn't use the latest feature, but because the methods espoused simply do not scale. A better title of the book would be "C++ projects using Cfront 3 that have a lot of statically linked libraries." Basically, "large" embedded systems programming.
The problem with the book are the hasty generalizations. For instance, consider the "test" showing that "instance specific" allocators are preferable to "class-specific" allocators. What is really shown is a typical profiling homework question that shows that instance specific allocators have the best performance in the case shown. The author concludes that instance specific allocators are preferable in all cases - the real answer is that this method should be used when profiling warrants it, not as an overall strategy. How well do you think a program scales when I have to keep track not only of the object but the allocator it came from? Or that "three tiered package group" structures worked for Mentor Graphics IC CAD system, therefore it is the best method in general. Or that each .h/.c "component" have a tester associated with it, that is run during the build. That is a fine idea, until you have 1000's of "unit test" programs that must be compiled, linked and run for each clean build of each configuration. If my build wasn't slow before , NOW it is. There are few good tidbits here and there, but there are far better way to scale C++ systems then the overall methods presented. The book is good because the author is not lazy about detail and expository, and crafted a very fine volume. However as an intellectual work it is very wanting.

DeleteMe: DeleteAnonymousAccusations

<more ThreadMess, moved from JohnLakos page>

Readers beware, this is a great book, but many of the C++ idioms are hopelessly dated.

Hopelessly dated? Can someone with more C++ experience than I expand on this a bit? He purposely avoids some of the most recent C++ features, such as namespaces, giving good reasons why he makes that tradeoff. -- StevenNewton

Per the rambling circular debate written on the page LargeScaleCppSoftwareDesign, many of JL's specific recommendations, findings, and prejudices were obsolete long before they saw ink.

However, most of his recommendations (such as defeating package-level circular dependencies, or make #include "foo.h" the >first< line of foo.cpp) are valid.

But the most important part of the book are his experiments, not their results. Learn from them. Don't just use the specific recommendations to beat people over the head with (including John). -- PhlIp

CategoryBook, CategoryCpp, CategoryScaling

EditText of this page (last edited October 23, 2010) or FindPage with title or text search