The fundamental assumption of ContinuousIntegration
is that there's only one interesting version, the current one. Multiple code lines makes ContinuousIntegration
more difficult, and can lead to IntegrationHell
. This means "avoid version control branching". Thus, don't say "We'll make a new branch to add this hairy feature, and then merge it back in when it works". Instead, do everything incrementally so that you can add the hairy feature bit by bit, without needing to branch.
Patching old versions. Bug fixes may be required in previous versions but customers don't wish to use the latest version.
Don't release buggy software. Why don't customers always want the latest version? Perhaps some reasons can be removed. And then we can always ComponentizeTheSystem?
There are several misplaced assumptions in the above ...
- Myth#1: The fundamental assumption of ContinuousIntegration is that there's only one interesting version, the current one.
is really specific to a codeline. It doesn't directly apply to the case of multiple codelines. The assumption is that the latest-and-greatest (LAG) version (for any given codeline) is the most up-to-date and that more in-sync I am with it, the smaller the delta (difference) between what I'm working on and what the latest stuff is, therefore the less disruption/rework to me when I re-sync with the LAG stuff, and the less disruption/rework to others when I commit my changes to the codeline. It increases feedback frequency from the LAG to me and back to the LAG again.
This too is not really correct, partly for reasons stated above (ContinuousIntegration
is really specific to a codeline). And version branching for DualMaintenance
is unlikely to lead to what IntegrationHell
describes because the IntegrationHell
resulted from a completely different problem. Version-branching for DualMaintenance
is an altogether different beast than "big bang" integration that results in IntegrationHell
- Myth#3: This means "avoid version control branching".
Now we are at the crux of the matter. The problem is not version control branching!
Version control branching is one (common) solution to the particular problem of having to support the maintenance and evolution of more than one version of the codebase at a time. This is typically called the DualMaintenance
problem or the MultipleMaintenance
problem. Dual/Multiple Maintenance is what you really want to avoid!
(And why you want to do everything incrementally with ContinuousIntegration
as much as possible.)
No one that I know likes having to do DualMaintenance
. It always adds more work and complexity. Branching (if done right, using a Mainline and other key ScmPatterns
) is one way of managing and minimizing the additional overhead that DualMaintenance
entails. But when you are genuinely faced with the DualMaintenance
problem and can't make it go away - chances are that version control branching is going to be one of your best friends for solving this problem efficiently and effectively. -- BradAppleton
Extracted from ContinuousIntegration
For many projects, multiple-codelines can actually makes things much easier than forcing adherence to one and only one codeline. For other projects, it definitely hurts. Use one codeline if you can easily get away with it. It would be nice to have a universally applicable rule of thumb that works all, or even most of the time; unfortunately that isn't the case here (but perhaps that could be argued for the specific nature of XP projects).
In many years, this reader has split codelines about twice. and regretted it. single code line, conditional compilation works better.
Conditional compilation certainly works better than branching "done wrong". It typically does not work better than branching "done right". There are right ways and wrong ways to go about splitting - and many who try to "wing it" instead of first seeking existing best-practice often regret their split attempts. Also, using more than one codeline is useful for a great many things that have little in common with conditional compilation. Most of them (but not all) are concerned with issues of parallel-development activity/releases rather than environmental and functional variations (keep in mind the difference between ParallelDevelopmentVersusConcurrentDevelopment?
Conditional compilation and codeline-splitting are intended to solve different problems. Splitting codelines to accommodate platform variations is usually ill-advised; Even conditional-compilation isn't usually the best choice here (c.f. Spencer & Collyer in "#ifdef Considered Harmful
"). The recommended solution for this is separate
compilation rather than conditional compilation: split the platform-specific code into separate compilation units or sub-units, much like Doug Schmidt's WrapperFacade?
There are good reasons and bad reasons to split codelines; and once you do there are good ways and bad ways to do it. But the splitting itself is not intrinsically bad; it's splitting for the wrong reasons or in the wrong ways. There are in fact many good reasons and ways to go about it and many shops use them daily. Not only does it contribute to their success in these cases but it's often the lynchpin for it.
In many years, I have come across several hundred projects of all shapes and sizes; A great many of them rarely split codelines and/or regretted it when they did; an equally great many of them split codelines (often after trying everything to avoid it) and life was considerably better for them afterward, and they have continued this trend with marvelous success. Perhaps it's an acquired taste, and those who have bad initial experiences tend to discard it while those who have success with it do not. But the good reasons and practices for code-splitting not only recur, but abound with much regularity (and, unfortunately, so do the bad ones).
But what happens when it is imperative that an old version be restored for an essential and urgent maintenance release?
Why not just release the current version with the needed fix? It is usually a sales or marketing decision to patch an old version. Given the engineering costs associated with maintaining a branch and the goodwill generated by giving the customer an upgrade as well as a fix, you can make a good business case for this approach.
See the description of DualMaintenance
for a list of several reasons why the above may be an unacceptable solution. It is certainly a good question to ask to challenge whether you really are in one of those unavoidable situations - So if you haven't asked it yet, you certainly should. And perhaps the answer should even be periodically revisited to see if the current relationship with the customer(s) would allow you to eliminate having to do DualMaintenance
from that point onward.
It is certainly a good question to ask to challenge whether you really are in one of those unavoidable situations
Definitely good to ask. Groups should be asking early in the project development cycle. But I guess I've seen it the other way. People go through amazing effort to not have a branch. Usually is people copying code in and out of someone's private work area, which is just amazingly crude to me.
. This problem has been around for a very long time, and there are systems that are very good at dealing with it. Using only one codeline is a very lazy and inadequate way of dealing with the problem.
My experience is that branching in VSS, i.e maintaining two code versions, roughly doubles the maintenance work. In one company I halved the number of PROMS being blown by finding a way to make the debug build of the code and the release version one and the same (with almost no run time overhead). In another company I am currently in the process of merging a protocol converter with the protocol converter tester (which drives the code under test but is derived from the same original code). The tester had got about a year and a half out of date with the result that it was virtually useless. I agree with the reasons given earlier on the page for forking paths (or at least being able to roll back to an older version), but the reasons have to be strong like the ones given to justify the additional effort. It is easy to do at first, like cut and paste programming, but it bites back later.
Two codelines, double the work.
It is not the additional codeline that doubles the work!
It is the work that is done to solve the classic DualMaintenance
problem of having to maintain more than one historical configuration. Whether you solve this with branching, or conditional compilation, or some other means - unless you do something else to make the problem go away entirely, then the solution will always
be more work than having to maintain only one historical path.
Branching often undeservedly gets such a bad rap that the following is worth repeating:
- Branching is not the problem! DualMaintenance is the problem!
Branching is merely one of the better solutions to solving the DualMaintenance
is more work no matter how you solve it. Having to do DualMaintenance
can double the work. Solving it with branching can decrease that work significantly, typically more so than other solution alternatives.
Using conditional compilation as a DualMaintenance
solution (as suggested above) can in fact be shown to be equivalent to doing the same kind of 'delta encoding' that version control tools do when version-branching. So while it may seem simpler, its really reinventing the wheel and creating a lot more opportunities for error and a lot more condition points in the code to add complexity to every change and merge thereafter (whereas branching separates the problem so that the extra complexity and effort is only incurred for the legacy changes that must be propagated to the more recent codeline, but is somewhat more effort when you have to do it).
So in fact conditional compilation can be more than the amount of work that branching is (and if you do it with runtime control flow switches, it can add even more conditional combinations of code to test). The determining factor is how much activity is going to be on the legacy maintenance codeline as compared to the latest-and-greatest development codeline. It comes down to: (cost-of-occurrence * likelihood-of-occurrence) versus
(cost-of-prevention * 100%).
If change activity would be equally loaded on the two codelines, then using one version branch (codeline) with lots of conditional-branches in the control-flow of the code will likely be less effort and less complexity and less redundancy. If change activity on the legacy codeline will be significantly less frequent (and/or smaller in scope) than on the latest-and-greatest codeline (say, less than 50% as active/far-reaching), then version-branching of the codeline will end up being less effort and less complexity than conditional-branching of the code.
BUT - you have to do your branching and merging in a coherent and consistent fashion in order to be effective. If you don't really know what you're doing when you split off codeline and don't make proper use of a mainline to control the depth and breadth of your branches and minimize the complexity of merging/propagating changes, then you can make things worse instead of better. And this is the situation that many run into, and why there is such an intense fear/dislike/frustration with branching. It happens as a result of OnceBittenTwiceShy?
"syndrome" which seems to close one's mind against things that would appear to repeat the failed experience (it wasn't necessarily the idea or approach that was wrong, it might have been how it was applied/executed, but its easier to blame the thing than ourselves for doing the thing badly).
So many are more comfortable with code-branching solutions rather than codeline-branching solutions because they are more comfortable with code rather than (multiple) codelines and prefer it - even when it would be more work. Because its easier to stick with what you know than to learn something new and accept that previous failed attempts may have been a result of "doing the thing wrong"
rather than of "doing the wrong thing."
All that said, branching in VSS is not
the best idea. VSS doesn't have very good support for branching. The VSS tech notes from MS even say that VSS doesn't have great support for such parallel development and that such branching is not recommended. CVS has decent support, but its still more work with CVS than it ought to be. For much better support for branching that makes them "lighter weight" to use and less complex when merging and updating, take a look at PerforceVersionControl
, or SubVersion
, which implement branching with workspaces (sandboxes) that are first-class repositories in their own right).
So the best solution is to do what you can to avoid having to face the DualMaintenance
problem (as mentioned above). But if you do have to face it, then branching can be (and often is) a far better solution to the problem than conditional compilation (provided you do it properly :)
Programming Rule #42. RefactorProjects?.
is a nice theory, but thankfully engineers don't run the company. There's more economic value in not automatically upgrading your end customer so you can upsell them. Often (usually?) each custom
er has a custom
solution, tweaked just enough that it doesn't translate to other customers. This is why you can sell support contracts that pay for such custom maintenance. The caveat is that you have to patch each branch when the changes hit common components.
However, an alternate solution is to componentize the system so that the latest core code, which are usually the most frequently fixed, can be shipped without significantly upgrading the end user. This is what happens with many Windows system DLLs for instance. Thus, the system Whizbang 1.0.3 sold to Joe's Garage may have version 3.0.7 of the database access component, and version 2.1.2 of the networking component, as well as 1.0.3 of the frontend and 1.1.2 of Joe's Garage's Special Oil Checking Component. Nonetheless, you can fix bugs in 1.0.3 by recomposing
it of database access 3.2.1 and networking 4.1.7 without losing too much economic value from the upsell to 2.0.1.
Naturally you will have integration problems, but this is why you need regression tests. Of course, you have them anyway, so this isn't a major problem. As an added benefit, you can build new systems out of the components without much trouble.
By the way, this is why applications still ship as a glob of DLLs even though you can no longer share DLLs between applications. It's still nice to be able to upgrade a whole section of code without having to use the patch utility.
I like this name for this pattern. We tried to write it up under the name "Shared Source Escalation" for PLOP'99 but your name and description looks much better to me!
Lots of embedded systems apps have the pattern of multiple functionalities in a single code set, activated by the presense/absense of hardware or other triggering mechanisms. I designed some 1/16 DIN process automation instruments a decade or so ago that used the MCS-51 (8051, 8052) as a core. The startup would look to certain ports as inputs and see if a particular value was present; if so, certain hardware was present and the instrument took on the personality of a This Widget. Otherwise, it became a That Widget. The personalities involved programming certain I/O ports as either inputs or outputs, so the CPU had
to know what it was hooked up to.
There are many other ways to do this in hardware applications and even more ways to do this in software. The point is that it is not necessary to have mutually exclusive code even when you have mutually exclusive options for the app.
I'm not sure how this can be done in a business environment. When I release version 1.0 of my app, I ship it with bugs. I don't mean to, but there are invariably bugs that I can't find that the customer does.
The way that I handle this is to immediately produce a bugfix branch to 1.0. On the main line, I add the 2.0 functionality. When an emergency bug report comes in, I make the fix on the bugfix branch, merge the branch to the mainline, and release the latest rev on the bugfix branch as a bug fix patch.
If I UseOneCodeLine
, I would be forced to do the fix on the main line, and then release the main line, with possibly a third of the 2.0 functionality. Even ExtremeProgramming
doesn't guarantee that that piece of functionality will make sense to the customer.
How do you bug fixes between major releases if you UseOneCodeLine
I suspect that UseOneCodeLine
can only be approached but not necessarily realized in some project settings. In this case, I'd be look at trying to come up some way to catch more of the "customer bugs" before major releases. I've heard of using special beta customers that get releases early to provide feedback. The NetBeans
project recently introduced Q-Builds which are a step above the NightlyBuild
in that they have to go through a partial QA process. The spirit of UseOneCodeLine
is that the more code lines, the worse, the less code lines, the better. So do whatever you can to approach one code line.
It seems that we are losing sight of the scope of this pattern; it is a ContinuousIntegration
pattern, and so it's an ExtremeProgramming
thing. In XP, there are no major releases which need maintenance releases, and there are no big new features which are added in one go. In XP, UseOneCodeLine
is a no-brainer - it is simply impossible to do ContinuousIntegration
Can you provide the evidence that "In XP, there are no major releases which need maintenance releases." I'm sure it is a goal. I'm less convinced that it is always (or even usually) the reality. I think DualMaintenance describes several possible reasons why that recur commonly in reality.
does not mean, that your program in the PerlLanguage
should contain less than 2 lines of code.
Although of course it probably should.
What does the number of codelines have to do with continuous integration? You submit on a codeline. Your tools should be able to build that code line and every other code line that is used. Sounds like you could use better tools.
It seems that the original poster means "one line" to mean "one integration path". I.e., you don't have a "Stable" branch and a "Current" branch, with maintenance on both.
Am I right that version branching is just a massive CodeDuplication?
Not Really. Lots of version control tools appear to support the idea that it is because it is easier for many to visualize it that way (structurally) than to visualize the timeline and tree of changes. But in fact version-branching is almost the opposite of code-duplication.
Conceptually, it may seem like I am cutting and pasting the entire codebase to make a copy that can vary independently. But the way version branching works, it is not that way at all - in fact it is the opposite in the sense that you don't cut+paste+copy, you only add/create what needs to be different and reuse the rest. It is refactoring code over time instead of over space. The duplication is perceived because I may need copies of the codebase to build and may have to build twice - but if I use conditional compilation I have to do that anyway. I might need an extra sandbox - but that is not the same as code duplication.
The only real duplication happens when I have to merge a change I already made in a legacy codeline to the latest-and-greatest one. That's not massive, its small bits and pieces. That's the tradeoff version-branching makes: that it will be less effort to make significantly fewer changes to a legacy version and then have to check-it-in twice (once in the legacy line, then again when merging to the mainline), than to have to incorporate every change so that it works properly (or simply isn't executed) in a single codeline that becomes more complex to change, merge, build, and test/debug as a result of having to do this for every change made. -- BradAppleton
The great dislike of multiple code lines is interesting. I think it must be tool related. It really hasn't been difficult. We use PerforceVersionControl
and it's very easy to branch and merge changes into the feature branches and the merge changes back to the main code line. What's the big deal? -- Unknown
I also think this overdeveloped branching fear is tool-related. The most common VersionControl
tools used today are CVS and VSS and neither has very good branching support. These tools are common because they are freely/widely available (either free or already part of the development environment). And when someone gets bitten badly by a branching pitfall, there is a tendency to swear off branching altogether rather than learn to do it better. But more and more tools have better and better branching support, and SubVersion
(the CVS replacement) is almost at version 1.0 - so maybe things will change.
Until then, the folks who use ClearCase
, CMSynergy, SpectrumSCM, MKS, StarTeam
, and other tools know how easy it can be. But most of the VSS and CVS world does not, and it is a scary place to venture from if you had the branching-fear instilled within you and very hard to open one's mind up to this other world (and tools) where it is extremely easy and hassle-free.
Going back from maintenance branching to the topic at the top of the page:
don't say "We'll make a new branch to add this hairy feature, and then merge it back in when it works". Instead, do everything incrementally so that you can add the hairy feature bit by bit, without needing to branch.
Doesn't mean that if a feature's going to take longer than a single release cycle to develop (or at least, longer than the next one), and/or you have multiple development groups developing at different rates, you should solve it by rolling it all into one trunk line with lots of conditionals all through the code:
// combined stuff
// my new stuff
// bob's new stuff
// old stuff
Instead, it means that you should split it up more into tiny bits that don't have so many dependency problems and are okay to release before it's all done. In other words, break the hairy feature up into lots of smooth features before you start on it. What to do when that's not possible?