Copy And Paste Programming

Name: CopyAndPasteProgramming

Problem: Reusing code with a minimum of effort.

Context: Trying to optimize reuse when maintaining an existing system.

Forces: Design-time reuse is a lot of work. If a system has already been implemented, the reusable assets are there for the picking. During maintenance, you often do not know the assumptions that a module is making, so changing it is risky. It is safer to add new code than to change old code. Restructuring existing code to be more reusable (called ReFactoring by Opdyke et al. [1]) is expensive, particularly if the system is large.

Supposed Solution: Minimize reuse costs by cloning code and modifying it to suit new contexts.

Resulting Context: Reuse is easy, though specific blocks of reusable code are duplicated across the source spectrum. Individual bug fixes or enhancements to the original source must be individually ported to all clones if and when they can be tracked down.

Design Rationale: Short-term pay-off is more easily justified than a boondoggle that promises long-term investment without any conclusive proof.

Appropriate Usage: If the copies of the code will be heavily modified, it can be better to have the separate copies so that each can be changed independently. Later CodeHarvesting eliminates redundancy after the finer points have been worked out. It can also be handy to have two variants of a piece of code when debugging one variant only, setting breakpoints in the variant of interest.

Synonyms: CloneAndModifyProgramming, SnarfAndBarfProgramming, ClipboardInheritance?, ClipboardCoding?, RapeAndPasteProgramming

-- JimCoplien, inspired by KyleBrown and AdeleGoldberg, and a few words by JeffGrigg


The name RapeAndPasteProgramming was originally a derogatory term coined by AdeleGoldberg to refer to a style of programming practiced by certain SmalltalkLanguage aficionados, specifically SamAdams but others probably fit, where they would quickly steal code from other parts of a system to build a rapid version of an application and then refactor it later. She preferred a "purer" top-down approach that didn't involve snipping bits of code from old demos, system browsers, etc. This wouldn't have been possible without (a) Smalltalk containing all of that source code and (b) Smalltalk making it easy to locate methods, cut them out and paste them into other classes. -- KyleBrown


As much as I hate to admit it, if you have n variations of a piece of code, CopyAndPasteProgramming is probably the fastest way to produce the n+1th variation (even for n=1).

DavidHooker's SevenPrinciplesOfSoftwareDevelopment is a good summary of why CopyAndPasteProgramming is bad. It doesn't violate the first principle, but it breaks the other six, especially the last: Think! (Probably the most frequently broken principle.-) -- PaulChisholm


CopyAndPasteProgramming makes code reviews more tedious. Have you ever been forced to read the same novel over and over again? How about a really bad novel?

If the pasted code works, it makes boring or misleading reading. If the original code has flaws, not only do you have to hunt down all the clones, but what if you miss some? It's like a re-occurring rash - Damn, I thought I got rid of that! - it keeps coming back when you think its gone for good. -- ToddCoram


CopyAndPasteProgramming is not an antipattern, it is just a pattern applied in the wrong context. Writing reusable code takes a lot of work, and is only worthwhile if you are going to use it enough times to make it pay off. The first or second time you use some code, CopyAndPasteProgramming is probably the right solution. After that, you have strong evidence that the code will be reused enough to make it worthwhile to make it reusable. If you reuse something 15 times using CopyAndPasteProgramming then there is something wrong, but it is the application of the pattern that is wrong, not the pattern. -- RalphJohnson

Sorry, Ralph. I find that hard to believe. I have seen many instances where one copy and paste introduces hard to track defects. It clearly violates the OnceAndOnlyOnce rule. Also, in real life, a single person does not work on all of the code and Programmer A is unaware that Programmer B has copied and pasted his code elsewhere. When A changes his code later, B's code is a stale copy of A's code and hence, probably buggy. -- SriramGopalan
I find that reusable code is also more resilient to changes in the environment,as well as the inevitable requirements drift that affects all systems that are successful enough to be actually used. Moreover, there is such a thing as degree of reusability, such that even code that is not fully generalized (i.e. that has very strong preconditions that are unlikely to be realized in other application domains or contexts) can still benefit from making the preconditions as weak as possible, while explicitly stating any that remain. The most important issue here is that code reuse entails bug reuse, which entails test state explosion. If you reuse via CopyAndPasteProgramming, you mush test for each new 'implementation. When bugs are found or requirements change, update propagation becomes an issue. In the end, the more reuse you get the higher the cost, until diminishing returns negate the benefit of reuse as well as the purported time savings gained from not generalizing, and just as the cost becomes unbearable enough to justify rectification, you will realize that the risk of change will also drive up the cost of repair. It is a false economy. This is definitely an AntiPattern given that is does look like a good idea at the time, while greyer heads will counsel otherwise. If that is not the defining attribute of an AntiPattern, I do not know what is. -- MarcGrundfest

"Writing reusable code takes a lot of work, and is only worthwhile if you are going to use it enough times to make it pay off." - I disagree. In 10 years of programming I have found that most of the reusable code I wrote took less time to implement, some directly, the rest by not requiring much debugging effort over the course of the project. I also note that by using these principles, on average I wrote significantly more (and tidier and well-commented) code than my counterparts who used cut and paste in the same time, most/all of it useful and beneficial to the end product and with fewer bugs and shorter test times. Basically - "there is no substitute for knowing what you are doing". The problem with "cut-and-paste" is not just a principle, but the fact that the people who want to do it often take the path because they are lazy or do not understand enough about programming, or do not have the skills to visualize processes required to write generic code (not understanding is why they think it will be harder to do, and assuming it takes longer usually comes from never having tried). It depends on your line of work, but if you write database applications, for example, anything up to 70% of the code is identical for every system you create, so the savings of generic code are enormous over time. If you have your wits about you, you can finish the first project at normal speed and the rest faster (or with a richer feature set developed using the time saved). As your generics become enhanced over time, your applications become richer. Eventually you produce 'off-the-shelf' quality in the same time as others produce ugly junk that barely works. Of course, some people work on wildly variant systems, but even so, if you are cutting and pasting, it pretty well proves you have something that should have been generic in the first place.

Pattern in a wrong context - isn't this the definition of anti-pattern? -- ScottVachalek

[Hence why it may be a GreyPattern.]

ThreeStrikesAndYouRefactor is a pretty good summary of what Ralph is talking about. -- DougWay

Perhaps, but it is not necessarily a good way to proceed because the assumption of refactoring is the assumption of "type first, think second, get it right later". Some people prefer "think now, type second, check it is right before moving on" because it saves a lot of time and makes the end result much more reliable. I've seen too many projects that are complete disasters because it was assumed that there would be time to refactor out the errors down the track. In reality, the problem being solved at the beginning by this assumption was "how do I avoid thinking?" and "how do I avoid typing?"


It is true that this "tool" has costs and benefits. The benefits comes immediately and costs are postponed. Someone I know who's very fond of CopyAndPasteProgramming is also heavily in credit card debt. Mere coincidence? ;) If you had reasonable plan concerning how long you wanted the code to be used or updated, then you could make a rational decision whether to use this methodology. But one uniform tendency of software is to exist longer than it is planned for. Indeed, it is uncertain whether the planning process concerning how long the software will last is even primarily rational in many cases. Indeed, the use of methodologies to improve code is often needed to reign-in the *psychological* dynamic towards trading the long-term for the short-term. This is made doubly hard considering that some code really will absolutely only be used once, some code likely will be used only once, and other code you are just fooling yourself if you think it will die. The problem is that allowing CopyAndPasteProgramming allows your misjudgment to suit something that is very comfortable for a moment.

All that taken into, consider that in any programming toolbox there are going to be a finite number of tools - some of these include ways to create programs quickly. Should CopyAndPasteProgramming be one of this finite number?

-- JoeSolbrig?


Allow yourself to replicate code 'once'. Add a comment that points to the other copy at either end. When (if) you come back to the either place and want to copy again, there's probably enough information now to find the right way to factor the three uses out. -- AndreasKrueger

Doesn't TwoIsAnImpossibleNumber apply here? -- TomRossen


I teach my programmers that if they CopyAndPaste even 'once', then they need to sit back a bit and re-think the problem; its time for some low-level redesign, at least. I've noticed that only after they gone through that a couple of times does the whole business of 'abstraction' become concrete to them. More than any book or abstract lecture, this one simple principle has caused most of my team to finally see the practicality of object-orientation. -- KevinLacobie


A less profound reason why CopyAndPasteProgramming is bad is that it can lead to XySymmetryBugs. -- DaveHarris


I recently worked with a small team who made heavy use of "copy-paste" programming, instead of call-level reuse. One of their members argued that copy-paste programming was better, because you can customize any of the copies to the particular customer needs for that module more easily. He said that call-level reuse can limit your ability to meet new and unique requirements that don't fit the "model" of the existing reusable code. Force-fitting to the model, or reworking the model, can be difficult and time-consuming.

I think he had a point about some rare cases. But most in business systems, I've found almost everything to be repeating patterns of simple rules. (Thus, call-level reuse makes 99% of the work much easier, and 1% of the work somewhat harder. A good trade-off, I think. ;-)

-- JeffGrigg

{I have to disagree that the patterns are predictable. Business rules often gain weird twists that tend to gum up frameworks. Many of them I concluded that I could not have foreseen in advanced. Generacy is difficult and takes work, skill, and often trial-and-error. See also: WarStories and AreBusinessAppsBoring. -- top}

It is really common for large projects to use "copy and paste" heavily. I think those programmers are paid by the number of LOC they write. They absolutely do not like small functions and for some reason unknown to me (they say it runs faster but I need proof of that) they think that spreading a design decision all over the place is THE right thing to do. Imagine how much "copy and paste" the result certainly looks like spaghetti. They even use GOTO...

Of course, when a terrible bug that must be fixed is spread over in 2000 different places, their hands shake before changing those lines (since there are no unit tests in place) and they of course decide not to fix it and just live with it.

The solution: Refactor mercilessly until no line is duplicated. It requires a lot of thinking and the first ideas aren't that good. You go slower but you learn, I recommend it for people who are learning OO. Even when you use that new method you created only twice, an added benefit is that you have a test for it. If you ever happen to use the method again, it is already tested!

If your project is in a hurry, allow copy and paste if it is faster, at least as much as it doesn't become hard to refactor it. -- GuillermoSchwarz


Earlier someone stated that premature generalization can be a mistake. DonRoberts and RalphJohnson expand on this in their patterns for evolving frameworks. But if one fears being smeared with the CopyAndPasteProgramming epithet, you will waste time trying to get it right before you know how.

There is always great fear of replicating bugs. Why not be more mellow, more brave, and see the eventual discovery of a duplicated bug as a good thing. A signal that you are now ready to begin refactoring replicated code that has evolved so little the same bug remains in each copy.

I understand that such an approach is impossible in a conventional software development environment, where no one will refactor after the fact, or no one will take the care to run down multiple copies of a bug once discovered. But maybe with a little extremism we can do more of CopyAndPasteProgramming. -- ScottJohnston


In a code review this morning, someone inadvertently made reference to something called "cuss and paste". That seems funny and appropriate to me on multiple levels ... -- GlennVanderburg


I have generally found this approach to be more prevalent among inexperienced developers. They usually stop when I point out that they are creating more testing work for themselves if they cut & paste rather than creating a subroutine (because we make them check both sides of every branch, etc.). Even inexperienced programmers prefer coding to testing.

My favorite instance was some incorrect COBOL code intended to handle leap years. The code had been cut and pasted into several programs, resulting in a significant clean-up effort. -- AlexStockdale

'Member that goofy y2k boondoggle? Imagine if all that date-processing code wasn't CopyAndPasteProgramming... one JustaProgrammer; five minutes; on to the real problems. Then again, for the less ethical, generous application of C&PP may well be the path to JobSecurity, especially if doubled up by running everything through a CodeObfuscator.


CopyAndPasteProgramming is LarryWall's example of FalseLaziness. -- RobertField


For a related, though more abstract discussion, see OnceAndOnlyOnce. -- AndreasKrueger

Basically, CopyAndPasteProgramming is diametrically opposed to OnceAndOnlyOnce.


See also NarrowTheInterface for one solution and CopyAndPasteTherapy for a more general discussion of (proposed) solutions.


I just got done with an assignment writing servlets in Java where my predecessor had made heavy use of CopyAndPaste. Several different servlets doing that same thing to different pieces of data all had the same routines database lookups and HTML writing. I didn't have the time or the support to rewrite everything, but I did abstract out quite a lot of things and created a superclass servlet to inherit from. I shudder to think how many places things will have to change if they make certain changes to the business logic they were discussing. -- StevenNewton


I just put a page on the emacs wiki for people to suggest tips to make programming easier with emacs http://www.emacswiki.org/cgi-bin/wiki.pl?ProgrammingEffectivelyWithEmacs.

The first suggestion was to use emacs macros to automate repetitive tasks. I found this to be a very nasty suggestion.

The first reason is that it makes highly used CopyAndPasteProgramming with customizations likely.

The second is that it elevates CopyAndPasteProgramming to the level of a process level technique.


It seems like environments that don't force people to see the whole text of what they are duplicating let people off the hook too easily (why not copy-and-paste... this tool makes it so damn easy!). So, I ask:

Does the use of non-file based development environments promote CopyAndPasteProgramming? By this, I mean development environments where you can copy-and-paste functions by drag-and-dropping subroutine icons from class to class.

-- EricRunquist


Waaayyy back at the top, PaulChisholm talked about the n + 1 issue. I thought that there was OnlyOneOriginalProgram ever written..... All others are imperfect CopyAndPasteProgramming (and similar ilk) copies.... -- HalSmith
Here's a little Java utility that may help find examples of this here CopyAndPasteProgramming - CpdTool -- TomCopeland
And here - http://www.devx.com/Java/Article/17947?trk=DXRSS_JAVA - is an article about Java utility that encourages copy and paste by making it easier to do. "when you implement a class that's similar to an existing one, you find yourself copying the old class source to the new file, editing the new source, and modifying class and member names as well as member types in several spots." Words fail. -- StevenNewton


I like to copy and paste! It reduces risk when quick changes are required in a production environment. Do it and don't feel ashamed. -- FirstEclipse

Quick changes in a production environment? Maybe you should feel ashamed.
I agree with FirstEclipse...I consider this Leverage Programming (LP). What in the world can help to create so much in so little a time without affecting one's ego to say "I cant do it in this much time". -- Niteen See Also: DuplicatedCode


I've found that CopyAndPaste works well in conjunction with RefactorMercilessly. First I copy and paste a bit of code. Then I get it working in its new environment. Finally I RefactorMercilessly to remove the duplicate code. At that point I know the exact requirements of the refactoring, so I can be sure to DoTheSimplestThingThatCouldPossiblyWork. -- RainerDeyke


This technique is OK if you're doing code reviews, but often times it takes the place of code review: "Hey, this database accessor written for Oracle 7 in 1998 as a SELECT * FROM BIG-HONKIN-TABLE is in production so it's OK to use it for my customer profile self-service app."


When I first noticed people doing it, I could almost see their lips moving as they pressed Ctrl-C followed by Ctrl-V. "Copy ... Paste ... Copy ... Paste ... Just one more to go... Copy ... Paste! Yay, I'm done!"


CopyAndPaste can be better, when you are copying good code, than reinventing the wheel. Having twenty versions of the same functionality is more of a nightmare. (You have to decide which one to cut and paste.) The only way to prevent this practice is to present all components as black boxes that cannot be changed just subclassed. However, God help you if the blackbox changes. This is where MicroSoft makes all it's money. -- MarkSpanglet


Note that this is sometimes referred to as CutAndPasteProgramming. However, Cut and paste is strictly not an accurate description of the process as it implies the code is deleted at the source, whereas in fact the technique is usually used to duplicate.

I often find myself doing what I would call CutAndPasteRefactoring?. That is to say, implementing the refactored version of something can often be done by moving some of the code from the old version into the new context and adapting it to that context.


CopyAndPasteProgramming is often used in large projects. Reasons for that in large projects: Believe me or not, working in large projects isn't that funny.

-- AndreasSchweikardt

Mitigating forces to the above include CollectiveCodeOwnership, VersionControl, and ContinuousIntegration.


Copy and paste is not inherently evil. A DuplicationRefactoringThreshold of around 3 or 4 is generally acceptable. Perfect factoring can make code tough to read because of all the indirection hopping. It sometimes can take longer to absorb the abstractions than it can to just read the implementation code. The "only have to change in one spot" may be overdone. It is sometimes easier to change in multiple spots than it is to find the single spot if there is a lot of indirection. Making easy-to-understand abstractions is a tough skill that few will master. Thus, it may be better to allow some duplication than to let them create tangled abstractions. This is based on my observations and experiments in factoring. At times I have overdone it and confused the daylights out of both myself and others. I could often not find an easy way to avoid the problems without a crystal ball. Duplication is the devil we know and tangled indirection is the unpredictable devil.


Curiously enough, I often find that CopyAndPasteProgramming is a valuable intermediate step when ReFactoring.

That is perfectly acceptable. It only becomes a problem if it stops being an intermediate step.
I think copy-and-paste programming has its place at a certain level: copying between separate codebases. To avoid it you need to introduce a SharedLibrary, which leads to BigDesignUpFront, versioning issues, and can easily lead to shared-library hell.

But within a project, you should RefactorMercilessly.

- BrianSlesinsky


The problem I keep encountering is having a poorly-documented code base, and lots of employee turnover. Since no-one has the time to understand the code, they copy-and-paste since that approach is fairly low risk. It takes perspective and time to understand the code to the extent needed to refactor. And in the organizations where I work (State Government, large corporations - the typical consulting client), unit tests are either non-existent or poorly organized, and the code isn't very modular either. Therefore, even if one sees a refactoring opportunity, testing the refactoring and untangling the tendrils of the old design are pretty difficult. Another inhibitor is the sense that "gee, if I start cleaning it up here, I'll have to clean up over there, and there, and there, and there, ..." - so rather than go down that path it's "do it this way for now, and we'll do the real cleanup later." -- MaxRahder

I think US companies have decided that they have to use heavy FutureDiscounting because changing direction fast is their comparative advantage, not efficiency and not planning. I used to mumble heavily about "stupid companies", but to some extent there is method in their madness I have to conclude. CopyAndPaste is FutureDiscounting galore. -- top


Copy and paste programming is just fine at times. For example, imagine a framework of Object-to-Relation Modelling based "Entity" objects, based on normalized tables. The method of loading objects from the database should be similar (for example, some sort of static CDBObject LoadFromDatabase?(int iIdentifier) method) but we want to have type safety in return values. Copy and paste programming is great here.

Before someone mentions templates, which are a better solution, some languages don't support templates at all (for example, all .NET Framework 1.1 languages).

It is not clear to me exactly what is being copied. Further, some dispute the practice of mirroring schemas with copy-cat classes.

A static function that does error checking (to see the identifier is valid in the DB, for example), creates an object and returns it in a TYPE-SAFE manner (not as an abstract base-class, but as the actual concrete class). Just an example.


I have heard copy-paste programming being defended by many - often mediocre - programmers over the years. It was always those programmers that were to blame for introducing sloppy code and avoidable bugs. They didn't take the time to understand what they were supposed to achieve, what the code they pasted in was trying to achieve, and what the differences between the two contexts were. It is also very easy to overlook a needed change to make the pasted code work for your context. The only acceptable use of copy paste is when you're starting on some new application, want to reuse older, proven code you have, without wanting to deal with setting up multi-project dependencies.

Are you talking about within an application or between applications? Intra-app factoring is much easier than inter-app factoring because often GenericBusinessFrameworkUnobtainable.


OO code doesn't just create re-usable code... it creates Usable code. If a code base is not Usable, it is not good.

You should never copy and paste program. Instead refactor the code you wish to draw from, then Use the object you have created where you need it.

Example?

I have become more rigorous in avoiding copy and paste programming over time. I now consider almost any copying within the same code base to be unacceptable. If I am tempted to copy code, then I've already (perhaps unconsciously) realized that there is some inherent similarity with an existing piece of code, and so the original code that I wanted to copy should be put into a form that can be reused instead. This also applies to manually copying (i.e. re-typing something that you see in another part of the code).

This helps distinguish inherent similarity from coincidental similarity. If I didn't actually need to look at another piece of code to write what I'm writing, then any similarity to existing code is probably coincidental. There is no reason to try to avoid the duplication, since there is no reason the two pieces of code will need to stay the same in the future.

Part of becoming rigorous involves learning various techniques that make it practical. For C++, these include

-- VaughnCato?


EditHint: Merge with CopyAndPaste? There appears to be duplicate coverage.
See: UsingToolsToAvoidCopyAndPasteProgramming, SwipedFromTheBestWithPride
An AntiPattern. (CategoryAntiPattern) Could be a GreyPattern.
CategoryDevelopmentAntiPattern

EditText of this page (last edited July 17, 2012) or FindPage with title or text search