Yagni And Databases

Do database concepts and practices conflict with YAGNI (YouArentGonnaNeedIt)?

This topic currently has two main areas of discussion:
Moved from YouArentGonnaNeedIt

Isn't one of the goals of software design, to make the software malleable in the face of anticipated changes? Is it good to decide the structure of the software without thinking about how it might change?

Let me give an example, although slightly contrived.

Say we need a database, and we are planning to use the RDBMS from vendor A. We might expect that in future we might want to use a different database. So we design the software to insulate the rest of the application for the database dependency. Does this contradict the YAGNI maxim? -- SureshVv?

Addressed at DatabaseVendorLock.

Yes, it does contradict the maxim. XPers would build the code to be well-factored, but fitted precisely to the DB connection we had. Consider the possible futures:

Further, the following is always true: we don't switch to all possible different databases. Any generalization installed for a database to which we do not switch is wasted forever.

Now let me emphasize that we are NOT talking about building flimsy or poorly-factored code. The DB will be properly isolated from the bulk of the system, not for future-oriented reasons, but because proper factoring (OnceAndOnlyOnce) will make it come out that way.

So the changes will be isolated to the original classes. Worst case, they'd be rewritten. Realistically, since one RDB is much like another (thanks to DrCodd), the changes will likely not be major. -- RonJeffries

Unfortunately, vendors don't seem to eager to stick to a standard. Part of the problem is that SQL is more complex a language than it needs to be. (SqlFlaws).

Added reference to DoTheSimplestThingThatCouldPossiblyWork, which is part of making YAGNI work as well as is OAOO. -- RonJeffries

From ReinventingTheDatabaseInApplication:

"Well, it is a statistical issue. If say 80 percent of all apps in a given shop will eventually end up needing at least 2 of the referenced [nine] database features, then the total cost will likely be less than keep reinventing them one-by-one until you realize you need to gut the whole app to use a database instead."

My experience is that there are certain things that keep popping up as a future requirement. Databases don't have to be big and bulky (NimbleDatabase), so the "cost" of using them is not (or does not have to be) high if by chance you really don't need one, and if designed well, make things simpler than code-centric solutions IMO. You can query and browse your "state" for testing purposes, for example. On the flip side, writing DB-like operations from scratch for each new DB-like feature later needed is costly.

We can use past experience to guide present development. You don't have to start over each time from scratch. That is not proper use of the ol' noggin.

(Comparisons of YAGNI to financial analysis can be found in DecisionMathAndYagni.)

When you build a building, you generally plan for wind, rain, and to some extent earthquakes even though they are not present at the time of building. Why? Because experience has taught us that those will fairly likely be needed, and that the trouble caused by not having those features is heavy. The features of databases are generally like this.

This is a bogus analogy. When you build a building you plan for wind, rain, earthquakes etc... according to the known characteristics of the region in which you are building. It's part of the spec that a building should be able to withstand the local climate so far as it is known.

But there is a reason it is in the specs: because YagNi would mean a flat house upon the first wind storm. Ad-hoc query storms and multiple viewpoint storms are common in business applications also.

No, that's not what YAGNI means. If something is in the specs, you are going to need it.

So, if the government did not dictate housing laws, you would build crap that falls over just because the client has no experience with weather-proofing?''

Why do you think the government has to set building codes? Before building codes, big chunks of London burned down every few years. What does that have to do with YAGNI? If it's in the spec, you need it. YAGNI applies to things that aren't in the spec.

Databases are like gearing up for events that are fairly likely to happen based on past experience, just like weather extremes that pop up every few years. Just because the government does not dictate them is no reason to screw over the naive customer. If the customer does not specify a language, do you use assembly because higher-level languages are not a requirement?

Yes, if you need a database, use a database. If you don't need a database, YAGNI.

Do you "need" a high-level language? If not, then wouldn't YagNi dictate assembler?

I've needed many different languages, including assembly language. YAGNI doesn't dictate language choice.

{Then why should YagNi dictate not to use a database? Languages are tools, and YagNi does not dictate languages and therefore probably does not dictate tools and a database is a tool. Thus, perhaps YagNi does not apply to the decision to use a database or not.}

YAGNI doesn't dictate not to use a database. If you need a database, use a database.

How about we start on application. I write in perl. You write in smalltalk. Someone else writes in python. I write on windows. You write on linux. Someone else writes on a mac. Someone else makes up their own language and their own OS. Given XP this isn't an issue because we don't know there's an issue until there's an issue. If this strikes you as absurd, it is no more absurd as not using experience to find other potential issues. If you can decide on an OS, language, and tool set before development why can't you decide on other things that strike you as making sense?

[Because of experience. :)]

Another analogy

I liken it to building a motorized bicycle on your own because your original requirements only require going to the local market to pick up a few small things. However, you find that when it rains you have to put a hood over the bicycle. Then you realize when it rains heavier that the tires get stuck in the town mud, so you put thicker tires on it. Then you later need to add lights to drive at night. Then you have a child and need to carry it. At first the child is small enough to put in a back-pack, but after a few years you need a separate seat. Then you need to carry loads of food and diapers. One day it dawns on you that you should have bought a car rather than incrementally kept adding to your bicycle.

The features of databases are the same way in my opinion. If you add even half the features incrementally, you will have a bigger mess and cost than if you just simply used a database to begin with. Perhaps we need more NimbleDatabases on the market if initial efficiency is your concern.

I use a kind of DecisionMathAndYagni. The probability of needing at least about half of the 9 or so database features (DatabaseDefinition) is pretty strong. Building those 5 or so incrementally would be costlier than just using a database up front. Perhaps I cannot predict which one of the 9 features will be needed, but I can say with confidence after years of observation that on average each will need at least 4 or 5 after 7 years or so. (This repeats some stuff already said above. I need to refactor one of these days.)

Further, databases are more standardized than incremental, roll-your-own solutions. I don't have to learn new interfaces and/or APIs at every shop to handle all those things. And roll-your-own solutions generally don't offer relational techniques, but the inferior (IMO) NavigationalDatabase.

It seems like hands-down best bet to me. Yagni is just plain flawed in some areas.

I don't think proponents of YouArentGonnaNeedIt use super-low-level tools like assembler and machine language or raw RTE bytecode. They use higher level languages and tools. Databases are another such high level tool. Thus, you are not truly starting from scratch, but using past lessons by using higher-level tools to start with.

When you need it.

But then you have to retrofit.

That assumes the project went from not needing it to needing it. It's usually obvious which software needs a database and which doesn't.

Some things cannot be incrementally achieved without back-tracking.

Apply OnceAndOnlyOnce. If the code suddenly needs to use a database it only changes in one place. That keeps the cost of back tracking low.

Why not start with assembler language and organically factor your way up to a higher-level language then?

I'm not saying don't start with a database if you know you'll need a database. I'm saying use it when you need it.

Like I said many times, a good portion of the 9 or so database features will eventually be needed on most projects I encounter. If I started out with arrays and lists of class pointers, then I would have to remove all that stuff and shift those features to the database, overhauling a lot of code. Plus, it results in a paradigm mismatch because relational and navigational techniques are too different to be swapped out one-for-one. It would be like gradually trying to change text from English to Chinese by swapping words out one-by-one. It does not work because the boundaries and context etc. are all so different between the languages. You cannot just swap words.

Is someone asking you not to start with a database? I'm not. If you write apps that need fast ad hoc queries, start with a database.

If you know about them ahead of time, they are not "ad hoc" :-)

Sometimes you know you'll need fast queries but you don't know which ones.

But there are very few times where you are certain you will NOT need them, except for maybe special niches.

My experience does not support that. Most of the projects I've worked on never needed fast ad hoc queries. I think you are generalizing your experience to the entire industry.

If one is depending on experience of a domain, then you are generally ingoring YagNi anyhow. If 3/4 of all projects in a shop have needed ad hoc queries and other DB stuff, pure YagNi would dictate ignoring that pattern if none can currently be identified for the current project.

Experience trumps YAGNI. [ This is a bizzare statement. Experience trumps experience? YagNi is just a heuristic. It should be taken as far as it needs to be taken. It isn't meant to be used like a sword, lopping off chances for new technology. It's meant to keep developers on task when they're in the trenches-a place where you can lose sight of your overall goals quite easily. ]

Tell that to XP fans. But, it would be nice to identify which domains can allegedly make effective use of databases and which can't, and isolate the differences.

List Of Pointers Example

A possible example to explore is in RefactoringIsNotRelational. The example uses a list of items or references to items. When an item's status changes (such as from "waiting" to "processed"), the implementation moves the pointer from one list (say "waiting list") to the other ("processed list"). A Yagni approach may start out like this. After all, it tends to model the real world of a stack of papers being passed to different booths. However, to later move this to a database so that we can share the status information with other apps/tools would be more work than using a database or database-like API from the start. The database approach would change a status field of a record, not move record pointers from list A to list B.

People take YAGNI way too far. YAGNI is a guiding principle against gold-plating. But taken to the extreme YAGNI could then apply to ANYTHING: "we don't need a computer for this project, it can be done on paper!" It's a rationalization. People take these principles way too literally. There is no such thing as a univerally applicable principle devoid of context.

YAGNI only applies to 1) "Requirements" that aren't spelled out by the customer (GoldPlating); 2) Requirements that the customer wants but aren't specified for this iteration (violation of PlanningGame). If they customer specifies that they want it done on a computer (obviously, otherwise why did they call you in the first place?), then doing so isn't YAGNI. Of course, there are times when the customer really doesn't need what they're asking for (CustomerGoldPlating??) so pointing that out is probably in everybody's best interest--so if they can do it on paper then maybe that's the ideal solution after all. --JasonBucata

Concerning the comment that YAGNI implies using assembly until the customer demands a HighLevelLanguage: Somebody might make the same argument that DoTheSimplestThingThatCouldPossiblyWork and SimpleTools? imply the same thing. However, what DoTheSimplestThingThatCouldPossiblyWork (and related PerlVirtueLaziness?), would actually prescribe is that they buy the whole application off the shelf, and send everybody home. Only if that can't possibly work do you delve into writing code--and so you'd look into code generators, report writers, etc. to do the code writing for you. Of course there starts to be interesting considerations here, whether you should buy such tools, write them yourselves, or skip them and do the low-level stuff yourself after all--IOW, actual thought is required (see WhatIsSimplest and SimplestOrEasiest). YAGNI is either not at all applicable, or is balanced out by other XP practices so that (hopefully) the RightThing happens. --JasonBucata

The bottom line is that SoftwareDevelopmentIsGambling. Requirements for implementation tools are generally not binary, but a list of desired goals (hopefully) with various rankings given to them.

I don't think YagNi applies.

From the YagNi page:
    Don't build it until you actually need a given feature

You don't buid a database like Oracle, you use it, so by the above definition the YagNi argument does not apply. As implied by above posts, the 'build' definition is more reasonable than a 'use' definition, otherwise you always end up with MachineCode. -- JaredSulem

I think the implication is that using a database at the start may not be DoTheSimplestThingThatCouldPossiblyWork. For example, using maps and lists instead of databases may allegedly be the proper YagNi approach because it may (initially) be simpler. Plus using a database may involve extra installation steps. To some extent I agree with this. However, revamping maps and lists into database calls when a database is later needed can be a fair amount of effort. I suppose one could wrap such calls into functions or methods to prepare for later data structure overhauls, but that also violates YagNi. If you are going to go through the effort of creating wrappers, then you might as well start with a database instead if already accepting a future-preperation cost.
So let me see if I understand the issue. Let's assume, for the sake of discussion, that a particular application is collecting enough information that a database is needed. Let's say, for example, that we're talking about a data storage facility for a call-center backend of a telephone marketing and sales company. The customer is not technical and has no idea what a "relational", "data base", or "SQL" is. The customer has heard, however, that "agile is good" and has retained us to rollout the new application as soon as possible.

I suggest that we had better have a very good idea of what tables, columns, indices, and foreign keys are going to be needed well before we start gathering data. I think we'd better think now about what the customer is going to need before the customer knows they need it. I further suggest that a consultant who cites YAGNI as a motivation for NOT doing this has never faced a legacy data migration issue before.

When the application is up and running, and has collected several million records, it's a little late to discover that an extra field was needed to contain something like "Extra comments" -- especially after a few hundred of those several million entries have used the second line of the address field to hold those comments.

Data migration is hard, tedious, expensive. The purchase order to pay the bill for that migration is very difficult for the IT manager to explain to her CFO, especially if it follows our earlier bill for designing and building the application.

If the problem requires a database, then I suggest we should be very cautious about making any assumptions about XP practices in the code that interacts with that database. Very cautious.
This is another aspect of XpAndPublishedInterfaces. The problem is that XP falls apart around the parts of the code will be "Annealed" by existing content. In this case, the content of the DB becomes brittle, making the code that interacts with that content brittle. XP works fine until your product goes live... at that point, refactoring becomes expensive (expensive refactoring makes XP impossible). Which is not to say that XP could not be used up to a point... but at some point before the DB gets annealed by going live, some long-term thinking must be done.

Maybe UserStories could be created for backward compatibility of any published interfaces that the system deals with? This does not need to be designed up front per se, just before going live. It should be okay if the system doesn't deal with old file formats that only existed internally, for example. -- JasonWynja?
See YagniAndSpikeSolutions, XpAndEncapsulation, ReinventingTheDatabaseInApplication, UsingDatabaseUpFrontConsideredHarmful

View edit of February 2, 2009 or FindPage with title or text search