Abstraction Deviation Domain Smell

I find that deviations from abstractions are often a domain (business rule) smell. They may be an old bad habit that stuck with the business. Or perhaps they signal a different way to go about the problem. The problem is convincing the customer to change to a cleaner abstraction.

I've found the hard way that sometimes "silly" rules are there for a reason. Business processes are often organically grown over time and shaped by trial and error. If you simply replace it with a simpler abstraction, you'll often find out later that you broke something that used to work or lost a feature you didn't know was a feature. Re-engineering is a tricky path. I'm not saying don't try, but rather prepare for such surprises in estimates and via flexibility in design.

For example, I once was working with existing code that used a lot of Trim() functions. It seemed a massive violation of OnceAndOnlyOnce, so I removed most of the Trim's and simply made a "standard" loading mechanism that automatically trimmed the input parameters. Hundreds of Trim's became one Trim. I patted myself on the back and gave myself an abstraction gold-star award for my own forehead (in my mind).

Then one day someone discovered that a certain data element needed a space after it (or was it before) because multiple other processes down the line, outside of our jurisdiction expected it that way. I think it was a sorting trick to force something at the top of reports or lists, but it was an UndocumentedFeature such that nobody really knew. I had to add all those damned trim's back and turn in my mental abstraction award. (I couldn't make an exception to the rule for the rare spaced data, because it came from yet another feeder process(es) which couldn't be determined at code-time.) --top

{Nothing you've written above has anything to do with abstraction. It has everything to do with excess repetition, perhaps, but not abstraction. Abstraction is about finding general mechanisms that appropriately encompass a category of specific cases. For example, a set of "Report" mechanisms -- which might be one class, a group of classes, a collection of functions, or a library or framework with any or all of these -- may be an abstraction to handle all printed reports. "Report" is an abstraction because it encompasses all specific reports, but does not represent any particular report.}

Repetition factoring and abstraction have a lot in common. In fact, measuring repetition is about the only way to objectively identify and/or measure "abstraction". Outside of that, we only have nebulous personal definitions and hard-to-text mental notions of what "abstraction" is.

{Be that as it may, nothing you've described above has anything to do with abstraction. Regardless what "nebulous personal definitions and hard-to-text mental notions" you may hold, adding or removing trim()s is neither adding or subtracting abstraction. An abstraction is a general mechanism that appropriately encompasses a category of specific cases without being a specific case. How does your gaggle of trim()s relate to that?}

I suppose we are headed down the path of a LaynesLaw over "abstraction". I disagree with your working definition because it's not about hierarchical classifications. That's too narrow and specific. It's generally defined as excluding or hiding details irrelevant to the problem at hand. --top

[Huh? It seems you're objecting to his working definition because it's not about hierarchical classifications (which is true) and at the same time complaining that hierarchical classifications are too narrow and specific (which is irrelevant because, as noted, his working definition is not about hierarchical classifications). Finally, you present a definition of 'abstraction' (generally defined as excluding or hiding details irrelevant to the problem at hand) that could be used just as effectively to say: "{Be that as it may, nothing you've described above has anything to do with abstraction.}". I know that logic and rational reasoning is well outside your normal modus operandi, TopMind, but you usually do a better job of faking it than what you wrote above.]

OffendedButResistingCounterInsult. "Specific case" generally implies a hierarchy.

[No, it does not. For example, "this particular blade of grass in my hand" is a specific case of "green things" and a specific case of "things called 'blade of grass'" and a specific case of "objects with tensile strength", and a specific case of "things that might cut my hand". Where is the hierarchy?]

But we'll table that for now. As far as the second issue, I assumed that trimming wasn't a relevant detail of all those submitting processes and thus moved it to a spot where it was a relevant detail. They didn't need their semi-result trimmed until just before submitting it to 2nd stage. Thus, I assumed that trimming was only a relevant detail to the 2nd stage. Thus, I did use abstraction. It was just the wrong abstraction.

[What you did was make an assumption. It was just the wrong assumption. If trimming isn't a 'relevant detail' at a given point in reasoning, abstraction would hide the irrelevant detail from local consideration or expression, not eliminate it.]

Suppose your power company wanted you to staple a special paper to your electric bill. They need that special paper internally. However, it shouldn't be relevant to the bill submitter. Every bill submitter shouldn't be required to staple some odd paper to their bill. It's an irrelevant detail to the bill payer. Fixing this is similar to the kind of clean-up I attempted to apply.

[Maybe so. Similar to the kind of clean-up you attempted to apply, that clean-up has nothing to do with abstraction.]

I already know you disagree it's "abstraction", but I still don't know why.

[Abstraction is about how we think about and discuss things. It's a meta-linguistic word. "Cat" is an abstraction because it doesn't discuss a particular cat. "Red" is an abstraction because it doesn't discuss a particular red object. Your 'clean-up' is about changing a specific process - not about describing that process, much less abstracting it.]

You still have not clearly explained why it's not "abstraction". And how can a "clean-up" be about "describing that process"? Abstracting the documentation? I cannot X-ray the abstraction determination algorithm, formula, or neuron-firing pattern in your head. You'll have to use words to describe your working criteria for "abstraction". I don't necessarily disagree with your "cat" and "red" examples of abstraction, but classification and "instance-of" relationships are only a subset abstraction.

[You made what is called a category mistake. To call what you earlier called an 'abstraction' makes as little sense as opening up your computer to look for the Internet, or saying that the physical activity of "clean-up" is about "describing a process" (it isn't, which was the stated point). Looking for an 'abstraction' in someone's head is another category mistake, unless you're thinking of 'neurons' and 'neural firing patterns' in the abstract sense; by definition, abstractions aren't physical, specific things. And every abstraction represents an 'instance-of' relationship (by every definition of abstraction that I've ever seen), but it is still an error to call this 'classification' (since abstraction doesn't necessarily involve the process of classification) and it is also an error to call it a 'subset' (since there isn't necessarily a defined set from which to take a subset).]

The definitions I've seen emphasize extracting essentials from non-essentials (for a given purpose), and yours seems too closely related to classification (which is a subset of essentialness-extraction techniques). Anyhow, we probably cannot and should not continue without a clear and agreed-upon working definition of "abstraction". Good day, Sir. -t

[The way you want to use 'abstraction', you'd need to say a water purification filter performs 'abstraction' (since it extracts non-essentials for the given purpose of drinking water). That's ridiculous, and your 'Trim()' and 'odd paper' examples are ridiculous for the very same reasons. Abstraction by all but the most archaic of definitions is about concepts, fundamentally. It is not about 'classification'. I think your opinion is strong evidence of illiteracy on your part. Go educate yourself, sir.]

Essential concepts, not essential particles. My apologies. I guess I thought that kind of thing was obvious from the context, but it seems not.

http://en.wiktionary.org/wiki/abstraction contains BOTH our definitions, so is of little help. However, containing both means that both can apply.

  2. (philosophy) The act of leaving out of consideration one 
  or more properties of a complex object so as to attend to 
  others; analysis. 

9. (computing) Any generalization technique that ignores or hides details to capture some kind of commonality between different instances for the purpose of controlling the intellectual complexity of engineered systems, particularly software systems.
I suppose you could argue that the "computing" definition applies more in our context, but this is more about conceptual abstractions that the customer could possibly relate to also (business rules), not so much about coding in say Java.

[Neither of those definitions applies to your 'odd paper' or 'Trim()' examples.]

Because you say so, it must be true. Anyhow, being true-blue certified "abstraction" or not does not necessarily detract from my illustration that re-engineering designs and code to simplify a system carries with it risks. If you can offer better or more fitting actual examples, be my guest...

{Be that as it may, your original paragraph re trim() et al. has nothing to do with abstraction or deviation from it. Perhaps you'd consider moving it somewhere more appropriate, or re-write it to actually deal with abstraction?}

You have not shown with clear logic that I am wrong. Stop trying to hand-wave me away.

[The logic is opaque only to you, TopMind. Your poor excuse for mangling the English language has been hand-waving from the beginning. You fail to grasp even the elementary difference between leave properties out of consideration to support analysis and remove those things from the product. There's glory for you.]

You just have an outlier writing and thinking style. I showed you how to present logic, but you ignored my recommendations, thinking your convoluted insipid style is sufficient. I didn't remove "things", I just refactored them to fewer spots. Support analysis? Please clarify. Usually when one says "cleaner abstraction", they mean a simpler system. Why change the system unless it will result in either performance gains or a simpler system to maintain or use? Or, are you talking about "better" business processes independent of software?

Further, if we are talking about "improving" a system, why get caught up in the distinction between abstraction-based improvements and non-abstraction-based-improvements (if there is such a beast)? Or are you just being anal for the sake of analness?

{Going in reverse order, we care about the distinction between abstraction based improvements and non-abstraction based improvements on this page because this page is about abstractions. (And there certainly are such beasts. If your change had actually been an improvement, it would have qualified as a non-abstraction based improvement.)}

{While one usually wants a simpler system from a cleaner abstraction, it doesn't follow that everything that results in a simpler system is a cleaner abstraction. A->B is not the same thing as B->A. }

{You did remove things. You removed the spaces that another part of your project needed.}

{And finally, where, on this page, have you showed us how to present logic? And why should we follow your rules?}

To clean up this page, I plan to move my "trim" anecdote to ReEngineeringPitFalls to avoid any hard connection with the term "abstraction"; and move the discussion about the meaning of "abstraction" to "AbstractionDefinitionDiscussion?". Is this acceptable? -t

I'd like to suggest refactoring this page to something like:

Example: A program may process input from various sources. Throughout the program, each function could do it's own cleaning or sanitization of the input as needed, or use the raw input if appropriate. But that would involve lots of duplication and increase the chances of accidentally using raw input where clean was needed (or vice versa).

To simplify and clarify things, this is abstracted out into an Input type that handles the cleaning. That way, a given function can just use input.Clean() or input.Raw() as appropriate. The function doesn't have to worry about how things are being cleaned. If a new rule (input must be transliterated as well as trimmed) needs to be added, then it only has to be added in one place and the functions don't need to know about it. It's also obvious whether you're using clean or raw input.

However, this abstraction may break down when different functions need different types of cleaning. Examples of cleaning might be trimming whitespace, urldecoding, converting all punctuation to dashes, transliteration, escaping, filtering out something depending on the source of the input, etc. One function might require stripping punctuation while others do not. Different functions may require different escaping, depending on where they're sending the data. Some might need the input trimmed but not transliterated.

As a result, various functions start deviating from the abstraction, implementing their own cleaning routines on input.Raw(). The benefits of the abstraction are lost as the code grows with exceptions to the rules and special cases.

Sometimes refactoring the abstraction helps (parameterize Input.Clean()), if the problem is due to technical limitations of the underlying system. But often the problem is due to business rules which may be leftovers from a forgotten past. Simply changing those business rules would simplify everything and remove the need for a lot of 'special case' code. (Remove the ability for certain people to sometimes embed scripts in input, drop obsolete naming conventions that were put in place for a previous system, switch to a newer external system that doesn't choke if there's a '+' in the input, or that can sort properly without space padding, etc.).

However, those special cases were usually put into place for a reason. Simply changing or dropping them for the sake of better abstraction is likely to break something, unless you fully understand why they're there.

Note that often one may not have the rank or position necessary to "fix" the domain rules even if we were certain they could in theory be cleaned.

See also: InvolveTheCustomer, EightyTwentyRule, ThatsNotaBugItsaFeature

View edit of December 19, 2011 or FindPage with title or text search