Measuring Abstraction

If somebody says, "My language/paradigm/technique is a higher level of abstraction than yours", how does one go about measuring abstraction level?

There's a paper on this topic that may or may not have the ultimate answer, but I believe does a good job of addressing and dismissing the general sort of argument below, and moving on to more workable approaches: "A Definition of Abstraction"; Martin Ward, January 17, 2003 -- ( the link above seems outdated, here's one indirection from : ) DougMerritt

A summary of that paper:

You skipped the interesting part where they concluded that size-based definitions are inadequate, and their best attempted definition...

It also tends to focus on "toy" examples like sorts and tower-of-honoi. I often find these hard to extrapolate to "real" applications because they don't deal with enough factors. The hard part of real stuff is dealing with all the interweaving variables (dimensions or factors), not just taking simple inputs and getting simple outputs.

True. Do you have further suggestions for reading that might be better, or at least also interesting? I found that one by googling today, and didn't find much else so far.

Less and/or simpler code may be one way. However, sometimes smaller code requires more rework to incorporate changes. For example, if your system currently serves five countries, it might be less code to hard-wire in those five countries. But adding more countries could require more work than looping through a centrally-defined list, especially if that list of 5 is used in more than one place.

Thus, perceived abstraction probably depends on things as how one perceives the likelyhood of certain changes. Are there ways to measure besides those that typically lead to HolyWars?

Take a simple task, such as gaining an array which contains all of the squares of the first n numbers. In C:

          for (i = 0; i < n; i++) {
                     array[i] = i*i;


          array [assigned to the value of] [iota] n * [iota] n;

(bear in mind that [assigned to the value of] and [iota] are single characters, so the whole thing is eleven characters long)

Having accomplished the same task in both C and APL, I get the feeling that APL has greater abstraction when it comes to arrays.

I am not sure symbol counts make a very good measure because APL uses symbols in place of words by extending the alphabet. It is sort of like Chinese compared to English. Pictures tend to be more compact. However, they complicate your base "alphabet", which may not be a good thing. I tend to consider things like LISP more abstract because you can make your own sub-language easier rather than rely on APL Inc. to include the right symbols/operations in its library. However, I don't know how to measure that either.

For methodologies like AspectOrientedProgramming for which there can be no simple example, it gets tricky.
I think it's like asking the "right" way to unscramble an anagram. You could always make the argument that language feature A in language Z is isomorphic to lanugage feature B in language Y. Which makes a task "simpler" is a difficult question.

3(k+j) or 3k + 3j - which is "simpler"? If it's in the equation 3k + 3j = 6x then you'd want the first form; if its in the equation 3(k+j) = 7j you'd want the second form.

There is no "good" answer. But you might not want abstraction for its own sake, you might want abstraction for the sake of ease of extension. In which case, measure ease of extension, not abstraction.
Before you can measure something (in this case "abstraction"), you need to define it - and the measurement becomes trivial. So the problem present by this page is not really measuring abstraction, but defining abstraction!

I don't know what the formal definition of abstraction is, by the one that I personally devised is this: "Abstraction is the process of mapping a model onto the thing being abstracted".

While I have explained this in more detail on ArgumentsAgainstOop, a terse explanation follows. A model of something is a simplified representation of that thing, which doesn't necessarily have any relationship to that thing; so a mechanical mental model or a real scale model are both valid models. Mapping is the process of associating parts of the models with parts of the thing being abstracted.

Therefore an abstraction excludes non-essential details by not having anything to associate those non-essentials details with on the model. Conversely, the essential details are associated with elements of the model. One extreme for a model is to use the thing being abstracted as the model itself - this provides zero abstraction because no details are lost. Therefore the more details that are ignore (become irrelevant), the more abstracted you are becoming. In the limit, only a single element is mapped onto the (very simple) model, and this provides the ultimate abstraction.

So with abstraction clearly defined, you can see that to measure the relative abstraction of two languages (which need to be made to perform the same task), you need to compare how many "details" each language requires. I would suggest that anything which can vary is a detail, since it provides something meaningful. This means that keywords are ignored.

As an example, concatenating two (immutable) strings:

In the CeeLanguage requires at least three pointers (one for each string), two variables to calculate the length of the new string, an index variable for stepping through the string being read, an index variable to step through the string being written, and a variable to hold the value of a character so that you can see if it is the zero terminating character. A total 8 details.

While the JavaLanguage requires just three string variables. A total of 3 details.

So in this crude example, Java is 2-3 times more abstract than C. A counter-argument against this is that if C came with a library of string functions (as it probably does:-) then appending a string would be no more complex in C. My response to this would be that because both languages support abstraction, you must consider the most *basic* representation that is reasonable in each language. Since C supports direct memory access, use that. Since Java doesn't, you can't. --ChrisHandley

Remember, CeeIsNotThePinnacleOfProcedural.

{That's a cute definition. I disagree with the counter-counter-argument, though. If C has to ways of doing something, one more abstract than the other, that shows the language has a range of possible levels. I note many high level languages allow you to drop directly to assembly code.}

But one can measure that to some extent by comparing code size. But it may also imply that Python is more abstract than Java because Python will usually be smaller. Python proponents would agree, but not Java propononets. The only thing anyone will agree on is "abstraction is that language/paradigm/methodology which makes me most happy." We need a way out of the subjectivty black-hole that software engineering is trapped in. --top

I was rather pleased with my definition since it seems intuitively right (to me). To better justify my counter-counter-argument, remember that pretty much all languages support some kind of abstraction, and yes some support access to lower-levels (Java's JNI could be seen as that). The job in question is not to determine how good those languages are at providing abstraction (or providing special-case access to low-level features), but rather to evaluate the abstraction level of the lowest commonly-used features. I agree this is a tricky call to make, but it is way better than saying you can't measure abstraction at all! --ChrisHandley

Perhaps it is time for a specific example.

I have a problem with defining abstraction as just eliminating details. I thought, in addition, an abstraction includes generalization (which may occur when you eliminate details but may not). --DoraiThodla

See also LevelsOfAbstraction, FuzzyDistinctionBetweenInterfaceAndImplementation

View edit of March 14, 2008 or FindPage with title or text search