Number Of Keystrokes

From ComputerLanguageBenchmarksGame:

I think LinesOfCode is a ridiculous measure: if for example, I were told up front that a low LOC was considered "good", I would pack as much in one line as I could. You can do this in the ForthLanguage and in CeePlusPlus for but two examples. Using a measure like LOC to rate programs or programmers reminds me of the Tech Support debacle that Gateway experienced recently: [...]

But, if we replace "low number of lines of code" with "low number of keystrokes", we get something which I think is amazingly accurate measure of the work a program has caused despite being a very technical measure. (Or we'd end up with a GraphicalProgrammingLanguage driven by mouse. ;-) Forces:

  1. the trouble of actually writing code is directly proportional to the number of key presses the programmer has to make;
  2. even though most of programming time is spent thinking, the amount of that time is, in my experience, proportional to the amount of writing;
  3. different programming languages and different algorithms tend to produce differing amounts of thought work and design work, but the amount of thought work needed is, in my experience, directly related to how well the programmer understands the language used and the problem to be solved. Thus, the number of keystrokes in the code corresponds directly to the work caused by the implementation phase(s) / mode(s) of the development.
  4. so, the only work which is not reflected in number of keystrokes, is the work of learning new things. I don't think anybody expects this work to be estimable in any sensible way.

Also, the number of keystrokes is relatively hard to bluff as a measure of productivity. About the only thing I can think of is making a big bunch of #defines and using short and non-descriptive variable names. But, if your #defines are not appropriate abstractions, they will be of little shortening value; and shorter variable names are not a problem as long as you are actively developing the software (at least I don't ever forget the meaning of my variables unless there is a pause in the development of the software).

Also, defining language-level abbreviation for common cases, as Perl does, will not spoil this measure. Unless the abbreviations are very well-chosen, they don't help much, and make other constructs correspondingly longer. And if the abbreviation are well-chosen, they genuinely increase the expressiveness of the language.

This way of thinking points into a theoretical claim: the complexity of a problem is directly proportional to its KolmogorovComplexity, and the value of a tool for solving a problem can be directly estimated by the KolmogorovComplexity it gives for the problem.

Counting the number of keystrokes goes often against maintainability, but not always. After all, when changes come, the length of the expressions to be changed governs how much work there is to the changes. What this does not take into account is understanding the change, understanding where the change is to land and testing the change; but these are things you have to do anyway, and I claim they are not affected drastically by the terseness of a program. -- PanuKalliokoski
I believe NumberOfTokens? is a completely superior - but still fundamentally flawed - measure. -- DanielBrockman


Also, the number of keystrokes is relatively hard to bluff as a measure of productivity.

Given that a 'low number' is a good thing, and that WellFactoredCode will have a lower number than code that violates OnceAndOnlyOnce, treating number of keystrokes as a 'measure of productivity' is a really bad idea. You'll end up with one of two scenarios: WhateverGetsMeasuredGetsOptimized.


The trouble of actually writing code is directly proportional to the number of key presses the programmer has to make.

I strongly disagree. In my experience the number of keystrokes has little to do with the effort required to write code. A commonly used pattern may take more keystrokes to enter than a novel one, yet the novel one will require significantly more mental effort. Configuring depends on typing. Programming depends on thinking. -- EricHodges

That's why I say "actually writing", meaning the physical act of writing. Moreover, in my experience, the speed of typing does not vary much depending on how familiar one is with some construct; only constructs like "while" get some bonus for being written all the time and turning into reflexes. What you address in your criticism, thinking, is dealt with in the other points above, not this one. -- PanuKalliokoski

If all you're saying is that the effort required to enter text into a computer is proportional to the number of key presses, then we agree. If you're drawing a correlation between that effort and the effort of programming, we still disagree. Some of the hardest programming work I've done has involved very little typing, while much of the easiest programming work involves a great deal of typing. -- EH

In the point you quoted, yes, that's all I'm saying. In the other points, I try to make the case that the only thing not reflected in number of keystrokes is the effort of learning new things, leaving number of keystrokes a very good measure of expressivity of language / complexity of a problem.

Did you actually read my blurb above? -- PanuKalliokoski

Yes I did. You leave out the biggest part of programming, problem solving. Keystrokes may map to expressiveness of a given language, but it doesn't map to complexity of a problem. Some very complex problems have small solution expressions, and vice versa. -- EH

Would you like to give an example of a complex problem that has small solution implementation, or a simple problem with a big solution implementation? And I'm not leaving problem solving out: my theory is that it has two parts, pattern generation ("liquid thinking" in psychological terms) and pattern application ("crystallized thinking" in psychological terms). Pattern generation is what I refer to by "learning" in the blurb above, and the speed of pattern application, in my experience, does not vary as much on the "type" of a pattern as on the size of the pattern when applied.

Complexity of a problem is, of course, very programmer dependent exactly because programmers have different sets of patterns to apply to the problem and so the work of pattern generation varies a lot. What I claim is that the number of keystrokes gives a good estimate of the complexity of a problem irrespective of the programmer. -- PanuKalliokoski

A complex problem with a small solution: How much energy is in any given amount of matter? The solution can be expressed as "e=mc^2", but arriving at that solution took considerable effort. It took me only 7 keystrokes to express that solution, but it takes me many more keystrokes to express solutions to much less difficult problems. -- EH

I guessed you were talking about something like this. But you're talking about a very different thing than I am. First, I'm talking about algorithms and description of behavior, which programming is; your "problem" is about finding out empirical truths about the real world. Second, "e=mc^2", though succinct, (1) is not entirely correct: the exact amount of energy depends on many other factors, and (2) certainly didn't take you much thinking (unless you were spending the last 6 hours trying to find out the solution). Third, even though a remarkable physical finding at the time, this equation is the bulk of physics and is used constantly by scientists without a second thought. So, unless you count the heritage of earlier programming in every program's "amount of work", counting the trouble of first finding out the equation "e=mc^2" is indeed moot. Fourth, your chosen problem is not what "e=mc^2" was the solution of at the time it was found out: the big finding was that energy and matter are in fact interconvertible; if you already know that, the equation "e=mc^2" arises naturally from the units of mass, speed and energy.

I thought, because you said,

"Some of the hardest programming work I've done has involved very little typing, while much of the easiest programming work involves a great deal of typing. ",

that you probably had some practical example from the work of programming, and I was asking for such an example. Of course, I'm likely to claim that the "hardness" of this programming work is in learning new things, the cost of which, as I admit above, is not reflected in the number of keystrokes. But on the other hand, I am interested in finding genuine counterexamples for this thought. -- PanuKalliokoski

I was hoping you could apply my example to programming. If not, think of a geometric problem, like the ones described on GeographyExample. The process of writing software to generate and manipulate those maps depends on solving several geometry problems. Once solved, their expression as code can be quite simple, but the act of solving them takes considerably more effort than the act of expressing them as code. Programming isn't just a process of expressing well known solutions to well known problems, or learning new things from someone else and applying those. It's usually the process of solving problems no one else has ever encountered. For problems with existing solutions, it's usually better to buy instead of build. -- EH

Re "It's usually the process of solving problems no one else has ever encountered": are you serious? In my experience, and as well explained in ProgrammingIsNotFun (part "most programs are boring"), most of programming work is application of existing solutions to new problems. You very seldom run into problems you actually know no (elegant) solution for. It happens, but it happens rarely, and it is totally dependent of the programmer, not the problem at hand. Re "better to buy instead of build": exactly, most of programming work is about the application of existing tools in new contexts. The situation is actually the same, whether you implemented the abstraction yourself or bought the implementation from outside. -- PK

I am serious. We must have very different experiences as software developers. I very seldom run into problems for which I know an elegant solution. Perhaps you know more elegant solutions than I do. Perhaps I avoid work that has already been solved by someone else. -- EH

Well, of course the solutions differ in elegance. But I very seldom run into a problem (in programming) for which I don't know at least one obvious solution. As you say, our experiences or styles of coding differ.

This discussion about elegance points into another definition - in my opinion / view, the elegance of a solution is almost directly its brevity. This has a link to the interesting property that every problem has its KolmogorovComplexity in a given language, which is the shortest description of the behavior, however an isomorphic solution may be of any length whatsoever. -- PK

But there's not a one to one correlation between brevity of the expression of a solution (the end result) and the process of arriving at that solution (solving the problem). I'm tempted to say there's an inverse relation, if any. It's often easier to find a long solution than a small one. -- EH

Now that's true, and I feel it is the kernel of this question: you talk about the trouble of finding a solution, which I feel is negligible because "you just have to go and try it out", and I talk about the trouble of using that solution. I do acknowledge that sometimes you have a problem for which you can't come up with any solution at all, but that's very rarely in my experience.

Our experiences do differ, and they differ in an interesting way: I feel that your way emphasizes design, whereas mine emphasizes experience. Please tell me if you see it some other way. And I would still like you to take an actual example problem with which you've struggled in the past. -- PK

I'm not talking about design as I understand the term. I'm talking about problem solving. I see every set of requirements as a problem and every piece of software that meets those requirements as a possible solution. Design, to me, has to do with finding an appropriate implementation of the solution given the constraints of time and budget. Solving the problem is something else. -- EricHodges

[interesting but lengthy and not directly related discussion about GeographyExample moved into IcosahedronImplementation]


Actually, the GeographyExample (I didn't read the discussion through, just the problem statement) goes to prove my point, in my opinion: it's a problem for which you can try different mathematical approaches (pattern application), which yield implementations with differing length (and correctness). I claim that the work and time taken by the implementation of different solutions corresponds roughly to the NumberOfKeystrokes, if you try coding them out. -- PanuKalliokoski

But I don't try coding them out. I solve them mathematically first, with pen and paper, then express the solution I want to implement. It would take more time to express all of the possible solutions as code first. -- EH

Whooee - sounds like BigDesignUpFront to me. I usually "work out" the problem directly by coding, because many things that look fine on paper turn out to be extremely inelegant while implementing them. -- PK

Quite the opposite. I see writing a lot of code to solve a geometry problem as BDUF. -- EH

How on earth could trying different things out so that they're instantly usable as such be BDUF? -- PanuKalliokoski

Because you can apply geometry theory to solve the problem more rapidly. It's the simplest thing that could possible work. -- EH

I think your argumentation seems to point in the direction of BigDesignUpFront being sometimes feasible, not experimentation with code being BigDesignUpFront. -- PK


By the way, I noted a clear terminological difference between us - you seem to me to think, complexity of problem =(def) the work for finding out its solution (without prior knowledge about the area). But this is highly culture and thinker dependent, so I think it says little about the problem. For me, the complexity of a problem is in the number of relations, concepts, objects, etc. which it takes to express the problem and the solution. -- PK

By complexity of the problem I mean the difficulty (amount of energy expended) to find a solution. Some problems with only a few concepts and relationships still take a great deal of effort to solve. -- EH


I believe "number of keystrokes" suffers the same problem as "lines of code." Although each one could be suitably defined to become a true unit of measure, there is not much value to the measure when done. Fewer keystrokes could be either beneficial or detrimental and this could only be determined on a case by case basis. There is no way to predict whether fewer keystrokes produces a desirable result, rather, one needs to look at how the fewer keystrokes were achieved. -- WayneMack

You have very good points. They point into the question, "what is measuring number of keystrokes good for"? For me, the answer is: lengthy code is a CodeSmell (if you have to write so much that typing speed becomes a bounding factor for development instead of thinking speed, you probably are not abstracting properly), NumberOfKeystrokes remains valuable for estimating the expressiveness of a programming language, and the complexity (as understood by me) of a problem is reflected in the number of keystrokes needed to implement it. But this last measure is of very dubious value, however, the others remain valuable at least to me. -- PanuKalliokoski


[Discussion about the length of identifiers moved into AreLongAndDescriptiveRelated.]

Thank you guys for your input! I appreciate very much peer opinions about my controversial thoughts. I'll probably summarize this all after some time. -- PK


I think NumberOfTokens? is superior to NumberOfKeystrokes, if we with keytsrokes we mean characters/bytes, but I wonder if the original author really meant keystrokes in the sense of including cursor movement, deletion etc.

The trouble of actually writing code is directly proportional to the number of key presses the programmer has to make.

This definition seems to imply, that we have to count these keystrokes too. And if we have to repeatedly change some code until it finally runs (because it is hard to get right e.g. due to language terseness), we may come up with a much higher number of keystrokes than just counting the final number of characters.

Probably an even better measure would be AccumulatedTokenChangeEffort?, though it being as inmeasurable (without support by the IDE) as the real NumberOfKeystrokes.

Of course even this discounts the ImaginaryKeystrokes?, that the programmer considers, but never executes.

-- GunnarZarncke


so, the only work which is not reflected in number of keystrokes, is the work of learning new things. I don't think anybody expects this work to be estimable in any sensible way.

The people deciding whether or not to pay for it do.

What I claim is that the number of keystrokes gives a good estimate of the complexity of a problem irrespective of the programmer.

As with LoC, this is another after-the-fact measurement not very practical for predictive estimation. And given that lines of code are just sequences of keystrokes, it's really pretty much the same thing, the only difference being how many of those keystrokes are newlines. What makes the number of newlines so important?


CategoryMetrics

EditText of this page (last edited January 12, 2011) or FindPage with title or text search