Argument By Lab Toy

Just because a language or tool is effective for a sample textbook-like application or a given problem sample doesn't mean its powers will scale to or apply to the real world and/or all domains.


The LudicFallacy? is a term coined by NassimTaleb?, in TheBlackSwanBook. "Ludic" is from the Latin ludus, meaning "play, game, sport, pastime." It is summarized as "the misuse of games to model real-life situations."Taleb explains the fallacy as "basing studies of chance on the narrow world of games and dice."

It is a central argument in the book and a rebuttal of the predictive mathematical models used to predict the future – as well as an attack on the idea of applying naïve and simplified statistical models in complex domains


(Moved from ExampleSizeIssues)

One of the reason that programming contest examples don't seem to apply to the real world is in dealing with all the exceptions (deviations) to abstractions that are usually found in the real world (EightyTwentyRule). Programming contest examples tend to weed all these out in order to keep the problem statement relatively simple. Thus, one is looking at problems with consistency and high abstraction built into them. But in the real world the problem statement is rarely simple (if it even exists yet). Large-scale abstractions are harder to find or have more exceptions than lab or textbook examples typically present (EightyTwentyRule). Real problems are full of little sub-rules, exceptions, and grey areas that may cause DiscontinuitySpikes with small change requests by the customer. Finding the fuzzy spots is often a big part of the developer's job. Otherwise, it would probably be off-shored to a lower-wage country (AmericanCulturalAssumption). In fact, a practitioner may judge languages and tools more on how well they handle the exceptions than on how many high-level abstraction features they have. "Academic" languages tend to have the focus the other way around. -top

[more to come...]
Note that lab toys are often good for demonstrating specific worse-case and/or best-case issues/scenarios even if they are not so good at demonstrating the interaction of myriads of factors (such as FeatureInteraction).
Was: (such as scalability or FeatureInteraction)

A removed comment suggested that "lab toy examples" are not good for testing scalability. I would tend to disagree. The test software after all could use large data-sets of dummy data, such as random values, to test scalability of data size. Another testing technique is to use large quantities of desktop computers at night when they are not being used in production. But it is true that one has to be specifically targeting scalability of test software/systems in order to test scalability. --top

"ArgumentByLabToy" connotes the sort of napkin or blackboard problem. It remains worth noting that the sort of 'toy examples' one can explain to someone via a medium such as WikiWiki are inherently poor for discussing or demonstrating scalability concerns.

As a general statement, I disagree. A lab toy can demonstrate how a program that sorts a thousand items fine will bog down under 100 million, for example. Many specific weaknesses or limits can still be demonstrated with a lab example. True, those issues caused by heavy FeatureInteraction are usually more difficult to demonstrate, but not all problems fall into that class.

You are wrong. If I provide you a sorting program, it would NOT tell you how it will bog down under 100 million items. To understand that, you would need more than the example. For example, you would either need to follow and understand the mathematics, or you would need to take the example out of the lab, flesh it out for real-world use, and actually run it.

Demonstrating the existence of a problem and demonstrating the reason or cause of it are two different "levels". You can demonstrate that "X doesn't scale" using such a device regardless of algorithm and math analysis.

My above words regarded even demonstrating the mere existence of a particular class of problems (scalability), not about its reason or cause or whether it is necessary. (Argument by example cannot ever demonstrate that a problem with a solution is a necessary one, vs. caused by a naive or contrived example. This is why argument by example is so often used as a straw-man.) While argument by example can usually show that a problem is possible, a 'lab-toy' example is (by its own nature) too small to demonstrate scalability problems. This has nothing to do with the size of the code, but rather with the size and completeness of the full example (including data).

Further, you there is a lot more than one scalability problem. Another scalability problem, for example, is showing how a data structure that can handle one find-and-replace just fine will bog down under 100 concurrent users doing different things.

True, it's not meant to be an exhaustive test. But it may not have to be for the targeted purpose. Often the point is to expose at least one flaw such that you don't need to expose a second. If you expose just one flaw, then a claim that "X doesn't scale" carries far more weight. You only need to shoot a horse once to kill it.

For example, you only need to show ONE case where bag processing is less efficient than set processing to make the case that there exists scalability problems with bags. You don't need to show every case. This doesn't rule out other scenarios that may favor bags, but that may not be the goal of a debate.

Now you're stretching beyond what any sort of argument by example can demonstrate. Suppose I showed you an implementation of an algorithm to process bags, and a variation optimized for sets. Then I provided example data that showed some scalability problems. Suppose, for the sake of argument, that the reviewer of that data wants to believe that bags have no scalability problem. What will that reviewer do? (1) deny that the example data is realistic (you never see that many duplicates in real data!), (2) say that the performance problem in question could have been avoided by (pick one: unique a projection, delay projection til a later step, hand-waving indexing techniques, etc.), (3) point out that the bag and set code aren't exactly the same and say that the bag code wasn't optimized as well (maybe some optimizations could have helped the bag code, too! then what would the difference be?), (4) all of the above. My experience is that (4) is the usual response. Anyone who wants to dismiss a specific example as a reason for dismissing a general problem will find excuses to do so. People who ask for such examples while eagerly seeking excuse to dismiss them, are simply lazy sophist scum. Good thing we don't have people like that on this wiki, eh?

Argument by example, including argument by lab-toy, is fundamentally to show that something is possible. This can be used in the role of a prototype (i.e. to derail disbelievers), or to counter a claim of non-existence. Argument by example cannot ever demonstrate a problem is necessary. In the domain of code, this is because BadCodeCanBeWrittenInAnyLanguage. Perhaps it only takes one hit to kill the horse, but unless the shot necessarily hits, it will have 'missed' in the eyes of the fans of the horse.

I see nothing wrong with the example turning into one or more of the above responses/issues. That's a natural evolution of debate. Whether any of the four are valid or not is a new sub-debate, similar to StepwiseRefinement. For #1, a new sub-debate explores the probability of a given data pattern(s). For #2, it may require re-adjusting the example, perhaps at the burden of the critic of your algorithm. For #3, it would probably be the burden of the critic to supply a better bag algorithm to replace yours. Nobody ever claimed an example was certain to end debate. It may merely become a work-space upon which to explore the issues of the debate. It's usually a vast improvement over using just English in my opinion. -t

Examples can be used in explanation. That is different from using them in argument. It is not sane, reasonable, or rational to present an argument by example if you do not expect it to convince people towards a conclusion. If TopMind wants to present or ask for an example for explanation, however, then he ought to take that position clearly and become a better student - because that's what explanations are about: teaching. If you confuse explanation and debate, people will (quite naturally) think you a sophist, a troll, a HostileStudent, an idiot, and any number of other unsavory titles.

This doesn't make a whole lot of sense. You are parsing words too heavily. The boundary between "explanation" and "evidence" can be very fuzzy. Potential problems or issues can be demonstrated with lab-toy code and runs. Lab toys are a tool for communication, and like any tool they can be misapplied, abused, etc. As it tool it cannot be summarily accepted and summarily dismissed. The devil's in the details. And by the way, I'm not your fucking student, so stop being a hostile teacher.

Even if it doesn't make sense to TopMind, this makes sense to almost everyone else. If I ask you for an 'explanation', I am asking for you to teach me, as a student. If I ask you for an 'argument', I am asking for you to prove yourself to me, as a peer with a similar level of expertise. TopMind protects only his ignorance and his self-destructive ego by asking for argument when he really wants an explanation. You are not willing to become anyone's student, even briefly, and that is a huge black mark against you in the 'SetTheBozoBit' category. At some point in the past, someone asked you to go fuck yourself; you've been succeeding for the last decade or two, at least.

Why should I believe that you take accurate polls? You are an exaggerator and a drama-queen. Most academic IT papers end up in the Bozo Bit Bucket because they mistake theoretical elegance or a narrow pet concept for general practicality. It's a common mistake in the academic world. Your disconnect from the real world or inability to tie it to the real world will likely lead to the trashing of your theoretical babblings, or worship by a small like-minded cult at the best. They should fire you guys to stop wasting tax-payer dollars and student tuition. Or maybe they feel it's worth it because one in a thousand comes up with a good idea by accident. Every now and then the goat craps gold, but it still leaves a lot of [bleep] around. It's academic Russian-roulette. Software engineering is not (yet) a real discipline outside of machine performance; it's full of alchemists with pseudorigor. "Removing nulls makes my equations simpler, therefore it's better for production. Cubic cows make my farm simulation simpler, therefore cubic cows are better for a real farm." Tripe.

Well, this topic has degenerated into a familiar pattern. Yes, and it's all my fault because I'm bad/evil/stupid blah blah blah...

As far as the value of lab-toys for debate, ideally both sides would be given hundreds of millions of dollars and a big staff to produce production-quality tests. But, that's not going to happen. Lab toys, descriptive scenarios, and theoretical elegance are consolation prizes that we are unfortunately stuck with. We can complain about the limitations of these consolation prizes, but it won't do any good. We have to live with them because we are not billionaires and thus have no real alternatives. And I don't dismiss the value of theoretical elegance, I just don't over-weigh it compared to other factors, unlike you. A lot of our debate is not really me claiming there is no theoretical elegance to your ideas, but rather the weight that should be given to such. You offer no rigor that says TE should be weighed more, only vague chicken-little claims for those who don't heed TE.


It kind of reminds of me the basketball drill where the shooter can make 75% of their shots by standing in the same place and having an assistant keep passing them basket balls to shoot. One thinks, "Wow, why can't they do that in the game itself?! Our team would have the trophy for sure!" But it's different when you are running all over the court at different angles and have to pay attention to lots of other parts of the teams and games, including the defensive player on your ass.


See Also: ExampleSizeIssues, FeatureInteraction, TooManyVariablesForScience
CategoryExample, CategoryEvidence, CategoryTesting

EditText of this page (last edited August 26, 2014) or FindPage with title or text search