Distribution Is Optimization

Original statement (false):

Distribution is an optimization technique: you distribute a program across multiple computers because you want it to scale, or you want the user interface to be responsive, or for some similar reason.

Therefore, the decision to distribute should be made according to the RulesOfOptimization, and never up front.

Revised statement: (controversial)

Distribution is sometimes an optimization technique. Examples include scalability and responsiveness. When distributing for these reasons, the decision should be made according to the RulesOfOptimization, and never up front.

Distillation of discussion:

Distribution is not just an optimization technique. There are legitimate needs for distribution. See below for examples.

When distribution is done as an optimization, the traditional approach is to design this into the architecture up front. The DistributionStories page includes anecdotes about this approach succeeding and failing. No one has yet contributed a story about distributing an application after first developing it under a non-distributed architecture.

See also:

Arguments in favor:

Arguments against:

This example does not require distribution, but only a client-server network. Distribution is something different. This example could also be used to describe dumb terminals connected to a time-sharing mainframe.

For me, a "client-server system" is a distributed system. The server logic is centralized, but the system as a whole (which includes the clients) is distributed. Your mainframe example is also a distributed system, no matter how dumb the clients are. And anyway, your example is extreme. Most client-server systems have at least some small intelligence on their clients. And if the client can talk to the server via a protocol, it's already intelligent enough for me. Hmm... maybe centralization is just an abstraction: everything, if inspected closely enough, is a communicant, distributed system?

Again, this is not an argument for distribution, but for client-server resource sharing.

Interesting points:

They are optimizations. SetiAtHome could've been done by buying a massive supercomputer; instead, they decided to use other resources. You could design ATMs to not be distributed, with one centralized access point that everybody would have to share. (Arguably, this is a function bank teller windows used to fulfill.) In these cases, of course, optimization seems extremely necessary. So perhaps we're dealing with a special kind of optimization that doesn't have to follow the RulesOfOptimization?

SETI could not afford a supercomputer powerful enough to process data as fast as SETI at home. Therefore, the use of distribution was based on economic factors, not optimization factors.

SetiAtHome could also have been accomplished on one workstation; it simply would require millions of years of processing time. Distribution in this case is an optimization - it offers an increase in speed without increasing the core functionality.

A SetiAtHome implementation that can only run on a single machine is not unoptimized - it is broken, since it is simply not a solution. Or, if you have a powerful and expensive machine it is just a solution that is too expensive (for today's machines, at least) and brings nothing new.

ATM networks are not an optimization. Would a bank's customers travel from all over the globe to the single ATM machine to get cash? Of course they wouldn't. Therefore, the distributed nature of ATM machines is a functional requirement of the system, not an optimization. Similarly, a bank's customers want to be able take cash out of ATM machines owned by other banks. Thus the banking network federates the systems owned and managed by different, competing organizations. Banks will not allow their competitors access to their systems, and will not run their business systems on machines owned by another bank. Therefore, distribution is a functional requirement.

I disagree. Many Enterprise Java applications are distributed specifically for "scalability." It's a choice made specifically for performance reasons, not as a result of functional or economical requirements.

Many EJB applications are built because EJB is a trendy buzzword, or because the system needs part of what EJB provides (usually transactional persistence), or to ensure that application code is independent of the underlying database and transaction monitor. Distribution is then a side effect of using EJB not the reason for using EJB.

Okay. I was responding to the page title and argument. It is generally considered, and agrees with my experience, that it is often easier to start from scratch than distribute a mature monolithic system. This depends on the quality of the system, however. I tend to write systems with a view to later distribution which obviously makes it a lot easier. This process can be applied retroactively as an architectural refactoring, but not many shops employ (or even recognize the existence of) coding architects (grumble moan ;)). -- RichardHenderson

I agree that the conventional wisdom is that distribution must be planned in advance. Is this actually true, though? Have you tried to distribute a well-factored single-process system? If so, what was the driving force behind that decision, and how did it turn out? -- JimLittle

I specialize in architectural refactoring of big systems. The forces, among others, can be: performance, contingency, scaling, geographical cohesion. I did it, with some pain. I watched a couple of other groups fail, but then they didn't hire me :). They then tried to write it from scratch and still had problems. The key issue they ignored is physical failure. See DistributedTransactionsAreEvil. -- RichardHenderson

Richard wrote:

I tend to write systems with a view to later distribution which obviously makes it a lot easier.

Can you say more about what that means, and it particular how such a system might differ from a system that was simply well-factored?

What could distribution possibly have to do with factoring?

To my mind, writing systems "with a view to later distribution" could mean one of two things:
  1. You're adding "we might need this hook six months from now" code - which means you're spending time on something the client doesn't need today. YouArentGonnaNeedIt.
  2. You're writing code that does only what it needs to do, but is extremely well-factored, so when the call to distribute code comes down six weeks or months or years from now, the process will be as painless as possible.

Or perhaps I'm missing something in that original sentence.

There are several reasons to distribute a system that have nothing to do with optimization:

I'm sure someone will argue that all of these are optimizations in some way, but they don't match the common definition of "optimization" which is "making something faster than the simplest implementation by using a different algorithm". Connecting diverse systems using the network is simpler than is creating a new monolithic system with all the functionality of the existing systems.

I have worked on several distributed systems over the years, and hope for future scalability is, in my mind, the stupidest reason to go through the difficulties of designing a distributed system. That is often a case of YouArentGonnaNeedIt.

Perhaps, some of the confusion comes from the negative association with optimization one gets from a maxim like OptimizeLater. There are cases in which optimization may be part of what the client asks for, from day one: Nobody in the SETI project needed to work to prove the requirement for optimization-through-distribution.

So optimization isn't always a bad thing. Seems to me that the forces behind OptimizeLater are that:
  1. Programmers need to guard against their understandable impulse to make code faster, because doing so might distract from whatever else might give the customer more value
  2. Technical implementation is hard, and it's often impossible to predict how slow software will be until you've implemented it, and then profiled it.

The first one is negated if the customer asks explicitly for optimization, i.e., optimization is customer value. The second one is negated if you already know enough about the implementation to know you'll need to optimize. For example, the SETI people can say, right at the beginning, "We don't know how long it'll take to compute this all on one workstation, but we know it'll take way too long. Distribute it."

Actually, SetiAtHome is probably a bad example, since one of the goals (and possibly the goal) was to increase awareness, hence funding, for SETI, and you don't do that by running it on your supercomputer. Their main goal was to involve the public and show them some pretty pictures on the computer, which is apparent from the fact that they kept sending the same data blocks for months.


View edit of February 11, 2004 or FindPage with title or text search