# Genetic Algorithm

Genetic Algorithms (GAs) were developed by Prof. JohnHolland and his students at the University of Michigan during the 1960s and 1970s. Essentially, they are a method of "breeding" computer programs and solutions to optimization or search problems by means of simulated evolution. Processes loosely based on natural selection, crossover, and mutation are repeatedly applied to a population of binary strings which represent potential solutions. Over time, the number of above-average individuals increases, and highly-fit building blocks are combined from several fit individuals to find good solutions to the problem at hand.

The Canonical GA (pseudo code):

``` choose initial population
evaluate each individual's fitness
determine population's average fitness
repeat
select best-ranking individuals to reproduce
mate pairs at random
apply crossover operator
apply mutation operator
evaluate each individual's fitness
determine population's average fitness
until terminating condition (e.g. until at least one individual has
the desired fitness or enough generations have passed)

```
This pseudo-code does not hint what the averaging is for.
• To let you know what sort of progress they've made so far, and also sometimes the individual rankings are made against the group average. So it's not necessarily absolutely necessary.
• Perhaps we should present a "bare bones" algorithm and "fancy" one. Another thing that's optional is the variation methods used. Only one of cross-over and mutation ("bit-flipping") is necessary to qualify as a GA. In some experiments, one or the other seems to work well by itself, depending on the optimization topic and other factors. Also, cross-over that allows position transposition and inversion generally approximates direct mutation. In other words, the more "random" crossover is, the less need there is for direct mutation because crossover can potentially provide sufficient randomness by itself.

Ending conditions can be:

• Fixed number of generations reached
• Budgeting: allocated computation time/money used up
• An individual is found that satisfies minimum criteria
• The highest ranking individual's fitness is reaching or has reached a plateu such that successive iterations are not producing better results anymore.
• Manual inspection. May require start-and-stop ability
• Combinations of the above

GAs are sensitive to the mutation and crossover rates, which is unsurprising when you think about an extreme case: a 99% chance of a mutation of each bit on each generation means too little stability; the genome will always be essentially random.

GAs are sensitive to the encoding scheme of its bits; using gray code, for instance, means that adjacent solutions in the search space will be adjacent in the encoding space as well, requiring fewer mutations to discover.

GAs are sensitive to the gradient of the search space curve leading towards solutions, as are almost all search algorithms (except exhaustive search and pure random search); if there's no way to tell when you're getting close to a solution, then there's no way to optimize the search. Thus GAs will not help a naive search for factors of a huge number: you don't know when you're close to the right factors, you only know when you've hit them exactly.

Most importantly of all, GAs are critically dependent on the fitness function. If you can't define the fitness function, then you can't apply a GA. This is obvious in some cases: if you want a GA to create music, but you don't know music theory, then you won't know how to create a fitness function to evaluate the degree of harmony of the results. But it is less obvious in other cases; if you have bugs in your fitness function, then the GA will evolve to "take advantage" of those bugs -- what else could they do?

• Re: "if you want a GA to create music, but you don't know music theory, then you won't know how to create a fitness function..." - Actually one can simply listen to it to judge whether it's "good". However, that's going to be a lot of manual work. "Art" evolution has been done on the Internet by allowing passer-by's to judge the art. The best-voted variations were then mutated into multiple candidates for another round of judging. If those scored lower than their parent, they were not displayed as often for votes. Thus, they were competing for attention and votes. It was pretty interesting. 60's hippies would have dug it.

I discussed this in my talk on "The Evolution of Dishonesty" in regard to my 1986 work on breeding Tic Tac Toe colonies. Most people are surprised to hear that it didn't matter whether the underlying mechanism implemented simulated CPUs with branching or whether they were neural nets trained by GA evolution of weights. The evolution results and rate of convergence were essentially identical. -- DougMerritt

More material on GAs:

I did transport scheduling (an NpHard problem) using GAs back in 1993, and it worked at least as well as the alternatives. It didn't perform a whole lot better than random generation of solutions, but that may have been that we didn't apply the right optimizations. -- PeteBevin

Confusion over the terms GeneticAlgorithm and GeneticProgramming:

A genetic algorithm is simply the algorithm used to simulate evolution. It takes candidate solutions, selects some of the best using user-defined evaluation functions, applies user-defined transformations (often called mutation and crossover, but implementations of these depend on the problem), and makes new candidate solutions.

JohnHolland's 'classical' genetic algorithm specifically used raw bitstrings as the candidate solutions, a problem-specific evaluation function depending on the problem, and simple bit-flip mutations and bitstring swapping for cross-over. This is just one type of genetic algorithm. See Faulkenauer's GeneticAlgorithmsAndGroupingProblems for a good introduction to the differences between implementations.

GeneticProgramming is yet another example of the genetic algorithm. It uses instruction trees as candidate solutions, problem-specific evaluation functions, and sometimes complicated tree pruning, swapping, and random branch generation (among other methods) to generate new candidate solutions. Aside from the user-specific strategies (see StrategyPattern), the main algorithm is the same.

So, whether the candidate solutions are 'data structures' or 'programs', it doesn't matter, they all can be manipulated by a genetic algorithm.

There's a thriving business in crafting EvolutionaryAlgorithms so that their performance improves relative to random search -- I've done it myself for years as a consultant. In essence, GAs and other EvolutionaryAlgorithms involve an intricate interplay between the problem you're trying to solve, its representation, the search operators you choose, and the performance measure you choose (like "how many individual evaluations of different potential solutions", or "how much wall clock time to improve on the starting guesses").

The whole process comes down to MetaOptimization, which I suppose is my coinage of about 10 years back. -- BillTozier

For those interested in learning more about GeneticAlgorithms and EvolutionaryComputation? in general, a great place to start is http://gnu.egr.msu.edu/pub/EC/ [BrokenLink].

WikiGnomes: that link referred to "Encore" a.k.a. "The Hitch-hiker's Guide To Evolutionary Computation". It appears that this originated at alife.santefe.edu, and was once very widely mirrored, but now is gone, nor did I see it at alife.org. I only found one copy: http://www.cs.bham.ac.uk/Mirrors/ftp.de.uu.net/EC/clife/ Anyone know of some other mirrors to help prevent bitrot? (Or do you want to mirror it?)

One of the things I like most about GeneticAlgorithms is that if you decide some aspect of the solution is more important than you originally thought, you can just change the scoring function to weight it higher (or add it).

Why is this specific to genetic algorithms? You can do this with any form of optimization, no?

Because the population at any iteration includes a wide variety of solution-fragments. When you change the scoring function, a new solution may arise from the previously less-optimal individuals. In a traditional optimization algorithm with only one or a few candidate solutions, no candidate will be near the new goal and you may have trouble escaping a local maximum.

In other words: Using an evolutionary approach to optimization is a good idea in some DynamicOptimization? problems (which is what the previous writer was getting at). The lion's share of GA applications (and theory), though, applies them to static problems, where the goals, constraints, and the problem definition itself are fixed at the start. That is, OfflineOptimization?. -- BillTozier

The primary problem with GAs is that they require that the entire search-space be a valid input to the evaluation function. Most physical problems are constrained. The result is a lot of culled mutoids that could never live, making it very inefficient. Worse still, genetic algorithms can severely distort the search space. This is a good thing for getting a fast convergence, but prevents GAs from being general. Simulated annealing (and SimulatedQuenching) do not suffer from this. Hybrid functions try to blend the generality of stochastic techniques with the very fast convergence of GAs. -- RichardHenderson.

This is not true (there is no such requirement of GAs). An important concern is to choose appropriate transformations to supply the GA with new candidate solutions. See Faulkenauer in his GeneticAlgorithmsAndGroupingProblems book ISBN 0471971502 . This essentially means that GAs are not a panacea and you can't just toss a GA at a problem willy nilly without thinking. Which is just another way of saying the NoFreeLunchTheorem. All MetaHeuristics (GAs, SimulatedAnnealing, TabuSearch, etc.) suffer from this limitation, so to say it of GAs, as if it made them worse than other techniques, is misleading.

I think I understand you. On the first point, it is possible to transform the set of inputs appropriately, but this is not in any way simple. Annealing techniques tend to directly model the network equations and constraints. This is a huge advantage. Furthermore GAs are weak heuristics. They cannot guarantee to search the entire parameter space. What they can do is integrate a huge number of variables, and converge quickly to an entire set of good solutions. For small to medium scale models, however, particularly homogenous [homogeneous?] ones, my experience is that simulated annealing has superior characteristics. I'm sure this is entirely problem dependent. I must now see what is this thing 'TabuSearch' :). -- RichardHenderson

The interest in GA (this applies to NeuralNetworks also, that other great comp sci fad of modern times) comes from capitalists wanting to reduce their dependence on labour by replacing skilled programmers with raw processing power. Additionally, from corporations wanting to reduce their product liability by blaming the inevitable errors on the intrinsic unknowability and unpredictability of GA. OTOH, if programmers produce bugs then the corporation could be liable for failing to use code review and formal methods (corporations live in abject fear that one day consumers and citizens will cease to tolerate bugs).

I've worked for a company that produces a GA. We use it because the problem domain under consideration is very large and no precise solution is applicable in all cases, so this is available as a reasonable back-up. So your invective is inaccurate.

I don't understand your point. Can you explain why a GA was used instead of an algorithm programmed by a beta team?

We have some extra algorithms programmed in. The GA is for when these aren't that applicable. Optimizing a non-linear function over a constrained binary space is hard.

Not only capitalists want to reduce dependence human labour - I'm an anarchist/socialist/utopist and I want to reduce dependence on unnecessary labour as well, so we humans can devote our time to art and philosophy.

The only reason GA produces anything remotely interesting is because silicon is thousands of times faster than flesh. So why is it that computer science people waste their time on GAs instead of AI? Ultimately, even GA enthusiasts admit that GA produces substandard solutions when you have fixed expectations. Since this applies to everything but pure research, one wonders why GA isn't dismissed as irrelevant mathematics instead of being taught by comp sci departments. The only aspect under which GA (and neural nets) are useful is as simulations useful in developing principles to help us understand already existing phenomena. That is, sandboxes in which we can study evolution and cognitive science.

This view is both narrow and wrong, and it will only become more wrong as time goes on. GeneticAlgorithms are a type of AI. For one example off the top of my head, one branch of GAs, called GeneticProgramming, has been used to develop useful patents which have never before existed, and which passed all criteria required for a patent (i.e. non-obvious, etc.).

The following comments go wrong in using a wrong definition of AI (look at the algorithms presented in any introductory AI text). They become correct or at least arguably correct in reference to "strong AI". I have therefore inserted "strong AI" below:

GA seem like strong AI only if your level of abstraction is sufficiently low. If you're trying to evolve a complex program by doing random CopyAndPasteProgramming on low-level pieces of code, then the idiocy (literal inability to learn from past experience) of GA becomes apparent.

Ultimately, GA can never be strong AI because a strong AI must be able to construct abstractions, evaluate the completeness of those abstractions and decide when to work at a higher level of abstraction. GAs do the first, maybe, but they cannot do the second and third so cannot switch over to a higher level of abstraction when appropriate. GAs do not encapsulate solutions so as to be reused systematically. Unless of course, you program a strong AI that happens to use GA instead of neural nets at its base. That would not make GA a type of strong AI any more than it makes neural nets a type of strong AI.

The idea that GA is strong AI comes from the meme theory of mind, a theory which explains nothing new and only serves to deny the cognitive functions of the mind. See MemesShmemes.

Also, automated theorem provers have no doubt constructed novel and elegant mathematical theorems which humans had never thought of before. Nobody would call them strong AI on that basis.

I use the pragmatic definition of AI, which is verifiable. The cognitive definition will always be up for debate, and that belongs on another page. The basis for calling GA AI: If you took a course on AI and it didn't talk about GAs, I'd say you got ripped off.

Note that e.g. theorem provers are a canonical example of weak AI, whereas there has never been even one instance of a strong AI (and many/most people are still arguing that strong AI is impossible).

Given how easy for the human mind to interpret real-life, organic evolution as design, I see no reason to dismiss the classification offhandedly of GAs as a form of AI.

Sometimes, I feel people are missing the point. I was originally fascinated by genetic algorithms for their simple appeal of beng able generally to solve any problem no matter whether an algorithmic solution already existed for it. (I was breeding NeuralNetworks long before I learned about BackPropagation?.) But to confuse it with GeneticProgramming is a grave mistake. I've implemented GeneticProgramming several times. I'm ready to admit that you can happily throw a lot more processing power at it, making it more useful. But its solution domain is quite distinct from that of GAs. Firstly GAs usually operate on fixed structures, making the design of a GA very problem-specific. GP, on the other hand, can change the size of its solution (you can even include brevity in your fitness measure), and so it can adapt. If your original guess at the complexity of a solution was completely wrong, the GP programs will simply grow until a solution becomes viable. Furthermore, it's more general than any other solution-finding mechanism I know of. It's not AI, I'll readily admit, but it can solve jsut about any problem for which you can define a UnitTest. In fact, I'd guess you could create a general GP system that spits out solutions to UnitTests. Only that we don't have the CPU power currently to do anything but offline evolution.

Consider a modern GeforceFX chip with 8 pipelines of 64k instructions each. If we remove all the fancy memory bandwidth stuff and set up 512 or better 1024 pipelines instead, then we're getting into a realm where for, say, a robot control program, we can each second evolve a new program from what we learned the previous second. Society of Minds if anybody has read that.

Please bring out your pros and contras here (especially the hard core statisticians ;). -- SvenNeumann

AI is a term that sort of fell from favor when it became apparent how cheeky it was to call something artificial that we could not define in the first place. It broke down into the subdisciplines of robotics, machine vision, A-life, neural nets, GP, etc.

GeneticProgramming is an application of the GeneticAlgorithm using a linear array or parse tree of code as the chromosome instead of an encoded string or similar representation.

Sven, you are a man of vision: "...but it can solve jsut about any problem for which you can define a UnitTest. In fact, I'd guess you could create a general GP system that spits out solutions to UnitTests. Only that we don't have the CPU power currently to do anything but offline evolution."

WardCunningham and I have collaborated to create such a system. The concept is called ExtremeGeneticProgramming and involves using the UnitTestAsFitnessFunction. I believe it is the purest form of TestFirstDesign. Self-directed teams shepherd the evolution of software by PairProgramming unit tests using an integrated version of the FrameworkForIntegratedTest. The GP application, called Neovolve, then uses the tests to converge on a solution.

Luckily, GP lends itself to parallelizing. Using the PowerKernel? Java parallel computing framework has allowed me to use standard PC's on regular networks to scale computational power to solve GeneticProgramming problems.

Results can be found on the FitWiki at http://fit.c2.com and at http://www.neocoretechs.com

-- JonGroff

Aah, an excellent example of GreenspunsTenthRuleOfProgramming.

How's that again? If I'm not mistaken, they've got a very formally specified, unbuggy implementation of something that's not really common lisp... and it's being used for a 'higher' purpose (i.e., that which makes it apply to Greenspuns is the starting point, not the finish point of the software).

The Greenspun comment does not apply. XGP bugs are optimized out from the beginning rather than being added in later such as the code you are probably most familiar with.