Cultured Perl: Genetic Algorithms Applied with Perl
By Teodor Zlatanov2004-11-03
So What is the Genetic Algorithm?
The genetic algorithm is simple enough for anyone to understand, using biological terms taught in high school. Take a population of individuals, each with DNA of their own. Then measure each individual's fitness (measured as a function applied to the individual's DNA) and make the individual more likely to procreate if he is more fit. Extremely unfit individuals are killed off. Every survivor gets a chance to procreate (and it is important that procreation is never denied to a survivor, only made less likely if he is less fit). Merging the parents' DNA, and then applying a random mutation to the merged DNA simulates procreation. The new individual will, in theory, be as fit as the parents, plus or minus a small variance caused by the mutation. The cycle is then repeated.
There are, obviously, many variables that can affect the genetic algorithm, including population size, generations (iterations of the algorithm), merging method, fitness function, how fitness affects procreation chances, and how much mutation happens.
There are some drawbacks to the algorithm, as well. It works best if the fitness function is applied to the DNA as a series of bits. In other words, it's best if the DNA is a series of binary options, yes or no. Blue eyes? Black eyes? Red hair? Black hair? Merging of parent DNA and subsequent mutation should not allow certain combinations of bits, because the resulting DNA will not be a valid solution to the original problem. Remember that the "DNA" is nothing more than a solution to a mathematical fitness formula. Some values used in that formula may be invalid -- for instance, dividing by zero.
In addition, the genetic algorithm is not bound by time. You pick the number of generations. You can define some goal -- for instance, "look for an individual with fitness of 0.99999" and stop there, but then the algorithm may never end because it hasn't found that individual. That can be a problem if you set unrealistic goals or too few generations. Trial and error, and a good feel for the problem, are the best ways around this issue.
The fitness formula should return a floating point number between 0 and 1. Other ranges can be used, but in my experience floating point numbers work best. You may want a range between 0 and 32767, for instance, if you want the fitness to be a 7-bit integer for optimization.
Of course, delaying optimization until you know it's needed is a good idea, so you should at least start with a simple fitness formula. The fitness formula is the most-used function in the genetic algorithm (it will be invoked (population size) x (generations times)), so you should make it as simple and as fast as possible.
There are three "good" ways to exit the genetic algorithm. First, you can decide to exit if there is no variety in the DNA pool anymore. This is a tricky test, actually, and a CPAN module for finding the distance of strings might be useful here, as long as you can represent the DNA as a string. Second, you can exit if you have reached a fitness goal. Unless you have a very good understanding of the fitness formula (in which case, you probably don't need the genetic algorithm anyway), setting a fitness goal can result in either infinite loops, or an individual who is only "good enough." Third, you can exit the algorithm after a set number of iterations, or "generations."
In practice, all three ways (or at least the second and third one) are used to control the genetic algorithm. A few test runs, maybe 10 or 20, will give you a good feel for how quickly the algorithm converges, what kind of fitness you can expect.
Genetic Algorithms FAQ
See the Resources section for the link to Genetic Algorithms FAQ. It also points to suites of genetic algorithm software, both free and commercial. See the Algorithm GA that was obtained from the Genetic Algorithms FAQ.
Tutorial Pages:
» Create Your Own Darwinian Breeding Grounds
» History
» So What is the Genetic Algorithm?
» A Simple Example
» Breeding words
» Conclusion
» Resources
First published by IBM DeveloperWorks
| Related Tutorials: » Random subroutines in Perl » Log Script Use » Creating Perl Modules for Web Sites » Bit Vector, Using Perl Vec » Build a Perl/CGI Voting System » Perl Range Operator |
