Genetic Algorithms

How biology triumphs brute force

ISA Manipal
7 min readMay 7, 2021

Genetic algorithm is a search-based optimization technique based on the principles of Genetics and Natural Selection. Genetic algorithms are a subset of a much larger branch of computation known as Evolutionary Computation. Genetic algorithms are used to find near-optimal solutions to an NP-Hard problem (Problems that take a long time to compute, even with supercomputers) efficiently. GAs try and replicate the process of natural selection where the fittest individuals are selected for reproduction to produce offspring of the next generation.

How it Works

GA begins with a set of possible solutions that are known as the population. These solutions are measured on how well they fit the problem and are given a fitness value according to a fitness function. The solutions that are a better fit to the problem undergo recombination and mutation (like in natural genetics), producing novel solutions. The process is repeated over various generations until we reach our stopping criteria.

Flowchart representing Genetic Algorithm.

Initialisation

The first step to solve a problem using a genetic algorithm is to define a population. The population is a subset of solutions in the current generation. The most efficient way to initialize a population is by random selection as by using a heuristic, the results of the next generation may be better, but it leads to reduced diversity, and the solutions generated are remarkably similar to the ones we already had. This leads to a condition known as Premature Convergence. Premature convergence is when the algorithm stops at a local optimum instead of reaching the global optimum due to a lack of diversity in the population.

Further, there are two classification models for a population: the Steady State Model and the Generational Model. In the steady state model, only a few individuals are replaced in each iteration. In contrast, in the generational model, the entire population is replaced by new individuals in the next iteration.

Fitness Function

A fitness function is a function that takes an individual of the population as input and returns with a value that helps us determine how it compares to the other individuals as the solution to the problem. The fitness function is the most critical part of a genetic algorithm and must be chosen carefully. While selecting a fitness function, it is important to make sure that it returns a quantitative value that can be used to compare it to other solutions. The function should not be overly complex and should take less time to compute. Otherwise, it’ll hamper the overall efficiency of the algorithm as the fitness value of every individual is determined in each iteration.

Selection

Parent Selection is the process of selecting parents who mate and recombine to create offspring for the next generation. Good parent selection can help the algorithm converge faster as better parents have fitter offspring. However, it is important to make sure diversity is maintained, or it leads to premature convergence.

Pie chart for Roulette Selection.

One of the best ways to select the parents is by the Fitness Proportionate Selection method. In this method, the parents are given a probability of selection based on their fitness score. This leads to fitter solutions having a higher chance of being selected while at the same time this helps maintain the diversity in the population. A straightforward way to visualize the proportion selection method is by having a pie chart divided into n parts (for n individuals). Everyone gets a share proportional to their fitness score. Like a roulette, there is a fixed point, and where the roulette stops, that individual is selected as the parent. An adaptation of this method is the stochastic universal sampling, where instead of one fixed point in the roulette, multiple fixed points choose all the parents at once.

There are other techniques that are used in specific cases, but they are all based on the same principle, where the higher fitness has a higher probability of being selected. For example, the rank selection method is used when the fitness scores of all the individuals are close to each other. The probability of selection, in this case, is given based on rank instead.

Crossover

In a genetic algorithm, crossover, also known as recombination, is a genetic operator used to combine the genetic information of two parents to produce an offspring. It is a way to create novel solutions in a population stochastically.

There are multiple ways to do a crossover:

Single point crossover: This method involves selecting a random crossover point in the parents, and the tail of the parents is swapped to generate novel solutions.

Single point crossover

Multiple point crossover: This method is like the single point crossover method but with multiple crossover points. In this method, alternate segments are swapped to generate offspring.

Multiple point crossover

Uniform crossover: This method considers all the individual genes as separate segments, and swapping happens based on a coin flip (a 50% chance of swapping happening).

Mutation

Mutation is the process that helps in maintaining the diversity of the solutions in the population. It is a random tweak that takes place in the chromosome to create a new individual. The probability of mutation happens is kept low (~0.05) so that the entire process is not randomized.

There are multiple mutation operators:

  1. Bit Flip Mutation: in a binary encoded genetic algorithm, i.e., all the genes are represented by 1 or 0, we can use bit flip mutation to randomly select a bit and flip it from 1 to 0 or vice versa.
  2. Random resetting: it is a more general form of bit flip mutation that works on non-binary GAs. In this operator, a random gene is selected, and its value is randomized from the set of permissible values.
  3. Swap mutation: for swap mutation, two genes are selected at random from a chromosome(individual), and the value of the two genes is swapped.
  4. Scramble mutation: a subset of the chromosome(individual) is selected and is shuffled randomly to generate a new individual.
  5. Inversion mutation: it is like scramble mutation where a subset is chosen, but instead of shuffling it, the operator only inverts the subset.

Termination

After mutation happens, the complete process of selecting a new population is completed, and then it’s time to either start the process again or terminate it and output the result. Termination is done when either we have no improvement in our solution after n number of iterations or when n number of iterations have been completed. It is good to know when to terminate the process as it helps return the results quicker and with good enough accuracy.

The Knapsack Problem — an Example

The knapsack problem is one of the best uses for the genetic algorithm. The knapsack problem aims to select items into the knapsack to attain maximum profit while simultaneously not exceeding the knapsack’s capacity.

Let’s start with 5 gold items that weigh and are valued differently. To find the most effective combination, we can try out all the combinations and compare the values to find the answer.

We can assign all the items a value of 1 if they are in the knapsack and 0 if they are not in the knapsack.

As each item can be in the knapsack or not, the total number of combinations is 25=32.

This problem is easy to do for computers when the number of items is less, but when the number of items increases, the combinations and, therefore, time taken increases exponentially.

So, we approach this problem using genetic algorithms.

Step 1: In the first step, we randomly generate a population of viable solutions

Step 2: We use a fitness function to determine which solutions are better. In our case, a solution is better if it has a higher value of gold items, and the weight is less than the capacity of the knapsack.

Step 3: We now select the parents for the next generation of solutions. Using the roulette method, let’s say we get 00110 and 11011 as our parents

Step 4: Now, we crossover the parents to generate our offspring.

Let’s have our crossover happen at position 3:

Step 5: There can also be random mutation in any gene of the offspring.

Step 6: let’s say we terminate our program after 10 iterations, so the process runs for 9 more times, after which the max of the population is returned.

Applications

Genetic algorithms are used in places where the result is needed quickly, and an accuracy of over 90% is enough in most cases. For example, the travelling salesman problem can be seen in real life where GPS needs to find the most efficient path from one place to another within seconds as no one will wait 10 mins for the program to calculate the most efficient route and an accuracy of 90% is enough for the average user. It is also beneficial in the knapsack problem, where the problem scales exponentially according to the number of items to be considered.

Written by Siddhant Gupta

--

--

ISA Manipal
ISA Manipal

Written by ISA Manipal

The Official Student Section of the International Society of Automation at the Manipal Institute of Technology, Manipal.

No responses yet