A Genetic Algorithm Function Optimizer in C++

Introduction

An example of how a genetic algorithm can be applied to optimize standard mathematical functions, such as the Rosenbrock function. (Image obtained from the Wikipedia page.)

The Rosenbrock function is a non-convex function used to test the performance of optimization algorithms introduced by Howard H. Rosenbrock in 1960. The Rosenbrock function is defined by:

f(x,y) = (1-x)^2 + 100(y-x^2)^2

As shown in the diagram, the global minimum lies inside a long, narrow, parabolic shaped flat valley.

Locating the valley vicinity is fairly straightforward, while converging to the global minimum, located at (x,y) = (1,1) where f(1,1) = 0 is trickier.

The Genetic Algorithm

This software employs standard genetic operators crossover, mutation and selection, as applied to chromosome representations of floating-point numbers.

In this application of the genetic algorithm, the IEEE 754 standard, as also described here and here, is used to represent floating point numbers as binary arrays.

In short, The first bit represents the sign (0 = plus, 1 = minus); the following 8 bits store the biased exponent (bias=127) and the remaining 23 bits represent the fractional part (mantissa).

Standard genetic operators crossover and mutation are used to generate new, possible fitter chromosomes. The crossover operator works by exchanging portions of bit strings between pairs of arbitrarily selected ‘parent’ chromosomes to produce new ‘child’ chromosomes. This is illustrated by the diagram below, showing how the portions of the bit string between the positions in ‘parent’ chromosomes P1 and P2 are exchanged to produce ‘child’ chromosome C1 and C2:

crossover

The mutation operators works by arbitrarily selecting a single chromosome in the population and changing a randomly selected number of bits to produce a modified version of the chromosome. This is illustrated by the diagram below showing how individual bits are modified to produce the new chromosome:

mutation

The downloadable code C++ implementation should be reasonably self-documenting and self-explanatory.

The code

The 32-bit binary encodings are implemented using the Chromosome class, as a means of housing the binary values and performing other simple get/set tasks, etc:

class Chromosome
{
public:
	Chromosome(const int& size);
	~Chromosome(void);

	void SetChromosome( const int& index, const unsigned char& value );
	unsigned char GetChromosome( const int& index );
	void SetFitness( const double& value );
	double GetFitness() const;
	int size() const;
	void Print( const int& index ) const;

private:

	std::unique_ptr<int[]> chr;
	double fitness;
	int chrSize;
};

The Population class maintains a population of Chromosome objects and also apply the standard genetic operators crossover, mutation and selection on them. It also generates an initial arbitrary population of Chromosomes with random values and has other helper functions with which to convert between binary and decimal representations of the binary strings:

class Population
{
public:
	Population(void);
	~Population(void);

	void SetChromosomeSize( const int& size );
	void CreateRandomPopulation( const int& size );	
	void Crossover( const int& index1,
		            const int& index2,
					const int& point );
	void Crossover( const int& index1,
		            const int& index2,
					const int& point1,
					const int& point2 );

	void Mutation( const int& index );
	double EvaluatePopulation();
	double CalcChromosomeFitness( const int& index,
								  float& xv,
								  float& yv);

	double GetChromosomeFitness( const int& index ) const;
	void CopyChromosome( const int& source, const int& dest );

private:
	Chromosome* CreateRandomChromosome();	
	std::string GetXstring( Chromosome* chr );
	std::string GetYstring( Chromosome* chr );
	float GetFloat32( std::string Binary );
	int Binary2Hex( std::string Binary );
	double CalculateFitnessFunction( const float& x, 
		                             const float& y );
private:
	std::vector< Chromosome* > pop;
	int chrSize;
};

The next class of interest would be the GeneticAlgorithm class, which should be fairly self-explanatory: run the genetic algorithm, applying it to the population of chromosomes, with the intention of progressively improving the overall fitness over successive iterations:

class GeneticAlgorithm
{
public:
	GeneticAlgorithm(void);
	~GeneticAlgorithm(void);
	void Initialize( const int& crate,
					 const int& mrate,
					 const int& psize,
					 const int& iter,
					 const int& csize,
					 const int& tsize,
					 const std::string& path );	
	void Run();

private:

	void CreatePopulation();
	double Evaluate();
	void Crossover();
	void Mutate();
	void Select();
	void SetParameters( const int& crate,
		                const int& mrate,
						const int& psize,
						const int& iter,
						const int& csize,
						const int& tsize );
	void LogResult( const double& result, 
		            const int& iter,
		            const int& count );

private:
	int mutationRate;
	int crossoverRate;
	int populationSize;
	int numberIterations;
	int chromosomeSize;
	int tournamentSize;
	int bestFitnessIndex;
	double bestFitness;
	Population pop;
	Log log;
};

Experimental Results

I have run a number of simple experiments to see how well the algorithm converges over 10,000 iterations. For the first experiment, using the IEEE 754 bit string representation for floating-point numbers, the algorithm parameters used were as follows:

crossover_rate = 50;
mutation_rate = 5;
population_size = 100;
tournament_size = population_size / 4;

Binary tournament selection ss the preferred ‘survival of the fittest’ mechanism. This mechanism works by selecting arbitrary pairs of chromosomes from the population, and selecting the fitter of the pair for inclusion in the new generation.

The code has a simple logger enabling you to dump results to a text file, if required. In this example I got it return the best fitness found so far at every generation.

Here is graph showing the best value found at every 10th iteration:

The global minimum for this particular function is 0.0; in this run the algorithm managed to get it down to 0.02. As you can see, no further reduction takes place for the next 800 or so iterations. Feel free to try out this code and improve upon it, or apply it to your own area of interest.

Example usage

Here is an example of the the main.cpp usage showing parameters used to obtain the results shown in the console output.

// GAfunction.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "Log.h"
#include "GeneticAlgorithm.h"

const int encoding_type   = 1;
const int crossover_rate  = 50;
const int mutation_rate   = 2;
const int population_size = 500;
const int number_iterations = 30000;
const int chromosome_size = 32;
const int tournament_size = population_size / 10;
const int precision = 6;
const int epoch = 50;


int _tmain(int argc, _TCHAR* argv[])
{
	// Run the GA!
	GeneticAlgorithm ga;	

	// Set the function type and hence the constraints
	Constraint constraint = Constraint::Rosenbrock;

	ga.Initialize( epoch,
		           precision,
				   encoding_type,
		           crossover_rate, 
		           mutation_rate, 
				   population_size,
				   number_iterations,
				   chromosome_size,
				   tournament_size,
				   "C:\\dump\\best.txt",
				   constraint );
	ga.Run();

	return 0;
}

Giving the following console output as shown:

functionoptimizer1

As can be seen, the algorithm steadily approaches x = y = 1.0 to yield the desired optimal result of f(x,y) = 0;

Running the algorithm for an increased number of iterations, using double the population size yields us even more precise values for x and y as shown:

functionoptimizer2

Booth’s function

Another example optimization test function whose objective function is:

f(x,y) = (x+2y-7)^2 + (2x+y-5)^2

… and whose known optimal is

f(1,3) = 0

… and whose constraints are:

-10 <= x,y <= 10

Visual representation:

BoothsFunction

Console output after 20,000 iterations as shown:

functionoptimizer3

McCormick function

Objective function:

f(x,y) = sin(x+y) + (x-y)^2 - 1.5x + 2.5y + 1

… known optimal:

f(-0.54719, -0.54719) = -1.9133

… constraints:

-1.5 <= x <= 4
-3 <= y <= 4

Visual representation:

McCormick

Console output after 30,000 iterations:

functionoptimizer4

Code samples

Visual Studio 2010 console application is downloadable from here:


Comments and feedback always welcome.

Some other genetic algorithm related posts:

Genetic Algorithm based routing optimization: an update
Genetic Algorithms Applied to Travelling Salesman Problems in C++
A Genetic Algorithm Based Routing Optimization Tool