GPR Manual
Table of Contents
 1 mglgpr ASDF System Details
 2 Background
 3 Evolutionary Algorithms
 4 Genetic Programming
 5 Differential Evolution
[in package MGLGPR]
1 mglgpr ASDF System Details
 Version: 0.0.1
 Description: MGLGPR is a library of evolutionary algorithms such as Genetic Programming (evolving typed expressions from a set of operators and constants) and Differential Evolution.
 Licence: MIT, see COPYING.
 Author: GĂˇbor Melis
 Mailto: mega@retes.hu
 Homepage: http://quotenil.com
2 Background
Evolutionary algorithms are optimization tools that assume little of the task at hand. Often they are population based, that is, there is a set of individuals that each represent a candidate solution. Individuals are combined and changed with crossover and mutationlike operators to produce the next generation. Individuals with lower fitness have a lower probability to survive than those with higher fitness. In this way, the fitness function defines the optimization task.
Typically, EAs are quick to get up and running, can produce reasonable results across a wild variety of domains, but they may need a bit of fiddling to perform well and domain specific approaches will almost always have better results. All in all, EA can be very useful to cut down on the tedium of human trial and error. However, they have serious problems scaling to higher number of variables.
This library grew from the Genetic Programming implementation I wrote while working for Ravenpack who agreed to release it under an MIT licence. Several years later I cleaned it up, and documented it. Enjoy.
3 Evolutionary Algorithms
Evolutionary algorithm is an umbrella term. In this section we first discuss the concepts common to conrete evolutionary algorithms Genetic Programming and Differential Evolution.
[class] EVOLUTIONARYALGORITHM
The
EVOLUTIONARYALGORITHM
is an abstract base class for generational, population based optimization algorithms.
3.1 Populations
The currenly implemented EAs are generational. That is, they maintain a population of candidate solutions (also known as individuals) which they replace with the next generation of individuals.
[accessor] POPULATIONSIZE EVOLUTIONARYALGORITHM (:POPULATIONSIZE)
The number of individuals in a generation. This is a very important parameter. Too low and there won't be enough diversity in the population, too high and convergence will be slow.
[accessor] POPULATION EVOLUTIONARYALGORITHM (= (
MAKEARRAY
0:ADJUSTABLE
0:FILLPOINTER
T
))An adjustable array with a fillpointer that holds the individuals that make up the population.
[reader] GENERATIONCOUNTER EVOLUTIONARYALGORITHM (= 0)
A counter that starts from 0 and is incremented by
ADVANCE
. All accessors ofEVOLUTIONARYALGORITHM
are allowed to be specialized on a subclass which allows them to be functions ofGENERATIONCOUNTER
.
[function] ADDINDIVIDUAL EA INDIVIDUAL
Adds
INDIVIDUAL
toPOPULATION
ofEA
. Usually called when initializing theEA
.
3.2 Evaluation
[reader] EVALUATOR EVOLUTIONARYALGORITHM (:EVALUATOR)
A function of two arguments: the
EVOLUTIONARYALGORITHM
object and an individual. It must return the fitness of the individual. For Genetic Programming, the evaluator often simply callsEVAL
, orCOMPILE
+FUNCALL
, and compares the result to some gold standard. It is also typical to slightly penalize solutions with too many nodes to control complexity and evaluation cost (seeCOUNTNODES
). For Differential Evolution, individuals are conceptually (and often implemented as) vectors of numbers so the fitness function may include an L1 or L2 penalty term.Alternatively, one can specify
MASSEVALUATOR
instead.
[reader] MASSEVALUATOR EVOLUTIONARYALGORITHM (:MASSEVALUATOR =
NIL
)NIL
or a function of three arguments: theEVOLUTIONARYALGORITHM
object, the population vector and the fitness vector into which the fitnesses of the individuals in the population vector shall be written. By specifyingMASSEVALUATOR
instead of anEVALUATOR
, one can, for example, distribute costly evaluations over multiple threads.MASSEVALUATOR
has precedence overEVALUATOR
.
[reader] FITNESSKEY EVOLUTIONARYALGORITHM (:FITNESSKEY = #'
IDENTITY
)A function that returns a real number for an object returned by
EVALUATOR
. It is called when two fitness are to be compared. The default value is #'IDENTITY
which is sufficient whenEVALUATOR
returns real numbers. However, sometimes the evaluator returns more information about the solution (such as fitness in various situations) andFITNESSKEY
key be used to select the fitness value.
3.3 Training
Training is easy: one creates an object of a subclass of
EVOLUTIONARYALGORITHM
such as GENETICPROGRAMMING
or
DIFFERENTIALEVOLUTION
, creates the initial population by adding
individuals to it (see ADDINDIVIDUAL
) and calls ADVANCE
in a loop
to move on to the next generation until a certain number of
generations or until the FITTEST
individual is good enough.

Create the next generation and place it in
POPULATION
ofEA
.
[reader] FITTEST EVOLUTIONARYALGORITHM (=
NIL
)The fittest individual ever to be seen and its fittness as a cons cell.
[accessor] FITTESTCHANGEDFN EVOLUTIONARYALGORITHM (:FITTESTCHANGEDFN =
NIL
)If nonNIL, a function that's called when
FITTEST
is updated with three arguments: theEVOLUTIONARYALGORITHM
object, the fittest individual and its fitness. Useful for tracking progress.
4 Genetic Programming
4.1 Background
What is Genetic Programming? This is what Wikipedia has to say:
In artificial intelligence, genetic programming (GP) is an
evolutionary algorithmbased methodology inspired by biological
evolution to find computer programs that perform a userdefined
task. Essentially GP is a set of instructions and a fitness
function to measure how well a computer has performed a task. It
is a specialization of genetic algorithms (GA) where each
individual is a computer program. It is a machine learning
technique used to optimize a population of computer programs
according to a fitness landscape determined by a program's ability
to perform a given computational task.
Lisp has a long history of Genetic Programming because GP involves manipulation of expressions which is of course particularly easy with sexps.
4.2 Tutorial
GPR works with typed expressions. Mutation and crossover never produce expressions that fail with a type error. Let's define a couple of operators that work with real numbers and also return a real:
(defparameter *operators* (list (operator (+ real real) real)
(operator ( real real) real)
(operator (* real real) real)
(operator (sin real) real)))
One cannot build an expression out of these operators because they
all have at least one argument. Let's define some literal classes
too. The first is produces random numbers, the second always returns
the symbol *X*
:
(defparameter *literals* (list (literal (real)
( (random 32.0) 16.0))
(literal (real)
'*x*)))
Armed with *OPERATORS*
and *LITERALS*
, one can already build
random expressions with RANDOMEXPRESSION
, but we also need to
define how good a certain expression is which is called fitness.
In this example, we are going to perform symbolic regression, that is, try to find an expression that approximates some target expression well:
(defparameter *targetexpr* '(+ 7 (sin (expt (* *x* 2 pi) 2))))
Think of *TARGETEXPR*
as a function of *X*
. The evaluator
function will bind the special *X*
to the input and simply EVAL
the expression to be evaluated.
(defvar *x*)
The evaluator function calculates the average difference between
EXPR
and TARGETEXPR
, penalizes large expressions and returns
the fitness of EXPR
. Expressions with higher fitness have higher
chance to produce offsprings.
(defun evaluate (gp expr targetexpr)
(declare (ignore gp))
(/ 1
(1+
;; Calculate average difference from target.
(/ (loop for x from 0d0 to 10d0 by 0.5d0
summing (let ((*x* x))
(abs ( (eval expr)
(eval targetexpr)))))
21))
;; Penalize large expressions.
(let ((minpenalizedsize 40)
(size (countnodes expr)))
(if (< size minpenalizedsize)
1
(exp (min 120 (/ ( size minpenalizedsize) 10d0)))))))
When an expression is to undergo mutation, a randomizer function is
called. Here we change literal numbers slightly, or produce an
entirely new random expression that will be substituted for EXPR
:
(defun randomize (gp type expr)
(if (and (numberp expr)
(< (random 1.0) 0.5))
(+ expr (random 1.0) 0.5)
(randomgpexpression gp (lambda (level)
(<= 3 level))
:type type)))
That's about it. Now we create a GP instance hooking everything up,
set up the initial population and just call ADVANCE
a couple of
times to create new generations of expressions.
(defun run ()
(let ((*printlength* nil)
(*printlevel* nil)
(gp (makeinstance
'gp
:topleveltype 'real
:operators *operators*
:literals *literals*
:populationsize 1000
:copychance 0.0
:mutationchance 0.5
:evaluator (lambda (gp expr)
(evaluate gp expr *targetexpr*))
:randomizer 'randomize
:selector (lambda (gp fitnesses)
(declare (ignore gp))
(holdtournament fitnesses :ncontestants 2))
:fittestchangedfn
(lambda (gp fittest fitness)
(format t "Best fitness until generation ~S: ~S for~% ~S~%"
(generationcounter gp) fitness fittest)))))
(loop repeat (populationsize gp) do
(addindividual gp (randomgpexpression gp (lambda (level)
(<= 5 level)))))
(loop repeat 1000 do
(when (zerop (mod (generationcounter gp) 20))
(format t "Generation ~S~%" (generationcounter gp)))
(advance gp))
(destructuringbind (fittest . fitness) (fittest gp)
(format t "Best fitness: ~S for~% ~S~%" fitness fittest))))
Note that this example can be found in example/symbolicregression.lisp.
4.3 Expressions
Genetic programming works with a population of individuals. The
individuals are sexps that may be evaluated directly by EVAL
or by
other means. The internal nodes and the leafs of the sexp as a tree
represent the application of operators and literal objects,
respectively. Note that currently there is no way to represent
literal lists.

An object of
EXPRESSIONCLASS
defines two things: how to build a random expression that belongs to that expression class and what lisp type those expressions evaluate to.
[reader] RESULTTYPE EXPRESSIONCLASS (:RESULTTYPE)
Expressions belonging to this expression class must evaluate to a value of this lisp type.
[reader] WEIGHT EXPRESSIONCLASS (:WEIGHT = 1)
The probability of an expression class to be selected from a set of candidates is proportional to its weight.
[class] OPERATOR EXPRESSIONCLASS
Defines how the symbol
NAME
in the function position of a list can be combined arguments: how many and of what types. The following defines+
as an operator that adds twoFLOAT
s:(makeinstance 'operator :name '+ :resulttype float :argumenttypes '(float float))
See the macro
OPERATOR
for a shorthand for the above.Currently no lambda list keywords are supported and there is no way to define how an expression with a particular operator is to be built. See
RANDOMEXPRESSION
.
[reader] ARGUMENTTYPES OPERATOR (:ARGUMENTTYPES)
A list of lisp types. One for each argument of this operator.
[macro] OPERATOR (NAME &REST ARGTYPES) RESULTTYPE &KEY (WEIGHT 1)
Syntactic sugar for instantiating operators. The example given for
OPERATOR
could be written as:(operator (+ float float) float)
See
WEIGHT
for whatWEIGHT
means.
[class] LITERAL EXPRESSIONCLASS
This is slightly misnamed. An object belonging to the
LITERAL
class is not a literal itself, it's a factory for literals via itsBUILDER
function. For example, the following literal builds bytes:(makeinstance 'literal :resulttype '(unsignedbyte 8) :builder (lambda () (random 256)))
In practice, one rarely writes it out like that, because the
LITERAL
macro provides a more convenient shorthand.
[reader] BUILDER LITERAL (:BUILDER)
A function of no arguments that returns a random literal that belongs to its literal class.
[macro] LITERAL (RESULTTYPE &KEY (WEIGHT 1)) &BODY BODY
Syntactic sugar for defining literal classes. The example given for
LITERAL
could be written as:(literal ((unsignedbyte 8)) (random 256))
See
WEIGHT
for whatWEIGHT
means.
[function] RANDOMEXPRESSION OPERATORS LITERALS TYPE TERMINATEFN
Return an expression built from
OPERATORS
andLITERALS
that evaluates to values ofTYPE
.TERMINATEFN
is a function of one argument: the level of the root of the subexpression to be generated in the context of the entire expression. If it returnsT
then aLITERAL
will be inserted (by calling itsBUILDER
function), else anOPERATOR
with all its necessary arguments.The algorithm recursively generates the expression starting from level 0 where only operators and literals with a
RESULTTYPE
that's a subtype ofTYPE
are considered and one is selected with the unnormalized probability given by itsWEIGHT
. On lower levels, theARGUMENTTYPES
specification of operators is similarly satisfied and the resulting expression should evaluate without without a type error.The building of expressions cannot backtrack. If it finds itself in a situation where no literals or operators of the right type are available then it will fail with an error.
4.4 Basics
To start the evolutionary process one creates a GP object,
adds to it the individuals (see ADDINDIVIDUAL
) that make up the
initial population and calls ADVANCE
in a loop to move on to the
next generation.
[class] GENETICPROGRAMMING EVOLUTIONARYALGORITHM
The
GENETICPROGRAMMING
class defines the search space, how mutation and recombination occur, and hold various parameters of the evolutionary process and the individuals themselves.
[function] RANDOMGPEXPRESSION GP TERMINATEFN &KEY (TYPE (
TOPLEVELTYPE
GP))Creating the initial population by hand is tedious. This convenience function calls
RANDOMEXPRESSION
to create a random individual that producesGP
'sTOPLEVELTYPE
. By passing in anotherTYPE
one can create expressions that fit somewhere else in a larger expression which is useful in aRANDOMIZER
function.
4.5 Search Space
The search space of the GP is defined by the available operators, literals and the type of the final result produced. The evaluator function acts as the guiding light.
[reader] OPERATORS GENETICPROGRAMMING (:OPERATORS)
The set of
OPERATOR
s from which (together withLITERAL
s) individuals are built.
[reader] LITERALS GENETICPROGRAMMING (:LITERALS)
The set of
LITERAL
s from which (together withOPERATOR
s) individuals are built.
[reader] TOPLEVELTYPE GENETICPROGRAMMING (:TOPLEVELTYPE =
T
)The type of the results produced by individuals. If the problem is to find the minimum a 1d real function then this may be the symbol
REAL
. If the problem is to find the shortest route, then this may be a vector. It all depends on the representation of the problem, the operators and the literals.
[function] COUNTNODES TREE &KEY INTERNAL
Count the nodes in the sexp
TREE
. IfINTERNAL
then don't count the leaves.
4.6 Reproduction
The RANDOMIZER
and SELECTOR
functions define how mutation and
recombination occur.
[reader] RANDOMIZER GENETICPROGRAMMING (:RANDOMIZER)
Used for mutations, this is a function of three arguments: the GP object, the type the expression must produce and current expression to be replaced with the returned value. It is called with subexpressions of individuals.
[reader] SELECTOR GENETICPROGRAMMING (:SELECTOR)
A function of two arguments: the GP object and a vector of fitnesses. It must return the and index into the fitness vector. The individual whose fitness was thus selected will be selected for reproduction be it copying, mutation or crossover. Typically, this defers to
HOLDTOURNAMENT
.
[function] HOLDTOURNAMENT FITNESSES &KEY SELECTCONTESTANTFN NCONTESTANTS KEY
Select
NCONTESTANTS
(all different) for the tournament randomly, represented by indices intoFITNESSES
and return the one with the highest fitness. IfSELECTCONTESTANTFN
isNIL
then contestants are selected randomly with uniform probability. IfSELECTCONTESTANTFN
is a function, then it's called withFITNESSES
to return an index (that may or may not be already selected for the tournament). SpecifyingSELECTCONTESTANTFN
allows one to conduct 'local' tournaments biased towards a particular region of the index range.KEY
isNIL
or a function that select the real fitness value from elements ofFITNESSES
.
4.7 Environment
The new generation is created by applying a reproduction operator
until POPULATIONSIZE
is reached in the new generation. At each
step, a reproduction operator is randomly chosen.
[accessor] COPYCHANCE GENETICPROGRAMMING (:COPYCHANCE = 0)
The probability of the copying reproduction operator being chosen. Copying simply creates an exact copy of a single individual.
[accessor] MUTATIONCHANCE GENETICPROGRAMMING (:MUTATIONCHANCE = 0)
The probability of the mutation reproduction operator being chosen. Mutation creates a randomly altered copy of an individual. See
RANDOMIZER
.
If neither copying nor mutation were chosen, then a crossover will take place.
[accessor] KEEPFITTESTP GENETICPROGRAMMING (:KEEPFITTESTP =
T
)If true, then the fittest individual is always copied without mutation to the next generation. Of course, it may also have other offsprings.
5 Differential Evolution
The concepts in this section are covered by Differential Evolution: A Survey of the StateoftheArt.
[class] DIFFERENTIALEVOLUTION EVOLUTIONARYALGORITHM
Differential evolution (DE) is an evolutionary algorithm in which individuals are represented by vectors of numbers. New individuals are created by taking linear combinations or by randomly swapping some of these numbers between two individuals.
[reader] MAPWEIGHTSINTOFN DIFFERENTIALEVOLUTION (:MAPWEIGHTSINTOFN = #'
MAPINTO
)The vector of numbers (the 'weights') are most often stored in some kind of array. All individuals must have the same number of weights, but the actual representation can be anything as long as the function in this slot mimics the semantics of
MAPINTO
that's the default.
[reader] CREATEINDIVIDUALFN DIFFERENTIALEVOLUTION (:CREATEINDIVIDUALFN)
Holds a function of one argument, the DE, that returns a new individual that needs not be initialized in any way. Typically this just calls
MAKEARRAY
.
[reader] MUTATEFN DIFFERENTIALEVOLUTION (:MUTATEFN)
One of the supplied mutation functions:
MUTATE/RAND/1
MUTATE/BEST/1
MUTATE/CURRENTTOBEST/2
.
[reader] CROSSOVERFN DIFFERENTIALEVOLUTION (:CROSSOVERFN = #'
CROSSOVER/BINARY
)A function of three arguments, the DE and two individuals, that destructively modifies the second individual by using some parts of the first one. Currently, the implemented crossover function is
CROSSOVER/BINARY
.
 [function] MUTATE/RAND/1 DE CURRENT BEST POPULATION NURSERY &KEY (F 0.5)
 [function] MUTATE/BEST/1 DE CURRENT BEST POPULATION NURSERY &KEY (F 0.5)
 [function] MUTATE/CURRENTTOBEST/2 DE CURRENT BEST POPULATION NURSERY &KEY (F 0.5)
[function] CROSSOVER/BINARY DE INDIVIDUAL1 INDIVIDUAL2 &KEY (CROSSOVERRATE 0.5)
Destructively modify
INDIVIDUAL2
by replacement each element with a probability of 1 CROSSOVERRATE
with the corresponding element inINDIVIDUAL1
. At least one, element is changed. ReturnINDIVIDUAL2
.
 [function] SELECTDISTINCTRANDOMNUMBERS TABOOS N LIMIT
5.1 SANSDE
[class] SANSDE DIFFERENTIALEVOLUTION
SaNSDE is a special DE that dynamically adjust the crossover and mutation are performed. The only parameters are the generic EA ones:
POPULATIONSIZE
,EVALUATOR
, etc. One also has to specifyMAPWEIGHTSINTOFN
andCREATEINDIVIDUALFN
.