λ

GPR Manual

Table of Contents

[in package MGL-GPR]

λ

1 mgl-gpr ASDF System Details

λ

2 Background

Evolutionary algorithms are optimization tools that assume little of the task at hand. Often they are population based, that is, there is a set of individuals that each represent a candidate solution. Individuals are combined and changed with crossover and mutationlike operators to produce the next generation. Individuals with lower fitness have a lower probability to survive than those with higher fitness. In this way, the fitness function defines the optimization task.

Typically, EAs are quick to get up and running, can produce reasonable results across a wild variety of domains, but they may need a bit of fiddling to perform well and domain specific approaches will almost always have better results. All in all, EA can be very useful to cut down on the tedium of human trial and error. However, they have serious problems scaling to higher number of variables.

This library grew from the Genetic Programming implementation I wrote while working for Ravenpack who agreed to release it under an MIT licence. Several years later I cleaned it up, and documented it. Enjoy.

λ

3 Evolutionary Algorithms

Evolutionary algorithm is an umbrella term. In this section we first discuss the concepts common to conrete evolutionary algorithms Genetic Programming and Differential Evolution.

λ

3.1 Populations

The currenly implemented EAs are generational. That is, they maintain a population of candidate solutions (also known as individuals) which they replace with the next generation of individuals.

λ

3.2 Evaluation

λ

3.3 Training

Training is easy: one creates an object of a subclass of EVOLUTIONARY-ALGORITHM such as GENETIC-PROGRAMMING or DIFFERENTIAL-EVOLUTION, creates the initial population by adding individuals to it (see ADD-INDIVIDUAL) and calls ADVANCE in a loop to move on to the next generation until a certain number of generations or until the FITTEST individual is good enough.

λ

4 Genetic Programming

λ

4.1 Background

What is Genetic Programming? This is what Wikipedia has to say:

In artificial intelligence, genetic programming (GP) is an
evolutionary algorithm-based methodology inspired by biological
evolution to find computer programs that perform a user-defined
task. Essentially GP is a set of instructions and a fitness
function to measure how well a computer has performed a task. It
is a specialization of genetic algorithms (GA) where each
individual is a computer program. It is a machine learning
technique used to optimize a population of computer programs
according to a fitness landscape determined by a program's ability
to perform a given computational task.

Lisp has a long history of Genetic Programming because GP involves manipulation of expressions which is of course particularly easy with sexps.

λ

4.2 Tutorial

GPR works with typed expressions. Mutation and crossover never produce expressions that fail with a type error. Let's define a couple of operators that work with real numbers and also return a real:

(defparameter *operators* (list (operator (+ real real) real)
                                (operator (- real real) real)
                                (operator (* real real) real)
                                (operator (sin real) real)))

One cannot build an expression out of these operators because they all have at least one argument. Let's define some literal classes too. The first is produces random numbers, the second always returns the symbol *X*:

(defparameter *literals* (list (literal (real)
                                 (- (random 32.0) 16.0))
                               (literal (real)
                                 '*x*)))

Armed with *OPERATORS* and *LITERALS*, one can already build random expressions with RANDOM-EXPRESSION, but we also need to define how good a certain expression is which is called fitness.

In this example, we are going to perform symbolic regression, that is, try to find an expression that approximates some target expression well:

(defparameter *target-expr* '(+ 7 (sin (expt (* *x* 2 pi) 2))))

Think of *TARGET-EXPR* as a function of *X*. The evaluator function will bind the special *X* to the input and simply EVAL the expression to be evaluated.

(defvar *x*)

The evaluator function calculates the average difference between EXPR and TARGET-EXPR, penalizes large expressions and returns the fitness of EXPR. Expressions with higher fitness have higher chance to produce offsprings.

(defun evaluate (gp expr target-expr)
  (declare (ignore gp))
  (/ 1
     (1+
      ;; Calculate average difference from target.
      (/ (loop for x from 0d0 to 10d0 by 0.5d0
               summing (let ((*x* x))
                         (abs (- (eval expr)
                                 (eval target-expr)))))
         21))
     ;; Penalize large expressions.
     (let ((min-penalized-size 40)
           (size (count-nodes expr)))
       (if (< size min-penalized-size)
           1
           (exp (min 120 (/ (- size min-penalized-size) 10d0)))))))

When an expression is to undergo mutation, a randomizer function is called. Here we change literal numbers slightly, or produce an entirely new random expression that will be substituted for EXPR:

(defun randomize (gp type expr)
  (if (and (numberp expr)
           (< (random 1.0) 0.5))
      (+ expr (random 1.0) -0.5)
      (random-gp-expression gp (lambda (level)
                                 (<= 3 level))
                            :type type)))

That's about it. Now we create a GP instance hooking everything up, set up the initial population and just call ADVANCE a couple of times to create new generations of expressions.

(defun run ()
  (let ((*print-length* nil)
        (*print-level* nil)
        (gp (make-instance
             'gp
             :toplevel-type 'real
             :operators *operators*
             :literals *literals*
             :population-size 1000
             :copy-chance 0.0
             :mutation-chance 0.5
             :evaluator (lambda (gp expr)
                          (evaluate gp expr *target-expr*))
             :randomizer 'randomize
             :selector (lambda (gp fitnesses)
                         (declare (ignore gp))
                         (hold-tournament fitnesses :n-contestants 2))
             :fittest-changed-fn
             (lambda (gp fittest fitness)
               (format t "Best fitness until generation ~S: ~S for~%  ~S~%"
                       (generation-counter gp) fitness fittest)))))
    (loop repeat (population-size gp) do
      (add-individual gp (random-gp-expression gp (lambda (level)
                                                    (<= 5 level)))))
    (loop repeat 1000 do
      (when (zerop (mod (generation-counter gp) 20))
        (format t "Generation ~S~%" (generation-counter gp)))
      (advance gp))
    (destructuring-bind (fittest . fitness) (fittest gp)
      (format t "Best fitness: ~S for~%  ~S~%" fitness fittest))))

Note that this example can be found in example/symbolic-regression.lisp.

λ

4.3 Expressions

Genetic programming works with a population of individuals. The individuals are sexps that may be evaluated directly by EVAL or by other means. The internal nodes and the leafs of the sexp as a tree represent the application of operators and literal objects, respectively. Note that currently there is no way to represent literal lists.

λ

4.4 Basics

To start the evolutionary process one creates a GP object, adds to it the individuals (see ADD-INDIVIDUAL) that make up the initial population and calls ADVANCE in a loop to move on to the next generation.

λ

4.5 Search Space

The search space of the GP is defined by the available operators, literals and the type of the final result produced. The evaluator function acts as the guiding light.

λ

4.6 Reproduction

The RANDOMIZER and SELECTOR functions define how mutation and recombination occur.

λ

4.7 Environment

The new generation is created by applying a reproduction operator until POPULATION-SIZE is reached in the new generation. At each step, a reproduction operator is randomly chosen.

If neither copying nor mutation were chosen, then a crossover will take place.

λ

5 Differential Evolution

The concepts in this section are covered by Differential Evolution: A Survey of the State-of-the-Art.

λ

5.1 SANSDE