Genetic Algorithm Performance Testing II

With the superiority of mutation mode 1 at least tentatively established, I have started conducting more thorough optimization tests on the genetic algorithm engine. So far I have only completed two full tests, since the method I am using takes about half a day for a single run.

The follow chart shows the efficiency of two different populations. The averaged data was taken over the mutation rate interval [1,10] with step size 0.1. Each population ran 3000 times at each mutation rate to ensure very high accuracy in the averages. The first curve, labeled GA Comp 4, represents the efficiency of a population of complexity 4. Similarly, the second curve represents the efficiency of a complexity-5 population. Complexity, for the purposes of this test, is the numerical measure of both the number of individuals in the population as well as the number of alleles available to the engine and the number of alleles in each individual. So, for example, a complexity-4 population comprises 4 individuals, each having 4 alleles, each allele being drawn from a set of 4 unique alleles. Similarly, a complexity-5 population comprises 5 individuals, each having 5 alleles drawn from a set of 5 unique alleles. Obviously, the higher the complexity, the more difficult it is for the engine to reach the "optimal individual."

In the previous post, I estimated that mutation mode 1 runs most efficiently around 4.5%-5.5%. I based that estimation off of a complexity-4 population. The above results confirm that the optimal rate for a complexity-4 population is just shy of 4.5%. This value, however, is not the most efficient for a complexity-5 population, as the red data demonstrates. Rather, as the complexity increases to 5, the optimal mutation rate increases as well - to roughly 5.5%.

It is interesting to note that the efficiency of a complexity-4 population can be approximated quite well by a polynomial of order 3, while the efficiency of a complexity-5 population can be approximated nicely by a polynomial of order 4. These approximation curves are overlaid in the graph above.

Though I'd need several more runs to draw a concrete conclusion, the immediate evidence suggests that, as a rough generalization, the genetic algorithm performs most efficiently when the mutation rate is set to about .5 above the quantitative complexity factor of the population to be optimized.

Now, how does any of this relate to music and algorithmic composition? That I still have to figure out.

Genetic Algorithm Performance Testing

Having laid down the foundations of the XIAS genetic algorithm engine, I decided a performance test would be necessary to get the most out of the engine. Genetic algorithms, unlike most of the other methods that I've explored, are highly impacted by small quantitative tweaks to the engine, so performance can be gained by extensive testing of parameters.

Some preliminary tests were run tonight to determine the most effective mutation rate as well as the best mutation mode. I developed two mutation modes for the engine: Mode 1 allows an allele to mutate to any other allele that has been specified as part of the allele set of the engine. Mode 2 allows an allele to mutate only to an allele that already exists on the individual. In order to test these modes, the engine ran through 1000 genetic algorithm populations, keeping track of how many sub-loops it took to optimize the population each time. Each population was allowed a maximum of 100 loops to become perfectly optimized. Obviously, the idea is for the populations to optimize before those 100 loops are up.  The population optimization times are then averaged across the 1000 populations in order to achieve more accurate results.  Note that lower numbers on the graph are better (because the population optimized faster).

A couple of notes about the engine that stem from the graph:

  • Mutation mode 1, in general, optimizes faster than mutation mode 2
  • Mutation mode 1 is most efficient with a mutation rate of around 4.5% to 5.5%
  • Mutation mode 2 is most efficient with a mutation rate of around 6% to 7%
  • Mutation mode 1 provides more consistent optimization (notice that mode 2's curve changes directions several times, whereas mode 1's curve changes direction only once, at the minimum)
  • The optimization time of each mode can be roughly approximated by a cubic curve

Reproduction in Genetic Algorithms

I've come up with three different ways of performing the reproduction process in the genetic algorithm engine.  Note that all processes start out with ten individuals and, accordingly, perform ten reproductions.  The question is how to optimize the outcome of the reproduction so that the children end up with an optimal gene pool but still retain some diversity.

The first method, which I call "nearest neighbor," simply pairs the individual of rank n with the individual of rank n+1. The last individual pairs with the first.

The second method, "first priority," pairs the top-ranked individual with all other individuals, then pairs the second-ranked individual with the third.

The final method, "multiple neighbor," starts at the top-ranked individual, n, and pairs it with n+1n+2, ..., n+i, then moves on to individual n+1, pairing it with n+2, n+3, ..., n+1+i, continuing this pattern until the number of reproductions matches the number of individuals in the original population.  The integer i should be chosen based on population size.  Note that the "first priority" method is essentially a special case of the multiple neighbor method wherein i is equal to the size of the original population.

I'm betting that multiple neighbor with an i value of about a third of the population size will probably perform the best.  Only tests will tell which of these methods of reproduction (if not all of them) should be implemented in the final engine.

Notes on Implementing Genetic Algorithms

XIAS (cross-integrated algorithm system) supports the following algorithm types so far:

  • Fractal
  • Grammar
  • L-System

The next natural extension of XIAS is an implementation of genetic algorithms.  I have a bit of experience with genetic algorithms from designing the EvoSpeak engine, which basically combined stochastics and genetic algorithms to evolve towards good-sounding melodies.

In keeping with the overall theme of the XIAS project, the designs for this genetic algorithm implementation focus on portability and ease-of-use.  As such, the system may seem a little oversimplified.  Only time will tell whether it will suffice for algorithmic composition needs or whether I will need to extend the features of the genetic algorithms.  For now, here are some notes that I've brainstormed for the implementation of this new system.

Data Structure of the GA Engine

  • Engine
    • Alleles (effectively the algorithm's domain; string)
    • Cumulative point distribution (count; integer)
  • Individuals
    • Data String (genetic data; string)
    • Points (count; integer)

GA Engine Operation

  • Create m individuals with random alleles
  • Rate each of the individuals
    • Pass data string to transform function
    • Feed transformed output to fitness evaluation function
    • Distribute points according to fitness performance
  • Breed the top n individuals according to point distribution with each other to create m new individuals
  • Repeat with the new generation of individuals

Mutation in the GA Engine

  • Deterministic Approach
    • Create a function that takes two parent alleles and returns a third allele different from both (think cross product)
    • Run the function on every qth allele
  • Nondeterministic Approach
    • Replace a child allele with a random allele with a given frequency

Quick Summary of GA Engine Implementation

  • Randomize Individuals
  • Rate
    • Functional Transformation
    • Fitness Evaluation
    • Point Allocation
  • Reproduce
    • Allele Combination
    • Allele Mutation
  • Repeat

The only real problem left to solve is concerning reproduction.  How do we choose which individuals to combine with which individuals in order to produce a population of exactly m members, given that we want the children to have the best possible combinations of genes based on the point distribution of the parents?

Notes from Talk with Peter Torpey at MIT

  • EchoNest is doing the "inverse" of what I'm doing
  • Could use "cloud" processing to make small generative music devices possible (instead of having an on-board rendering engine, send rendering parameters to a server and render the compositions on a large, specialized parallel processing network at the server.)
  • Though there are not dedicated algorithmic composition projects going on at the Media Lab right now, it's definitely something that would fit in

More to come later when I remember the rest of the stuff we talked about!

ChillZone Pianist v2

ChillZone Pianist is finally getting the update it deserves.  The first of many style options, left hand arpeggiation, is tentatively working.  The plugin takes advantage of the new generalized algorithm include file (code-named XIAS).  In particular, it uses a Lindenmayer system to control an underlying grammatical engine.  The words are generated randomly using Brownian fractal motion.  The results sound great for preliminary runs.  Though some of the arpeggiations the program comes up with aren't too compelling, most of the output sounds good and is both coherent and creative.

After some more initial testing, I'll implement an array of new styles so that ChillZone Pianist will finally be a flexible piano plugin.  I intend to flesh out the options for chord, rhythmic chord, melody, and abstract, as well as arpeggiation.

Meeting with Peter Torpey at MIT

Peter Torpey, a research assistant at the MIT Media Lab in the Opera of the Future Group, has agreed to give me some of his time on Friday when I visit MIT for the second time!

When I toured MIT the first time over spring break, I had the pleasure of meeting Professor Machover as well as Peter Torpey (as detailed in previous posts). I told them both about my algorithmic composition project, and Peter showed a degree of interest. He's going to give me some time this Friday to show him what I've been working on and, perhaps, give me some advice or general feedback on the project.

It's going to be an exciting week for mGen. This will be the first time professionals see the program!

New Grammar Engine in the Works

In keeping with the spirit of the recent drive to create a "generalized" algorithm system (starting with the XenCut fractal cutting engine mentioned a while ago), I'm beginning yet another grammar engine. The goal this time? Lighter and more portable than WordEngine. Easy applicability, easy conversions between words and soft patterns of notes, etc. The engine will also be more generalized. Words are simply comma-delimited strings now, and have no duration, velocity, or time data. To create full sets of note, velocity, time, and duration data, all one has to do is combine four words.

The key component that I want to emphasize in development is the interoperability of the algorithms. For example, the XenCut/generalized fractal engine could be used to create random words for the new grammar engine's dictionary. A Lindenmayer system could then be used to string words together into phrases, and a stochastic engine could control the distribution of phrases.

I'm hoping that this development of easy-to-use algorithms will lead to the creation of the ultimate hybrid plugin that will bring together the strengths of all methods as I originally envisioned when I first began the program.

Fractals Getting Old

Sadly, fractals are starting to get old. They've had a good run in mGen, and I don't anticipate them being topped by any other algorithm type anytime soon. Nonetheless, I'm getting tired of hearing them and my standards for coherence are going up. At first, fractals sounded quite coherent and creative, but I think it's time to push for more coherence.

Unfortunately, I don't know how to progress as far as algorithms go. I could try a grammar system layered on top of a fractal decision engine...but where will the grammar source come from? Would a fractal decision engine even sound good? If not, what else could I use to make choices (random functions are just too simple at this point in development)?

Looks like it's going to take some serious creative inspiration to get the next wave of development started.