Tag Archives: evolutionary methods

Cranking out Engines

It's almost mid-March already.  I don't like that fact the the samples haven't improved appreciably in a while.  As I noted earlier, it's mostly due to the fact that I've been upgrading internals rather than working on sound.  Still, it's time to step on it.

Over the past few weeks, I've been working like mad to re-code all the old engines in c++, taking advantage of the massive optimizations possible therein.  So far, the following engines are now at least partially-functional as part of the new c++ library:

  • Artificial Neural Network Engine (was actually never implemented in AHK and has yet to be used in a plugin)
  • Contour Grammar
  • Evolutionary Engine
  • Fraccut
  • Markov Engine
  • Multi-State Engine

ALL of the new implementations are better than their predecessors, both in terms of efficiency and ease-of-use.  Certain complex engines such as the Markov engine may see speed increases of over a thousand fold thanks to redesigning.

By the end of the month, these myriad engines should be coming together to form some really powerful new plugins.  All it takes is code.

ChromoRhythm: Mediocre Results

Though I haven't fully fleshed out the ChromoRhythm data structure, I'm already getting the unfortunate feeling that the drum module simply isn't going to be able to stand up to GGrewve.  It's too unstructured to bring any coherence to a drum channel without extensive use of a complicated fitness function and a large population of randomly mutated DNA strands, which results in unnecessarily high run times.

In short, Chromo's genetic algorithm engine isn't looking like a good choice for drum beats.  The engine, even in early stages, consumes too much time and produces mediocre results.

I'm not yet ready to give up on finding a superior alternative to GGrewve.  I may abandon this particular genetic algorithm approach for the moment and focus on a flexible stochastic system implemented on top of a grammar foundation.

As I like to remind myself, one of the greatest benefits of my style of research lies in my ability to immediately turn around when I sense a dead end.  Without a concrete aim or hypothesis, other than the overall goal of creating an infinite music system, I am completely free to try whatever methods I choose, creating and abandoning modules as necessary for maximum productivity.  Killing a module within two days of creating it isn't a bad thing.  It means more experience, more knowledge of what doesn't work, and more opportunity to find out what does work.

ChromoRhythm

ChromoRhythm is a new generative drum module that combines many of the recent ideas I've had surrounding percussive generation.  One of the main features of ChromoRhythm will be the genetic (chromosome-based) configuration storage system, which will facilitate some degree of dynamic mutation of the drum styles.  Though I have yet to decide how the evolutionary fitness function will be implemented and whether it will be subjective or objective, I believe that the ability to convert information between chromosome-like data structures and configuration variables will enable a degree of variability unknown to previous drum modules.

Naturally, ChromoRhythm will be measured against GGrewve, since GGrewve is the current standard in drum pattern generation (and an excellent standard, at that).  Though I doubt that ChromoRhythm will ever be able to surpass GGrewve in human-feeling beats, the new data paradigm employed by the engine should allow it to trump GGrewve in terms of dynamic playing and variability of style.

Many of the details of Chromo are still being worked out.  Here are a few things that are already coded or conceptually established:

  • DNA data structure consists of a long string of numbers
  • Internal function converts between DNA strings (for use with the generalized XIAS evolutionary engine) and global configuration variable states
  • Engine uses a "base style" for each individual drum, as specified by the DNA
  • Engine layers "variations" on top of the base style for each individual drum, as specified by the DNA
  • Engine adjusts playing dynamics based on movement intensity as well as coordination accenting instructions; DNA specifies the degree and exact parameters of the dynamics

One of the main questions upon which the development of ChromoRhythm is hinging concerns the fitness function.  I see two ways of doing it:

Objective Fitness Function - ChromoRhythm takes a drummer, randomly mutates the DNA of the drummer, then runs a fitness function on the mutations.  The fitness function is the first-ever "ear module" employed in the development of mGen, and mimics a user that listens to the beat.  The fitness function assigns points based on how well the drummer lines up with the coordination instructions, how effectively the dynamics change the style, and how pleasing the beat is in general.  The XIAS evolutionary engine then evolves the drummer mutations to the next generation and repeats the process for a set number of generations (or, alternatively, until the population reaches a specified mean fitness value).

Subjective Fitness Function - ChromoRhythm creates a pool of potential drummers to begin with.  The user selects an option in the interface that makes Chromo generate a channel for each potential drummer during the composition of a single piece.  The user then listens to the piece, muting all but one channel, and assigning points to the drummers based on how much he or she likes the style.  The XIAS evolutionary engine then evolves the drummer pool, as detailed above.

One may see that these two methods have very little in common - in fact, the first details a method of evolving substyles of a single drummer configuration, while the second involves evolving overarching drummer configurations.  Perhaps, then, both of these methods can be combined.

Needless to say, the logistics of implementing a genetic algorithm aren't at all simple.  Conceptualizing the process takes far longer than actually implementing it.  Determining what exactly should be evolved, how populations should be treated, which individual should be played, and how fitness points should be assigned will all heavily affect the output of ChromoRhythm.

Notes on Implementing Genetic Algorithms

XIAS (cross-integrated algorithm system) supports the following algorithm types so far:

  • Fractal
  • Grammar
  • L-System

The next natural extension of XIAS is an implementation of genetic algorithms.  I have a bit of experience with genetic algorithms from designing the EvoSpeak engine, which basically combined stochastics and genetic algorithms to evolve towards good-sounding melodies.

In keeping with the overall theme of the XIAS project, the designs for this genetic algorithm implementation focus on portability and ease-of-use.  As such, the system may seem a little oversimplified.  Only time will tell whether it will suffice for algorithmic composition needs or whether I will need to extend the features of the genetic algorithms.  For now, here are some notes that I've brainstormed for the implementation of this new system.

Data Structure of the GA Engine

  • Engine
    • Alleles (effectively the algorithm's domain; string)
    • Cumulative point distribution (count; integer)
  • Individuals
    • Data String (genetic data; string)
    • Points (count; integer)

GA Engine Operation

  • Create m individuals with random alleles
  • Rate each of the individuals
    • Pass data string to transform function
    • Feed transformed output to fitness evaluation function
    • Distribute points according to fitness performance
  • Breed the top n individuals according to point distribution with each other to create m new individuals
  • Repeat with the new generation of individuals

Mutation in the GA Engine

  • Deterministic Approach
    • Create a function that takes two parent alleles and returns a third allele different from both (think cross product)
    • Run the function on every qth allele
  • Nondeterministic Approach
    • Replace a child allele with a random allele with a given frequency

Quick Summary of GA Engine Implementation

  • Randomize Individuals
  • Rate
    • Functional Transformation
    • Fitness Evaluation
    • Point Allocation
  • Reproduce
    • Allele Combination
    • Allele Mutation
  • Repeat

The only real problem left to solve is concerning reproduction.  How do we choose which individuals to combine with which individuals in order to produce a population of exactly m members, given that we want the children to have the best possible combinations of genes based on the point distribution of the parents?

EvoSpeak - Optimization

Yes, I'm STILL working on getting the analysis part of EvoSpeak working. I now have the structure of the species' brains figured out and I've optimized the analysis engine a LOT, thanks to the new storage method of the analysis filters. So things are looking pretty good and soon enough I should be working within the interface of EvoSpeak instead of grinding around in the code.

I'll admit, progress on mGen is coming slowly. It still looks like I'm going to make the code deadline that I set for Wednesday (17,000 lines), which is encouraging. How is mGen shaping up in the big picture/long run? That's what I'm more worried about. I'm going to have to step back and take a serious look at what's going on after I hit this deadline. I don't even really have anything for Mr. Taranto to help with yet, even though our meeting is in under two weeks. It's time to step up to the plate.

EvoSpeak - Analysis

Work is starting to get pretty messy with EvoSpeak. I'm trying to design a very generalized analysis framework to allow easy analysis of just about any relationship. Doing so is not at all easy. My strategy is to set up a "perception" matrix that simulates the state of any given sample at any given point in time. The idea is that an analysis function can build a filter matrix and then call the perception function, which will subsequently compare the filter matrix to the perception matrix and gather appropriate statistics.

The first analysis to be implement is, of course, a zeroeth-order Markov. Fancy language aside, what it boils down to is this: did the species use a certain word (melodic or rhythmic) in a certain sample? So a zeroeth-order Markov simply deals with the innate "quality" of certain words over other words, not taking into account ANY contextual variables.

Problems arise quickly with this general framework. It's very difficult to obtain certain state values to populate the matrix because of the grammar engine design. The melodic and rhythmic data streams are asynchronous, so melodic events don't necessarily line up with rhythmic events, which makes finding synchronous data (like perception data) very difficult. Apparently, I've already messed up in trying to separate the streams, as indicated by the presence of some anomalies in the preliminary statistics.

On top of all that, the analysis is a LOT slower than I thought it would be, even after serious reconstruction and optimization. I knew that it would take a lot of loops and recursions to do the analysis...but I thought the computer would just chew through them. Already a simple zeroeth-order Markov analysis on the melody alone costs about 2.6 seconds. Using that number and performing the necessary exponential extrapolating, a second-order Markov analysis would take a whopping sixteen minutes, which is simply unacceptable. And that's only to level-up once. I'm definitely going to have to figure something out there.

While I'm running into some obstacles, EvoSpeak is still advancing steadily and I'm confident that the analysis functions will soon be fully-functional.

EvoSpeak - Experience & Dörf

I finished a lot of EvoSpeak coding tonight. The training process is mostly coded. The species will spit out samples, the user listens and grades the performance, and then the results are submitted and stored to the species' memory. It's also possible to create new species now from within the configuration utility.
I created my first creature, Dörf, today. He speaks the default language. Why the name? I'm not sure, I just liked it. I've trained him 24 times so far, so he has 120 xp. He's actually ready to level up to level 2, since it only requires 100 xp to do so. Well, I guess as soon as I finish making the leveling-up algorithm, (which is the real meat of this whole process since it provides the "brains" for each species) he'll be good to go.
I look forward to working with Dörf; I hope he's a memorable first EvoSpeak species.

I finished a lot of EvoSpeak coding tonight. The training process is mostly coded. The species will spit out samples, the user listens and grades the performance, and then the results are submitted and stored to the species' memory. It's also possible to create new species now from within the configuration utility.

I created my first creature, Dörf, today. He speaks the default language. Why the name? I'm not sure, I just liked it. I've trained him 24 times so far, so he has 120 xp. He's actually ready to level up to level 2, since it only requires 100 xp to do so. Well, I guess as soon as I finish making the leveling-up algorithm, (which is the real meat of this whole process since it provides the "brains" for each species) he'll be good to go.

I look forward to working with Dörf; I hope he's a memorable first EvoSpeak species.

EvoSpeak - Getting Closer

I finished the preview builder and now have a working random pattern generator and previewer for EvoSpeak. I still can't submit ratings so species don't gain experience yet, but the hardest work is done...until it comes time to build the "leveling" mechanism (i.e. the Markov analysis tool).
And the results of the initial grammar runs? Good! Overall, I am very satisfied with what I'm hearing. Based off of the twenty-or-so previews that I've listened to so far, the engine is much more interesting than GrammGen. It sounds a lot better.
The thing I really like, however, is that switching languages dramatically changes the previews. Of course the same was true for GrammGen, but I never built a second language for GrammGen because of the relative difficulty of editing the languages. In EvoSpeak there's a built-in language editor. It's as easy as slapping in some pipe-delimited numbers for rhythm and melody and listening to the results.
It took me thirty seconds to build a language that could be used for repetitive arps in the background. So I think I've found my solution for arpeggiation! The simple the language, the more likely it is to repeat words - which is exactly what you want in a background pattern. After listening to some previews of the new language, I'm certain that this will be a very promising and flexible system.
So far EvoSpeak is going very well! The real question, however, has yet to be answered: will the "experience" and analysis system actually allow EvoSpeak to improve the quality of its output? The answer would seem to be a very obvious yes if I do everything right. But at the same time, it's hard to believe that listening to samples and pressing buttons can train a program to make better music. But who knows, I guess I'll just have to find out.
PS - It's worth noting, in case I was never clear about this, that EvoSpeak is NOT a grammatical subdivision engine like GGrewve, rather, it's a grammatical chain engine like GrammGen. Chains are simpler and easier to work with but subdivision is more powerful. And yes, I coined both of those terms, which is why you won't find information on them anywhere else :)

I finished the preview builder and now have a working random pattern generator and previewer for EvoSpeak. I still can't submit ratings so species don't gain experience yet, but the hardest work is done...until it comes time to build the "leveling" mechanism (i.e. the Markov analysis tool).

And the results of the initial grammar runs? Good! Overall, I am very satisfied with what I'm hearing. Based off of the twenty-or-so previews that I've listened to so far, the engine is much more interesting than GrammGen. It sounds a lot better.

The thing I really like, however, is that switching languages dramatically changes the previews. Of course the same was true for GrammGen, but I never built a second language for GrammGen because of the relative difficulty of editing the languages. In EvoSpeak there's a built-in language editor. It's as easy as slapping in some pipe-delimited numbers for rhythm and melody and listening to the results.

It took me thirty seconds to build a language that could be used for repetitive arps in the background. So I think I've found my solution for arpeggiation! The simple the language, the more likely it is to repeat words - which is exactly what you want in a background pattern. After listening to some previews of the new language, I'm certain that this will be a very promising and flexible system.

So far EvoSpeak is going very well! The real question, however, has yet to be answered: will the "experience" and analysis system actually allow EvoSpeak to improve the quality of its output? The answer would seem to be a very obvious yes if I do everything right. But at the same time, it's hard to believe that listening to samples and pressing buttons can train a program to make better music. But who knows, I guess I'll just have to find out.

PS - It's worth noting, in case I was never clear about this, that EvoSpeak is NOT a grammatical subdivision engine like GGrewve, rather, it's a grammatical chain engine like GrammGen. Chains are simpler and easier to work with but subdivision is more powerful. And yes, I coined both of those terms, which is why you won't find information on them anywhere else :)

EvoSpeak - Progress and Ideas

I'm still working on EvoSpeak, getting the engine all set up. I finished the random generating algorithms that will provide training material from which EvoSpeak will "learn." They also define the basis of the new grammar system, whose syntax is simpler even than that of GrammGen, but whose power is much greater.
Next I need to create the functions that will analyze the training material to figure out what attributes they have in terms of melody and rhythm. All of this analysis data will be stored in a training file that will also indicate how well the user likes the material. After a certain number of training pieces have been graded by the user, EvoSpeak will dig up all the analysis data and perform an extensive statistical analysis on it to try find correlations and develop a "brain," so to speak, that will allow the program to function as an output device.
I'm still trying to figure out exactly what variables/attributes should be part of the "brain." This has always been my problem with statistical models; I've never known exactly what variables to draw statistics from. Now I've got to tackle the issue. I'll start simple - state variables (such as what beat the rhythmic or melodic object falls on) and first-order memory variables (what the last rhythmic or melodic object was) should work fine for the first version.
I plan to have EvoSpeak set up in an intuitive "leveling" kind of way that reflects a simple game. Before EvoSpeak will work, the user must first create a new "creature" that speaks a certain "language." At first the creature will have no idea how to speak the language; like a child, the creature must be shown how to use words to make sentences. The user "trains" the creature by listening to samples and rating them on a scale of 1 (strong dislike) to 5 (strong like). The creature gains XP (experience points) when the user listens to samples and submits ratings. When the creature has enough XP, it can "level up." During the leveling-up process (unbeknownst to the user), the creature actually goes back and analyzes all of the samples and ratings and essentially "learns" from the previous batch of material. The leveling system is good because it will ensure that correlations are relatively strong before they will be used to generate (i.e. the creature won't work without the user having trained it to a certain level).
At higher levels, creatures may learn the ability to analyze deeper variables other than states and first-order memories. Perhaps the creature gains more memory with each level (this is equivalent to increasing the order of the Markov chain analysis). Or perhaps the creature starts analyzing surface contours (3-variable functions) instead of 2-dimensional dependencies.
These are pretty abstract and crazy ideas, but I think they make sense, and I think they will provide a refreshing and intuitive break from the usual grind of KBSs. I'm interested to start training my first creature! And if the leveling system actually makes the music sound better (as intended)...well...I think I could spend all day leveling my creatures (is this starting to sound like Pokemon? That's neither my intent nor my inspiration).

I'm still working on EvoSpeak, getting the engine all set up. I finished the random generating algorithms that will provide training material from which EvoSpeak will "learn." They also define the basis of the new grammar system, whose syntax is simpler even than that of GrammGen, but whose power is much greater.

Next I need to create the functions that will analyze the training material to figure out what attributes they have in terms of melody and rhythm. All of this analysis data will be stored in a training file that will also indicate how well the user likes the material. After a certain number of training pieces have been graded by the user, EvoSpeak will dig up all the analysis data and perform an extensive statistical analysis on it to try find correlations and develop a "brain," so to speak, that will allow the program to function as an output device.

I'm still trying to figure out exactly what variables/attributes should be part of the "brain." This has always been my problem with statistical models; I've never known exactly what variables to draw statistics from. Now I've got to tackle the issue. I'll start simple - state variables (such as what beat the rhythmic or melodic object falls on) and first-order memory variables (what the last rhythmic or melodic object was) should work fine for the first version.

I plan to have EvoSpeak set up in an intuitive "leveling" kind of way that reflects a simple game. Before EvoSpeak will work, the user must first create a new "creature" that speaks a certain "language." At first the creature will have no idea how to speak the language; like a child, the creature must be shown how to use words to make sentences. The user "trains" the creature by listening to samples and rating them on a scale of 1 (strong dislike) to 5 (strong like). The creature gains XP (experience points) when the user listens to samples and submits ratings. When the creature has enough XP, it can "level up." During the leveling-up process (unbeknownst to the user), the creature actually goes back and analyzes all of the samples and ratings and essentially "learns" from the previous batch of material. The leveling system is good because it will ensure that correlations are relatively strong before they will be used to generate (i.e. the creature won't work without the user having trained it to a certain level).

At higher levels, creatures may learn the ability to analyze deeper variables other than states and first-order memories. Perhaps the creature gains more memory with each level (this is equivalent to increasing the order of the Markov chain analysis). Or perhaps the creature starts analyzing surface contours (3-variable functions) instead of 2-dimensional dependencies.

These are pretty abstract and crazy ideas, but I think they make sense, and I think they will provide a refreshing and intuitive break from the usual grind of KBSs. I'm interested to start training my first creature! And if the leveling system actually makes the music sound better (as intended)...well...I think I could spend all day leveling my creatures (is this starting to sound like Pokemon? That's neither my intent nor my inspiration).

EvoSpeak

Yes, I know. Way too many new plugins lately. I can't help it, I have to find some new inspiration. The assisted compositions are falling into a rut and mGen needs some variety pretty badly. So I'm trying a new melody method, EvoSpeak.
EvoSpeak will be the first plugin featuring a true hybrid engine. It will incorporate an evolutionary system based on a grammar engine. The grammar engine will not be a derivative of WordSmith or the WordEngine, nor will it copy the format of GGrewve. This grammar engine will be based off of the lighter and more manageable GrammGen engine, with some obvious modifications for greater efficiency. I've basically developed two grammar systems and I have yet to hit the sweet spot for melodies. GrammGen introduced a very simple and very light-weight grammar system based off of loosely-defined words. The results were interesting but have failed to keep my attention for very long. The GGrewve engine brought with it a much deeper and extremely effective grammar engine. The GGrewve (and, subsequently, WordSmith) engine, however, is not altogether flexible. It basically requires MIDI files to create styles since the words are much more complex than those of GrammGen.
With the EvoSpeak engine, I hope to achieve the impressive coherence and variety of the GGrewve engine but with the efficiency and originality of the GrammGen system. The evolutionary model should allow a fundamentally simple grammar to come together into a much more complex system. Loosely speaking, the EvoSpeak engine is also a statistical model. The evolutionary model actually "evolves" via a statistical model, but "speaks" via a grammar.
Implementation is in alpha stages right now so I'm really not sure what to expect since this is my first real endeavor into evolutionary models (I tried a very, very basic evolutionary drum model as one of my first plugins ever, but it was way too simple and poorly-coded to achieve anything).

Yes, I know. Way too many new plugins lately. I can't help it, I have to find some new inspiration. The assisted compositions are falling into a rut and mGen needs some variety pretty badly. So I'm trying a new melody method, EvoSpeak.

EvoSpeak will be the first plugin featuring a true hybrid engine. It will incorporate an evolutionary system based on a grammar engine. The grammar engine will not be a derivative of WordSmith or the WordEngine, nor will it copy the format of GGrewve. This grammar engine will be based off of the lighter and more manageable GrammGen engine, with some obvious modifications for greater efficiency. I've basically developed two grammar systems and I have yet to hit the sweet spot for melodies. GrammGen introduced a very simple and very light-weight grammar system based off of loosely-defined words. The results were interesting but have failed to keep my attention for very long. The GGrewve engine brought with it a much deeper and extremely effective grammar engine. The GGrewve (and, subsequently, WordSmith) engine, however, is not altogether flexible. It basically requires MIDI files to create styles since the words are much more complex than those of GrammGen.

With the EvoSpeak engine, I hope to achieve the impressive coherence and variety of the GGrewve engine but with the efficiency and originality of the GrammGen system. The evolutionary model should allow a fundamentally simple grammar to come together into a much more complex system. Loosely speaking, the EvoSpeak engine is also a statistical model. The evolutionary model actually "evolves" via a statistical model, but "speaks" via a grammar.

Implementation is in alpha stages right now so I'm really not sure what to expect since this is my first real endeavor into evolutionary models (I tried a very, very basic evolutionary drum model as one of my first plugins ever, but it was way too simple and poorly-coded to achieve anything).