Interface Woes

I just can't get the main interface up to par with my expectations.  When it comes right down to it, I'm absolutely worthless at graphic/interface design, which means mGen is severely lacking in the aesthetics department.  I know it doesn't affect the music making, but it's very uninspiring to have such a boring, inartistic interface.  In a huge effort involving every smidgen of my artistic capabilities (and lack thereof), I have designed a new concept interface.

The new interface probably won't replace the current one; it will probably just be a higher-level tool for those that don't want to get as in-depth.  I'm not going to reveal any screenshots, but it basically takes all the technicality out of the work.  There's no compositional skeleton, no ugly "modular displays," and no render settings (well...not yet, at least).  You load modules by clicking on large tile-buttons that transform to indicate the loaded plugin.  Of course this means that generative and post-processing modules will be limited because there aren't infinite tiles.  Nonetheless, I think this interface will work nicely for quick test runs and for users with little technical skill.

Along with the new interface I think I'll also re-code the framework from scratch since it's getting to be the oldest part of mGen.  Granted, everything works right now.  But on days like today when a bug shows up, it's a nightmare to try to figure out what's causing it with the current framework, which was designed back in the day when I had no clue how to organize data structures appropriately.  It's way more complicated than need be, and with a slick new interface I'd like a slick new backend as well.  This time I'll try to make the framework componentry more independent of the interface as well, so I don't have to rewrite the framework if I change the interface again.

Hopefully mGen will be looking and functioning a lot smoother in the next week or so.

Moving Forward

Tonight was a happier night for mGen.  I'm in the process of making those changes of which I spoke in the last entry.  But today I focused more on EvoSpeak than anything.  I've got to stop writing a half-functional plugin and then leaving it in the dust.  I think EvoSpeak has a lot of potential and I need to finish it.

Training is already well-implemented in EvoSpeak and several species are coming along nicely.  I already have several simple languages as well.  As I mentioned earlier, it's amazing how much the language variations change the output.  It's the difference between listening to English and Chinese being spoken.  As far as analysis goes, the species now do zeroeth-order Markov (at all levels), multivariable zeroeth-order Markov comparing rhythm and melody (starting at level 3), zeroeth-order Markov with beat relation, which gives the analysis a spacial/time dimension (starting at level 3), and first-order Markov (starting at level 5).  The analysis takes a very long time, much more than I'd like it to take.  But I'm sure I'll be able to optimize it further in the future.  Right now my most advanced species, Vlad, is taking about a minute-and-a-half to level up; he's at level 11 right now and has to sift through over 200 samples during the analysis.

At any rate, I finally built the backend of EvoSpeak, meaning it now functions as a plugin instead of just a configuration utility for a nonexistent plugin.  The backend isn't dynamic yet so the species makes a pattern and sticks to it for the entire composition, but it's a proof-of-concept kind of thing.

And my first results with EvoSpeak as a real plugin?  Very impressive.  I used Arpy, my most coherent species, to generate the sample.  Although Arpy is only level 3, his language is built to be an arpeggiation language, so it's very coherent and produces extremely nice results.  It was very fulfilling.

When airframe gets going and starts handling the new part system, we might be looking at the possibility of another 'Sample #3' (the sample that mGen composed completely autonomously a few months ago that almost had me in tears, as I recounted in an entry).

I look forward to 'Sample #4.'

Don't Blame airframe

The first active tests of airframe today were basically all failures.  The coherence hasn't improved at all.  But I'm not sure why I ever expected that it would.  You can't take all the problems that you've been ignoring for so long and blame them all on one module.  It's not airframe's fault.  There's nothing wrong with the L-system.

The fact of the matter is, generative modules aren't musicians yet.  They don't act, they don't think, they don't respond.  They aren't dynamic like musicians.  As I've said before, they shouldn't rely solely on the structure module.  They're too rigid right now.  I don't know how to fix this.  Obviously I need more variety in almost all the pugins.  But it's more than variety...it's the ability to know when to play what.  It's one of those "essential questions" that you can't defeat with sheer complexity.

So my game plan?  Multifaceted.

FIRST, rework the part classification system so that it's no longer based on instrument name.  "Piano" and "Strings" don't help the structure module to arrange the piece at all.  What would be a lot more useful is if parts were name according to the playing charactiristics - "Melodic - Lead," "Melodic - Background," "Sustained," "Percussive," etc.  This way, the structure module's arrangements will make more sense.

SECOND, completely nix the "part intensity" instruction.  Musicians can figure out when to change volume based on how the composer moves his hands; it doesn't require individual instructions before hand.  In other words, the generative modules will be responsible for crafting their own volume instructions.  Instead, there will be three part instruction states: "On, Off, and Focus."  This will eliminate the quantitative junk from the reccomendation system, which is totally unneccessary and I'm really getting fed up with it.  This new black-and-white three-state system will tell modules whether to play, rest, or take focus.  The focus instruction isn't quite as obvious as the others, but it basically means that the module can draw attention to itself, via a shift in velocity or a more aggressive playing style.  It's not really a solo, but it's supposed to keep track of what the listener is focusing on so that modules can constantly shift the listener's attention to keep the song fresh.

THIRD, make the generative modules "dynamic" in their playing.  Changeable styles, evolving parts, etc.  The musician is responsible for giving his part direction.  Simple to say, very, very hard to actually do.  This will be the most difficult.

FOURTH, establish "real" qualitative and quantitative criteria for classifying structure parts.  "Intensity" is the only variable I use right now, even though some modules recognize "tension" and "fullness" as well.  These abstractions really mean very little to the program right now though, probably because I don't have a good understanding of them myself.  Maybe a single "emotion" state variable would be more appropriate?  At any rate, I feel that the current system just isn't adequate.

This stuff won't happen overnight, so let's hope I have the endurance.

airframe - Lindenmayer Upgrade

I've been working on airframe lately in hopes of getting the structure up to par with the other modules (progression module is my next target).  I've come to the conclusion that generative modules need to be a lot smarter than they are right now, because they shouldn't rely too heavily on the structure module to provide "solid" structure data.  Structures are really not very complex.  To be perfectly honest, the best option would probably be a very simple KBS with a pre-programmed list of structure forms (ABABCB, etc).  Until then, I'm overshooting the complexity.  So the generative modules will be responsible for keeping the piece coherent and figuring out what to do if the structure modules sends overly-complicated (or overly-simplified) instructions.

The core system of airframe has been decided upon - at least for now.  I'm proud to introduce it as the first mGen plugin to use a Lindenmayer system, also known as an L-system, which is basically a simple grammar system at first glance but has roots in fractals.  I think the L-system will provide a refreshing dose of simplicity and organized complexity.  I really won't know what to expect until I hear it, and if it works, it'll be far too good to be true given how easy an L-system is to implement.

Of course I plan on layering other processes over the L-system engine to refine the structure output (maybe an ear module?), so airframe will technically still be a hybrid plugin.

EvoSpeak is also still progressing.

EvoSpeak - Optimization

Yes, I'm STILL working on getting the analysis part of EvoSpeak working. I now have the structure of the species' brains figured out and I've optimized the analysis engine a LOT, thanks to the new storage method of the analysis filters. So things are looking pretty good and soon enough I should be working within the interface of EvoSpeak instead of grinding around in the code.

I'll admit, progress on mGen is coming slowly. It still looks like I'm going to make the code deadline that I set for Wednesday (17,000 lines), which is encouraging. How is mGen shaping up in the big picture/long run? That's what I'm more worried about. I'm going to have to step back and take a serious look at what's going on after I hit this deadline. I don't even really have anything for Mr. Taranto to help with yet, even though our meeting is in under two weeks. It's time to step up to the plate.

EvoSpeak - Analysis

Work is starting to get pretty messy with EvoSpeak. I'm trying to design a very generalized analysis framework to allow easy analysis of just about any relationship. Doing so is not at all easy. My strategy is to set up a "perception" matrix that simulates the state of any given sample at any given point in time. The idea is that an analysis function can build a filter matrix and then call the perception function, which will subsequently compare the filter matrix to the perception matrix and gather appropriate statistics.

The first analysis to be implement is, of course, a zeroeth-order Markov. Fancy language aside, what it boils down to is this: did the species use a certain word (melodic or rhythmic) in a certain sample? So a zeroeth-order Markov simply deals with the innate "quality" of certain words over other words, not taking into account ANY contextual variables.

Problems arise quickly with this general framework. It's very difficult to obtain certain state values to populate the matrix because of the grammar engine design. The melodic and rhythmic data streams are asynchronous, so melodic events don't necessarily line up with rhythmic events, which makes finding synchronous data (like perception data) very difficult. Apparently, I've already messed up in trying to separate the streams, as indicated by the presence of some anomalies in the preliminary statistics.

On top of all that, the analysis is a LOT slower than I thought it would be, even after serious reconstruction and optimization. I knew that it would take a lot of loops and recursions to do the analysis...but I thought the computer would just chew through them. Already a simple zeroeth-order Markov analysis on the melody alone costs about 2.6 seconds. Using that number and performing the necessary exponential extrapolating, a second-order Markov analysis would take a whopping sixteen minutes, which is simply unacceptable. And that's only to level-up once. I'm definitely going to have to figure something out there.

While I'm running into some obstacles, EvoSpeak is still advancing steadily and I'm confident that the analysis functions will soon be fully-functional.

EvoSpeak - Experience & Dörf

I finished a lot of EvoSpeak coding tonight. The training process is mostly coded. The species will spit out samples, the user listens and grades the performance, and then the results are submitted and stored to the species' memory. It's also possible to create new species now from within the configuration utility.
I created my first creature, Dörf, today. He speaks the default language. Why the name? I'm not sure, I just liked it. I've trained him 24 times so far, so he has 120 xp. He's actually ready to level up to level 2, since it only requires 100 xp to do so. Well, I guess as soon as I finish making the leveling-up algorithm, (which is the real meat of this whole process since it provides the "brains" for each species) he'll be good to go.
I look forward to working with Dörf; I hope he's a memorable first EvoSpeak species.

I finished a lot of EvoSpeak coding tonight. The training process is mostly coded. The species will spit out samples, the user listens and grades the performance, and then the results are submitted and stored to the species' memory. It's also possible to create new species now from within the configuration utility.

I created my first creature, Dörf, today. He speaks the default language. Why the name? I'm not sure, I just liked it. I've trained him 24 times so far, so he has 120 xp. He's actually ready to level up to level 2, since it only requires 100 xp to do so. Well, I guess as soon as I finish making the leveling-up algorithm, (which is the real meat of this whole process since it provides the "brains" for each species) he'll be good to go.

I look forward to working with Dörf; I hope he's a memorable first EvoSpeak species.

EvoSpeak - Getting Closer

I finished the preview builder and now have a working random pattern generator and previewer for EvoSpeak. I still can't submit ratings so species don't gain experience yet, but the hardest work is done...until it comes time to build the "leveling" mechanism (i.e. the Markov analysis tool).
And the results of the initial grammar runs? Good! Overall, I am very satisfied with what I'm hearing. Based off of the twenty-or-so previews that I've listened to so far, the engine is much more interesting than GrammGen. It sounds a lot better.
The thing I really like, however, is that switching languages dramatically changes the previews. Of course the same was true for GrammGen, but I never built a second language for GrammGen because of the relative difficulty of editing the languages. In EvoSpeak there's a built-in language editor. It's as easy as slapping in some pipe-delimited numbers for rhythm and melody and listening to the results.
It took me thirty seconds to build a language that could be used for repetitive arps in the background. So I think I've found my solution for arpeggiation! The simple the language, the more likely it is to repeat words - which is exactly what you want in a background pattern. After listening to some previews of the new language, I'm certain that this will be a very promising and flexible system.
So far EvoSpeak is going very well! The real question, however, has yet to be answered: will the "experience" and analysis system actually allow EvoSpeak to improve the quality of its output? The answer would seem to be a very obvious yes if I do everything right. But at the same time, it's hard to believe that listening to samples and pressing buttons can train a program to make better music. But who knows, I guess I'll just have to find out.
PS - It's worth noting, in case I was never clear about this, that EvoSpeak is NOT a grammatical subdivision engine like GGrewve, rather, it's a grammatical chain engine like GrammGen. Chains are simpler and easier to work with but subdivision is more powerful. And yes, I coined both of those terms, which is why you won't find information on them anywhere else :)

I finished the preview builder and now have a working random pattern generator and previewer for EvoSpeak. I still can't submit ratings so species don't gain experience yet, but the hardest work is done...until it comes time to build the "leveling" mechanism (i.e. the Markov analysis tool).

And the results of the initial grammar runs? Good! Overall, I am very satisfied with what I'm hearing. Based off of the twenty-or-so previews that I've listened to so far, the engine is much more interesting than GrammGen. It sounds a lot better.

The thing I really like, however, is that switching languages dramatically changes the previews. Of course the same was true for GrammGen, but I never built a second language for GrammGen because of the relative difficulty of editing the languages. In EvoSpeak there's a built-in language editor. It's as easy as slapping in some pipe-delimited numbers for rhythm and melody and listening to the results.

It took me thirty seconds to build a language that could be used for repetitive arps in the background. So I think I've found my solution for arpeggiation! The simple the language, the more likely it is to repeat words - which is exactly what you want in a background pattern. After listening to some previews of the new language, I'm certain that this will be a very promising and flexible system.

So far EvoSpeak is going very well! The real question, however, has yet to be answered: will the "experience" and analysis system actually allow EvoSpeak to improve the quality of its output? The answer would seem to be a very obvious yes if I do everything right. But at the same time, it's hard to believe that listening to samples and pressing buttons can train a program to make better music. But who knows, I guess I'll just have to find out.

PS - It's worth noting, in case I was never clear about this, that EvoSpeak is NOT a grammatical subdivision engine like GGrewve, rather, it's a grammatical chain engine like GrammGen. Chains are simpler and easier to work with but subdivision is more powerful. And yes, I coined both of those terms, which is why you won't find information on them anywhere else :)

EvoSpeak - Progress and Ideas

I'm still working on EvoSpeak, getting the engine all set up. I finished the random generating algorithms that will provide training material from which EvoSpeak will "learn." They also define the basis of the new grammar system, whose syntax is simpler even than that of GrammGen, but whose power is much greater.
Next I need to create the functions that will analyze the training material to figure out what attributes they have in terms of melody and rhythm. All of this analysis data will be stored in a training file that will also indicate how well the user likes the material. After a certain number of training pieces have been graded by the user, EvoSpeak will dig up all the analysis data and perform an extensive statistical analysis on it to try find correlations and develop a "brain," so to speak, that will allow the program to function as an output device.
I'm still trying to figure out exactly what variables/attributes should be part of the "brain." This has always been my problem with statistical models; I've never known exactly what variables to draw statistics from. Now I've got to tackle the issue. I'll start simple - state variables (such as what beat the rhythmic or melodic object falls on) and first-order memory variables (what the last rhythmic or melodic object was) should work fine for the first version.
I plan to have EvoSpeak set up in an intuitive "leveling" kind of way that reflects a simple game. Before EvoSpeak will work, the user must first create a new "creature" that speaks a certain "language." At first the creature will have no idea how to speak the language; like a child, the creature must be shown how to use words to make sentences. The user "trains" the creature by listening to samples and rating them on a scale of 1 (strong dislike) to 5 (strong like). The creature gains XP (experience points) when the user listens to samples and submits ratings. When the creature has enough XP, it can "level up." During the leveling-up process (unbeknownst to the user), the creature actually goes back and analyzes all of the samples and ratings and essentially "learns" from the previous batch of material. The leveling system is good because it will ensure that correlations are relatively strong before they will be used to generate (i.e. the creature won't work without the user having trained it to a certain level).
At higher levels, creatures may learn the ability to analyze deeper variables other than states and first-order memories. Perhaps the creature gains more memory with each level (this is equivalent to increasing the order of the Markov chain analysis). Or perhaps the creature starts analyzing surface contours (3-variable functions) instead of 2-dimensional dependencies.
These are pretty abstract and crazy ideas, but I think they make sense, and I think they will provide a refreshing and intuitive break from the usual grind of KBSs. I'm interested to start training my first creature! And if the leveling system actually makes the music sound better (as intended)...well...I think I could spend all day leveling my creatures (is this starting to sound like Pokemon? That's neither my intent nor my inspiration).

I'm still working on EvoSpeak, getting the engine all set up. I finished the random generating algorithms that will provide training material from which EvoSpeak will "learn." They also define the basis of the new grammar system, whose syntax is simpler even than that of GrammGen, but whose power is much greater.

Next I need to create the functions that will analyze the training material to figure out what attributes they have in terms of melody and rhythm. All of this analysis data will be stored in a training file that will also indicate how well the user likes the material. After a certain number of training pieces have been graded by the user, EvoSpeak will dig up all the analysis data and perform an extensive statistical analysis on it to try find correlations and develop a "brain," so to speak, that will allow the program to function as an output device.

I'm still trying to figure out exactly what variables/attributes should be part of the "brain." This has always been my problem with statistical models; I've never known exactly what variables to draw statistics from. Now I've got to tackle the issue. I'll start simple - state variables (such as what beat the rhythmic or melodic object falls on) and first-order memory variables (what the last rhythmic or melodic object was) should work fine for the first version.

I plan to have EvoSpeak set up in an intuitive "leveling" kind of way that reflects a simple game. Before EvoSpeak will work, the user must first create a new "creature" that speaks a certain "language." At first the creature will have no idea how to speak the language; like a child, the creature must be shown how to use words to make sentences. The user "trains" the creature by listening to samples and rating them on a scale of 1 (strong dislike) to 5 (strong like). The creature gains XP (experience points) when the user listens to samples and submits ratings. When the creature has enough XP, it can "level up." During the leveling-up process (unbeknownst to the user), the creature actually goes back and analyzes all of the samples and ratings and essentially "learns" from the previous batch of material. The leveling system is good because it will ensure that correlations are relatively strong before they will be used to generate (i.e. the creature won't work without the user having trained it to a certain level).

At higher levels, creatures may learn the ability to analyze deeper variables other than states and first-order memories. Perhaps the creature gains more memory with each level (this is equivalent to increasing the order of the Markov chain analysis). Or perhaps the creature starts analyzing surface contours (3-variable functions) instead of 2-dimensional dependencies.

These are pretty abstract and crazy ideas, but I think they make sense, and I think they will provide a refreshing and intuitive break from the usual grind of KBSs. I'm interested to start training my first creature! And if the leveling system actually makes the music sound better (as intended)...well...I think I could spend all day leveling my creatures (is this starting to sound like Pokemon? That's neither my intent nor my inspiration).

EvoSpeak

Yes, I know. Way too many new plugins lately. I can't help it, I have to find some new inspiration. The assisted compositions are falling into a rut and mGen needs some variety pretty badly. So I'm trying a new melody method, EvoSpeak.
EvoSpeak will be the first plugin featuring a true hybrid engine. It will incorporate an evolutionary system based on a grammar engine. The grammar engine will not be a derivative of WordSmith or the WordEngine, nor will it copy the format of GGrewve. This grammar engine will be based off of the lighter and more manageable GrammGen engine, with some obvious modifications for greater efficiency. I've basically developed two grammar systems and I have yet to hit the sweet spot for melodies. GrammGen introduced a very simple and very light-weight grammar system based off of loosely-defined words. The results were interesting but have failed to keep my attention for very long. The GGrewve engine brought with it a much deeper and extremely effective grammar engine. The GGrewve (and, subsequently, WordSmith) engine, however, is not altogether flexible. It basically requires MIDI files to create styles since the words are much more complex than those of GrammGen.
With the EvoSpeak engine, I hope to achieve the impressive coherence and variety of the GGrewve engine but with the efficiency and originality of the GrammGen system. The evolutionary model should allow a fundamentally simple grammar to come together into a much more complex system. Loosely speaking, the EvoSpeak engine is also a statistical model. The evolutionary model actually "evolves" via a statistical model, but "speaks" via a grammar.
Implementation is in alpha stages right now so I'm really not sure what to expect since this is my first real endeavor into evolutionary models (I tried a very, very basic evolutionary drum model as one of my first plugins ever, but it was way too simple and poorly-coded to achieve anything).

Yes, I know. Way too many new plugins lately. I can't help it, I have to find some new inspiration. The assisted compositions are falling into a rut and mGen needs some variety pretty badly. So I'm trying a new melody method, EvoSpeak.

EvoSpeak will be the first plugin featuring a true hybrid engine. It will incorporate an evolutionary system based on a grammar engine. The grammar engine will not be a derivative of WordSmith or the WordEngine, nor will it copy the format of GGrewve. This grammar engine will be based off of the lighter and more manageable GrammGen engine, with some obvious modifications for greater efficiency. I've basically developed two grammar systems and I have yet to hit the sweet spot for melodies. GrammGen introduced a very simple and very light-weight grammar system based off of loosely-defined words. The results were interesting but have failed to keep my attention for very long. The GGrewve engine brought with it a much deeper and extremely effective grammar engine. The GGrewve (and, subsequently, WordSmith) engine, however, is not altogether flexible. It basically requires MIDI files to create styles since the words are much more complex than those of GrammGen.

With the EvoSpeak engine, I hope to achieve the impressive coherence and variety of the GGrewve engine but with the efficiency and originality of the GrammGen system. The evolutionary model should allow a fundamentally simple grammar to come together into a much more complex system. Loosely speaking, the EvoSpeak engine is also a statistical model. The evolutionary model actually "evolves" via a statistical model, but "speaks" via a grammar.

Implementation is in alpha stages right now so I'm really not sure what to expect since this is my first real endeavor into evolutionary models (I tried a very, very basic evolutionary drum model as one of my first plugins ever, but it was way too simple and poorly-coded to achieve anything).