I've set aside the physics engine for a while to undertake another huge project - one that lies at the core of everything I do. My new goal: eliminate marching cubes. It's slow, it's memory intensive, it has no respect for surface topology, and it's not easy to make adaptive.
Instead, I'm building an algorithm that will polygonize a surface using....the surface! Just like my adapted version of marching cubes ('crawling cubes') crawls along the surface, so will my new algorithm. But, unlike MC, my algorithm will orient triangles such that they actually approximate the surface.
This is definitely going to be the most challenging algorithm I've ever designed, and I'm sure edge cases and slight things that I haven't considered yet are going to turn this into an absolute nightmare...but the reward for success is huge: polygonization quality and speed are absolutely essential to my entire engine, since all meshes will be converted at some point. MC is currently responsible for many of the annoyances that I see on a daily basis: poorly-calculated normals (due to sliver triangles), high poly count (due to the lack of adaptiveness), meshes with holes (due to lack of a bulletproof method for determining the isosurface bounds). A good surface-based algorithm will fix all of the above.
At any rate, I'm excited to have some initial results to share! It's not much, but you can see the formation of something that loosely resembles a surface! There's no adaptability yet, and the surface doesn't handle self-intersection (which I anticipate will be the most difficult part of the algorithm). Still, it's an encouraging start, and the surface quality clearly beats out anything MC could produce.
Huge breakthrough today!
I've written a new algorithm that I called the "Crawling Cubes Algorithm" that works extremely well and is ultra-efficient. Right now, it's looking about 4-5 times faster than my most heavily-optimized Marching Cubes, and roughly 20x faster than the vanilla Marching Cubes. Crawling Cubes is almost algorithmically optimal, meaning that it makes almost no unnecessary evaluations of the density field.
Crawling Cubes is specifically designed for metaball fields. It takes advantage of the fact that it can gain access to the positions of the metaballs. Using this information, Crawling Cubes is able to find a "starting point" cell on every disjoint isosurface. The algorithm then propagates over the surface until all cells containing the surface have been polygonized. Since any given cube that intersects an isosurface has perfect information about whether or not each of its neighbors intersects the isosurface, Crawling Cubes can propagate over the entire surface without wasting any evaluations on cells that don't intersect the surface.
To be fair, I didn't come up with this idea on my own. I found the idea on this page and quickly realized how brilliant it was and how perfectly it would work for metaball meshes.
The algorithm can now handle that 60k mesh of 30 metaballs that I mentioned in my last post in well under a single second. Furthermore, Crawling Cubes never causes topological artifacts. Unlike my previous hacky optimizations, this optimization preserves all topology (at the expense of generality, since it works only for metaball meshes).
I'm really, really excited about the development of this new algorithm and can't wait to put it to good use with ultra-high-quality procedural models!
I've spent an insane amount of time over the past few days trying to optimize the Marching Cubes algorithm, since it's going to be such a pivotal part of my project. Ideally, I'd like all meshes to be parametrized in terms of isosurfaces for memory efficiency as well as for the ability to quickly prototype models.
I experimented with both octree culling as well as optimizations to the original, brute-force algorithm. Surprisingly, the brute-force algorithm ended up outperforming the octree in all cases. Furthermore, I failed to find a reliable way of predicting whether or not an octree node would intersect the isosurface given just the density values at the corners, so reconstruction often produced meshes with holes in them.
Ultimately, I ended up cutting polygonization time down by a factor of four. I'm still using a brute-force technique. However, using some rather hacky methods to perform linear interpolation on selective parts of the density field, I've managed to avoid having to directly compute the entire field. This method basically involves computing a derivative of the field and then using a very reserved linear approximation of the density field to skip a few computations if the values are determined to be at a "safe" distance from the isolevel.
With this technique, the engine can build a 60k poly model (high quality) of a density field containing roughly 30 metaballs in less than 2 seconds. That's pretty fast! I'm going to continue to seek optimizations for this algorithm, but for now I'm much more pleased than I used to be with the speed. Isosurface extraction is starting to look very feasible for a game engine!
With the power of a working isosurface extraction algorithm, I decided, quite naturally, to test the capabilities of marching cubes on the 3-dimensional analogue of my favorite noise function - perlin noise.
Coding a perlin noise function to work in tandem with the marching cubes algorithm was simply a matter of transferring the 3D perlin noise function that I had already written in HLSL over to equivalent CPU instructions.
The results, though quite intriguing to look at, aren't particularly beautiful.
Unfortunately, 3D perlin noise does not seem to be feasible for real-time application. Generating one or two meshes that use 3D perlin noise may be acceptable, but making heavy use of the function simply isn't feasible if loading times are to be kept reasonable. It's slow. Really slow. Perlin noise is already a computationally-expensive function, and the fact that I'm now running a 3D analogue of it (which is, by nature, more expensive than 2D) on the CPU rather than the GPU makes the problem intractable. The CPU simply isn't capable of crunching the math fast enough.
In the long run, I'm going to look into alternatives to 3D perlin noise. The function would be a great asset for a new reality, but it's not practical for real-time application at the moment - at least on the CPU. In the future, I'll try to look into DirectX 10's stream out, as I know that it can be used to perform marching cubes on the GPU (probably in a ludicrously small fraction of the time that it takes the CPU).
Since I haven't had much time recently for major coding sessions (between classes and a social life), most of my work over the past few weeks has involved optimization. After downloading Intel's Parallel Studio trial, I immediately fell in love with the large suite of optimization tools it offers. Armed with such tools, I revisited my implementation of the Marching Cubes algorithm as well as my mesh simplification algorithm which, together, take up a huge amount of CPU time. Both, however, are absolutely critical to A New Reality.
After lots of tuning, I have the isosurface extraction algorithm running about 15-20 times faster and the mesh simplification algorithm running about 5 times faster. Combined, these speed increases allow for the creation of ultra-high detail meshes in almost no time.
On a side note, I also wrote my own (simple) implementation of the STL vector that runs about 5 times faster than the STL version for typical usage. Not bad! As a bonus, it shaves 2kb off of the packed executable. And here I was thinking all along that STL was made by the Gods of optimization.
After a great deal of problems and long nights of debugging, I finally have the marching cubes polygonization algorithm up and working in my engine! I was hesitant to implement the algorithm in the first place, since it requires a lot of constant global memory (several KB for the lookup tables), and I'm trying to keep my executable size as low as possible. In the end, however, it was worth the few extra kilobytes. The results are incredible! Isosurface extraction may be the answer to all of A New Reality's problems!
Using isosurface extraction, it will be loads easier to create procedural models. My fear, however, is performance. The algorithm is sluggish at best, even on powerful CPUs. No wonder, considering it has to sample tens, possibly hundreds, of thousands of grid points in the density function. For high-quality procedural modeling to be feasible using isosurface extraction, I'm going to have to hope that CPU power increases a lot in the near future. Until then, I will continue attempting to optimize the algorithm as well as explore multithreaded options to prevent lockups during mesh creation.
I hope to post some results of the algorithm soon!