Posts In: ambient occlusion

Ambient Occlusion II

January 25, 2013 General News 7 Comments

Last time I talked about AO, but I left out a teensy little detail: although per-vertex AO is very easy to compute, and also extremely fast to render, it's extremely slow to compute during the pre-process.  To get high-quality, noise-free AO requires somewhere in the vicinity of 1000 samples of the density field per vertex.  Not exactly a cheap operation!  On the CPU, it quickly becomes prohibitively expensive as either the complexity of the density field or the resolution of the mesh increase.

Today, I moved the computation to the GPU, and have once again been blown away by the computational abilities of modern GPUs.  Now that I have every piece of the mesh computation process - the field evaluation, the gradient evaluation, and, finally, the AO evaluation - running on the GPU, it's simply mind-blowing how high I can push the quality of the mesh and the complexity of the density field.

Here's a mesh consisting of 50 unioned and subtracted round boxes (round boxes are very expensive compared to sharp-edged ones), contoured on a grid of 300 x 300 x 300 (that's an insane level of detail, FYI), resulting in half a million vertices, each of which takes 1024 AO samples.  The GPU performs this work in ~3 seconds.  Incredible.

High-Quality, Per-Vertex AO Computed on the GPU

 

But that's not even the most amazing part.  The amazing thing is that, after profiling, it would seem that the GPU actually takes less than 1 second to complete this work.  It is OpenGL's shader compiler (which, of course, is running on the CPU) that takes the majority of the time.  This isn't too surprising, as the shaders to compute these things are massive, since I actually bake the field equation into the shader.  I'm sure GL spends a long time analyzing and optimizing the equation, which is a good thing, because the shader runs absurdly fast.

Unfortunately, this brings up a few unwanted issues - I now have a CPU bottleneck that can't be easily offloaded to another thread.  Since the bottleneck is inside GL, I will need to explore multithreaded GL contexts in order to compile the shaders in another thread while the game runs, because I can't have the game stalling every time a new asset enters the region and the corresponding shaders have to be compiled.  Sadly, this probably won't be too easy, but I'm sure I'll learn a lot...!

Another, less-tractable problem is that the shader compiler flat-out crashes after a certain field complexity is hit.  I will need to explore this some more.  It might just be the fact that my field function dumps an incredibly-ugly equation into the shader (it's literally a single line, with hundreds of functions wrapped together).  Perhaps breaking it up will prevent the crash.  Or maybe I've hit some kind of hard limit on the allowed complexity of pixel shaders.  If that's the case, I could explore a solution that uploads the equation as a texture, and create a shader that understands how to parse an equation from a texture.  But that would no doubt be significantly slower than baking the equation into the shader...probably at least an order of magnitude slower :/

But for now, I will allow myself to be happy with these results, and am most definitely looking forward to working on ships again with this technology in hand!

Ambient Occlusion

January 24, 2013 General News 0 Comments

Finally, just like every other modern game, Limit Theory now has highly-exaggerated ambient occlusion.  At least mine isn't screen-space, though.  A while back I dropped SSAO because I was tired of it.  It's expensive, requires absurd levels of tweakery just to get right, varies with distance to the object, etc.  It's bad, let's face it.  Per-vertex AO probably isn't an option for many games due to tessellation level on the mesh.  For per-vertex to work, you need a consistent level of tessellation across the whole surface.  Hey, guess which really stupid method of creating surfaces provides just that?  Yep, uniform-grid contouring of distance fields.  Marching cubes, Dual Contouring, you name it.  I used to think that the uniform level of tessellation, even on really flat surfaces, was a big problem.  Now I realize that we can actually use it to our advantage by storing useful information at the vertex level!

Oh, another bonus: occlusion factor is almost too easy to compute with a scalar field.  As in, the function that I wrote to take a mesh and bake AO data in is literally 15 lines of very-readable, simple code.  Man, I love fields 🙂

The result is beautiful! (At least I think.)

Per-Vertex Ambient Occlusion

 

I love this kind of occlusion, because, unlike SSAO, it does not vary with distance.  So you can zoom in as far as you like, and those darkly-shaded corners are still dark (a result that is impossible with screen-space ambient occlusion techniques).  Note that I exaggerated it a lot for this shot, it probably won't be that heavy in the game!

Anyway, I'm very excited about using this in LT.  I can't wait to see how good ships will look once I get the ship generator converted to use scalar fields rather than polygons!!!

PS ~ You might be concerned about performance given the high tessellation of surfaces because of the large number of vertices that methods like MC and DC output.  Indeed, at first it seems like a valid concern.  But then again, vertices are extremely cheap these days.  Vertex shaders are significantly lighter than fragment shaders, and also get run significantly less frequently.  This means that in almost every modern game, you will find yourself very much bottlenecked by the fragment shader.  Furthermore, with field methods, you get perfect LOD meshes for free, you just turn down the contouring resolution.  So even if you're putting more vertices on screen, you can easily control this by setting up LOD levels.  With analytic normals, even low-resolution meshes tend to look quite good!  So I don't think vertex density is something to be too worried about.