The One Man MMO Project

The story of a lone developer's quest to build an online world :: MMO programming, design, and industry commentary

The story of a lone developer's quest to build an online world :: MMO programming, design, and industry commentary

Scalable Ambient Obscurance for the Mathematically Uninclined

Scalable Ambient Obscurance (SAO) is one of the many new art and rendering improvements in **Bold New World**, the latest update to *The Imperial Realm::Miranda* my seamless open-world real time strategy game currently in Early Access. Learn more at theimperialrealm.com.

SAO is a well-regarded algorithm for rendering ambient occlusion (the dark area in concave corners where ambient light has trouble reaching) and is a significant rendering upgrade for Miranda. The original paper by Morgan McGuire, Michael Mara and David Luebke does a good job of describing a the algorithm and there is a reference HLSL implementation. I needed something in GLSL for my OpenGL engine, and at first blush, SAO has a **lot** of moving parts.

You can see the difference SAO makes in the image above. The effect on the lighting of the rocks is remarkable. With the procedural rocks I was building at the same time, I really needed something that would bring them to life beyond normal maps and texturing effects. You might notice a slight straight line of darkening on the ground between the rock formations. I spent quite a bit of time tweaking the parameters of the SAO shader to get rid of unwanted occlusion and I talk in detail about that below. The real bonus of SAO is that it also improves the lighting on all the building and vehicle models that were not specifically having any work done on them for Bold New World.

[Miranda's Bold New World trailer shows SAO in action.]

I had a terrible time getting SAO to work, mainly because I started from this example from three.js which is written in GLSL. **Do NOT use this example!** It is missing several important elements of the algorithm and does others incorrectly. I wasted a lot of time trying to rationalize the two implementations into something I could understand and that worked.

[For a first attempt, this is actually a pretty good result.]

**The SAO shader here is not identical to the reference implementation.** I changed a number of elements during my implementation to take into account the language differences, render environment and artistic preferences. I note the differences versus the original implementation and why I chose them.

There is arguably a bit of math here, but I am trying for it to be of the "this does this" and "just do this" variety. I hope that combined with the source code it might help someone else avoid the math entirely.

As I've said before, my render setup is +X to the right, +Y up and +Z coming out of the screen. The C++ code snippets are particular to Miranda, but hopefully they should provide enough information to reproduce the effect in any program.

SAO's primary input is a depth buffer of the scene. SAO uses this depth to calculate the locations of the points it samples on the hemisphere around the point it is currently shading. I do a depth pre-pass where I render the scene with this shader to prepare the depth buffer in world coordinates (in my case, metres.) The render target for the depth pass is a full resolution single channel 32-bit float. For sceneDepth, zNear and zFar are the near and far clip planes in world coordinates.

By far the trickiest part of SAO is reconstructing the view space position. I spent a lot of time getting the depthm value to calculate correctly. Luckily I was able to use GDebugger to examine the depth buffer for sane values.

uniform mat4 modelViewProjectionMatrix; uniform vec3 sceneDepth; attribute vec3 vertex3; varying float depthm; void main() { vec4 vertex = vec4( vertex3.xyz, 1.0 ); vec4 position = modelViewProjectionMatrix * vertex; /* https://stackoverflow.com/questions/6652253/getting-the-true-z-value-from-the-depth-buffer */ /* sceneDepth.r = 2.0 * zNear * zFar */ /* sceneDepth.g = zFar + zNear */ /* sceneDepth.b = zFar - zNear */ depthm = sceneDepth.r / ( sceneDepth.g - ( ( 2.0 * position.z / position.w ) - 1.0 ) * sceneDepth.b ); gl_Position = position; }

varying float depthm; void main() { gl_FragColor = vec4( depthm, 0.0, 0.0, 0.0 ); }

Once the depth buffer is built, we're ready for the SAO pass. The SAO render target must be full resolution. I originally had a half-resolution buffer and with that, SAO only works properly on objects which are very near the camera. I spent a lot of time trying to puzzle out that little conundrum. The render target for the SAO pass is a single byte (R8.)

All that is needed to be rendered for this pass is a single fullscreen quad. Here's the C++ setup for that (vertex parameters are X,Y,Z,U,V) since the UV coordinates feed into the SAO shader when reconstructing the view position.

float left = -1.0f; float right = 1.0f; float top = 1.0f; float bottom = -1.0f; Vertex vertices[] = { Vertex( left, top, 0.05f, 0.0f, 1.0f ), Vertex( left, bottom, 0.05f, 0.0f, 0.0f ), Vertex( right, bottom, 0.05f, 1.0f, 0.0f ), Vertex( right, top, 0.05f, 1.0f, 1.0f ) };

The projection matrix and inverse projection matrix are calculated the regular way, for me it looks like this:

mProjectionMatrix = Projections::Perspective( fovy, aspect, FRUSTUM_Z_NEAR_M, FRUSTUM_Z_FAR_M ); mInverseProjectionMatrix = mProjectionMatrix; mInverseProjectionMatrix.Inverse();

We also need to calculate the height of a 1m object 1m from the camera in pixels. mRenderDevice.GetHeight returns the height of the render device in pixels.

Vector4 nt( 0.0f, 0.5f, -1.0f, 1.0f ); Vector4 nb( 0.0f, -0.5f, -1.0f, 1.0f ); nt = mProjectionMatrix * nt; nb = mProjectionMatrix * nb; float mSaoProjectionScale = ( ( nt[ 1 ] / nt[ 3 ] ) - ( nb[ 1 ] / nb[ 3 ] ) ) * mRenderDevice.GetHeight();

The vertex shader's only job in ths SAO pass is to send texCoord0 along to the fragment shader to be used in position reconstruction.

uniform mat4 modelViewProjectionMatrix; attribute vec3 vertex3; attribute vec2 texCoords0; varying vec2 texCoord0; void main() { texCoord0 = texCoords0; vec4 vertex = vec4( vertex3.xyz, 1.0 ); gl_Position = modelViewProjectionMatrix * vertex; }

The entirety of the SAO fragment shader is presented here with discussion.

// Scalable Ambient Obscurance-ish ambient occlusion shader // Paper: http://research.nvidia.com/sites/default/files/pubs/2012-06_Scalable-Ambient-Obscurance/McGuire12SAO.pdf // Examples http://casual-effects.com/g3d/G3D10/data-files/shader/AmbientOcclusion/AmbientOcclusion_AO.pix // http://casual-effects.com/g3d/G3D10/G3D-app.lib/include/G3D-app/AmbientOcclusionSettings.h // https://github.com/PeterTh/gedosato/blob/master/pack/assets/dx9/SAO.fx // https://gist.github.com/bhouston/1dc2a760783314b95bd9 // Visual example but don't look at the shader: https://threejs.org/examples/#webgl_postprocessing_sao

See the Tuning section below for how to use these settings.

// Number of samples to take on disk around sample location. Higher value better quality and worse performance. const int NUM_SAMPLES = 19; // Number of rotations around disk. const int NUM_ROTATIONS = 7; // The size in scene coordinates (metres) of the disk to search for ambient occlusion. const float kernelRadiusM = 1.0; // Increment bias slowly to get rid of shading on otherwise flat geometry. If you turn it up // too far, you'll get white lines on edges. const float bias = 0.18; // Increment this until shading looks the way you like (makes it darker.) 1.0 - 6.0. const float intensity = 4.0; // This is how much a point needs to differ in depth from the sample point to be considered a discontinuity. const float maxDepthDiffM = 2.0; // This is the distance from max distance where we start to fade out the AO effect. const float saoFadeDepthM = 50.0; // Screen geometry depth in metres (32-bit single channel texture.) uniform sampler2D geometryDepth; // Screen resolution in pixels. uniform vec2 screenResolution; // Random value (if you want temporal change in your AO.) uniform float random; // Height of 1m object 1m from camera in pixels. uniform float saoProjectionScale; // Max distance for a point to be processed for SAO (metres.) uniform float saoMaxDistanceM; // Perspective projection matrix for scene uniform mat4 saoProjection; // Inverse of perspective projection matrix for scene uniform mat4 saoInverseProjection; // Texture coordinates for full screen polygon (0,0-1,1) varying vec2 texCoord0; const float TWOPI = 2.0 * 3.1415926535897932384626433832795; const float ANGLE_STEP = TWOPI * float( NUM_ROTATIONS ) / float( NUM_SAMPLES ); const float oneOverKernelRadiusMSquared = 1.0 / ( kernelRadiusM * kernelRadiusM );

This random is a popular GLSL random function I found on StackOverflow.

// Get a random value from 0-1. float rand( vec2 co ) { return fract( sin( dot( co.xy, vec2( 12.9898, 78.233 ) ) ) * 43758.5453 ); }

To me, getViewPosition is the really clever part of SAO, it uses the depth buffer and screen coordinates to calculate the clip space position of any point on screen. The clip space point can be used to figure out if the scene geometry at the screen location is obscuring the point we are processing. This was by far the most difficult part to get correct. depth buffers in most implementations seem to be in normalized device coordinates (NDC) (-1 to 1) but I had a pre-existing depth buffer in metres which was used by other passes, so I needed to get the depth converted back. The screen UV's in the 0-1 range also get converted to NDC. The inverseProjection matrix transforms the NDC coordinates back into clip space. Don't forget to divide by w. The saoInverseProjection matrix is just the inverse of the standard perspective projection.

// Get view space position from screen coordinates and depth in metres. vec3 getViewPosition( const in vec2 screenPosition, const in float depthm ) { // Convert metres to normalized device coordinates (-1 to 1) // https://stackoverflow.com/questions/6652253/getting-the-true-z-value-from-the-depth-buffer float depth = 0.5 * ( -1.0 * saoProjection[2].z * depthm + saoProjection[3].z ) / depthm + 0.5; // Screen coordinates (0-1) to normalized device coordinates (-1 to 1 for x,y,z) vec4 ndcPosition = vec4( screenPosition.xy * 2.0 - 1.0, depth, 1.0 ); // Normalized device coordinates to view space coordinates. // https://stackoverflow.com/questions/1352564/mapping-from-normalized-device-coordinates-to-view-space vec4 clipSpacePosition = saoInverseProjection * ndcPosition; return vec3( clipSpacePosition.xyz / clipSpacePosition.w ); }

SAO works by taking a number of samples in a hemisphere around the point it is trying to shade and calculating how close they are to the surface normal at that point. The closer they are, the more they occlude. It does this by drawing a vector from the centre of the hemisphere to the sample point.

There is a lot of variety in the implementation of this function in the many examples I looked at, specifically in the falloff function. The shader examples linked at the top of the shader offer a number of function choices to choose from. After some experimentation I decided not to use a falloff function at all. I was also surprised to discover what I think is a bug in the reference implementation.

// Get occlusion value for sample point (1 is occluded.) float getOcclusion( const in vec3 centerViewPosition, const in vec3 centerViewNormal, const in vec3 sampleViewPosition ) { // epsilon prevents divide by zero. const float epsilon = 0.0002; vec3 viewDelta = sampleViewPosition - centerViewPosition; float viewDeltaLengthSquared = dot( viewDelta, viewDelta ); float vn = dot( viewDelta, centerViewNormal ); float occlusion = clamp( 1.0 - ( viewDeltaLengthSquared * oneOverKernelRadiusMSquared ), 0.0, 1.0 ) * // One at centre of sample hemisphere clamp( vn - bias, 0.0, 1.0 ) / ( epsilon + viewDeltaLengthSquared ); // One where vector near normal, bias provides dead zone at centre // vn is viewDelta length squared. Divide by viewDeltaLengthSquared // to get into range 0-1. // Most implementations have a falloff function here. I like it better without. // The standard example seems to be bugged, I think occlusion should be the *last* parameter to mix, don't really // like either version, and intensity here seems to have a different range than intensity below. //return occlusion * mix( 0.9 + intensity * 0.5, 1.1 - intensity * 0.15, occlusion ); return occlusion; }

// Get ambient occlusion value for pixel. 0 is occluded. float getAmbientOcclusion( const in vec3 centerViewPosition, const in float centredepthm ) {

Here's some shader magic, this recreates the normal at the center position. It's not perfect but it works well enough for this.

vec3 centerViewNormal = normalize( cross( dFdx( centerViewPosition ), dFdy( centerViewPosition ) ) );

In order for SAO to work for both nearby, and distant features in your scene (particularly if it is large like Miranda's,) we calculate the hemisphere radius in pixels at whatever depth the centre is. If the radius is less than 1 pixel we can't calculate SAO for this pixel.

float discRadiusPx = saoProjectionScale * kernelRadiusM / centredepthm; // Can't calculate AO with this small of a radius if ( discRadiusPx <= 1.0 ) { return 1.0; }

We take a number of samples around the hemisphere, the angle is calculated from the screen position and a random value that is fed to the shader by the game. The SAO shading changes constantly if the angle is truly random but I preferred the SAO to be stable when the camera wasn't moving so I commented out the random value.

The algorithm for calculating the sampleUv coordinates in the reference implementation is entirely different from what is presented here. This algorithm spirals out from one pixel to discRadiusPx pixels over NUM_ROTATIONS.

// If random is included, AO shading changes every frame. For Miranda it looks better static. float angle = rand( texCoord0 /* + random */ ) * TWOPI; float occlusion = 0.0; for( int i = 0; i < NUM_SAMPLES; i++ ) { // Ensure that the samples are at least 1 pixel away float radiusPx = max( 1.0, discRadiusPx * float( i ) / float( NUM_SAMPLES - 1 ) ); // Convert radius in pixels to texture coordinates. float radius = radiusPx / screenResolution[ 1 ]; // Get sample from angle and radius from centre. vec2 sampleUv = texCoord0 + vec2( cos( angle ), sin( angle ) ) * radius; angle += ANGLE_STEP;

SAO implementations recommend rendering a slightly larger depth buffer than the actual screen so that SAO can sample outside the regular screen borders to calculate the SAO at the borders of the screen. I decided just to let the SAO effect fade out at the screen borders and didn't go for the extra complexity of rendering a larger depth buffer. The borders of Miranda's screen are a little busy, so you can't really see it anyway.

// If sample is outside texture, then ignore. Ideally we would render a border around the screen so we could sample there. if( ( sampleUv.x < 0.0 ) || ( sampleUv.x > 1.0 ) || ( sampleUv.y < 0.0 ) || ( sampleUv.y > 1.0 ) ) { continue; }

This is another diversion from the reference algorithm. Because I have the depth in metres, it is a simple matter to detect depth discontinuities compared to the original algorithm's clever and not entirely reliable use of the values of the calculated normals.

float sampledepthm = texture2D( geometryDepth, sampleUv ).r; // If sample is beyond max distance, or the depth difference between the points is larger than maxDepthDiffM (depth discontinuity) then ignore. if ( ( sampledepthm > saoMaxDistanceM ) || ( abs( sampledepthm - centredepthm ) > maxDepthDiffM ) ) { continue; } occlusion += getOcclusion( centerViewPosition, centerViewNormal, getViewPosition( sampleUv, sampledepthm ) ); }

This is a place where there are a number of choices of function in the example shaders linked above. Feel free to tweak this function to best visual effect.

return pow( clamp( 1.0 - sqrt( occlusion / float( NUM_SAMPLES ) ), 0.0, 1.0 ), intensity ); }

void main( ) { float depthm = texture2D( geometryDepth, texCoord0 ).r; if ( depthm > saoMaxDistanceM ) { gl_FragColor = vec4( 1.0, 1.0, 1.0, 1.0 ); } else { float ambientOcclusion = getAmbientOcclusion( getViewPosition( texCoord0, depthm ), depthm );

[This what it looks like if SAO is applied to the sky.]

This was my addition to darken the scene more where there was occlusion.

// With our settings and functions, ambientOcclusion ranges from 0.5 to 1.0 so remap that to 0.0 to 1.0 so it darkens the scene better. ambientOcclusion = ( ambientOcclusion - 0.5 ) * 2.0; // Fade AO out at a distance from the camera. Nobody wants AO on the sky. float fadeNear = saoMaxDistanceM - saoFadeDepthM; if( depthm > fadeNear ) { ambientOcclusion = mix( ambientOcclusion, 1.0, ( depthm - fadeNear ) / saoFadeDepthM ); } gl_FragColor = vec4( ambientOcclusion, ambientOcclusion, ambientOcclusion, 1.0 ); }

I needed a lot of visualizations to debug this shader, here are a few of the prettier ones.

// Debug code // Render normals to buffer. // vec3 centerViewPosition = getViewPosition( texCoord0, depthm ); // vec3 centerViewNormal = normalize( cross( dFdx( centerViewPosition ), dFdy( centerViewPosition ) ) ); // gl_FragColor = vec4( ( centerViewNormal + 1.0 ) * 0.5, 1.0 ); // Render position to buffer // gl_FragColor = vec4( getViewPosition( texCoord0, depthm ) + 0.5, 1.0 ); // Render NDC depth to buffer. // float depth = 0.5 * ( -1.0 * saoProjection[2].z * depthm + saoProjection[3].z ) / depthm + 0.5; // gl_FragColor = vec4( depth / 2.0 + 0.5, 0.0, 0.0, 1.0 ); // Render depth to buffer // gl_FragColor = vec4( depthm / 650.0, depthm / 650.0, depthm / 650.0, 1.0 ); }

The output render target from the SAO pass (textureSource) is blurred using a two pass (horizontal and vertical) 7 sample Gaussian blur. I ping-pong the render targets and the first blur pass render target is half resolution to save memory and to make it even blurrier. The second blur pass renders back into the same buffer the SAO pass rendered into. The blur passes are each just a single full-screen quad.

// Must match width/height scale of render target. const float32 SAO_BLUR_COEF = 0.5; // scale for H and V passes. mSAOScaleH[ 0 ] = 1.0 / (mRenderDevice.GetWidth() * SAO_BLUR_COEF ); mSAOScaleH[ 1 ] = 0.0; mSAOScaleV[ 0 ] = 0.0; mSAOScaleV[ 1 ] = 1.0 / (mRenderDevice.GetHeight() * SAO_BLUR_COEF );

uniform mat4 modelViewProjectionMatrix; attribute vec3 vertex3; attribute vec2 texCoords0; varying vec2 texCoord0; void main() { texCoord0 = texCoords0; vec4 vertex = vec4( vertex3.xyz, 1.0 ); gl_Position = modelViewProjectionMatrix * vertex; }

uniform vec2 scale; uniform sampler2D textureSource; varying vec2 texCoord0; void main() { vec4 color = vec4(0.0); color += texture2D( textureSource, texCoord0.st + vec2( -3.0 * scale.x, -3.0 * scale.y ) ) * 0.015625; color += texture2D( textureSource, texCoord0.st + vec2( -2.0 * scale.x, -2.0 * scale.y ) ) * 0.09375; color += texture2D( textureSource, texCoord0.st + vec2( -1.0 * scale.x, -1.0 * scale.y ) ) * 0.234375; color += texture2D( textureSource, texCoord0.st + vec2( 0.0 , 0.0 ) ) * 0.3125; color += texture2D( textureSource, texCoord0.st + vec2( 1.0 * scale.x, 1.0 * scale.y ) ) * 0.234375; color += texture2D( textureSource, texCoord0.st + vec2( 2.0 * scale.x, 2.0 * scale.y ) ) * 0.09375; color += texture2D( textureSource, texCoord0.st + vec2( 3.0 * scale.x, 3.0 * scale.y ) ) * 0.015625; gl_FragColor = color; }

**Ambient** occlusion is only meant to occlude the ambient part of the lighting calculation, so the output from the blur is multiplied by the ambient color in the fragment shader of the final shading pass. I ended up having to make my ambient lighting a bit brighter in order for the SAO to show a little better.

float saovalue = texture2D( sao, gl_FragCoord.xy / screenResolution.xy ).r; vec4 color = ( ( lambert * diffuseColor * shadow ) + ambientColor * saovalue ) * vec4( texColor.rgb, 1.0 );

Ambient occlusion is more art than science, so be prepared to spend quite a lot of time experimenting with its many settings to get the ideal results for your scene.

The first setting to tweak is kernelRadiusM. Depending on the scale of objects in your scene, this is the maximum distance from a sample point that you want scene geometry to effect the ambient occlusion. I originally thought 0.5m would be sufficient, but it looked better with a full metre.

Next is maxDepthDiffM which is the threshold where we consider there to be a depth discontinuity between an object and its background (we don't want occlusion on such edges.) This could probably be the same as kernelRadiusM, but I tried twice that for no particular reason.

NUM_SAMPLES/NUM_ROTATIONS are the primary performance determiners of the SAO shader. The larger NUM_SAMPLES, the better it looks and the slower it runs. The recommendation is that NUM_SAMPLES and NUM_ROTATIONS be relatively prime so that the samples don't align on multiple rotations.

bias controls what is considered occluded. Originally nearly every edge had occlusion on it. That wasn't so pretty. I increased bias in 0.01 increments until most of those dark lines went away.

Increasing intensity makes the occlusion darker.

Lastly, adjust saoMaxDistanceM to eliminate any remaining SAO on distant features (like the sky.)

[Rendering the raw SAO buffer (below the UI) after tuning.]

I am really impressed with the results of SAO on Miranda's scene. I spent two weeks learning how the algorithm works and implementing SAO as well as several days on and off adjusting it afterwards. I hope this article speeds that up for the next person.

Soon after SAO was complete, I noticed that the SAO effect flickered off when the camera was moving towards objects and came back as soon as the camera stopped. It took me days of puzzling to figure out what was going wrong. I suspected that there was a problem with depth testing somewhere in the SAO pipeline, I thought most likely in the blur passes, but glClear was being called with the correct parameters at the correct times. Eventually I set up the game to draw the depth buffer to the screen, and as soon as I did I knew that the depth buffer was the problem. It's a shame that screenshots were broken while I was having this problem, because all the incorrect values in the depth buffer as the camera moved around were beautiful. A bit of searching turned up the answer to my problem. glClear does not effect the depth buffer if glDepthMask is false. In Miranda, glDepthMask is manipulated on a per-material basis and it worked out that other passes were leaving glDepthMask off during the depth pre-pass. Only part of the depth buffer would get updated when the camera was moving toward an object. :(

We were unable to retrieve our cookie from your web browser. If pressing F5 once to reload this page does not get rid of this message, please read this to learn more.

**You will not be able to post until you resolve this problem.**

Copyright (C)2009-2019 onemanmmo.com. All Rights Reserved