Tech Feature: SSAO and Temporal Blur

Some links in this article have expired and have been removed.

Screen space ambient occlusion (SSAO) is the standard solution for approximating ambient occlusion in video games. Ambient occlusion is used to represent how exposed each point is to the indirect lighting from the scene. Direct lighting is light emitted from a light source, such as a lamp or a fire. The direct light then illuminates objects in the scene. These illuminated objects make up the indirect lighting. Making each object in the scene cast indirect lighting is very expensive. Ambient occlusion is a way to approximate this by using a light source with constant color and information from nearby geometry to determine how dark a part of an object should be. The idea behind SSAO is to get geometry information from the depth buffer.

There are many publicised algorithms for high quality SSAO. This tech feature will instead focus on improvements that can be made after the SSAO has been generated.

SSAO Algorithm

SOMA uses a fast and straightforward algorithm for generating medium frequency AO. The algorithm runs at half resolution which greatly increases the performance. Running at half resolution doesn’t reduce the quality by much, since the final result is blurred.

For each pixel on the screen, the shader calculates the position of the pixel in view space and then compares that position with the view space position of nearby pixels. How occluded the pixel gets is based on how close the points are to each other and if the nearby point is in front of the surface normal. The occlusion for each nearby pixel is then added together for the final result. 

SOMA uses a radius of 1.5m to look for nearby points that might occlude. Sampling points that are outside of the 1.5m range is a waste of resources, since they will not contribute to the AO. Our algorithm samples 16 points in a growing circle around the main pixel. The size of the circle is determined by how close the main pixel is to the camera and how large the search radius is. For pixels that are far away from the camera, a radius of just a few pixels can be used. The closer the point gets to the camera the more the circle grows – it can grow up to half a screen. Using only 16 samples to select from half a screen of pixels results in a grainy result that flickers when the camera is moving.

Grainy result from the SSAO algorithm

Bilateral Blur

Blurring can be used to remove the grainy look of the SSAO. Blur combines the value of a large number of neighboring pixels. The further away a neighboring pixel is, the less the impact it will have on the final result. Blur is run in two passes, first in the horizontal direction and then in the vertical direction.

The issue with blurring SSAO this way quickly becomes apparent. AO from different geometry leaks between boundaries causing a bright halo around objects. Bilateral weighting can be used to fix the leaks between objects. It works by comparing the depth of the main pixel to the depth of the neighboring pixel. If the distance between the depth of the main and the neighbor is outside of a limit the pixel will be skipped. In SOMA this limit is set to 2cm.

To get good-looking blur the number of neighboring pixels to sample needs to be large. Getting rid of the grainy artifacts requires over 17×17 pixels to be sampled at full resolution.

Temporal Filtering 

Temporal Filtering is a method for reducing the flickering caused by the low number of samples. The result from the previous frame is blended with the current frame to create smooth transitions. Blending the images directly would lead to a motion-blur-like effect. Temporal Filtering removes the motion blur effect by reverse reprojecting the view space position of a pixel to the view space position it had the previous frame and then using that to sample the result. The SSAO algorithm runs on screen space data but AO is applied on world geometry. An object that is visible in one frame may not be seen in the next frame, either because it has moved or because the view has been blocked by another object. When this happens the result from the previous frame has to be discarded. The distance between the points in world space determines how much of the result from the previous frame should be used.

Explanation of Reverse Reprojection used in Frostbite 2 [2]

Temporal Filtering introduces a new artifact. When dynamic objects move close to static objects they leave a trail of AO behind. Frostbite 2’s implementation of Temporal Filtering solves this by disabling the Temporal Filter for stable surfaces that don’t get flickering artifacts. I found another way to remove the trailing while keeping Temporal Filter for all pixels.

https://www.youtube.com/watch?v=U1NBIJMVLL8
Shows the trailing effect that happens when a dynamic object is moved. The Temporal Blur algorithm is then applied and most of the trailing is removed.

Temporal Blur

(A) Implementation of Temporal Filtered SSAO (B) Temporal Blur implementation 

I came up with a new way to use Temporal Filtering when trying to remove the trailing artifacts. By combining two passes of cheap blur with Temporal Filtering all flickering and grainy artifacts can be removed without leaving any trailing. 

When the SSAO has been rendered, a cheap 5×5 bilateral blur pass is run on the result. Then the blurred result from the previous frame is applied using Temporal Filtering. A 5×5 bilateral blur is then applied to the image. In addition to using geometry data to calculate the blending amount for the Temporal Filtering the difference in SSAO between the frames is used, removing all trailing artifacts. 

Applying a blur before and after the Temporal Filtering and using the blurred image from the previous frame results in a very smooth image that becomes more blurred for each frame, it also removes any flickering. Even a 5×5 blur will cause the resulting image to look as smooth as a 64×64 blur after a few frames.

Because the image gets so smooth the upsampling can be moved to after the blur. This leads to Temporal Blur being faster, since running four 5×5 blur passes in half resolution is faster than running two 17×17 passes in full resolution. 

Upsampling

All of the previous steps are performed in half resolution. To get the final result it has to be scaled up to full resolution. Stretching the half resolution image to twice its size will not look good. Near the edges of geometry there will be visible bleeding; non-occluded objects will have a bright pixel halo around them. This can be solved using the same idea as the bilateral blurring. Normal linear filtering is combined with a weight calculated by comparing the distance in depth between the main pixel and the depth value of the four closest half resolution pixels.

Summary 

Combining SSAO with the Temporal Blur algorithm produces high quality results for a large search radius at a low cost. The total cost of the algoritm is 1.1ms (1920×1080 AMD 5870). This is more than twice as fast as a normal SSAO implementation.

SOMA uses high frequency AO baked into the diffuse texture in addition to the medium frequency AO generated by the SSAO.

Temporal Blur could be used to improve many other post effects that need to produce smooth-looking results.

Ambient Occlusion is only one part of the rendering pipeline, and it should be combined with other lighting techniques to give the final look.

References

  1. http://gfx.cs.princeton.edu/pubs/Nehab_2007_ARS/NehEtAl07.pdf
  2. http://dice.se/wp-content/uploads/GDC12_Stable_SSAO_In_BF3_With_STF.pdf
Code Appendix

 // SSAO Main loop



//Scale the radius based on how close to the camera it is

 float fStepSize = afStepSizeMax * afRadius / vPos.z; 
 float fStepSizePart = 0.5 * fStepSize / ((2 + 16.0));    



 for(float d = 0.0; d < 16.0; d+=4.0)
 {
        //////////////
        // Sample four points at the same time
        vec4 vOffset = (d + vec4(2, 3, 4, 5))* fStepSizePart;
        

        //////////////////////
        // Rotate the samples
        vec2 vUV1 = mtxRot * vUV0;
        vUV0 = mtxRot * vUV1;

        vec3 vDelta0 = GetViewPosition(gl_FragCoord.xy + vUV1 * vOffset.x) - vPos;
        vec3 vDelta1 = GetViewPosition(gl_FragCoord.xy - vUV1 * vOffset.y) - vPos;
        vec3 vDelta2 = GetViewPosition(gl_FragCoord.xy + vUV0 * vOffset.z) - vPos;
        vec3 vDelta3 = GetViewPosition(gl_FragCoord.xy - vUV0 * vOffset.w) - vPos;

        vec4 vDistanceSqr = vec4(dot(vDelta0, vDelta0),
                                 dot(vDelta1, vDelta1),
                                 dot(vDelta2, vDelta2),
                                 dot(vDelta3, vDelta3));

        vec4 vInvertedLength = inversesqrt(vDistanceSqr);

        vec4 vFalloff = vec4(1.0) + vDistanceSqr * vInvertedLength * fNegInvRadius;

        vec4 vAngle = vec4(dot(vNormal, vDelta0),
                            dot(vNormal, vDelta1),
                            dot(vNormal, vDelta2),
                            dot(vNormal, vDelta3)) * vInvertedLength;


        ////////////////////
        // Calculates the sum based on the angle to the normal and distance from point
        fAO += dot(max(vec4(0.0), vAngle), max(vec4(0.0), vFalloff));

}



//////////////////////////////////
// Get the final AO by multiplying by number of samples

fAO = max(0, 1.0 - fAO / 16.0);



------------------------------------------------------------------------------ 



// Upsample Code

 
vec2 vClosest = floor(gl_FragCoord.xy / 2.0);
vec2 vBilinearWeight = vec2(1.0) - fract(gl_FragCoord.xy / 2.0);

float fTotalAO = 0.0;
float fTotalWeight = 0.0;

for(float x = 0.0; x < 2.0; ++x)
for(float y = 0.0; y < 2.0; ++y)
{
       // Sample depth (stored in meters) and AO for the half resolution  
       float fSampleDepth = textureRect(aHalfResDepth, vClosest + vec2(x,y)); 
       float fSampleAO = textureRect(aHalfResAO, vClosest + vec2(x,y));

       // Calculate bilinear weight 
       float fBilinearWeight = (x-vBilinearWeight .x) * (y-vBilinearWeight .y);
       // Calculate upsample weight based on how close the depth is to the main depth
       float fUpsampleWeight = max(0.00001, 0.1 - abs(fSampleDepth – fMainDepth)) * 30.0;

       // Apply weight and add to total sum 
       fTotalAO += (fBilinearWeight + fUpsampleWeight) * fSampleAO;
       fTotalWeight += (fBilinearWeight + fUpsampleWeight);
}

// Divide by total sum to get final AO
float fAO = fTotalAO / fTotalWeight;



-------------------------------------------------------------------------------------



// Temporal Blur Code



//////////////////
// Get current frame depth and AO
vec2 vScreenPos = floor(gl_FragCoord.xy) + vec2(0.5);
float fAO = textureRect(aHalfResAO, vScreenPos.xy);

float fMainDepth = textureRect(aHalfResDepth, vScreenPos.xy);   



//////////////////
// Convert to view space position

vec3 vPos = ScreenCoordToViewPos(vScreenPos, fMainDepth);



/////////////////////////
// Convert the current view position to the view position it 

// would represent the last frame and get the screen coords
vPos = (a_mtxPrevFrameView * (a_mtxViewInv * vec4(vPos, 1.0))).xyz;

vec2 vTemporalCoords = ViewPosToScreenCoord(vPos);
        
//////////////
// Get the AO from the last frame
float fPrevFrameAO = textureRect(aPrevFrameAO, vTemporalCoords.xy);
float fPrevFrameDepth = textureRect(aPrevFrameDepth, vTemporalCoords.xy);



/////////////////
// Get to view space position of temporal coords
vec3 vTemporalPos = ScreenCoordToViewPos(vTemporalCoords.xy, fPrevFrameDepth);
        
///////
// Get weight based on distance to last frame position (removes ghosting artifact)
float fWeight = distance(vTemporalPos, vPos) * 9.0;



////////////////////////////////




// And weight based on how different the amount of AO is (removes trailing artifact)

// Only works if both fAO and fPrevFrameAO is blurred 
fWeight += abs(fPrevFrameAO - fAO ) * 5.0;

////////////////
// Clamp to make sure atleast 1.0 / FPS of a frame is blended
fWeight = clamp(fWeight, afFrameTime, 1.0);       


fAO = mix(fPrevFrameAO , fAO , fWeight);
   

------------------------------------------------------------------------------