DirectX 12 – Ambient Occlusion – The Second Approach


After some time trying to implement and understand ambient occlusion (with a failed attempt), I finished a better implementation. This was after I filled some gaps I had about perspective projection matrix and depth buffer (I made three posts about this).

Now, I wanted to describe from the beginning what I know and learned about ambient occlusion.

What is ambient light?

Ambient light is a light type in computer graphics that is used to simulate global illumination. It represents an omnidirectional, fixed-intensity and fixed-color light source that affects all objects in the scene equally. Upon rendering, all objects in the scene are brightened with the specified intensity and color. This type of light source is mainly used to provide the scene with a basic view of the different objects in it. This is the simplest type of lighting to implement and models how light can be scattered or reflected many times producing a uniform effect. In the following image, you can see a 3D scene rendered only with ambient lighting.


What is ambient occlusion?

Ambient occlusion is simply a simulation of the shadowing caused by objects blocking the ambient light. Because ambient light is environmental, unlike other types of lighting, ambient occlusion does not depend on light direction. Ambient lighting can be combined with ambient occlusion to represent how exposed each point of the scene is, affecting the amount of ambient light it can reflect. This produces diffuse, non-directional lighting throughout the scene, casting no clear shadows, but with enclosed and sheltered areas darkened. The result is usually visually similar to an overcast day. In the next image, we applied ambient occlusion to the previous image (the effect was exaggerated on purpose)


Ok, we could notice the difference between both pictures, but what about a real game (environment lighting + area lights, etc.)?

Check this example from our DirectX12 framework


Check these examples from shipped games




Ok, you convinced us, but does ambient occlusion exist in the real world?

In real world, the shadow from the sun grounds objects (for example, this car)


But if you are already in shadow, it is the ambient occlusion that grounds the car in the following image


If we take out the ambient occlusion from the previous image, we get the following result


Unlike local methods such as Phong shading, ambient occlusion is a global method, meaning that the illumination at each point is a function of other geometry in the scene. However, it is a very crude approximation to full global illumination. The appearance achieved by ambient occlusion alone is similar to the way an object might appear on an overcast day.

How does it affect ambient lighting?

The ambient term in the lighting equation is

Ambient Color = Ambient Light Color * Ambient Factor * Diffuse Albedo 

where the Ambient Factor (a scalar) specifies the total amount of indirect (ambient) light a surface receives from a light source, and the Diffuse Albedo (RGB) determines the amount of incoming light that the surface reflects due to diffuse reflectance. The Ambient Light Color is usually white (1.0, 1.0, 1.0).

All the ambient term does is uniformly brighten up the object a bit so that it does not go completely black in shadow (there is no real physics calculation at all). The idea is that the indirect light has scattered and bounced around the scene so many times that it strikes the object equally in every direction.

The idea of ambient occlusion is that the amount of indirect light a point p on a surface receives is proportional to how occluded it is to incoming light over the hemisphere about p.


The occlusion factor measures how occluded the point is (i.e. how much light it does not receive). For the purposes of calculations, we work with the inverse of this (i.e. how much light the point does receive). This quantity is called Ambient Accessibility and is derived from ambient occlusion as

Ambient Accessibility = 1.0 – Ambient Occlusion

where Ambient Occlusion belongs to [0.0, 1.0]

Finally, the ambient term is updated to the following expression

Ambient Color = Ambient Light Color * Ambient Factor * Diffuse Albedo * Ambient Accessibility

How is it implemented?

In ray tracing, it is simulated by sampling rays from a particular point, which takes the shape of a hemisphere, and then is checked for intersection with the scene.


You can see a ray tracer result in the following image


But that technique is too slow to be used in real-time computer graphics…

You can precompute (offline) ambient occlusion by ambient occlusion map generation (a texture that contains ambient occlusion data). This works well for static models, but for animated models does not. Nor we can generate ambient occlusion maps in runtime because we will need to generate 1 texture per model and that will impact performance. The next image shows an ambient occlusion map generated offline


As you can see, ambient occlusion texture is monochromatic. This is because its values belong to [0.0, 1.0] interval.

Then, what are the options for real-time computer graphics?

Vladimir Kajalin was working at Crytek when he developed a technique called Screen Space Ambient Occlusion (SSAO) that was used for first time in 2007 in Crysis video game.

The algorithm is implemented in the pixel shader. It needs to access to the depth buffer information. We will go step by step and check various improvements we can add to the implementation.

The basic steps are the following:

  • Generate a sample kernel. These samples are distributed over a sphere. The sphere radius R is a parameter that should be appropriate to the scale of the scene.


This can be generated in CPU side in the following way (note that we do not take into account radius R here because it can be used directly in pixel shader)


You can store this information in a HLSL StructuredBuffer

  • Project each sample into screen space to get the coordinates into the depth buffer. You can use Texture2D::Load() method to do this.
  • Sample the depth buffer. If the sample position is behind the sampled depth (i.e., inside geometry), then it contributes to the occlusion factor.

The ambient accessibility texture using a sample kernel of 128 samples looks like the following image


and with 32 samples looks like the following image


Definitely, we should use 128 samples version, it looks much better…

That is true, reducing the number of samples produces banding artifacts in the result, but to achieve the decent performance we should reduce the number of samples. This problem can be addressed by randomly rotating the sample kernel at each pixel. To achieve this, we need to do the following:

  • Generate a noise texture that contains random float3 used to rotate the sample kernel. This increases the sample count and minimizes the banding artifacts.


Crysis implementation used a texture of 4×4 and tiled it over the screen. This will cause the orientation of the kernel to be repeated. As the texture is small, this will occur at a high frequency. To remove this high frequency, we can add a blurring step that preserves the low-frequency detail of the image. This is cheaper than generating a noise texture of screen width * screen height dimension.

  • Noise texture will be sampled in pixel shader taking into account its dimensions. In our case, we have a screen width of 1920 and screen height of 1080 and our noise texture is 4×4. Then noise texture will be sampled in the following way.


  • Generate sample kernel reorientation matrix. We choose to perform the rotation along the fragment’s normal. For this, we will need to store geometry normals in a buffer. If you work with a deferred shading renderer, then you will have this for free (your geometry buffer that stores normal information). The matrix construction looks like the following code


  • Blur step to the ambient accessibility texture (there are a lot of blur shaders available on the internet, so I am not going to post code here)

After these improvements, the results with 32 samples is the following


You can note that for this scene we were using normal mapping. This is more noticeable now because we are reorienting our sample kernel along with the fragment’s normal vector.

You can compare in the following image, the difference with blur (left image) and without blur (right image)


What are the benefits of Screen Space Ambient Occlusion?

Compared to other ambient occlusion solutions, SSAO has the following advantages:

  • Independent from scene complexity.
  • No data pre-processing needed, no loading time and no memory allocations in system memory.
  • Works with dynamic scenes.
  • Operates in the same consistent way for every pixel on the screen.
  • It can be executed entirely on the GPU (except sample kernel and noise texture generation)
  • May be easily integrated into any modern graphics pipeline.

Of course, it has its disadvantages as well:

  • Rather local and in many cases view-dependent, as it is dependent on adjacent texel depths which may be generated by any geometry whatsoever.
  • Difficult to correctly smooth/blur out the noise without interfering with depth discontinuities, such as object edges (the occlusion should not “bleed” onto objects).

Are there any improvements to Crysis’s SSAO?

In a John Chapman’s post, he explains the problem with Crysis’s SSAO implementation:

“The Crysis method produces occlusion factors with a particular ‘look’ – because the sample kernel is a sphere, flat walls end up looking grey because ~50% of the samples end up being inside the surrounding geometry. Concave corners darken as expected, but convex ones appear lighter since fewer samples fall inside geometry. Although these artifacts are visually acceptable, they produce a stylistic effect which strays somewhat from photorealism.”

To solve this issue, our samples must be in the normal oriented hemisphere of the fragment. Then we need to distribute our sample kernel vectors on the hemisphere oriented along the z-axis. The function will be the following


Additionally, in the noise generation, the z component must zero. Since our kernel is oriented along the z-axis, we want the random rotation to occur around that axis. The function will be the following


Another improvement is to distribute sample kernel in a different way, for example, as described in John Chapman’s article, to fall off distribution when the distance from the center is bigger.


In the following video, you can see our current SSAO implementation

As you can see in the video, there is like a halo around each geometry. This can be fixed by introducing a range check that helps to prevent erroneous occlusion between large depth discontinuities (as described in John Chapman’s post).


Another problem is that we are not checking if the current sample is inside screen borders. That can cause the problem shown in the following video

We can skip current sample if it is outside screen borders


In the following video, you can see the improved version (range check + screen borders check)

Then, is SSAO the definitive technique to use?

There are a lot of different methods to use:

  • SSDO-Screen space directional occlusion
  • HDAO-High Definition Ambient Occlusion
  • HBAO+-Horizon Based Ambient Occlusion+
  • AAO-Alchemy Ambient Occlusion
  • ABAO-Angle Based Ambient Occlusion
  • PBAO
  • VXAO-Voxel Accelerated Ambient Occlusion

Also, an excellent article by Sean Barrett describes some problems with SSAO.

Here is another great article that compares different screen space ambient occlusion methods (thanks, Pablo Zurita for this recommendation)


In addition to all the links cited in the article, I used the following references:

Uncharted 2 HDR

Ambient Occlusion chapter in Introduction to 3D Game Programming with DirectX 12

Source Code

The implementation is located here


4 thoughts on “DirectX 12 – Ambient Occlusion – The Second Approach

  1. Hi Nico!

    Long time! 🙂

    Great post! I always look forward to your posts on DX!

    This time I can give you a hint on a very subtle issue regarding your random sampling, in your current implementation, it’s biased towards the vertices of the unit cube / unit square. I can do a shameless self promotion of the issue and how to avoid it from one post I made:

    In essence, don’t use just Normalize(vec3(random(), random(), random()), either use rejection sampling or use a latitude/longitude distribution.



  2. Hi, Mariano!

    Thanks for the article, I just read it and it is great.

    In my case, I am using 32 samples. Given this small number of samples, do you think a different random sampling algorithm could make a difference in the final result?


    1. Hi Nico!

      Probably with 32 samples the difference would be negligible (and not worth the effort), but if you want, you could just use rejection sampling which is just rejecting sample vectors whose length is more than 1 (so that they all fall within the unit sphere) which is just a while loop that, most likely, won’t execute many times.

      Still I thought there ware considerably more samples, that’s why I suggested the possible bias.

      Again, great article!


  3. Yes, I think rejection sampling has worse performance but a much better distribution. I only generate sample kernel once and at loading time, so it will not impact real-time performance.

    Thanks for your suggestion! I implemented some particle systems with DirectX 11, one of them organized particles on the surface of a sphere and I remember it looked like the first animation in your article (not uniform distribution)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s