It's been awhile since we have announced VXGI, the real-time voxel global illumination technology. Since then, a lot of effort was put to productize it, integrate into Unreal Engine 4, and make it available for free. And it's a great technology; it can produce realistic lighting with good occlusion for dynamic scenes with no preprocessing. VXGI allows applications to compute multi-bounce indirect diffuse and specular lighting, use the cone tracing functions in application shaders for special needs like refractive materials or light map computation, and compute ambient occlusion effects based on voxel data.
One of these features has been left with too little attention: the ambient occlusion mode, or VXAO as we call it. The idea is simple - we remove the lighting part and keep only the occlusion part. Obviously, it is much less resource intensive: computing VXAO for a frame can be 2-10x faster than computing full global illumination solution, depending on settings. At the same time, VXAO is 3-4x slower than HBAO+, while its results are much better than HBAO+.
If you have worked with screen-space ambient occlusion algorithms, you know the primary issues that they come with. These are:
- Dark halos or lack of occlusion behind foreground objects;
- Unstable results near screen borders;
- Locality, which means that only a small volume around a surface contributes to AO;
- Blurriness, which comes from a blur filter required because computing a complete solution for every pixel would be too expensive.
VXAO has none of these issues because it is based on a different principle. Instead of relying on screen space data, it gathers information from a world space voxel representation of the scene, which covers a large area around the viewer. It doesn't matter if some object is not visible to the viewer - it can be behind something else, or even behind the viewer - it's still there, and it still contributes to ambient occlusion. VXAO uses voxel cone tracing, so objects that are relatively far from the surface under consideration can still contribute, and taking them into account is not as expensive as it would be in a screen space algorithm.
Let's take a look at the images produced by one of the best SSAO libraries, HBAO+, and VXAO (Fig. 1 and 2).
The difference between these images is obvious, but some areas deserve special attention.
- Ground under the tank: no occlusion from HBAO+, some occlusion from VXAO.
- Bottom part of the tank tracks: no occlusion from HBAO+, significant occlusion from VXAO.
- Metal stand on the left side: a lot of occlusion from HBAO+, almost none from VXAO.
- Barrel behind the fire hydrant: halo around the hydrant from HBAO+, no halo from VXAO.
If we multiply these channels by surface albedo, as one would do in actual game engine, the difference becomes not as obvious, but it’s still significant (Fig. 3 and 4).
Now let's take a look at the view dependence in VXAO against HBAO+.
As you can see, VXAO results are very stable. There are some changes in occlusion as objects move closer or further away, but they are not too noticeable. The reason for such changes is that when an object moves away, its voxel representation becomes coarser, which means greater error. Ultimately, when objects are outside of the voxel coverage area, VXAO cannot compute any occlusion for them. In this case the alpha channel of the computed AO buffer will contain a zero, indicating that a fallback solution should be used. There is a smooth transition from one (full confidence) to zero (we know nothing), so that the fallback solution can be blended without a sharp boundary.
But how well does VXAO handle dynamic scenes? The answer is, quite well. Most of the voxel data can be preserved between frames, unless there are lots of moving objects or the camera moves quickly. And even if no voxel data can be preserved, voxelizing geometry again is not too expensive. Voxelization of a typical, high-detail game scene with a few million triangles can be done in about 3-5 milliseconds on a modern GPU like GeForce GTX 980.
Overall, there are three major passes in the VXAO algorithm: voxelization, voxel post-processing, and cone tracing. Voxelization is performed by rendering the triangle meshes into a 3D texture, and as such, its performance highly depends on the total number of triangles, size of these triangles, and the number of draw calls required to render them. Post-processing combines passes like clearing, filtering and downsampling voxels, and its performance depends on the total number of voxels produced during voxelization. Typical post-processing time is 0.5 – 1.5 ms. And finally, cone tracing is performed in screen space, so its performance depends on the screen resolution, shading rate, and the cone tracing pass in 1080p resolution.
VXAO video memory requirements are not frightening, either. Depending on settings, the voxel textures can take from 6 to 100 MB. Some screen-sized or smaller 2D textures are also used, which can add a few dozen megabytes in very high resolution modes. To compare, VXAO uses 4 or 8 bytes per voxel, and full VXGI adds 24, 48 or 72 bytes to that, so VXGI’s memory requirements are normally around 500 MB and can go up to 7 GB if the highest quality settings are used.
Finally, if you saw the original announcement of VXGI at Maxwell launch, you may think it works only on Maxwell. That's not true. Maxwell does have some useful hardware features, but the only one relevant to VXAO is pass-through geometry shaders, which improve voxelization performance by approximately 30%, and they can be safely replaced with regular geometry shaders. So VXGI in general and VXAO in particular can work on all DX11 class GPUs, including ones made by NVIDIA competitors, but Maxwell GPUs deliver the best performance. It’s not limited to DX11 either: DX12 and OpenGL 4.5 are also supported.
A few months ago we started working with Nixxes Software to help them integrate VXAO into Rise of the Tomb Raider game, and it was successfully released in a patch last week. This is the first use of VXAO in a real world game, and it proves the point that VXAO is the next step in ambient occlusion technology. There are plenty of game locations where VXAO shines. For example, many indoor locations start looking much more natural – see Fig. 5 and 6.
In Rise of the Tomb Raider, not all locations benefit from VXAO equally well, and the reason for that lies in the game’s art and lighting solution. The game has separate channels for ambient lighting and ambient occlusion, and how exactly they are used is determined by materials. The ambient occlusion channel is often applied on top of direct lighting as well, and because VXAO is not a local effect and tends to add occlusion to large surfaces, some lights become dimmer. So we had to apply VXAO to the ambient light channel instead and keep HBAO+ in the ambient occlusion channel in order to achieve the best look. It became clear that VXAO is mostly a long-range effect, and it’s useful to combine it with some short-range SSAO technique to highlight small features which cannot be adequately represented by voxels. For this reason, VXAO now includes an optional screen-space occlusion pass so that you don’t have to work with a separate SSAO library.
Currently, VXAO support in Rise of the Tomb Raider is limited to specific enthusiast-class GPUs in order to provide the most enjoyable gaming experience. There is no doubt that global illumination and shading techniques place a heavy load on the GPU, so in order to maintain high frame rates for all RoTR players, VXAO is available only on higher performance systems.
Hopefully by now you have decided to try using VXAO in your game or application. If you’re using Unreal Engine 4, that’s easy: we have integrated VXGI into UE 4.10, and it’s available at GitHub for free: https://github.com/NvPhysX/UnrealEngine branch VXGI-4.10. To start using VXAO, just set the console variable “r.VXGI.AmbientOcclusionMode” to 1 and enable VXGI Diffuse Tracing in Post-Process Volume settings. Note: if you get a 404 error on the link above, create an Unreal Engine account and link it to your GitHub account, as described here: https://github.com/EpicGames/Signup.
If you’re using a different engine, you’ll have to integrate VXAO into it. The integration process is not too complicated, although integrating VXAO is very different from working with screen-space methods. You can get the distribution package here: https://developer.nvidia.com/NVIDIA-VXAO and take a look at the samples and the integration tutorial supplied with them.