Hybrid Frustum Traced Shadows
I recently delivered a presentation on “Advanced Geometrically Correct Shadows for Modern Game Engines” at GDC 2016. If you weren’t fortunate enough to attend GDC, then you can catch up with what I was talking about right here! The presentation itself can be found on our developer website:
NVIDIA HFTS (Hybrid Frustum Traced Shadows) is an advanced hybrid shadow technique that combines frustum tracing, screen-space anti-aliasing and variable penumbra soft shadow filters. Frustum Tracing is a form of Ray Tracing and as such, it produces perfect results. NVIDIA HFTS smoothly transitions from a geometrically accurate hard shadow to a super soft result in real time. Additionally it addresses issues that other shadow technologies do not, like shadow detachment, aliasing, and interference from overlapping blockers. Real time ray traced shadows have been the holy grail of graphics programmers for years, and we believe the inclusion of HFTS in Tom Clancy’s The Division is an industry first in this regard. HFTS takes advantage of a new GPU hardware feature called Conservative Rasterization, which ensures that even the smallest triangles do not escape being frustum traced.
How is Frustum Tracing Different to Ray Tracing?
A conventional ray tracer shoots a ray from a point in the scene into a stored hierarchy of triangles. Based upon where the ray intersects the hierarchy a sub-set of triangles are then tested for intersection with the ray. This approach is problematic from the point of view of hierarchy construction, pure storage and incoherent memory access. HFTS does not store a hierarchy of triangles, and thus circumvents this fundamental problem. Instead it creates a list of the screen pixels that map to a given light space texel – also known as an “Irregular Z-Buffer”. When triangles are rendered in light space, a frustum (bounding volume) is constructed using the triangle itself and the light direction. Then all of the screen pixels that map to it, are tested using a classic point-in-frustum algorithm. If a screen pixel is deemed to be inside the frustum, then it is marked as in shadow. Figures 1 and 2 below depict the fundamental difference between the two tracing techniques:
Constructing the Irregular Z-Buffer
The irregular z-buffer can be thought of simply as a way to build lists of screen pixels that map to light space texels. A pixel shader is used to perform a full screen pass, where each screen pixel is transformed into light space. An atomic exchange then occurs with the address of the screen pixel and the one currently stored in the light space texel. In this way a fixed memory footprint list structure is created, as shown below in Figure 3:
Conservative raster is a cool new feature that came along with NVIDIA’s new Maxwell architecture, which means it is accessible today on GeForce GTX 900 Series, second generation Maxwell GPU’s. It allows rasterization to generate fragments for every pixel touched by a primitive as shown in the figure 4 below. Conservative raster is an absolute requirement for frustum tracing to ensure that no triangle is missed. If any single triangle is missing you end up with holes in the frustum traced shadow result.
In addition to being available in DirectX 11.3 & 12, it is also possible to enable this feature in OpenGL and in DirectX 11:Direct3D11
Use the NVIDIA NvAPI to create an ID3D11RasterizerState, the interface for this is shown below:
NvAPI_D3D11_RASTERIZER_DESC_EX NVAPI_RS_DESC; NVAPI_RS_DESC.ConservativeRasterEnable = TRUE; // Enable conservative rasterOpenGL
NvAPI_D3D11_CreateRasterizerState( < pD3D11Device, (const NvAPI_D3D11_RASTERIZER_DESC_EX*)&NVAPI_RS_DESC, &m_pConservativeRaster );
Use the GL_NV_conservative_raster extension, the specs of which are located here: GL_NV Conservative Raster Extension
Frustum Tracing vs Shadow Mapping
The quality of frustum tracing is such that it produces a perfect hard shadow, and that holds true no matter how close you zoom in to look at the details. In this way it sort of gives you infinite resolution. By contrast conventional shadow mapping tends to suffer from a lack of resolution, and so produces aliased results. This is shown below in Figure 5 and 6:
It is of course possible to shoot additional rays to achieve an anti-aliased result, but not only is getting this correct quite tricky, it is also rather expensive. A simple trick is to make good use of a screen space AA technique such as FXAA. In figure 7 below, you can clearly see the smoothing benefit that this provides:
In The Division the game engine already had an SMAA implementation in place, so this AA step came for free.
Hard shadows on their own may be useful in some niche circumstances, but the vast majority of games can benefit from contact-hardening style shadows that transition from being hard to soft as the distance between receiver and blocker increases. In order to achieve this you first need to generate a robust interpolation factor as described in figure 8 below:
Using the interpolation factor it is then possible to transition from a perfect frustum traced result to the soft shadow result of PCSS. This generates extremely high quality results, as can be seen in figure 9 below:
Figures 10 and 11 below are a comparison taken from Tom Clancy’s The Division, clearly showing the difference between regular PCSS and HFTS:
Call to Action
Because of great new GPU HW features like conservative raster, we can explore new approaches to problems. This new hybrid approach to shadows combines the best of both worlds, with razor sharp hard shadows transitioning to super soft shadows. This new technique is now available as part of ShadowWorks, you can check it out here:
Additionally, check out the Tom Clancy's The Division Graphics & Performance Guide:Tom Clancy's The Division Graphics & Performance Guide