# NVIDIA DXR Helpers: Introduction The [DXR Tutorial](/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-1) uses a small number of helper classes to bridge the gap between the principles of DXR and the actual implementation. These helpers make heavy use of the STL and modern C++ to reduce the code size to a minimum. The entirety of the helper code is present in this page without any more dependency than DirectX headers and the STL. The classes are completely independent from each other, and the helper methods directly contain the DX12 code without further indirections. This document describes the contents of such helpers, which have been designed to be usable either as is, or to provide code that can be easily extracted and integrated in existing applications. As such, the helpers contain the minumum amount of data, and leave the user manage the GPU activity: in particular, the helpers do not perform any GPU memory allocations. Since every application may have its own memory management, the helpers do not use any smart pointers, leaving the responsibility of pointer management to the application. This document aims at providing information on the underlying helpers of the tutorial, and does not claim to document the DXR specification and usage exhaustively. For a complete description of DXR, we would recommend reading the documentation of the DXR SDK, available in the [DirectXTech Forums](http://www.directxtech.com/). Each section can be read independently, hence some repetitions can be found from a section to another. The source files of the helper classes can be found here: [DXRHelpers.zip](/rtx/raytracing/dxr/tutorial/Files/DXRHelpers.zip) # Quick reference * [`BottomLevelASGenerator`](#toc3): Generating the bottom-level acceleration structure (BLAS) * [`AddVertexBuffer`](#toc3.1) Add a vertex buffer to the geometry of the BLAS * [`ComputeASBufferSizes`](#toc3.2): Compute the amount of memory required to build the BLAS * [`Generate`](#toc3.3): Build the BLAS and stores it into a user-provided buffer * [`TopLevelASGenerator`](#toc4): Create and hold the acceleration structure of the scene * [`AddInstance`](#toc4.1): Add an instance to the acceleration structure * [`ComputeASBufferSizes`](#toc4.2): Compute the memory requirements to build the TLAS * [`Generate`](#toc4.3): Generate and store TLAS * [`RootSignatureGenerator`](#toc5): Simple generation of complex root signatures * [`AddRangeParameter`](#toc5.1) Add a reference to a range of views within the active heap * [`AddHeapRangesParameter`](#toc5.2) Add an explicit reference to a buffer or constants * [`Generate`](#toc5.3) Generate the root signature from the parameters * [`RayTracingPipelineGenerator`](#toc6): Assembling components to generate the raytracing pipeline * [`AddLibrary`](#toc6.5) Add a DXIL library representing a shader program * [`AddHitGroup`](#toc6.6) Combine intersection, any hit and closest hit programs into a hit group * [`AddRootSignatureAssociation`](#toc6.6) Associate programs or hit groups to a root signature * [`SetMaxPayloadSize`, `SetMaxAttributeSize`, `SetMaxRecursionDepth`](#toc6.7) Set the global pipeline properties * [`Generate`](#toc6.11) Create the pipeline subobjects and Generate the raytracing pipeline * [`ShaderBindingTableGenerator`](#toc7): Constructing the SBT associating geometry and shaders * [`AddRayGenerationProgram`](#toc7.2) Add a ray generation program and its resource pointers * [`AddMissProgram`](#toc7.3) Add a miss program and its resource pointers * [`AddHitGroup`](#toc7.4) Add a hit group and its resource pointers * [`Generate`](#toc7.7) Add a ray generation program and its resource pointers * [`Reset`](#toc7.9) Remove all program and hit groups references from the SBT * [`Getters`](#toc7.10) Access the size of the entries and SBT sections to facilitate the `DispatchRays` setup # Bottom-Level Acceleration Structure The `BottomLevelAS` class facilitates setting up the geometry to be used as input of the bottom-level acceleration structure (BLAS) builder. This bottom-level hierarchy is used to store the triangle data in a way suitable for fast ray-triangle intersection at runtime. To be built, this data structure requires some scratch space which has to be allocated by the application. Similarly, the resulting data structure is stored in an application-controlled buffer. To be used, the application must first add all the vertex buffers to be contained in the final structure, using AddVertexBuffer. After all buffers have been added, ComputeASBufferSizes will prepare the build, and provide the required sizes for the scratch data and the final result. The Generate call will finally compute the acceleration structure and store it in the result buffer. Note that the build is enqueued in the command list, meaning that the scratch buffer needs to be kept until the command list execution is finished. Here is an example usage: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Add the vertex buffers (geometry) BottomLevelAS bottomLevelAS; bottomLevelAS.AddVertexBuffer(vertexBuffer, 0, vertexCount, sizeof(Vertex), identityMat.Get(), 0); bottomLevelAS.AddVertexBuffer(vertexBuffer2, 0, vertexCount2, sizeof(Vertex), identityMat.Get(), 0); ... // Find the size for the buffers UINT64 scratchSizeInBytes = 0; UINT64 resultSizeInBytes = 0; bottomLevelAS.ComputeASBufferSizes(GetRTDevice(), false, &scratchSizeInBytes, &resultSizeInBytes); AccelerationStructureBuffers buffers; buffers.pScratch = nv_helpers_dx12::CreateBuffer(..., scratchSizeInBytes, ...); buffers.pResult = nv_helpers_dx12::CreateBuffer(..., resultSizeInBytes, ...); // Generate acceleration structure bottomLevelAS.Generate(m_commandList.Get(), rtCmdList, buffers.pScratch.Get(), buffers.pResult.Get(), false, nullptr); return buffers; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This class contains a few members: the vector of geometry descriptors, the scratch and storage memory computed by `ComputeASBufferSizes`, and a flag indicating whether the geometry is dynamic or not. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /// Vertex buffer descriptors used to generate the AS std::vector m_vertexBuffers = {}; /// Amount of temporary memory required by the builder UINT64 m_scratchSizeInBytes = 0; /// Amount of memory required to store the AS UINT64 m_resultSizeInBytes = 0; /// Flags for the builder, specifying whether to allow iterative updates, or /// when to perform an update D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS m_flags; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddVertexBuffer The `AddVertexBuffer` method adds a vertex buffer along with its index buffer in GPU memory into the acceleration structure. The vertices are supposed to be represented by 3 float32 values. At this stage, the method creates a `D3D12_RAYTRACING_GEOMETRY_DESC` descriptor for the geometry, and adds it to the vector of geometries to combine within the BLAS. Note that when adding geometry to the BLAS it is possible to pass a `transformBuffer`, which will contain a 4x4 transform matrix located at `transformOffsetInBytes`. This allows the application to combine multiple objects within a single BLAS, which is particularly useful to optimize performance on the static parts of the scene. If not provided, an identity matrix is assumed. This implementation limits the original flexibility of the API: * No custom intersector support, only triangles * Vertex positions are in a 3xfloat32 format * Indices are 32-bit values ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void BottomLevelAS::AddVertexBuffer( ID3D12Resource *vertexBuffer, // Buffer containing the vertex coordinates, // possibly interleaved with other vertex data UINT64 vertexOffsetInBytes, // Offset of the first vertex in the vertex buffer uint32_t vertexCount, // Number of vertices to consider in the buffer UINT vertexSizeInBytes, // Size of a vertex including all its other data, // used to stride in the buffer ID3D12Resource *indexBuffer, // Buffer containing the vertex indices // describing the triangles UINT64 indexOffsetInBytes, // Offset of the first index in the index buffer uint32_t indexCount, // Number of indices to consider in the buffer ID3D12Resource *transformBuffer, // Buffer containing a 4x4 transform matrix // in GPU memory, to be applied to the // vertices. This buffer cannot be nullptr UINT64 transformOffsetInBytes, // Offset of the transform matrix in the // transform buffer bool isOpaque /* = true */ // If true, the geometry is considered opaque, optimizing the search // for a closest hit ) { // Create the DX12 descriptor representing the input data, assumed to be // opaque triangles, with 3xf32 vertex coordinates and 32-bit indices D3D12_RAYTRACING_GEOMETRY_DESC descriptor = {}; descriptor.Type = D3D12_RAYTRACING_GEOMETRY_TYPE_TRIANGLES; descriptor.Triangles.VertexBuffer.StartAddress = vertexBuffer->GetGPUVirtualAddress() + vertexOffsetInBytes; descriptor.Triangles.VertexBuffer.StrideInBytes = vertexSizeInBytes; descriptor.Triangles.VertexCount = vertexCount; descriptor.Triangles.VertexFormat = DXGI_FORMAT_R32G32B32_FLOAT; descriptor.Triangles.IndexBuffer = indexBuffer ? (indexBuffer->GetGPUVirtualAddress() + indexOffsetInBytes) : 0; descriptor.Triangles.IndexFormat = indexBuffer ? DXGI_FORMAT_R32_UINT : DXGI_FORMAT_UNKNOWN; descriptor.Triangles.IndexCount = indexCount; descriptor.Triangles.Transform = transformBuffer ? (transformBuffer->GetGPUVirtualAddress() + transformOffsetInBytes) : 0; descriptor.Flags = isOpaque ? D3D12_RAYTRACING_GEOMETRY_FLAG_OPAQUE : D3D12_RAYTRACING_GEOMETRY_FLAG_NONE; m_vertexBuffers.push_back(descriptor); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ComputeASBufferSizes Once all the geometry has been added to the vector of geometry descriptors, we need to estimate two amounts of memory required to build the BLAS: the size of the scratch space, which is used as temporary storage during the build, and the size of the actual BLAS. This method returns both values, so that the application can allocate the appropriate amounts of memory. The description of the work to be performed by the builder is provided in the `D3D12_GET_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO_DESC` structure. It provides the pointers on the vertex data descriptors, and a flag indicating whether the AS will be static (`D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_NONE`) or possibly updated over time (`D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE`). This flag is stored in the helper for later use during the build. This information is then passed to `GetRaytracingAccelerationStructurePrebuildInfo`, which provides the required amounts of scratch and storage memory. Note that the buffers will have the same properties as constant buffers, meaning that their size needs to be 256-byte aligned. The required sizes are returned so that the application can allocate the buffers before calling the BLAS builder. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void BottomLevelAS::ComputeASBufferSizes( ID3D12DeviceRaytracingPrototype *device, // Device on which the build will be performed bool allowUpdate, // If true, the resulting acceleration structure will // allow iterative updates UINT64 *scratchSizeInBytes, // Required scratch memory on the GPU to build // the acceleration structure UINT64 *resultSizeInBytes // Required GPU memory to store the acceleration // structure ) { // The generated AS can support iterative updates. This may change the final // size of the AS as well as the temporary memory requirements, and hence has // to be set before the actual build m_flags = allowUpdate ? D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE : D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_NONE; // Describe the work being requested, in this case the construction of a // (possibly dynamic) bottom-level hierarchy, with the given vertex buffers D3D12_GET_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO_DESC prebuildDesc; prebuildDesc.Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL; prebuildDesc.DescsLayout = D3D12_ELEMENTS_LAYOUT_ARRAY; prebuildDesc.NumDescs = static_cast(m_vertexBuffers.size()); prebuildDesc.pGeometryDescs = m_vertexBuffers.data(); prebuildDesc.Flags = m_flags; // This structure is used to hold the sizes of the required scratch memory and resulting AS D3D12_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO info = {}; // Building the acceleration structure (AS) requires some scratch space, as well as space to store // the resulting structure This function computes a conservative estimate of the memory // requirements for both, based on the geometry size. device->GetRaytracingAccelerationStructurePrebuildInfo(&prebuildDesc, &info); // Buffer sizes need to be 256-byte-aligned *scratchSizeInBytes = ROUND_UP(info.ScratchDataSizeInBytes, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); *resultSizeInBytes = ROUND_UP(info.ResultDataMaxSizeInBytes, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); // Store the memory requirements for use during build m_scratchSizeInBytes = *scratchSizeInBytes; m_resultSizeInBytes = *resultSizeInBytes; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Generate The BLAS builder `Generate` takes as input the scratch and storage buffers, whose size have been computed above. Note that these buffers must be in the default heap, and with the `D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS` state before calling `Generate`. In case the BLAS is dynamic, once the BLAS has been built once it is possible to set the `updateOnly` parameter and, in this case, also provide a pointer to the current BLAS. The update can be done in-place or not, so it is possible to have `previousResult==resultBuffer`. The `Generate` call is the only one actually performing any GPU work, hence it requires a command list. Before DXR is natively supported in DirectX12, the `Generate` method requires a pointer to a regular `ID3D12GraphicsCommandList` command list as well as a pointer to the same command list, cast as a `ID3D12CommandListRaytracingPrototype`. Whether the BLAS is dynamic or not has been indicated in the `ComputeASBufferSizes` method. This allows us to partially check the consistency between the `ComputeASBufferSizes` and `Generate` calls. The builder work is described in the `D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC` descriptor, which provides the set of geometries to add and the target buffers. This descriptor is then passed to `BuildRaytracingAccelerationStructure` which enqueues the builder work on the command list. In case the BLAS is used directly within the same command list, the helper contains a barrier to ensure the build is finished before processing further commands. Since the buffers are in the `D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS` state, the barrier is a `D3D12_RESOURCE_BARRIER_TYPE_UAV`. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void BottomLevelAS::Generate( ID3D12GraphicsCommandList *commandList, // Command list on which the build will be enqueued ID3D12CommandListRaytracingPrototype *rtCmdList, // Same command list, casted into a raytracing list. This // will not be needed anymore with Windows 10 RS5. ID3D12Resource *scratchBuffer, // Scratch buffer used by the builder to // store temporary data ID3D12Resource *resultBuffer, // Result buffer storing the acceleration structure bool updateOnly, // If true, simply refit the existing // acceleration structure ID3D12Resource *previousResult // Optional previous acceleration // structure, used if an iterative update // is requested ) { D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS flags = m_flags; // The stored flags represent whether the AS has been built for updates or not. If yes and an // update is requested, the builder is told to only update the AS instead of fully rebuilding it if (flags == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE && updateOnly) flags = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PERFORM_UPDATE; // Sanity checks if (m_flags != D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE && updateOnly) throw std::logic_error("Cannot update a bottom-level AS not originally built for updates"); if (updateOnly && previousResult == nullptr) throw std::logic_error("Bottom-level hierarchy update requires the previous hierarchy"); if (m_resultSizeInBytes == 0 || m_scratchSizeInBytes == 0) throw std::logic_error("Invalid scratch and result buffer sizes - ComputeASBufferSizes needs " "to be called before Generate"); // Create a descriptor of the requested builder work, to generate a // bottom-level AS from the input parameters D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC buildDesc = {}; buildDesc.Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL; buildDesc.DescsLayout = D3D12_ELEMENTS_LAYOUT_ARRAY; buildDesc.NumDescs = static_cast(m_vertexBuffers.size()); buildDesc.pGeometryDescs = m_vertexBuffers.data(); buildDesc.DestAccelerationStructureData = {resultBuffer->GetGPUVirtualAddress(), m_resultSizeInBytes}; buildDesc.ScratchAccelerationStructureData = {scratchBuffer->GetGPUVirtualAddress(), m_scratchSizeInBytes}; buildDesc.SourceAccelerationStructureData = previousResult ? previousResult->GetGPUVirtualAddress() : 0; buildDesc.Flags = flags; // Generate the AS rtCmdList->BuildRaytracingAccelerationStructure(&buildDesc); // Wait for the builder to complete by setting a barrier on the resulting buffer. This is // particularly important as the construction of the top-level hierarchy may be called right // afterwards, before executing the command list. D3D12_RESOURCE_BARRIER uavBarrier; uavBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV; uavBarrier.UAV.pResource = resultBuffer; uavBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; commandList->ResourceBarrier(1, &uavBarrier); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Top-Level Acceleration Structure The `TopLevelAS` class embeds the code required to compute the top-level acceleration structure (TLAS), which binds together a set of BLAS described in the section above. The top-level hierarchy is used to store a set of instances represented by bottom-level hierarchies in a way suitable for fast intersection at runtime. To be built, this data structure requires some scratch space which has to be allocated by the application. Similarly, the resulting data structure is stored in an application-controlled buffer. To be used, the application must first add all the instances to be contained in the final structure, using AddInstance. After all instances have been added, ComputeASBufferSizes will prepare the build, and provide the required sizes for the scratch data and the final result. The Generate call will finally compute the acceleration structure and store it in the result buffer. Note that the build is enqueued in the command list, meaning that the scratch buffer needs to be kept until the command list execution is finished. Here is an example usage from the tutorial: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add all instances of the scene TopLevelAS topLevelAS; topLevelAS.AddInstance(instances1, matrix1, instanceId1, hitGroupIndex1); topLevelAS.AddInstance(instances2, matrix2, instanceId2, hitGroupIndex2); ... // Find the size of the buffers to store the AS UINT64 scratchSize, resultSize, instanceDescsSize; topLevelAS.ComputeASBufferSizes(GetRTDevice(), true, &scratchSize, &resultSize, &instanceDescsSize); // Create the AS buffers AccelerationStructureBuffers buffers; buffers.pScratch = nv_helpers_dx12::CreateBuffer(..., scratchSizeInBytes, ...); buffers.pResult = nv_helpers_dx12::CreateBuffer(..., resultSizeInBytes, ...); buffers.pInstanceDesc = nv_helpers_dx12::CreateBuffer(..., resultSizeInBytes, ...); // Generate the top level acceleration structure topLevelAS.Generate(m_commandList.Get(), rtCmdList, m_topLevelAS.pScratch.Get(), m_topLevelAS.pResult.Get(), m_topLevelAS.pInstanceDesc.Get(), updateOnly, updateOnly ? m_topLevelAS.pResult.Get() : nullptr); return buffers; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `TopLevelAS` class contains an internal structure to store the description of the instances, namely a pointer to the corresponding BLAS, a transform matrix, the instance index accessible as `InstanceID()` in the HLSL code, and the hit group index defining the first hit group of the Shader Binding Table corresponding to that particular instance. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /// Helper struct storing the instance data struct Instance { Instance(ID3D12Resource* blAS, const DirectX::XMMATRIX& tr, UINT iID, UINT hgId); /// Bottom-level AS ID3D12Resource* bottomLevelAS; /// Transform matrix const DirectX::XMMATRIX& transform; /// Instance ID visible in the shader UINT instanceID; /// Hit group index used to fetch the shaders from the SBT UINT hitGroupIndex; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The instances are stored in a vector for later use: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /// Instances contained in the top-level AS std::vector m_instances; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The class also contains a flag indicating whether the TLAS supports dynamic updates or not, and the amounts of memory required by the builder: the scratch memory to store temporary data during the build only, the size of the buffer containing the description of the instances, and the final buffer containing the TLAS itself. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /// Construction flags, indicating whether the AS supports iterative updates D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS m_flags; /// Size of the temporary memory used by the TLAS builder UINT64 m_scratchSizeInBytes; /// Size of the buffer containing the instance descriptors UINT64 m_instanceDescsSizeInBytes; /// Size of the buffer containing the TLAS UINT64 m_resultSizeInBytes; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddInstance This method adds an instance to the top-level acceleration structure. The instance is represented by a bottom-level AS, a transform, an instance ID and the index of the hit group indicating which shaders are executed upon hitting any geometry within the instance. It simply enqueues the instance data in the vector of instances. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void TopLevelAS::AddInstance( ID3D12Resource *bottomLevelAS, // Bottom-level acceleration structure containing the // actual geometric data of the instance const DirectX::XMMATRIX &transform, // Transform matrix to apply to the instance, allowing the // same bottom-level AS to be used at several world-space // positions UINT instanceID, // Instance ID, which can be used in the shaders to // identify this specific instance UINT hitGroupIndex // Hit group index, corresponding the the index of the // hit group in the Shader Binding Table that will be // invocated upon hitting the geometry ) { m_instances.emplace_back(Instance(bottomLevelAS, transform, instanceID, hitGroupIndex)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ComputeASBufferSizes Once all instances have been added to the vector of instance descriptors, we need to estimate 3 amounts of memory required to build the TLAS: the size of the scratch space, which is used as temporary storage during the build, the size of the buffer holding the instance descriptors, and the size of the actual TLAS. This method returns both values, so that the application can allocate the appropriate amounts of memory. The description of the work to be performed by the builder is provided in the `D3D12_GET_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO_DESC` structure. It provides the number of instances, and a flag indicating whether the AS will be static (`D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_NONE`) or possibly updated over time (`D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE`). This flag is stored in the helper for later use during the build. This information is then passed to `GetRaytracingAccelerationStructurePrebuildInfo`, which provides the required amounts of scratch and storage memory. Note that the buffers will have the same properties as constant buffers, meaning that their size needs to be 256-byte aligned. The size of the instance descriptor buffer is simply given by the number of instances and the size of the `D3D12_RAYTRACING_INSTANCE_DESC` structure. The required sizes are returned so that the application can allocate the buffers before calling the TLAS builder. See the `Generate` section for the requirements on the buffers themselves. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void TopLevelAS::ComputeASBufferSizes( ID3D12DeviceRaytracingPrototype *device, // Device on which the build will be performed bool allowUpdate, // If true, the resulting acceleration structure will // allow iterative updates UINT64 *scratchSizeInBytes, // Required scratch memory on the GPU to build // the acceleration structure UINT64 *resultSizeInBytes, // Required GPU memory to store the acceleration // structure UINT64 *descriptorsSizeInBytes // Required GPU memory to store instance // descriptors, containing the matrices, // indices etc. ) { // The generated AS can support iterative updates. This may change the final // size of the AS as well as the temporary memory requirements, and hence has // to be set before the actual build m_flags = allowUpdate ? D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE : D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_NONE; // Describe the work being requested, in this case the construction of a // (possibly dynamic) top-level hierarchy, with the given instance descriptors D3D12_GET_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO_DESC prebuildDesc = {}; prebuildDesc.Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL; prebuildDesc.DescsLayout = D3D12_ELEMENTS_LAYOUT_ARRAY; prebuildDesc.NumDescs = static_cast(m_instances.size()); prebuildDesc.Flags = m_flags; // This structure is used to hold the sizes of the required scratch memory and // resulting AS D3D12_RAYTRACING_ACCELERATION_STRUCTURE_PREBUILD_INFO info = {}; // Building the acceleration structure (AS) requires some scratch space, as // well as space to store the resulting structure This function computes a // conservative estimate of the memory requirements for both, based on the // number of bottom-level instances. device->GetRaytracingAccelerationStructurePrebuildInfo(&prebuildDesc, &info); // Buffer sizes need to be 256-byte-aligned info.ResultDataMaxSizeInBytes = ROUND_UP(info.ResultDataMaxSizeInBytes, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); info.ScratchDataSizeInBytes = ROUND_UP(info.ScratchDataSizeInBytes, D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); m_resultSizeInBytes = info.ResultDataMaxSizeInBytes; m_scratchSizeInBytes = info.ScratchDataSizeInBytes; // The instance descriptors are stored as-is in GPU memory, so we can deduce // the required size from the instance count m_instanceDescsSizeInBytes = ROUND_UP(sizeof(D3D12_RAYTRACING_INSTANCE_DESC) * static_cast(m_instances.size()), D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT); *scratchSizeInBytes = m_scratchSizeInBytes; *resultSizeInBytes = m_resultSizeInBytes; *descriptorsSizeInBytes = m_instanceDescsSizeInBytes; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Generate The TLAS builder `Generate` takes as input the scratch, instance descriptor and storage buffers, whose size have been computed above. The scratch and storage buffers must be in the default heap, and with the `D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS` state before calling `Generate`. The instance descriptor buffer must be in the upload heap as it will be mapped within the `Generate` method. In case the TLAS is dynamic, once the TLAS has been built once it is possible to set the `updateOnly` parameter and, in this case, also provide a pointer to the current TLAS. The update can be done in-place or not, so it is possible to have `previousResult==resultBuffer`. The `Generate` call is the only one actually performing any GPU work, hence it requires a command list. Before DXR is natively supported in DirectX12, the `Generate` method requires a pointer to a regular `ID3D12GraphicsCommandList` command list as well as a pointer to the same command list, cast as a `ID3D12CommandListRaytracingPrototype`. Whether the TLAS is dynamic or not has been indicated in the `ComputeASBufferSizes` method. This allows us to partially check the consistency between the `ComputeASBufferSizes` and `Generate` calls. `Generate` first maps the instance descriptor buffer and copies the instance data in it. The actual builder work is described in the `D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC` descriptor, which provides the set of instances to add and the scratch, instances and result buffers. This descriptor is then passed to `BuildRaytracingAccelerationStructure` which enqueues the builder work on the command list. In case the TLAS is used directly within the same command list, the helper contains a barrier to ensure the build is finished before processing further commands. Since the buffers are in the `D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS` state, the barrier is a `D3D12_RESOURCE_BARRIER_TYPE_UAV`. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void TopLevelAS::Generate( ID3D12GraphicsCommandList *commandList, // Command list on which the build will be enqueued ID3D12CommandListRaytracingPrototype *rtCmdList, // Same command list, casted into a raytracing list. This // will not be needed anymore with Windows 10 RS5. ID3D12Resource *scratchBuffer, // Scratch buffer used by the builder to // store temporary data ID3D12Resource *resultBuffer, // Result buffer storing the acceleration structure ID3D12Resource *descriptorsBuffer, // Auxiliary result buffer containing the instance // descriptors, has to be in upload heap bool updateOnly /*= false*/, // If true, simply refit the existing // acceleration structure ID3D12Resource *previousResult /*= nullptr*/ // Optional previous acceleration // structure, used if an iterative update // is requested ) { // Copy the descriptors in the target descriptor buffer D3D12_RAYTRACING_INSTANCE_DESC *instanceDescs; descriptorsBuffer->Map(0, nullptr, (void **)&instanceDescs); if (!instanceDescs) throw std::logic_error("Cannot map the instance descriptor buffer - is it " "in the upload heap?"); UINT instanceCount = static_cast(m_instances.size()); // Initialize the memory to zero on the first time only if (!updateOnly) { ZeroMemory(instanceDescs, m_instanceDescsSizeInBytes); } // Create the description for each instance for (uint32_t i = 0; i < instanceCount; i++) { // Instance ID visible in the shader in InstanceID() instanceDescs[i].InstanceID = m_instances[i].instanceID; // Index of the hit group invoked upon intersection instanceDescs[i].InstanceContributionToHitGroupIndex = m_instances[i].hitGroupIndex; // Instance flags, including backface culling, winding, etc - TODO: should // be accessible from outside instanceDescs[i].Flags = D3D12_RAYTRACING_INSTANCE_FLAG_NONE; // Instance transform matrix DirectX::XMMATRIX m = XMMatrixTranspose( m_instances[i].transform); // GLM is column major, the INSTANCE_DESC is row major memcpy(instanceDescs[i].Transform, &m, sizeof(instanceDescs[i].Transform)); // Get access to the bottom level instanceDescs[i].AccelerationStructure = m_instances[i].bottomLevelAS->GetGPUVirtualAddress(); // Visibility mask, always visible here - TODO: should be accessible from // outside instanceDescs[i].InstanceMask = 0xFF; } descriptorsBuffer->Unmap(0, nullptr); // If this in an update operation we need to provide the source buffer D3D12_GPU_VIRTUAL_ADDRESS pSourceAS = updateOnly ? previousResult->GetGPUVirtualAddress() : (D3D12_GPU_VIRTUAL_ADDRESS) nullptr; D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS flags = m_flags; // The stored flags represent whether the AS has been built for updates or // not. If yes and an update is requested, the builder is told to only update // the AS instead of fully rebuilding it if (flags == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE && updateOnly) { flags = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PERFORM_UPDATE; } // Sanity checks if (m_flags != D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE && updateOnly) throw std::logic_error("Cannot update a top-level AS not originally built for updates"); if (updateOnly && previousResult == nullptr) throw std::logic_error("Top-level hierarchy update requires the previous hierarchy"); // Create a descriptor of the requested builder work, to generate a top-level // AS from the input parameters D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_DESC buildDesc = {}; buildDesc.Type = D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL; buildDesc.DescsLayout = D3D12_ELEMENTS_LAYOUT_ARRAY; buildDesc.InstanceDescs = descriptorsBuffer->GetGPUVirtualAddress(); buildDesc.NumDescs = instanceCount; buildDesc.DestAccelerationStructureData = {resultBuffer->GetGPUVirtualAddress(), m_resultSizeInBytes}; buildDesc.ScratchAccelerationStructureData = {scratchBuffer->GetGPUVirtualAddress(), m_scratchSizeInBytes}; buildDesc.SourceAccelerationStructureData = pSourceAS; buildDesc.Flags = flags; // Build the top-level AS rtCmdList->BuildRaytracingAccelerationStructure(&buildDesc); // Wait for the builder to complete by setting a barrier on the resulting // buffer. This can be important in case the rendering is triggered // immediately afterwards, without executing the command list D3D12_RESOURCE_BARRIER uavBarrier; uavBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV; uavBarrier.UAV.pResource = resultBuffer; uavBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE; commandList->ResourceBarrier(1, &uavBarrier); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Root Signature Compiler The `RootSignatureCompiler` class is not directly related to DXR, but applies to DirectX12 in general to simplify writing root signatures by allowing the user to iteratively add components. In the context of DXR the order in which the addition methods are called is important as it will directly map to the slots of the heap or of the Shader Binding Table to which buffer pointers will be bound. Example to create an empty root signature: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ nv_helpers_dx12::RootSignatureCompiler rsc; return rsc.Generate(m_device.Get(), true); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Example to create a signature with one constant buffer as a root parameter, by default bound to `register(b0, space0)`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ nv_helpers_dx12::RootSignatureCompiler rsc; rsc.AddRootParameter(D3D12_ROOT_PARAMETER_TYPE_CBV); return rsc.Generate(m_device.Get(), true); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Root signature referencing ranges in the heap, with explicit register setting. Each range is defined by its starting register number, the number of successive buffers of that type, the register space, register type, and the index in the heap where the buffer pointer can be found. For example, {0,1,0, D3D12_DESCRIPTOR_RANGE_TYPE_UAV, 0}, means (in order) that * The first buffer will be accessible in the shader as `u0` * There is only one buffer in that range * The register space is `space0` * The buffer is an Unordered Access Variable (UAV) * The buffer pointer is stored in the first slot of the heap ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ nv_helpers_dx12::RootSignatureCompiler rsc; rsc.AddRangeParameter({{0,1,0, D3D12_DESCRIPTOR_RANGE_TYPE_UAV, 0}, {0,1,0, D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 1}, {0,1,0, D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 2}}); return rsc.Generate(m_device.Get(), true); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To simplify the actual implementation of the helper we use the `stl::tuple` and an enumeration to make tuple access more understandable: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ enum { RSC_BASE_SHADER_REGISTER = 0, RSC_NUM_DESCRIPTORS = 1, RSC_REGISTER_SPACE = 2, RSC_RANGE_TYPE = 3, RSC_OFFSET_IN_DESCRIPTORS_FROM_TABLE_START = 4 }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All root signature parameters, whether they are heap descriptor ranges or actual root parameter (buffer or constants), are described by a `D3D12_ROOT_PARAMETER` structure. The helper class stores the list of those parameters: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Root parameter descriptors std::vector m_parameters; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the case of range descriptors the `D3D12_ROOT_PARAMETER` structure stores a pointer to an array of ranges. Since root parameters can be added iteratively and the helper uses `std::vector`, memory can get reallocated, making the pointers invalid. Instead, the helper stores the ranges separately in `m_ranges`. For each root parameter, `m_rangeLocations` indicates the index of the corresponding range array. The pointers are then computed when compiling the root signature. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /// Heap range descriptors std::vector<:vector>> m_ranges; /// For each entry of m_parameter, indicate the index of the range array in m_ranges, and ~0u if /// the parameter is not a heap range descriptor std::vector m_rangeLocations; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddHeapRangesParameter The `AddHeapRangesParameter` adds a set of heap range descriptors as a parameter of the root signature. It adds the range to the vector of range vectors, and creates a `D3D12_ROOT_PARAMETER` indicating the root parameter describes a set of ranges in the heap (`D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE`) and number of ranges. Instead of storing directly the pointer to the ranges, we store the index of the `m_ranges` vector storing the actual ranges into `m_rangeLocations`. The pointer will be resolved in `Generate`, after all parameters have been added. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RootSignatureCompiler::AddHeapRangesParameter( const std::vector &ranges) { m_ranges.push_back(ranges); // A set of ranges on the heap is a descriptor table parameter D3D12_ROOT_PARAMETER param = {}; param.ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; param.DescriptorTable.NumDescriptorRanges = static_cast(ranges.size()); // The range pointer is kept null here, and will be resolved when generating the root signature // (see explanation of m_rangeLocations below) param.DescriptorTable.pDescriptorRanges = nullptr; // All parameters (heap ranges and root parameters) are added to the same parameter list to // preserve order m_parameters.push_back(param); // The descriptor table descriptor ranges require a pointer to the descriptor ranges. Since new // ranges can be dynamically added in the vector, we separately store the index of the range set. // The actual address will be solved when generating the actual root signature m_rangeLocations.push_back(static_cast(m_ranges.size() - 1)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To avoid explicitly creating a vector of `D3D12_DESCRIPTOR_RANGE` and potentially use initialization lists, the `AddHeapRangesParameter` overload creates the descriptors from a vector of `std::tuple`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RootSignatureCompiler::AddHeapRangesParameter( std::vector<:tuple baseshaderregister uint numdescriptors registerspace d3d12_descriptor_range_type rangetype offsetindescriptorsfromtablestart>> ranges) { // Build and store the set of descriptors for the ranges std::vector rangeStorage; for (const auto &input : ranges) { D3D12_DESCRIPTOR_RANGE r = {}; r.BaseShaderRegister = std::get(input); r.NumDescriptors = std::get(input); r.RegisterSpace = std::get(input); r.RangeType = std::get(input); r.OffsetInDescriptorsFromTableStart = std::get(input); rangeStorage.push_back(r); } // Add those ranges to the heap parameters AddHeapRangesParameter(rangeStorage); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddRootParameter This method adds a root parameter to the shader, defined by its type: constant buffer (CBV), shader resource (SRV), unordered access (UAV), or root constant (CBV, directly defined by its value instead of a buffer). The shaderRegister and registerSpace indicate how to access the parameter in the HLSL code, e.g a SRV with shaderRegister==1 and registerSpace==0 is accessible via register(t1, space0). In case of a root constant, the last parameter indicates how many successive 32-bit constants will be bound. The root parameter descriptor `D3D12_ROOT_PARAMETER` is simply built and added to the vector of parameters. Since a root parameter does not refer to a range, the value of `m_rangeLocations` at this index is set to `~0u`. However, this value is arbitrary - the only requirement is that `m_parameters` and `m_rangeLocations` are kept aligned. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RootSignatureCompiler::AddRootParameter(D3D12_ROOT_PARAMETER_TYPE type, UINT shaderRegister /*= 0*/, UINT registerSpace /*= 0*/, UINT numRootConstants /*= 1*/) { D3D12_ROOT_PARAMETER param = {}; param.ParameterType = type; // The descriptor is an union, so specific values need to be set in case the parameter is a // constant instead of a buffer. if (type == D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS) { param.Constants.Num32BitValues = numRootConstants; param.Constants.RegisterSpace = registerSpace; param.Constants.ShaderRegister = shaderRegister; } else { param.Descriptor.RegisterSpace = registerSpace; param.Descriptor.ShaderRegister = shaderRegister; } // We default the visibility to all shaders param.ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; // Add the root parameter to the set of parameters, m_parameters.push_back(param); // and indicate that there will be no range // location to indicate since this parameter is not part of the heap m_rangeLocations.push_back(~0u); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Generate The `Generate` call creates the root signature from the set of parameters, in the order of the addition calls. DXR introduces the concept of global and local root signatures, where global ones are the usual root signatures and local ones are the root signatures of the shaders used in the raytracing pipeline. The method first goes through the vector of parameters `m_parameters`, and resolves the range pointers for the heap range descriptors only (`ParameterType == D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE`). From the parameters the helper builds a `D3D12_ROOT_SIGNATURE_DESC` providing the pointer to the array of `D3D12_ROOT_PARAMETER`. The creation of the root signature itself follows the usual template, by serializing the root signature from its paramters, and creating the actual `ID3D12RootSignature` from the serialized result. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ID3D12RootSignature *RootSignatureCompiler::Generate(ID3D12Device *device, bool isLocal) { // Go through all the parameters, and set the actual addresses of the heap range descriptors based // on their indices in the range set array for (size_t i = 0; i < m_parameters.size(); i++) { if (m_parameters[i].ParameterType == D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE) { m_parameters[i].DescriptorTable.pDescriptorRanges = m_ranges[m_rangeLocations[i]].data(); } } // Specify the root signature with its set of parameters D3D12_ROOT_SIGNATURE_DESC rootDesc = {}; rootDesc.NumParameters = static_cast(m_parameters.size()); rootDesc.pParameters = m_parameters.data(); // Set the flags of the signature. By default root signatures are global, for example for vertex // and pixel shaders. For raytracing shaders the root signatures are local. rootDesc.Flags = isLocal ? D3D12_ROOT_SIGNATURE_FLAG_LOCAL_ROOT_SIGNATURE : D3D12_ROOT_SIGNATURE_FLAG_NONE; // Create the root signature from its descriptor ID3DBlob *pSigBlob; ID3DBlob *pErrorBlob; HRESULT hr = D3D12SerializeRootSignature(&rootDesc, D3D_ROOT_SIGNATURE_VERSION_1_0, &pSigBlob, &pErrorBlob); if (FAILED(hr)) { throw std::logic_error("Cannot serialize root signature"); } ID3D12RootSignature *pRootSig; hr = device->CreateRootSignature(0, pSigBlob->GetBufferPointer(), pSigBlob->GetBufferSize(), IID_PPV_ARGS(&pRootSig)); if (FAILED(hr)) { throw std::logic_error("Cannot create root signature"); } return pRootSig; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Raytracing Pipeline The raytracing pipeline combines the raytracing shaders into a state object, that can be thought of as an executable GPU program. For that, it requires the shaders compiled as DXIL libraries, where each library exports symbols in a way similar to DLLs. Those symbols are then used to refer to these shaders libraries when creating hit groups, associating the shaders to their root signatures and declaring the steps of the pipeline. All the calls to this helper class can be done in arbitrary order. Some basic sanity checks are also performed when compiling in debug mode. Note that the `RaytracingPipeline` helper addresses a common use case of the raytracing pipeline, in which all pipeline subobjects are defined within the same collection. More advanced usages are described in the DXR specification. In this case most of the code of this class could be reused, though. Simple usage of this class: we first import DXIL libraries containing the code for the shaders, along with the names of the shader functions. We create a hit group from one of the imported symbols (`ClosestHit`), and associate each shader symbol with its corresponding root signature. The final step of the setup is the setting of the ray payload that will be used to exchange data from the hit shaders to the ray generation, and the size of the intersection attributes. Those latter are generated by the intersection shader, and contain 2 floating-point values for the built-in intersector (the barycentric coordinates of the hit). The recursion depth indicates how many `TraceRay` calls can be nested, ie. how many hit shaders can be recursively called. For example, tracing only primary rays from the camera corresponds to a depth of 1. If shadow rays are traced from the hits, then the depth will be 2. In general, it is best to keep that depth as low as possible. Finally, we compile the pipeline into what can be thought as an executable program representing the raytracing process. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add all compiled shaders and declared functions pipeline.AddLibrary(m_rayGenLibrary.Get(), {L"RayGen"}); pipeline.AddLibrary(m_missLibrary.Get(), {L"Miss"}); pipeline.AddLibrary(m_hitLibrary.Get(), {L"ClosestHit"}); // Create a hit group for hit shaders pipeline.AddHitGroup(L"HitGroup", L"ClosestHit"); // Associate all the root signatures with the shaders pipeline.AddRootSignatureAssociation(m_rayGenSignature.Get(), {L"RayGen"}); pipeline.AddRootSignatureAssociation(m_missSignature.Get(), {L"Miss"}); pipeline.AddRootSignatureAssociation(m_hitSignature.Get(), {L"HitGroup"}); // Defining the maximum payload for all shaders pipeline.SetMaxPayloadSize(4 * sizeof(float)); // RGB + distance // Defining the maximum attribute for all shaders pipeline.SetMaxAttributeSize(2 * sizeof(float)); // barycentric coordinates // How many recursions of TraceRay will be allowed pipeline.SetMaxRecursionDepth(1); rtStateObject = pipeline.Generate(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Internally, this class defines a few concepts following the API, such as DXIL libraries, hit groups, and root signature associations. A `Library` simply stores a `IDxcBlob` pointer to the DXIL code, a list of exported symbol strings, and the descriptors that will be used to setup the pipeline subobjects: the export name descriptors `m_exports`, and the library descriptor `m_libDesc` that points to the array of export names. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ struct Library { Library(IDxcBlob *dxil, const std::vector<:wstring> exportedSymbols); Library(const Library &source); IDxcBlob *m_dxil; const std::vector<:wstring> m_exportedSymbols; std::vector m_exports; D3D12_DXIL_LIBRARY_DESC m_libDesc; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A `HitGroup` gathers the symbols corresponding to the intersection, any hit and closest hit shaders forming the hit group. When no intersection shader is provided, the default triangle intersector is used. When no any hit shader is provided, it will be replaced by a default pass-through any hit shader. The `m_desc` descriptor is the DirectX12 descriptor containing pointers to the shader symbol strings. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ struct HitGroup { HitGroup(const std::wstring &hitGroupName, const std::wstring &closestHitSymbol, const std::wstring &anyHitSymbol = L"", const std::wstring &intersectionSymbol = L""); HitGroup(const HitGroup &source); std::wstring m_hitGroupName; std::wstring m_closestHitSymbol; std::wstring m_anyHitSymbol; std::wstring m_intersectionSymbol; D3D12_HIT_GROUP_DESC m_desc = {}; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When importing a shader, we need to associate it with its root signature to be able to fetch its resources. The `RootSignatureAssociation` struct contains a pointer to the root signature, and the symbols representing the shaders associated to it. Note that the symbol strings are actually stored in `m_symbols`, while `m_symbolPointers` only stores pointers to be passed in the association descriptor. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ struct RootSignatureAssociation { RootSignatureAssociation(ID3D12RootSignature *rootSignature, const std::vector<:wstring> &symbols); RootSignatureAssociation(const RootSignatureAssociation &source); ID3D12RootSignature *m_rootSignature; std::vector<:wstring> m_symbols; std::vector m_symbolPointers; D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION m_association = {}; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using those wrapper structures, the class stores the libraries, hit groups and root signature associations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ std::vector m_libraries = {}; std::vector m_hitGroups = {}; std::vector m_rootSignatureAssociations = {}; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The properties of the pipeline itself are also stored in the class: the payload size, intersection attributes size and the maximum recursion level: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ UINT m_maxPayLoadSizeInBytes = 0; /// Attribute size, initialized to 2 for the barycentric coordinates used by the built-in triangle /// intersection shader UINT m_maxAttributeSizeInBytes = 2 * sizeof(float); /// Maximum recursion depth, initialized to 1 to at least allow tracing primary rays UINT m_maxRecursionDepth = 1; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For simplicity, the pipeline stores two pointers to the device: the actual `ID3D12Device`, and the same pointer cast to a `ID3D12DeviceRaytracingPrototype`. This second pointer is only necessary until DXR is part of the core DirectX12. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ID3D12Device *m_device; ID3D12DeviceRaytracingPrototype *m_rtDevice; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The current implementation of DXR requires pipelines to contain at least one global and one local empty root signatures, which do not have to be associated with any shader. The helper takes care of creating and adding those automatically in the pipeline. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ID3D12RootSignature *m_dummyLocalRootSignature; ID3D12RootSignature *m_dummyGlobalRootSignature; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Library The internal structure `Library` stores the pointer to the provided DXIL library and the exported symbols string, and generates the set of export descriptors `m_exports` containing pointers to the strings: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::Library::Library(IDxcBlob *dxil, const std::vector<:wstring> exportedSymbols) : m_exports(exportedSymbols.size()), m_dxil(dxil), m_exportedSymbols(exportedSymbols) { // Create one export descriptor per symbol for (size_t i = 0; i < m_exportedSymbols.size(); i++) { m_exports[i] = {}; m_exports[i].Name = m_exportedSymbols[i].c_str(); m_exports[i].ExportToRename = nullptr; m_exports[i].Flags = D3D12_EXPORT_FLAG_NONE; } // Create a library descriptor combining the DXIL code and the export names m_libDesc.DXILLibrary.BytecodeLength = dxil->GetBufferSize(); m_libDesc.DXILLibrary.pShaderBytecode = dxil->GetBufferPointer(); m_libDesc.NumExports = static_cast(m_exportedSymbols.size()); m_libDesc.pExports = m_exports.data(); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The copy constructor has to be defined so that the export descriptors are set correctly. Using the default constructor would copy the string pointers of the symbols into the descriptors, which would cause issues when the original `Library` object gets out of scope ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::Library::Library(const Library &source) : Library(source.m_dxil, source.m_exportedSymbols) { } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Hit groups In a way similar to the `Library` struct, the hit group stores the strings corresponding to each shader symbol, and creates the descriptor pointing to those strings: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::HitGroup::HitGroup(const std::wstring &hitGroupName, const std::wstring &closestHitSymbol, const std::wstring &anyHitSymbol /*= L""*/, const std::wstring &intersectionSymbol /*= L""*/) : m_hitGroupName(hitGroupName), m_closestHitSymbol(closestHitSymbol), m_anyHitSymbol(anyHitSymbol), m_intersectionSymbol(intersectionSymbol) { // Indicate which shader program is used for closest hit, leave the other // ones undefined (default behavior), export the name of the group m_desc.HitGroupExport = m_hitGroupName.c_str(); m_desc.ClosestHitShaderImport = m_closestHitSymbol.empty() ? nullptr : m_closestHitSymbol.c_str(); m_desc.AnyHitShaderImport = m_anyHitSymbol.empty() ? nullptr : m_anyHitSymbol.c_str(); m_desc.IntersectionShaderImport = m_intersectionSymbol.empty() ? nullptr : m_intersectionSymbol.c_str(); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The copy constructor also has to be defined so that the export descriptors are set correctly. Using the default constructor would copy the string pointers of the symbols into the descriptors, which would cause issues when the original `HitGroup` object gets out of scope: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::HitGroup::HitGroup(const HitGroup &source) : HitGroup(source.m_hitGroupName, source.m_closestHitSymbol, source.m_anyHitSymbol, source.m_intersectionSymbol) { } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Root Signature association The `RootSignatureAssociation` structure stores a pointer to the root signature, and the strings corresponding to the shader names associated to the root signature. From these, it also generates a vector of string pointers to be used directly in the root signature association descriptor built when compiling the pipeline. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::RootSignatureAssociation::RootSignatureAssociation( ID3D12RootSignature *rootSignature, const std::vector<:wstring> &symbols) : m_rootSignature(rootSignature), m_symbols(symbols), m_symbolPointers(symbols.size()) { for (size_t i = 0; i < m_symbols.size(); i++) m_symbolPointers[i] = m_symbols[i].c_str(); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As in `Library` and `HitGroup`, the copy constructor has to be defined so that the export descriptors are set correctly. Using // the default constructor would copy the string pointers of the symbols into the descriptors, which // would cause issues when the original `RootSignatureAssociation` object gets out of scope ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::RootSignatureAssociation::RootSignatureAssociation( const RootSignatureAssociation &source) : RootSignatureAssociation(source.m_rootSignature, source.m_symbols) { } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## RayTracingPipeline The constructor simply stores the device pointers, and builds the empty local and global root signatures required to form a valid pipeline. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RayTracingPipeline::RayTracingPipeline(ID3D12Device *device, ID3D12DeviceRaytracingPrototype *rtDevice) : m_device(device), m_rtDevice(rtDevice) { // The pipeline creation requires having at least one empty global and local root signatures, so // we systematically create both, as this does not incur any overhead CreateDummyRootSignatures(); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddLibrary The `AddLibrary` method adds a DXIL library to the pipeline. This library has to be compiled with the dxc compiler (and not `D3DCompile`), using a `lib_6_3` target. The exported symbols must correspond exactly to the names of the shaders declared in the library, although unused ones can be omitted. Note that in the HLSL code, the semantic of the shader needs to be indicated by decorating the shader function for example: `[shader("raygeneration")] void RayGen() { `. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::AddLibrary(IDxcBlob *dxilLibrary, const std::vector<:wstring> &symbolExports) { m_libraries.emplace_back(Library(dxilLibrary, symbolExports)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddHitGroup Adds a hit group into the pipeline. As a reminder, in DXR the hit-related shaders are grouped into hit groups. Such shaders are: - The intersection shader, which can be used to intersect custom geometry, and is called upon hitting the bounding box the the object. A default one exists to intersect triangles - The any hit shader, called on each intersection, which can be used to perform early alpha-testing and allow the ray to continue if needed. Default is a pass-through. - The closest hit shader, invoked on the hit point closest to the ray start. The shaders in a hit group share the same root signature, and are only referred to by the hit group name in other places of the program. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::AddHitGroup(const std::wstring &hitGroupName, const std::wstring &closestHitSymbol, const std::wstring &anyHitSymbol /*= L""*/, const std::wstring &hitSymbol /*= L""*/) { m_hitGroups.emplace_back(HitGroup(hitGroupName, closestHitSymbol, anyHitSymbol, hitSymbol)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddRootSignatureAssociation The shaders and hit groups may have various root signatures. This call associates a root signature to one or more symbols. All imported symbols must be associated to exactly one root signature, otherwise the pipeline compilation will fail. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::AddRootSignatureAssociation(ID3D12RootSignature *rootSignature, const std::vector<:wstring> &symbols) { m_rootSignatureAssociations.emplace_back(RootSignatureAssociation(rootSignature, symbols)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## SetMaxPayloadSize The payload is the way hit or miss shaders can exchange data with the shader that called TraceRay. When several ray types are used (e.g. primary and shadow rays), this value must be the largest possible payload size. Note that to optimize performance, this size must be kept as low as possible. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::SetMaxPayloadSize(UINT sizeInBytes) { m_maxPayLoadSizeInBytes = sizeInBytes; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## SetMaxAttributeSize When hitting geometry, a number of surface attributes can be generated by the intersector. Using the built-in triangle intersector the attributes are the barycentric coordinates, with a size 2*sizeof(float). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::SetMaxAttributeSize(UINT sizeInBytes) { m_maxAttributeSizeInBytes = sizeInBytes; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## SetMaxRecursionDepth Upon hitting a surface, a closest hit shader can issue a new TraceRay call. This parameter indicates the maximum level of recursion. Note that this depth should be kept as low as possible, typically 2, to allow hit shaders to trace shadow rays. Recursive ray tracing algorithms must be flattened to a loop in the ray generation program for best performance. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::SetMaxRecursionDepth(UINT maxDepth) { m_maxRecursionDepth = maxDepth; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Generate The `Generate` method builds an array of subobjects, each of which representing an element of the pipeline such as library imports, hit groups or root signature associations. Those latter require two subobjects each: one to declare the root signature, and one to associate shader symbols to it. The shader configuration subobject contains the payload and attributes sizes. This configuration is associated to all shaders. The pipeline automatically adds empty root signatures, one local and one global, as required by the raytracing pipeline compiler. The last subobject is the pieline configuration, setting the maximum recursion depth. Since association subobjects refer to other subobjects by pointers, it is important to pre-allocate the vector of `D3D12_STATE_SUBOBJECT` to avoid reallocations and pointer invalidations. The `currentIndex` value will be used to set the values of the subobjects in the array. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ID3D12StateObjectPrototype *RayTracingPipeline::Generate() { // The pipeline is made of a set of sub-objects, representing the DXIL libraries, hit group // declarations, root signature associations, plus some configuration objects UINT64 subobjectCount = m_libraries.size() + // DXIL libraries m_hitGroups.size() + // Hit group declarations 1 + // Shader configuration 1 + // Shader payload association 2 * m_rootSignatureAssociations.size() + // Root signature declaration + association 2 + // Empty global and local root signatures 1; // Final pipeline subobject // Initialize a vector with the target object count. It is necessary to make the allocation before // adding subobjects as some subobjects reference other subobjects by pointer. Using push_back may // reallocate the array and invalidate those pointers. std::vector subobjects(subobjectCount); UINT currentIndex = 0; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first subobjects define the DXIL libraries and their imported symbols: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add all the DXIL libraries for (const Library &lib : m_libraries) { D3D12_STATE_SUBOBJECT libSubobject = {}; libSubobject.Type = D3D12_STATE_SUBOBJECT_TYPE_DXIL_LIBRARY; libSubobject.pDesc = &lib.m_libDesc; subobjects[currentIndex++] = libSubobject; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Similarly, we add the hit groups: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add all the hit group declarations for (const HitGroup &group : m_hitGroups) { D3D12_STATE_SUBOBJECT hitGroup = {}; hitGroup.Type = D3D12_STATE_SUBOBJECT_TYPE_HIT_GROUP; hitGroup.pDesc = &group.m_desc; subobjects[currentIndex++] = hitGroup; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The shader configuration `D3D12_RAYTRACING_SHADER_CONFIG` stores the maximum payload and attribute sizes required by the shaders in the pipeline. We then create a subobject storing a pointer to this configuration object. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add a subobject for the shader payload configuration D3D12_RAYTRACING_SHADER_CONFIG shaderDesc = {}; shaderDesc.MaxPayloadSizeInBytes = m_maxPayLoadSizeInBytes; shaderDesc.MaxAttributeSizeInBytes = m_maxAttributeSizeInBytes; D3D12_STATE_SUBOBJECT shaderConfigObject = {}; shaderConfigObject.Type = D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_SHADER_CONFIG; shaderConfigObject.pDesc = &shaderDesc; subobjects[currentIndex++] = shaderConfigObject; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The shader configuration needs to be associated with the ray generation and miss shaders, as well as with the hit groups. Since the API calls only define the imported symbols and the composition of the hit groups, we first build a list containing the names of the ray generation, miss and hit groups, but not the names of the intersection, any hit and closest hit programs. From that list, we generate a vector of string pointers to be used in the descriptors. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Build a list of all the symbols for ray generation, miss and hit groups // Those shaders have to be associated with the payload definition std::vector<:wstring> exportedSymbols = {}; std::vector exportedSymbolPointers = {}; BuildShaderExportList(exportedSymbols); // Build an array of the string pointers exportedSymbolPointers.reserve(exportedSymbols.size()); for (const auto &name : exportedSymbols) { exportedSymbolPointers.push_back(name.c_str()); } const WCHAR **shaderExports = exportedSymbolPointers.data(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From that list, we can now create a `D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION` object storing the pointers to the symbols of the shaders, and a pointer to the subobject located at the previous index in the subobjects array, that is the shader configuration subobject. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add a subobject for the association between shaders and the shader configuration D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION shaderPayloadAssociation = {}; shaderPayloadAssociation.NumExports = static_cast(exportedSymbols.size()); shaderPayloadAssociation.pExports = shaderExports; // Associate the set of shaders with the payload defined in the previous subobject shaderPayloadAssociation.pSubobjectToAssociate = &subobjects[(currentIndex - 1)]; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ That association object is then added into a `D3D12_STATE_SUBOBJECT` in the pipeline: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Create and store the payload association object D3D12_STATE_SUBOBJECT shaderPayloadAssociationObject = {}; shaderPayloadAssociationObject.Type = D3D12_STATE_SUBOBJECT_TYPE_SUBOBJECT_TO_EXPORTS_ASSOCIATION; shaderPayloadAssociationObject.pDesc = &shaderPayloadAssociation; subobjects[currentIndex++] = shaderPayloadAssociationObject; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now the shader configuration association is complete, we need to associate the imported shaders to their root signature. Note that as with any DirectX12 shader, the resources used in the HLSL code must at least be a subset of the resources declared in the root signature, and ideally be exactly the same to avoid confusions and errors. We perform this association for every root signature association: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ for (RootSignatureAssociation &assoc : m_rootSignatureAssociations) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In a way similar to the shader configuration association, associating a root signature with a shader symbol requires two subobjects: one to declare the root signature, and another to associate that root signature to a set of symbols. The first subobject simply stores the pointer to the root signature: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add a subobject to declare the root signature D3D12_STATE_SUBOBJECT rootSigObject = {}; rootSigObject.Type = D3D12_STATE_SUBOBJECT_TYPE_LOCAL_ROOT_SIGNATURE; rootSigObject.pDesc = &assoc.m_rootSignature; subobjects[currentIndex++] = rootSigObject; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To build the actual association object, we first gather the pointers to the symbols string. Then, we associate the symbols to the root signature by setting the `pSubobjectToAssociate` to the previous object in the subobjects array, that is the root signature declaration. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add a subobject for the association between the exported shader symbols and the root // signature assoc.m_association.NumExports = static_cast(assoc.m_symbolPointers.size()); assoc.m_association.pExports = assoc.m_symbolPointers.data(); assoc.m_association.pSubobjectToAssociate = &subobjects[(currentIndex - 1)]; D3D12_STATE_SUBOBJECT rootSigAssociationObject = {}; rootSigAssociationObject.Type = D3D12_STATE_SUBOBJECT_TYPE_SUBOBJECT_TO_EXPORTS_ASSOCIATION; rootSigAssociationObject.pDesc = &assoc.m_association; subobjects[currentIndex++] = rootSigAssociationObject; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As stated at the beginning of the section, a valid pipeline must contain empty local and global root signatures. We add two subobjects containing the pointers to the automatically generated ones. Note that this requirement may not exist in the final release of DXR. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // The pipeline construction always requires an empty global root signature D3D12_STATE_SUBOBJECT globalRootSig; globalRootSig.Type = D3D12_STATE_SUBOBJECT_TYPE_ROOT_SIGNATURE; ID3D12RootSignature *dgSig = m_dummyGlobalRootSignature; globalRootSig.pDesc = &dgSig; subobjects[currentIndex++] = globalRootSig; // The pipeline construction always requires an empty local root signature D3D12_STATE_SUBOBJECT dummyLocalRootSig; dummyLocalRootSig.Type = D3D12_STATE_SUBOBJECT_TYPE_LOCAL_ROOT_SIGNATURE; ID3D12RootSignature *dlSig = m_dummyLocalRootSignature; dummyLocalRootSig.pDesc = &dlSig; subobjects[currentIndex++] = dummyLocalRootSig; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The final subobject in the pipeline is the pipeline configurationm, which indicates the maximum recursion level allowed: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Add a subobject for the ray tracing pipeline configuration D3D12_RAYTRACING_PIPELINE_CONFIG pipelineConfig = {}; pipelineConfig.MaxTraceRecursionDepth = m_maxRecursionDepth; D3D12_STATE_SUBOBJECT pipelineConfigObject = {}; pipelineConfigObject.Type = D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_PIPELINE_CONFIG; pipelineConfigObject.pDesc = &pipelineConfig; subobjects[currentIndex++] = pipelineConfigObject; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The pipeline descriptor is the input to the actual compilation. It contains the pointer to the array of subobjects defining the pipeline. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Describe the ray tracing pipeline state object D3D12_STATE_OBJECT_DESC pipelineDesc = {}; pipelineDesc.Type = D3D12_STATE_OBJECT_TYPE_RAYTRACING_PIPELINE; pipelineDesc.NumSubobjects = currentIndex; pipelineDesc.pSubobjects = subobjects.data(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From this descriptor we can finally call the raytracing pipeline state object ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ID3D12StateObjectPrototype *rtStateObject = nullptr; // Create the state object HRESULT hr = m_rtDevice->CreateStateObject(&pipelineDesc, IID_PPV_ARGS(&rtStateObject)); if (FAILED(hr)) { throw std::logic_error("Could not create the raytracing state object"); } return rtStateObject; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## CreateDummyRootSignatures This private method which creates the empty root signatures is straightforward, using the classical root signature template found in the DirectX samples. Note that we could also have used the `RootSignatureCompiler` for this, but chose not to in order to avoid nested helpers and facilitate code copy-pasting in other applications. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::CreateDummyRootSignatures() { // Creation of the global root signature D3D12_ROOT_SIGNATURE_DESC rootDesc = {}; rootDesc.NumParameters = 0; rootDesc.pParameters = nullptr; // A global root signature is the default, hence this flag rootDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_NONE; HRESULT hr = 0; ID3DBlob *serializedRootSignature; ID3DBlob *error; // Create the empty global root signature hr = D3D12SerializeRootSignature(&rootDesc, D3D_ROOT_SIGNATURE_VERSION_1, &serializedRootSignature, &error); if (FAILED(hr)) { throw std::logic_error("Could not serialize the global root signature"); } hr = m_device->CreateRootSignature(0, serializedRootSignature->GetBufferPointer(), serializedRootSignature->GetBufferSize(), IID_PPV_ARGS(&m_dummyGlobalRootSignature)); serializedRootSignature->Release(); if (FAILED(hr)) { throw std::logic_error("Could not create the global root signature"); } // Create the local root signature, reusing the same descriptor but altering the creation flag rootDesc.Flags = D3D12_ROOT_SIGNATURE_FLAG_LOCAL_ROOT_SIGNATURE; hr = D3D12SerializeRootSignature(&rootDesc, D3D_ROOT_SIGNATURE_VERSION_1, &serializedRootSignature, &error); if (FAILED(hr)) { throw std::logic_error("Could not serialize the local root signature"); } hr = m_device->CreateRootSignature(0, serializedRootSignature->GetBufferPointer(), serializedRootSignature->GetBufferSize(), IID_PPV_ARGS(&m_dummyLocalRootSignature)); serializedRootSignature->Release(); if (FAILED(hr)) { throw std::logic_error("Could not create the local root signature"); } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## BuildShaderExportList This private method builds a list containing the export symbols for the ray generation shaders, miss shaders, and hit group names. It also performs some sanity checks to obtain more explicit errors messages in case of invalid symbols. The method starts by building the set `exports` containing all the symbols exported by the libraries. In debug mode it also verifies that no symbols are duplicated. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void RayTracingPipeline::BuildShaderExportList(std::vector<:wstring> &exportedSymbols) { // Get all names from libraries // Get names associated to hit groups // Return list of libraries+hit group names - shaders in hit groups std::unordered_set<:wstring> exports; // Add all the symbols exported by the libraries for (const Library &lib : m_libraries) { for (const auto &exportName : lib.m_exportedSymbols) { #ifdef _DEBUG // Sanity check in debug mode: check that no name is exported more than once if (exports.find(exportName) != exports.end()) { throw std::logic_error("Multiple definition of a symbol in the imported DXIL libraries"); } #endif exports.insert(exportName); } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In debug mode, we now check that the shader names referenced in the hit groups actually correspond to exported symbols from the libraries. We also add the hit group names to the `all_exports` set for further checks. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #ifdef _DEBUG // Sanity check in debug mode: verify that the hit groups do not reference an unknown shader name std::unordered_set<:wstring> all_exports = exports; for (const auto &hitGroup : m_hitGroups) { if (!hitGroup.m_anyHitSymbol.empty() && exports.find(hitGroup.m_anyHitSymbol) == exports.end()) throw std::logic_error("Any hit symbol not found in the imported DXIL libraries"); if (!hitGroup.m_closestHitSymbol.empty() && exports.find(hitGroup.m_closestHitSymbol) == exports.end()) throw std::logic_error("Closest hit symbol not found in the imported DXIL libraries"); if (!hitGroup.m_intersectionSymbol.empty() && exports.find(hitGroup.m_intersectionSymbol) == exports.end()) throw std::logic_error("Intersection symbol not found in the imported DXIL libraries"); all_exports.insert(hitGroup.m_hitGroupName); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Still in debug mode only, we verify that the symbols referenced by root signature associations are actually either a ray generation/miss shader, or a hit group: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Sanity check in debug mode: verify that the root signature associations do not reference an // unknown shader or hit group name for (const auto &assoc : m_rootSignatureAssociations) { for (const auto &symb : assoc.m_symbols) { if (!symb.empty() && all_exports.find(symb) == all_exports.end()) { throw std::logic_error("Root association symbol not found in the " "imported DXIL libraries and hit group names"); } } } #endif ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `exports` contains all the symbols exported by the libraries, but this method must output the list of ray generation shader, miss shaders and hit groups. We then remove the names of intersection, any hit and closest hit shaders from the set. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Go through all hit groups and remove the symbols corresponding to intersection, any hit and // closest hit shaders from the symbol set for (const auto &hitGroup : m_hitGroups) { if (!hitGroup.m_anyHitSymbol.empty()) exports.erase(hitGroup.m_anyHitSymbol); if (!hitGroup.m_closestHitSymbol.empty()) exports.erase(hitGroup.m_closestHitSymbol); if (!hitGroup.m_intersectionSymbol.empty()) exports.erase(hitGroup.m_intersectionSymbol); exports.insert(hitGroup.m_hitGroupName); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We then iterate on the set to build the target vector of names containing ray generation and miss shaders, plus the hit group names ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ for (const auto &name : exports) { exportedSymbols.push_back(name); } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Shader Binding Table The Shader Binding Table (SBT) is the cornerstone of DXR's raytracing setup: it associates the contents of the acceleration structures to the shaders and their resources. The `ShaderBindingTable` class is a helper to construct the SBT. It helps maintaining the proper offsets of each element, required when constructing the SBT, but also when filling the input descriptor to `DispatchRays`. Each record in the SBT consists of a shader or hit group name, followed by a set of 64-bit values representing either pointers in the heap, buffer pointers, or 32-bit constants. In a simple example, we first obtain the pointer to the beginning of the heap, and add the ray generation program `RayGen`, which will have to access only the heap, as described in its root signature. This heap access is provided by adding the heap pointer to the ray generation program resources. The simple `Miss` shader only communicates results through its payload, and therefore does not require any resources. We then declare a first hit group `HitGroup` that will be used by primary rays, and another `ShadowHitGroup` that will be called when tracing shadow rays. Please refer to the [DXR Tutorial](/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-1) for an explanation of the hit group mappings to the geometry instances. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ D3D12_GPU_DESCRIPTOR_HANDLE srvUavHeapHandle = m_srvUavHeap->GetGPUDescriptorHandleForHeapStart(); UINT64* heapPointer = reinterpret_cast< UINT64* >(srvUavHeapHandle.ptr); m_sbtHelper.AddRayGenerationProgram(L"RayGen", {heapPointer}); m_sbtHelper.AddMissProgram(L"Miss", {}); m_sbtHelper.AddHitGroup(L"HitGroup", {(void*)(m_constantBuffers[i]->GetGPUVirtualAddress())}); m_sbtHelper.AddHitGroup(L"ShadowHitGroup", {}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Once the entries in the SBT have been defined, the size of the required SBT buffer on the GPU is computed by a call to `ComputeSBTSize`. In a way similar to the acceleration structure setup, this allows the application to know how much memory will be required to store the SBT on the GPU, and allocate the buffer as needed. Note that the helper will map the SBT buffer, and hence this buffer needs to be created on the upload heap. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Create the SBT on the upload heap uint32_t sbtSize = 0; m_sbtHelper.ComputeSBTSize(GetRTDevice(), &sbtSize); m_sbtStorage = nv_helpers_dx12::CreateBuffer(m_device.Get(), sbtSize, D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_GENERIC_READ, nv_helpers_dx12::kUploadHeapProps); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using the application-allocated buffer, the SBT is then generated by calling the `Generate` method: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ m_sbtHelper.Generate(m_sbtStorage.Get(), m_rtStateObjectProps.Get()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The contents of the SBT are used during the raytracing process, and for that the `DispatchRays` call needs to obtain the appropriate pointers and offsets to address the right shaders and resources. This consistency is enforced by using the helper when creating the `D3D12_DISPATCH_RAYS_DESC` upon rendering. The helper introduces a number of `Get*` methods for each shader category (ray generation, miss, hit group) to access the size of a SBT entries for that shader category, and the the size of the SBT section for that category. Arbitrarily, the helper puts first the ray generation, followed by the miss shaders, then the hit groups. That is why the `StartAddress` of the ray generation section is at the beginning of the SBT buffer, while the address of the first miss is offset by `rayGenerationSectionSizeInBytes`. Similarly, we offset the address of the first hit group. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ D3D12_DISPATCH_RAYS_DESC desc = {}; // The ray generation shaders are at the beginning of the SBT m_sbtEntrySize. uint32_t rayGenerationSectionSizeInBytes = m_sbtHelper.GetRayGenSectionSize(); desc.RayGenerationShaderRecord.StartAddress = m_sbtStorage->GetGPUVirtualAddress(); desc.RayGenerationShaderRecord.SizeInBytes = rayGenerationSectionSizeInBytes; // The miss section start after the ray generation shaders uint32_t missSectionSizeInBytes = m_sbtHelper.GetMissSectionSize(); desc.MissShaderTable.StartAddress = m_sbtStorage->GetGPUVirtualAddress() + rayGenerationSectionSizeInBytes; desc.MissShaderTable.SizeInBytes = missSectionSizeInBytes; desc.MissShaderTable.StrideInBytes = m_sbtHelper.GetMissEntrySize(); // The hit groups section start after the miss shaders uint32_t hitGroupsSectionSize = m_sbtHelper.GetHitGroupSectionSize(); desc.HitGroupTable.StartAddress = m_sbtStorage->GetGPUVirtualAddress() + rayGenerationSectionSizeInBytes + missSectionSizeInBytes; desc.HitGroupTable.SizeInBytes = hitGroupsSectionSize; desc.HitGroupTable.StrideInBytes = m_sbtHelper.GetHitGroupEntrySize(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Private class members A `SBTEntry` structure stores the name of the shader, and a vector containing the set of 64-bit values representing its resources (either heap/buffer pointers or 32-bit constants): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ class SBTEntry { public: SBTEntry(const std::wstring &entryPoint, const std::vector &inputData); const std::wstring m_entryPoint; const std::vector m_inputData; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The SBT helper maintains a list of shaders in each category: ray generation, miss and hit group ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ std::vector m_rayGen; std::vector m_miss; std::vector m_hitGroup; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For each category, the size of an entry in the SBT depends on the maximum number of resources used by the shaders in that category. The helper computes those values automatically in `GetEntrySize`. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uint32_t m_rayGenEntrySize; uint32_t m_missEntrySize; uint32_t m_hitGroupEntrySize; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The program names are translated into program identifiers. The size in bytes of an identifier is provided by the device and is the same for all categories. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ UINT m_progIdSize; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddRayGenerationProgram This method adds a ray generation program by name, with its list of data pointers or values according to the layout of its root signature ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void ShaderBindingTable::AddRayGenerationProgram(const std::wstring &entryPoint, const std::vector &inputData) { m_rayGen.emplace_back(SBTEntry(entryPoint, inputData)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddMissProgram Adds a miss program by name, with its list of data pointers or values according to the layout of its root signature ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void ShaderBindingTable::AddMissProgram(const std::wstring &entryPoint, const std::vector &inputData) { m_miss.emplace_back(SBTEntry(entryPoint, inputData)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## AddHitGroup Adds a hit group by name, with its list of data pointers or values according to the layout of its root signature ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ void ShaderBindingTable::AddHitGroup(const std::wstring &entryPoint, const std::vector &inputData) { m_hitGroup.emplace_back(SBTEntry(entryPoint, inputData)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## ComputeSBTSize The size of the Shader Binding Table depends on the set of programs and hit groups it contains, and on how many resources are required for each category of shader programs. We first query the size of a program identifier, which is dependent on the driver implementation. Then, for each shader category (ray generation, miss, hit group) we use the private`GetEntrySize()` method to compute the amount of memory required for an entry of each category. The size of the SBT is then given by the number of programs in each category and their SBT entry sizes. Note that the SBT size needs to be a multiple of 256, hence the rounding. After calling `ComputeSBTSize` the application only has to allocate the SBT buffer on the upload heap. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uint32_t ShaderBindingTable::ComputeSBTSize(ID3D12DeviceRaytracingPrototype *rtDevice) { // Size of a program identifier m_progIdSize = rtDevice->GetShaderIdentifierSize(); // Compute the entry size of each program type depending on the maximum number of parameters in // each category m_rayGenEntrySize = GetEntrySize(m_rayGen); m_missEntrySize = GetEntrySize(m_miss); m_hitGroupEntrySize = GetEntrySize(m_hitGroup); // The total SBT size is the sum of the entries for ray generation, miss and hit groups, aligned // on 256 bytes uint32_t sbtSize = ROUND_UP(m_rayGenEntrySize * static_cast(m_rayGen.size()) + m_missEntrySize * static_cast(m_miss.size()) + m_hitGroupEntrySize * static_cast(m_hitGroup.size()), 256); return sbtSize; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## GetEntrySize This private method is invoked by `ComputeSBTSize`, and computes the size of the SBT entries for a set of entries, which is determined by finding the entry having the the maximum number of parameters of its root signature. A SBT entry then contains the program identifier, plus 8 bytes for each parameter. The entries need to be aligned on `D3D12_RAYTRACING_SHADER_RECORD_BYTE_ALIGNMENT` bytes, which is currently 16. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uint32_t ShaderBindingTable::GetEntrySize(const std::vector &entries) { // Find the maximum number of parameters used by a single entry size_t maxArgs = 0; for (const auto &shader : entries) { maxArgs = max(maxArgs, shader.m_inputData.size()); } // A SBT entry is made of a program ID and a set of parameters, taking 8 bytes each. Those // parameters can either be 8-bytes pointers, or 4-bytes constants uint32_t entrySize = m_progIdSize + 8 * static_cast(maxArgs); // The entries of the shader binding table must be 16-bytes-aligned entrySize = ROUND_UP(entrySize, D3D12_RAYTRACING_SHADER_RECORD_BYTE_ALIGNMENT); return entrySize; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Generate Once the SBT size has been computed and the application has allocated the SBT buffer on the upload heap, the `Generate` method builds the actual contents of the SBT. We first map the SBT buffer to allow writing to it, hence the need of having the buffer on the upload heap. Then, for each shader category, we copy the shader identifiers and resources using the private method `CopyShaderData`. This method returns the number of bytes written in the SBT to store this category of shader. We call this method first for the ray generation, then for the miss shaders, and finally for the hit groups, before unmapping the buffer. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Generate the SBT and store it into sbtBuffer, which has to be pre-allocated on the upload heap. // Access to the raytracing pipeline object is required to fetch program identifiers using their // names void ShaderBindingTable::Generate(ID3D12Resource *sbtBuffer, ID3D12StateObjectPropertiesPrototype *raytracingPipeline) { // Map the SBT uint8_t *pData; HRESULT hr = sbtBuffer->Map(0, nullptr, (void **)&pData); if (FAILED(hr)) { throw std::logic_error("Could not map the shader binding table"); } // Copy the shader identifiers followed by their resource pointers or root constants: first the // ray generation, then the miss shaders, and finally the set of hit groups uint32_t offset = 0; offset = CopyShaderData(raytracingPipeline, pData, m_rayGen, m_rayGenEntrySize); pData += offset; offset = CopyShaderData(raytracingPipeline, pData, m_miss, m_missEntrySize); pData += offset; offset = CopyShaderData(raytracingPipeline, pData, m_hitGroup, m_hitGroupEntrySize); // Unmap the SBT sbtBuffer->Unmap(0, nullptr); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## CopyShaderData For each entry, this private method copies the shader identifier followed by its resource pointers and/or root constants in `outputData`, with a stride in bytes of `entrySize`, and returns the size in bytes actually written to `outputData`. We iterate through the list of entries, and check whether that symbol is actually defined in the raytracing pipeline. We then copy the shader identifier and its array of resources to the SBT. At the end we return the number of bytes written, which is given by the number of entries times the size of an entry. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uint32_t ShaderBindingTable::CopyShaderData(ID3D12StateObjectPropertiesPrototype *raytracingPipeline, uint8_t *outputData, const std::vector &shaders, uint32_t entrySize) { uint8_t *pData = outputData; for (const auto &shader : shaders) { // Get the shader identifier, and check whether that identifier is known void *id = raytracingPipeline->GetShaderIdentifier(shader.m_entryPoint.c_str()); if (!id) { std::wstring errMsg(std::wstring(L"Unknown shader identifier used in the SBT: ") + shader.m_entryPoint); throw std::logic_error(std::string(errMsg.begin(), errMsg.end())); } // Copy the shader identifier memcpy(pData, id, m_progIdSize); // Copy all its resources pointers or values in bulk memcpy(pData + m_progIdSize, shader.m_inputData.data(), shader.m_inputData.size() * 8); pData += entrySize; } // Return the number of bytes actually written to the output buffer return static_cast(shaders.size()) * entrySize; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Reset This method simply resets all the parameters of the helper ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Reset the sets of programs and hit groups void ShaderBindingTable::Reset() { m_rayGen.clear(); m_miss.clear(); m_hitGroup.clear(); m_rayGenEntrySize = 0; m_missEntrySize = 0; m_hitGroupEntrySize = 0; m_progIdSize = 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Getters The following getters are used to simplify the call to DispatchRays where the offsets of the shader programs must be exactly following the SBT layout. Their implementation is straightforward, by accessing the precomputed entry sizes and the number of entries in each category. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // Get the size in bytes of the SBT section dedicated to ray generation programs UINT ShaderBindingTable::GetRayGenSectionSize() const { return m_rayGenEntrySize * static_cast(m_rayGen.size()); } // Get the size in bytes of one ray generation program entry in the SBT UINT ShaderBindingTable::GetRayGenEntrySize() const { return m_rayGenEntrySize; } // Get the size in bytes of the SBT section dedicated to miss programs UINT ShaderBindingTable::GetMissSectionSize() const { return m_missEntrySize * static_cast(m_miss.size()); } // Get the size in bytes of one miss program entry in the SBT UINT ShaderBindingTable::GetMissEntrySize() { return m_missEntrySize; } // Get the size in bytes of the SBT section dedicated to hit groups UINT ShaderBindingTable::GetHitGroupSectionSize() const { return m_hitGroupEntrySize * static_cast(m_hitGroup.size()); } // Get the size in bytes of one hit group entry in the SBT UINT ShaderBindingTable::GetHitGroupEntrySize() const { return m_hitGroupEntrySize; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~